From tim.one@home.com  Sun Jul  1 02:58:29 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 30 Jun 2001 21:58:29 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3E4487.40054EAE@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEKLKLAA.tim.one@home.com>

[Paul Prescod]
> "The Energy is the mass of the object times the speed of light times
> two."

[David Ascher]
> Actually, it's "squared", not times two.  At least in my universe =)

This is something for Guido to Pronounce on, then.  Who's going to write the
PEP?  The threat of nuclear war seems almost laughable in Paul's universe,
so it's certainly got attractions.  OTOH, it's got to be a lot colder too.

energy-will-do-what-guido-tells-it-to-do-ly y'rs  - tim



From paulp@ActiveState.com  Sun Jul  1 04:59:02 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sat, 30 Jun 2001 20:59:02 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com>
Message-ID: <3B3EA006.14882609@ActiveState.com>

David Ascher wrote:
> 
> > "The Energy is the mass of the object times the speed of light times
> > two."
> 
> Actually, it's "squared", not times two.  At least in my universe =)

Pedant. Next you're going to claim that these silly equations effect my
life somehow.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From paulp@ActiveState.com  Sun Jul  1 05:04:49 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sat, 30 Jun 2001 21:04:49 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com>
Message-ID: <3B3EA161.1375F74C@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
>...
> 
> The term "character" in Python should really only be used for
> the 8-bit strings. 

Are we going to change chr() and unichr() to one_element_string() and
unicode_one_element_string()

u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
character. No Python user will find that confusing no matter how Unicode
knuckle-dragging, mouth-breathing, wife-by-hair-dragging they are.

> In Unicode a "character" can mean any of:

Mark Davis said that "people" can use the word to mean any of those
things. He did not say that it was imprecisely defined in Unicode.
Nevertheless I'm not using the Unicode definition anymore than our
standard library uses an ancient Greek definition of integer. Python has
a concept of integer and a concept of character.

> >     It has been proposed that there should be a module for working
> >     with UTF-16 strings in narrow Python builds through some sort of
> >     abstraction that handles surrogates for you. If someone wants
> >     to implement that, it will be another PEP.
> 
> Uhm, narrow builds don't support UTF-16... it's UCS-2 which
> is supported (basically: store everything in range(0x10000));
> the codecs can map code points to surrogates, but it is solely
> their responsibility and the responsibility of the application
> using them to take care of dealing with surrogates.

The user can view the data as UCS-2, UTF-16, Base64, ROT-13, XML, ....
Just as we have a base64 module, we could have a UTF-16 module that
interprets the data in the string as UTF-16 and does surrogate
manipulation for you.

Anyhow, if any of those is the "real" encoding of the data, it is
UTF-16. After all, if the codec reads in four non-BMP characters in,
let's say, UTF-8, we represent them as 8 narrow-build Python characters.
That's the definition of UTF-16! But it's easy enough for me to take
that word out so I will.

>...
> Also, the module will be useful for both narrow and wide builds,
> since the notion of an encoded character can involve multiple code
> points. In that sense Unicode is always a variable length
> encoding for characters and that's the application field of
> this module.

I wouldn't advise that you do all different types of normalization in a
single module but I'll wait for your PEP.

> Here's the adjusted text:
> 
>      It has been proposed that there should be a module for working
>      with Unicode objects using character-, word- and line- based
>      indexing. The details of the implementation is left to
>      another PEP.
 
     It has been proposed that there should be a module that handles
     surrogates in narrow Python builds for programmers. If someone 
     wants to implement that, it will be another PEP. It might also be 
     combined with features that allow other kinds of character-, 
     word- and line- based indexing.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From DavidA@ActiveState.com  Sun Jul  1 07:09:40 2001
From: DavidA@ActiveState.com (David Ascher)
Date: Sat, 30 Jun 2001 23:09:40 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com>
Message-ID: <3B3EBEA4.3EC84EAF@ActiveState.com>

Paul Prescod wrote:
> 
> David Ascher wrote:
> >
> > > "The Energy is the mass of the object times the speed of light times
> > > two."
> >
> > Actually, it's "squared", not times two.  At least in my universe =)
> 
> Pedant. Next you're going to claim that these silly equations effect my
> life somehow.

Although one stretch the argument to say that the equations _effect_
your life, I'd limit the claim to stating that they _affect_ your life. 

pedantly y'rs,

--dr david


From paulp@ActiveState.com  Sun Jul  1 07:15:46 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sat, 30 Jun 2001 23:15:46 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com> <3B3EBEA4.3EC84EAF@ActiveState.com>
Message-ID: <3B3EC012.A3A05E64@ActiveState.com>

David Ascher wrote:
> 
> Paul Prescod wrote:
> >
> > David Ascher wrote:
> > >
> > > > "The Energy is the mass of the object times the speed of light times
> > > > two."
> > >
> > > Actually, it's "squared", not times two.  At least in my universe =)
> >
> > Pedant. Next you're going to claim that these silly equations effect my
> > life somehow.
> 
> Although one stretch the argument to say that the equations _effect_
              ^               
might    -----

> your life, I'd limit the claim to stating that they _affect_ your life.

And you just bought such a shiny, new glass, house. Pity.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From nhodgson@bigpond.net.au  Sun Jul  1 14:00:15 2001
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Sun, 1 Jul 2001 23:00:15 +1000
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com>
Message-ID: <00dd01c1022d$c61e4160$0acc8490@neil>

Paul Prescod:
<PEP: 261>

   The problem I have with this PEP is that it is a compile time option
which makes it hard to work with both 32 bit and 16 bit strings in one
program. Can not the 32 bit string type be introduced as an additional type?

> Are we going to change chr() and unichr() to one_element_string() and
> unicode_one_element_string()
>
> u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> character.

   This wasn't usefully true in the past for DBCS strings and is not the
right way to think of either narrow or wide strings now. The idea that
strings are arrays of characters gets in the way of dealing with many
encodings and is the primary difficulty in localising software for Japanese.
Iteration through the code units in a string is a problem waiting to bite
you and string APIs should encourage behaviour which is correct when faced
with variable width characters, both DBCS and UTF style. Iteration over
variable width characters should be performed in a way that preserves the
integrity of the characters. M.-A. Lemburg's proposed set of iterators could
be extended to indicate encoding "for c in s.asCharacters('utf-8')" and to
provide for the various intended string uses such as "for c in
s.inVisualOrder()" reversing the receipt of right-to-left substrings.

   Neil




From guido@digicool.com  Sun Jul  1 14:44:29 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 01 Jul 2001 09:44:29 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: Your message of "Sun, 01 Jul 2001 23:00:15 +1000."
 <00dd01c1022d$c61e4160$0acc8490@neil>
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com>
 <00dd01c1022d$c61e4160$0acc8490@neil>
Message-ID: <200107011344.f61DiTM03548@odiug.digicool.com>

> <PEP: 261>
> 
>    The problem I have with this PEP is that it is a compile time option
> which makes it hard to work with both 32 bit and 16 bit strings in one
> program. Can not the 32 bit string type be introduced as an additional type?

Not without an outrageous amount of additional coding (every place in
the code that currently uses PyUnicode_Check() would have to be
bifurcated in a 16-bit and a 32-bit variant).

I doubt that the desire to work with both 16- and 32-bit characters in
one program is typical for folks using Unicode -- that's mostly
limited to folks writing conversion tools.  Python will offer the
necessary codecs so you shouldn't have this need very often.

You can use the array module to manipulate 16- and 32-bit arrays, and
you can use the various Unicode encodings to do the necessary
encodings.

> > u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> > character.
> 
>    This wasn't usefully true in the past for DBCS strings and is not the
> right way to think of either narrow or wide strings now. The idea that
> strings are arrays of characters gets in the way of dealing with many
> encodings and is the primary difficulty in localising software for Japanese.

Can you explain the kind of problems encountered in some more detail?

> Iteration through the code units in a string is a problem waiting to bite
> you and string APIs should encourage behaviour which is correct when faced
> with variable width characters, both DBCS and UTF style.

But this is not the Unicode philosophy.  All the variable-length
character manipulation is supposed to be taken care of by the codecs,
and then the application can deal in arrays of characteres.
Alternatively, the application can deal in opaque objects representing
variable-length encodings, but then it should be very careful with
concatenation and even more so with slicing.

> Iteration over
> variable width characters should be performed in a way that preserves the
> integrity of the characters. M.-A. Lemburg's proposed set of iterators could
> be extended to indicate encoding "for c in s.asCharacters('utf-8')" and to
> provide for the various intended string uses such as "for c in
> s.inVisualOrder()" reversing the receipt of right-to-left substrings.

I think it's a good idea to provide a set of higher-level tools as
well.  However nobody seems to know what these higher-level tools
should do yet.  PEP 261 is specifically focused on getting the
lower-level foundations right (i.e. the objects that represent arrays
of code units), so that the authors of higher level tools will have a
solid base.  If you want to help author a PEP for such higher-level
tools, you're welcome!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From loewis@informatik.hu-berlin.de  Sun Jul  1 14:52:58 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 1 Jul 2001 15:52:58 +0200 (MEST)
Subject: [Python-Dev] Support for "wide" Unicode characters
Message-ID: <200107011352.PAA27645@pandora.informatik.hu-berlin.de>

> The problem I have with this PEP is that it is a compile time option
> which makes it hard to work with both 32 bit and 16 bit strings in
> one program.

Can you elaborate why you think this is a problem?

> Can not the 32 bit string type be introduced as an additional type?

Yes, but not just "like that". You'd have to define an API for
creating values of this type, you'd have to teach all functions which
ought to accept it to process it, you'd have to define conversion
operations and all that: In short, you'd have to go through all the
trouble that introduction of the Unicode type gave us once again.
Also, I cannot see any advantages in introducing yet another type.

Implementing this PEP is straight forward, and with almost no visible
effect to Python programs.

People have suggested to make it a run-time decision, having the
internal representation switch on demand, but that would give an API
nightmare for C code that has to access such values.

> u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> character.

>  This wasn't usefully true in the past for DBCS strings and is not the
> right way to think of either narrow or wide strings now. The idea
> that strings are arrays of characters gets in the way of dealing
> with many encodings and is the primary difficulty in localising
> software for Japanese.

While I don't know much about localising software for Japanese (*), I
agree that 'u[i] is a character' isn't useful to say in many cases. If
this is the old Python string type, I'd much prefer calling u[i] a
'byte'.

Regards,
Martin

(*) Methinks that the primary difficulty still is translating all the
documentation, and messages. Actually, keeping the translations
up-to-date is even more challenging.


From aahz@rahul.net  Sun Jul  1 15:19:41 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Sun, 1 Jul 2001 07:19:41 -0700 (PDT)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3EC012.A3A05E64@ActiveState.com> from "Paul Prescod" at Jun 30, 2001 11:15:46 PM
Message-ID: <20010701141941.A323099C80@waltz.rahul.net>

Paul Prescod wrote:
> David Ascher wrote:
>> Paul Prescod wrote:
>>> David Ascher wrote:
>>>>>
>>>>> "The Energy is the mass of the object times the speed of light times
>>>>> two."
>>>>
>>>> Actually, it's "squared", not times two.  At least in my universe =)
>>>
>>> Pedant. Next you're going to claim that these silly equations effect my
>>> life somehow.
>> 
>> Although one stretch the argument to say that the equations _effect_
>               ^               
> might    -----
> 
>> your life, I'd limit the claim to stating that they _affect_ your life.
> 
> And you just bought such a shiny, new glass, house. Pity.

All speeling falmes contain at least one erorr.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From just@letterror.com  Sun Jul  1 15:43:08 2001
From: just@letterror.com (Just van Rossum)
Date: Sun,  1 Jul 2001 16:43:08 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <200107011344.f61DiTM03548@odiug.digicool.com>
Message-ID: <20010701164315-r01010600-c2d5b07d@213.84.27.177>

Guido van Rossum wrote:

> > <PEP: 261>
> > 
> >    The problem I have with this PEP is that it is a compile time option
> > which makes it hard to work with both 32 bit and 16 bit strings in one
> > program. Can not the 32 bit string type be introduced as an additional type?
> 
> Not without an outrageous amount of additional coding (every place in
> the code that currently uses PyUnicode_Check() would have to be
> bifurcated in a 16-bit and a 32-bit variant).

Alternatively, a Unicode object could *internally* be either 8, 16 or 32 bits
wide (to be clear: not per character, but per string). Also a lot of work, but
it'll be a lot less wasteful.

> I doubt that the desire to work with both 16- and 32-bit characters in
> one program is typical for folks using Unicode -- that's mostly
> limited to folks writing conversion tools.  Python will offer the
> necessary codecs so you shouldn't have this need very often.

Not a lot of people will want to work with 16 or 32 bit chars directly, but I
think a less wasteful solution to the surrogate pair problem *will* be desired
by people. Why use 32 bits for all strings in a program when only a tiny
percentage actually *needs* more than 16? (Or even 8...)

> > Iteration through the code units in a string is a problem waiting to bite
> > you and string APIs should encourage behaviour which is correct when faced
> > with variable width characters, both DBCS and UTF style.
> 
> But this is not the Unicode philosophy.  All the variable-length
> character manipulation is supposed to be taken care of by the codecs,
> and then the application can deal in arrays of characteres.

Right: this is the way it should be.

My difficulty with PEP 261 is that I'm afraid few people will actually enable
32-bit support (*what*?! all unicode strings become 32 bits wide? no way!),
therefore making programs non-portable in very subtle ways.

Just


From DavidA@ActiveState.com  Sun Jul  1 18:13:30 2001
From: DavidA@ActiveState.com (David Ascher)
Date: Sun, 01 Jul 2001 10:13:30 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com> <3B3EBEA4.3EC84EAF@ActiveState.com> <3B3EC012.A3A05E64@ActiveState.com>
Message-ID: <3B3F5A3A.A88B54B2@ActiveState.com>

Paul: 
> And you just bought such a shiny, new glass, house. Pity.

What kind of comma placement is that?

--david


From paulp@ActiveState.com  Sun Jul  1 19:08:10 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 11:08:10 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com> <00dd01c1022d$c61e4160$0acc8490@neil>
Message-ID: <3B3F670A.B5396D61@ActiveState.com>

Neil Hodgson wrote:
> 
> Paul Prescod:
> <PEP: 261>
> 
>    The problem I have with this PEP is that it is a compile time option
> which makes it hard to work with both 32 bit and 16 bit strings in one
> program. Can not the 32 bit string type be introduced as an additional type?

The two solutions are not mutually exclusive. If you (or someone)
supplies a 32-bit type and Guido accepts it, then the compile option
might fall into disuse. But this solution was chosen because it is much
less work. Really though, I think that having 16-bit and 32-bit types is
extra confusion for very little gain. I would much rather have a single
space-efficient type that hid the details of its implementation. But
nobody has volunteered to code it and Guido might not accept it even if
someone did.

>...
>    This wasn't usefully true in the past for DBCS strings and is not the
> right way to think of either narrow or wide strings now. The idea that
> strings are arrays of characters gets in the way of dealing with many
> encodings and is the primary difficulty in localising software for Japanese.

The whole benfit of moving to 32-bit character strings is to allow
people to think of strings as arrays of characters. Forcing them to
consider variable-length encodings is precisely what we are trying to
avoid.

> Iteration through the code units in a string is a problem waiting to bite
> you and string APIs should encourage behaviour which is correct when faced
> with variable width characters, both DBCS and UTF style. Iteration over
> variable width characters should be performed in a way that preserves the
> integrity of the characters. 

On wide Python builds there is no such thing as variable width Unicode
characters. It doesn't make sense to combine two 32-bit characters to
get a 64-bit one. On narrow Python builds you might want to treat a
surrogate pair as a single character but I would strongly advise against
it. If you want wide characters, move to a wide build. Even if a narrow
build is more space efficient, you'll lose a ton of performance
emulating wide characters in Python code.

> ... M.-A. Lemburg's proposed set of iterators could
> be extended to indicate encoding "for c in s.asCharacters('utf-8')" and to
> provide for the various intended string uses such as "for c in
> s.inVisualOrder()" reversing the receipt of right-to-left substrings.

A floor wax and a desert topping. <0.5 wink>

I don't think that the average Python programmer would want
s.asCharacters('utf-8') when they already have s.decode('utf-8'). We
decided a long time ago that the model for standard users would be
fixed-length (1!), abstract characters. That's the way Python's Unicode
subsystem has always worked.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From paulp@ActiveState.com  Sun Jul  1 19:19:17 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 11:19:17 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010701164315-r01010600-c2d5b07d@213.84.27.177>
Message-ID: <3B3F69A5.D7CE539D@ActiveState.com>

Just van Rossum wrote:
> 
> Guido van Rossum wrote:
> 
> > > <PEP: 261>
> > >
> > >    The problem I have with this PEP is that it is a compile time option
> > > which makes it hard to work with both 32 bit and 16 bit strings in one
> > > program. Can not the 32 bit string type be introduced as an additional type?
> >
> > Not without an outrageous amount of additional coding (every place in
> > the code that currently uses PyUnicode_Check() would have to be
> > bifurcated in a 16-bit and a 32-bit variant).
> 
> Alternatively, a Unicode object could *internally* be either 8, 16 or 32 bits
> wide (to be clear: not per character, but per string). Also a lot of work, but
> it'll be a lot less wasteful.

I hope this is where we end up one day. But the compile-time option is
better than where we are today. Even though PEP 261 is not my favorite
solution, it buys us a couple of years of wait-and-see time.

Consider that computer memory is growing much faster than textual data.
People's text processing techniques get more and more "wasteful" because
it is now almost always possible to load the entire "text" into memory
at once. I remember how some text editors used to boast that they only
loaded your text "on demand". 

Maybe so much data will be passed to us from UCS-4 APIs that trying to
"compress it" will actually be inefficient.

Maybe two years from now Guido will make UCS-4 the default and only a
tiny minority will notice or care.

> ...
> My difficulty with PEP 261 is that I'm afraid few people will actually enable
> 32-bit support (*what*?! all unicode strings become 32 bits wide? no way!),
> therefore making programs non-portable in very subtle ways.

It really depends on what the default build option is.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From paulp@ActiveState.com  Sun Jul  1 19:22:01 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 11:22:01 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com> <3B3EBEA4.3EC84EAF@ActiveState.com> <3B3EC012.A3A05E64@ActiveState.com> <3B3F5A3A.A88B54B2@ActiveState.com>
Message-ID: <3B3F6A49.6E82B7DE@ActiveState.com>

David Ascher wrote:
> 
> Paul:
> > And you just bought such a shiny, new glass, house. Pity.
> 
> What kind of comma placement is that?

I had to leave you something to complain about;
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From guido@digicool.com  Sun Jul  1 19:37:48 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 01 Jul 2001 14:37:48 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: Your message of "Sun, 01 Jul 2001 16:43:08 +0200."
 <20010701164315-r01010600-c2d5b07d@213.84.27.177>
References: <20010701164315-r01010600-c2d5b07d@213.84.27.177>
Message-ID: <200107011837.f61IbmZ03645@odiug.digicool.com>

> Alternatively, a Unicode object could *internally* be either 8, 16
> or 32 bits wide (to be clear: not per character, but per
> string). Also a lot of work, but it'll be a lot less wasteful.

Depending on what you prefer to waste: developers' time or computer
resources.  I bet that if you try the measure the wasted space you'll
find that it wastes very little compared to all the other overheads
in a typical Python program: CPU time compared to writing your code in
C, memory overhead for integers, etc.

It so happened that the Unicode support was written to make it very
easy to change the compile-time code unit size; but making this a
per-string (or even global) run-time variable is much harder without
touching almost every place that uses Unicode (not to mention slowing
down the common case).

Nobody was enthusiastic about fixing this, so our choice was really
between staying with 16 bits or making 32 bits an option for those who
need it.

> Not a lot of people will want to work with 16 or 32 bit chars
> directly,

How do you know?  There are more Chinese than Americans and Europeans
together, and they will soon all have computers. :-)

> but I think a less wasteful solution to the surrogate pair
> problem *will* be desired by people. Why use 32 bits for all strings
> in a program when only a tiny percentage actually *needs* more than
> 16? (Or even 8...)

So work in UTF-8 -- a lot of work can be done in UTF-8.

> > But this is not the Unicode philosophy.  All the variable-length
> > character manipulation is supposed to be taken care of by the codecs,
> > and then the application can deal in arrays of characteres.
> 
> Right: this is the way it should be.
> 
> My difficulty with PEP 261 is that I'm afraid few people will
> actually enable 32-bit support (*what*?! all unicode strings become
> 32 bits wide? no way!), therefore making programs non-portable in
> very subtle ways.

My hope and expectation is that those folks who need 32-bit support
will enable it.  If this solution is not sufficient, we may have to
provide something else in the future, but given that the
implementation effort for PEP 261 was very minimal (certainly less
than the time expended in discussing it) I am very happy with it.

It will take quite a while until lots of folks will need the 32-bit
support (there aren't that many characters defined outside the basic
plane yet).  In the mean time, those that need to 32-bit support
should be happy that we allow them to rebuild Python with 32-bit
support.  In the next 5-10 years, the 32-bit support requirement will
become more common -- as will be the memory upgrades to make it
painless.

It's not like Python is making this decision in a vacuum either: Linux
already has 32-bit wchar_t.  32-bit characters will eventually be
common (even in Windows, which probably has the largest investment in
16-bit Unicode at the moment of any system).  Like IPv6, we're trying
to enable uncommon uses of Python without breaking things for the
not-so-early adopters.

Again, don't see PEP 261 as the ultimate answer to all your 32-bit
Unicode questions.  Just consider that realistically we have two
choices: stick with 16-bit support only or make 32-bit support an
option.  Other approaches (more surrogate support, run-time choices,
transparent variable-length encodings) simply aren't realistic --
no-one has the time to code them.

It should be easy to write portable Python programs that work
correctly with 16-bit Unicode characters on a "narrow" interpreter and
also work correctly with 21-bit Unicode on a "wide" interpreter:
just avoid using surrogates.  If you *need* to work with surrogates,
try to limit yourself to very simple operations like concatenations of
valid strings, and splitting strings at known delimiters only.
There's a lot you can do with this.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Sun Jul  1 19:52:36 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 1 Jul 2001 14:52:36 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3F69A5.D7CE539D@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEMBKLAA.tim.one@home.com>

[Paul Prescod]
> ...
> Consider that computer memory is growing much faster than textual data.
> People's text processing techniques get more and more "wasteful" because
> it is now almost always possible to load the entire "text" into memory
> at once.

Indeed, the entire text of the Bible fits in a corner of my year-old box's
RAM, even at 32 bits per character.

> I remember how some text editors used to boast that they only loaded
> your text "on demand".

Well, they still do -- fancy editors use fancy data structures, so that,
e.g., inserting characters at the start of the file doesn't cause a 50Mb
memmove each time.  Response time is still important, but I'd wager
relatively insensitive to basic character size (you need tricks that cut
factors of 1000s off potential worst cases to give the appearance of
instantaneous results; a factor of 2 or 4 is in the noise compared to what's
needed regardless).



From aahz@rahul.net  Sun Jul  1 20:21:26 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Sun, 1 Jul 2001 12:21:26 -0700 (PDT)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3F670A.B5396D61@ActiveState.com> from "Paul Prescod" at Jul 01, 2001 11:08:10 AM
Message-ID: <20010701192126.9EB8299C80@waltz.rahul.net>

Paul Prescod wrote:
> 
> On wide Python builds there is no such thing as variable width Unicode
> characters. It doesn't make sense to combine two 32-bit characters to
> get a 64-bit one. On narrow Python builds you might want to treat a
> surrogate pair as a single character but I would strongly advise against
> it. If you want wide characters, move to a wide build. Even if a narrow
> build is more space efficient, you'll lose a ton of performance
> emulating wide characters in Python code.

This needn't go into the PEP, I think, but I'd like you to say something
about what you expect the end result of this PEP to look like under
Windows, where "rebuild" isn't really a valid option for most Python
users.  Are we simply committing to make two builds available?  If so,
what happens the next time we run into a situation like this?
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From paulp@ActiveState.com  Sun Jul  1 20:21:09 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 12:21:09 -0700
Subject: [Python-Dev] Text editors
References: <LNBBLJKPBEHFEDALKOLCIEMBKLAA.tim.one@home.com>
Message-ID: <3B3F7825.CA3D1B5B@ActiveState.com>

Tim Peters wrote:
> 
>...
> 
> > I remember how some text editors used to boast that they only loaded
> > your text "on demand".
> 
> Well, they still do -- fancy editors use fancy data structures, so that,
> e.g., inserting characters at the start of the file doesn't cause a 50Mb
> memmove each time.  

Yes, but most modern text editors take O(n) time to open the file. There
was a time when the more advanced ones did not. Or maybe that was just
SGML editors...I can't remember.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From guido@digicool.com  Sun Jul  1 20:32:52 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 01 Jul 2001 15:32:52 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: Your message of "Sun, 01 Jul 2001 12:21:26 PDT."
 <20010701192126.9EB8299C80@waltz.rahul.net>
References: <20010701192126.9EB8299C80@waltz.rahul.net>
Message-ID: <200107011932.f61JWq803843@odiug.digicool.com>

> This needn't go into the PEP, I think, but I'd like you to say something
> about what you expect the end result of this PEP to look like under
> Windows, where "rebuild" isn't really a valid option for most Python
> users.  Are we simply committing to make two builds available?  If so,
> what happens the next time we run into a situation like this?

I imagine that we will pick a choice (I expect it'll be UCS2) and
make only that build available, until there are loud enough cries from
folks who have a reasonable excuse not to have a copy of VCC around.

Given that the rest of Windows uses 16-bit Unicode, I think we'll be
able to get away with this for quite a while.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paulp@ActiveState.com  Sun Jul  1 20:33:20 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 12:33:20 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010701192126.9EB8299C80@waltz.rahul.net>
Message-ID: <3B3F7B00.29D6832@ActiveState.com>

Aahz Maruch wrote:
> 
>...
> 
> This needn't go into the PEP, I think, but I'd like you to say something
> about what you expect the end result of this PEP to look like under
> Windows, where "rebuild" isn't really a valid option for most Python
> users.  Are we simply committing to make two builds available?  If so,
> what happens the next time we run into a situation like this?

Windows itself is strongly biased towards 16-bit characters. Therefore I
expect that to be the default for a while. Then I expect Guido to
announce that 32-bit characters are the new default with version 3000
(perhaps right after Windows 3000 ships) and we'll all change. So most
Windows users will not be able to work with 32-bit characters for a
while. But since Windows itself doesn't like those characters, they
probably won't run into them much.

I strongly doubt that we'll ever make two builds available because it
would cause a mess of extension module incompatibilities.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From paulp@ActiveState.com  Sun Jul  1 20:57:09 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 12:57:09 -0700
Subject: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" Unicode characters
Message-ID: <3B3F8095.8D58631D@ActiveState.com>

PEP: 261
Title: Support for "wide" Unicode characters
Version: $Revision: 1.3 $
Author: paulp@activestate.com (Paul Prescod)
Status: Draft
Type: Standards Track
Created: 27-Jun-2001
Python-Version: 2.2
Post-History: 27-Jun-2001


Abstract

    Python 2.1 unicode characters can have ordinals only up to 2**16
-1.  
    This range corresponds to a range in Unicode known as the Basic
    Multilingual Plane. There are now characters in Unicode that live
    on other "planes". The largest addressable character in Unicode
    has the ordinal 17 * 2**16 - 1 (0x10ffff). For readability, we
    will call this TOPCHAR and call characters in this range "wide 
    characters".


Glossary

    Character 
        
        Used by itself, means the addressable units of a Python 
        Unicode string.

    Code point

        A code point is an integer between 0 and TOPCHAR.
        If you imagine Unicode as a mapping from integers to
        characters, each integer is a code point. But the 
        integers between 0 and TOPCHAR that do not map to
        characters are also code points. Some will someday 
        be used for characters. Some are guaranteed never 
        to be used for characters.

    Codec

        A set of functions for translating between physical
        encodings (e.g. on disk or coming in from a network)
        into logical Python objects.

    Encoding

        Mechanism for representing abstract characters in terms of
        physical bits and bytes. Encodings allow us to store
        Unicode characters on disk and transmit them over networks
        in a manner that is compatible with other Unicode software.

    Surrogate pair

        Two physical characters that represent a single logical
        character. Part of a convention for representing 32-bit
        code points in terms of two 16-bit code points.

    Unicode string

          A Python type representing a sequence of code points with
          "string semantics" (e.g. case conversions, regular
          expression compatibility, etc.) Constructed with the 
          unicode() function.


Proposed Solution

    One solution would be to merely increase the maximum ordinal 
    to a larger value. Unfortunately the only straightforward
    implementation of this idea is to use 4 bytes per character.
    This has the effect of doubling the size of most Unicode 
    strings. In order to avoid imposing this cost on every
    user, Python 2.2 will allow the 4-byte implementation as a
    build-time option. Users can choose whether they care about
    wide characters or prefer to preserve memory.

    The 4-byte option is called "wide Py_UNICODE". The 2-byte option
    is called "narrow Py_UNICODE".

    Most things will behave identically in the wide and narrow worlds.

    * unichr(i) for 0 <= i < 2**16 (0x10000) always returns a
      length-one string.

    * unichr(i) for 2**16 <= i <= TOPCHAR will return a
      length-one string on wide Python builds. On narrow builds it will 
      raise ValueError.

        ISSUE 

            Python currently allows \U literals that cannot be
            represented as a single Python character. It generates two
            Python characters known as a "surrogate pair". Should this
            be disallowed on future narrow Python builds?

        Pro:

            Python already the construction of a surrogate pair
            for a large unicode literal character escape sequence.
            This is basically designed as a simple way to construct
            "wide characters" even in a narrow Python build. It is also
            somewhat logical considering that the Unicode-literal syntax
            is basically a short-form way of invoking the unicode-escape
            codec.

        Con:

            Surrogates could be easily created this way but the user
            still needs to be careful about slicing, indexing, printing 
            etc. Therefore some have suggested that Unicode
            literals should not support surrogates.


        ISSUE 

            Should Python allow the construction of characters that do
            not correspond to Unicode code points?  Unassigned Unicode 
            code points should obviously be legal (because they could 
            be assigned at any time). But code points above TOPCHAR are 
            guaranteed never to be used by Unicode. Should we allow
access 
            to them anyhow?

        Pro:

            If a Python user thinks they know what they're doing why
            should we try to prevent them from violating the Unicode
            spec? After all, we don't stop 8-bit strings from
            containing non-ASCII characters.

        Con:

            Codecs and other Unicode-consuming code will have to be
            careful of these characters which are disallowed by the
            Unicode specification.

    * ord() is always the inverse of unichr()

    * There is an integer value in the sys module that describes the
      largest ordinal for a character in a Unicode string on the current
      interpreter. sys.maxunicode is 2**16-1 (0xffff) on narrow builds
      of Python and TOPCHAR on wide builds.

        ISSUE: Should there be distinct constants for accessing
               TOPCHAR and the real upper bound for the domain of 
               unichr (if they differ)? There has also been a
               suggestion of sys.unicodewidth which can take the 
               values 'wide' and 'narrow'.

    * every Python Unicode character represents exactly one Unicode code 
      point (i.e. Python Unicode Character = Abstract Unicode
character).

    * codecs will be upgraded to support "wide characters"
      (represented directly in UCS-4, and as variable-length sequences
      in UTF-8 and UTF-16). This is the main part of the implementation 
      left to be done.

    * There is a convention in the Unicode world for encoding a 32-bit
      code point in terms of two 16-bit code points. These are known
      as "surrogate pairs". Python's codecs will adopt this convention
      and encode 32-bit code points as surrogate pairs on narrow Python
      builds. 

        ISSUE 

            Should there be a way to tell codecs not to generate
            surrogates and instead treat wide characters as 
            errors?

        Pro:

            I might want to write code that works only with
            fixed-width characters and does not have to worry about
            surrogates.


        Con:

            No clear proposal of how to communicate this to codecs.

    * there are no restrictions on constructing strings that use 
      code points "reserved for surrogates" improperly. These are
      called "isolated surrogates". The codecs should disallow reading
      these from files, but you could construct them using string 
      literals or unichr().


Implementation

    There is a new (experimental) define:

        #define PY_UNICODE_SIZE 2

    There is a new configure option:

        --enable-unicode=ucs2 configures a narrow Py_UNICODE, and uses
                              wchar_t if it fits
        --enable-unicode=ucs4 configures a wide Py_UNICODE, and uses
                              whchar_t if it fits
        --enable-unicode      same as "=ucs2"

    The intention is that --disable-unicode, or --enable-unicode=no
    removes the Unicode type altogether; this is not yet implemented.

    It is also proposed that one day --enable-unicode will just
    default to the width of your platforms wchar_t.

    Windows builds will be narrow for a while based on the fact that
    there have been few requests for wide characters, those requests
    are mostly from hard-core programmers with the ability to buy
    their own Python and Windows itself is strongly biased towards
    16-bit characters.


Notes

    This PEP does NOT imply that people using Unicode need to use a
    4-byte encoding for their files on disk or sent over the network. 
    It only allows them to do so. For example, ASCII is still a 
    legitimate (7-bit) Unicode-encoding.

    It has been proposed that there should be a module that handles
    surrogates in narrow Python builds for programmers. If someone 
    wants to implement that, it will be another PEP. It might also be 
    combined with features that allow other kinds of character-, 
    word- and line- based indexing.


Rejected Suggestions

    More or less the status-quo

        We could officially say that Python characters are 16-bit and
        require programmers to implement wide characters in their
        application logic by combining surrogate pairs. This is a heavy 
        burden because emulating 32-bit characters is likely to be
        very inefficient if it is coded entirely in Python. Plus these
        abstracted pseudo-strings would not be legal as input to the
        regular expression engine.

    "Space-efficient Unicode" type

        Another class of solution is to use some efficient storage
        internally but present an abstraction of wide characters to
        the programmer. Any of these would require a much more complex
        implementation than the accepted solution. For instance consider
        the impact on the regular expression engine. In theory, we could
        move to this implementation in the future without breaking
Python
        code. A future Python could "emulate" wide Python semantics on
        narrow Python. Guido is not willing to undertake the
        implementation right now.

    Two types

        We could introduce a 32-bit Unicode type alongside the 16-bit
        type. There is a lot of code that expects there to be only a 
        single Unicode type.

    This PEP represents the least-effort solution. Over the next
    several years, 32-bit Unicode characters will become more common
    and that may either convince us that we need a more sophisticated 
    solution or (on the other hand) convince us that simply 
    mandating wide Unicode characters is an appropriate solution.
    Right now the two options on the table are do nothing or do
    this.


References

    Unicode Glossary: http://www.unicode.org/glossary/


Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From thomas@xs4all.net  Sun Jul  1 23:12:48 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 2 Jul 2001 00:12:48 +0200
Subject: [Python-Dev] Python 2.1.1 release 'schedule'
Message-ID: <20010702001248.H8098@xs4all.nl>

This is just a heads-up to everyone. I plan to release Python 2.1.1c1
(release candidate 1) somewhere on Friday the 13th (of July) and, barring
any serious problems, the full release the friday following that, July 20.

The python 2.1.1 CVS branch (tagged 'release21-maint') should be stable, and
should contain most bugfixes that will be in 2.1.1. If you care about
2.1.1's stability and portability, or you found bugs in 2.1 and aren't sure
they are fixed, and you can check things out of CVS, please give the CVS
branch a try: just 'checkout' python with

cvs co -rrelease21-maint python

(with the -d option from the SourceForge CVS page that applies to you) and
follow the normal compile procedure. Binaries for Windows as well as source
tarballs will be provided for the release candidate and the final release
(obviously) but the more bugs people point out before the final release, the
more bugs will be fixed in 2.1.1 :-)

Python 2.1.1 (as well as the CVS branch) will fall under the new
GPL-compatible PSF licence, just like Python 2.0.1. The only notable thing
missing from the CVS branch is an updated NEWS file -- I'm working on it.
I'm also not done searching the open bugs for ones that might need to be
adressed in 2.1.1, but feel free to point me to bugs you think are
important!

2.1.1-Patch-Czar-ly y'rs,
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From greg@cosc.canterbury.ac.nz  Mon Jul  2 03:06:50 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Jul 2001 14:06:50 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3EBEA4.3EC84EAF@ActiveState.com>
Message-ID: <200107020206.OAA00427@s454.cosc.canterbury.ac.nz>

David Ascher <DavidA@ActiveState.com>:

> I'd limit the claim to stating that they _affect_ your life.

If matter didn't have any rest energy, everything
would fly about at the speed of light, which would
make life very hectic.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Mon Jul  2 03:36:39 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Jul 2001 14:36:39 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <20010701164315-r01010600-c2d5b07d@213.84.27.177>
Message-ID: <200107020236.OAA00432@s454.cosc.canterbury.ac.nz>

Just van Rossum <just@letterror.com>:

> My difficulty with PEP 261 is that I'm afraid few people will actually enable
> 32-bit support (*what*?! all unicode strings become 32 bits wide? no way!),
> therefore making programs non-portable in very subtle ways.

I agree. This can only be a stopgap measure. Ultimately the
Unicode type needs to be made smarter.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Mon Jul  2 03:42:12 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Jul 2001 14:42:12 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3F5A3A.A88B54B2@ActiveState.com>
Message-ID: <200107020242.OAA00436@s454.cosc.canterbury.ac.nz>

David Ascher <DavidA@ActiveState.com>:
> > And you just bought such a shiny, new glass, house. Pity.
>
> What kind of comma placement is that?

Obviously it's only the glass that is new, not the
whole house. :-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From nhodgson@bigpond.net.au  Mon Jul  2 03:42:11 2001
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Mon, 2 Jul 2001 12:42:11 +1000
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107011352.PAA27645@pandora.informatik.hu-berlin.de>
Message-ID: <01d601c102a0$98671580$0acc8490@neil>

Martin von Loewis:


> > The problem I have with this PEP is that it is a compile time option
> > which makes it hard to work with both 32 bit and 16 bit strings in
> > one program.
>
> Can you elaborate why you think this is a problem?

   A common role for Python is to act as glue between various modules. If
Paul produces some interesting code that depends on 32 bit strings and I
want to use that in conjunction with some Win32 specific or COM dependent
code that wants 16 bit strings then it may not be possible or may require
difficult workaronds.

> (*) Methinks that the primary difficulty still is translating all the
> documentation, and messages. Actually, keeping the translations
> up-to-date is even more challenging.

   Translation of documentation and strings can be performed by almost
anyone who writes both languages ("even managers") and can be budgeted by
working out the amount of text and applying a conversion rate. Code requires
careful thought and can lead to the typical buggy software schedule
blowouts.

   Neil




From greg@cosc.canterbury.ac.nz  Mon Jul  2 03:49:56 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Jul 2001 14:49:56 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <200107011837.f61IbmZ03645@odiug.digicool.com>
Message-ID: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>

> It so happened that the Unicode support was written to make it very
> easy to change the compile-time code unit size

What about extension modules that deal with Unicode strings?
Will they have to be recompiled too? If so, is there anything
to detect an attempt to import an extension module with an
incompatible Unicode character width?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From nhodgson@bigpond.net.au  Mon Jul  2 03:52:45 2001
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Mon, 2 Jul 2001 12:52:45 +1000
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com>              <00dd01c1022d$c61e4160$0acc8490@neil>  <200107011344.f61DiTM03548@odiug.digicool.com>
Message-ID: <01ea01c102a2$128491c0$0acc8490@neil>

Guido van Rossum:

> >    This wasn't usefully true in the past for DBCS strings and is
> > not the right way to think of either narrow or wide strings
> > now. The idea that strings are arrays of characters gets in
> > the way of dealing with many encodings and is the primary
> > difficulty in localising software for Japanese.
>
> Can you explain the kind of problems encountered in some more detail?

   Programmers used to working with character == indexable code unit will
often split double wide characters when performing an action. For example
searching for a particular double byte character "bc" may match "abcd"
incorrectly where "ab" and "cd" are the characters. DBCS is not normally
self synchronising although UTF-8 is. Another common problem is counting
characters, for example when filling a line, hitting the line width and
forcing half a character onto the next line.

> I think it's a good idea to provide a set of higher-level tools as
> well.  However nobody seems to know what these higher-level tools
> should do yet.  PEP 261 is specifically focused on getting the
> lower-level foundations right (i.e. the objects that represent arrays
> of code units), so that the authors of higher level tools will have a
> solid base.  If you want to help author a PEP for such higher-level
> tools, you're welcome!

   Its more likely I'll publish some of the low level pieces of
Scintilla/SinkWorld as a Python extension providing some of these facilities
in an editable-text class. Then we can see if anyone else finds the code
worthwhile.

   Neil




From nhodgson@bigpond.net.au  Mon Jul  2 04:00:41 2001
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Mon, 2 Jul 2001 13:00:41 +1000
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <LNBBLJKPBEHFEDALKOLCIEMBKLAA.tim.one@home.com>
Message-ID: <020b01c102a3$2dd23440$0acc8490@neil>

Tim Peters:

> Well, they still do -- fancy editors use fancy data structures, so that,
> e.g., inserting characters at the start of the file doesn't cause a 50Mb
> memmove each time.  Response time is still important, but I'd wager
> relatively insensitive to basic character size (you need tricks that cut
> factors of 1000s off potential worst cases to give the appearance of
> instantaneous results; a factor of 2 or 4 is in the noise compared to
what's
> needed regardless).

   I actually have some numbers here. Early versions of some new editor
buffer code used UCS-2 on .NET and the JVM. Moving to an 8 bit buffer saved
10-20% of execution time on the insert string, delete string and global
replace benchmarks using strings that fit into ASCII. These buffers did have
some other overhead for line management and other features but I expect
these did not affect the proportions much.

   Neil





From tim.one@home.com  Mon Jul  2 05:36:20 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 2 Jul 2001 00:36:20 -0400
Subject: [Python-Dev] RE: Python 2.1.1 release 'schedule'
In-Reply-To: <20010702001248.H8098@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENEKLAA.tim.one@home.com>

Woo hoo!

[Thomas Wouters]
> ...
> Binaries for Windows as well as source tarballs will be provided ...

Building a Windows installer isn't straightforward, so you'd better let us
do that part (e.g., you need the Wise installer program, Fred needs to
supply appropriate HTML docs for the Windows installer to zip up, Tcl/Tk has
to get unpacked and rearranged, etc).  I just checked in 2.1.1c1 changes to
the Windows part of the release21-maint tree, but the rest of it isn't in
CVS.



From thomas@xs4all.net  Mon Jul  2 07:27:24 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 2 Jul 2001 08:27:24 +0200
Subject: [Python-Dev] Re: Python 2.1.1 release 'schedule'
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEENEKLAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCEENEKLAA.tim.one@home.com>
Message-ID: <20010702082724.K32419@xs4all.nl>

On Mon, Jul 02, 2001 at 12:36:20AM -0400, Tim Peters wrote:

> [Thomas Wouters]
> > ...
> > Binaries for Windows as well as source tarballs will be provided ...

> Building a Windows installer isn't straightforward, so you'd better let us
> do that part (e.g., you need the Wise installer program, Fred needs to
> supply appropriate HTML docs for the Windows installer to zip up, Tcl/Tk has
> to get unpacked and rearranged, etc).  I just checked in 2.1.1c1 changes to
> the Windows part of the release21-maint tree, but the rest of it isn't in
> CVS.

Oh yeah, I was entirely going to let you guys do it, or at least find
another set of wintendows-weenies to do it :) That's part of why I posted
the tentative release dates.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From loewis@informatik.hu-berlin.de  Mon Jul  2 08:25:18 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 2 Jul 2001 09:25:18 +0200 (MEST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <01d601c102a0$98671580$0acc8490@neil> (nhodgson@bigpond.net.au)
References: <200107011352.PAA27645@pandora.informatik.hu-berlin.de> <01d601c102a0$98671580$0acc8490@neil>
Message-ID: <200107020725.JAA25925@pandora.informatik.hu-berlin.de>

> > > The problem I have with this PEP is that it is a compile time option
> > > which makes it hard to work with both 32 bit and 16 bit strings in
> > > one program.
> >
> > Can you elaborate why you think this is a problem?
> 
>    A common role for Python is to act as glue between various modules. If
> Paul produces some interesting code that depends on 32 bit strings and I
> want to use that in conjunction with some Win32 specific or COM dependent
> code that wants 16 bit strings then it may not be possible or may require
> difficult workaronds.

Neither nor. All it will require is you to recompile your Python
installation for to use wide Unicode.

On Win32 APIs, this will mean that you cannot directly interpret
PyUnicode object representations as WCHAR_T pointers. This is no
problem, as you can transparently copy unicode objects into wchar_t
strings; it's a matter of coming up with a good C API for doing so
conveniently.

Regards,
Martin


From fredrik@pythonware.com  Mon Jul  2 09:20:09 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 2 Jul 2001 10:20:09 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107020236.OAA00432@s454.cosc.canterbury.ac.nz>
Message-ID: <03b301c102cf$e0e3dd00$0900a8c0@spiff>

greg wrote:

> I agree. This can only be a stopgap measure. Ultimately the
> Unicode type needs to be made smarter.

PIL uses 8 bits per pixel to store bilevel images, and 32 bits
per pixel to store 16- and 24-bit images.

back in 1995, some people claimed that the image type had
to be made smarter to be usable.  these days, nobody ever
notices...

</F>




From fredrik@pythonware.com  Mon Jul  2 09:08:10 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 2 Jul 2001 10:08:10 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com> <00dd01c1022d$c61e4160$0acc8490@neil>
Message-ID: <03b201c102cf$e0dab540$0900a8c0@spiff>

Neil Hodgson wrote:
> > u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> > character.
>
>    This wasn't usefully true in the past for DBCS strings and is not the
> right way to think of either narrow or wide strings now. The idea that
> strings are arrays of characters gets in the way

if you stop confusing binary buffers with text strings, all such
problems will go away.

</F>




From mal@egenix.com  Mon Jul  2 10:39:55 2001
From: mal@egenix.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 11:39:55 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>
Message-ID: <3B40416B.6438D1F7@egenix.com>

Greg Ewing wrote:
> 
> > It so happened that the Unicode support was written to make it very
> > easy to change the compile-time code unit size
> 
> What about extension modules that deal with Unicode strings?
> Will they have to be recompiled too? If so, is there anything
> to detect an attempt to import an extension module with an
> incompatible Unicode character width?

That's a good question ! 

The answer is: yes, extensions which use Unicode will have to
be recompiled for narrow and wide builds of Python. The question
is however, how to detect cases where the user imports an
extension built for narrow Python into a wide build and
vice versa.

The standard way of looking at the API level won't help. We'd
need some form of introspection API at the C level... hmm,
perhaps looking at the sys module will do the trick for us ?!

In any case, this is certainly going to cause trouble one
of these days...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From mal@lemburg.com  Mon Jul  2 11:13:59 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 12:13:59 +0200
Subject: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" Unicode
 characters
References: <3B3F8095.8D58631D@ActiveState.com>
Message-ID: <3B404967.14FE180F@lemburg.com>

Paul Prescod wrote:
> 
> PEP: 261
> Title: Support for "wide" Unicode characters
> Version: $Revision: 1.3 $
> Author: paulp@activestate.com (Paul Prescod)
> Status: Draft
> Type: Standards Track
> Created: 27-Jun-2001
> Python-Version: 2.2
> Post-History: 27-Jun-2001
> 
> Abstract
> 
>     Python 2.1 unicode characters can have ordinals only up to 2**16
> -1.
>     This range corresponds to a range in Unicode known as the Basic
>     Multilingual Plane. There are now characters in Unicode that live
>     on other "planes". The largest addressable character in Unicode
>     has the ordinal 17 * 2**16 - 1 (0x10ffff). For readability, we
>     will call this TOPCHAR and call characters in this range "wide
>     characters".
> 
> Glossary
> 
>     Character
> 
>         Used by itself, means the addressable units of a Python
>         Unicode string.

Please add: also known as "code unit".
 
>     Code point
> 
>         A code point is an integer between 0 and TOPCHAR.
>         If you imagine Unicode as a mapping from integers to
>         characters, each integer is a code point. But the
>         integers between 0 and TOPCHAR that do not map to
>         characters are also code points. Some will someday
>         be used for characters. Some are guaranteed never
>         to be used for characters.
> 
>     Codec
> 
>         A set of functions for translating between physical
>         encodings (e.g. on disk or coming in from a network)
>         into logical Python objects.
> 
>     Encoding
> 
>         Mechanism for representing abstract characters in terms of
>         physical bits and bytes. Encodings allow us to store
>         Unicode characters on disk and transmit them over networks
>         in a manner that is compatible with other Unicode software.
> 
>     Surrogate pair
> 
>         Two physical characters that represent a single logical

Eeek... two code units (or have you ever seen a physical character
walking around ;-)

>         character. Part of a convention for representing 32-bit
>         code points in terms of two 16-bit code points.
> 
>     Unicode string
> 
>           A Python type representing a sequence of code points with
>           "string semantics" (e.g. case conversions, regular
>           expression compatibility, etc.) Constructed with the
>           unicode() function.
> 
> Proposed Solution
> 
>     One solution would be to merely increase the maximum ordinal
>     to a larger value. Unfortunately the only straightforward
>     implementation of this idea is to use 4 bytes per character.
>     This has the effect of doubling the size of most Unicode
>     strings. In order to avoid imposing this cost on every
>     user, Python 2.2 will allow the 4-byte implementation as a
>     build-time option. Users can choose whether they care about
>     wide characters or prefer to preserve memory.
> 
>     The 4-byte option is called "wide Py_UNICODE". The 2-byte option
>     is called "narrow Py_UNICODE".
> 
>     Most things will behave identically in the wide and narrow worlds.
> 
>     * unichr(i) for 0 <= i < 2**16 (0x10000) always returns a
>       length-one string.
> 
>     * unichr(i) for 2**16 <= i <= TOPCHAR will return a
>       length-one string on wide Python builds. On narrow builds it will
>       raise ValueError.
> 
>         ISSUE
> 
>             Python currently allows \U literals that cannot be
>             represented as a single Python character. It generates two
>             Python characters known as a "surrogate pair". Should this
>             be disallowed on future narrow Python builds?
> 
>         Pro:
> 
>             Python already the construction of a surrogate pair
>             for a large unicode literal character escape sequence.
>             This is basically designed as a simple way to construct
>             "wide characters" even in a narrow Python build. It is also
>             somewhat logical considering that the Unicode-literal syntax
>             is basically a short-form way of invoking the unicode-escape
>             codec.
> 
>         Con:
> 
>             Surrogates could be easily created this way but the user
>             still needs to be careful about slicing, indexing, printing
>             etc. Therefore some have suggested that Unicode
>             literals should not support surrogates.
> 
>         ISSUE
> 
>             Should Python allow the construction of characters that do
>             not correspond to Unicode code points?  Unassigned Unicode
>             code points should obviously be legal (because they could
>             be assigned at any time). But code points above TOPCHAR are
>             guaranteed never to be used by Unicode. Should we allow
> access
>             to them anyhow?
> 
>         Pro:
> 
>             If a Python user thinks they know what they're doing why
>             should we try to prevent them from violating the Unicode
>             spec? After all, we don't stop 8-bit strings from
>             containing non-ASCII characters.
> 
>         Con:
> 
>             Codecs and other Unicode-consuming code will have to be
>             careful of these characters which are disallowed by the
>             Unicode specification.
> 
>     * ord() is always the inverse of unichr()
> 
>     * There is an integer value in the sys module that describes the
>       largest ordinal for a character in a Unicode string on the current
>       interpreter. sys.maxunicode is 2**16-1 (0xffff) on narrow builds
>       of Python and TOPCHAR on wide builds.
> 
>         ISSUE: Should there be distinct constants for accessing
>                TOPCHAR and the real upper bound for the domain of
>                unichr (if they differ)? There has also been a
>                suggestion of sys.unicodewidth which can take the
>                values 'wide' and 'narrow'.
> 
>     * every Python Unicode character represents exactly one Unicode code
>       point (i.e. Python Unicode Character = Abstract Unicode
> character).
> 
>     * codecs will be upgraded to support "wide characters"
>       (represented directly in UCS-4, and as variable-length sequences
>       in UTF-8 and UTF-16). This is the main part of the implementation
>       left to be done.
> 
>     * There is a convention in the Unicode world for encoding a 32-bit
>       code point in terms of two 16-bit code points. These are known
>       as "surrogate pairs". Python's codecs will adopt this convention
>       and encode 32-bit code points as surrogate pairs on narrow Python
>       builds.
> 
>         ISSUE
> 
>             Should there be a way to tell codecs not to generate
>             surrogates and instead treat wide characters as
>             errors?
> 
>         Pro:
> 
>             I might want to write code that works only with
>             fixed-width characters and does not have to worry about
>             surrogates.
> 
>         Con:
> 
>             No clear proposal of how to communicate this to codecs.

No need to pass this information to the codec: simply write
a new one and give it a clear name, e.g. "ucs-2" will generate
errors while "utf-16-le" converts them to surrogates.
 
>     * there are no restrictions on constructing strings that use
>       code points "reserved for surrogates" improperly. These are
>       called "isolated surrogates". The codecs should disallow reading
>       these from files, but you could construct them using string
>       literals or unichr().
> 
> Implementation
> 
>     There is a new (experimental) define:
> 
>         #define PY_UNICODE_SIZE 2
> 
>     There is a new configure option:
> 
>         --enable-unicode=ucs2 configures a narrow Py_UNICODE, and uses
>                               wchar_t if it fits
>         --enable-unicode=ucs4 configures a wide Py_UNICODE, and uses
>                               whchar_t if it fits
>         --enable-unicode      same as "=ucs2"
> 
>     The intention is that --disable-unicode, or --enable-unicode=no
>     removes the Unicode type altogether; this is not yet implemented.
> 
>     It is also proposed that one day --enable-unicode will just
>     default to the width of your platforms wchar_t.
> 
>     Windows builds will be narrow for a while based on the fact that
>     there have been few requests for wide characters, those requests
>     are mostly from hard-core programmers with the ability to buy
>     their own Python and Windows itself is strongly biased towards
>     16-bit characters.
> 
> Notes
> 
>     This PEP does NOT imply that people using Unicode need to use a
>     4-byte encoding for their files on disk or sent over the network.
>     It only allows them to do so. For example, ASCII is still a
>     legitimate (7-bit) Unicode-encoding.
> 
>     It has been proposed that there should be a module that handles
>     surrogates in narrow Python builds for programmers. If someone
>     wants to implement that, it will be another PEP. It might also be
>     combined with features that allow other kinds of character-,
>     word- and line- based indexing.
> 
> Rejected Suggestions
> 
>     More or less the status-quo
> 
>         We could officially say that Python characters are 16-bit and
>         require programmers to implement wide characters in their
>         application logic by combining surrogate pairs. This is a heavy
>         burden because emulating 32-bit characters is likely to be
>         very inefficient if it is coded entirely in Python. Plus these
>         abstracted pseudo-strings would not be legal as input to the
>         regular expression engine.
> 
>     "Space-efficient Unicode" type
> 
>         Another class of solution is to use some efficient storage
>         internally but present an abstraction of wide characters to
>         the programmer. Any of these would require a much more complex
>         implementation than the accepted solution. For instance consider
>         the impact on the regular expression engine. In theory, we could
>         move to this implementation in the future without breaking
> Python
>         code. A future Python could "emulate" wide Python semantics on
>         narrow Python. Guido is not willing to undertake the
>         implementation right now.
> 
>     Two types
> 
>         We could introduce a 32-bit Unicode type alongside the 16-bit
>         type. There is a lot of code that expects there to be only a
>         single Unicode type.
> 
>     This PEP represents the least-effort solution. Over the next
>     several years, 32-bit Unicode characters will become more common
>     and that may either convince us that we need a more sophisticated
>     solution or (on the other hand) convince us that simply
>     mandating wide Unicode characters is an appropriate solution.
>     Right now the two options on the table are do nothing or do
>     this.
> 
> References
> 
>     Unicode Glossary: http://www.unicode.org/glossary/

Plus perhaps the Mark Davis paper at:

http://www-106.ibm.com/developerworks/unicode/library/utfencodingforms/
 
> Copyright
> 
>     This document has been placed in the public domain.

Good work, Paul !

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From mal@lemburg.com  Mon Jul  2 11:08:53 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 12:08:53 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com>
Message-ID: <3B404835.4CE77C60@lemburg.com>

Paul Prescod wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> >...
> >
> > The term "character" in Python should really only be used for
> > the 8-bit strings.
> 
> Are we going to change chr() and unichr() to one_element_string() and
> unicode_one_element_string()

No. I am just suggesting to make use of the crispy clear
definitions which the Unicode Consortium has developed for us.
 
> u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> character. No Python user will find that confusing no matter how Unicode
> knuckle-dragging, mouth-breathing, wife-by-hair-dragging they are.

Except that u[i] maps to a code unit which may or may not be
a code point. Whether a code point matches a grapheme (this
is what users tend to regard as character) is yet another
story due to combining code points.

> > In Unicode a "character" can mean any of:
> 
> Mark Davis said that "people" can use the word to mean any of those
> things. He did not say that it was imprecisely defined in Unicode.
> Nevertheless I'm not using the Unicode definition anymore than our
> standard library uses an ancient Greek definition of integer. Python has
> a concept of integer and a concept of character.

Ok, I'll stop whining. Just as final remark, let me say that
our little discussion is a perfect example of how people can
misunderstand each other by using the terms in different ways
(Kant tried to solve this for Philosophy and did not succeed;
so I guess the Unicode Consortium doesn't stand a chance 
either ;-)
 
> > >     It has been proposed that there should be a module for working
> > >     with UTF-16 strings in narrow Python builds through some sort of
> > >     abstraction that handles surrogates for you. If someone wants
> > >     to implement that, it will be another PEP.
> >
> > Uhm, narrow builds don't support UTF-16... it's UCS-2 which
> > is supported (basically: store everything in range(0x10000));
> > the codecs can map code points to surrogates, but it is solely
> > their responsibility and the responsibility of the application
> > using them to take care of dealing with surrogates.
> 
> The user can view the data as UCS-2, UTF-16, Base64, ROT-13, XML, ....
> Just as we have a base64 module, we could have a UTF-16 module that
> interprets the data in the string as UTF-16 and does surrogate
> manipulation for you.
> 
> Anyhow, if any of those is the "real" encoding of the data, it is
> UTF-16. After all, if the codec reads in four non-BMP characters in,
> let's say, UTF-8, we represent them as 8 narrow-build Python characters.
> That's the definition of UTF-16! But it's easy enough for me to take
> that word out so I will.

u[i] gives you a code unit and whether this maps to a code point
or not is dependent on the implementation which in turn depends
on the narrow/wide choice.

In UCS-2, I believe, surrogates are regarded as two code points;
in UTF-16 they always have to come in pairs. There's a semantic
difference here which is for the codecs and these additional
tools to be aware of -- not the Unicode type implementation.

> >...
> > Also, the module will be useful for both narrow and wide builds,
> > since the notion of an encoded character can involve multiple code
> > points. In that sense Unicode is always a variable length
> > encoding for characters and that's the application field of
> > this module.
> 
> I wouldn't advise that you do all different types of normalization in a
> single module but I'll wait for your PEP.

I'll see if I find some time at the Bordeaux Python Meeting
next week.
 
> > Here's the adjusted text:
> >
> >      It has been proposed that there should be a module for working
> >      with Unicode objects using character-, word- and line- based
> >      indexing. The details of the implementation is left to
> >      another PEP.
> 
>      It has been proposed that there should be a module that handles
>      surrogates in narrow Python builds for programmers. If someone
>      wants to implement that, it will be another PEP. It might also be
>      combined with features that allow other kinds of character-,
>      word- and line- based indexing.

Hmm, I liked my version better, but what the heck ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal@lemburg.com  Mon Jul  2 11:43:38 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 12:43:38 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com>
 <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>
Message-ID: <3B40505A.2F03EEC4@lemburg.com>

Guido van Rossum wrote:
> 
> Hi Marc-Andre,
> 
> I'm dropping the i18n-sig from the distribution list.
> 
> I hear you:
> 
> > You didn't get my point. I feel responsable for the Unicode
> > implementation design and would like to see it become a continued
> > success.
> 
> I'm sure we all share this goal!
> 
> > In that sense and taking into account that I am the
> > maintainer of all this stuff, I think it is very reasonable to
> > ask me before making any significant changes to the implementation
> > and also respect any comments I put forward.
> 
> I understand you feel that we've rushed this in without waiting for
> your comments.
> 
> Given how close your implementation was, I still feel that the changes
> weren't that significant, but I understand that you get nervous.  If
> Christian were to check in his speed hack  changes to the guts of
> ceval.c I would be nervous too!  (Heck, I got nervous when Eric
> checked in his library-wide string method changes without asking.)
> 
> Next time I'll try to be more sensitive to situations that require
> your review before going forward.

Good.
 
> > Currently, I have to watch the checkins list very closely
> > to find out who changed what in the implementation and then to
> > take actions only after the fact. Since I'm not supporting Unicode
> > as my full-time job this is simply impossible. We have the SF manager
> > and there is really no need to rush anything around here.
> 
> Hm, apart from the fact that you ought to be left in charge, I think
> that in this case the live checkins were a big win over the usual SF
> process.  At least two people were making changes, sometimes to each
> other's code, and many others on at least three continents were
> checking out the changes on many different platforms and immediately
> reporting problems.  We would definitely not have a patch as solid as
> the code that's now checked in, after two days of using SF!  (We
> could've used a branch, but I've found that getting people to actually
> check out the branch is not easy.)

True, but I was thinking of the concept and design questions
which should be resolved *before* taking the direct checkin 
approach.
 
> So I think that the net result was favorable.  Sometimes you just have
> to let people work in the spur of the moment to get the results of
> their best thinking, otherwise they lose interest or their train of
> thought.

Understood, but then I'd like to at least receive a summary
of the changes in some way, so that I continue to understand
how the implementation works after the checkins and which
corners to keep in mind for future additions, changes, etc.
 
> > If I am offline or too busy with other things for a day or two,
> > then I want to see patches on SF and not find new versions of
> > the implementation already checked in.
> 
> That's still the general rule, but in our enthousiasm (and mine was
> definitely part of this!) we didn't want to wait.  Also, I have to
> admit that I mistook your silence for consent -- I didn't think the
> main proposed changes (making the size of Py_UNICODE a config choice)
> were controversial at all, so I didn't realize you would have a problem
> with it.

I don't have a problem with it; I was just seeing things
slip my fingers and getting worried about this.
 
> > This has worked just fine during the last year, so I can only explain
> > the latest actions in this direction with an urge to bypass my comments
> > and any discussion this might cause.
> 
> I think you're projecting your own stuff here. 

Not really. I have processed many patches on SF, gave comments
etc. and did the final checkin. This has worked great over
the last months and I intend to keep working this way since
it is by far the best way to both manage and document the
issues and questions which arise during the process.

E.g. I'm currently processing a patch by Walter Dörwald 
which adds support for callback error handlers. He has done
some great work there which was the result of many lively
discussions. Working like this is fun while staying
manageable at the same time... and again, there's really no
need to rush things !

> I honestly didn't
> think there was much disagreement on your part and thought we were
> doing you a favor by implementing the consensus.  IMO, Martin and and
> Fredrik are familiar enough with both the code and the issues to do a
> good job.

Well, the above was my interpretation of how things went. 
I may have been wrong (and honestly do hope that I am wrong),
but my gutt feeling simply said: hey, what are these guys doing
there... is this some kind of 
 
> > Needless to say that
> > quality control is not possible anymore.
> 
> Unclear.  Lots of other people looked over the changes in your
> absence.  And CVS makes code review after it's checked in easy enough.
> (Hey, in many other open source projects that's the normal procedure
> once the rough characteristics of a feature have been agreed upon:
> check in first and review later!)

That was not my point: quality control also includes checking
the design approach. This is something which should normally
be done in design/implementation/design/... phases -- just like 
I worked with you on the Unicode implementation late in 1999.
 
> > Conclusion:
> > I am not going to continue this work if this does not change.
> 
> That would be sad, and I hope you will stay with us.  We certainly
> don't plan to ignore your comments!
> 
> > Another other problem for me is the continued hostility I feel on i18n
> > against parts of the design and some of my decisions. I am
> > not talking about your feedback and the feedback from many other
> > people on the list which was excellent and to high standards.
> > But reading the postings of the last few months you will
> > find notices of what I am referring to here (no, I don't want
> > to be specific).
> 
> I don't know what to say about this, and obviously nobody has the time
> to go back and read the archives.  I'm sure it's not you as a person
> that was attacked.  If the design isn't perfect -- and hey, since
> Python is the 80 percent language, few things in it are quite perfect!
> -- then (positive) criticism is an attempt to help, to move it closer
> to perfection.
> 
> If people have at times said "the Unicode support sucks", well, that
> may hurt.  You can't always stay friends with everybody.  I get flames
> occasionally for features in Python that folks don't like.  I get used
> to them, and it doesn't affect my confidence any more.  Be the same!

I'll try.
 
> But sometimes, after saying "it sucks", people make specific
> suggestions for improvements, and it's important to be open for those
> even from sources that use offending language.  (Within reason, of
> course.  I don't ask you to listen to somebody who is persistently
> hostile to you as a person.)

Ok.
 
> > If people don't respect my comments or decision, then how can
> > I defend the design and how can I stop endless discussions which
> > simply don't lead anywhere ? So either I am missing something
> > or there is a need for a clear statement from you about
> > my status in all this.
> 
> Do you really *want* to be the Unicode BDFL?  Being something's BDFL a
> full-time job, and you've indicated you're too busy.  (Or is that
> temporary?)

I am currently doing a lot of consulting work, so things sometimes
tighten up and are less work intense at other times. Given
this setup, I think that I will be able to play the BD (without
the FL) for Unicode for some time. I will certainly pass on the
flag to someone else if I find myself not spending enough
time on it.

The only thing I'm asking for, is some more professional
work mentality at times. If people make it hard for me to follow
the development, then I cannot manage this task in a satisfying
way.

> I see you as the original coder, which means that you know that
> section of the code better than anyone, and whenever there's a
> question that others can't answer about its design, implementation, or
> restrictions, I refer to you.  But given that you've said you wouldn't
> be able to work much on it, I welcome contributions by others as long
> as they seem knowledgeable.

Same here.
 
> > If I don't have the right to comment on proposals and patches,
> > possibly even rejecting them, then I simply don't see any
> > ground for keeping the implementation in a state which I can
> > maintain.
> 
> Nobody said you couldn't comment, and you know that.

If I don't get a chance to comment on a summary of changes
(be it before or after a batch of checkings), how am I
supposed to follow up on them ? Keeping a close eye
on the checkin mailing list doesn't help: it simply doesn't
always give you the big picture.

We are all professional quality programmers and I respect
Fredrik and Martin for their coding quality and ideas. What
I am asking for is some more teamwork.

> When it comes to rejecting or accepting, I feel that I am still the
> final arbiter, even for Unicode, until I get hit by a bus.  Since I
> don't always understand the implementation or the issues, I'll of
> course defer to you in cases where I think I can't make the decision,
> but I do reserve the right to be convinced by others to override your
> judgement, occasionally, if there's a good reason.  And when you're
> not responsive, I may try to channel you.  (I'll try to be more
> explicit about that.)

That's perfectly OK (and indeed can be very useful at times).
 
> > And last but not least: The fun-factor has faded which was
> > the main motor driving my into working on Unicode in the first
> > place. Nothing much you can do about this, though :-/
> 
> Yes, that happens to all of us at times.  The fun factor goes up and
> down, and sometimes we must look for fun elsewhere for a while.  Then
> the fun may come back where it appeared lost.  Go on vacation, read a
> book, tackle a new project in a totally different area!  Then come
> back and see if you can find some fun in the old stuff again.

I'll visit the Bordeaux Python conference later week. That should
give me some time to breathe (and hopefully to write some more
PEPs :=).
 
> > > Paul Prescod offered to write a PEP on this issue.  My cynical half
> > > believes that we'll never hear from him again, but my optimistic half
> > > hopes that he'll actually write one, so that we'll be able to discuss
> > > the various issues for the users with the users.  I encourage you to
> > > co-author the PEP, since you have a lot of background knowledge about
> > > the issues.
> >
> > I guess your optimistic half won :-) I think Paul already did all the
> > work, so I'll simply comment on what he wrote.
> 
> Your suggestions were very valuable.  My opinion of Paul also went up
> a notch!
> 
> > > BTW, I think that Misc/unicode.txt should be converted to a PEP, for
> > > the historic record.  It was very much a PEP before the PEP process
> > > was invented.  Barry, how much work would this be?  No editing needed,
> > > just formatting, and assignment of a PEP number (the lower the better).
> >
> > Thanks for converting the text to PEP format, Barry.
> >
> > Thanks for reading this far,
> 
> You're welcome, and likewise.
> 
> Just one more thing, Marc-Andre.  Please know that I respect your work
> very much even if we don't always agree.  We would get by without you,
> but Python would be hurt if you turned your back on us.

Thanks. Be assured that I'll stay around for quite some time --
you won't get by that easily ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From mal@lemburg.com  Mon Jul  2 11:56:00 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 12:56:00 +0200
Subject: [Python-Dev] Bordeaux Python Meeting 04.07.-07.07.
Message-ID: <3B405340.31C5AA11@lemburg.com>

Hi everybody,

I think nobody has posted an announcement for the conference
yet, so I'll at least provide a pointer:

	http://www.lsm.abul.org/program/topic19/

Marc Poinot, who also organized the "First Python Day" in France,
is chair of this subtopic at the "Debian One" conference in
Bordeaux:

	http://www.lsm.abul.org/

Cheers,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From fredrik@pythonware.com  Mon Jul  2 12:41:51 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 2 Jul 2001 13:41:51 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com>              <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com> <3B40505A.2F03EEC4@lemburg.com>
Message-ID: <001e01c102eb$fe4995d0$4ffa42d5@hagrid>

mal wrote:

> The only thing I'm asking for, is some more professional
> work mentality at times.

for the record, your recent posts under this subject doesn't strike
me as very professional.

think about it.

</F>



From paulp@ActiveState.com  Mon Jul  2 15:25:55 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 02 Jul 2001 07:25:55 -0700
Subject: [I18n-sig] Re: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide"
 Unicodecharacters
References: <3B3F8095.8D58631D@ActiveState.com> <3B404967.14FE180F@lemburg.com>
Message-ID: <3B408473.77AB6C8@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
>...
> >     Character
> >
> >         Used by itself, means the addressable units of a Python
> >         Unicode string.
> 
> Please add: also known as "code unit".

I'm not entirely comfortable with that. As you yourself pointed out, the
same Python Unicode object can be interpreted as either a series of
single-width code points *or* as a UTF-16 string where the characters
are code units. You could also interpet it as a BASE64'd region or an
XML document... It all depends on how you look at it.

> ....
> >     Surrogate pair
> >
> >         Two physical characters that represent a single logical
> 
> Eeek... two code units (or have you ever seen a physical character
> walking around ;-)

No, that's sort of my point. The user can decide to adopt the convention
of looking at the two characters as code units or they can ignore that
interpretation and look at them as two code points. It's all relative,
man. Dig it? That's why I use the word "convention" below:

> >         character. Part of a convention for representing 32-bit
> >         code points in terms of two 16-bit code points.

"Surrogates are all in your head. Python doesn't know or care about
them!"

I'll change this to:

    Surrogate pair

        Two Python Unicode characters that represent a single logical
        Unicode code point. Part of a convention for representing
        32-bit code points in terms of two 16-bit code points. Python
        has limited support for reading, writing and constructing
strings 
        that use this convention (described below). Otherwise Python
        ignores the convention.

> No need to pass this information to the codec: simply write
> a new one and give it a clear name, e.g. "ucs-2" will generate
> errors while "utf-16-le" converts them to surrogates.

That's a good point, but what if I want a UTF-8 codec that doesn't
generate surrogates? Or even a UCS4 one?

> Plus perhaps the Mark Davis paper at:
> 
> http://www-106.ibm.com/developerworks/unicode/library/utfencodingforms/

Okay.

> > Copyright
> >
> >     This document has been placed in the public domain.
> 
> Good work, Paul !

Thanks for your help. You did help me to clarify many things even though
I argued with you as I was doing it. 
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From guido@digicool.com  Mon Jul  2 16:23:56 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 02 Jul 2001 11:23:56 -0400
Subject: [Python-Dev] Unicode Maintenance
In-Reply-To: Your message of "Mon, 02 Jul 2001 12:43:38 +0200."
 <3B40505A.2F03EEC4@lemburg.com>
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>
 <3B40505A.2F03EEC4@lemburg.com>
Message-ID: <200107021523.f62FNun01807@odiug.digicool.com>

Thanks for your response, Marc-Andre.  I'd like to close this topic
now.  I'm not sure how to get you a "summary of changes", but I think
you can ask Fredrik directly (Martin annonced he's away on vacation).

One thing you can do is pipe the output of "cvs log" through
tools/scripts/logmerge.py -- this gives you the checkin messages in
(reverse?) chronological order.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Mon Jul  2 16:29:39 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 02 Jul 2001 11:29:39 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: Your message of "Mon, 02 Jul 2001 11:39:55 +0200."
 <3B40416B.6438D1F7@egenix.com>
References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>
 <3B40416B.6438D1F7@egenix.com>
Message-ID: <200107021529.f62FTdx01823@odiug.digicool.com>

> Greg Ewing wrote:
> > 
> > > It so happened that the Unicode support was written to make it very
> > > easy to change the compile-time code unit size
> > 
> > What about extension modules that deal with Unicode strings?
> > Will they have to be recompiled too? If so, is there anything
> > to detect an attempt to import an extension module with an
> > incompatible Unicode character width?
> 
> That's a good question ! 
> 
> The answer is: yes, extensions which use Unicode will have to
> be recompiled for narrow and wide builds of Python. The question
> is however, how to detect cases where the user imports an
> extension built for narrow Python into a wide build and
> vice versa.
> 
> The standard way of looking at the API level won't help. We'd
> need some form of introspection API at the C level... hmm,
> perhaps looking at the sys module will do the trick for us ?!
> 
> In any case, this is certainly going to cause trouble one
> of these days...

Here are some alternative ways to deal with this:

(1) Use the preprocessor to rename all the Unicode APIs to get "Wide"
    appended to their name in wide mode.  This makes any use of a
    Unicode API in an extension compiled for the wrong Py_UNICODE_SIZE
    fail with a link-time error.  (Which should cause an ImportError
    for shared libraries.)

(2) Ditto but only rename the PyModule_Init function.  This is much
    less work but more coarse: a module that doesn't use any Unicode
    APIs (and I expect these will be a large majority) still would not
    be accepted.

(3) Change the interpretation of PYTHON_API_VERSION so that a low bit
    of '1' means wide Unicode.  Then you only get a warning (followed
    by a core dump when actually trying to use Unicode).

I mentioned (1) and (3) in an earlier post.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@beowolf.digicool.com  Mon Jul  2 16:37:45 2001
From: fdrake@beowolf.digicool.com (Fred Drake)
Date: Mon,  2 Jul 2001 11:37:45 -0400 (EDT)
Subject: [Python-Dev] [maintenance doc updates]
Message-ID: <20010702153745.B304B28929@beowolf.digicool.com>

The development version of the documentation has been updated:

	http://python.sourceforge.net/maint-docs/


Updated to reflect the current state of the Python 2.1.1 maintenance
release branch.



From mal@lemburg.com  Mon Jul  2 17:51:58 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 18:51:58 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>
 <3B40416B.6438D1F7@egenix.com> <200107021529.f62FTdx01823@odiug.digicool.com>
Message-ID: <3B40A6AE.EDE30857@lemburg.com>

Guido van Rossum wrote:
> 
> > Greg Ewing wrote:
> > >
> > > > It so happened that the Unicode support was written to make it very
> > > > easy to change the compile-time code unit size
> > >
> > > What about extension modules that deal with Unicode strings?
> > > Will they have to be recompiled too? If so, is there anything
> > > to detect an attempt to import an extension module with an
> > > incompatible Unicode character width?
> >
> > That's a good question !
> >
> > The answer is: yes, extensions which use Unicode will have to
> > be recompiled for narrow and wide builds of Python. The question
> > is however, how to detect cases where the user imports an
> > extension built for narrow Python into a wide build and
> > vice versa.
> >
> > The standard way of looking at the API level won't help. We'd
> > need some form of introspection API at the C level... hmm,
> > perhaps looking at the sys module will do the trick for us ?!
> >
> > In any case, this is certainly going to cause trouble one
> > of these days...
> 
> Here are some alternative ways to deal with this:
> 
> (1) Use the preprocessor to rename all the Unicode APIs to get "Wide"
>     appended to their name in wide mode.  This makes any use of a
>     Unicode API in an extension compiled for the wrong Py_UNICODE_SIZE
>     fail with a link-time error.  (Which should cause an ImportError
>     for shared libraries.)
>
> (2) Ditto but only rename the PyModule_Init function.  This is much
>     less work but more coarse: a module that doesn't use any Unicode
>     APIs (and I expect these will be a large majority) still would not
>     be accepted.
> 
> (3) Change the interpretation of PYTHON_API_VERSION so that a low bit
>     of '1' means wide Unicode.  Then you only get a warning (followed
>     by a core dump when actually trying to use Unicode).
>
> I mentioned (1) and (3) in an earlier post.

(4) Add a feature flag to PyModule_Init() which then looks up the
    features in the sys module and uses this as basis for
    processing the import requrest.

In this case, I think that (5) would be the best solution,
since old code will notice the change in width too.

-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/


From paulp@ActiveState.com  Mon Jul  2 19:15:41 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 02 Jul 2001 11:15:41 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>
 <3B40416B.6438D1F7@egenix.com> <200107021529.f62FTdx01823@odiug.digicool.com> <3B40A6AE.EDE30857@lemburg.com>
Message-ID: <3B40BA4D.9C85A202@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
>...
> 
> (4) Add a feature flag to PyModule_Init() which then looks up the
>     features in the sys module and uses this as basis for
>     processing the import requrest.

Could an extension be carefully written so that a single binary could be
compatible with both types of Python build? I'm thinking that it would
pass data buffers with the "right width" based on checking a runtime
flag...
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From just@letterror.com  Mon Jul  2 19:20:38 2001
From: just@letterror.com (Just van Rossum)
Date: Mon,  2 Jul 2001 20:20:38 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B40BA4D.9C85A202@ActiveState.com>
Message-ID: <20010702202041-r01010600-d5c62b95@213.84.27.177>

Paul Prescod wrote:

> Could an extension be carefully written so that a single binary could be
> compatible with both types of Python build? I'm thinking that it would
> pass data buffers with the "right width" based on checking a runtime
> flag...

But then it would also be compatible with a unicode object using different
internal storage units per string, so I'm sure this is a dead end ;-)

Just


From mal@lemburg.com  Mon Jul  2 19:59:06 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 20:59:06 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010702202041-r01010600-d5c62b95@213.84.27.177>
Message-ID: <3B40C47A.94317663@lemburg.com>

Just van Rossum wrote:
> 
> Paul Prescod wrote:
> 
> > Could an extension be carefully written so that a single binary could be
> > compatible with both types of Python build? I'm thinking that it would
> > pass data buffers with the "right width" based on checking a runtime
> > flag...
> 
> But then it would also be compatible with a unicode object using different
> internal storage units per string, so I'm sure this is a dead end ;-)

Agreed :-)

Extension writer will have to provide two versions of the binary.

-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/


From mal@lemburg.com  Mon Jul  2 20:12:45 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 21:12:45 +0200
Subject: [I18n-sig] Re: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide"
 Unicodecharacters
References: <3B3F8095.8D58631D@ActiveState.com> <3B404967.14FE180F@lemburg.com> <3B408473.77AB6C8@ActiveState.com>
Message-ID: <3B40C7AD.F2646D56@lemburg.com>

Paul Prescod wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> >...
> > >     Character
> > >
> > >         Used by itself, means the addressable units of a Python
> > >         Unicode string.
> >
> > Please add: also known as "code unit".
> 
> I'm not entirely comfortable with that. As you yourself pointed out, the
> same Python Unicode object can be interpreted as either a series of
> single-width code points *or* as a UTF-16 string where the characters
> are code units. You could also interpet it as a BASE64'd region or an
> XML document... It all depends on how you look at it.

Well, that's what code unit tries to capture too: it's the basic storage
unit used by the implementation for storing characters. Never mind, it's
just a detail...
 
> > ....
> > >     Surrogate pair
> > >
> > >         Two physical characters that represent a single logical
> >
> > Eeek... two code units (or have you ever seen a physical character
> > walking around ;-)
> 
> No, that's sort of my point. The user can decide to adopt the convention
> of looking at the two characters as code units or they can ignore that
> interpretation and look at them as two code points. It's all relative,
> man. Dig it? That's why I use the word "convention" below:

Ok.
 
> > >         character. Part of a convention for representing 32-bit
> > >         code points in terms of two 16-bit code points.
> 
> "Surrogates are all in your head. Python doesn't know or care about
> them!"
> 
> I'll change this to:
> 
>     Surrogate pair
> 
>         Two Python Unicode characters that represent a single logical
>         Unicode code point. Part of a convention for representing
>         32-bit code points in terms of two 16-bit code points. Python
>         has limited support for reading, writing and constructing
> strings
>         that use this convention (described below). Otherwise Python
>         ignores the convention.

Good.
 
> > No need to pass this information to the codec: simply write
> > a new one and give it a clear name, e.g. "ucs-2" will generate
> > errors while "utf-16-le" converts them to surrogates.
> 
> That's a good point, but what if I want a UTF-8 codec that doesn't
> generate surrogates? Or even a UCS4 one?

With Walter's patch for callback error handlers, you should be able to
provide handlers which implement whatever you see fit. 
 
I think that codecs should work the same on all platforms and always
apply the needed conversion for the platform in question; could be wrong
though... it's really only a minor issue.

> > Plus perhaps the Mark Davis paper at:
> >
> > http://www-106.ibm.com/developerworks/unicode/library/utfencodingforms/
> 
> Okay.
> 
> > > Copyright
> > >
> > >     This document has been placed in the public domain.
> >
> > Good work, Paul !
> 
> Thanks for your help. You did help me to clarify many things even though
> I argued with you as I was doing it.

Thank you for taking the suggestions into account.

-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/


From fredrik@pythonware.com  Mon Jul  2 20:41:33 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 2 Jul 2001 21:41:33 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>              <3B40505A.2F03EEC4@lemburg.com>  <200107021523.f62FNun01807@odiug.digicool.com>
Message-ID: <013101c1032f$022770d0$4ffa42d5@hagrid>

guido wrote: 
> I'm not sure how to get you a "summary of changes", but I think you
> can ask Fredrik directly (Martin annonced he's away on vacation).

summary:

- portability: made unicode object behave properly also if
  sizeof(Py_UNICODE) > 2 and >= sizeof(long) (FL)
- same for unicode codecs and the unicode database (MvL)
- base unicode feature selection on unicode defines, not platform (FL)
- wrap surrogate handling in #ifdef Py_UNICODE_WIDE (MvL, FL)
- tweaked unit tests to work with wide unicode, by replacing explicit
  surrogates with \U escapes (MvL)
- configure options for narrow/wide unicode (MvL)
- removed bogus const and register from some scalars (GvR, FL)
- default unicode configuration for PC (Tim, FL)
- default unicode configuration for Mac (Jack)
- added sys.maxunicode (MvL)

most changes where really trivial (e.g. ~0xFC00 => 0x3FF). martin's
big patch was reviewed and tested by both me and him before checkin
(tim managed to check out and build before I'd gotten around to check
in my windows tweaks, but that's what makes distributed egoless deve-
lopment so fun ;-)

</F>



From greg@cosc.canterbury.ac.nz  Tue Jul  3 01:20:37 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 03 Jul 2001 12:20:37 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <03b301c102cf$e0e3dd00$0900a8c0@spiff>
Message-ID: <200107030020.MAA00584@s454.cosc.canterbury.ac.nz>

Fredrik Lundh <fredrik@pythonware.com>:

> back in 1995, some people claimed that the image type had
> to be made smarter to be usable.

But at least you can use more than one depth of
image in the same program...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From mal@lemburg.com  Tue Jul  3 09:31:50 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 03 Jul 2001 10:31:50 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>              <3B40505A.2F03EEC4@lemburg.com>  <200107021523.f62FNun01807@odiug.digicool.com> <013101c1032f$022770d0$4ffa42d5@hagrid>
Message-ID: <3B4182F6.DAC4C1@lemburg.com>

Fredrik Lundh wrote:
> 
> guido wrote:
> > I'm not sure how to get you a "summary of changes", but I think you
> > can ask Fredrik directly (Martin annonced he's away on vacation).
> 
> summary:
> 
> - portability: made unicode object behave properly also if
>   sizeof(Py_UNICODE) > 2 and >= sizeof(long) (FL)
> - same for unicode codecs and the unicode database (MvL)
> - base unicode feature selection on unicode defines, not platform (FL)
> - wrap surrogate handling in #ifdef Py_UNICODE_WIDE (MvL, FL)
> - tweaked unit tests to work with wide unicode, by replacing explicit
>   surrogates with \U escapes (MvL)
> - configure options for narrow/wide unicode (MvL)
> - removed bogus const and register from some scalars (GvR, FL)
> - default unicode configuration for PC (Tim, FL)
> - default unicode configuration for Mac (Jack)
> - added sys.maxunicode (MvL)

Thank you for the summary. 

Please let me suggest that for the next coding party you prepare a patch 
which spans all party checkins and upload that patch with a summary
like the above to SF. That way we can keep the documentation of the overall
changes in one place and make the process more transparent for everybody.

Now let's get on with business...

Thanks,
-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/




From fredrik@pythonware.com  Tue Jul  3 11:21:27 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 3 Jul 2001 12:21:27 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>              <3B40505A.2F03EEC4@lemburg.com>  <200107021523.f62FNun01807@odiug.digicool.com> <013101c1032f$022770d0$4ffa42d5@hagrid> <3B4182F6.DAC4C1@lemburg.com>
Message-ID: <05aa01c103a9$ec29e710$0900a8c0@spiff>

mal wrote:

> Please let me suggest that for the next coding party you prepare a patch
> which spans all party checkins and upload that patch with a summary
> like the above to SF. That way we can keep the documentation of the overall
> changes in one place and make the process more transparent for everybody.

Sorry, but as long as Guido wants an open development approach
based on collective code ownership (aka "egoless programming"),
that's what he gets.

The current environment provides several tools to track changes
to the code base.  The python-checkins list provides instant info
on every single change to the code base; the investment to track
tha list is a few minutes per day.  The CVS history is also easy to
access; you can reach it via the viewcvs interface, or from the
command line.

Using both CVS and SF's patch manager to track development history
is a waste of time.  A development project manned by volunteers
doesn't need bureaucrats; the version control system provides
all the accountability we'll ever need.

(commercial development projects doesn't need bureaucrats
either, and usually don't have them, but that's another story).

I'd also argue that using many incremental checkins improves
quality -- the smaller a change is, the easier it is to understand,
and the more likely it is that also non-experts will notice simple
mistakes or portability issues.  (I regularily comment on checkin
messages that look suspicious codewise, even if I don't know
anything about the problem area.  I'm even right, sometimes).
Reviewing big patches on SF is really hard, even for experts.

And every hour a patch sits on sourceforge instead of in the code
repository is ten hours less burn-in in a heterogenous testing en-
vironment.  That's worth a lot.

Finally, my experience from this and other projects is that the
"visible heartbeat" you get from a continuous flow of checkin
messages improves team productivity and team morale.  No-
thing is more inspiring than seeing others working for a common
goal.  It's the final product that matters, not who's in charge of
what part of it.  The end user couldn't care less.

I'd prefer if you didn't feel the need to play miniboss on the Python
project (I'm sure you have plenty of 'mx' projects that you can use
that approach, if you have to).  And I'd rather see you at the next
party than out there whining over how you missed the last one.

Cheers /F




From mal@lemburg.com  Tue Jul  3 12:30:05 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 03 Jul 2001 13:30:05 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>              <3B40505A.2F03EEC4@lemburg.com>  <200107021523.f62FNun01807@odiug.digicool.com> <013101c1032f$022770d0$4ffa42d5@hagrid> <3B4182F6.DAC4C1@lemburg.com> <05aa01c103a9$ec29e710$0900a8c0@spiff>
Message-ID: <3B41ACBD.9FA8FB25@lemburg.com>

Fredrik Lundh wrote:
> 
> > Please let me suggest that for the next coding party you prepare a patch
> > which spans all party checkins and upload that patch with a summary
> > like the above to SF. That way we can keep the documentation of the overall
> > changes in one place and make the process more transparent for everybody.
> 
> Sorry, but as long as Guido wants an open development approach
> based on collective code ownership (aka "egoless programming"),
> that's what he gets.
> 
> The current environment provides several tools to track changes
> to the code base.  The python-checkins list provides instant info
> on every single change to the code base; the investment to track
> tha list is a few minutes per day.  The CVS history is also easy to
> access; you can reach it via the viewcvs interface, or from the
> command line.

I think you misunderstood my suggestion: I didn't say you can't have
a coding party with lots of small checkins, I just suggested that *after*
the party someone does a diff before-and-after-the-party.diff and
uploads this diff to SF with a description of the overall changes.

You simply don't get the big picture from looking at various small 
checkin messages which are sometimes spread across mutliple files/checkins.
 
> Using both CVS and SF's patch manager to track development history
> is a waste of time.  A development project manned by volunteers
> doesn't need bureaucrats; the version control system provides
> all the accountability we'll ever need.
> 
> (commercial development projects doesn't need bureaucrats
> either, and usually don't have them, but that's another story).

Wasn't talking about bureaucrats... 
 
> I'd also argue that using many incremental checkins improves
> quality -- the smaller a change is, the easier it is to understand,
> and the more likely it is that also non-experts will notice simple
> mistakes or portability issues.  (I regularily comment on checkin
> messages that look suspicious codewise, even if I don't know
> anything about the problem area.  I'm even right, sometimes).
> Reviewing big patches on SF is really hard, even for experts.

It's just for keeping a combined record of changes. Following up on
dozens of checkins spanning another dozen files using CVS is 
harder, IMHO, than looking at one single before/after diff.
 
> And every hour a patch sits on sourceforge instead of in the code
> repository is ten hours less burn-in in a heterogenous testing en-
> vironment.  That's worth a lot.

Agreed.
 
> Finally, my experience from this and other projects is that the
> "visible heartbeat" you get from a continuous flow of checkin
> messages improves team productivity and team morale.  No-
> thing is more inspiring than seeing others working for a common
> goal.  It's the final product that matters, not who's in charge of
> what part of it.  The end user couldn't care less.
> 
> I'd prefer if you didn't feel the need to play miniboss on the Python
> project (I'm sure you have plenty of 'mx' projects that you can use
> that approach, if you have to). 

I have no intention of playing "miniboss" (I have enough of that being
the boss of a small company), I'm just trying to keep the task of a code
maintainer manageable; that's all. 'nuff said.

> And I'd rather see you at the next
> party than out there whining over how you missed the last one.

Perhaps you can send around invitations first, before starting the party 
next time ?!

BTW, do you have plans to update the Unicode database to the 3.1
version ? If not, I'll look into this next week.

-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/



From thomas@xs4all.net  Tue Jul  3 12:41:51 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 3 Jul 2001 13:41:51 +0200
Subject: [Python-Dev] CVS
Message-ID: <20010703134151.P8098@xs4all.nl>

Slightly off-topic, but I've depleted all my other sources :) I'm trying to
get CVS to give me all logentries for all checkins in a specific branch (the
2.1.1 branch) so I can pipe it through logmerge. It seems the one thing I'm
missing now is a branchpoint tag (which should translate to a revision with
an even number of dots, apparently) but 'release21' and 'release21-maint'
both don't qualify. Even the usage logmerge suggests (cvs log -rrelease21)
doesn't work, gives me a bunch of "no revision =12elease21' in <file>"
warnings and just all logentries for those files.

Am I missing something simple, here, or should I hack logmerge to parse the
symbolic names, figure out the even-dotted revision for each file from the
uneven-dotted branch-tag, and filter out stuff outside that range ? :P

--=20
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me sp=
read!


From gregor@hoffleit.de  Tue Jul  3 13:09:51 2001
From: gregor@hoffleit.de (Gregor Hoffleit)
Date: Tue, 3 Jul 2001 14:09:51 +0200
Subject: [Python-Dev] PEP 250, site-python, site-packages
Message-ID: <20010703140951.A27647@mediasupervision.de>

PEP 250 talks about adopting site-packages for Windows systems. I'd like
to discuss the sitedirs as a whole.

Currently, site.py appends the following sitedirs to sys.path:

    * <prefix>/lib/python<version>/site-packages
    * <prefix>/lib/site-python

If exec-prefix is different from prefix, then also

    * <exec-prefix>/lib/python<version>/site-packages
    * <exec-prefix>/lib/site-python


>From the viewpoint of a Linux distribution, putting pure Python
extension packages in <prefix>/lib/python<version>/site-packages is
quite awkward: Debian has Python extension packages that would work
unmodified with all Python versions since 1.4 up to now; and still, for
every new <version> of Python, we have to make a new package, with only
the installation path changed.

Due to Python's good tradition of compatibility, this is the vast
majority of packages; only packages with binary modules necessarily need
to be recompiled anyway for each major new <version>.

What makes me wonder is that nobody seems to use site-python; Distutils
is completely unaware of it, and besides a few generic Debian packages
(reportbug, dpkg-python), no extension packages on my machine is in
site-python. site-packages OTOH is used by Distutils, and this PEP 250
would recommend its use even on Windows systems.


I would suggest to turn this upside down:

Python extensions should be installed in <prefix>/lib/site-python by
default. Only if they contain things that definitely should not be used
with any other Python <version> (e.g. binary modules), they might be
installed in the version-specific extension directory,
<prefix>/lib/python<version>/site-packages.


I'm thinking about modifying Debian's distutils in order to install all
architecture independent stuff in site-python. This would vastly ease
the maintenance of Python packages.


    Gregor


From jepler@mail.inetnebr.com  Tue Jul  3 13:38:00 2001
From: jepler@mail.inetnebr.com (Jeff Epler)
Date: Tue, 3 Jul 2001 07:38:00 -0500
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703140951.A27647@mediasupervision.de>; from gregor@mediasupervision.de on Tue, Jul 03, 2001 at 02:09:51PM +0200
References: <20010703140951.A27647@mediasupervision.de>
Message-ID: <20010703073759.A4972@localhost.localdomain>

On Tue, Jul 03, 2001 at 02:09:51PM +0200, Gregor Hoffleit wrote:
> Due to Python's good tradition of compatibility, this is the vast
> majority of packages; only packages with binary modules necessarily need
> to be recompiled anyway for each major new <version>.

Aren't there bytecode changes in 1.6, 2.0, and 2.1, compared to 1.5.2?  If
so, this either means that each version of Python does need a separate copy
(for the .pyc/.pyo file), or if all versions are compatible with 1.5.2
bytecodes (and I don't know that they are) then all packages would need to
be bytecompiled with 1.5.2.

For instance, it appears that between 1.5.2 and 2.1, the UNPACK_LIST
and UNPACK_TUPLE bytecode instructions were removed and replaced with
a single UNPACK_SEQUENCE opcode.

Information gathered by executing:
	python -c 'import dis
	for name in dis.opname:
	    if name[0] != "<": print name' | sort -u > opcodes-1.5.2
and similarly for python2.

Jeff


From gregor@hoffleit.de  Tue Jul  3 13:53:11 2001
From: gregor@hoffleit.de (Gregor Hoffleit)
Date: Tue, 3 Jul 2001 14:53:11 +0200
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703073759.A4972@localhost.localdomain>
References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain>
Message-ID: <20010703145311.A12350@mediasupervision.de>

On Tue, Jul 03, 2001 at 07:38:00AM -0500, Jeff Epler wrote:
> On Tue, Jul 03, 2001 at 02:09:51PM +0200, Gregor Hoffleit wrote:
> > Due to Python's good tradition of compatibility, this is the vast
> > majority of packages; only packages with binary modules necessarily need
> > to be recompiled anyway for each major new <version>.
> 
> Aren't there bytecode changes in 1.6, 2.0, and 2.1, compared to 1.5.2?  If
> so, this either means that each version of Python does need a separate copy
> (for the .pyc/.pyo file), or if all versions are compatible with 1.5.2
> bytecodes (and I don't know that they are) then all packages would need to
> be bytecompiled with 1.5.2.
> 
> For instance, it appears that between 1.5.2 and 2.1, the UNPACK_LIST
> and UNPACK_TUPLE bytecode instructions were removed and replaced with
> a single UNPACK_SEQUENCE opcode.
> 
> Information gathered by executing:
> 	python -c 'import dis
> 	for name in dis.opname:
> 	    if name[0] != "<": print name' | sort -u > opcodes-1.5.2
> and similarly for python2.

Right, I forgot about that. It's not so bad for Debian though, since
most of our packages byte-compile the stuff only when unpacking the
package. Since installation of a new python-base package recompiles the
complete site-packages tree (but not yet site-python, you got me ;-),
we're not hurt by that problem.

Any other arguments contra ? ;-)

    Gregor


From thomas@xs4all.net  Tue Jul  3 13:53:34 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 3 Jul 2001 14:53:34 +0200
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703073759.A4972@localhost.localdomain>
References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain>
Message-ID: <20010703145334.Q8098@xs4all.nl>

On Tue, Jul 03, 2001 at 07:38:00AM -0500, Jeff Epler wrote:
> On Tue, Jul 03, 2001 at 02:09:51PM +0200, Gregor Hoffleit wrote:
> > Due to Python's good tradition of compatibility, this is the vast
> > majority of packages; only packages with binary modules necessarily need
> > to be recompiled anyway for each major new <version>.

> Aren't there bytecode changes in 1.6, 2.0, and 2.1, compared to 1.5.2?  If
> so, this either means that each version of Python does need a separate copy
> (for the .pyc/.pyo file), or if all versions are compatible with 1.5.2
> bytecodes (and I don't know that they are) then all packages would need to
> be bytecompiled with 1.5.2.

None are compatible. This might change, but I don't think so -- I think the
CVS tree already has a different bytecode magic than 2.1, though I haven't
checked. Perhaps what Gregor wants is a set of symlinks in each python
version's site-packages directory, to a system-wide one, and a
'register-python-version' script like the emacs/xemacs stuff has that adds
those symlinks. That way, the .pyc/.pyo versions would remain in the
version-specific directory.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Tue Jul  3 14:00:03 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 3 Jul 2001 15:00:03 +0200
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703145311.A12350@mediasupervision.de>
References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain> <20010703145311.A12350@mediasupervision.de>
Message-ID: <20010703150003.R8098@xs4all.nl>

On Tue, Jul 03, 2001 at 02:53:11PM +0200, Gregor Hoffleit wrote:

> Right, I forgot about that. It's not so bad for Debian though, since
> most of our packages byte-compile the stuff only when unpacking the
> package. Since installation of a new python-base package recompiles the
> complete site-packages tree (but not yet site-python, you got me ;-),
> we're not hurt by that problem.

What about when you want to have multiple python versions, like python
1.5.2, 2.0.1, 2.1.1 and 2.2-CVS-snapshot ? :-)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From gregor@hoffleit.de  Tue Jul  3 14:02:50 2001
From: gregor@hoffleit.de (Gregor Hoffleit)
Date: Tue, 3 Jul 2001 15:02:50 +0200
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703145334.Q8098@xs4all.nl>
References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain> <20010703145334.Q8098@xs4all.nl>
Message-ID: <20010703150250.B12350@mediasupervision.de>

On Tue, Jul 03, 2001 at 02:53:34PM +0200, Thomas Wouters wrote:
> On Tue, Jul 03, 2001 at 07:38:00AM -0500, Jeff Epler wrote:
> > On Tue, Jul 03, 2001 at 02:09:51PM +0200, Gregor Hoffleit wrote:
> > > Due to Python's good tradition of compatibility, this is the vast
> > > majority of packages; only packages with binary modules necessarily need
> > > to be recompiled anyway for each major new <version>.
> 
> > Aren't there bytecode changes in 1.6, 2.0, and 2.1, compared to 1.5.2?  If
> > so, this either means that each version of Python does need a separate copy
> > (for the .pyc/.pyo file), or if all versions are compatible with 1.5.2
> > bytecodes (and I don't know that they are) then all packages would need to
> > be bytecompiled with 1.5.2.
> 
> None are compatible. This might change, but I don't think so -- I think the
> CVS tree already has a different bytecode magic than 2.1, though I haven't
> checked. Perhaps what Gregor wants is a set of symlinks in each python
> version's site-packages directory, to a system-wide one, and a
> 'register-python-version' script like the emacs/xemacs stuff has that adds
> those symlinks. That way, the .pyc/.pyo versions would remain in the
> version-specific directory.

Sounds like a LOT of symlinks. To be honest, I would prefer to postulate
that there's only one official Python version on a Debian system at a
time. Then, the postinst and prerm scripts of python-base could take
care of removing and recompiling .pyc and .pyo files at install time of
a new Python version.

Certainly, this won't work for packages that ship with precompiled
.pyc/.pyo files, and we have to provide a method for registering .py
files in non-standard places.

If all of this was in place, I don't see a reason *not* to use
site-python instead of site-packages...

    Gregor



From gregor@hoffleit.de  Tue Jul  3 14:05:35 2001
From: gregor@hoffleit.de (Gregor Hoffleit)
Date: Tue, 3 Jul 2001 15:05:35 +0200
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703150003.R8098@xs4all.nl>
References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain> <20010703145311.A12350@mediasupervision.de> <20010703150003.R8098@xs4all.nl>
Message-ID: <20010703150535.C12350@mediasupervision.de>

On Tue, Jul 03, 2001 at 03:00:03PM +0200, Thomas Wouters wrote:
> On Tue, Jul 03, 2001 at 02:53:11PM +0200, Gregor Hoffleit wrote:
> 
> > Right, I forgot about that. It's not so bad for Debian though, since
> > most of our packages byte-compile the stuff only when unpacking the
> > package. Since installation of a new python-base package recompiles the
> > complete site-packages tree (but not yet site-python, you got me ;-),
> > we're not hurt by that problem.
> 
> What about when you want to have multiple python versions, like python
> 1.5.2, 2.0.1, 2.1.1 and 2.2-CVS-snapshot ? :-)

You've hit the forbidden question ;-)

Seriously, does anybody (besides the Python developers) feel a need to
have multiple Python versions on the same system ?

If there's a real world need for this, then, yes, we had to come up with
a completely different setup. I guess this setup might involve symlink
farms (urghh).

    Gregor


From thomas@xs4all.net  Tue Jul  3 14:16:08 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 3 Jul 2001 15:16:08 +0200
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703150535.C12350@mediasupervision.de>
Message-ID: <20010703151608.S8098@xs4all.nl>

On Tue, Jul 03, 2001 at 03:05:35PM +0200, Gregor Hoffleit wrote:

> Seriously, does anybody (besides the Python developers) feel a need to
> have multiple Python versions on the same system ?

Well, currently anyone who wants to use python2.0+ does, yes. It's up to
you, not me, whether that should be continued :-)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From gregor@hoffleit.de  Tue Jul  3 14:28:09 2001
From: gregor@hoffleit.de (Gregor Hoffleit)
Date: Tue, 3 Jul 2001 15:28:09 +0200
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703151608.S8098@xs4all.nl>
References: <20010703151608.S8098@xs4all.nl>
Message-ID: <20010703152809.E12350@mediasupervision.de>

On Tue, Jul 03, 2001 at 03:16:08PM +0200, Thomas Wouters wrote:
> On Tue, Jul 03, 2001 at 03:05:35PM +0200, Gregor Hoffleit wrote:
> 
> > Seriously, does anybody (besides the Python developers) feel a need to
> > have multiple Python versions on the same system ?
> 
> Well, currently anyone who wants to use python2.0+ does, yes. It's up to
> you, not me, whether that should be continued :-)

Well, that's certainly quite OT since Debian-specific, but the need for
an unofficial python2.0+ only arises due to the fact that a controlled
and concurrent upgrade of the various Python packages is really, really
awkward with the current setup. That's why I brought up this question in
the first place.

So let me paraphrase: Provided the maintainer of the Debian Python
package would do a good job and keep the package always up-to-date,
would you think there's a real world need for concurrent Python versions
on the same system ?

(Python developers still could use symlink farms to link the stuff from
/usr/lib/site-python into /usr/local/lib/python3.1/site-packages...)

    Gregor



From barry@digicool.com  Tue Jul  3 14:31:32 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 3 Jul 2001 09:31:32 -0400
Subject: [Python-Dev] PEP 250, site-python, site-packages
References: <20010703140951.A27647@mediasupervision.de>
 <20010703073759.A4972@localhost.localdomain>
 <20010703145311.A12350@mediasupervision.de>
 <20010703150003.R8098@xs4all.nl>
 <20010703150535.C12350@mediasupervision.de>
Message-ID: <15169.51508.176575.33388@anthem.wooz.org>

>>>>> "GH" == Gregor Hoffleit <gregor@mediasupervision.de> writes:

    GH> You've hit the forbidden question ;-)

    GH> Seriously, does anybody (besides the Python developers) feel a
    GH> need to have multiple Python versions on the same system ?

Yes, definitely as both a Zope and Mailman developer <wink> I need
multiple Python versions.  But I suspect even normal users of the
system will need multiple versions.  Different Python-based apps are
requiring their users to upgrade Python on their own schedule, so
multiple versions will still be required.

-Barry


From gward@python.net  Tue Jul  3 14:51:23 2001
From: gward@python.net (Greg Ward)
Date: Tue, 3 Jul 2001 09:51:23 -0400
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703152809.E12350@mediasupervision.de>; from gregor@mediasupervision.de on Tue, Jul 03, 2001 at 03:28:09PM +0200
References: <20010703151608.S8098@xs4all.nl> <20010703152809.E12350@mediasupervision.de>
Message-ID: <20010703095122.A558@gerg.ca>

On 03 July 2001, Gregor Hoffleit said:
> So let me paraphrase: Provided the maintainer of the Debian Python
> package would do a good job and keep the package always up-to-date,
> would you think there's a real world need for concurrent Python versions
> on the same system ?

Speaking as someone who uses Python day-to-day and occasionally worries
about compatibility across Python versions: yes, it would be really nice
if Python better supported multiple versions installed at the same time.
lib/python1.5, lib/python2.0, and lib/python2.1 just don't cut it: I
remember running 1.5.1, 1.5.2, and alpha/beta versions of 1.6
simultaneously.  I had to install each to a separate prefix, which was
ugly but workable.  It would be nice if Python (and, yes, the Distutils
now) had better native support for multiple simultaneous versions.

Speaking as the main perpetrator of the Distutils: AAUUGGHGHHHH!!!!
NOOOOO!!!  Please, don't make me look at this stuff AGAIN!!!!
Aiiieeee!!

But seriously: I think I once attempted to convince Guido that a
revamped organization of the library directories would be a good idea,
and that the Distutils would be a good way to introduce that scheme.
Obviously, I didn't convince him, so we still have the same system.  The
one glimmer of good news is that the Distutils "install" command is
insanely flexible; if you can manage to wrap your head around the 17,000
levels of indirection, it should be a simple matter of changing a few
hard-coded dictionaries (there are two for Unix, and one each for
Windows and Mac OS) to introduce a completely new installation scheme.
I probably had some expectation that someday this discussion would open
up again.

BTW, I'm skeptical about keeping .py and .pyc code in a
non-Python-version-specific directory (ie. site-python).  Debian's
bytecode-recompilation at installation time scheme sounds cool, but the
desire/need to have multiple Python versions available kind of nixes it.
Bummer.

Oh yeah, another thing I vaguely recall from the pre-Distutils-0.1 era:
Guido doesn't (didn't?) like site-python and wanted to deprecate it.
Perhaps the above paragraph explains why.

        Greg
-- 
Greg Ward - Linux geek                                  gward@python.net
http://starship.python.net/~gward/
Drive defensively -- buy a tank.


From fdrake@acm.org  Tue Jul  3 15:02:33 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 3 Jul 2001 10:02:33 -0400 (EDT)
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703150535.C12350@mediasupervision.de>
References: <20010703140951.A27647@mediasupervision.de>
 <20010703073759.A4972@localhost.localdomain>
 <20010703145311.A12350@mediasupervision.de>
 <20010703150003.R8098@xs4all.nl>
 <20010703150535.C12350@mediasupervision.de>
 <15169.51508.176575.33388@anthem.wooz.org>
Message-ID: <15169.53369.55827.570681@cj42289-a.reston1.va.home.com>

Gregor Hoffleit writes:
 > Seriously, does anybody (besides the Python developers) feel a need to
 > have multiple Python versions on the same system ?

  Absolutely!  Anyone that wants to write cross-version Python code
needs to be able to have multiple versions available.  I'd even like
to be able to have both Python 2.0 and Python 2.0.1 available on the
same $prefix/$exec_prefix -- that can't be done currently.  This kind
of thing is pretty important when you want to take cross-version
compatibility seriously.

Barry A. Warsaw writes:
 > Yes, definitely as both a Zope and Mailman developer <wink> I need
 > multiple Python versions.  But I suspect even normal users of the
 > system will need multiple versions.  Different Python-based apps are
 > requiring their users to upgrade Python on their own schedule, so
 > multiple versions will still be required.

  Another excellent reason to support multiple versions!  As more
widely distributed applications are written using Python and don't
want to include the interpreter, this becomes a more noticable issue.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From fdrake@acm.org  Tue Jul  3 15:09:37 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 3 Jul 2001 10:09:37 -0400 (EDT)
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703095122.A558@gerg.ca>
References: <20010703151608.S8098@xs4all.nl>
 <20010703152809.E12350@mediasupervision.de>
 <20010703095122.A558@gerg.ca>
Message-ID: <15169.53793.422966.868795@cj42289-a.reston1.va.home.com>

Greg Ward writes:
 > Oh yeah, another thing I vaguely recall from the pre-Distutils-0.1 era:
 > Guido doesn't (didn't?) like site-python and wanted to deprecate it.
 > Perhaps the above paragraph explains why.

  Another reason not to use site-python is that it is actually still
hard to write cross-version Python code -- there are enough
differences that any substantial volume of code (and in Python, you
don't need many KLoC to get substantial code!) is bound to encounter a
few, especially if you get used to using only Python 2.0+ -- it's easy
to get used to features like string methods, list comprehensions, and
augmented assignment!
  The site-packages directory was introduced to avoid the deficiencies
of the site-python directory.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From guido@digicool.com  Tue Jul  3 15:31:40 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 03 Jul 2001 10:31:40 -0400
Subject: [Python-Dev] CVS
In-Reply-To: Your message of "Tue, 03 Jul 2001 13:41:51 +0200."
 <20010703134151.P8098@xs4all.nl>
References: <20010703134151.P8098@xs4all.nl>
Message-ID: <200107031431.f63EVem05167@odiug.digicool.com>

> Slightly off-topic, but I've depleted all my other sources :) I'm trying to
> get CVS to give me all logentries for all checkins in a specific branch (the
> 2.1.1 branch) so I can pipe it through logmerge. It seems the one thing I'm
> missing now is a branchpoint tag (which should translate to a revision with
> an even number of dots, apparently) but 'release21' and 'release21-maint'
> both don't qualify. Even the usage logmerge suggests (cvs log -rrelease21)
> doesn't work, gives me a bunch of "no revision elease21' in <file>"
> warnings and just all logentries for those files.

But those files should be old, so logmerge should safely sort their
messages last, right?

> Am I missing something simple, here, or should I hack logmerge to parse the
> symbolic names, figure out the even-dotted revision for each file from the
> uneven-dotted branch-tag, and filter out stuff outside that range ? :P

You're lucky: at least the fork point is tagged (release21).  For the
descr-branch, if I want to do some kind of reasonable merge, I'll have
to write a tool that figures out the fork point and tags it.  That's
one "cvs tag" call for each file...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Tue Jul  3 15:38:07 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 03 Jul 2001 10:38:07 -0400
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: Your message of "Tue, 03 Jul 2001 15:05:35 +0200."
 <20010703150535.C12350@mediasupervision.de>
References: <20010703140951.A27647@mediasupervision.de> <20010703073759.A4972@localhost.localdomain> <20010703145311.A12350@mediasupervision.de> <20010703150003.R8098@xs4all.nl>
 <20010703150535.C12350@mediasupervision.de>
Message-ID: <200107031438.f63Ec7K05210@odiug.digicool.com>

> > What about when you want to have multiple python versions, like python
> > 1.5.2, 2.0.1, 2.1.1 and 2.2-CVS-snapshot ? :-)
> 
> You've hit the forbidden question ;-)
> 
> Seriously, does anybody (besides the Python developers) feel a need to
> have multiple Python versions on the same system ?

I've had enough requests over the years for this, so it is indeed
supported, and I believe there is a need.  Quite often people have
important programs that for some minor reason don't work on a newer
version yet and they can't find the person or the time to fix it.

Python's standard installation makes this possible.  You can have only
one "python" but you can request a specific version by appending the
"major dot minor" part of the version number, e.g. python1.5,
python2.0, python2.1, python2.2.  "python" is a hard link to one of
these.  You can't (easily) have multiple version with the same
major.minor, but that should never be needed.  I've heard though that
some Linux distributors break this versioning scheme in favor of their
own.

> If there's a real world need for this, then, yes, we had to come up with
> a completely different setup. I guess this setup might involve symlink
> farms (urghh).

Ugh maybe, but it's the only thing that scales.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Tue Jul  3 15:45:34 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 03 Jul 2001 10:45:34 -0400
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: Your message of "Tue, 03 Jul 2001 09:51:23 EDT."
 <20010703095122.A558@gerg.ca>
References: <20010703151608.S8098@xs4all.nl> <20010703152809.E12350@mediasupervision.de>
 <20010703095122.A558@gerg.ca>
Message-ID: <200107031445.f63EjYF05265@odiug.digicool.com>

> Speaking as someone who uses Python day-to-day and occasionally worries
> about compatibility across Python versions: yes, it would be really nice
> if Python better supported multiple versions installed at the same time.
> lib/python1.5, lib/python2.0, and lib/python2.1 just don't cut it: I
> remember running 1.5.1, 1.5.2, and alpha/beta versions of 1.6
> simultaneously.  I had to install each to a separate prefix, which was
> ugly but workable.  It would be nice if Python (and, yes, the Distutils
> now) had better native support for multiple simultaneous versions.

That was mostly because we were abusing the version numbering scheme
to roll out feature releases with a micro version number (1.5.1,
1.5.2).  We don't do that any more -- feature releases have a minor
(middle) version number change.

If you really need to distinguish Python 2.0 and 2.0.1 on the same
system, you're a Python developer by definition. :-)

> Speaking as the main perpetrator of the Distutils: AAUUGGHGHHHH!!!!
> NOOOOO!!!  Please, don't make me look at this stuff AGAIN!!!!
> Aiiieeee!!

BTW, Greg, there's this bug I've found in Distutils, but the margin of
this email isn't wide enough to describe it. :-)

> But seriously: I think I once attempted to convince Guido that a
> revamped organization of the library directories would be a good idea,
> and that the Distutils would be a good way to introduce that scheme.
> Obviously, I didn't convince him, so we still have the same system.

Which I think isn't so bad given that we now have a well-behaved
versioning policy in place.

> The one glimmer of good news is that the Distutils "install" command
> is insanely flexible; if you can manage to wrap your head around the
> 17,000 levels of indirection, it should be a simple matter of
> changing a few hard-coded dictionaries (there are two for Unix, and
> one each for Windows and Mac OS) to introduce a completely new
> installation scheme.  I probably had some expectation that someday
> this discussion would open up again.
> 
> BTW, I'm skeptical about keeping .py and .pyc code in a
> non-Python-version-specific directory (ie. site-python).  Debian's
> bytecode-recompilation at installation time scheme sounds cool, but the
> desire/need to have multiple Python versions available kind of nixes it.
> Bummer.

Yes, good point.  Bytecode is generally not compatible between
versions -- its specification is considered an internal detail of the
implementation (again, it can't vary with a micro-version, but it can
and usually does vary with the minor version number).

> Oh yeah, another thing I vaguely recall from the pre-Distutils-0.1 era:
> Guido doesn't (didn't?) like site-python and wanted to deprecate it.
> Perhaps the above paragraph explains why.

Indeed, /usr/local/lib/python<major>.<minor>/site-packages/ is where
site packages should go.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paul@pfdubois.com  Tue Jul  3 15:44:48 2001
From: paul@pfdubois.com (Paul F. Dubois)
Date: Tue, 3 Jul 2001 07:44:48 -0700
Subject: [Python-Dev] site-python, multiple installations
Message-ID: <ADEOIFHFONCLEEPKCACCMELICKAA.paul@pfdubois.com>

I'm on vacation and haven't followed this discussion well but read with
alarm some talk about how it would be expected that there would only be "one
official Python" on a system. This is categorically a false assumption for
almost everyone at LLNL. Please do not attempt to make any changes that
assume there is one place into which everything should be put, or that there
should be some system-wide registry of packages.

I thought this demon had been killed in distutils-sig long ago.

Paul






From thomas@xs4all.net  Tue Jul  3 16:05:28 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 3 Jul 2001 17:05:28 +0200
Subject: [Python-Dev] CVS
In-Reply-To: <200107031431.f63EVem05167@odiug.digicool.com>
References: <20010703134151.P8098@xs4all.nl> <200107031431.f63EVem05167@odiug.digicool.com>
Message-ID: <20010703170528.U32419@xs4all.nl>

On Tue, Jul 03, 2001 at 10:31:40AM -0400, Guido van Rossum wrote:
> > Slightly off-topic, but I've depleted all my other sources :) I'm tryin=
g to
> > get CVS to give me all logentries for all checkins in a specific branch=
 (the
> > 2.1.1 branch) so I can pipe it through logmerge. It seems the one thing=
 I'm
> > missing now is a branchpoint tag (which should translate to a revision =
with
> > an even number of dots, apparently) but 'release21' and 'release21-main=
t'
> > both don't qualify. Even the usage logmerge suggests (cvs log -rrelease=
21)
> > doesn't work, gives me a bunch of "no revision =12elease21' in <file>"
> > warnings and just all logentries for those files.

> But those files should be old, so logmerge should safely sort their
> messages last, right?

Yes, but it also lists all checkin messages for all branches, including the
trunk, after release21... so I end up no smarter than without '-rrelease21'.

> > Am I missing something simple, here, or should I hack logmerge to parse=
 the
> > symbolic names, figure out the even-dotted revision for each file from =
the
> > uneven-dotted branch-tag, and filter out stuff outside that range ? :P

> You're lucky: at least the fork point is tagged (release21).  For the
> descr-branch, if I want to do some kind of reasonable merge, I'll have
> to write a tool that figures out the fork point and tags it.  That's
> one "cvs tag" call for each file...

No, that's one 'cvs log' command; for each entry, it contains all symbolic
names. All you need to do <wink> is to search for the descr-branch symbolic
name in that list, grab the revision it lists (if any), chop off the last
dot-and-digit, and you're done. You can almost do that in a shell oneliner:

centurion:~/python/python-CVS > cvs log | egrep "(RCS file:|descr-branch:)"=
 | python -c "
import fileinput
lastline =3D ''
for line in fileinput.input():
    if lastline and line[0] =3D=3D '\t':
        filename =3D lastline[33:-3]
        revision =3D line.split()[1]
        branchpoint =3D revision[:revision.rindex('.')]
        print filename, branchpoint
        lastline =3D ''
    else:
        lastline =3D line
"

(adjust quotes for (t)csh, I guess)
Tadaaa! Hmm... I'll just use that myself, too <wink>.

But why not merge the trunk into your tree ? You can do that with

cvs update -j HEAD

inside your (sticky-tagged) working tree, IIRC. It doesn't change the
repository either, just your working directory, so it's safe to try in a
separate directory. Then, when you're satisfied it all works, you can commit
the whole thing.

--=20
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me sp=
read!


From thomas@xs4all.net  Tue Jul  3 16:26:16 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 3 Jul 2001 17:26:16 +0200
Subject: [Python-Dev] site-python, multiple installations
In-Reply-To: <ADEOIFHFONCLEEPKCACCMELICKAA.paul@pfdubois.com>
Message-ID: <20010703172615.V8098@xs4all.nl>

On Tue, Jul 03, 2001 at 07:44:48AM -0700, Paul F. Dubois wrote:
> I'm on vacation and haven't followed this discussion well but read with
> alarm some talk about how it would be expected that there would only be "one
> official Python" on a system. This is categorically a false assumption for
> almost everyone at LLNL. Please do not attempt to make any changes that
> assume there is one place into which everything should be put, or that there
> should be some system-wide registry of packages.

That wasn't for Python, it was for Debian. You'll note that Gregor actually
said "this is getting off-topic" in one of the mails :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From aahz@rahul.net  Tue Jul  3 19:11:45 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Tue, 3 Jul 2001 11:11:45 -0700 (PDT)
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703152809.E12350@mediasupervision.de> from "Gregor Hoffleit" at Jul 03, 2001 03:28:09 PM
Message-ID: <20010703181145.8D05D99C8A@waltz.rahul.net>

Gregor Hoffleit wrote:
> 
> So let me paraphrase: Provided the maintainer of the Debian Python
> package would do a good job and keep the package always up-to-date,
> would you think there's a real world need for concurrent Python versions
> on the same system ?

Yes.  Thing is, you're going to have Debian system scripts that will
possibly rely on a specific version of Python.  It's not fair to expect
users to upgrade the OS every time they want a newer version of Python,
yet you can't take a chance on the system scripts breaking.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From guido@digicool.com  Tue Jul  3 20:08:04 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 03 Jul 2001 15:08:04 -0400
Subject: [Python-Dev] CVS
In-Reply-To: Your message of "Tue, 03 Jul 2001 17:05:28 +0200."
 <20010703170528.U32419@xs4all.nl>
References: <20010703134151.P8098@xs4all.nl> <200107031431.f63EVem05167@odiug.digicool.com>
 <20010703170528.U32419@xs4all.nl>
Message-ID: <200107031908.f63J84008549@odiug.digicool.com>

> But why not merge the trunk into your tree ? You can do that with
> 
> cvs update -j HEAD
> 
> inside your (sticky-tagged) working tree, IIRC. It doesn't change the
> repository either, just your working directory, so it's safe to try in a
> separate directory. Then, when you're satisfied it all works, you can commit
> the whole thing.

I believe that's what I tried last time, and it suddenly revived a
bunch of files that had been dead for years.  But you're right, I
should probably try this.

But in the light of multiple merges, it's important to tag the tree
three times: (1) tag the HEAD at the point where you want to do the
merge; (2) tag the branch at the point where you waht to merge into;
(3) after resolving conflicts and making the resulting checkins in the
branch, tag the branch again.  Well, maybe (2) is redundant.  But (1)
is essential to be able to do do another merge later.  And I think I
recall that (3) was good for something, too.  (Maybe if you want to
merge in the other direction.)

Sigh.  Not my day, it seems.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@digicool.com  Tue Jul  3 20:10:36 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 3 Jul 2001 15:10:36 -0400
Subject: [Python-Dev] CVS
References: <20010703134151.P8098@xs4all.nl>
Message-ID: <15170.6316.543729.545921@anthem.wooz.org>

>>>>> "TW" == Thomas Wouters <thomas@xs4all.net> writes:

    TW> Slightly off-topic, but I've depleted all my other sources :)
    TW> I'm trying to get CVS to give me all logentries for all
    TW> checkins in a specific branch (the 2.1.1 branch) so I can pipe
    TW> it through logmerge.

I had a lot of problems trying to do the same thing with the (slightly
misnamed) Mailman Release_2_0_1-branch.  I basically could not get CVS
to give me just the log messages for all changes to that branch.  It
would either give me nothing or give me all changes in all branches
and trunk.  It may just be a CVS bug, I dunno.  I eventually gave up.

-Barry



From thomas.heller@ion-tof.com  Wed Jul  4 17:28:49 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 4 Jul 2001 18:28:49 +0200
Subject: [Python-Dev] Checkin problems (slightly off-topic)
Message-ID: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook>

After I tried to set up syncmail notification for
the anygui project, I did not get it to work.
Then I found out that exactly the same problem
occurs if I try to checkin into other SF projects
I have, also python. Here is the message I get:

C:\sf\distutils\distutils\command>cvs commit -m "dummy checkin for testing, please ignore"

cvs commit: Examining .
Checking in bdist_wininst.py;
/cvsroot/python/distutils/distutils/command/bdist_wininst.py,v  <--  bdist_wininst.py
new revision: 1.22; previous revision: 1.21
done
Mailing python-checkins@python.org...
Generating notification message...
Generating notification message... done.
2001-07-04 09:52:14 Failed to get user name for uid 34174

The checkin succeeded, but no mail is sent :-(
I have no clue what uid 34174 is, surely not
my SF user id (which is 11105).

Has anyone seen this problem before, or can
offer other help?

Thanks,

Thomas



From fdrake@acm.org  Wed Jul  4 18:27:16 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 4 Jul 2001 13:27:16 -0400 (EDT)
Subject: [Python-Dev] Checkin problems (slightly off-topic)
In-Reply-To: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook>
References: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook>
Message-ID: <15171.20980.397264.341559@cj42289-a.reston1.va.home.com>

Thomas Heller writes:
 > Has anyone seen this problem before, or can
 > offer other help?

  I saw this last night, but don't know that we can deal with it.
Have you filed a SourceForge support request?


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From thomas.heller@ion-tof.com  Wed Jul  4 18:05:40 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 4 Jul 2001 19:05:40 +0200
Subject: [Python-Dev] Checkin problems (slightly off-topic)
References: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook> <15171.20980.397264.341559@cj42289-a.reston1.va.home.com>
Message-ID: <038901c104ab$8e0deac0$e000a8c0@thomasnotebook>

> 
> Thomas Heller writes:
>  > Has anyone seen this problem before, or can
>  > offer other help?
> 
>   I saw this last night, but don't know that we can deal with it.
> Have you filed a SourceForge support request?

Seems to be a problem with my account. I will file
a support request.

Thanks,

Thomas



From thomas.heller@ion-tof.com  Wed Jul  4 18:06:48 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 4 Jul 2001 19:06:48 +0200
Subject: [Python-Dev] Checkin problems (slightly off-topic)
References: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook> <15171.20980.397264.341559@cj42289-a.reston1.va.home.com>
Message-ID: <039301c104ab$b67898c0$e000a8c0@thomasnotebook>

> 
> Thomas Heller writes:
>  > Has anyone seen this problem before, or can
>  > offer other help?
> 
>   I saw this last night, but don't know that we can deal with it.
So you mean _you_ have seen the same problem for yourself?

Thomas



From fdrake@acm.org  Wed Jul  4 18:43:52 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 4 Jul 2001 13:43:52 -0400 (EDT)
Subject: [Python-Dev] Checkin problems (slightly off-topic)
In-Reply-To: <039301c104ab$b67898c0$e000a8c0@thomasnotebook>
References: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook>
 <15171.20980.397264.341559@cj42289-a.reston1.va.home.com>
 <039301c104ab$b67898c0$e000a8c0@thomasnotebook>
Message-ID: <15171.21976.338066.16853@cj42289-a.reston1.va.home.com>

Thomas Heller writes:
 > >   I saw this last night, but don't know that we can deal with it.
 > So you mean _you_ have seen the same problem for yourself?

  Yes.  It started about 2:00am (east coast time); things had been
fine before that.  I think it affected both mail sent by syncmail and
the trackers.  I don't know about other systems at SourceForge.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From thomas.heller@ion-tof.com  Wed Jul  4 18:20:42 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 4 Jul 2001 19:20:42 +0200
Subject: [Python-Dev] Checkin problems (slightly off-topic)
References: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook><15171.20980.397264.341559@cj42289-a.reston1.va.home.com><039301c104ab$b67898c0$e000a8c0@thomasnotebook> <15171.21976.338066.16853@cj42289-a.reston1.va.home.com>
Message-ID: <03fb01c104ad$a8c5e780$e000a8c0@thomasnotebook>

From: "Fred L. Drake, Jr." <fdrake@acm.org>
> 
> Thomas Heller writes:
>  > >   I saw this last night, but don't know that we can deal with it.
>  > So you mean _you_ have seen the same problem for yourself?
> 
>   Yes.  It started about 2:00am (east coast time); things had been
> fine before that.  I think it affected both mail sent by syncmail and
> the trackers.  I don't know about other systems at SourceForge.
> 
I found at least two (open) support requests from other people
reporting exactly the same problem. I don't think they need another
one.

Thomas



From thomas@xs4all.net  Wed Jul  4 21:46:50 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 4 Jul 2001 22:46:50 +0200
Subject: [Python-Dev] Checkin problems (slightly off-topic)
In-Reply-To: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook>
References: <02ff01c104a6$67a5b5c0$e000a8c0@thomasnotebook>
Message-ID: <20010704224650.Z8098@xs4all.nl>

On Wed, Jul 04, 2001 at 06:28:49PM +0200, Thomas Heller wrote:

> Generating notification message... done.
> 2001-07-04 09:52:14 Failed to get user name for uid 34174

> The checkin succeeded, but no mail is sent :-(
> I have no clue what uid 34174 is, surely not
> my SF user id (which is 11105).

Actually, it *is* your SF userid ;)

twouters@usw-pr-shell2:~$ python
Python 1.5.2 (#0, Dec 27 2000, 13:59:38)  [GCC 2.95.2 20000220 (Debian
GNU/Linux)] on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> import pwd
>>> pwd.getpwuid(34174)
('theller', 'x', 34174, 100, 'Thomas Heller', '/home/users/t/th/theller',
'/bin/bash')

It's your unix user-id, not the SF websystem one. Probably the cvs machine's
PAM setup is broken in some way... From a quick look on the shell machine it
seems they don't quite run the typical setup ;-)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From jack@oratrix.nl  Wed Jul  4 22:46:05 2001
From: jack@oratrix.nl (Jack Jansen)
Date: Wed, 04 Jul 2001 23:46:05 +0200
Subject: [Python-Dev] Will the new type system allow automatic coercion?
Message-ID: <20010704214610.6D99CDA740@oratrix.oratrix.nl>

Something I've suddenly started to need is automatic coercion implemented
by the source type. I have a type implemented in C, and while
automatic coercion from any other type to my type is easy to implement
(in your O& routine you simply check whether the passed object is of a
type you can coerce) there is no way to do the reverse (at least, not
that I'm aware of, please enlighten me).

And now I have this CFString type (a wrapper around the MacOS
CoreFoundation object. Nice things, by the way, these CoreFoundation
objects, sort-of inherited from NextStep and they share a lot of
design with Python objects, but I digress) that can show itself as a
Unicode string or an 8 bit string or a number of other things. It
would be nice if users could pass these CFString objects in places
where a string or unicode is expected. Simply said, if PyArg_Parse s
format would accept my objects.

Will the new type system allow me to do this?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | ++++ see http://www.xs4all.nl/~tank/ ++++


From mwh@python.net  Wed Jul  4 23:30:42 2001
From: mwh@python.net (Michael Hudson)
Date: 04 Jul 2001 23:30:42 +0100
Subject: [Python-Dev] summer summaries
Message-ID: <m366d8z5rh.fsf@atrus.jesus.cam.ac.uk>

My internet connection is going to get drastically worse tomorrow, and
while I could still do the python-dev summaries over the summer, it
would be significantly more tedious.  Would someone else be able to do
them for a bit?  I can provide the scripts I use to generate the
distributions and format into text and xhtml, and continue to archive
them on my starship pages.

It's not that much work; a few hours a fortnight.

I could also do with a break from it for more general reasons.

My internet connection should be back to full strength in October.

Cheers,
M.

-- 
  While preceding your entrance with a grenade is a good tactic in
  Quake, it can lead to problems if attempted at work.    -- C Hacking
               -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html



From tim.one@home.com  Wed Jul  4 23:59:34 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 4 Jul 2001 18:59:34 -0400
Subject: [Python-Dev] Checkin problems (slightly off-topic)
In-Reply-To: <039301c104ab$b67898c0$e000a8c0@thomasnotebook>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEHBKMAA.tim.one@home.com>

[Thomas Heller, on
    2001-07-04 09:52:14 Failed to get user name for uid 34174
at the end of a checkin
]

> Has anyone seen this problem before, or can offer other help?

[Fred]
> I saw this last night, but don't know that we can deal with it.

[Thomas Heller]
> So you mean _you_ have seen the same problem for yourself?

I saw the same thing today when I did a checkin, although with a different
uid.  Can't speak for Fred, but can't imagine what else he could have meant
(unless he was running a spy monitor watching it happen to you <wink>).



From akuchlin@mems-exchange.org  Thu Jul  5 01:34:03 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 4 Jul 2001 20:34:03 -0400
Subject: [Python-Dev] summer summaries
In-Reply-To: <m366d8z5rh.fsf@atrus.jesus.cam.ac.uk>; from mwh@python.net on Wed, Jul 04, 2001 at 11:30:42PM +0100
References: <m366d8z5rh.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <20010704203403.A11589@ute.cnri.reston.va.us>

On Wed, Jul 04, 2001 at 11:30:42PM +0100, Michael Hudson wrote:
>would be significantly more tedious.  Would someone else be able to do
>them for a bit?  I can provide the scripts I use to generate the

I'm willing to pick them up again for a bit.  

>I could also do with a break from it for more general reasons.

Been there, done that. :)

--amk                                          (www.amk.ca)
This is the moment when I get a real sense of job satisfaction.
    -- The Collector, at Leela's execution, in "The Sunmakers"


From guido@digicool.com  Thu Jul  5 02:06:43 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 04 Jul 2001 21:06:43 -0400
Subject: [Python-Dev] Will the new type system allow automatic coercion?
In-Reply-To: Your message of "Wed, 04 Jul 2001 23:46:05 +0200."
 <20010704214610.6D99CDA740@oratrix.oratrix.nl>
References: <20010704214610.6D99CDA740@oratrix.oratrix.nl>
Message-ID: <200107050106.f6516hm10144@odiug.digicool.com>

> Something I've suddenly started to need is automatic coercion implemented
> by the source type. I have a type implemented in C, and while
> automatic coercion from any other type to my type is easy to implement
> (in your O& routine you simply check whether the passed object is of a
> type you can coerce) there is no way to do the reverse (at least, not
> that I'm aware of, please enlighten me).
> 
> And now I have this CFString type (a wrapper around the MacOS
> CoreFoundation object. Nice things, by the way, these CoreFoundation
> objects, sort-of inherited from NextStep and they share a lot of
> design with Python objects, but I digress) that can show itself as a
> Unicode string or an 8 bit string or a number of other things. It
> would be nice if users could pass these CFString objects in places
> where a string or unicode is expected. Simply said, if PyArg_Parse s
> format would accept my objects.
> 
> Will the new type system allow me to do this?

I don't know that the new type system (which isn't really a type
system, just a generalized implementation of class construction
through a formalization of the Don Beaudry hook :-) has anything to do
with this, but can't the buffer interface come to the rescue?  The s
format accepts anything that conforms to the buffer interface, AFAIK.

Does that help?  (Alas, I don't think there's a similar API for
Unicode.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Thu Jul  5 02:22:31 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 4 Jul 2001 21:22:31 -0400
Subject: [Python-Dev] "Becoming a Python Developer"
Message-ID: <20010704212231.A11629@ute.cnri.reston.va.us>

Inspired by some discussion on c.l.py, I've written a draft of a guide
to working on Python:

http://www.amk.ca/python/writing/python-dev.html

Comments welcomed!

--amk


From greg@cosc.canterbury.ac.nz  Thu Jul  5 06:23:58 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 05 Jul 2001 17:23:58 +1200 (NZST)
Subject: [Python-Dev] Making a .pyd using Cygwin?
Message-ID: <200107050523.RAA00926@s454.cosc.canterbury.ac.nz>

Is it feasible to compile a Python extension module
for Windows using Cygwin?

I have tried this, and the linker tells me that it can't
export '_bss_start__', '_bss_end__', '_data_start__'
and '_data_end__' because they're not defined.

I tried defining some symbols with those names in a
c file and got it to link, but importing the resulting
extension causes the interpreter to hang.

I'm using Python 2.1, Windows 2000 Professional, and
whatever version of Cygwin was the latest as of a
couple of days ago.

Thanks for any help,

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From mwh@python.net  Thu Jul  5 09:22:30 2001
From: mwh@python.net (Michael Hudson)
Date: 05 Jul 2001 09:22:30 +0100
Subject: [Python-Dev] summer summaries
In-Reply-To: Andrew Kuchling's message of "Wed, 4 Jul 2001 20:34:03 -0400"
References: <m366d8z5rh.fsf@atrus.jesus.cam.ac.uk> <20010704203403.A11589@ute.cnri.reston.va.us>
Message-ID: <m33d8bzsxl.fsf@atrus.jesus.cam.ac.uk>

Andrew Kuchling <akuchlin@mems-exchange.org> writes:

> On Wed, Jul 04, 2001 at 11:30:42PM +0100, Michael Hudson wrote:
> >would be significantly more tedious.  Would someone else be able to do
> >them for a bit?  I can provide the scripts I use to generate the
> 
> I'm willing to pick them up again for a bit.  

Thanks!  I've got one in the boiler for today, which I'll post soon.

> >I could also do with a break from it for more general reasons.
> 
> Been there, done that. :)

I thought you might know what I was talking about here...

Cheers,
M.

-- 
  If your telephone company installs a system in the woods with no
  one around to see them, do they still get it wrong?
                                 -- Robert Moir, alt.sysadmin.recovery



From thomas@xs4all.net  Thu Jul  5 11:10:25 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 5 Jul 2001 12:10:25 +0200
Subject: [Python-Dev] While we're deprecating...
Message-ID: <20010705121024.A8098@xs4all.nl>

While we're in the deprecation mood (not that I changed my mind on xrage()
;P) how about we deprecate the alternate-tab-size-comment checks of the
parser. That is, generate a deprecation warning for these comments:

                        "tab-width:",           /* Emacs */
                        ":tabstop=",            /* vim, full form */
                        ":ts=",                 /* vim, abbreviated form */
                        "set tabsize=",         /* will vi never die? */

with sizes other than '8', and rip out the code in 2.3 ?

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mwh@python.net  Thu Jul  5 11:10:48 2001
From: mwh@python.net (Michael Hudson)
Date: Thu, 5 Jul 2001 11:10:48 +0100 (BST)
Subject: [Python-Dev] python-dev summary 2001-06-21 - 2001-07-05
Message-ID: <Pine.LNX.4.30.0107051110260.27229-100000@localhost.localdomain>

 This is a summary of traffic on the python-dev mailing list between
 June 21 and July 4 (inclusive) 2001.  It is intended to inform the
 wider Python community of ongoing developments.  To comment, just
 post to python-list@python.org or comp.lang.python in the usual
 way. Give your posting a meaningful subject line, and if it's about a
 PEP, include the PEP number (e.g. Subject: PEP 201 - Lockstep
 iteration) All python-dev members are interested in seeing ideas
 discussed by the community, so don't hesitate to take a stance on a
 PEP if you have an opinion.

 This is the eleventh summary written by Michael Hudson.
 Summaries are archived at:

  <http://starship.python.net/crew/mwh/summaries/>

   Posting distribution (with apologies to mbm)

   Number of articles in summary: 252

    40 |                 [|]
       |                 [|]
       |                 [|]
       |                 [|]
       |                 [|]
    30 |                 [|]
       |                 [|]
       |                 [|]     [|]                 [|]
       |                 [|]     [|]                 [|] [|]
       |                 [|]     [|]                 [|] [|]
    20 |                 [|]     [|]                 [|] [|]
       |     [|]         [|]     [|]                 [|] [|]
       |     [|]         [|] [|] [|]                 [|] [|]
       |     [|]     [|] [|] [|] [|]         [|] [|] [|] [|]
       |     [|]     [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
    10 |     [|]     [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
       |     [|]     [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
       |     [|] [|] [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
       |     [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|]
       | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|]
     0 +-004-019-007-016-042-018-028-013-005-015-016-029-027-013
        Thu 21| Sat 23| Mon 25| Wed 27| Fri 29| Sun 01| Tue 03|
            Fri 22  Sun 24  Tue 26  Thu 28  Sat 30  Mon 02  Wed 04

 This will be my last python-dev summary for a while, as I'm going to
 be mostly away from the internet for the summer.  However, Andrew
 Kuchling has agreed to take up writing them again, so there should be
 no interruption in the summaries.


    * Support for "wide" Unicode characters *

 Paul Prescod posted a draft of PEP 261 'Support for "wide" Unicode
 characters':

  <http://mail.python.org/pipermail/python-dev/2001-June/015644.html>

 which proposes adding a compile time option to configure unicode
 objects to store "code points" (the integers that the unicode
 specification maps to "characters" -- though that word is dangerously
 overlodaed in the Unicode arena) in 32 bit integers -- they're
 currently stored in 16 bits.

 This was (I believe) at least partially inspired by the Unicode
 Consortium assigning code points outside the "Basic Multilingual
 Plane" (i.e. the range of 16 bit integers).

 Noone is convinced that this is the best possible solution (a better
 solution might be to have unicode objects that could either store
 code points in 16 bits or 32 bits as necessary, and this solution
 could have binary compatibility problems), but it seems noone has the
 time to implement a better one (and a better one would probably have
 compatibility problems that couldn't be fixed by a simple recompile):

  <http://mail.python.org/pipermail/python-dev/2001-July/015674.html>

 (I apologise for any abuse of terminology in the above - I know very
 little about the issues surrounding Unicode).


    * Python specializing compiler *

 Armin Rigo announced his "Python specializing compiler", psyco:

  <http://mail.python.org/pipermail/python-dev/2001-June/015503.html>

 It works on the principle that you can compile a faster version of a
 function if you know stuff about the arguments it's likely to be
 called with.  This is one of the more asthetically pleasing of the
 possible ways to speed Python up (it's similar to some tactics used
 by the seemingly defunct self compiler), but it's still a very large
 amount of work away from being useful...


    * IPv6 *

 *Very* preliminary support for IPv6 - the "next generation internet
 protocol" was checked in.  The support thus far doesn't actually
 support IPv6 at all, but rather emulates IPv6's new functions for
 IPv4 addresses, so that code for Python 2.2 will hopefully be
 portable between machines that do and do not support IPv6, whilst
 being able to use IPv6 where it is supported (I hope that makes
 sense).

 Unfortunately the checkin broke the build on some platforms (OSF1,
 Windows) but I believe these problems are now sorted out.

 IPv6 support has been muttered about for years now, so it's nice to
 finally see some movement, even if it is causing some x-platform
 pain.


    * PEP 260: simplifying xrange *

 Guido posted PEP 260, a proposal to removed some of the less useful
 aspects of the xrange type:

  <http://mail.python.org/pipermail/python-dev/2001-June/015590.html>

 Support was muted; there's the usual concern on removing "little
 used" features -- what if someone (who maybe doesn't read
 comp.lang.python or these summaries) uses them?


    * site-python, site-packages *

 Gregor Hoffleit posted a request that <prefix>/lib/site-python be
 considered a standard install target:

  <http://mail.python.org/pipermail/python-dev/2001-July/015715.html>

 as the current standard of <prefix>/lib/pythonX.X/site-packages/
 makes life awkward for packagers.

 It's possible Gregor asked the wrong bunch of people; a non-version
 dependent path makes life awkward for those who want to mantain more
 than version of Python, and that includes most of the people on
 pyton-dev.  OTOH, it probably also includes everyone who cares about
 the cross-version portability of the code they write, so it seems
 that movemnet is unlikely here (could be wrong, though).

Cheers,
M.



From mwh@python.net  Thu Jul  5 11:12:48 2001
From: mwh@python.net (Michael Hudson)
Date: Thu, 5 Jul 2001 11:12:48 +0100 (BST)
Subject: [Python-Dev] python-dev summary 2001-06-21 - 2001-07-05
Message-ID: <Pine.LNX.4.30.0107051112150.27229-100000@localhost.localdomain>

The less typo-ed version!

 This is a summary of traffic on the python-dev mailing list between
 June 21 and July 4 (inclusive) 2001.  It is intended to inform the
 wider Python community of ongoing developments.  To comment, just
 post to python-list@python.org or comp.lang.python in the usual
 way. Give your posting a meaningful subject line, and if it's about a
 PEP, include the PEP number (e.g. Subject: PEP 201 - Lockstep
 iteration) All python-dev members are interested in seeing ideas
 discussed by the community, so don't hesitate to take a stance on a
 PEP if you have an opinion.

 This is the eleventh summary written by Michael Hudson.
 Summaries are archived at:

  <http://starship.python.net/crew/mwh/summaries/>

   Posting distribution (with apologies to mbm)

   Number of articles in summary: 252

    40 |                 [|]
       |                 [|]
       |                 [|]
       |                 [|]
       |                 [|]
    30 |                 [|]
       |                 [|]
       |                 [|]     [|]                 [|]
       |                 [|]     [|]                 [|] [|]
       |                 [|]     [|]                 [|] [|]
    20 |                 [|]     [|]                 [|] [|]
       |     [|]         [|]     [|]                 [|] [|]
       |     [|]         [|] [|] [|]                 [|] [|]
       |     [|]     [|] [|] [|] [|]         [|] [|] [|] [|]
       |     [|]     [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
    10 |     [|]     [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
       |     [|]     [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
       |     [|] [|] [|] [|] [|] [|] [|]     [|] [|] [|] [|] [|]
       |     [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|]
       | [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|] [|]
     0 +-004-019-007-016-042-018-028-013-005-015-016-029-027-013
        Thu 21| Sat 23| Mon 25| Wed 27| Fri 29| Sun 01| Tue 03|
            Fri 22  Sun 24  Tue 26  Thu 28  Sat 30  Mon 02  Wed 04

 This will be my last python-dev summary for a while, as I'm going to
 be mostly away from the internet for the summer.  However, Andrew
 Kuchling has agreed to take up writing them again, so there should be
 no interruption in the summaries.


    * Support for "wide" Unicode characters *

 Paul Prescod posted a draft of PEP 261 'Support for "wide" Unicode
 characters':

  <http://mail.python.org/pipermail/python-dev/2001-June/015644.html>

 which proposes adding a compile time option to configure unicode
 objects to store "code points" (the integers that the unicode
 specification maps to "characters" -- though that word is dangerously
 overlodaed in the Unicode arena) in 32 bit integers -- they're
 currently stored in 16 bits.

 This was (I believe) at least partially inspired by the Unicode
 Consortium assigning code points outside the "Basic Multilingual
 Plane" (i.e. the range of 16 bit integers).

 Noone is convinced that this is the best possible solution (a better
 solution might be to have unicode objects that could either store
 code points in 16 bits or 32 bits as necessary, and this solution
 could have binary compatibility problems), but it seems noone has the
 time to implement a better one (and a better one would probably have
 compatibility problems that couldn't be fixed by a simple recompile):

  <http://mail.python.org/pipermail/python-dev/2001-July/015674.html>

 (I apologise for any abuse of terminology in the above - I know very
 little about the issues surrounding Unicode).


    * Python specializing compiler *

 Armin Rigo announced his "Python specializing compiler", psyco:

  <http://mail.python.org/pipermail/python-dev/2001-June/015503.html>

 It works on the principle that you can compile a faster version of a
 function if you know stuff about the arguments it's likely to be
 called with.  This is one of the more asthetically pleasing of the
 possible ways to speed Python up (it's similar to some tactics used
 by the seemingly defunct self compiler), but it's still a very large
 amount of work away from being useful...


    * IPv6 *

 *Very* preliminary support for IPv6 - the "next generation internet
 protocol" was checked in.  The support thus far doesn't actually
 support IPv6 at all, but rather emulates IPv6's new functions for
 IPv4 addresses, so that code for Python 2.2 will hopefully be
 portable between machines that do and do not support IPv6, whilst
 being able to use IPv6 where it is supported (I hope that makes
 sense).

 Unfortunately the checkin broke the build on some platforms (OSF1,
 Windows) but I believe these problems are now sorted out.

 IPv6 support has been muttered about for years now, so it's nice to
 finally see some movement, even if it is causing some x-platform
 pain.


    * PEP 260: simplifying xrange *

 Guido posted PEP 260, a proposal to removed some of the less useful
 aspects of the xrange type:

  <http://mail.python.org/pipermail/python-dev/2001-June/015590.html>

 Support was muted; there's the usual concern on removing "little
 used" features -- what if someone (who maybe doesn't read
 comp.lang.python or these summaries) uses them?


    * site-python, site-packages *

 Gregor Hoffleit posted a request that <prefix>/lib/site-python be
 considered a standard install target:

  <http://mail.python.org/pipermail/python-dev/2001-July/015715.html>

 as the current standard of <prefix>/lib/pythonX.X/site-packages/
 makes life awkward for packagers.

 It's possible Gregor asked the wrong bunch of people; a non-version
 dependent path makes life awkward for those who want to mantain more
 than version of Python, and that includes most of the people on
 pyton-dev.  OTOH, it probably also includes everyone who cares about
 the cross-version portability of the code they write, so it seems
 that movemnet is unlikely here (could be wrong, though).

Cheers,
M.



From guido@digicool.com  Thu Jul  5 14:12:46 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 05 Jul 2001 09:12:46 -0400
Subject: [Python-Dev] While we're deprecating...
In-Reply-To: Your message of "Thu, 05 Jul 2001 12:10:25 +0200."
 <20010705121024.A8098@xs4all.nl>
References: <20010705121024.A8098@xs4all.nl>
Message-ID: <200107051312.f65DCkv10572@odiug.digicool.com>

> While we're in the deprecation mood (not that I changed my mind on xrage()
> ;P) how about we deprecate the alternate-tab-size-comment checks of the
> parser. That is, generate a deprecation warning for these comments:
> 
>                         "tab-width:",           /* Emacs */
>                         ":tabstop=",            /* vim, full form */
>                         ":ts=",                 /* vim, abbreviated form */
>                         "set tabsize=",         /* will vi never die? */
> 
> with sizes other than '8', and rip out the code in 2.3 ?

Was this ever even documented?  Is it worth being so careful?  We
could rip out the functionality now, replacing it with a warning, and
lose the warning in 2.3.  Or we could just rip it out now, and always
enable the -t option.  (Hm, that should be unified with the warnings
framework, although I'm not sure how easy that will be.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Thu Jul  5 14:35:22 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 5 Jul 2001 15:35:22 +0200
Subject: [Python-Dev] While we're deprecating...
In-Reply-To: <200107051312.f65DCkv10572@odiug.digicool.com>
References: <20010705121024.A8098@xs4all.nl> <200107051312.f65DCkv10572@odiug.digicool.com>
Message-ID: <20010705153522.B8098@xs4all.nl>

On Thu, Jul 05, 2001 at 09:12:46AM -0400, Guido van Rossum wrote:
> > While we're in the deprecation mood (not that I changed my mind on xrage()
> > ;P) how about we deprecate the alternate-tab-size-comment checks of the
> > parser. That is, generate a deprecation warning for these comments:
> > 
> >                         "tab-width:",           /* Emacs */
> >                         ":tabstop=",            /* vim, full form */
> >                         ":ts=",                 /* vim, abbreviated form */
> >                         "set tabsize=",         /* will vi never die? */
> > 
> > with sizes other than '8', and rip out the code in 2.3 ?

> Was this ever even documented?  Is it worth being so careful?  We
> could rip out the functionality now, replacing it with a warning, and
> lose the warning in 2.3.  Or we could just rip it out now, and always
> enable the -t option.  (Hm, that should be unified with the warnings
> framework, although I'm not sure how easy that will be.)

Uhmm... if it wasn't documented, all the more reason to be careful.
Imagine, say,

# tab-width:4 (or however it's done)

<quadzillion of lines>

<4 spaces>for record in database:
< 8  spaces >  process_record
< 1 tab > del database

Now, I completely agree that that is very fragile code (imagine some emacs
loathing colleague removing the tab-width line) but that doesn't mean we
should just break it for the hell of it... We have it, we want to lose it,
we deprecate it. If we rip it out now (which I'd be -0 on) we should replace
it with an *error*, not a warning, since code has a high chance of breaking.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas.heller@ion-tof.com  Thu Jul  5 14:36:55 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 5 Jul 2001 15:36:55 +0200
Subject: [Python-Dev] Playing with descr-branch
Message-ID: <009001c10557$8eb82150$e000a8c0@thomasnotebook>

Guido,

some feedback from first experiments with descr-branch:

The test-suite seems to work, as does the test_descr.py script
run standalone.

Immediate crash (access vialoation) on executing:

C:\sf\desc-branch\python\dist\src\PCbuild>python
Python 2.2a0 (#16, Jul  5 2001, 12:26:08) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> class C:
...   def foo(*a): return a
...   goo = classmethod(foo)
...
>>> C.goo

The crash can be avoided by executing C = c()
before calling C.goo.

Just an observation...
Currently this code does not seem stable enough to
play with.

Thomas



From thomas@xs4all.net  Thu Jul  5 14:54:47 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 5 Jul 2001 15:54:47 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: <E15I97e-00025R-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010705155447.C8098@xs4all.nl>

On Thu, Jul 05, 2001 at 06:24:46AM -0700, Guido van Rossum wrote:
> Update of /cvsroot/python/python/dist/src/Include
> In directory usw-pr-cvs1:/tmp/cvs-serv8011
> 
> Modified Files:
> 	rangeobject.h 
> Log Message:
> Rip out the fancy behaviors of xrange that nobody uses: repeat, slice,
> contains, tolist(), and the start/stop/step attributes.  This includes
> removing the 4th ('repeat') argument to PyRange_New().

Eek... What do we have the fucking warning framework and deprecation
warnings for, anyway ?! It may sound overly conservative, but I *really*
don't like ripping things out just because you don't like them, without even
as much as a release with a warning (and no, 2.1.1 can't have the warning;
PEP 6 won't allow it.)

You're basically telling people "You didn't use it the way I thought people
would use it but never documented anywhere, so if you used them the way they
are documented, you're screwed." Defense offers exhibit A: the
standard library reference:

http://www.python.org/doc/current/lib/typesseq-xrange.htm:
"XRange objects behave like tuples, and offer a single method"

(Not to mention http://www.python.org/doc/current/lib/typesseq.html which
*strongly* suggests all the operations in the table apply to range-objects
as well as to strings, unicode strings, lists, tuples and buffers.)

The API change also means binary and source level breakages... Is it really
that much trouble to, just for *one* release, keep the functionality and
just generate a warning when the 4th argument is something other than '1' ?

I can live (though not agree with, sorry ;P) the removal of xrange advanced
features... just not from supported to *gone* in a single step.

Wishing-I-hadn't-mentioned-xrange-in-the-other-thread-ly y'rs, ;)
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From Jason.Tishler@dothill.com  Thu Jul  5 14:58:16 2001
From: Jason.Tishler@dothill.com (Jason Tishler)
Date: Thu, 5 Jul 2001 09:58:16 -0400
Subject: [Python-Dev] Making a .pyd using Cygwin?
In-Reply-To: <200107050523.RAA00926@s454.cosc.canterbury.ac.nz>
Message-ID: <20010705095816.B6130@dothill.com>

Greg,

On Thu, Jul 05, 2001 at 05:23:58PM +1200, Greg Ewing wrote:
> Is it feasible to compile a Python extension module
> for Windows using Cygwin?

By the above, do you mean a Win32 or Cygwin extension module?  The answer
is yes in either case.  However, using Cygwin Python to build a Cygwin
extension module is more straight forward than a Win32 one.  Actually,
Cygwin Python behaves the same as other Unix platforms.

> I have tried this, and the linker tells me that it can't
> export '_bss_start__', '_bss_end__', '_data_start__'
> and '_data_end__' because they're not defined.

The above is due to the fact that your extension module (i.e., DLL) is
not exporting any symbols.  You can rectify this problem by adding a
DL_EXPORT macro to the definition of the module's initialization function.
See the following for an example of the solution:

    http://sources.redhat.com/ml/cygwin/2001-06/msg00442.html

BTW, the cygwin@cygwin.com mailing list is a more appropriate forum for
this type of question.

Jason

-- 
Jason Tishler
Director, Software Engineering       Phone: 732.264.8770 x235
Dot Hill Systems Corp.               Fax:   732.264.8798
82 Bethany Road, Suite 7             Email: Jason.Tishler@dothill.com
Hazlet, NJ 07730 USA                 WWW:   http://www.dothill.com


From guido@digicool.com  Thu Jul  5 15:09:58 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 05 Jul 2001 10:09:58 -0400
Subject: [Python-Dev] "Becoming a Python Developer"
In-Reply-To: Your message of "Wed, 04 Jul 2001 21:22:31 EDT."
 <20010704212231.A11629@ute.cnri.reston.va.us>
References: <20010704212231.A11629@ute.cnri.reston.va.us>
Message-ID: <200107051409.f65E9wS10645@odiug.digicool.com>

Nice work, Andrew!

I surely hope this will bring us some new contributors...

Some answers to your XXX marks:

> You like hacking on language interpreters.  (XXX is that a bit
> snarky? not sure I phrased that well...)

Sounds fine to me, or you could extend to "large software packages".

> Python is over 10 years old, and its development process is quite
> (XXX elaborate? evolved? mature?)

I'd say mature.

> Python is developed by a group of about 30 people,

True, but may sound off-putting to would-be contributors.  Maybe you
can add that lots of others contribute significantly?  (E.g. ask Fred
how many folks have contributed to the docs!)

> XXX should something be written about CPython / JPython / Stackless
> / Python.NET? I only know about CPython.

Explaining the distinction would be helpful, and you could add that
for Java programmers, participating in Jython would be a logical step.

> Guido van Rossum has the title of Benevolent Dictator For Life, or
> BDFL.

Lest people who aren't familiar with Python culture (or those who are
but lack a sense of humor) take this at face value, can you explain
that this is a tongue-in-cheek title?

The section on CVS is redundant -- this information is already on the
SF website, isn't it?  (Or most of it.)  I don't think detailed
instructions need to be in a high-level motivational article -- a link
to http://sourceforge.net/cvs/?group_id=5470 is all that's needed
(like you do for other services).

> diff -C2. (XXX is that correct?)

I use "diff -c" which seems to have the same effect.

> Python's standard style, described at XXX.


Alas, there's no description.  Let me try to summarize the rules here.


C dialect:

- Use ANSI/ISO standard C (the 1989 version of the standard).

- All function declarations and definitions must use full prototypes
  (i.e. specify the types of all arguments).

- Never use C++ style // one-line comments.

- No compiler warnings with major compilers (gcc, VC++, a few others).


Code lay-out:

- Use single-tab indents, where a tab is worth 8 spaces.

- No line should be longer than 79 characters.  If this and the
  previous rule together don't give you enough room to code, your code
  is too complicated -- consider using subroutines.

- Function definition style: function name in column 1, outermost
  curly braces in column 1, blank line after local variable
  declarations.

	static int
	extra_ivars(PyTypeObject *type, PyTypeObject *base)
	{
		int t_size = PyType_BASICSIZE(type);
		int b_size = PyType_BASICSIZE(base);

		assert(t_size >= b_size); /* type smaller than base! */
		...
		return 1;
	}

- Code structure: one space between keywords like 'if', 'for' and the
  following left paren; no spaces inside the paren; braces as shown:

	if (mro != NULL) {
		...
	}
	else {
		...
	}

- The return statement should *not* get redundant parentheses:

	return Py_None; /* correct */
	return(Py_None); /* incorrect */

- Function and macro call style: foo(a, b, c) -- no space before the
  open paren, no spaces inside the parens, no spaces before commas,
  one space after each comma.

- Always put spaces around assignment, Boolean and comparison
  operators.  In expressions using a lot of operators, add spaces
  around the outermost (lowest-priority) operators.

- Breaking long lines: if you can, break after commas in the outermost
  argument list.  Always indent continuation lines appropriately, e.g.:

	PyErr_Format(PyExc_TypeError,
		     "cannot create '%.100s' instances",
		     type->tp_name);

- When you break a long expression at a binary operator, the operator
  goes at the end of the previous line, e.g.:

	if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 &&
	    type->tp_dictoffset == b_size &&
	    (size_t)t_size == b_size + sizeof(PyObject *))
		return 0; /* "Forgive" adding a __dict__ only */

- Put blank lines around functions, structure definitions, and major
  sections inside functions.

- Comments go before the code they describe.

- All functions and global variables should be declared static unless
  they are to be part of a published interface

- For external functions and variables, we always have a declaration
  in an appropriate header file in the "Include" directory, which uses
  the DL_IMPORT() macro, like this:

	extern DL_IMPORT(PyObject *) PyObject_Repr(PyObject *);


Naming conventions:

- Use a Py prefix for public functions; never for static functions.
  The Py_ prefix is reserved for global service routines like
  Py_FatalError; specific groups of routines (e.g. specific object
  type APIs) use a longer prefix, e.g. PyString_ for string functions.

- Public functions and variables use MixedCase with underscores, like
  this: PyObject_GetAttr, Py_BuildValue, PyExc_TypeError.

- Occasionally an "internal" function has to be visible to the loader;
  we use the _Py prefix for this, e.g.: _PyObject_Dump.

- Macros should have a MixedCase prefix and then use upper case, for
  example: PyString_AS_STRING, Py_PRINT_RAW.


I'm sure there's more.  I'll make this into a PEP.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Thu Jul  5 15:06:38 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 5 Jul 2001 10:06:38 -0400 (EDT)
Subject: [Python-Dev] While we're deprecating...
In-Reply-To: <200107051312.f65DCkv10572@odiug.digicool.com>
References: <20010705121024.A8098@xs4all.nl>
 <200107051312.f65DCkv10572@odiug.digicool.com>
Message-ID: <15172.29806.432822.805282@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > Was this ever even documented?  Is it worth being so careful?  We

  It's not in the LaTeX documentation.

 > could rip out the functionality now, replacing it with a warning, and
 > lose the warning in 2.3.  Or we could just rip it out now, and always
 > enable the -t option.  (Hm, that should be unified with the warnings
 > framework, although I'm not sure how easy that will be.)

  I'd rather see the warning added one version before ripping it out.
(And no, I don't see any real reason to change xrange either.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From skip@pobox.com (Skip Montanaro)  Thu Jul  5 15:46:59 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 5 Jul 2001 09:46:59 -0500
Subject: [Python-Dev] "Becoming a Python Developer"
In-Reply-To: <20010704212231.A11629@ute.cnri.reston.va.us>
References: <20010704212231.A11629@ute.cnri.reston.va.us>
Message-ID: <15172.32227.320917.531063@beluga.mojam.com>

    Andrew> Inspired by some discussion on c.l.py, I've written a draft of a
    Andrew> guide to working on Python:

    ...

Andrew,

Good work.  I would recommend that new people wanting to contribute to
Python be urged to look first at the libraries (batteries) instead of the
language (radio) itself.  The language itself is growing new features on an
occaional basis, but its interaction with the outside world (e.g. XML) is
just as important (or perhaps more important).

Skip



From akuchlin@mems-exchange.org  Thu Jul  5 16:03:35 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 5 Jul 2001 11:03:35 -0400
Subject: [Python-Dev] "Becoming a Python Developer"
In-Reply-To: <200107051409.f65E9wS10645@odiug.digicool.com>; from guido@digicool.com on Thu, Jul 05, 2001 at 10:09:58AM -0400
References: <20010704212231.A11629@ute.cnri.reston.va.us> <200107051409.f65E9wS10645@odiug.digicool.com>
Message-ID: <20010705110334.C12027@ute.cnri.reston.va.us>

On Thu, Jul 05, 2001 at 10:09:58AM -0400, Guido van Rossum wrote:
>> You like hacking on language interpreters.  (XXX is that a bit
>> snarky? not sure I phrased that well...)

Whoops, that XXX is now redundant.  The original text had something
like "... and you want to do something more interesting than writing
the 187th Scheme interpreter", but then I took the Scheme reference
out.  

>Explaining the distinction would be helpful, and you could add that
>for Java programmers, participating in Jython would be a logical step.

The thing is, I'm not sure how Jython is developed.  Is Finn Bock the
BDFL, do the developers vote, or what?  (Same question for .NET.)  If
some Jython development offered to contribute a description, I
certainly wouldn't turn it down.

>I'm sure there's more.  I'll make this into a PEP.

Should the Python style guide, currently at
/doc/essays/styleguide.html, also become a PEP?  It's often
referenced...

I'll make the other suggested changes, and refer to PEP 7; thanks!

Anyone have suggestions for other topics/issues that should be covered
in this document?  

--amk


From guido@digicool.com  Thu Jul  5 16:07:32 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 05 Jul 2001 11:07:32 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: Your message of "Thu, 05 Jul 2001 15:54:47 +0200."
 <20010705155447.C8098@xs4all.nl>
References: <20010705155447.C8098@xs4all.nl>
Message-ID: <200107051507.f65F7Wf12155@odiug.digicool.com>

> > Rip out the fancy behaviors of xrange that nobody uses: repeat, slice,
> > contains, tolist(), and the start/stop/step attributes.  This includes
> > removing the 4th ('repeat') argument to PyRange_New().
> 
> Eek... What do we have the fucking warning framework and deprecation
> warnings for, anyway ?!

For more important things?!

I posted the PEP about this, and I got mostly favorable or lukewarm
responses.  *One* person admitted they were using advanced xrange()
features now, but he said he wouldn't miss them.  The warning
framework and deprecation warnings are important for things that will
change the semantics of things *without* causing an error message
(like nested scopes).  They are also important for things that will
require lots of folks to change their code.

>From the responses to the PEP posting it doesn't seem like there are
many people using xrange() in non-idiomatic ways, so the latter risk
seems very small to me.

With the exception of the change to PyRange_New(), the changes here
will give people clear error message when they try to use the existing
features.  If you insist, I can change the signature for PyRange_New()
back and add a warning if the 4th argument is not 1, but I'm reluctant
there too.

> It may sound overly conservative, but I *really*
> don't like ripping things out just because you don't like them, without even
> as much as a release with a warning (and no, 2.1.1 can't have the warning;
> PEP 6 won't allow it.)

Of course not.

> You're basically telling people "You didn't use it the way I thought people
> would use it but never documented anywhere, so if you used them the way they
> are documented, you're screwed." Defense offers exhibit A: the
> standard library reference:
> 
> http://www.python.org/doc/current/lib/typesseq-xrange.htm:
> "XRange objects behave like tuples, and offer a single method"

Well, they never did behave like tuples (s+t never worked, and you
couldn't slice a repeated xrange object).  But more importantly,
(almost) nobody has used them as such.

> (Not to mention http://www.python.org/doc/current/lib/typesseq.html which
> *strongly* suggests all the operations in the table apply to range-objects
> as well as to strings, unicode strings, lists, tuples and buffers.)

I'll update this to explain that concat, repeat and slice don't work
for xrange() object.

> The API change also means binary and source level breakages... Is it
> really that much trouble to, just for *one* release, keep the
> functionality and just generate a warning when the 4th argument is
> something other than '1' ?

No binary breakage -- the 4th argument is normally 1 anyway.

The source breakage is easy to fix.

Again, the real point of the deprecation policy is not to *never* get
an error in old code.  It is to make sure that you don't get burned by
*silent* changes in semantics, and to make sure that *common* usage
that will stop working is caught.  Advanced xrange() is not common.
Calling PyRange_New() from C is not common.

> I can live (though not agree with, sorry ;P) the removal of xrange
> advanced features... just not from supported to *gone* in a single
> step.

Sorry, then you better commit suicide. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Thu Jul  5 16:12:14 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 05 Jul 2001 11:12:14 -0400
Subject: [Python-Dev] "Becoming a Python Developer"
In-Reply-To: Your message of "Thu, 05 Jul 2001 11:03:35 EDT."
 <20010705110334.C12027@ute.cnri.reston.va.us>
References: <20010704212231.A11629@ute.cnri.reston.va.us> <200107051409.f65E9wS10645@odiug.digicool.com>
 <20010705110334.C12027@ute.cnri.reston.va.us>
Message-ID: <200107051512.f65FCEI12208@odiug.digicool.com>

> >Explaining the distinction would be helpful, and you could add that
> >for Java programmers, participating in Jython would be a logical step.
> 
> The thing is, I'm not sure how Jython is developed.  Is Finn Bock the
> BDFL, do the developers vote, or what?  (Same question for .NET.)  If
> some Jython development offered to contribute a description, I
> certainly wouldn't turn it down.

I really don't know how Jython is developed, but Finn and Samuele are
on this list.  I expect it's more democratic.  I think .NET is purely
an ActiveState venture -- Paul Prescod may care to comment.

> >I'm sure there's more.  I'll make this into a PEP.
> 
> Should the Python style guide, currently at
> /doc/essays/styleguide.html, also become a PEP?  It's often
> referenced...

Yes.  I vaguely recall some group of volunteers was planning to rework
it into a more current document, but I don't know what happened to
that effort.  Some of the doc string conventions are now immortalized
as PEP 257.

> I'll make the other suggested changes, and refer to PEP 7; thanks!

You're welcome, and thank *you* for doing this.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Thu Jul  5 16:16:37 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 05 Jul 2001 11:16:37 -0400
Subject: [Python-Dev] Playing with descr-branch
In-Reply-To: Your message of "Thu, 05 Jul 2001 15:36:55 +0200."
 <009001c10557$8eb82150$e000a8c0@thomasnotebook>
References: <009001c10557$8eb82150$e000a8c0@thomasnotebook>
Message-ID: <200107051516.f65FGbE12901@odiug.digicool.com>

> some feedback from first experiments with descr-branch:
> 
> The test-suite seems to work, as does the test_descr.py script
> run standalone.
> 
> Immediate crash (access vialoation) on executing:
> 
> C:\sf\desc-branch\python\dist\src\PCbuild>python
> Python 2.2a0 (#16, Jul  5 2001, 12:26:08) [MSC 32 bit (Intel)] on win32
> Type "copyright", "credits" or "license" for more information.
> >>> class C:
> ...   def foo(*a): return a
> ...   goo = classmethod(foo)
> ...
> >>> C.goo
> 
> The crash can be avoided by executing C = c()
> before calling C.goo.

Hm, I can't reproduce this; it works for me!  Have you tried cvs
update and rebuilding?

> Just an observation...
> Currently this code does not seem stable enough to
> play with.

What you observe sounds like a reference count error.  If you can
still reproduce it, can you try to dig a little deeper?  Linking with
-lefence and running it under gdb, then reporting the backtrace when
it fails would help.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tismer@tismer.com  Thu Jul  5 16:16:20 2001
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 05 Jul 2001 17:16:20 +0200
Subject: [Python-Dev] Psyco1 with Stackless
Message-ID: <3B4484C4.8196A4A9@tismer.com>

Hi Armin, developers,

I had a closer look at your Python Specializing Compiler.
This is a very promising approach, going into directions
which I have been trying a little myself.

In its current state, Psyco introduces a nice little extra engine,
which mostly deals with efficient integer operations. There are
a lot of other optimizations possible, and lots of opcodes need
to be implemented in order to make it usable for real world
applications. Anyway, I found this proof of concept very interesting,
and so I built the extensions for Win32 (with very small changes)
and did some testing with mctest.py .

Here my results with stock Python 2.0:
result 1952145856 in 2.43729775712 seconds
result 1952145856 in 2.18692571263 seconds
result 1952145856 in 5.60894890272 seconds

(run1, run2, original func)

But before, I tested with Stackless Python by chance, and I got this:
result 1952145856 in 2.42536300005 seconds
result 1952145856 in 2.18817615088 seconds
result 1952145856 in 3.51236064919 seconds

While your result outperforms standard Python by 2.56, it performs
only by 1.605 better than Stackless!

This doesn't say anything against your implementation, instead it
tells me that Stackless' code optimization is much better than
Standard Python's, especially for integer operations on Win32.
For sure, your version could be much faster when it is generating
machine code, or if even more optimizations of data flow are
done.
Your little vm looks already very efficient. The is of course some
room for optimizations, like this: the SGET macro is used all around,
and it always uses explicit stack addressing. A function like
       CODE_INT_BINARY(IntSub, -)
expands into 16 machine instructions with Visual Studio 6.
For common cases, like  [TOS-1] = [TOS] - [TOS-1], special
accessors might save about half of these opcodes, again.

In other words, I assume that you can get three times as fast
as Python on integer operations with just a VM.

Congratulations and keep this work on! - chris


p.s.: Note that at the moment, you don't do any overflow checks
on integers. This is not cmpatible to Python, while I would
love to have an option to switch of overflow checks in Python,
of course!

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net/
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com/


From Samuele Pedroni <pedroni@inf.ethz.ch>  Thu Jul  5 16:27:59 2001
From: Samuele Pedroni <pedroni@inf.ethz.ch> (Samuele Pedroni)
Date: Thu, 5 Jul 2001 17:27:59 +0200 (MET DST)
Subject: [Python-Dev] "Becoming a Python Developer"
Message-ID: <200107051528.RAA11938@core.inf.ethz.ch>

Hi.

[Andrew Kuchling]
> 
> >Explaining the distinction would be helpful, and you could add that
> >for Java programmers, participating in Jython would be a logical step.
> 
> The thing is, I'm not sure how Jython is developed.  Is Finn Bock the
> BDFL, do the developers vote, or what?  (Same question for .NET.)  If
> some Jython development offered to contribute a description, I
> certainly wouldn't turn it down.
> 
The situation for jython development is as follows (Finn could have
a different opinion):
There are 2 active core developers.
Until now we never needed to vote, also because most of what we do
is mimicking python semantic. The java/jython specific stuff is
already subtle enough for two minds and we simply try to converge
to a decent solution. I don't think Finn considers himself a BDFL.
Until now (jython phase, not jpython) we have received only small
contributions, and we have rejected a few patches, we are severe censors 
wrt. to quality, at least we try.

The matter of fact is that for the momement we have never been challenged
by a patch, feature addition by a promising new contributor.
That's a pity. We have (I imagine) lots of users, and we get praised but...

Maybe we are bad at diplomacy or we have somehow closed the development process ...

In any case we would definitely like to have some more contributors, also in order
to better keep up with python quick pace ... Java is more productive than multi-platform
portable C but ...

regards, Samuele.




From thomas.heller@ion-tof.com  Thu Jul  5 16:33:45 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 5 Jul 2001 17:33:45 +0200
Subject: [Python-Dev] Playing with descr-branch
References: <009001c10557$8eb82150$e000a8c0@thomasnotebook>  <200107051516.f65FGbE12901@odiug.digicool.com>
Message-ID: <004901c10567$e0de25f0$e000a8c0@thomasnotebook>

> > some feedback from first experiments with descr-branch:
> > 
> > The test-suite seems to work, as does the test_descr.py script
> > run standalone.
> > 
> > Immediate crash (access vialoation) on executing:
> > 
> > C:\sf\desc-branch\python\dist\src\PCbuild>python
> > Python 2.2a0 (#16, Jul  5 2001, 12:26:08) [MSC 32 bit (Intel)] on win32
> > Type "copyright", "credits" or "license" for more information.
> > >>> class C:
> > ...   def foo(*a): return a
> > ...   goo = classmethod(foo)
> > ...
> > >>> C.goo
> > 
> > The crash can be avoided by executing C = c()
> > before calling C.goo.
> 
> Hm, I can't reproduce this; it works for me!  Have you tried cvs
> update and rebuilding?
> 

I thought so.

> > Just an observation...
> > Currently this code does not seem stable enough to
> > play with.
> 
> What you observe sounds like a reference count error.  If you can
> still reproduce it, can you try to dig a little deeper?  Linking with
> -lefence and running it under gdb, then reporting the backtrace when
> it fails would help.
I have seen several variants of this and once tried it with the debug
build in MSVC6. IIRC the code was calling ->tp_alloc(...) somewhere
which was NULL.
I will try again this night and report back.
gdb? You use gdb under Windows?
And what is -lefence?


Thanks, Thomas



From guido@digicool.com  Thu Jul  5 16:40:57 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 05 Jul 2001 11:40:57 -0400
Subject: [Python-Dev] Playing with descr-branch
In-Reply-To: Your message of "Thu, 05 Jul 2001 17:33:45 +0200."
 <004901c10567$e0de25f0$e000a8c0@thomasnotebook>
References: <009001c10557$8eb82150$e000a8c0@thomasnotebook> <200107051516.f65FGbE12901@odiug.digicool.com>
 <004901c10567$e0de25f0$e000a8c0@thomasnotebook>
Message-ID: <200107051540.f65FewK14436@odiug.digicool.com>

> I have seen several variants of this and once tried it with the debug
> build in MSVC6. IIRC the code was calling ->tp_alloc(...) somewhere
> which was NULL.
> I will try again this night and report back.
> gdb? You use gdb under Windows?
> And what is -lefence?

Sorry, I missed the subtle clue that you were reporting a Windows
bug... :-(

I can indeed reproduce this on Windows, and I'll look into it now (if
Tim doesn't beat me to it).

Under Linux, it *is* stable enough! :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Thu Jul  5 16:43:17 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 5 Jul 2001 17:43:17 +0200
Subject: [Python-Dev] Playing with descr-branch
References: <009001c10557$8eb82150$e000a8c0@thomasnotebook> <200107051516.f65FGbE12901@odiug.digicool.com>              <004901c10567$e0de25f0$e000a8c0@thomasnotebook>  <200107051540.f65FewK14436@odiug.digicool.com>
Message-ID: <00a901c10569$3653c890$e000a8c0@thomasnotebook>

> I can indeed reproduce this on Windows, and I'll look into it now (if
> Tim doesn't beat me to it).
> 
> Under Linux, it *is* stable enough! :-)
I know there _are_ reasons to switch :-)

Thomas



From guido@digicool.com  Thu Jul  5 16:48:01 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 05 Jul 2001 11:48:01 -0400
Subject: [Python-Dev] Playing with descr-branch
In-Reply-To: Your message of "Thu, 05 Jul 2001 17:43:17 +0200."
 <00a901c10569$3653c890$e000a8c0@thomasnotebook>
References: <009001c10557$8eb82150$e000a8c0@thomasnotebook> <200107051516.f65FGbE12901@odiug.digicool.com> <004901c10567$e0de25f0$e000a8c0@thomasnotebook> <200107051540.f65FewK14436@odiug.digicool.com>
 <00a901c10569$3653c890$e000a8c0@thomasnotebook>
Message-ID: <200107051548.f65Fm1B14472@odiug.digicool.com>

> > I can indeed reproduce this on Windows, and I'll look into it now (if
> > Tim doesn't beat me to it).
> > 
> > Under Linux, it *is* stable enough! :-)
> I know there _are_ reasons to switch :-)

I looked at your problem with the debugger, and it seems that the
classmethod and staticmethod types don't get initialized (the
constructor would have been a good time to do this :-).  But now I
don't understand why this isn't a hard failure on Linux.

Checking will follow ASAP.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gward@python.net  Thu Jul  5 16:47:53 2001
From: gward@python.net (Greg Ward)
Date: Thu, 5 Jul 2001 11:47:53 -0400
Subject: [Python-Dev] Making a .pyd using Cygwin?
In-Reply-To: <200107050523.RAA00926@s454.cosc.canterbury.ac.nz>; from greg@cosc.canterbury.ac.nz on Thu, Jul 05, 2001 at 05:23:58PM +1200
References: <200107050523.RAA00926@s454.cosc.canterbury.ac.nz>
Message-ID: <20010705114752.A954@gerg.ca>

On 05 July 2001, Greg Ewing said:
> Is it feasible to compile a Python extension module
> for Windows using Cygwin?

I was under the impression that the Distutils supported gcc under
Cygwin.  I know several people put in a lot of work to make this happen,
and I eventually approved all the patches.

        Greg
-- 
Greg Ward - Unix bigot                                  gward@python.net
http://starship.python.net/~gward/
"What do you mean -- a European or an African swallow?"


From akuchlin@mems-exchange.org  Thu Jul  5 16:54:15 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 5 Jul 2001 11:54:15 -0400
Subject: [Python-Dev] "Becoming a Python Developer"
In-Reply-To: <200107051528.RAA11938@core.inf.ethz.ch>; from pedroni@inf.ethz.ch on Thu, Jul 05, 2001 at 05:27:59PM +0200
References: <200107051528.RAA11938@core.inf.ethz.ch>
Message-ID: <20010705115415.F12027@ute.cnri.reston.va.us>

On Thu, Jul 05, 2001 at 05:27:59PM +0200, Samuele Pedroni wrote:
>Maybe we are bad at diplomacy or we have somehow closed the
>development process ...

It might also be Java's development culture.  Back when I occasionally
looked for Java classes, it was amazing how little free Java software
there was.  People would write, say, a specialized AWT Layout class,
but instead of putting on a Web page with source code and an example,
they'd want to you pay $25 or $300 for it!  This probably isn't helped
by Java support on Unixes (other than Solaris) having been so dodgy
for so long, as Unix seems to have the strongest such culture.  So
people may not be accustomed to the idea that if they use Jython, they
can also *improve* it.

BTW, I assume the Jython source uses the standard Java indentation and
formatting style?

--amk


From barry@digicool.com  Thu Jul  5 17:03:21 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Thu, 5 Jul 2001 12:03:21 -0400
Subject: [Python-Dev] "Becoming a Python Developer"
References: <20010704212231.A11629@ute.cnri.reston.va.us>
 <200107051409.f65E9wS10645@odiug.digicool.com>
 <20010705110334.C12027@ute.cnri.reston.va.us>
Message-ID: <15172.36809.897670.477227@anthem.wooz.org>

>>>>> "AK" == Andrew Kuchling <akuchlin@mems-exchange.org> writes:

    AK> Should the Python style guide, currently at
    AK> /doc/essays/styleguide.html, also become a PEP?  It's often
    AK> referenced...

    AK> I'll make the other suggested changes, and refer to PEP 7;
    AK> thanks!

PEP 8 will be "Style Guide for Python Code".  I'll adapt the contents
of the styleguide to PEP form and check that in.

-Barry


From aahz@rahul.net  Thu Jul  5 17:40:01 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Thu, 5 Jul 2001 09:40:01 -0700 (PDT)
Subject: [Python-Dev] "Becoming a Python Developer"
In-Reply-To: <200107051409.f65E9wS10645@odiug.digicool.com> from "Guido van Rossum" at Jul 05, 2001 10:09:58 AM
Message-ID: <20010705164002.E662599C81@waltz.rahul.net>

Guido van Rossum wrote:
> AMK:
>>
>> Guido van Rossum has the title of Benevolent Dictator For Life, or
>> BDFL.
> 
> Lest people who aren't familiar with Python culture (or those who are
> but lack a sense of humor) take this at face value, can you explain
> that this is a tongue-in-cheek title?

It should be made clear, I think, that while the title is tongue-in-cheek, 
the semantics of the title are not.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From DavidA@ActiveState.com  Thu Jul  5 17:48:13 2001
From: DavidA@ActiveState.com (David Ascher)
Date: Thu, 05 Jul 2001 09:48:13 -0700
Subject: [Python-Dev] "Becoming a Python Developer"
References: <20010704212231.A11629@ute.cnri.reston.va.us> <200107051409.f65E9wS10645@odiug.digicool.com>
 <20010705110334.C12027@ute.cnri.reston.va.us> <200107051512.f65FCEI12208@odiug.digicool.com>
Message-ID: <3B449A4C.4B541061@ActiveState.com>

Guido van Rossum wrote:
> 
> > >Explaining the distinction would be helpful, and you could add that
> > >for Java programmers, participating in Jython would be a logical step.
> >
> > The thing is, I'm not sure how Jython is developed.  Is Finn Bock the
> > BDFL, do the developers vote, or what?  (Same question for .NET.)  If
> > some Jython development offered to contribute a description, I
> > certainly wouldn't turn it down.
> 
> I really don't know how Jython is developed, but Finn and Samuele are
> on this list.  I expect it's more democratic.  I think .NET is purely
> an ActiveState venture -- Paul Prescod may care to comment.

For the purposes of Andrew's document, Mark Hammond, of ActiveState, is
the BDFL on the .NET research project, which shouldn't be considered at
the same level of maturity as Jython by any means.  In other words, it's
fun, but it's not useful yet.

-- David Ascher
   ActiveState

   New! ASPN - ActiveState Programmer Network
   Essential programming tools and information
   http://www.ActiveState.com/ASPN


From barry@digicool.com  Thu Jul  5 17:51:39 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Thu, 5 Jul 2001 12:51:39 -0400
Subject: [Python-Dev] "Becoming a Python Developer"
References: <200107051528.RAA11938@core.inf.ethz.ch>
 <20010705115415.F12027@ute.cnri.reston.va.us>
Message-ID: <15172.39707.597180.746@anthem.wooz.org>

>>>>> "AK" == Andrew Kuchling <akuchlin@mems-exchange.org> writes:

    AK> BTW, I assume the Jython source uses the standard Java
    AK> indentation and formatting style?

Finn and Samuele are better arbiters of this, but when I was hacking
JPython, the answer was yes, with one exception: the opening brace for
a class should be on a line by itself, in column zero (not, as is the
Java convention, hanging at the right end of the first line of code).
E.g. the convention was the same as Guido has for C code.

This was done primarily for Emacs' sake, but I don't think Finn or
Samuele use Emacs for their development.

-Barry


From tismer@tismer.com  Thu Jul  5 17:52:51 2001
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 05 Jul 2001 18:52:51 +0200
Subject: [Python-Dev] Psyco1 with Stackless
References: <3B4484C4.8196A4A9@tismer.com>
Message-ID: <3B449B63.5DB59864@tismer.com>


Christian Tismer wrote:
...
> For common cases, like  [TOS-1] = [TOS] - [TOS-1], special
> accessors might save about half of these opcodes, again.

The above suggestion was a little thoughtless. The code generator
makes no attempt to keep the used slots together, therefore
the slot addressing cannot be saved easily. 

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net/
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com/


From Samuele Pedroni <pedroni@inf.ethz.ch>  Thu Jul  5 17:53:16 2001
From: Samuele Pedroni <pedroni@inf.ethz.ch> (Samuele Pedroni)
Date: Thu, 5 Jul 2001 18:53:16 +0200 (MET DST)
Subject: [Python-Dev] "Becoming a Python Developer"
Message-ID: <200107051653.SAA13631@core.inf.ethz.ch>

[Andrew Kuchling]
> It might also be Java's development culture.  Back when I occasionally
> looked for Java classes, it was amazing how little free Java software
> there was.  People would write, say, a specialized AWT Layout class,
> but instead of putting on a Web page with source code and an example,
> they'd want to you pay $25 or $300 for it!  This probably isn't helped
> by Java support on Unixes (other than Solaris) having been so dodgy
> for so long, as Unix seems to have the strongest such culture.  So
> people may not be accustomed to the idea that if they use Jython, they
> can also *improve* it.
Or maybe there's not a lot of people with core hacking attitude in 
the java world. How many people do know the classfile format and the
jvm instruction set ? ... Or their boss does not want them to spend
office time on one of the key of their company success ;)
Any conspiracy theory can do the job here <wink> Nevertheless the situation
is bit sad ...

> BTW, I assume the Jython source uses the standard Java indentation and
> formatting style?
Yes and identation = 4 spaces

A note: the code is a bit messy sometimes ;)



From Samuele Pedroni <pedroni@inf.ethz.ch>  Thu Jul  5 18:08:25 2001
From: Samuele Pedroni <pedroni@inf.ethz.ch> (Samuele Pedroni)
Date: Thu, 5 Jul 2001 19:08:25 +0200 (MET DST)
Subject: [Python-Dev] "Becoming a Python Developer"
Message-ID: <200107051708.TAA13910@core.inf.ethz.ch>

[BAW]
> 
> >>>>> "AK" == Andrew Kuchling <akuchlin@mems-exchange.org> writes:
> 
>     AK> BTW, I assume the Jython source uses the standard Java
>     AK> indentation and formatting style?
> 
> Finn and Samuele are better arbiters of this, but when I was hacking
> JPython, the answer was yes, with one exception: the opening brace for
> a class should be on a line by itself, in column zero (not, as is the
> Java convention, hanging at the right end of the first line of code).
> E.g. the convention was the same as Guido has for C code.
> 
> This was done primarily for Emacs' sake, but I don't think Finn or
> Samuele use Emacs for their development.
> 
My bad, now both styles:

class A { and
class A
{

appear in the code, I use Forte.

>LOL!  You should have seen it before it was imported into CVS!  It
>would have made you cry!

I was unaware of that. In any case I have added my personal amount
of entropy to the code, and it wasn't in any way a targeted critique.
Simply CPython code looks mostly better, maybe is just a virtue of C <wink>.

Samuele.



From barry@digicool.com  Thu Jul  5 20:03:53 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Thu, 5 Jul 2001 15:03:53 -0400
Subject: [Python-Dev] [ANNOUNCE] PEP 8, Style Guide for Python Code
Message-ID: <15172.47641.290174.183847@yyz.digicool.com>

I've taken Guido's original style guide essay and converted it to PEP
form.  It is available as

    http://www.python.org/peps/pep-0008.html

I've done some mild spellchecking and editorial formatting on the
file, but left it as incomplete as the original essay.  Hopefully
though, having this document in PEP form will encourage contributions
to update and expand it.

Cheers,
-Barry


From thomas@xs4all.net  Fri Jul  6 09:58:08 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 6 Jul 2001 10:58:08 +0200
Subject: [Python-Dev] "Becoming a Python Developer"
In-Reply-To: <20010705110334.C12027@ute.cnri.reston.va.us>
Message-ID: <20010706105808.D8098@xs4all.nl>

On Thu, Jul 05, 2001 at 11:03:35AM -0400, Andrew Kuchling wrote:

> The thing is, I'm not sure how Jython is developed.  Is Finn Bock the
> BDFL, do the developers vote, or what?  (Same question for .NET.)  If
> some Jython development offered to contribute a description, I
> certainly wouldn't turn it down.

I always find myself thinking the BDFL job covers the *language* 'Python'
more than the CPython implementation, and in that respect Guido is the BDFL
of all implementations :) If you look at BDFL pronouncements, they are
usually about semantics and syntax.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From akuchlin@mems-exchange.org  Fri Jul  6 13:26:35 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Fri, 6 Jul 2001 08:26:35 -0400
Subject: [Python-Dev] "Becoming a Python Developer"
In-Reply-To: <3B449A4C.4B541061@ActiveState.com>; from DavidA@activestate.com on Thu, Jul 05, 2001 at 09:48:13AM -0700
References: <20010704212231.A11629@ute.cnri.reston.va.us> <200107051409.f65E9wS10645@odiug.digicool.com> <20010705110334.C12027@ute.cnri.reston.va.us> <200107051512.f65FCEI12208@odiug.digicool.com> <3B449A4C.4B541061@ActiveState.com>
Message-ID: <20010706082635.A14026@ute.cnri.reston.va.us>

On Thu, Jul 05, 2001 at 09:48:13AM -0700, David Ascher wrote:
>For the purposes of Andrew's document, Mark Hammond, of ActiveState, is
>the BDFL on the .NET research project, which shouldn't be considered at
>the same level of maturity as Jython by any means.  In other words, it's
>fun, but it's not useful yet.

OK.  Here are the descriptions I've written.  If people want to
rewrite for accuracy, please make suggestions (or just rewrite the
text and send it to me):

\item Stackless Python is a fork of CPython, but not one that diverges
very far from the main tree.  Its author, Christian Tismer, rewrote
the main interpreter loop of CPython to minimize its use of the C
stack; in particular, calling a Python function doesn't occupy any
more room on the C stack.  This means that, while CPython can only
recurse a few thousand levels deep before filling up the C stack and
crashing, Stackless can recurse to an unlimited depth.  Stackless is
also significantly faster than CPython (around 10\%), supports
continuations and lightweight threads, and has found a community of
highly skilled users, who use it to do things such as writing
massively-multiplayer online game.  The Stackless Python home page is
at \url{http://www.stackless.com}.

\item Jython is a reimplementation of Python, written in Java instead
of C.  (It was originally named JPython, but the name had to be
changed for stupid trademark reasons.)  Jython compiles Python code
into Java bytecodes, and can seamlessly use any Java class directly
from Python code, with no need to write an extension module first, as
is necessary for CPython.   The Jython home page is at
\url{http://www.jython.org}.

\item Python for .NET is an experimental implementation of Python for
the .NET Framework.  Currently this seems to be a research effort,
because while compiling Python to .NET bytecodes has been implemented,
and the resulting code works, making the resulting code \emph{fast}
seems to be a difficult problem.  See the Python.NET home page, at
\url{http://www.activestate.com/Initiatives/NET/Research.html}, to get
an overview of the current state of progress.

--amk


From Samuele Pedroni <pedroni@inf.ethz.ch>  Fri Jul  6 14:35:06 2001
From: Samuele Pedroni <pedroni@inf.ethz.ch> (Samuele Pedroni)
Date: Fri, 6 Jul 2001 15:35:06 +0200 (MET DST)
Subject: [Python-Dev] Python and e-art
Message-ID: <200107061335.PAA17961@core.inf.ethz.ch>

Hi. For the curious I just discovered this (maybe someone knew that already).

Isn't python incredible <wink>.

A group of e-artists
has presented an e-art "virus" biennale.py written in python:

http://www.0100101110101101.org/home/biennale_py/

at the Biennale, the famous international contemporary art exposition and gathering in Venezia.

It seems a t-shirt with the source code is available too.

Samuele Pedroni.



From Samuele Pedroni <pedroni@inf.ethz.ch>  Fri Jul  6 16:22:58 2001
From: Samuele Pedroni <pedroni@inf.ethz.ch> (Samuele Pedroni)
Date: Fri, 6 Jul 2001 17:22:58 +0200 (MET DST)
Subject: [Python-Dev] Q: import logic
Message-ID: <200107061523.RAA28861@core.inf.ethz.ch>

Hi. I have looked at CPython import logic (C code) ...

is the following true (ignoring relative import issues and None markers):

trying to import s.p.a.m the logic checks for:

s.p.a.m
s.p.a
s.p
s

in sys.modules in this order until it finds an already present module and starts the effective
loading from there.

Is that an implementation detail, or should be considered an important semantic aspect.

Jython has a different logic but then some tricky python code (substituing packages with classes)
can incur in inf recursion.

Thanks, Samuele Pedroni.



From guido@digicool.com  Fri Jul  6 16:48:03 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 06 Jul 2001 11:48:03 -0400
Subject: [Python-Dev] Q: import logic
In-Reply-To: Your message of "Fri, 06 Jul 2001 17:22:58 +0200."
 <200107061523.RAA28861@core.inf.ethz.ch>
References: <200107061523.RAA28861@core.inf.ethz.ch>
Message-ID: <200107061548.f66Fm3P18082@odiug.digicool.com>

> Hi. I have looked at CPython import logic (C code) ...
> 
> is the following true (ignoring relative import issues and None markers):
> 
> trying to import s.p.a.m the logic checks for:
> 
> s.p.a.m
> s.p.a
> s.p
> s
> 
> in sys.modules in this order until it finds an already present
> module and starts the effective loading from there.
> 
> Is that an implementation detail, or should be considered an
> important semantic aspect.
> 
> Jython has a different logic but then some tricky python code
> (substituing packages with classes) can incur in inf recursion.
> 
> Thanks, Samuele Pedroni.

I'm not sure what alternative you had in mind, so I'm not sure how to
answer this (fearing it is a trick question :-).

This is supposed to look for s first, then s.p, then s.p.a, and then
s.p.a.m.  So exactly the opposite order of what you state!

I hesitate to call this an implementation detail -- it really is
intentional behavior that packages s, s.p, and s.p.a must be loaded
and initialized before the import of s.p.a.m is attempted.

Can you clarify the background of your question?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paulp@ActiveState.com  Fri Jul  6 17:44:41 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Fri, 06 Jul 2001 09:44:41 -0700
Subject: [Python-Dev] IPv6
Message-ID: <3B45EAF9.5F91ABB4@ActiveState.com>

I don't know if this is interesting to anyone but...

-------- Original Message --------
Subject: Re: New python-dev summary
Date: Fri, 06 Jul 2001 15:03:39 +0900
From: matz@ruby-lang.org (Yukihiro Matsumoto)
To: language-dev@netthink.co.uk
References: <15172.62152.816000.876075@gargle.gargle.HOWL>

>....

Ruby's socket extension has been IPv6 aware for more than 2 years, by
help from the BSD IPv6 stack developers.  It may be useful for Python
too.  Unfortunately I myself have little knowledge about it.

							matz.

Oops, I mailed to Nathan directly, sorry.


From fdrake@acm.org  Sat Jul  7 00:40:37 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Fri,  6 Jul 2001 19:40:37 -0400 (EDT)
Subject: [Python-Dev] [development doc updates]
Message-ID: <20010706234037.432972892B@cj42289-a.reston1.va.home.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/


Lot's of updates!  Mostly small style adjustments.

Documentation for some new markup of the documentation has been added.

There is a bunch of new content in the Python/C API manual.  I have started
describing the new interface to support high-performance profiling and
tracing.  Some of the PyObject_*() functions which are used in creating
objects have been described and some related reference count information
has been added as well.  Some small corrections have also been made in the
C API manual.  The updates to this manual have not yet been checked in.



From akuchlin@mems-exchange.org  Sat Jul  7 04:59:43 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Fri, 6 Jul 2001 23:59:43 -0400
Subject: [Python-Dev] "Becoming", rev. 2
Message-ID: <20010706235943.A15689@ute.cnri.reston.va.us>

Another update:

http://www.amk.ca/python/writing/python-dev.html

Added a section on design principles, mostly so I can quote Tim's 19
theses, a conclusion and acks sections, and the previously posted
descriptions of Stackless, Jython, and Python.NET.  At this point I'm
ready to go more public with it, and will send off notes to the usual
places to announce it.  Don't hesitate to send more comments, before
or after any announcements go out.  Now I have to go sleepy-dodos or I
shall be all cross in the morning.

--amk


From thomas@xs4all.net  Sat Jul  7 18:07:15 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sat, 7 Jul 2001 19:07:15 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: <200107051507.f65F7Wf12155@odiug.digicool.com>
References: <20010705155447.C8098@xs4all.nl> <200107051507.f65F7Wf12155@odiug.digicool.com>
Message-ID: <20010707190715.J8098@xs4all.nl>

On Thu, Jul 05, 2001 at 11:07:32AM -0400, Guido van Rossum wrote:
> > > Rip out the fancy behaviors of xrange that nobody uses: repeat, slice,
> > > contains, tolist(), and the start/stop/step attributes.  This includes
> > > removing the 4th ('repeat') argument to PyRange_New().

> > Eek... What do we have the fucking warning framework and deprecation
> > warnings for, anyway ?!

> For more important things?!

> I posted the PEP about this, and I got mostly favorable or lukewarm
> responses.  *One* person admitted they were using advanced xrange()
> features now, but he said he wouldn't miss them.  The warning
> framework and deprecation warnings are important for things that will
> change the semantics of things *without* causing an error message
> (like nested scopes).  They are also important for things that will
> require lots of folks to change their code.

I'm sorry, but I have to disagree with this, and vehemently disagree.
Violently, even. Had I not taken the time to write this, it would have been
riddled with cusswords ;) I'll tell you why we disagree, though: we look at
Python from two entirely different angles. Let me try to explain mine, and
why it is *bad* to change something, even something that should be rarely
used, without warning.

You seem to argue from the belief that everyone installs their own Python
version, or upgrades by choice, being fully aware of all the changes it
carries with them. This is (unfortunately) probably true for most of the
Python users. I say unfortunately, because it means Python still hasn't hit
the main stream :)

XS4ALL is an ISP. We provide a bunch of services, like webhosting, machine
hosting, shell access, etc. We have something like 100k shell users, and 8k
webservers with CGI access, and all of them can use Python. Upgrading
anything in that setup is a bitch. Upgrading something that *might* break
'broken' customer code is even worse. We had a client threaten to sue us for
upgrading a Perl version where

    close <FILEHANDLE>;

was changed from a warning into a (compile time) error. Nevermind that it
never did anything in the first place, suddenly their scripts generated an
HTTP error 500, without them changing anything. And believe me, when you
have 8k clueless companies hire wannabe-scriptkiddies to grab some Matt's
Scripting Archive perl scripts from the 'net and get them working the way
the company wants, you accumulate a *lot* of broken-but-barely-working code.

Upgrading something that might break *perfectly valid code* is a lot, lot
worse. The advanced xrange behaviour being gone in 2.2, as well as a 'yield'
keyword added (which you hinted at in a c.l.py posting) without future
statement, would make it practically impossible for me to upgrade Python
from 2.0/2.1 to stock 2.2.

I can't imagine it's any different for Gregor or any of the other
package/distribution maintainers. How are they supposed to provide a smooth
upgrade path if code breaks in silent and unobvious ways ? How can they
decide for their millions of 'customers' whether or not they should have
used xrange's advanced features ?? About the only thing I can think of that
people like Gregor and/or people like me can do, is revert the xrange change
and add warnings ourselves.

I'm sorry, but "it shouldn't have been used this way" is simply not enough
justification to rip something out without as much as a warning in advance.
Range-objects aren't broken now. They aren't blocking the advancement of
Python in any significant way like 'import *' and 'exec' were for nested
scopes. Adding warnings should not be that hard, or the warnings framework
is very broken. And I don't see why we bother with future statements and
warnings at all if we still won't give the guarantee that code won't go from
'documentation-correct' to 'silently-broken' in a *single release*. It
doesn't quite give the message that Python cares about backward
compatibility or code-stability at all, so why bother trusting it at all ?

Unfortunately, I can't decide *not* to upgrade Python either. One of our
customers once threatened to sue us for not upgrading GCC. (And, of course,
when we did, one of our other customers threatened to sue us for upgrading
GCC, because of the damned C++ ABI/API changes.) We've had (and still have!)
similar upgrade nightmares with F-secure SSH and OpenSSH, where you don't
really have the option not to upgrade if you care about system security. I
really don't need another package to worry about.

> Again, the real point of the deprecation policy is not to *never* get
> an error in old code.  It is to make sure that you don't get burned by
> *silent* changes in semantics, and to make sure that *common* usage
> that will stop working is caught.  Advanced xrange() is not common.
> Calling PyRange_New() from C is not common.

Not for you, probably not for most people. But I don't trust my customers,
so I can't know what they do or what they rely on. But I do know that the
removal of the advanced xrange() behaviour is very silent indeed, and it
definately warrants a warning in a release before it is ripped out.
Especially because there seems to be no reason to remove it, other than "I
don't like it". Guido, I trust your language instincts; I know you are
probably right about the advanced features of xrange, and I would never try
to persuade you to do what you think is wrong, just supply my own opinion.

But in maintenance issues, both in a technical and in a PR sense, I trust my
own instincts a lot more than yours, and my instrincts are running around in
bright red bodypaint, smacking themselves over the head with cluebricks,
going "don't do it, don't do it".

> > I can live (though not agree with, sorry ;P) the removal of xrange
> > advanced features... just not from supported to *gone* in a single
> > step.

> Sorry, then you better commit suicide. :-)

And leave you to finish 2.1.1 as well as 2.0.1 ? Hmmm. But I'll tell you one
thing: if you make me be 2.2 Patch Czar with xrange still lobotomized, I'll
have to consider that a bug and fix it the same week 2.2 comes out.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From skip@pobox.com (Skip Montanaro)  Sat Jul  7 19:00:14 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Sat, 7 Jul 2001 13:00:14 -0500
Subject: [Python-Dev] Re: Comment on PEP-0238
In-Reply-To: <9ngektgqnrl17er73ukukqids95p5158dp@4ax.com>
References: <mailman.994427714.26169.python-list@python.org>
 <cpsngaja4g.fsf@cj20424-a.reston1.va.home.com>
 <FFv17.1178$Y6.656565@news1.rdc2.pa.home.com>
 <cpelrskgrr.fsf@cj20424-a.reston1.va.home.com>
 <9i7a7k$h9ks6$1@ID-11957.news.dfncis.de>
 <9ngektgqnrl17er73ukukqids95p5158dp@4ax.com>
Message-ID: <15175.20014.876425.627044@beluga.mojam.com>

    Guido> (Hm.  For various reasons I'm very tempted to introduce 'yield'
    Guido> as a new keyword without warnings or future statements in Python
    Guido> 2.2, so maybe I should bite the bullet and add 'div' as well...)

    C//> If one is going to add keywords to a language, I suggest that a
    C//> list of possible future keywords -- even ones that aren't planned
    C//> on being supported any time soon -- be reserved at the same time.

And that warnings be issued for their use for at least one version.

-- 
Skip Montanaro (skip@pobox.com)
(847)971-7098


From guido@digicool.com  Sat Jul  7 19:14:31 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 07 Jul 2001 14:14:31 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: Your message of "Sat, 07 Jul 2001 19:07:15 +0200."
 <20010707190715.J8098@xs4all.nl>
References: <20010705155447.C8098@xs4all.nl> <200107051507.f65F7Wf12155@odiug.digicool.com>
 <20010707190715.J8098@xs4all.nl>
Message-ID: <200107071814.f67IEWT18834@odiug.digicool.com>

Yes, the lawyers have a way of scaring us all, don't they. :-)

I hear clearly that you want the advanced xrange() behavior to
generate a warning before I take it out.  I still think that's
unnecessary, given that nobody in their right mind uses it.  But since
people who are out of their mind have access to lawyers too, you can
go ahead and restore the old code and stuff it with warnings.  Make
sure to add a warning for every feature that I've taken out!

(Do you think you'll need to add a warning to the __contains__
implementation?  Taking that away doesn't change the functionality,
but changes the *performance* from O(1) to O(n).)

Regarding the yield statement: I'd love to require a future statement,
but the current support for future statements doesn't support
modifying the parser based on the presence of future statements, and I
don't know how to resolve that, short of totally rewriting the parser
or scanning ahead looking for a future statement with some regular
expression.

Sobering thought: It's possible, given all the other changes that I'm
thinking about, that it just won't be possible to make Python 2.2
fully backwards compatible.  Should we rename it to 3.0?  Forget about
the changes?  Label it as experimental and encourage ISPs to install
it as an "alternative" version, only available by using "python2.2"?

PS: I am beginning to believe that the ThreadingTCPServer /
SocketServer problems reported on SF are serious enough to warrant
fixing in 2.1.1.  I'll try to get to the bottom of it ASAP, but if
someone else could look into this I'd be grateful too.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com  Sat Jul  7 19:37:01 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 7 Jul 2001 20:37:01 +0200
Subject: [Python-Dev] Re: CVS: python/dist/src/Include rangeobject.h,2.16,2.17
References: <20010705155447.C8098@xs4all.nl> <200107051507.f65F7Wf12155@odiug.digicool.com>              <20010707190715.J8098@xs4all.nl>  <200107071814.f67IEWT18834@odiug.digicool.com>
Message-ID: <00c201c10713$d19a7be0$4ffa42d5@hagrid>

guido wrote:
> Sobering thought: It's possible, given all the other changes that I'm
> thinking about, that it just won't be possible to make Python 2.2
> fully backwards compatible.  Should we rename it to 3.0?  Forget about
> the changes?  Label it as experimental and encourage ISPs to install
> it as an "alternative" version, only available by using "python2.2"?

every single Python release ever made has broken some of my
code (often in rather esoteric ways).  does that make them all
"experimental"?

imo, the only reasonable strategy for an ISP (or anyone offering
a "standard python install" for a group of users) is of course to
install new versions beside the old ones, notify users, and switch
the default a couple of months after the new version has been
installed.

</F>



From loewis@informatik.hu-berlin.de  Sat Jul  7 19:38:35 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Sat, 7 Jul 2001 20:38:35 +0200 (MEST)
Subject: [Python-Dev] from future import yield
Message-ID: <200107071838.UAA11256@pandora.informatik.hu-berlin.de>

> Regarding the yield statement: I'd love to require a future statement,
> but the current support for future statements doesn't support
> modifying the parser based on the presence of future statements, and I
> don't know how to resolve that, short of totally rewriting the parser
> or scanning ahead looking for a future statement with some regular
> expression.

The "directive" patch manages to conditionally introduce a new keyword,
namely directive. The trick is to introduce it into the grammar, but only
recognize it as a keyword if a flag is set. That approach could
be used for future imports also, although I'd much prefer to spell it

directive transitional yield

Regards,
Martin



From skip@pobox.com (Skip Montanaro)  Sat Jul  7 19:51:21 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Sat, 7 Jul 2001 13:51:21 -0500
Subject: [Python-Dev] Re: Comment on PEP-0238
In-Reply-To: <cp4rsok3i7.fsf@cj20424-a.reston1.va.home.com>
References: <mailman.994427714.26169.python-list@python.org>
 <cpsngaja4g.fsf@cj20424-a.reston1.va.home.com>
 <FFv17.1178$Y6.656565@news1.rdc2.pa.home.com>
 <cpelrskgrr.fsf@cj20424-a.reston1.va.home.com>
 <9i7a7k$h9ks6$1@ID-11957.news.dfncis.de>
 <cp4rsok3i7.fsf@cj20424-a.reston1.va.home.com>
Message-ID: <15175.23081.888138.584693@beluga.mojam.com>

    Guido> "Emile van Sebille" <emile@fenx.com> writes:
    >> If you're going to add keywords, why not add precision and allow
    >> those who want non-integer division to set it to the level of
    >> precision they require.  That breaks no more code (presumably) than
    >> adding div or yield does.

    Guido> I'm not sure what you're asking about.  If you're serious, please
    Guido> submit a PEP!  This is the time to do it.  Posting to the
    Guido> newsgroup is *not* sufficient to let an idea be heard by me --
    Guido> you *have* to mail it to me directly or to python-dev.  (While I
    Guido> like to read c.l.py sometimes, I cannot guarantee that I see
    Guido> every post.)

Isn't this similar to Paul DuBois' floating point ideas?

Skip



From paulp@ActiveState.com  Sat Jul  7 20:03:14 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sat, 07 Jul 2001 12:03:14 -0700
Subject: [Python-Dev] Mobius2
References: <20010705155447.C8098@xs4all.nl> <200107051507.f65F7Wf12155@odiug.digicool.com>
 <20010707190715.J8098@xs4all.nl> <200107071814.f67IEWT18834@odiug.digicool.com>
Message-ID: <3B475CF2.8EDE5E5A@ActiveState.com>

Guido van Rossum wrote:
> 
>...
> Regarding the yield statement: I'd love to require a future statement,
> but the current support for future statements doesn't support
> modifying the parser based on the presence of future statements, and I
> don't know how to resolve that, short of totally rewriting the parser
> or scanning ahead looking for a future statement with some regular
> expression.

Jeff Epler has an extension to Python that allows the grammar to be
loaded at runtime. That might help:

http://aspn.activestate.com/ASPN/Mail/Message/585636

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From thomas@xs4all.net  Sat Jul  7 20:31:58 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sat, 7 Jul 2001 21:31:58 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: <200107071814.f67IEWT18834@odiug.digicool.com>
Message-ID: <20010707213158.K8098@xs4all.nl>

On Sat, Jul 07, 2001 at 02:14:31PM -0400, Guido van Rossum wrote:
> Yes, the lawyers have a way of scaring us all, don't they. :-)

I don't care about the lawyers, we have lawyers to do that (and they're
*good*, just look at the court cases we've won against all odds :) but I do
care about angry customers badmouthing the company I work for, which I like
to think isn't an ordinary one :)

> I hear clearly that you want the advanced xrange() behavior to
> generate a warning before I take it out.  I still think that's
> unnecessary, given that nobody in their right mind uses it.  But since
> people who are out of their mind have access to lawyers too, you can
> go ahead and restore the old code and stuff it with warnings.  Make
> sure to add a warning for every feature that I've taken out!

Great, thanx, I will, right after I cancel my lawyer's appointment. ;)

> (Do you think you'll need to add a warning to the __contains__
> implementation?  Taking that away doesn't change the functionality,
> but changes the *performance* from O(1) to O(n).)

I hadn't even noticed you took that one out... I can't say I see much point
in removing it, but I don't see a reason to add a warning for it.

> Regarding the yield statement: I'd love to require a future statement,
> but the current support for future statements doesn't support
> modifying the parser based on the presence of future statements, and I
> don't know how to resolve that, short of totally rewriting the parser
> or scanning ahead looking for a future statement with some regular
> expression.

Aha. Hrm... 

> Sobering thought: It's possible, given all the other changes that I'm
> thinking about, that it just won't be possible to make Python 2.2
> fully backwards compatible.  Should we rename it to 3.0?  Forget about
> the changes?  Label it as experimental and encourage ISPs to install
> it as an "alternative" version, only available by using "python2.2"?

Well... hrm... Iterators, generators and the type/class unification strike
me as more than enough reason to call it Python 3.0. Or we could ship 2.2
with iterators, but not the other features, warn against identifiers called
'yield' in that one, and ship 3.0 not long after. I have to admit I object
less to adding 'yield' without warning than removing advanced xrange
features, for two reasons: a new keyword breaks at compilation time, 
whereas missing xrange features appear at runtime, and secondly, I *like*
generators :)

A new parser that handles keywords more gracefully would also be an
excellent reason for a 3.0 version number :-)

> PS: I am beginning to believe that the ThreadingTCPServer /
> SocketServer problems reported on SF are serious enough to warrant
> fixing in 2.1.1.  I'll try to get to the bottom of it ASAP, but if
> someone else could look into this I'd be grateful too.

Unsure which problems those are, but I'll keep an eye open for it (I'm going
through the CVS logs, now that I figured out how to get them working, and
the SF bug/patch database in the coming week.)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From jack@oratrix.nl  Sat Jul  7 22:43:37 2001
From: jack@oratrix.nl (Jack Jansen)
Date: Sat, 07 Jul 2001 23:43:37 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: Message by Guido van Rossum <guido@digicool.com> ,
 Sat, 07 Jul 2001 14:14:31 -0400 , <200107071814.f67IEWT18834@odiug.digicool.com>
Message-ID: <20010707214342.DEFF0DA742@oratrix.oratrix.nl>

Recently, Guido van Rossum <guido@digicool.com> said:
> Sobering thought: It's possible, given all the other changes that I'm
> thinking about, that it just won't be possible to make Python 2.2
> fully backwards compatible.  Should we rename it to 3.0?  Forget about
> the changes?  Label it as experimental and encourage ISPs to install
> it as an "alternative" version, only available by using "python2.2"?

In this respect you should also think of the people
Embedding/extending Python. From the checkin messages I get the
impression that all the new inheritance stuff could well break things
there, and if you're going to break, say, pyapache or somesuch then a
major version jump may well be called for...
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From barry@digicool.com  Sun Jul  8 01:19:08 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Sat, 7 Jul 2001 20:19:08 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
References: <200107071814.f67IEWT18834@odiug.digicool.com>
 <20010707213158.K8098@xs4all.nl>
Message-ID: <15175.42748.753199.958152@anthem.wooz.org>

>>>>> "TW" == Thomas Wouters <thomas@xs4all.net> writes:

    TW> Well... hrm... Iterators, generators and the type/class
    TW> unification strike me as more than enough reason to call it
    TW> Python 3.0.

I think this is something to seriously consider.  Especially because I
suspect that the types/class stuff may be rather green at first, and
(as Guido implied) may not be able to be done in a backwards
compatible way.  Bumping the rev number to 3.0 also makes me a little
more comfortable with adding stuff like the yield keyword with no
future statement.

/If/ we do that, then we shouldn't necessarily abandon the 2.x series
immediately.  We can do things like work on performance improvements,
library enhancements, and bug fixes.  This strategy might also calm
the fears about Python-the-language moving too quickly.

-Barry


From akuchlin@mems-exchange.org  Sun Jul  8 02:35:09 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Sat, 7 Jul 2001 21:35:09 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: <20010707213158.K8098@xs4all.nl>; from thomas@xs4all.net on Sat, Jul 07, 2001 at 09:31:58PM +0200
References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl>
Message-ID: <20010707213508.A16251@ute.cnri.reston.va.us>

On Sat, Jul 07, 2001 at 09:31:58PM +0200, Thomas Wouters wrote:
>Well... hrm... Iterators, generators and the type/class unification strike
>me as more than enough reason to call it Python 3.0. Or we could ship 2.2

Agreed.  The version number being a few decimal place shifts away from
Python 3000 is cute, too.  

--amk


From guido@digicool.com  Sun Jul  8 12:45:14 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 08 Jul 2001 07:45:14 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: Your message of "Sat, 07 Jul 2001 21:35:09 EDT."
 <20010707213508.A16251@ute.cnri.reston.va.us>
References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl>
 <20010707213508.A16251@ute.cnri.reston.va.us>
Message-ID: <200107081145.f68BjE824353@odiug.digicool.com>

> On Sat, Jul 07, 2001 at 09:31:58PM +0200, Thomas Wouters wrote:
> >Well... hrm... Iterators, generators and the type/class unification strike
> >me as more than enough reason to call it Python 3.0. Or we could ship 2.2
> 
> Agreed.  The version number being a few decimal place shifts away from
> Python 3000 is cute, too.  
> 
> --amk

Well, that's one of the reasons why I *don't* want this to be the 3.0
release.  Python 2.2 is *not* Python 3000, it is only a small step on
the way.  I also think that as soon as we announce something that
smells like Py3k to the users, there will be a huge effort to keep
Python 2.x alive.  This could cause a split in the user community of
gigantic porportions, and we'd run the risk that most of the users
would stay at Python 2.x forever.  This in turn would require us to
maintain that, probably release 2.2, 2.3 and further versions.

Despite what started this discussion, I think there will only be a
very small number of real incompatibilities between 2.1 and 2.2: one
or two new keywords (and we may have a way to reduce this to zero by
using a future or directive statement), and the object introspection
API will change.  I'm not planning on breaking classic classes in any
significant way -- that will be reserved for 2.3 or later (this is the
domain of PEP 254 which is deliberately empty so far).

Q. If an operation that failed with an AttributeError now fails with a
TypeError (or the other way around), how important is that
incompatibility?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@thyrsus.com  Sat Jul  7 21:01:32 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sat, 7 Jul 2001 16:01:32 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: <200107081145.f68BjE824353@odiug.digicool.com>; from guido@digicool.com on Sun, Jul 08, 2001 at 07:45:14AM -0400
References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com>
Message-ID: <20010707160132.A8791@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> Q. If an operation that failed with an AttributeError now fails with a
> TypeError (or the other way around), how important is that
> incompatibility?

Not very, in my opinion.  I don't believe I've ever coded an except for
either of them.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The abortion rights and gun control debates are twin aspects of a deeper
question --- does an individual ever have the right to make decisions
that are literally life-or-death?  And if not the individual, who does?


From fredrik@pythonware.com  Sun Jul  8 13:41:23 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 8 Jul 2001 14:41:23 +0200
Subject: [Python-Dev] Re: changing AttributeError to TypeError
References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl>              <20010707213508.A16251@ute.cnri.reston.va.us>  <200107081145.f68BjE824353@odiug.digicool.com>
Message-ID: <007101c107ad$27fccc60$4ffa42d5@hagrid>

guido wrote:

> Q. If an operation that failed with an AttributeError now fails with a
> TypeError (or the other way around), how important is that
> incompatibility?

what operations do you have in mind?  

    cd Lib
    grep "except.*\(AttributeError\|TypeError\)" *.py */*.py */*/*.py

gives me about 75 hits in the 2.0 standard library; looks like all but
one would break if you changed *all* attribute errors to type errors,
and vice versa...

if this change doesn't affect any code in the standard library,
changes are that it'll only break a few of the ~1000 uses I found
in my company's code repository...

</F>



From gward@python.net  Mon Jul  9 00:14:01 2001
From: gward@python.net (Greg Ward)
Date: Sun, 8 Jul 2001 19:14:01 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: <200107081145.f68BjE824353@odiug.digicool.com>; from guido@digicool.com on Sun, Jul 08, 2001 at 07:45:14AM -0400
References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com>
Message-ID: <20010708191401.B779@gerg.ca>

On 08 July 2001, Guido van Rossum said:
> Q. If an operation that failed with an AttributeError now fails with a
> TypeError (or the other way around), how important is that
> incompatibility?

I generally think of those exceptions as meaning, "You've got a bug in
your code, bozo" so I don't bother catching them (except in the main
loop of GUIs and servers, to show a big scary traceback to the poor user
or dump it in a logfile).

However, I think that AttributeError is pretty aptly used for the most
part, and I don't see a great benefit in changing an incorrect
"thing.property" to raise TypeError.

        Greg
-- 
Greg Ward - Linux geek                                  gward@python.net
http://starship.python.net/~gward/
God is real, unless declared integer.


From tim.one@home.com  Mon Jul  9 00:44:22 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 8 Jul 2001 19:44:22 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: <20010708191401.B779@gerg.ca>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEBMKNAA.tim.one@home.com>

[Greg Ward]
> However, I think that AttributeError is pretty aptly used for the most
> part, and I don't see a great benefit in changing an incorrect
> "thing.property" to raise TypeError.

"The problem" comes up again and again, and in every release (this isn't
something new!), in the specific context of instance objects.  Like what do
you do for

    class C:
        pass

    c = C()
    print len(c)

?  Instance objects have *every* interesting tp_xxx slot filled in "just in
case", so the len() implementation code that first checks for the existence
of tp_as_sequence or tp_as_mapping says "NO!" to len(3) but "YES!" to
len(c).  The result is that len(3) produces

    TypeError: len() of unsized object

but len(c) eventually gets around to raising the superficially different

    AttributeError: C instance has no attribute '__len__'

instead.  So in cases like this TypeError really means "and it's damned
obvious", while AttributeError means "but it might have been otherwise had
you defined your class differently, but you didn't, and I'm not exactly sure
*why* someone is asking me for "__len__", so the safest thing to say is that
I don't have such an attribute".

Then the problem is that "just try it and see whether it works" code gets
written based on trying an example in a shell, to see whether AttributeError
or TypeError gets raised in the specific case the author is worried about,
and in a later release it raises the other one instead.

As an old-time Pythoneer, I took the almost total lack of "which exceptions
get raised when, exactly" docs as a warning that this stuff was *expected*
to change frequently, so I've always written "try it and see" code via

    try:
        it._and(see)
    except (TypeError, AttributeError):
        pass

That almost never "breaks" across releases.

Here's a specific 2.1 vs 2.2 example:

>>> for i in C(): pass # 2.1
...
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: C instance has no attribute '__getitem__'
>>>

>>> for i in C(): pass # 2.2a0
...
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: iter() of non-sequence
>>>

Since for-loops no longer require __getitem__ at all, even if we *could*
raise the same error in 2.2, it wouldn't make *sense* in 2.2.

In every case I've seen, a switch from AttributeError to TypeError makes
better sense in the end.  Can break code, though!



From guido@digicool.com  Mon Jul  9 01:33:15 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 08 Jul 2001 20:33:15 -0400
Subject: [Python-Dev] Re: changing AttributeError to TypeError
In-Reply-To: Your message of "Sun, 08 Jul 2001 14:41:23 +0200."
 <007101c107ad$27fccc60$4ffa42d5@hagrid>
References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com>
 <007101c107ad$27fccc60$4ffa42d5@hagrid>
Message-ID: <200107090033.f690XFg24570@odiug.digicool.com>

> guido wrote:
> 
> > Q. If an operation that failed with an AttributeError now fails with a
> > TypeError (or the other way around), how important is that
> > incompatibility?
> 
> what operations do you have in mind?  

The specific example was this:

    class C: pass
    list(C())

The second line used to raise AttributeError: 'C' instance has no
attribute '__len__'; now it raises TypeError: iter() of non-sequence.

But I imagine there will be others, caused by the different (IMO
better) way of implementing getattr for most built-in types.  (Note
that I'm hardly touching "classic" classes -- that's a post-2.2 job if
there ever was one.)

>     cd Lib
>     grep "except.*\(AttributeError\|TypeError\)" *.py */*.py */*/*.py
> 
> gives me about 75 hits in the 2.0 standard library; looks like all but
> one would break if you changed *all* attribute errors to type errors,
> and vice versa...
> 
> if this change doesn't affect any code in the standard library,
> changes are that it'll only break a few of the ~1000 uses I found
> in my company's code repository...
> 
> </F>

Not clear what that means...

I tend to fix the test suite when it tests for too specific an error.
I don't think there are many cases in the library proper that are
sensitive to the kind of thing that might change.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Mon Jul  9 01:40:47 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 08 Jul 2001 20:40:47 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: Your message of "Sun, 08 Jul 2001 19:14:01 EDT."
 <20010708191401.B779@gerg.ca>
References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com>
 <20010708191401.B779@gerg.ca>
Message-ID: <200107090040.f690elP24596@odiug.digicool.com>

> On 08 July 2001, Guido van Rossum said:
> > Q. If an operation that failed with an AttributeError now fails with a
> > TypeError (or the other way around), how important is that
> > incompatibility?

Greg Ward:
> I generally think of those exceptions as meaning, "You've got a bug in
> your code, bozo" so I don't bother catching them (except in the main
> loop of GUIs and servers, to show a big scary traceback to the poor user
> or dump it in a logfile).

That's my view on them too.

> However, I think that AttributeError is pretty aptly used for the most
> part, and I don't see a great benefit in changing an incorrect
> "thing.property" to raise TypeError.

Fortunately, that wasn't what I attempted to propose.  As I mentioned
in my reply to Fredrik, there are/were some cases where you get a
surprise AttributeError because a type inconsistency reveals itself
when an object doesn't support a required operation.  This can go
either way: what used to be an AttributeError may become a TypeError,
or vice versa.  (Sorry, no concrete examples right now besides the
previous list(C()) example.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gward@python.net  Mon Jul  9 02:14:22 2001
From: gward@python.net (Greg Ward)
Date: Sun, 8 Jul 2001 21:14:22 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEBMKNAA.tim.one@home.com>; from tim.one@home.com on Sun, Jul 08, 2001 at 07:44:22PM -0400
References: <20010708191401.B779@gerg.ca> <LNBBLJKPBEHFEDALKOLCMEBMKNAA.tim.one@home.com>
Message-ID: <20010708211422.A1546@gerg.ca>

On 08 July 2001, Tim Peters said:
> "The problem" comes up again and again, and in every release (this isn't
> something new!), in the specific context of instance objects.  Like what do
> you do for
> 
>     class C:
>         pass
> 
>     c = C()
>     print len(c)

Good point -- this is one place where AttributeError is misused and
confusing.  +1 on changing it to TypeError -- this sounds like a
definite usability increase.  (IOW, it won't break *my* code.  ;-)

        Greg
-- 
Greg Ward - Unix nerd                                   gward@python.net
http://starship.python.net/~gward/
I repeat myself when under stress I repeat myself when under stress I repeat---


From tim.one@home.com  Mon Jul  9 03:40:19 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 8 Jul 2001 22:40:19 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: <20010708211422.A1546@gerg.ca>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECDKNAA.tim.one@home.com>

[Tim]
>     class C:
>         pass
>
>     c = C()
>     print len(c)

[Greg Ward]
> Good point -- this is one place where AttributeError is misused and
> confusing.  +1 on changing it to TypeError -- this sounds like a
> definite usability increase.  (IOW, it won't break *my* code.  ;-)

We're not actually *proposing* to change anything; in fact, that specific
example works the same in 2.2a0 (even with the type/class changes) as in
2.1.  The problem is that which of {TypeError, AttributeError} you get when
a specific object doesn't support a specific operation is at least partly an
accident, and changes from time to time whether or not intended.

Since instance objects have always been the flakiest in this respect, and
the instance/class machinery is undergoing radical surgery on descr-branch
(in particular, classes are themselves becoming instances (of metaclasses)),
I think Guido is trying to get a feel for how loudly people will howl if we
don't add reams of obscure code seeking to reproduce old accidents exactly.

it's-not-whether-they'll-howl-it's-the-volume-ly y'rs  - tim



From tim.one@home.com  Mon Jul  9 04:51:30 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 8 Jul 2001 23:51:30 -0400
Subject: [Python-Dev] Python and e-art
In-Reply-To: <200107061335.PAA17961@core.inf.ethz.ch>
Message-ID: <LNBBLJKPBEHFEDALKOLCIECGKNAA.tim.one@home.com>

[Samuele Pedroni]
> Hi. For the curious I just discovered this (maybe someone knew
> that already).
>
> Isn't python incredible <wink>.
>
> A group of e-artists has presented an e-art "virus" biennale.py
> written in python:
>
> http://www.0100101110101101.org/home/biennale_py/
>
> at the Biennale, the famous international contemporary art
> exposition and gathering in Venezia.
>
> It seems a t-shirt with the source code is available too.

Ya, and last week Python-Help got its first question about how concerned
Python users should be about this.

It's a cute and silly "virus":  it's just a bit of Python code that reads
its own source code from disk (up to a "stop here!" marker), looks for some
other Python files, and prepends itself to them.  Thus the files it alters
will (probably) do the same kind of thing when *they're* run; and so on.
The infected files clearly say that they're infected (in comments), and the
"stop here!" marker makes it easy to remove the mutation later.

All in all, it's more an example of marketing savvy than virus technology.
At the Biennale, their "exhibit" is simply a computer infected with this
virus.  An article in Wired said they managed to sucker 3 people so far into
paying something like $1000.00 a pop for a CD containing the virus source
code.

all's-fair-in-war-and-art-ly y'rs  - tim



From barry@digicool.com  Mon Jul  9 05:20:46 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 9 Jul 2001 00:20:46 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
References: <20010708211422.A1546@gerg.ca>
 <LNBBLJKPBEHFEDALKOLCCECDKNAA.tim.one@home.com>
Message-ID: <15177.12574.96784.801833@anthem.wooz.org>

>>>>> "TP" == Tim Peters <tim.one@home.com> writes:

    TP> Since instance objects have always been the flakiest in this
    TP> respect, and the instance/class machinery is undergoing
    TP> radical surgery on descr-branch (in particular, classes are
    TP> themselves becoming instances (of metaclasses)), I think Guido
    TP> is trying to get a feel for how loudly people will howl if we
    TP> don't add reams of obscure code seeking to reproduce old
    TP> accidents exactly.

As you say, this has always been flaky, inconsistent, underspecified,
and unpredictable, so IMO Guido's free to change this kind of thing as
he sees fit.  Builtins like list() or len() which implicitly do
attribute access under the covers should be free to raise either
exception, and good defensive programs have already probably been
catching both.

I know there's no danger in changing the behavior for an explicit
instance.attr access.  We all agree that that should always raise
AttributeError, right? :)

-Barry


From tim.one@home.com  Mon Jul  9 06:08:50 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 9 Jul 2001 01:08:50 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: <15177.12574.96784.801833@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCIECJKNAA.tim.one@home.com>

[Barry A. Warsaw]
> ...
> I know there's no danger in changing the behavior for an explicit
> instance.attr access.  We all agree that that should always raise
> AttributeError, right? :)

Please, let's be serious here.  How an instance looks up attributes is
obviously a policy of the instance's class, and class policies are obviously
set by the class of which the instance's class is an instance, or, in other
words, by the instances's class's metaclass.  Unless you want to say that
class policies are inherited from base classes, in which case an entirely
different line of obvious argument obviously applies -- but that would be
wrong.  Now if the metaclass is of type type, then all you have to do is
look at PyType_Type.tp_getattr == type_getattr, and we see that it raises
AttributeError unless the attribute is one of "__name__" or "__doc__" or
"__members__".  So, yes,

    instance.attr

will *always* raise AttributeError in this case, because PyType_Type doesn't
allow for the existence of any attribute named "attr".  From that we deduce
that looking up instance attributes is probably not a class policy
determined by the metaclass after all, so some other obvious argument must
apply.

I'll get back to you after rereading all the PEPs.  But it would be better
for all if you didn't ask such obvious questions to begin with <wink>.

thinking-too-much-is-a-symptom-of-disease-ly y'rs  - tim



From fredrik@pythonware.com  Mon Jul  9 09:11:02 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 9 Jul 2001 10:11:02 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
References: <20010708211422.A1546@gerg.ca><LNBBLJKPBEHFEDALKOLCCECDKNAA.tim.one@home.com> <15177.12574.96784.801833@anthem.wooz.org>
Message-ID: <01df01c1084e$b3921090$4ffa42d5@hagrid>

barry wrote:
> As you say, this has always been flaky, inconsistent, underspecified,
> and unpredictable

>>> class C: pass
...
>>> c = C()
>>> len(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'C' instance has no attribute '__len__'
>>> len(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'C' instance has no attribute '__len__'
>>> len(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'C' instance has no attribute '__len__'
>>> len(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'C' instance has no attribute '__len__'
>>> len(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'C' instance has no attribute '__len__'
>>> len(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'C' instance has no attribute '__len__'
>>> len(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'C' instance has no attribute '__len__'

looks pretty predictable to me...

</F>



From fredrik@pythonware.com  Mon Jul  9 09:08:59 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 9 Jul 2001 10:08:59 +0200
Subject: [Python-Dev] Re: changing AttributeError to TypeError
References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com>              <007101c107ad$27fccc60$4ffa42d5@hagrid>  <200107090033.f690XFg24570@odiug.digicool.com>
Message-ID: <01dc01c1084e$b33aefe0$4ffa42d5@hagrid>

guido wrote:

> > > Q. If an operation that failed with an AttributeError now fails with a
> > > TypeError (or the other way around), how important is that
> > > incompatibility?
>
> > what operations do you have in mind?  
> 
> The specific example was this:
> 
>     class C: pass
>     list(C())
> 
> The second line used to raise AttributeError: 'C' instance has no
> attribute '__len__'; now it raises TypeError: iter() of non-sequence.

so "an operation" in your original question is limited to operations that
may have resulted in an AttributeError or a TypeError depending on the
type, and the change means that they will now be more consistent?

doesn't sound too bad to me.

> > gives me about 75 hits in the 2.0 standard library; looks like all but
> > one would break if you changed *all* attribute errors to type errors,
> > and vice versa...
> > 
> > if this change doesn't affect any code in the standard library,
> > changes are that it'll only break a few of the ~1000 uses I found
> > in my company's code repository...
> 
> Not clear what that means...

the second sentence should have been:

    on the other hand, if this change DOESN'T affect any code in the
    standard library, chances are that it'll only break a few of the ~1000
    uses I found in my company's code repository...

> I tend to fix the test suite when it tests for too specific an error.
> I don't think there are many cases in the library proper that are
> sensitive to the kind of thing that might change.

have you made ANY changes to the library this far?

</F>



From thomas@xs4all.net  Mon Jul  9 13:31:04 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 9 Jul 2001 14:31:04 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include rangeobject.h,2.16,2.17
In-Reply-To: <200107071814.f67IEWT18834@odiug.digicool.com>
References: <20010705155447.C8098@xs4all.nl> <200107051507.f65F7Wf12155@odiug.digicool.com> <20010707190715.J8098@xs4all.nl> <200107071814.f67IEWT18834@odiug.digicool.com>
Message-ID: <20010709143104.R8098@xs4all.nl>

On Sat, Jul 07, 2001 at 02:14:31PM -0400, Guido van Rossum wrote:

> I hear clearly that you want the advanced xrange() behavior to
> generate a warning before I take it out.  I still think that's
> unnecessary, given that nobody in their right mind uses it.  But since
> people who are out of their mind have access to lawyers too, you can
> go ahead and restore the old code and stuff it with warnings.  Make
> sure to add a warning for every feature that I've taken out!

Done:

>>> (xrange(9)[8:7]*6 == xrange(5) or xrange(4).start or xrange(3)).tolist() 
__main__:1: DeprecationWarning: xrange object slicing is deprecated; convert to list instead
__main__:1: DeprecationWarning: xrange object multiplication is deprecated; convert to list instead
__main__:1: DeprecationWarning: PyRange_New's 'repetitions' argument is deprecated
__main__:1: DeprecationWarning: xrange object comparision is deprecated; convert to list instead
__main__:1: DeprecationWarning: xrange object's 'start', 'stop' and 'step' attributes are deprecated
__main__:1: DeprecationWarning: xrange.tolist() is deprecated; use list(xrange) instead
[0, 1, 2]

Those are all the warnings I added: for PyRange_New's 'reps' argument, for
slicing, multiplication, comparison (did you really mean to take it out?),
the start/stop/step attributes, and tolist().

I did leave the range_concat function out, though, so the error

>>> xrange(1) + xrange(1)
Traceback (innermost last):
  File "<stdin>", line 1, in ?
TypeError: cannot concatenate xrange objects

still changes into

>>> xrange(1) + xrange(1)  
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: unsupported operand types for +

but I don't see a problem with that. I also left out the 'contains'
implementation.

Still-not-understanding-*why*-<wink>-ly y'rs,
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Mon Jul  9 14:00:11 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 09 Jul 2001 09:00:11 -0400
Subject: [Python-Dev] Re: changing AttributeError to TypeError
In-Reply-To: Your message of "Mon, 09 Jul 2001 10:08:59 +0200."
 <01dc01c1084e$b33aefe0$4ffa42d5@hagrid>
References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com> <007101c107ad$27fccc60$4ffa42d5@hagrid> <200107090033.f690XFg24570@odiug.digicool.com>
 <01dc01c1084e$b33aefe0$4ffa42d5@hagrid>
Message-ID: <200107091300.f69D0Bo25004@odiug.digicool.com>

> so "an operation" in your original question is limited to operations that
> may have resulted in an AttributeError or a TypeError depending on the
> type, and the change means that they will now be more consistent?

Yes.

> doesn't sound too bad to me.

Me neither. :-)

> > I tend to fix the test suite when it tests for too specific an error.
> > I don't think there are many cases in the library proper that are
> > sensitive to the kind of thing that might change.
> 
> have you made ANY changes to the library this far?

Can't recall.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Samuele Pedroni <pedroni@inf.ethz.ch>  Mon Jul  9 15:16:28 2001
From: Samuele Pedroni <pedroni@inf.ethz.ch> (Samuele Pedroni)
Date: Mon, 9 Jul 2001 16:16:28 +0200 (MET DST)
Subject: [Python-Dev] Python and e-art
Message-ID: <200107091416.QAA05547@core.inf.ethz.ch>

[Tim Peters]
> all's-fair-in-war-and-art-ly y'rs  - tim

and commerce] No I was not much positively impressed
by biennale.py, e.g. as a Program is not a work of art.
I think something more along the line of perl (my bad) poetry,
would be better than a self-replicating juxtaposition of
body soul and fornicate etc (as identifiers) <wink>.
It's a poor representation of "sex" ...



From gball@cfa.harvard.edu  Mon Jul  9 15:50:16 2001
From: gball@cfa.harvard.edu (Greg Ball)
Date: Mon, 9 Jul 2001 10:50:16 -0400 (EDT)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include
 rangeobject.h,2.16,2.17
Message-ID: <Pine.OSF.4.10.10107091032110.20583-100000@cfata6.harvard.edu>

> but I don't see a problem with that. I also left out the 'contains'
> implementation.

> Still-not-understanding-*why*-<wink>-ly y'rs,

In the fine tradition of xrange, that 'contains' implementation is
slightly broken.  It doesn't have proper object equality semantics.

Python 2.1 (#1, Jul  4 2001, 14:48:37)
[GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-81)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> r, xr = range(10), xrange(10)
>>> 1.1 in r
0
>>> 1.1 in xr
1
>>> 1+0j in r
1
>>> 1+0j in xr
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: can't convert complex to int; use e.g. int(abs(z))


--Greg Ball




From guido@digicool.com  Mon Jul  9 19:39:21 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 09 Jul 2001 14:39:21 -0400
Subject: [Python-Dev] Re: changing AttributeError to TypeError
In-Reply-To: Your message of "Mon, 09 Jul 2001 09:00:11 EDT."
 <200107091300.f69D0Bo25004@odiug.digicool.com>
References: <200107071814.f67IEWT18834@odiug.digicool.com> <20010707213158.K8098@xs4all.nl> <20010707213508.A16251@ute.cnri.reston.va.us> <200107081145.f68BjE824353@odiug.digicool.com> <007101c107ad$27fccc60$4ffa42d5@hagrid> <200107090033.f690XFg24570@odiug.digicool.com> <01dc01c1084e$b33aefe0$4ffa42d5@hagrid>
 <200107091300.f69D0Bo25004@odiug.digicool.com>
Message-ID: <200107091839.f69IdLT30186@odiug.digicool.com>

I just noticed another place that swaps a TypeError for an
AttributeError.

In 2.1 and before, assigning to an attribute of an object that doesn't
support attribute assignment (like a list) raises TypeError.
Under the new scheme, this will raise AttributeError.

(On the other hand, assigning to a read-only attribute of an object
that *does* support attribute assignment raises TypeError in the old
and new scheme.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com (Skip Montanaro)  Tue Jul 10 05:14:01 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 9 Jul 2001 23:14:01 -0500
Subject: [Python-Dev] Silly little benchmark
Message-ID: <15178.33033.669776.824095@beluga.mojam.com>

I don't know what motivated me to try this, but based on the 

    print ``1`+`2``

thing that came up in c.l.py I came up with the following "benchmarks":

    for i in xrange(100000): pass
    for i in xrange(100000): x = 1
    for i in xrange(100000): x = ``1`+`2``

user mode times on my computer (sys mode was always 0.0) were

		   Python 1.6	       Python 2.1	  change
    pass	   0.12		       0.20		  1.67x
    x = 1	   0.17		       0.30		  1.76x
    x = ``1`+`2``  1.60		       2.13		  1.33x

Startup times (python -S -c 'pass') are 0.0 for both versions on my 'puter.
It appears loop execution overhead has gotten substantially worse between
1.6 and 2.1.  I know new stuff that will affect looping is going into 2.2
(the generator stuff), but it would seem a good time to reserve a minor
version for mostly performance improvements.  2.3 perhaps?

Or do we wait for Armin's magic pixie dust to sprinkle down upon our heads
so we can crush those Perl swine once and for all? ;-)

at-least-on-linux-ly y'rs,

Skip


From tim.one@home.com  Tue Jul 10 08:33:09 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 10 Jul 2001 03:33:09 -0400
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <15178.33033.669776.824095@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEFOKNAA.tim.one@home.com>

[Skip Montanaro]
> ...
> I came up with the following "benchmarks":
>
>     for i in xrange(100000): pass
>     for i in xrange(100000): x = 1
>     for i in xrange(100000): x = ``1`+`2``
>
> user mode times on my computer (sys mode was always 0.0) were
>
>              Python 1.6   Python 2.1   change
>     pass           0.12         0.20    1.67x
>     x = 1          0.17         0.30    1.76x
>     x = ``1`+`2``  1.60         2.13    1.33x

Please don't post stuff with hard tab characters (I took them out by hand so
this wasn't an unreadable mess).

> Startup times (python -S -c 'pass') are 0.0 for both versions on
> my 'puter.  It appears loop execution overhead has gotten substantially
> worse between 1.6 and 2.1.

AFAIK, nothing relevant changed between 1.6 and 2.1.  Anyone else?  Indeed,
AFIAK, *nothing* plausibly relevant about about for-loops or xrange has
changed since 1.5 (when some general eval-loop speedups got done).

> ...
> but it would seem a good time to reserve a minor version for mostly
> performance improvements.  2.3 perhaps?

The loop speedup in 2.2 requires changes in the PVM as well as adopting the
iterator protocol.  If you've got some *easy* performance improvements,
sure, but then I have to wonder why you've been holding them back <wink>.



From fdrake@acm.org  Tue Jul 10 17:27:12 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Tue, 10 Jul 2001 12:27:12 -0400 (EDT)
Subject: [Python-Dev] [development doc updates]
Message-ID: <20010710162712.D2A8F2892B@cj42289-a.reston1.va.home.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/

Updated to reflect the recent checkins for the Python/C API manual,
which cover a number of the object creation and initialization functions.



From skip@pobox.com (Skip Montanaro)  Tue Jul 10 18:23:15 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 10 Jul 2001 12:23:15 -0500
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEFOKNAA.tim.one@home.com>
References: <15178.33033.669776.824095@beluga.mojam.com>
 <LNBBLJKPBEHFEDALKOLCKEFOKNAA.tim.one@home.com>
Message-ID: <15179.14851.485783.763990@beluga.mojam.com>

    >> Python 1.6   Python 2.1   change
    >> pass           0.12         0.20    1.67x
    >> x = 1          0.17         0.30    1.76x
    >> x = ``1`+`2``  1.60         2.13    1.33x

    Tim> Please don't post stuff with hard tab characters (I took them out
    Tim> by hand so this wasn't an unreadable mess).

Damn!  I meant to run untabify before posting (I knew you'd bitch about hard
tabs ;-) but then I went and forgot.  I just added an untab hook to my Emacs
mail-send-hooks, so this shouldn't happen in the future.

    Tim> The loop speedup in 2.2 requires changes in the PVM as well as
    Tim> adopting the iterator protocol.  If you've got some *easy*
    Tim> performance improvements, sure, but then I have to wonder why
    Tim> you've been holding them back <wink>.

I've not been holding anything back.  Like I said, I don't know what made me
take the 10 minutes right then to whip up a couple trivial benchmarks.
(Bored, I guess.)  I'll try playing around to see what I can dig up.

Skip


From esr@snark.thyrsus.com  Tue Jul 10 18:40:30 2001
From: esr@snark.thyrsus.com (Eric S. Raymond)
Date: Tue, 10 Jul 2001 13:40:30 -0400
Subject: [Python-Dev] Leading with XML-RPC
Message-ID: <200107101740.f6AHeUC21223@snark.thyrsus.com>

I just got off the phone with Dave Winer, the designer of the Frontier 
scripting language.  Dave is concerned that the open-source
community's response to Microsoft's .NET and and Hailstorm proposals
isn't active enough; he views Miguel de Icaza's MONO proposal as good
thing but essentially playing catch-up with a Microsoft-defined
standard.  

Dave suggests that the open-source community can turn up the heat on
Microsoft by visibly supporting and promoting open RPC standards that
compete with .NET, such as XML-RPC and SOAP 1.1.  He thinks that the
implementors of scripting languages like Perl and Python are in a
particularly good position to make this happen, by making XML-RPC
and/or SOAP 1.1 fully documented parts of their standard libraries.  

I agree with both parts of Dave's assessment, and am willing to put my
own effort into making it happen by doing some of the integration work.

Therefore the concrete proposal: we should make XML-RPC support in the
Python standard library a goal for 2.2.  I'd like to see votes and/or
a BDFL pronouncement on this goal.

I've copied Fredrik Lundh and Eric Kidd, the implementors of two
XML-RPC implementations that might serve.  Dave (who designed XML-RPC)
likes them both.  I hope they'll report on which, if either, they
consider production-ready for integration with Python.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Idealism is the noble toga that political gentlemen drape over their
will to power.
	-- Aldous Huxley 


From Petra_Recter@prenhall.com  Tue Jul 10 18:28:33 2001
From: Petra_Recter@prenhall.com (Petra_Recter@prenhall.com)
Date: 10 Jul 2001 13:28:33 -0400
Subject: [Python-Dev] Publisher seeking technical  reviewers for books on Python
Message-ID: <"/GUID:Qzwpffjx11RGZOABgCI2PYQ*/G=Petra/S=Recter/OU=exchange/O=pearsontc/PRMD=pearson/ADMD=telemail/C=us/"@MHS>

Prentice Hall, a leading college publisher, is seeking knowledgeable python programmers to review chapters from technical computer science books.  The chapters are posted on an ftp site and reviewers are asked to download the chapters, print it out and make comments on the hard copy.  The requested turnaround time is approximately 3 days per chapter.  The token honorarium we are offering is $75 per chapter reviewed.

If you are interested, please contact Petra Recter (petra_recter@prenhall.com) and include your resume.

Thanks,

Petra

Petra Recter
Senior Acquisitions Editor, Computer Science
Prentice Hall
One Lake Street - #3F66
Upper Saddle River, NJ 07458

Email: petra_recter@prenhall.com
Tel: (201) 236-7186       Fax: (201) 236-7170



From skip@pobox.com (Skip Montanaro)  Tue Jul 10 18:52:27 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 10 Jul 2001 12:52:27 -0500
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <200107101740.f6AHeUC21223@snark.thyrsus.com>
References: <200107101740.f6AHeUC21223@snark.thyrsus.com>
Message-ID: <15179.16603.865875.309079@beluga.mojam.com>

    Eric> Therefore the concrete proposal: we should make XML-RPC support in
    Eric> the Python standard library a goal for 2.2.  I'd like to see votes
    Eric> and/or a BDFL pronouncement on this goal.

+1 from me.  I use a slightly doctored version of /F's 0.9.8 version of
xmlrpclib (current version is, I think, 0.9.9).  Perhaps inclusion in the
Python core would be a good reason to bump the version number to 1.0.  

The only potential problem I see is that nagging "gotta be ASCII for
interoperability" bug up Dave W's butt.  It flies in the face of attempts to
make Python more Unicode-friendly.

Skip


From guido@digicool.com  Tue Jul 10 19:10:59 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 10 Jul 2001 14:10:59 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: Your message of "Tue, 10 Jul 2001 13:40:30 EDT."
 <200107101740.f6AHeUC21223@snark.thyrsus.com>
References: <200107101740.f6AHeUC21223@snark.thyrsus.com>
Message-ID: <200107101811.f6AIAx312199@odiug.digicool.com>

> Therefore the concrete proposal: we should make XML-RPC support in the
> Python standard library a goal for 2.2.  I'd like to see votes and/or
> a BDFL pronouncement on this goal.
> 
> I've copied Fredrik Lundh and Eric Kidd, the implementors of two
> XML-RPC implementations that might serve.  Dave (who designed XML-RPC)
> likes them both.  I hope they'll report on which, if either, they
> consider production-ready for integration with Python.

Fredrik Lundh's xmlrpclib.py looks ready for the Python standard
library, if Fredrik agrees.  The license is right.  I'm not sure but I
believe that Eric Kidd's version is C or C++ code that *could* be
linked into Python?  This seems less attractive because there will
always have to be a separate distribution (for non-Python targets).

But maybe the motivation is wrong.  We should decide to include (or
not to include) xml-rpc based on a user need, not based on political
motives.  There may be a user need; Fredrik, do you know how popular
your xmlrpc module is?

Technical issues: should the server stubs also be included?  It might
benefit from also including the sgmlop.c extension.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From eric.kidd@pobox.com  Tue Jul 10 19:12:36 2001
From: eric.kidd@pobox.com (Eric Kidd)
Date: Tue, 10 Jul 2001 14:12:36 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <15179.16603.865875.309079@beluga.mojam.com>; from skip@pobox.com on Tue, Jul 10, 2001 at 12:52:27PM -0500
References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com>
Message-ID: <20010710141236.C9416@h00104b370897.ne.mediaone.net>

On Tue, Jul 10, 2001 at 12:52:27PM -0500, Skip Montanaro wrote:
> 
>     Eric> Therefore the concrete proposal: we should make XML-RPC support in
>     Eric> the Python standard library a goal for 2.2.  I'd like to see votes
>     Eric> and/or a BDFL pronouncement on this goal.
> 
> +1 from me.  I use a slightly doctored version of /F's 0.9.8 version of
> xmlrpclib (current version is, I think, 0.9.9).  Perhaps inclusion in the
> Python core would be a good reason to bump the version number to 1.0.  

I recommend using /F's library, too--it's just a tiny snippet of native
Python code.

My library, although nice, is intended for C programmers, and needlessly
duplicates a lot of Python functionality.  It has its own data model (based
on Python's), UTF-8 processing (based on Python's), structure builder
(based on Python's), and so on.  You get the picture.

> The only potential problem I see is that nagging "gotta be ASCII for
> interoperability" bug up Dave W's butt.  It flies in the face of attempts to
> make Python more Unicode-friendly.

Fredrik's library supports Unicode.  My library supports Unicode.  The Java
libraries all support Unicode.  And furthermore, we all appear to have
interop.

On a related note: XML-RPC is easy to implement, but a bit of niche.  SOAP,
on the other, is hard to implement but widely used.  But there's a third
option--"SOAP BDG" ("Busy Developer's Guide").

Dave Winer and his employees prepared a short summary of the SOAP
specification--leaving out many of the vaguer features--and convinced many
people to support this feature set.  So if you read the SOAP BDG paper and
implement it, you can interoperate with many, many commercial SOAP stacks.

So either XML-RPC or SOAP BDG would be good strategic options.

Cheers,
Eric


From guido@digicool.com  Tue Jul 10 19:16:51 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 10 Jul 2001 14:16:51 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: Your message of "Tue, 10 Jul 2001 14:12:36 EDT."
 <20010710141236.C9416@h00104b370897.ne.mediaone.net>
References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com>
 <20010710141236.C9416@h00104b370897.ne.mediaone.net>
Message-ID: <200107101817.f6AIGtr12278@odiug.digicool.com>

> So either XML-RPC or SOAP BDG would be good strategic options.

Or both?

And how does WebDAV fit in this picture?  That's another open protocol
that Python could easily support out of the box.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Tue Jul 10 19:19:26 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Tue, 10 Jul 2001 14:19:26 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <15179.16603.865875.309079@beluga.mojam.com>; from skip@pobox.com on Tue, Jul 10, 2001 at 12:52:27PM -0500
References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com>
Message-ID: <20010710141926.E2528@ute.cnri.reston.va.us>

On Tue, Jul 10, 2001 at 12:52:27PM -0500, Skip Montanaro wrote:
>    Eric> Therefore the concrete proposal: we should make XML-RPC support in
>    Eric> the Python standard library a goal for 2.2.  I'd like to see votes
>    Eric> and/or a BDFL pronouncement on this goal.

+0, I think.  Having the module available might lead people to make
more services available through XML-RPC.  My misgiving is that XML-RPC
is pretty limited, the lack of support for None being particularly
painful to a Python programmer.  Perhaps, if we can only have one,
SOAP would be better, but I haven't used SOAP seriously for anything
yet.

--amk


From esr@thyrsus.com  Tue Jul 10 19:47:35 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 10 Jul 2001 14:47:35 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <20010710141236.C9416@h00104b370897.ne.mediaone.net>; from eric.kidd@pobox.com on Tue, Jul 10, 2001 at 02:12:36PM -0400
References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net>
Message-ID: <20010710144735.A22087@thyrsus.com>

Eric Kidd <eric.kidd@pobox.com>:
> On a related note: XML-RPC is easy to implement, but a bit of niche.  SOAP,
> on the other, is hard to implement but widely used.  But there's a third
> option--"SOAP BDG" ("Busy Developer's Guide").
> 
> Dave Winer and his employees prepared a short summary of the SOAP
> specification--leaving out many of the vaguer features--and convinced many
> people to support this feature set.  So if you read the SOAP BDG paper and
> implement it, you can interoperate with many, many commercial SOAP stacks.
> 
> So either XML-RPC or SOAP BDG would be good strategic options.

Are there, as yet, any SOAP-BDG implementations we could use?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

He that would make his own liberty secure must guard even his enemy from
oppression: for if he violates this duty, he establishes a precedent that
will reach unto himself.
	-- Thomas Paine


From esr@thyrsus.com  Tue Jul 10 19:56:32 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 10 Jul 2001 14:56:32 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <200107101811.f6AIAx312199@odiug.digicool.com>; from guido@digicool.com on Tue, Jul 10, 2001 at 02:10:59PM -0400
References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <200107101811.f6AIAx312199@odiug.digicool.com>
Message-ID: <20010710145632.C22087@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> Fredrik Lundh's xmlrpclib.py looks ready for the Python standard
> library, if Fredrik agrees.  The license is right.

Eric Kidd agrees, but Fredrik has not checked in yet.

> But maybe the motivation is wrong.  We should decide to include (or
> not to include) xml-rpc based on a user need, not based on political
> motives.  There may be a user need; Fredrik, do you know how popular
> your xmlrpc module is?

There are good political reasons and bad political reasons.  

I think helping promote an open and well-designed RPC standard is a
good political reason. And XML-RPC is very good work; I wouldn't be
pushing it if I hadn't evaluated it myself and liked it a lot.  

One other thing that make Python support particularly appropriate is
that Zope objects are XML-RPC accessible (or so I'm told; I have not
tried this myself yet).

> Technical issues: should the server stubs also be included?  It might
> benefit from also including the sgmlop.c extension.

I would say yes to both.  The code is there and it's tested.  I'm willing
to merge in the documentation.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The danger (where there is any) from armed citizens, is only to the
*government*, not to *society*; and as long as they have nothing to
revenge in the government (which they cannot have while it is in their
own hands) there are many advantages in their being accustomed to the 
use of arms, and no possible disadvantage.
        -- Joel Barlow, "Advice to the Privileged Orders", 1792-93


From eric.kidd@pobox.com  Tue Jul 10 20:00:48 2001
From: eric.kidd@pobox.com (Eric Kidd)
Date: Tue, 10 Jul 2001 15:00:48 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <200107101811.f6AIAx312199@odiug.digicool.com>; from guido@digicool.com on Tue, Jul 10, 2001 at 02:10:59PM -0400
References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <200107101811.f6AIAx312199@odiug.digicool.com>
Message-ID: <20010710150048.D9416@h00104b370897.ne.mediaone.net>

On Tue, Jul 10, 2001 at 02:10:59PM -0400, Guido van Rossum wrote:
> Fredrik Lundh's xmlrpclib.py looks ready for the Python standard
> library, if Fredrik agrees.  The license is right.  I'm not sure but I
> believe that Eric Kidd's version is C or C++ code that *could* be
> linked into Python?  This seems less attractive because there will
> always have to be a separate distribution (for non-Python targets).

I recommend using /F's library.  It's less than a thousand lines of nice,
clean Python, and it doesn't duplicate any code in the Python core.

My library is quite a bit faster, but it contains lots of C code which
duplicates Python features.

The right solution is use /F's code.  And if his code isn't fast enough,
small sections can be rewritten in C without breaking the API.
 
> But maybe the motivation is wrong.  We should decide to include (or
> not to include) xml-rpc based on a user need, not based on political
> motives.  There may be a user need; Fredrik, do you know how popular
> your xmlrpc module is?

Moderately popular, AFAIK--it's currently bundled with Zope, and it's one
of the nicest XML-RPC libraries out there.

I've actually used Fredrik's library in more projects than my own.  This is
probably because I'd rather program in Python than C. :-)

Cheers,
Eric



From skip@pobox.com (Skip Montanaro)  Tue Jul 10 20:43:07 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 10 Jul 2001 14:43:07 -0500
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <20010710144735.A22087@thyrsus.com>
References: <200107101740.f6AHeUC21223@snark.thyrsus.com>
 <15179.16603.865875.309079@beluga.mojam.com>
 <20010710141236.C9416@h00104b370897.ne.mediaone.net>
 <20010710144735.A22087@thyrsus.com>
Message-ID: <15179.23243.253539.305609@beluga.mojam.com>

    Eric> Are there, as yet, any SOAP-BDG implementations we could use?

There was a SOAP.py module announced for Python recently:

    http://groups.google.com/groups?q=SOAP.py&hl=en&safe=off&rnum=2&ic=1&selm=mailman.990088321.2387.clpa-moderators%40python.org
    http://www.actzero.com/soap/SOAPpy.html

I can't get to the download page at the moment though, so I can't tell where
it falls on the spectrum between SOAP-BDG and SOAP.

In my opinion supporting both XML-RPC and SOAP in the core library would be
a good thing.  It's sort of like PIL supporting both GIF and JPEG image
files.  Both have their uses.

Skip


From esr@thyrsus.com  Tue Jul 10 20:52:46 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 10 Jul 2001 15:52:46 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <15179.23243.253539.305609@beluga.mojam.com>; from skip@pobox.com on Tue, Jul 10, 2001 at 02:43:07PM -0500
References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> <20010710144735.A22087@thyrsus.com> <15179.23243.253539.305609@beluga.mojam.com>
Message-ID: <20010710155246.B23638@thyrsus.com>

Skip Montanaro <skip@pobox.com>:
> In my opinion supporting both XML-RPC and SOAP in the core library would be
> a good thing.  It's sort of like PIL supporting both GIF and JPEG image
> files.  Both have their uses.

+1.  I think supporting XML-RPC is close to being a no-brainer at this point.
What to do about SOAP is a less trivial question.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Among the many misdeeds of British rule in India, history will look
upon the Act depriving a whole nation of arms as the blackest."
        -- Mohandas Ghandhi, An Autobiography, pg 446


From paulp@ActiveState.com  Tue Jul 10 20:56:29 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 10 Jul 2001 12:56:29 -0700
Subject: [Python-Dev] Leading with XML-RPC
References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> <20010710144735.A22087@thyrsus.com> <15179.23243.253539.305609@beluga.mojam.com> <20010710155246.B23638@thyrsus.com>
Message-ID: <3B4B5DED.9AC9E280@ActiveState.com>

I agree that XML-RPC is a no-brainer. I think it is too early for SOAP.
We need to wait for real interop to shake out before we commit to a SOAP
library. I don't see why waiting for SOAP should in any way dissuade us
from putting in XML-RPC. They are different protocols used by different
people in different projects, like POP and IMAP.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From DavidA@ActiveState.com  Tue Jul 10 21:55:30 2001
From: DavidA@ActiveState.com (David Ascher)
Date: Tue, 10 Jul 2001 13:55:30 -0700
Subject: [Python-Dev] Leading with XML-RPC
References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> <20010710144735.A22087@thyrsus.com> <15179.23243.253539.305609@beluga.mojam.com> <20010710155246.B23638@thyrsus.com>
Message-ID: <3B4B6BC2.74F4984D@ActiveState.com>

"Eric S. Raymond" wrote:
> 
> Skip Montanaro <skip@pobox.com>:
> > In my opinion supporting both XML-RPC and SOAP in the core library would be
> > a good thing.  It's sort of like PIL supporting both GIF and JPEG image
> > files.  Both have their uses.
> 
> +1.  I think supporting XML-RPC is close to being a no-brainer at this point.
> What to do about SOAP is a less trivial question.

FYI: We use a slightly doctored version of /F's xmlrpclib, IIRC, as well
as a slightly doctored version of /F's SOAP library.

I'm +1 on adding both of those to the library, so that we can get rid of
these various 'slight doctorings' =).

I'm +1 on adding good DAV support as well, although I think that that
will have to be through the addition of Neon.  Greg's davlib.py isn't
really industrial-strength, from what Greg tells me (e.g. no support for
authentication), and I don't think Greg is spending much time on it. 
Alas, Neon is C code, and still in flux.  Greg will speak up whenever he
resurfaces =).  If we had 'stubs' like Tcl, we could ship a Neon wrapper
w/o Neon, which would be good.  But we don't. =)

Documentation is probably the bigger problem, though, as usual.

However, as much as I like XML-RPC and SOAP and WebDAV, I don't know
that adding support for these protocols will have much impact on the
folks that are being exposed to the .NET story.  SOAP, especially, works
well with .NET, rather than competing with it.  I don't think anyone
wants to setup an "XML-RPC vs. SOAP" war, that'd be pretty pointless.

-- David Ascher

As a PS, I'm all for adding support for these protocols to Python, but I
don't see the relationship to 'turning up the heat on Microsoft'.  I'd
think there would be better ways of doing so, should you be so inclined
=).  [After reading Dave's piece on Mono, I understand why he thinks
that would 'work' -- but now I think he underestimates the scope and
depth of .NET -- adding interop through SOAP does not make the
alternatives competitive -- see the discussion on language-dev].


From esr@thyrsus.com  Tue Jul 10 22:10:08 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 10 Jul 2001 17:10:08 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <3B4B6BC2.74F4984D@ActiveState.com>; from DavidA@ActiveState.com on Tue, Jul 10, 2001 at 01:55:30PM -0700
References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> <20010710144735.A22087@thyrsus.com> <15179.23243.253539.305609@beluga.mojam.com> <20010710155246.B23638@thyrsus.com> <3B4B6BC2.74F4984D@ActiveState.com>
Message-ID: <20010710171008.A31430@thyrsus.com>

David Ascher <DavidA@ActiveState.com>:
> Documentation is probably the bigger problem, though, as usual.

I'm willing to put some personal elbow grease into solving that problem.
 
> However, as much as I like XML-RPC and SOAP and WebDAV, I don't know
> that adding support for these protocols will have much impact on the
> folks that are being exposed to the .NET story.  SOAP, especially, works
> well with .NET, rather than competing with it.  I don't think anyone
> wants to setup an "XML-RPC vs. SOAP" war, that'd be pretty pointless.

Dave's theory (which I agree with) is that the open-source community as
a whole can get ahead of Microsoft in things like identity services -- 
*if* there is uniform support for XML-RPC or SOAP in our development tools.

He approached me because he thought I'd be aable to get something moving
in the Python world.  He'll be talking with other people about Perl.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Those who make peaceful revolution impossible 
will make violent revolution inevitable."
	-- John F. Kennedy


From tim@digicool.com  Tue Jul 10 22:26:13 2001
From: tim@digicool.com (Tim Peters)
Date: Tue, 10 Jul 2001 17:26:13 -0400
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <15179.14851.485783.763990@beluga.mojam.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAEHCCCAA.tim@digicool.com>

Here are results on Win2K.  In part it just confirms that
xrange(large_number) is a poor way to drive benchmarks (the overhead of
creating and destroying gagilliblobs of unused integers is no help; OTOH,
2.0 certainly appears to be speedier at creating and destroying gagilliblobs
of useless integers!  a common cause for slowdowns of that nature is
ill-considered "special case" optimizations that turn out to cost more than
they save, although I have no particular reason to suspect that here).

Note that Windows Python has an excellent clock() function (it's real time,
not user time, and has better than microsecond resolution).

File skip.py:

N = 100000
TRIPS = 3

if 0:
    indices = xrange(N)   # common but ill-advised
else:
    indices = [None] * N  # better

def t1():
    for i in indices: pass

def t2():
    for i in indices: x = 1

def t3():
    for i in indices: x = ``1`+`2``


def timeit(f):
    from time import clock
    start = clock()
    f()
    finish = clock()
    return finish - start

for f, tag in (t1, "pass"), (t2, "x=1"), (t3, "x=``1`+`2``"):
    print "%-12s" % tag,
    # Warm up.
    f(); f(); f()
    for i in range(TRIPS):
        elapsed = timeit(f)
        print "%6.3f" % elapsed,
    print

"""
Results:

With

    indices = xrange(N)

C:\Code>\Python20\python.exe skip.py
pass          0.038  0.038  0.039
x=1           0.049  0.049  0.049
x=``1`+`2``   0.421  0.420  0.421

C:\Code>\Python21\python.exe skip.py
pass          0.042  0.042  0.042
x=1           0.053  0.053  0.053
x=``1`+`2``   0.456  0.456  0.455

C:\Code>python\dist\src\PCbuild\python skip.py # CVS
pass          0.040  0.039  0.039
x=1           0.050  0.051  0.050
x=``1`+`2``   0.449  0.452  0.452


With

    indices = [None] * N

instead:

C:\Code>\Python20\python.exe skip.py
pass          0.035  0.034  0.034
x=1           0.046  0.046  0.046
x=``1`+`2``   0.414  0.413  0.413

C:\Code>\Python21\python.exe skip.py
pass          0.037  0.037  0.037
x=1           0.048  0.048  0.048
x=``1`+`2``   0.451  0.448  0.453

C:\Code>python\dist\src\PCbuild\python skip.py # CVS
pass          0.031  0.030  0.031
x=1           0.041  0.042  0.041
x=``1`+`2``   0.438  0.447  0.444
"""



From skip@pobox.com (Skip Montanaro)  Tue Jul 10 23:27:10 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 10 Jul 2001 17:27:10 -0500
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHAEHCCCAA.tim@digicool.com>
References: <15179.14851.485783.763990@beluga.mojam.com>
 <BIEJKCLHCIOIHAGOKOLHAEHCCCAA.tim@digicool.com>
Message-ID: <15179.33086.641542.664190@beluga.mojam.com>

One thing that occurs to me as I rebuild 1.6 is that it would be real nice
to be able to query the interpreter for the compilation flags at runtime so
I could be more certain I was comparing apples and apples.  In my case, the
last time I built 1.6 was June 2000, so I have no idea what my compilation
flags were.  I can tell by the startup message that it was compiled with gcc
2.95.3, but not what optimization flags were used.

Skip


From skip@pobox.com (Skip Montanaro)  Tue Jul 10 23:58:04 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 10 Jul 2001 17:58:04 -0500
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHAEHCCCAA.tim@digicool.com>
References: <15179.14851.485783.763990@beluga.mojam.com>
 <BIEJKCLHCIOIHAGOKOLHAEHCCCAA.tim@digicool.com>
Message-ID: <15179.34940.785035.23896@beluga.mojam.com>

    Tim> Note that Windows Python has an excellent clock() function (it's
    Tim> real time, not user time, and has better than microsecond
    Tim> resolution).

Real time doesn't mean much on an operating system that can juggle multiple
tasks, no matter how quiescent you try to make it.

Okay, so here are some hopefully more comparable numbers.  I cvs up'd both
Python 1.6 (release16 tag) and Python 2.1 (release21-maint tag) directories,
reconfigured, executed make clean, then ran make and make install.  The
optimization/debug flags were the default: "-g -O2".  Both were compiled
with gcc 2.96.0.48mdk, the version of gcc that comes with Linux Mandrake
8.0.

Using the xrange(N) version:

    % python1.6 -S skip.py
    pass          0.090  0.090  0.090
    x=1           0.110  0.120  0.120
    x=``1`+`2``   1.080  1.070  1.060

    % python2.1 -S skip.py
    pass          0.090  0.100  0.090
    x=1           0.110  0.120  0.110
    x=``1`+`2``   1.700  1.680  1.700

Using the [None]*N version:

    % python1.6 -S skip.py
    pass          0.070  0.070  0.080
    x=1           0.100  0.110  0.100
    x=``1`+`2``   1.040  1.030  1.040

    % python2.1 -S skip.py
    pass          0.070  0.080  0.070
    x=1           0.110  0.100  0.100
    x=``1`+`2``   1.680  1.690  1.690

So, my observations about loop overhead were almost certainly artifacts of
differences in the way the two interpreters were compiled.  My apologies for
that flub.  It still appears there's a big slowdown between 1.6 and 2.1 in
the back tic operations though.

Aside: Tim, can I assume by your return address that Digital Creations
finally gave you an office and you're not computing from some seedy motel
room on US 1? ;-)

-- 
Skip Montanaro (skip@pobox.com)
(847)971-7098


From thomas@xs4all.net  Wed Jul 11 00:04:57 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 11 Jul 2001 01:04:57 +0200
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <15179.34940.785035.23896@beluga.mojam.com>
References: <15179.14851.485783.763990@beluga.mojam.com> <BIEJKCLHCIOIHAGOKOLHAEHCCCAA.tim@digicool.com> <15179.34940.785035.23896@beluga.mojam.com>
Message-ID: <20010711010456.D8098@xs4all.nl>

On Tue, Jul 10, 2001 at 05:58:04PM -0500, Skip Montanaro wrote:

> Okay, so here are some hopefully more comparable numbers.  I cvs up'd both
> Python 1.6 (release16 tag) and Python 2.1 (release21-maint tag) directories,

Wrong tag; you're testing the 2.1.1 branch, not the 2.1 release. 2.1.1
contains at least one small performance optimization of which the impact has
not been determined :)

Feel-free-to-determine-though-ly y'rs,
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From skip@pobox.com (Skip Montanaro)  Wed Jul 11 00:28:44 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 10 Jul 2001 18:28:44 -0500
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <20010711010456.D8098@xs4all.nl>
References: <15179.14851.485783.763990@beluga.mojam.com>
 <BIEJKCLHCIOIHAGOKOLHAEHCCCAA.tim@digicool.com>
 <15179.34940.785035.23896@beluga.mojam.com>
 <20010711010456.D8098@xs4all.nl>
Message-ID: <15179.36780.675079.11961@beluga.mojam.com>

    >> Okay, so here are some hopefully more comparable numbers.  I cvs up'd
    >> both Python 1.6 (release16 tag) and Python 2.1 (release21-maint tag)
    >> directories,

    Thomas> Wrong tag; you're testing the 2.1.1 branch, not the 2.1
    Thomas> release. 2.1.1 contains at least one small performance
    Thomas> optimization of which the impact has not been determined :)

I thought the whole idea of the dot dot releases was that they were supposed
to just be bug fixes. 

Skip


From guido@digicool.com  Wed Jul 11 00:41:56 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 10 Jul 2001 19:41:56 -0400
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: Your message of "Tue, 10 Jul 2001 17:27:10 CDT."
 <15179.33086.641542.664190@beluga.mojam.com>
References: <15179.14851.485783.763990@beluga.mojam.com> <BIEJKCLHCIOIHAGOKOLHAEHCCCAA.tim@digicool.com>
 <15179.33086.641542.664190@beluga.mojam.com>
Message-ID: <200107102341.f6ANfvL12944@odiug.digicool.com>

> One thing that occurs to me as I rebuild 1.6 is that it would be real nice
> to be able to query the interpreter for the compilation flags at runtime so
> I could be more certain I was comparing apples and apples.  In my case, the
> last time I built 1.6 was June 2000, so I have no idea what my compilation
> flags were.  I can tell by the startup message that it was compiled with gcc
> 2.95.3, but not what optimization flags were used.

If you did a full "make install" then and the results are still
around, look in <prefix>/lib/python1.6/config/Makefile .

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Wed Jul 11 02:44:26 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 10 Jul 2001 21:44:26 -0400 (EDT)
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <15179.33086.641542.664190@beluga.mojam.com>
References: <15179.14851.485783.763990@beluga.mojam.com>
 <BIEJKCLHCIOIHAGOKOLHAEHCCCAA.tim@digicool.com>
 <15179.33086.641542.664190@beluga.mojam.com>
 <200107102341.f6ANfvL12944@odiug.digicool.com>
Message-ID: <15179.44922.465917.740723@cj42289-a.reston1.va.home.com>

Skip Montanaro writes:
 > One thing that occurs to me as I rebuild 1.6 is that it would be real nice
 > to be able to query the interpreter for the compilation flags at runtime so
 > I could be more certain I was comparing apples and apples.  In my case, the

Guido van Rossum writes:
 > If you did a full "make install" then and the results are still
 > around, look in <prefix>/lib/python1.6/config/Makefile .

  And this can all be extracted from Python without having to delve
into obscure installed files, as well.  ;-)

For Python 1.6:

    >>> import distutils.sysconfig
    >>> distutils.sysconfig.OPT
    '-g -O2'

For Python 2.0 and newer:

    >>> import distutils.sysconfig
    >>> distutils.sysconfig.get_config_var('OPT')
    '-g -O2 -Wall -Wstrict-prototypes -fPIC'


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From skip@pobox.com (Skip Montanaro)  Wed Jul 11 03:30:39 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 10 Jul 2001 21:30:39 -0500
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <200107102341.f6ANfvL12944@odiug.digicool.com>
References: <15179.14851.485783.763990@beluga.mojam.com>
 <BIEJKCLHCIOIHAGOKOLHAEHCCCAA.tim@digicool.com>
 <15179.33086.641542.664190@beluga.mojam.com>
 <200107102341.f6ANfvL12944@odiug.digicool.com>
Message-ID: <15179.47695.555421.793300@beluga.mojam.com>

    >> I can tell by the startup message that it was compiled with gcc
    >> 2.95.3, but not what optimization flags were used.

    Guido> If you did a full "make install" then and the results are still
    Guido> around, look in <prefix>/lib/python1.6/config/Makefile .

Thanks for the tip.  Since I just rebuilt 1.6 this evening I wiped out
whatever was there from last June.  Still, I now have that information at my
fingertips:

    def getbuildinfo():
        import sys, re, string
        makefile = "%s/lib/python%d.%d/config/Makefile" % \
                   (sys.prefix, sys.version_info[0], sys.version_info[1])
        f = open(makefile)
        pat = re.compile("^([A-Z_]+)\s*=\s*(.*)")
        lines = f.readlines()
        opts = {}
        for line in lines:
            mat = pat.match(line)
            if mat:
                name = mat.group(1)
                val = string.strip(mat.group(2))
                opts[name] = val
        return opts

Skip



From skip@pobox.com (Skip Montanaro)  Wed Jul 11 03:35:13 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 10 Jul 2001 21:35:13 -0500
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <15179.44922.465917.740723@cj42289-a.reston1.va.home.com>
References: <15179.14851.485783.763990@beluga.mojam.com>
 <BIEJKCLHCIOIHAGOKOLHAEHCCCAA.tim@digicool.com>
 <15179.33086.641542.664190@beluga.mojam.com>
 <200107102341.f6ANfvL12944@odiug.digicool.com>
 <15179.44922.465917.740723@cj42289-a.reston1.va.home.com>
Message-ID: <15179.47969.50003.441171@beluga.mojam.com>

    Fred> And this can all be extracted from Python without having to delve
    Fred> into obscure installed files, as well.  ;-)

Dang!  Wasted another five minutes...

Skip


From tim@digicool.com  Wed Jul 11 06:05:52 2001
From: tim@digicool.com (Tim Peters)
Date: Wed, 11 Jul 2001 01:05:52 -0400
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <15179.34940.785035.23896@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEJAKNAA.tim@digicool.com>

[Skip Montanaro]
> Real time doesn't mean much on an operating system that can
> juggle multiple tasks, no matter how quiescent you try to make it.

If this made any difference in the results I reported, they wouldn't have
been reproducible to nearly 3 significant digits.  That's why I printed the
times for 3 runs of each -- you can trust that I know what I'm doing here.
These things run for a fraction of a second each, the machine was as quiet
as possible, and the output showed no cause for suspicion (indeed, I threw
out a few runs where one of the three numbers was 10x larger than the other
two -- *that's* how you know you got socked by a background task, provided
you've got a sensitive timer to work with).

> ...
> It still appears there's a big slowdown between 1.6 and 2.1 in
> the back tic operations though.

This assumes too much.  ``1`+`2`` triggers three reprs and a string
concatenation.  I suspect both slowed, but that the latter is the more
important hit.

First the repr(string) hit (first blob from the release20 rev of
stringobject.c):

*** 374,383 ****
  			c = op->ob_sval[i];
  			if (c == quote || c == '\\')
  				*p++ = '\\', *p++ = c;
! 			else if (c < ' ' || c >= 0177) {
! 				sprintf(p, "\\%03o", c & 0377);
! 				while (*p != '\0')
! 					p++;
  			}
  			else
  				*p++ = c;
--- 442,456 ----
  			c = op->ob_sval[i];
  			if (c == quote || c == '\\')
  				*p++ = '\\', *p++ = c;
! 			else if (c == '\t')
! 				*p++ = '\\', *p++ = 't';
! 			else if (c == '\n')
! 				*p++ = '\\', *p++ = 'n';
! 			else if (c == '\r')
! 				*p++ = '\\', *p++ = 'r';
! 			else if (c < ' ' || c >= 0x7f) {
! 				sprintf(p, "\\x%02x", c & 0xff);
!                                 p += 4;
  			}
  			else
  				*p++ = c;

"The usual" string char endures twice as many tests+branches now.

The other thing Jeremy has noted before:  string+string is slower than it
used to be, because BINARY_ADD now tries oodles of "sophisticated" ways to
coerce the operands to numbers before considering it might be asking for a
sequence catenation instead.  Given that the benchmark pastes together two
1-character strings, this overhead is overwhelming compared to the
concatenation work.

> Aside: Tim, can I assume by your return address that Digital Creations
> finally gave you an office and you're not computing from some seedy
> motel room on US 1? ;-)

I'm not entirely sure DC gave it to us, but there is indeed a Luxurious
PythonLabs World Headquarters now, in Falls Church, VA.  Conveniently
located atop an inaccessible hill, it overlooks the fabulous Leesburg Pike,
a stunning continuous strip mall stretching from the Potomac to the Arctic
Circle (or France, whichever is closer -- geography isn't my strong suit).

join-us-for-lunch!-we-need-the-company-ly y'rs  - tim



From skip@pobox.com (Skip Montanaro)  Wed Jul 11 07:24:32 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 11 Jul 2001 01:24:32 -0500
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEJAKNAA.tim@digicool.com>
References: <15179.34940.785035.23896@beluga.mojam.com>
 <LNBBLJKPBEHFEDALKOLCIEJAKNAA.tim@digicool.com>
Message-ID: <15179.61728.255760.814673@beluga.mojam.com>

    Tim> [Skip Montanaro]
    >> Real time doesn't mean much on an operating system that can
    >> juggle multiple tasks, no matter how quiescent you try to make it.

    Tim> If this made any difference in the results I reported, they
    Tim> wouldn't have been reproducible to nearly 3 significant digits.
    Tim> That's why I printed the times for 3 runs of each -- you can trust
    Tim> that I know what I'm doing here. 

I wasn't suggesting you weren't trustworthy.  On a Linux system, wall clock
time doesn't mean much when timing processes.  I have no control over when
sendmail or any of a number of other daemons might wake up to process
something.  Hence, for my purposes in my environment, user mode time (or
user+sys when sys > 0) are more useful than elapsed time (does Windows even
distinguish between user and system time?).  That time.clock means different
things on Windows and Unix-like systems bothers me a bit.  (It would bother
me more if I had to write timing code that was portable across both Unix and
Windows.)  But that, as they say, is a something left for another time.

    Tim> The other thing Jeremy has noted before: string+string is slower
    Tim> than it used to be, because BINARY_ADD now tries oodles of
    Tim> "sophisticated" ways to coerce the operands to numbers before
    Tim> considering it might be asking for a sequence catenation instead.
    Tim> Given that the benchmark pastes together two 1-character strings,
    Tim> this overhead is overwhelming compared to the concatenation work.

I can buy that.  Wasn't there some discussion about improving this
situation?  If so, I guess I should be using the head branch of the CVS tree
instead of release21-maint.

Skip


From paulp@ActiveState.com  Wed Jul 11 09:05:00 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Wed, 11 Jul 2001 01:05:00 -0700
Subject: [Python-Dev] Silly little benchmark
References: <LNBBLJKPBEHFEDALKOLCIEJAKNAA.tim@digicool.com>
Message-ID: <3B4C08AC.BB6014DB@ActiveState.com>

Tim Peters wrote:
> 
>...
> 
> I'm not entirely sure DC gave it to us, but there is indeed a Luxurious
> PythonLabs World Headquarters now, in Falls Church, VA.  Conveniently
> located atop an inaccessible hill, it overlooks the fabulous Leesburg Pike,
> a stunning continuous strip mall stretching from the Potomac to the Arctic
> Circle (or France, whichever is closer -- geography isn't my strong suit).

When I lived in Ontario I constantly wondered how far south that mall
went! I tried to walk to the end of it once. I gave up sometime after I
crossed the line where Roots franchises were replaced with Gaps.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From fredrik@pythonware.com  Wed Jul 11 09:09:06 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 11 Jul 2001 10:09:06 +0200
Subject: [Python-Dev] Leading with XML-RPC
References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> <20010710144735.A22087@thyrsus.com> <15179.23243.253539.305609@beluga.mojam.com> <20010710155246.B23638@thyrsus.com> <3B4B6BC2.74F4984D@ActiveState.com>
Message-ID: <018501c109e0$c345a450$4ffa42d5@hagrid>

I'm in a hurry, so here's a short version of what I think:

+1 on xmlrpclib.py in 2.2

+1 on a pure-python davlib.py in 2.2 (greg, please?)

-0 on soap support in 2.2 (it's still a moving target; a new spec draft
was released this weekend).  if we want something now, it should be
cayce ullman's SOAP.py, not my soaplib.py.  but I don't think we need
SOAP in the standard library for another year or two.

(fwiw, my current thinking is that SOAP is a flawed idea, and that the
need for SOAP will go away when people get better XML/Schema tools,
but that's another story.  and don't get me started on SOAP BDG...)

Cheers /F



From thomas@xs4all.net  Wed Jul 11 09:23:20 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 11 Jul 2001 10:23:20 +0200
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <15179.36780.675079.11961@beluga.mojam.com>
References: <15179.14851.485783.763990@beluga.mojam.com> <BIEJKCLHCIOIHAGOKOLHAEHCCCAA.tim@digicool.com> <15179.34940.785035.23896@beluga.mojam.com> <20010711010456.D8098@xs4all.nl> <15179.36780.675079.11961@beluga.mojam.com>
Message-ID: <20010711102320.R32419@xs4all.nl>

On Tue, Jul 10, 2001 at 06:28:44PM -0500, Skip Montanaro wrote:

>     >> Okay, so here are some hopefully more comparable numbers.  I cvs up'd
>     >> both Python 1.6 (release16 tag) and Python 2.1 (release21-maint tag)
>     >> directories,

>     Thomas> Wrong tag; you're testing the 2.1.1 branch, not the 2.1
>     Thomas> release. 2.1.1 contains at least one small performance
>     Thomas> optimization of which the impact has not been determined :)

> I thought the whole idea of the dot dot releases was that they were supposed
> to just be bug fixes. 

They do, it just depends on what you classify as a bug. This was a small bug
in the implementation of function calls that made calling 'normal' C
functions (without keyword arguments) from Python code a bit slower than
necessary, too.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From Paul.Moore@atosorigin.com  Wed Jul 11 09:32:06 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Wed, 11 Jul 2001 09:32:06 +0100
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB:
 strawman PEP)
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEDF@ukrux002.rundc.uk.origin-it.com>

From: Pete Shinners [mailto:pete@shinners.org]
> for the love of all things good, can we please make a recommendation
> in our PEP that the windows installation location be something other
> than "C:\PYTHON21"? something like "C:\PYTHON21\SITE-PACKAGES" would
> be a big improvement. i thought i heard that macpython recently made
> this "fix", why is the windows version lagging on this?

PEP 250 covers this. I have sent in the final PEP for approval, plus a
patch, but the process appears to be stalled. I guess I need to nag again.
The PEP process doesn't seem to cover non-core Python developers well (eg,
people like me who don't have a way of integrating with the Sourceforge
mechanisms...)

Paul.


From martin@strakt.com  Wed Jul 11 13:46:26 2001
From: martin@strakt.com (Martin Sjögren)
Date: Wed, 11 Jul 2001 14:46:26 +0200
Subject: [Python-Dev] Python and SSL
Message-ID: <20010711144626.A2998@strakt.com>

--Qxx1br4bt0+wmkIi
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hello

I'm currently in the process of developing a basic OpenSSL module for
Python. Before you say antyhing, yes I know about M2Crypto and its SSL
support, but for a number of reasons, it doesn't fulfill our needs.

We found the SSL support in Python to be insufficient (nonexistent :-))
for our needs.  We thus decided to write our own module.

The module is faaaar from complete as an interface to the general
cryptographic functionality of OpenSSl, but it does have basic SSL
support, including authorization using certificates, PRNG seeding
functions and an error handling system.

Since we are using Python extensively and don't have to pay for it, we
would like to reply in kind and offer the module back to the Python
project.

(This is, in case you're missing it, a hint that now that security is
the hot subject it is, it's silly for an otherwise so complete language to
lack SSL support ;-))

The whole kit (including some documentation) can be found here:
http://www.strakt.com/~martin/pyOpenSSL.tar.gz

My question is... What do I do now? Where to proceed?

Please CC me replies, since I'm (of course) not on the list.

Regards,
Martin Sj=F6gren
AB Strakt

--=20
Martin Sj=F6gren
  martin@strakt.com              ICQ : 41245059
  Phone: +46 (0)31 405242        Cell: +46 (0)739 169191
  GPG key: http://www.strakt.com/~martin/gpg.html

--Qxx1br4bt0+wmkIi
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iEYEARECAAYFAjtMSqEACgkQGpBPiZwE9FbxfwCfU9wL9mTnkLhOvzaprpjhHTod
IPsAoJta797qnpcW+veVceqkyulkhYpq
=TjXt
-----END PGP SIGNATURE-----

--Qxx1br4bt0+wmkIi--


From guido@digicool.com  Wed Jul 11 14:02:42 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 11 Jul 2001 09:02:42 -0400
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP)
In-Reply-To: Your message of "Wed, 11 Jul 2001 09:32:06 BST."
 <714DFA46B9BBD0119CD000805FC1F53B01B5AEDF@ukrux002.rundc.uk.origin-it.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEDF@ukrux002.rundc.uk.origin-it.com>
Message-ID: <200107111302.f6BD2gP13353@odiug.digicool.com>

> From: Pete Shinners [mailto:pete@shinners.org]
> > for the love of all things good, can we please make a recommendation
> > in our PEP that the windows installation location be something other
> > than "C:\PYTHON21"? something like "C:\PYTHON21\SITE-PACKAGES" would
> > be a big improvement. i thought i heard that macpython recently made
> > this "fix", why is the windows version lagging on this?

[Paul Moore]
> PEP 250 covers this. I have sent in the final PEP for approval, plus a
> patch, but the process appears to be stalled. I guess I need to nag again.
> The PEP process doesn't seem to cover non-core Python developers well (eg,
> people like me who don't have a way of integrating with the Sourceforge
> mechanisms...)

I just read that PEP over, and I agree with it.  I think it should be
implemented.  If anyone with sourceforge permission would like to
champion this PEP further (by implementing the modest change it
suggests so that it can be rolled out with Python 2.2a1 next week),
that would really help!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Wed Jul 11 14:26:51 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 11 Jul 2001 09:26:51 -0400
Subject: [Python-Dev] Python and SSL
In-Reply-To: Your message of "Wed, 11 Jul 2001 14:46:26 +0200."
 <20010711144626.A2998@strakt.com>
References: <20010711144626.A2998@strakt.com>
Message-ID: <200107111326.f6BDQpY13417@odiug.digicool.com>

> Hello
> 
> I'm currently in the process of developing a basic OpenSSL module for
> Python. Before you say antyhing, yes I know about M2Crypto and its SSL
> support, but for a number of reasons, it doesn't fulfill our needs.
> 
> We found the SSL support in Python to be insufficient (nonexistent :-))
> for our needs.  We thus decided to write our own module.
> 
> The module is faaaar from complete as an interface to the general
> cryptographic functionality of OpenSSl, but it does have basic SSL
> support, including authorization using certificates, PRNG seeding
> functions and an error handling system.
> 
> Since we are using Python extensively and don't have to pay for it, we
> would like to reply in kind and offer the module back to the Python
> project.
> 
> (This is, in case you're missing it, a hint that now that security is
> the hot subject it is, it's silly for an otherwise so complete language to
> lack SSL support ;-))
> 
> The whole kit (including some documentation) can be found here:
> http://www.strakt.com/~martin/pyOpenSSL.tar.gz
> 
> My question is... What do I do now? Where to proceed?
> 
> Please CC me replies, since I'm (of course) not on the list.
> 
> Regards,
> Martin Sjögren
> AB Strakt

Hi Martin,

You can actually subscribe to python-dev.  Just go to
http://mail.python.org/mailman/listinfo/python-dev and enter your
email and password; you will magically be approved.

The best thing you can do is try to find someone with Python SF commit
privileges who is willing to review your code and check it in.  (I
would have recommended Jeremy Hylton, but he's still away on paternity
leave, so you'll have to find someone outside PythonLabs.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Wed Jul 11 14:33:21 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 11 Jul 2001 15:33:21 +0200
Subject: [Python-Dev] Python and SSL
References: <20010711144626.A2998@strakt.com>
Message-ID: <3B4C55A1.73BB5A78@lemburg.com>

"Martin Sjögren" wrote:
> 
> Hello
> 
> I'm currently in the process of developing a basic OpenSSL module for
> Python. Before you say antyhing, yes I know about M2Crypto and its SSL
> support, but for a number of reasons, it doesn't fulfill our needs.

Note that there's also amkCrypto (the successor of mxCrypto which
is a wrapper of the low-level blazing fast tools in OpenSSL):

	http://www.amk.ca/python/code/crypto.html
 
> We found the SSL support in Python to be insufficient (nonexistent :-))
> for our needs.  We thus decided to write our own module.
> 
> The module is faaaar from complete as an interface to the general
> cryptographic functionality of OpenSSl, but it does have basic SSL
> support, including authorization using certificates, PRNG seeding
> functions and an error handling system.

There is some support in the socket module for dealing HTTPS.
Which level of OpenSSL are you focussing (ciphers, certificates
or protocol) ?
 
> Since we are using Python extensively and don't have to pay for it, we
> would like to reply in kind and offer the module back to the Python
> project.
> 
> (This is, in case you're missing it, a hint that now that security is
> the hot subject it is, it's silly for an otherwise so complete language to
> lack SSL support ;-))
>
> The whole kit (including some documentation) can be found here:
> http://www.strakt.com/~martin/pyOpenSSL.tar.gz
> 
> My question is... What do I do now? Where to proceed?

Since the module is "far from complete", I'd suggest to put the project
up on the web somewhere to let it mature. 

I am not sure whether it's a good idea to put
crypto code into the standard Python distribution due to the issues
involved in this (import/export restrictions, etc.), but
perhaps we could open up the Python core a bit for these
"extra" utilities and make them available as separate download
alongside the standard ones.
 
> Please CC me replies, since I'm (of course) not on the list.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From martin@strakt.com  Wed Jul 11 15:12:21 2001
From: martin@strakt.com (Martin Sjögren)
Date: Wed, 11 Jul 2001 16:12:21 +0200
Subject: [Python-Dev] Re: Python and SSL
In-Reply-To: <3B4C55A1.73BB5A78@lemburg.com>
References: <20010711144626.A2998@strakt.com> <3B4C55A1.73BB5A78@lemburg.com>
Message-ID: <20010711161221.A3684@strakt.com>

On Wed, Jul 11, 2001 at 03:33:21PM +0200, M.-A. Lemburg wrote:
> "Martin Sj=F6gren" wrote:
> > I'm currently in the process of developing a basic OpenSSL module for
> > Python. Before you say antyhing, yes I know about M2Crypto and its SS=
L
> > support, but for a number of reasons, it doesn't fulfill our needs.
>=20
> Note that there's also amkCrypto (the successor of mxCrypto which
> is a wrapper of the low-level blazing fast tools in OpenSSL):
>=20
> 	http://www.amk.ca/python/code/crypto.html

Yeah I looked at this too, but it doesn't have the things I'm interested
in (SSL_write,read,... etc). At first glance it looks like this module an=
d
my module complement each other, but I may be wrong :-)

> > We found the SSL support in Python to be insufficient (nonexistent :-=
))
> > for our needs.  We thus decided to write our own module.
> >=20
> > The module is faaaar from complete as an interface to the general
> > cryptographic functionality of OpenSSl, but it does have basic SSL
> > support, including authorization using certificates, PRNG seeding
> > functions and an error handling system.
>=20
> There is some support in the socket module for dealing HTTPS.
> Which level of OpenSSL are you focussing (ciphers, certificates
> or protocol) ?

We're using SSL to secure the communication in a client/server situation,
using certificates for authentication. Basically, my module is what we
think we need right now, no more, no less. Given that, I may continue wor=
k
on it, as our need changes.

> > The whole kit (including some documentation) can be found here:
> > http://www.strakt.com/~martin/pyOpenSSL.tar.gz
> >=20
> > My question is... What do I do now? Where to proceed?
>=20
> Since the module is "far from complete", I'd suggest to put the project
> up on the web somewhere to let it mature.=20

"faaaar from complete" in that it doesn't do everything OpenSSL does! I'd
like to think that it's pretty well contained, and can be used for exactl=
y
the kind of things we are going to use it for.

Nevertheless, letting it mature isn't a bad idea. What is badly needed is
getting it compiled and checked on windows. We're doing all our
development under Linux, and while it's sufficient that the server (which
is written in C and Python) runs on *IX, the client most definitely must
run on Windows.

Any suggestion where to put it so that it's found? The Vaults of Parnassu=
s
I guess, are there any other interesting spots?

> I am not sure whether it's a good idea to put
> crypto code into the standard Python distribution due to the issues
> involved in this (import/export restrictions, etc.), but
> perhaps we could open up the Python core a bit for these
> "extra" utilities and make them available as separate download
> alongside the standard ones.

I agree with that, but one can argue that since all cryptographic stuff i=
s
actually done by the OpenSSL library, this module won't even get compiled
and installed unless you have OpenSSL on your machine already. As they sa=
y
on SlashDot, IANAL, and I'm not American so it's not that big a problem
for me personally.

> > Please CC me replies, since I'm (of course) not on the list.

This is still relevant ;) I haven't seen a reply to my subscribe-request
yet.

Martin

--=20
Martin Sj=F6gren
  martin@strakt.com              ICQ : 41245059
  Phone: +46 (0)31 405242        Cell: +46 (0)739 169191
  GPG key: http://www.strakt.com/~martin/gpg.html


From thomas@xs4all.net  Wed Jul 11 15:36:22 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 11 Jul 2001 16:36:22 +0200
Subject: [Python-Dev] Re: Python and SSL
In-Reply-To: <20010711161221.A3684@strakt.com>
Message-ID: <20010711163622.A5396@xs4all.nl>

On Wed, Jul 11, 2001 at 04:12:21PM +0200, Martin Sj?gren wrote:

> Any suggestion where to put it so that it's found? The Vaults of Parnassus
> I guess, are there any other interesting spots?

Posting to comp.lang.python.announce or python-announce@python.org always
works well ;)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From Greg.Wilson@baltimore.com  Wed Jul 11 17:10:56 2001
From: Greg.Wilson@baltimore.com (Greg Wilson)
Date: Wed, 11 Jul 2001 12:10:56 -0400
Subject: [Python-Dev] RE: Python-Dev digest, Vol 1 #1470 - 11 msgs
Message-ID: <930BBCA4CEBBD411BE6500508BB3328F36831E@nsamcanms1.ca.baltimore.com>

> From: Martin Sj=F6gren <martin@strakt.com>
> Any suggestion where to put it so that it's found?

SourceForge, please --- it'll make it easy for others
to contribute, and to get.  I'm also finding that a lot
of sys admins at places I teach have figured SF out, so
I can say, "Install this, this, and this," and it just
happens.  Kind of cool...

Thanks,
Greg


---------------------------------------------------------------------------=
--------------------------------------
The information contained in this message is confidential and is intended=
f
 or the addressee(s) only.  If you have received this message in error or=
t
 here are any problems please notify the originator immediately.  The=20
unauthorized use, disclosure, copying or alteration of this message is=20
strictly forbidden. Baltimore Technologies plc will not be liable for direc=
t,=20
special, indirect or consequential damages arising from alteration of the=
c
 ontents of this message by a third party or as a result of any virus being=
p
 assed on.

In addition, certain Marketing collateral may be added from time to time to=
p
 romote Baltimore Technologies products, services, Global e-Security or=20
appearance at trade shows and conferences.
=20
This footnote confirms that this email message has been swept by=20
Baltimore MIMEsweeper for Content Security threats, including
computer viruses.


From thomas.heller@ion-tof.com  Wed Jul 11 20:12:23 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 11 Jul 2001 21:12:23 +0200
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP)
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEDF@ukrux002.rundc.uk.origin-it.com>  <200107111302.f6BD2gP13353@odiug.digicool.com>
Message-ID: <0e2e01c10a3d$6a950b40$e000a8c0@thomasnotebook>

> [Paul Moore]
> > PEP 250 covers this. I have sent in the final PEP for approval, plus a
> > patch, but the process appears to be stalled. I guess I need to nag again.
> > The PEP process doesn't seem to cover non-core Python developers well (eg,
> > people like me who don't have a way of integrating with the Sourceforge
> > mechanisms...)
> 
> I just read that PEP over, and I agree with it.  I think it should be
> implemented.  If anyone with sourceforge permission would like to
> champion this PEP further (by implementing the modest change it
> suggests so that it can be rolled out with Python 2.2a1 next week),
> that would really help!
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
If noone else shows up, I'll take it (hoping I find the time
for it).

Thomas




From akuchlin@mems-exchange.org  Wed Jul 11 20:16:22 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 11 Jul 2001 15:16:22 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <018501c109e0$c345a450$4ffa42d5@hagrid>; from fredrik@pythonware.com on Wed, Jul 11, 2001 at 10:09:06AM +0200
References: <200107101740.f6AHeUC21223@snark.thyrsus.com> <15179.16603.865875.309079@beluga.mojam.com> <20010710141236.C9416@h00104b370897.ne.mediaone.net> <20010710144735.A22087@thyrsus.com> <15179.23243.253539.305609@beluga.mojam.com> <20010710155246.B23638@thyrsus.com> <3B4B6BC2.74F4984D@ActiveState.com> <018501c109e0$c345a450$4ffa42d5@hagrid>
Message-ID: <20010711151622.J6846@ute.cnri.reston.va.us>

On Wed, Jul 11, 2001 at 10:09:06AM +0200, Fredrik Lundh wrote:
>(fwiw, my current thinking is that SOAP is a flawed idea, and that the
>need for SOAP will go away when people get better XML/Schema tools,
>but that's another story.  and don't get me started on SOAP BDG...)

*blink* Really?  XML/Schema is the canonical example I use of
XML-related standards having grown overcomplicated beyond all reason;
SOAP will have far to go before reaching that level.  (Or maybe 1.2 
would surprise me...?)

--amk



From James_Althoff@i2.com  Wed Jul 11 23:02:23 2001
From: James_Althoff@i2.com (James_Althoff@i2.com)
Date: Wed, 11 Jul 2001 15:02:23 -0700
Subject: [Python-Dev] TypeError and AttributeError
Message-ID: <OF309BC21F.6DD28D2F-ON88256A86.0077DCE9@i2.com>

Given that

except (TypeError, AttributeError):

is the "safest across releases" idiom (as suggested by Tim Peters),

would it make sense to make AttributeError a subclass of TypeError so that

except (TypeError):

would become equally "safe" (and simpler)?

Potentially, some exception handling code would behave differently
(break?), but apparently this is already the case given that what is raised
(AttributeError or TypeError) changes between releases.

Jim



From barry@digicool.com  Wed Jul 11 23:05:20 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Wed, 11 Jul 2001 18:05:20 -0400
Subject: [Python-Dev] CVS build failures?  posixmodule.c on Linux RH6.1
Message-ID: <15180.52640.666460.449616@anthem.wooz.org>

I just did a fresh cvs update, make distclean, configure, make and
posixmodule.c is now failing to compile.  Looks like it's Thomas
recent nice() patch that's failing because PRIO_PROCESS isn't defined
unless sys/resource.h is #included (at least on this RH6.1-ish Linux
box).

The following patch seems to fix the problem for me.  Comments?

-Barry

-------------------- snip snip --------------------
Index: posixmodule.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Modules/posixmodule.c,v
retrieving revision 2.191
diff -u -r2.191 posixmodule.c
--- posixmodule.c	2001/07/11 14:45:34	2.191
+++ posixmodule.c	2001/07/11 22:04:29
@@ -32,6 +32,12 @@
 #include <sys/wait.h>		/* For WNOHANG */
 #endif
 
+#ifdef HAVE_GETPRIORITY
+#ifndef PRIO_PROCESS
+#include <sys/resource.h>
+#endif /* !PRIO_PROCESS */
+#endif /* HAVE_GETPRIORITY */
+
 #ifdef HAVE_SIGNAL_H
 #include <signal.h>
 #endif


From barry@digicool.com  Wed Jul 11 23:08:05 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Wed, 11 Jul 2001 18:08:05 -0400
Subject: [Python-Dev] TypeError and AttributeError
References: <OF309BC21F.6DD28D2F-ON88256A86.0077DCE9@i2.com>
Message-ID: <15180.52805.511953.909752@anthem.wooz.org>

>>>>> "JA" == James Althoff <James_Althoff@i2.com> writes:

    JA> Given that

    JA> except (TypeError, AttributeError):

    JA> is the "safest across releases" idiom (as suggested by Tim
    JA> Peters),

    JA> would it make sense to make AttributeError a subclass of
    JA> TypeError so that

    JA> except (TypeError):

    JA> would become equally "safe" (and simpler)?

No, but it /might/ make sense to give them a new common base class
between them and Exception.  If so, called what?

-Barry


From barry@digicool.com  Wed Jul 11 23:11:39 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Wed, 11 Jul 2001 18:11:39 -0400
Subject: [Python-Dev] CVS build failures?  posixmodule.c on Linux RH6.1
References: <15180.52640.666460.449616@anthem.wooz.org>
Message-ID: <15180.53019.440723.13978@anthem.wooz.org>

Patch uploaded to SF bug #440522.

-Barry


From thomas@xs4all.net  Wed Jul 11 23:16:07 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 12 Jul 2001 00:16:07 +0200
Subject: [Python-Dev] CVS build failures?  posixmodule.c on Linux RH6.1
In-Reply-To: <15180.52640.666460.449616@anthem.wooz.org>
References: <15180.52640.666460.449616@anthem.wooz.org>
Message-ID: <20010712001607.B5396@xs4all.nl>

On Wed, Jul 11, 2001 at 06:05:20PM -0400, Barry A. Warsaw wrote:
> 
> I just did a fresh cvs update, make distclean, configure, make and
> posixmodule.c is now failing to compile.  Looks like it's Thomas
> recent nice() patch that's failing because PRIO_PROCESS isn't defined
> unless sys/resource.h is #included (at least on this RH6.1-ish Linux
> box).

The same problem exists on BSD-ish systems, and I changed the patch for some
other reasons as well. I'm double-checking it works on more systems right
now, and will commit in a few minutes :) Sorry it was delayed long enough
for most of you to run into this problem (or so it seems) but we had a big
network outage that had a bit higher priority ;P

FWIW, not just Linux has a boken nice()... BSDI and FreeBSD have it, too,
and they also note nice() has been obsoleted. I'll provide a patch to use
setpriority/getpriority for the 2.2 tree, falling back to nice() only if
they aren't available.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From nas@python.ca  Wed Jul 11 23:19:43 2001
From: nas@python.ca (Neil Schemenauer)
Date: Wed, 11 Jul 2001 15:19:43 -0700
Subject: [Python-Dev] Method resolution order
In-Reply-To: <E15KRUo-0000so-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Wed, Jul 11, 2001 at 02:26:10PM -0700
References: <E15KRUo-0000so-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010711151943.A15462@glacier.fnational.com>

Guido van Rossum wrote:
> Using the classic [depth-first, left-right] lookup rule, construct the
> list of classes that would be searched, including duplicates.  Now for
> each class that occurs in the list multiple times, remove all
> occurrences except for the last.  The resulting list contains each
> ancestor class exactly once

Is this original or is it used by other languages as well?  My books on
Dylan and CLOS are at home but I think they do something similar.

  Neil


From akuchlin@mems-exchange.org  Wed Jul 11 23:19:36 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 11 Jul 2001 18:19:36 -0400
Subject: [Python-Dev] beopen.com e-mail addresses in CVS
Message-ID: <E15KSKW-0004qZ-00@ute.cnri.reston.va.us>

I noticed that Tools/compiler has jeremy@beopen.com as the author's
address.  A quick grep:

./Lib/test/test_gettext.py:# Barry Warsaw <bwarsaw@beopen.com>, 2000.
./Lib/test/test_gettext.py:"Last-Translator: Barry A. Warsaw <bwarsaw@beopen.com>\n"
./Misc/RPM/Tkinter/setup.cfg:packager = Jeremy Hylton <jeremy@beopen.com>
./Misc/RPM/Tkinter/setup.py:      author_email="pythoneers@beopen.com",
./Misc/RPM/beopen-python.spec:Packager: Jeremy Hylton <jeremy@beopen.com>
./Misc/RPM/beopen-python.spec:* Mon Oct  9 2000 Jeremy Hylton <jeremy@beopen.com>
./Misc/RPM/beopen-python.spec:* Thu Oct  5 2000 Jeremy Hylton <jeremy@beopen.com>
./Misc/RPM/beopen-python.spec:* Tue Sep 26 2000 Jeremy Hylton <jeremy@beopen.com>
./Misc/RPM/beopen-python.spec:* Tue Sep 12 2000 Jeremy Hylton <jeremy@beopen.com>
./Tools/compiler/setup.py:      author_email = "jeremy@beopen.com",
./changes:a suggestion from Bob Weiner <weiner@beopen.com>.

All but the last (and maybe the second) seem to be worth fixing.

--amk



From bckfnn@worldonline.dk  Wed Jul 11 23:44:54 2001
From: bckfnn@worldonline.dk (Finn Bock)
Date: Wed, 11 Jul 2001 22:44:54 GMT
Subject: [Python-Dev] TypeError and AttributeError
In-Reply-To: <15180.52805.511953.909752@anthem.wooz.org>
References: <OF309BC21F.6DD28D2F-ON88256A86.0077DCE9@i2.com> <15180.52805.511953.909752@anthem.wooz.org>
Message-ID: <3b4cd444.59275613@mail.wanadoo.dk>

>>>>>> "JA" == James Althoff <James_Althoff@i2.com> writes:
>
>    JA> would it make sense to make AttributeError a subclass of
>    JA> TypeError so that
>
>    JA> except (TypeError):
>
>    JA> would become equally "safe" (and simpler)?

[Barry]

>No, but it /might/ make sense to give them a new common base class
>between them and Exception.  If so, called what?

ProtocolException.

I think it would have made sense, but it probably wouldn't have helped.
Users still see a specific exception thrown and write code against that.

regards,
finn


From guido@digicool.com  Thu Jul 12 00:21:49 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 11 Jul 2001 19:21:49 -0400
Subject: [Python-Dev] Re: Method resolution order
In-Reply-To: Your message of "Wed, 11 Jul 2001 15:19:43 PDT."
 <20010711151943.A15462@glacier.fnational.com>
References: <E15KRUo-0000so-00@usw-pr-cvs1.sourceforge.net>
 <20010711151943.A15462@glacier.fnational.com>
Message-ID: <200107112321.f6BNLn614044@odiug.digicool.com>

> Guido van Rossum wrote:
> > Using the classic [depth-first, left-right] lookup rule, construct the
> > list of classes that would be searched, including duplicates.  Now for
> > each class that occurs in the list multiple times, remove all
> > occurrences except for the last.  The resulting list contains each
> > ancestor class exactly once
> 
> Is this original or is it used by other languages as well?  My books on
> Dylan and CLOS are at home but I think they do something similar.
> 
>   Neil

I didn't make it up!  I got it from the reference [1] in the PEP.  C+
seems to do something similar (with added conflict checking).

It would be good to mention that this is not a new invention.  If you
can confirm that Dylan and CLOS have this, I'll add that.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Thu Jul 12 00:26:33 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 11 Jul 2001 19:26:33 -0400
Subject: [Python-Dev] TypeError and AttributeError
In-Reply-To: Your message of "Wed, 11 Jul 2001 22:44:54 GMT."
 <3b4cd444.59275613@mail.wanadoo.dk>
References: <OF309BC21F.6DD28D2F-ON88256A86.0077DCE9@i2.com> <15180.52805.511953.909752@anthem.wooz.org>
 <3b4cd444.59275613@mail.wanadoo.dk>
Message-ID: <200107112326.f6BNQYG14059@odiug.digicool.com>

> >>>>>> "JA" == James Althoff <James_Althoff@i2.com> writes:
> >
> >    JA> would it make sense to make AttributeError a subclass of
> >    JA> TypeError so that
> >
> >    JA> except (TypeError):
> >
> >    JA> would become equally "safe" (and simpler)?
> 
> [Barry]
> 
> >No, but it /might/ make sense to give them a new common base class
> >between them and Exception.  If so, called what?
> 
> ProtocolException.
> 
> I think it would have made sense, but it probably wouldn't have helped.
> Users still see a specific exception thrown and write code against that.

Yeah, the problem with Jim's proposal is that users who write

    try:
       "try something"
    except TypeError:
       "one way of handling it"
    except AttributeError:
       "another way of handling it"

will still see a change in behavior, as will users who catch
only AttributeError in a situation that now raises TypeError...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg@cosc.canterbury.ac.nz  Thu Jul 12 01:54:08 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 12 Jul 2001 12:54:08 +1200 (NZST)
Subject: [Python-Dev] TypeError and AttributeError
In-Reply-To: <200107112326.f6BNQYG14059@odiug.digicool.com>
Message-ID: <200107120054.MAA01835@s454.cosc.canterbury.ac.nz>

Maybe TypeError and AttributeError should be merged,
and made aliases for the same exception?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From esr@snark.thyrsus.com  Thu Jul 12 04:01:57 2001
From: esr@snark.thyrsus.com (Eric S. Raymond)
Date: Wed, 11 Jul 2001 23:01:57 -0400
Subject: [Python-Dev] XML-RPC docs are in
Message-ID: <200107120301.f6C31vj16739@snark.thyrsus.com>

I have spent the afternoon writing, and the first version of
xmlrpclib docs is checked in.

It probably has markup errors; it is not complete, and could probably stand
to have some of the internal things like Marshaller documented.  But I
think it does a decent job on the entry points and externally visible
things.

We have more to do.  Fred D. and Fredrik L. should proof this sucker 
for errors.  Fredrik L. should add stuff on some of the internals and
quasi-internals I haven't described, like the loads and dumps
functions.

Then, as Eric Kidd pointed out, we really ought to support an XML-RPC
server class wrapped around Fredrik's stubs.  There appears to be one
in the xmlrpclib distribution.  Fredrik, are you planning to document
that and check it in?

This is a feature set to make serious noise about in the publicity for 2.2.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"They that can give up essential liberty to obtain a little temporary 
safety deserve neither liberty nor safety."
	-- Benjamin Franklin, Historical Review of Pennsylvania, 1759.


From tim.one@home.com  Thu Jul 12 05:51:15 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 12 Jul 2001 00:51:15 -0400
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <15179.61728.255760.814673@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKENGKNAA.tim.one@home.com>

[Skip Montanaro]
> I wasn't suggesting you weren't trustworthy.

Neither was I <wink>.

> On a Linux system, wall clock time doesn't mean much when timing
> processes.

Sure -- different OS.  I'm not telling you how to time things on Linux; I
just explained what I did on Windows because it was questioned.

> ... (does Windows even distinguish between user and system time?).

I've never seen anything in Win9x that does, and not surprised:  they're at
best <0.7 wink> single-user systems.  On NT there's an elaborate performance
monitoring subsystem tied to the HKEY_PERFORMANCE_DATA registry hive, from
which-- given enough programming pain, none of which Python endures --you
can find out almost anything.

> That time.clock means different things on Windows and Unix-like systems
> bothers me a bit.

Blame X3J11 -- ANSI C is vague about what clock() is supposed to do.  I've
got no use for the *native* Windows implementation of clock() (Python maps
time.clock() to the high-resolution Win32 QueryPerformanceCounter API
instead); Unices in general don't have usable high-resolution timers;
Windowses in general don't have usable notions of user process time; so we
take what we can get.

> (It would bother me more if I had to write timing code that was
> portable across both Unix and Windows.)

Hmm.  Unless you're happy with wall-clock time, it may well drive you insane
just to write timing code portable across Unices.  At my last employer, we
wrote all our base timing routines in assembler, because it's generally easy
to suck what you need out of modern HW, but darned near impossible after
seventeen warring stds committees finish taking turns hiding it <wink>.

[about BINARY_ADD slowing string+string]
> I can buy that.  Wasn't there some discussion about improving this
> situation?

Yes.

> If so, I guess I should be using the head branch of the CVS tree
> instead of release21-maint.

AFAIK, nobody did anything *except* discuss it so far.  Insert an early
special case for sequence cat, and you slow each numeric addition by the
time it takes to fail that test, so there's no killer argument either way.
int+int is special-cased by BINARY_ADD, but everything else goes thru the
general machinery.



From tim.one@home.com  Thu Jul 12 06:23:17 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 12 Jul 2001 01:23:17 -0400
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP)
In-Reply-To: <200107111302.f6BD2gP13353@odiug.digicool.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKENHKNAA.tim.one@home.com>

[Guido]
> I just read that PEP over, and I agree with it.  I think it should be
> implemented.  If anyone with sourceforge permission would like to
> champion this PEP further (by implementing the modest change it
> suggests so that it can be rolled out with Python 2.2a1 next week),
> that would really help!

Umm, what am I missing?  The change to site.py was so simple you could have
committed it yourself quicker than it took to write the above.  I committed
it a few minutes ago.  If something else is needed, someone else will have
to do it (or explain it to me in detail so precise they could do it themself
10x quicker <wink>).



From thomas@xs4all.net  Thu Jul 12 08:09:03 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 12 Jul 2001 09:09:03 +0200
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKENGKNAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCKENGKNAA.tim.one@home.com>
Message-ID: <20010712090903.E5396@xs4all.nl>

On Thu, Jul 12, 2001 at 12:51:15AM -0400, Tim Peters wrote:

> > On a Linux system, wall clock time doesn't mean much when timing
> > processes.

> Sure -- different OS.  I'm not telling you how to time things on Linux; I
> just explained what I did on Windows because it was questioned.

Actually, it wasn't <wink>. Skip just said realtime didn't make sense on a
system that did multiple things at the same time. He obviously didn't mean
MS Windows :-)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Thu Jul 12 08:57:49 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 12 Jul 2001 09:57:49 +0200
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils]
 Package DB: strawman PEP)
References: <LNBBLJKPBEHFEDALKOLCKENHKNAA.tim.one@home.com>
Message-ID: <3B4D587D.FDB733C0@lemburg.com>

Tim Peters wrote:
> 
> [Guido]
> > I just read that PEP over, and I agree with it.  I think it should be
> > implemented.  If anyone with sourceforge permission would like to
> > champion this PEP further (by implementing the modest change it
> > suggests so that it can be rolled out with Python 2.2a1 next week),
> > that would really help!
> 
> Umm, what am I missing?  The change to site.py was so simple you could have
> committed it yourself quicker than it took to write the above.  I committed
> it a few minutes ago.  If something else is needed, someone else will have
> to do it (or explain it to me in detail so precise they could do it themself
> 10x quicker <wink>).

Cool, but what about the changes needed in distutils to actually
utilize the new directory and the changes to the Windows installer
to create the directory at installation time ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From Paul.Moore@atosorigin.com  Thu Jul 12 09:32:51 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Thu, 12 Jul 2001 09:32:51 +0100
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distu
 tils]  Package DB: strawman PEP)
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEE9@ukrux002.rundc.uk.origin-it.com>

From: M.-A. Lemburg [mailto:mal@lemburg.com]
Tim Peters wrote:
> Umm, what am I missing?  The change to site.py was so simple you could
have
> committed it yourself quicker than it took to write the above.  I
committed
> it a few minutes ago.  If something else is needed, someone else will have
> to do it (or explain it to me in detail so precise they could do it
themself
> 10x quicker <wink>).

> Cool, but what about the changes needed in distutils to actually
> utilize the new directory and the changes to the Windows installer
> to create the directory at installation time ?

The patch I sent along with the final version of the PEP included the
distutils change (it's only one line, but it's on the PC at home, so I can't
quote it here). I assume that the Python install should ensure that the
site-packages exists (it does at the moment) so I don't see a need for the
wininst installer to check.

Paul.

PS [After a quick rummage...] I *think* the following patch is what is
needed for distutils: I haven't tested it, though, so it would be better to
check the original version (which I did test...)

--- sysconfig.py.orig	Thu Apr 19 10:24:24 2001
+++ sysconfig.py	Thu Jul 12 09:32:34 2001
@@ -87,7 +87,7 @@
 
     elif os.name == "nt":
         if standard_lib:
-            return os.path.join(PREFIX, "Lib")
+            return os.path.join(PREFIX, "Lib", "site-packages")
         else:
             return prefix
 


From mal@lemburg.com  Thu Jul 12 10:14:35 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 12 Jul 2001 11:14:35 +0200
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils]
 Package DB: strawman PEP)
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEE9@ukrux002.rundc.uk.origin-it.com>
Message-ID: <3B4D6A7B.F493D611@lemburg.com>

"Moore, Paul" wrote:
> 
> From: M.-A. Lemburg [mailto:mal@lemburg.com]
> Tim Peters wrote:
> > Umm, what am I missing?  The change to site.py was so simple you could
> have
> > committed it yourself quicker than it took to write the above.  I
> committed
> > it a few minutes ago.  If something else is needed, someone else will have
> > to do it (or explain it to me in detail so precise they could do it
> themself
> > 10x quicker <wink>).
> 
> > Cool, but what about the changes needed in distutils to actually
> > utilize the new directory and the changes to the Windows installer
> > to create the directory at installation time ?
> 
> The patch I sent along with the final version of the PEP included the
> distutils change (it's only one line, but it's on the PC at home, so I can't
> quote it here). I assume that the Python install should ensure that the
> site-packages exists (it does at the moment) so I don't see a need for the
> wininst installer to check.

I don't have a site-packages dir in my installations. Could it be that
you installed some distutils package which automagically created one 
or that this change in Python 2.1.1 ?
 
> Paul.
> 
> PS [After a quick rummage...] I *think* the following patch is what is
> needed for distutils: I haven't tested it, though, so it would be better to
> check the original version (which I did test...)
> 
> --- sysconfig.py.orig   Thu Apr 19 10:24:24 2001
> +++ sysconfig.py        Thu Jul 12 09:32:34 2001
> @@ -87,7 +87,7 @@
> 
>      elif os.name == "nt":
>          if standard_lib:
> -            return os.path.join(PREFIX, "Lib")
> +            return os.path.join(PREFIX, "Lib", "site-packages")
>          else:
>              return prefix

This doesn't seem to do the trick: the Windows installer still installs
the packages directly to \Python21.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From Paul.Moore@atosorigin.com  Thu Jul 12 11:02:28 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Thu, 12 Jul 2001 11:02:28 +0100
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distu
 tils]   Package DB: strawman PEP)
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEB@ukrux002.rundc.uk.origin-it.com>

From: M.-A. Lemburg [mailto:mal@lemburg.com]
> I don't have a site-packages dir in my installations. Could it be that
> you installed some distutils package which automagically created one 
> or that this change in Python 2.1.1 ?

I don't believe so. I have ActiveState Python - it's possible (although
unlikely, I would think) that that version creates site-packages specially.
It's vaguely possible (although unlikely) that I created the directory
manually - it was missing in one of the 2.1 betas, IIRC, but I thought it
reappeared in 2.1 final. In any case, the necessary changes to make sure
that directory exists should be in the Windows Installer package(s) for
Python. I guess that means somewhere in the Wise installer scripts - which I
don't have access to, nor would I know how to change.

It should just be a case of reinstating the behaviour in 2.0, if the
directory really has been lost in 2.1.

> This doesn't seem to do the trick: the Windows installer still installs
> the packages directly to \Python21.

This change should (as I said, it's untested) have ensured that "python
setup.py install" puts the module into site-packages. I don't know what the
installer code in bdist_wininst.py does, as it's a base64-encoded EXE, and I
don't have the sources - surely it uses the distutils sysconfig stuff to get
the value (it has no other way of knowing...)?

Paul.


From thomas.heller@ion-tof.com  Thu Jul 12 11:57:18 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 12 Jul 2001 12:57:18 +0200
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils]   Package DB: strawman PEP)
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEB@ukrux002.rundc.uk.origin-it.com>
Message-ID: <100d01c10ac1$6b7fd470$e000a8c0@thomasnotebook>

From: "Moore, Paul" <Paul.Moore@atosorigin.com>
> From: M.-A. Lemburg [mailto:mal@lemburg.com]
> > This doesn't seem to do the trick: the Windows installer still installs
> > the packages directly to \Python21.
> 
> This change should (as I said, it's untested) have ensured that "python
> setup.py install" puts the module into site-packages. I don't know what the
> installer code in bdist_wininst.py does, as it's a base64-encoded EXE, and I
> don't have the sources - surely it uses the distutils sysconfig stuff to get
> the value (it has no other way of knowing...)?
The sources are in CVS:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/distutils/misc/

The bdist_wininst installer simply installs into prefix,
this is what the registry has under
HKEY_LOCAL_MACHINE\Software\Python\PythonCore\2.1\InstallPath.

Now what should it do?

There are probably some issues here.
Currently it installs the package into prefix,
and creates a prefix/Remove<xxx>.exe uninstaller, and
*appends* info about uninstallation into the prefix/<xxx>-wininst.log
file.
In the future (after PEP250) it should install the package into
prefix/lib/site-packages. Also for older Python versions?
Or only for the newer ones? Depending on the distutils' version
used to create the installer? Depending on the actual site.py file?
Hardcoding a version check (version >= 2.2') into the installer
doesn't seem so nice, but would probably do the correct thing.

Note that 'python setup.py install' requires distutils to be present
- the bdist_wininst installer does not.

Thomas



From mal@lemburg.com  Thu Jul 12 12:51:06 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 12 Jul 2001 13:51:06 +0200
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils]
 Package DB: strawman PEP)
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEB@ukrux002.rundc.uk.origin-it.com>
Message-ID: <3B4D8F2A.A1D54B9D@lemburg.com>

"Moore, Paul" wrote:
> 
> From: M.-A. Lemburg [mailto:mal@lemburg.com]
> > I don't have a site-packages dir in my installations. Could it be that
> > you installed some distutils package which automagically created one
> > or that this change in Python 2.1.1 ?
> 
> I don't believe so. I have ActiveState Python - it's possible (although
> unlikely, I would think) that that version creates site-packages specially.
> It's vaguely possible (although unlikely) that I created the directory
> manually - it was missing in one of the 2.1 betas, IIRC, but I thought it
> reappeared in 2.1 final. In any case, the necessary changes to make sure
> that directory exists should be in the Windows Installer package(s) for
> Python. I guess that means somewhere in the Wise installer scripts - which I
> don't have access to, nor would I know how to change.

They should be in the CVS tree of Python on SourceForge.
 
> It should just be a case of reinstating the behaviour in 2.0, if the
> directory really has been lost in 2.1.
> 
> > This doesn't seem to do the trick: the Windows installer still installs
> > the packages directly to \Python21.
> 
> This change should (as I said, it's untested) have ensured that "python
> setup.py install" puts the module into site-packages. 

About the change: I think distutils should lookup the path in Python's
site.py file - that way you assure that distutils will work on all
Python installations rather than only on those which have the site.py
patch. Otherwise, Python won't find the packages installed in 
Lib/site-packages.

> I don't know what the
> installer code in bdist_wininst.py does, as it's a base64-encoded EXE, and I
> don't have the sources - surely it uses the distutils sysconfig stuff to get
> the value (it has no other way of knowing...)?

The sources for the Windows installer are on SourceForge CVS too (under the
distutils branch). I believe that Thomas Heller who wrote the installer
will know best what to do about this...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From Paul.Moore@atosorigin.com  Thu Jul 12 12:57:55 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Thu, 12 Jul 2001 12:57:55 +0100
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distu
 tils]   Package DB: strawman PEP)
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEED@ukrux002.rundc.uk.origin-it.com>

From: Thomas Heller [mailto:thomas.heller@ion-tof.com]
> The sources are in CVS:
> http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/distutils/misc/

Unfortunately, I don't have CVS access...

> The bdist_wininst installer simply installs into prefix,
> this is what the registry has under
> HKEY_LOCAL_MACHINE\Software\Python\PythonCore\2.1\InstallPath.
>
> Now what should it do?

What does that key *mean*? If it is the directory into which packages should
get installed, then bdist_wininst should keep doing what it does now, and
the Python installer should be changed to put site-packages into that key.

If, on the other hand, this key has a meaning elsewhere in Python, and
changing it would cause a problem, then I would say that this is a bug in
the Windows Installer, which should use a key of its own. In that case, my
recommendation would be to have the Python 2.2 installer create a new key,
and wininst use that if it exists, otherwise fall back to the current key.
That would provide the correct behaviour in the new release, but retain
backward compatibility with earlier versions of Python.

> There are probably some issues here.

Agreed. I apologise if I didn't publicise the PEP in the right places for
these to get picked up earlier - I thought I had. I believe my suggestion
above will do the right thing, but I am not an expert in the intricacies of
Python's use of the registry, so I'd like someone more knowledgeable to
comment, if possible.

Paul.


From Paul.Moore@atosorigin.com  Thu Jul 12 13:05:52 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Thu, 12 Jul 2001 13:05:52 +0100
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distu
 tils]    Package DB: strawman PEP)
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEE@ukrux002.rundc.uk.origin-it.com>

From: M.-A. Lemburg [mailto:mal@lemburg.com]
> They should be in the CVS tree of Python on SourceForge.
I don't have CVS access, so I can't get at these, unfortunately...
 
> About the change: I think distutils should lookup the path in Python's
> site.py file - that way you assure that distutils will work on all
> Python installations rather than only on those which have the site.py
> patch. Otherwise, Python won't find the packages installed in 
> Lib/site-packages.

I'm not sure what you intend, here. site.py doesn't export this directory -
it is just one of the directories which gets added to sys.path in site.py.
On Unix, there are more than one such directory (both version-specific and
version-independent), so there isn't, in general, just one such directory. I
don't know how you could encapsulate this in a way which would not clash
with other platforms' policies.

The intention of this change was to be the smallest possible change which
would work. I believe it (or at least, the patch I sent when I submitted the
final version of the PEP) does that for everything except the Windows
Installer. I'll have to defer judgement on how best to address that area to
others better qualified to comment, but see my message to Thomas for my
suggestion.

Hope this helps,
Paul.


From guido@digicool.com  Thu Jul 12 13:32:47 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 12 Jul 2001 08:32:47 -0400
Subject: [Python-Dev] Improving nested-scope warnings?
Message-ID: <200107121232.f6CCWlu14533@odiug.digicool.com>

Someone on SF made a point: the warnings you get about nested scopes
aren't always as helpful as they could be.

> My "user experience" of porting my 2.0 code to 2.1 is
> however fairly pityful. Here some destilled suggestions:
> * separate warnings for "potential" "import *" problems for
> standard
>    modules (as in the examples) -- sure we know what math
> exports
>    right now and "from math import *" is a common idiom.
> * run-time warnings for shadowed constructs
> * listing of the variables that are imported and one may
> want to
>    import by name instead (or qualify)
> 
> While I really like the new scoping rules and they support
> my programming style their practical impact on existing code
> is quite
> large. A better support would be fairly important -- I have
> 50.000 lines of code to port ....

(From http://sourceforge.net/tracker/?func=detail&atid=105470&aid=440497&group_id=5470)

Anybody interested in implementing some of these?  I guess this would
be in the 2.1.1 branch, as in the 2.2 branch we're about to enable the
future...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas.heller@ion-tof.com  Thu Jul 12 13:51:05 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 12 Jul 2001 14:51:05 +0200
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils]   Package DB: strawman PEP)
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEED@ukrux002.rundc.uk.origin-it.com>
Message-ID: <115b01c10ad1$50ea9cc0$e000a8c0@thomasnotebook>

From: "Moore, Paul" <Paul.Moore@atosorigin.com>
> > The bdist_wininst installer simply installs into prefix,
> > this is what the registry has under
> > HKEY_LOCAL_MACHINE\Software\Python\PythonCore\2.1\InstallPath.
> >
> > Now what should it do?
> 
> What does that key *mean*? If it is the directory into which packages should
> get installed, then bdist_wininst should keep doing what it does now, and
> the Python installer should be changed to put site-packages into that key.
> 
> If, on the other hand, this key has a meaning elsewhere in Python, and
> changing it would cause a problem, then I would say that this is a bug in
> the Windows Installer, which should use a key of its own. In that case, my
> recommendation would be to have the Python 2.2 installer create a new key,
> and wininst use that if it exists, otherwise fall back to the current key.
> That would provide the correct behaviour in the new release, but retain
> backward compatibility with earlier versions of Python.
Good idea. But remember that there is still Pythonware's distribution,
which does neither create nor require registry entries,
also if you compile from source they are not available.

OTOH, bdist_wininst installers currently do not recognize these
Python installations, which is probably the next bug.

> 
> > There are probably some issues here.
> 
> Agreed. I apologise if I didn't publicise the PEP in the right places for
> these to get picked up earlier - I thought I had. I believe my suggestion
> above will do the right thing, but I am not an expert in the intricacies of
> Python's use of the registry, so I'd like someone more knowledgeable to
> comment, if possible.
It's my fault, I'm afraid. Didn't think enough about these things
earlier.

> 
> Paul.
> 

Thomas
BTW: We should narrow the TO: and CC: fields in this discussion.
I'm receiving every message threefold. What would be appropriate?



From skip@pobox.com (Skip Montanaro)  Thu Jul 12 14:13:16 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 12 Jul 2001 08:13:16 -0500
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKENGKNAA.tim.one@home.com>
References: <15179.61728.255760.814673@beluga.mojam.com>
 <LNBBLJKPBEHFEDALKOLCKENGKNAA.tim.one@home.com>
Message-ID: <15181.41580.258091.412915@beluga.mojam.com>

    Tim> AFAIK, nobody did anything *except* discuss it so far.  Insert an
    Tim> early special case for sequence cat, and you slow each numeric
    Tim> addition by the time it takes to fail that test, so there's no
    Tim> killer argument either way.  int+int is special-cased by
    Tim> BINARY_ADD, but everything else goes thru the general machinery.

Hmmm... What file we talkin' about Willis?  If we did

    test for int+int
    test for string+string
    general machinery

we might speed up a couple very common cases enough to have an overall win.

Skip



From mal@lemburg.com  Thu Jul 12 14:46:52 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 12 Jul 2001 15:46:52 +0200
Subject: [Distutils] RE: [Python-Dev] PEP 250 - site-packages on Windows:
 (Was: [Distutils]    Package DB: strawman PEP)
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEE@ukrux002.rundc.uk.origin-it.com>
Message-ID: <3B4DAA4C.A6F0859D@lemburg.com>

"Moore, Paul" wrote:
> 
> From: M.-A. Lemburg [mailto:mal@lemburg.com]
> > They should be in the CVS tree of Python on SourceForge.
> I don't have CVS access, so I can't get at these, unfortunately...

There should be a tarball of the CVS archive available somewhere
on SF.
 
> > About the change: I think distutils should lookup the path in Python's
> > site.py file - that way you assure that distutils will work on all
> > Python installations rather than only on those which have the site.py
> > patch. Otherwise, Python won't find the packages installed in
> > Lib/site-packages.
> 
> I'm not sure what you intend, here. site.py doesn't export this directory -
> it is just one of the directories which gets added to sys.path in site.py.
> On Unix, there are more than one such directory (both version-specific and
> version-independent), so there isn't, in general, just one such directory. I
> don't know how you could encapsulate this in a way which would not clash
> with other platforms' policies.
> 
> The intention of this change was to be the smallest possible change which
> would work. I believe it (or at least, the patch I sent when I submitted the
> final version of the PEP) does that for everything except the Windows
> Installer. I'll have to defer judgement on how best to address that area to
> others better qualified to comment, but see my message to Thomas for my
> suggestion.

Well, site.py could be modified to set a symbol in the sys module
which could then be queried by distutils, e.g. sys.extinstallprefix.

Alternatively, distutils could be made to default to 
Lib\site-packages and then revert to Lib\ in case this directory
is not available.

BTW, I don't think that using Windows registry keys for determining the
installation path is a good idea -- this information should be kept
in the site.py or sitecustomize.py module for easy editing.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From nas@python.ca  Thu Jul 12 14:51:05 2001
From: nas@python.ca (Neil Schemenauer)
Date: Thu, 12 Jul 2001 06:51:05 -0700
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <15181.41580.258091.412915@beluga.mojam.com>; from skip@pobox.com on Thu, Jul 12, 2001 at 08:13:16AM -0500
References: <15179.61728.255760.814673@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCKENGKNAA.tim.one@home.com> <15181.41580.258091.412915@beluga.mojam.com>
Message-ID: <20010712065105.A16964@glacier.fnational.com>

Skip Montanaro wrote:
> Hmmm... What file we talkin' about Willis?  If we did
> 
>     test for int+int
>     test for string+string
>     general machinery
> 
> we might speed up a couple very common cases enough to have an overall win.

BINARY_ADD in ceval.c.  I would guess that special casing strings would
be an overall loss.

  Neil


From thomas@xs4all.net  Thu Jul 12 14:54:09 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 12 Jul 2001 15:54:09 +0200
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <15181.41580.258091.412915@beluga.mojam.com>
References: <15179.61728.255760.814673@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCKENGKNAA.tim.one@home.com> <15181.41580.258091.412915@beluga.mojam.com>
Message-ID: <20010712155409.K5396@xs4all.nl>

On Thu, Jul 12, 2001 at 08:13:16AM -0500, Skip Montanaro wrote:

>     Tim> AFAIK, nobody did anything *except* discuss it so far.  Insert an
>     Tim> early special case for sequence cat, and you slow each numeric
>     Tim> addition by the time it takes to fail that test, so there's no
>     Tim> killer argument either way.  int+int is special-cased by
>     Tim> BINARY_ADD, but everything else goes thru the general machinery.

> Hmmm... What file we talkin' about Willis?

ceval.c, just look for BINARY_ADD.

> If we did

>     test for int+int
>     test for string+string
>     general machinery

> we might speed up a couple very common cases enough to have an overall win.

Don't forget to do meaningful performance comparisons before and after ;P

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Thu Jul 12 15:00:33 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 12 Jul 2001 10:00:33 -0400
Subject: [Python-Dev] "Brennan, Bernadette": Python 1.5.2 & Solaris 8
Message-ID: <200107121400.f6CE0XA14567@odiug.digicool.com>

Does anyone remember what the problems with Solaris-8 were?  Shallow,
I hope?

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Thu, 12 Jul 2001 09:11:08 -0400
From:    "Brennan, Bernadette" <Bernadette.Brennan@GD-CS.COM>
To:      "'guido@python.org'" <guido@python.org>
Subject: Python 1.5.2 & Solaris 8

We are currently using Python 1.5.2 on Solaris 5.  I have been tasked to
upgrade to Solaris 8, and I am running into problems compiling with Python.
Can you tell me if Python 1.5.2 is compatible with Solaris 8?  If 1.5.2 is
not compatible are any of the newer releases of Python?  Thank you for your
help.

Bernadette Brennan

------- End of Forwarded Message



From thomas@xs4all.net  Thu Jul 12 15:33:14 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 12 Jul 2001 16:33:14 +0200
Subject: [Python-Dev] Python 2.1.1
Message-ID: <20010712163314.L5396@xs4all.nl>

I'm done checking in bugfixes into 2.1.1. I went through all checkins since
release21 (using some evil python & shell scriptery to get it in the first
place) and caught a few more, today. However, I see two bugs on SF that still
bug me, and I'd like to see fixed:

  [ #425007 ] Python 2.1 installs shared libs with mode 0700
https://sourceforge.net/tracker/index.php?func=detail&aid=425007&group_id=5470&atid=105470

  [ #230075 ] dbmmodule build fails on Debian GNU/Linux unstable (Sid)
https://sourceforge.net/tracker/index.php?func=detail&aid=230075&group_id=5470&atid=105470

Both of these are distutils-build related, and I'm not sure on the 'right'
fix on either. The latter also applies to 'bsddb', by the way, and is
especially annoying to me, because I'm running Debian on more and more
machines :) Does anyone who understands setup.py have time to look at these
before a week from friday, when 2.1.1-final is scheduled ?

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From fdrake@acm.org  Thu Jul 12 15:48:03 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 12 Jul 2001 10:48:03 -0400 (EDT)
Subject: [Python-Dev] Docs for 2.1.1c1 frozen
Message-ID: <15181.47267.392908.120928@cj42289-a.reston1.va.home.com>

  I'm freezing the Doc/ tree on the release21-maint branch until the
2.1.1c1 release is out.  If you find a bug in that version of the
docs, please report it via the SourceForge bug tracker, even if you
have checkin permission, at least until the freeze is lifted.
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From thomas@xs4all.net  Thu Jul 12 15:55:01 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 12 Jul 2001 16:55:01 +0200
Subject: [Python-Dev] "Brennan, Bernadette": Python 1.5.2 & Solaris 8
In-Reply-To: <200107121400.f6CE0XA14567@odiug.digicool.com>
References: <200107121400.f6CE0XA14567@odiug.digicool.com>
Message-ID: <20010712165501.M5396@xs4all.nl>

On Thu, Jul 12, 2001 at 10:00:33AM -0400, Guido van Rossum wrote:
> Does anyone remember what the problems with Solaris-8 were?  Shallow,
> I hope?

No clue, sorry. I don't have Solaris 8, either, but I do have access to
Solaris 7 (currently, but not for long) and will attempt to build a couple
of releases on it.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tim@digicool.com  Thu Jul 12 15:57:30 2001
From: tim@digicool.com (Tim Peters)
Date: Thu, 12 Jul 2001 10:57:30 -0400
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils]  Package DB: strawman PEP)
In-Reply-To: <3B4D587D.FDB733C0@lemburg.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHIEIPCCAA.tim@digicool.com>

[MAL]
> Cool, but what about the changes needed in distutils to actually
> utilize the new directory

Be my guest -- don't know anything about that, and no time to learn.

> and the changes to the Windows installer to create the directory
> at installation time ?

OK, I'll look into that, although it doesn't seem necessary.



From skip@pobox.com (Skip Montanaro)  Thu Jul 12 16:09:36 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 12 Jul 2001 10:09:36 -0500
Subject: [Python-Dev] What are these defines doing in OPT?
Message-ID: <15181.48560.980298.911962@beluga.mojam.com>

I just tried Fred's distutils.sysconfig.OPT thing, which failed.  I then
poked around and found distutils.sysconf.get_config_var.  That wouldn't work
because I hadn't installed the 2.2 interpreter (there is as yet no
/usr/local/lib/python2.2).  So, I finall just grepped my Makefile for OPT
and found this definition:

    OPT=  -g -O2 -Wall -Wstrict-prototypes -Dss_family=__ss_family -Dss_len=__ss_len

What are those -D flags doing in OPT?  Shouldn't they be in CPPFLAGS or
CFLAGS? 

Skip



From thomas.heller@ion-tof.com  Thu Jul 12 16:11:19 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 12 Jul 2001 17:11:19 +0200
Subject: [Distutils] RE: [Python-Dev] PEP 250 - site-packages on Windows:  (Was: [Distutils]    Package DB: strawman PEP)
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEE@ukrux002.rundc.uk.origin-it.com> <3B4DAA4C.A6F0859D@lemburg.com>
Message-ID: <12c801c10ae4$e79d1fe0$e000a8c0@thomasnotebook>

[I've cut down the To: and CC: headers to olny include python-dev
and distutils]
> Well, site.py could be modified to set a symbol in the sys module
> which could then be queried by distutils, e.g. sys.extinstallprefix.
> 
> Alternatively, distutils could be made to default to 
> Lib\site-packages and then revert to Lib\ in case this directory
> is not available.
> 
> BTW, I don't think that using Windows registry keys for determining the
> installation path is a good idea -- this information should be kept
> in the site.py or sitecustomize.py module for easy editing.

The problem is that the 'installation path' information must be
loaded at run time by the windows installer, and it may not always
sucessful to embed python at run time and let python code retrieve it.
Remember the problems we had with Python2.0 on win95/98, when win32all
was not installed? The installer was not able to compile the installed
files to pyc/pyo because of this path bug.

Anyway, how does bdist-rpm does it? Should be the same problem
there...

Thomas



From nas@python.ca  Thu Jul 12 16:18:09 2001
From: nas@python.ca (Neil Schemenauer)
Date: Thu, 12 Jul 2001 08:18:09 -0700
Subject: [Python-Dev] What are these defines doing in OPT?
In-Reply-To: <15181.48560.980298.911962@beluga.mojam.com>; from skip@pobox.com on Thu, Jul 12, 2001 at 10:09:36AM -0500
References: <15181.48560.980298.911962@beluga.mojam.com>
Message-ID: <20010712081809.A17168@glacier.fnational.com>

Skip Montanaro wrote:
>     OPT=  -g -O2 -Wall -Wstrict-prototypes -Dss_family=__ss_family -Dss_len=__ss_len
> 
> What are those -D flags doing in OPT?  Shouldn't they be in CPPFLAGS or
> CFLAGS? 

IMHO, they should be in DEFS.  Any objections to moving them there?

  Neil


From skip@pobox.com (Skip Montanaro)  Thu Jul 12 16:54:26 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 12 Jul 2001 10:54:26 -0500
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <20010712155409.K5396@xs4all.nl>
References: <15179.61728.255760.814673@beluga.mojam.com>
 <LNBBLJKPBEHFEDALKOLCKENGKNAA.tim.one@home.com>
 <15181.41580.258091.412915@beluga.mojam.com>
 <20010712155409.K5396@xs4all.nl>
Message-ID: <15181.51250.50466.870207@beluga.mojam.com>

    Thomas> Don't forget to do meaningful performance comparisons before and
    Thomas> after ;P

It's that adjective "meaningful" that makes it difficult...  Obviously
pystone wouldn't be meaningful since it doesn't do much string stuff.  I
tried timing the following:

    PYTHONPATH= time ./python -tt ../Lib/test/regrtest.py -l

after making sure the .py[co] files were deleted.  I got "102.96user
1.47system" before and "103.24user 1.57system" after.  I then removed the
.py[co] files again and ran the same test under gdb, with breakpoints in
each of the three branches whose break commands incremented counters.  After
letting it run for *a while*, I got tired of waiting for it to complete (it
was in the midst of test___all__).  I broke into the debugger then examined
the counters.  The int/int branch had been taken 5432 times, the
string/string branch had been taken 635 times and the else branch 673 times.
It would appear that string/string add is perhaps the second-most executed
type of add, but that it is executed infrequently enough (at least by the
test suite) that special-casing it will have no effect.  Still, if you are
doing lots of string concatenation, perhaps looking at other methods (append
to list, then join the result, for example) would be worthwhile.

Skip




From trentm@ActiveState.com  Thu Jul 12 17:01:13 2001
From: trentm@ActiveState.com (Trent Mick)
Date: Thu, 12 Jul 2001 09:01:13 -0700
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils]  Package DB: strawman PEP)
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHIEIPCCAA.tim@digicool.com>; from tim@digicool.com on Thu, Jul 12, 2001 at 10:57:30AM -0400
References: <3B4D587D.FDB733C0@lemburg.com> <BIEJKCLHCIOIHAGOKOLHIEIPCCAA.tim@digicool.com>
Message-ID: <20010712090113.E10387@ActiveState.com>

On Thu, Jul 12, 2001 at 10:57:30AM -0400, Tim Peters wrote:
> > and the changes to the Windows installer to create the directory
> > at installation time ?
> 
> OK, I'll look into that, although it doesn't seem necessary.

I have to agree with Tim. If distutils is going to install a package to
site-packages then it should create the directory itself if it does not
exist. Certainly it should not fail if the directory does not exist.

Trent


-- 
Trent Mick
TrentM@ActiveState.com


From trentm@ActiveState.com  Thu Jul 12 17:08:58 2001
From: trentm@ActiveState.com (Trent Mick)
Date: Thu, 12 Jul 2001 09:08:58 -0700
Subject: [Distutils] RE: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distu tils]   Package DB: strawman PEP)
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEB@ukrux002.rundc.uk.origin-it.com>; from Paul.Moore@atosorigin.com on Thu, Jul 12, 2001 at 11:02:28AM +0100
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEB@ukrux002.rundc.uk.origin-it.com>
Message-ID: <20010712090858.F10387@ActiveState.com>

On Thu, Jul 12, 2001 at 11:02:28AM +0100, Moore, Paul wrote:
> From: M.-A. Lemburg [mailto:mal@lemburg.com]
> > I don't have a site-packages dir in my installations. Could it be that
> > you installed some distutils package which automagically created one 
> > or that this change in Python 2.1.1 ?
> 
> I don't believe so. I have ActiveState Python - it's possible (although
> unlikely, I would think) that that version creates site-packages specially.

The ActivePython 2.1 installer *does* create <installdir>\Lib\site-packages
on Windows.


Trent

-- 
Trent Mick
TrentM@ActiveState.com


From jeremy@alum.mit.edu  Thu Jul 12 17:26:50 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 12 Jul 2001 12:26:50 -0400
Subject: [Python-Dev] Python 2.1.1
In-Reply-To: <20010712163314.L5396@xs4all.nl>
Message-ID: <AJEAKILOCCJMDILAPGJNCENDCFAA.jeremy@alum.mit.edu>

There's also a nested scopes bug, related to classes that use the same free
variable in several classes.  Evan Simpson said he posted a SF report about
it, but I can't find it.  I may be able to look into it. I'd rather not be
on the hook for it, but I'm not sure anyone else understands the code :-(.

Jeremy



From jeremy@alum.mit.edu  Thu Jul 12 17:26:51 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 12 Jul 2001 12:26:51 -0400
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <15181.51250.50466.870207@beluga.mojam.com>
Message-ID: <AJEAKILOCCJMDILAPGJNEENDCFAA.jeremy@alum.mit.edu>

    Thomas> Don't forget to do meaningful performance comparisons before and
    Thomas> after ;P

It's that adjective "meaningful" that makes it difficult...

>From http://mail.python.org/pipermail/python-dev/2001-May/014911.html:

"""It looks like the new coercion rules have optimized number ops at the
expense of string ops.  If you're writing programs with lots of
numbers, you probably think that's peachy.  If you're parsing HTML,
perhaps you don't :-).

I looked at the test suite to see how often it is called with
non-number arguments.  The answer is 77% of the time, but almost all
of those calls are from test_unicodedata.  If that one test is
excluded, the majority of the calls (~90%) are with numbers.  But the
majority of those calls just come from a few tests -- test_pow,
test_long, test_mutants, test_strftime.

If I were to do something about the coercions, I would see if there
was a way to quickly determine that PyNumber_Add() ain't gonna have
any luck.  Then we could bail to things like string_concat more
quickly."""

Jeremy



From thomas@xs4all.net  Thu Jul 12 17:33:06 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 12 Jul 2001 18:33:06 +0200
Subject: [Python-Dev] Python 2.1.1
In-Reply-To: <AJEAKILOCCJMDILAPGJNCENDCFAA.jeremy@alum.mit.edu>
References: <AJEAKILOCCJMDILAPGJNCENDCFAA.jeremy@alum.mit.edu>
Message-ID: <20010712183305.S5391@xs4all.nl>

On Thu, Jul 12, 2001 at 12:26:50PM -0400, Jeremy Hylton wrote:
> There's also a nested scopes bug, related to classes that use the same free
> variable in several classes.  Evan Simpson said he posted a SF report about
> it, but I can't find it.  I may be able to look into it. I'd rather not be
> on the hook for it, but I'm not sure anyone else understands the code :-(.

As far as I could determine, that bug is the one you fixed shortly after
2.1-release:

----------------------------
compile.c revision 2.198
date: 2001/04/27 02:29:40;  author: jhylton;  state: Exp;  lines: +20 -6
Fix 2.1 nested scopes crash reported by Evan Simpson

The new test case demonstrates the bug.  Be more careful in
symtable_resolve_free() to add a var to cells or frees only if it
won't be added under some other rule.

XXX Add new assertion that will catch this bug.
-----------------------------

I couldn't reproduce his bugreport using 2.2/2.1.1-with-this-fix, but I
could with 2.1-final, so I mentioned that, marked it fixed and closed it.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From skip@pobox.com (Skip Montanaro)  Thu Jul 12 17:33:13 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 12 Jul 2001 11:33:13 -0500
Subject: [Python-Dev] Python 2.1.1
In-Reply-To: <20010712163314.L5396@xs4all.nl>
References: <20010712163314.L5396@xs4all.nl>
Message-ID: <15181.53577.543414.970113@beluga.mojam.com>

    Thomas> Both of these are distutils-build related, and I'm not sure on
    Thomas> the 'right' fix on either. The latter also applies to 'bsddb',
    Thomas> by the way, and is especially annoying to me, because I'm
    Thomas> running Debian on more and more machines :) Does anyone who
    Thomas> understands setup.py have time to look at these before a week
    Thomas> from friday, when 2.1.1-final is scheduled ?

I just added another variant (with a patch): bsddb build on Mandrake 8.0 is
broken because it doesn't account for the libdb* shared library when
creating bsddb.so:

    https://sourceforge.net/tracker/index.php?func=detail&aid=440725&group_id=5470&atid=105470

Thomas, I'm not sure if this applies to your Debian build woes, but perhaps
it will help.

-- 
Skip Montanaro (skip@pobox.com)
(847)971-7098


From mal@lemburg.com  Thu Jul 12 17:38:41 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 12 Jul 2001 18:38:41 +0200
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils]
 Package DB: strawman PEP)
References: <3B4D587D.FDB733C0@lemburg.com> <BIEJKCLHCIOIHAGOKOLHIEIPCCAA.tim@digicool.com> <20010712090113.E10387@ActiveState.com>
Message-ID: <3B4DD291.7FB22BC6@lemburg.com>

Trent Mick wrote:
> 
> On Thu, Jul 12, 2001 at 10:57:30AM -0400, Tim Peters wrote:
> > > and the changes to the Windows installer to create the directory
> > > at installation time ?
> >
> > OK, I'll look into that, although it doesn't seem necessary.
> 
> I have to agree with Tim. If distutils is going to install a package to
> site-packages then it should create the directory itself if it does not
> exist. Certainly it should not fail if the directory does not exist.

I believe that it creates the directory (distutils has a make_path()
API for this), but having it there for testing would sure help
in figuring out what to do. Please keep in mind that distutils
has to work with Python versions 1.5.2, 2.0 and 2.1.

Also, I think that it is cleaner to have existing directories
on sys.path. Indeed, it may be worthwhile having Python eliminate
non-existing dirs at startup time (i.e. in site.py).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From mal@lemburg.com  Thu Jul 12 17:43:36 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 12 Jul 2001 18:43:36 +0200
Subject: [Distutils] RE: [Python-Dev] PEP 250 - site-packages on Windows:
 (Was: [Distutils]    Package DB: strawman PEP)
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEEE@ukrux002.rundc.uk.origin-it.com> <3B4DAA4C.A6F0859D@lemburg.com> <12c801c10ae4$e79d1fe0$e000a8c0@thomasnotebook>
Message-ID: <3B4DD3B8.8A176981@lemburg.com>

Thomas Heller wrote:
> 
> [I've cut down the To: and CC: headers to olny include python-dev
> and distutils]
> > Well, site.py could be modified to set a symbol in the sys module
> > which could then be queried by distutils, e.g. sys.extinstallprefix.
> >
> > Alternatively, distutils could be made to default to
> > Lib\site-packages and then revert to Lib\ in case this directory
> > is not available.
> >
> > BTW, I don't think that using Windows registry keys for determining the
> > installation path is a good idea -- this information should be kept
> > in the site.py or sitecustomize.py module for easy editing.
> 
> The problem is that the 'installation path' information must be
> loaded at run time by the windows installer, and it may not always
> sucessful to embed python at run time and let python code retrieve it.
> Remember the problems we had with Python2.0 on win95/98, when win32all
> was not installed? The installer was not able to compile the installed
> files to pyc/pyo because of this path bug.

Ok. Point taken (this time ;-).
 
> Anyway, how does bdist-rpm does it? Should be the same problem
> there...

bdist_rpm runs the Python interpreter to figure out the install
dirs, etc. at rpm build time. The paths are then hard-coded into
the rpm file.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From thomas@xs4all.net  Thu Jul 12 17:43:19 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 12 Jul 2001 18:43:19 +0200
Subject: [Python-Dev] Python 2.1.1
In-Reply-To: <15181.53577.543414.970113@beluga.mojam.com>
References: <20010712163314.L5396@xs4all.nl> <15181.53577.543414.970113@beluga.mojam.com>
Message-ID: <20010712184319.T5391@xs4all.nl>

On Thu, Jul 12, 2001 at 11:33:13AM -0500, Skip Montanaro wrote:

>     Thomas> Both of these are distutils-build related, and I'm not sure on
>     Thomas> the 'right' fix on either. The latter also applies to 'bsddb',
>     Thomas> by the way, and is especially annoying to me, because I'm
>     Thomas> running Debian on more and more machines :) Does anyone who
>     Thomas> understands setup.py have time to look at these before a week
>     Thomas> from friday, when 2.1.1-final is scheduled ?

> I just added another variant (with a patch): bsddb build on Mandrake 8.0 is
> broken because it doesn't account for the libdb* shared library when
> creating bsddb.so:
> 
>     https://sourceforge.net/tracker/index.php?func=detail&aid=440725&group_id=5470&atid=105470

> Thomas, I'm not sure if this applies to your Debian build woes, but
> perhaps it will help.

Yes, it does! Now bsddb builds, but dbmmodule still doesn't. It seems that's
because setup.py only checks for libndbm.so, and not for libdbX.so, which
also have a DBM implementation (IIRC), or libgdbm.so, which has one too.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Thu Jul 12 17:50:29 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 12 Jul 2001 18:50:29 +0200
Subject: [Python-Dev] Python 2.1.1
In-Reply-To: <20010712184319.T5391@xs4all.nl>
Message-ID: <20010712185029.N5396@xs4all.nl>

On Thu, Jul 12, 2001 at 06:43:19PM +0200, Thomas Wouters wrote:

> Now bsddb builds, but dbmmodule still doesn't.

I should have said 'works'. They both build, dbmmodule just doesn't work:

test dbm skipped -- /home/thomas/python/python-2.1.1/dist/src/build/lib.linux-i686-2.1/dbm.so: undefined symbol: dbm_firstkey


-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From cgw@transamoeba.dyndns.org  Thu Jul 12 18:10:33 2001
From: cgw@transamoeba.dyndns.org (charles g waldman)
Date: Thu, 12 Jul 2001 12:10:33 -0500
Subject: [Python-Dev] Re: Python 1.5.2 and Solaris 8
In-Reply-To: <E15Kiv0-0001Zw-00@mail.python.org>
References: <E15Kiv0-0001Zw-00@mail.python.org>
Message-ID: <15181.55817.853438.785937@transamoeba.dyndns.org>

I have access to a Solaris 8 machine, but it's only 32 bits wide.

I did a quick build of Py 1.5.2  (which I haven't run for quite some time!)
under both the native (SunPro) compiler and also using gcc 2.95.2

I configured  --with-thread and ran the test suite.

There were no compile-time errors or warnings, but one test failed:

 test test_popen2 crashed -- exceptions.AssertionError : 
  Traceback (innermost last):
   File "./Lib/test/regrtest.py", line 204, in runtest
     __import__(test, globals(), locals(), [])
   File "./Lib/test/test_popen2.py", line 16, in ?
     main()
   File "./Lib/test/test_popen2.py", line 14, in main
     popen2._test()
   File "./Lib/popen2.py", line 95, in _test
     assert not _active
 AssertionError: 


This happened with both the gcc and SunPro builds.

Everything else looks OK, but I did not do any extensive tests beyond
"import test.testall"








From thomas@xs4all.net  Thu Jul 12 18:19:40 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 12 Jul 2001 19:19:40 +0200
Subject: [Python-Dev] Re: Python 1.5.2 and Solaris 8
In-Reply-To: <15181.55817.853438.785937@transamoeba.dyndns.org>
References: <E15Kiv0-0001Zw-00@mail.python.org> <15181.55817.853438.785937@transamoeba.dyndns.org>
Message-ID: <20010712191940.O5396@xs4all.nl>

On Thu, Jul 12, 2001 at 12:10:33PM -0500, charles g waldman wrote:

> I have access to a Solaris 8 machine, but it's only 32 bits wide.

Weren't there a bunch of 64-bit-system fixes in 2.0/2.1 ? Or were they just
for the Windows flavour, where pointers were bigger than longs ?

> I did a quick build of Py 1.5.2  (which I haven't run for quite some time!)
> under both the native (SunPro) compiler and also using gcc 2.95.2

Could I bug you to do the same thing with Python 2.1.1c1 ? My own attempts
on Solaris 7 worked okay, but two things failed: readline, and socket (with
SSL support.) The latter works okay without SSL. I suspect that's because
both libreadline and libcrypto/libssl are static libraries, not shared ones,
and the linker barfs on it, but that's just something I realized on the way
home, so I haven't doublechecked it :) All the other modules seem to compile
fine, and all tests pass, too.

Still, it would be nice to test 2.1.1c1 on as many obscure systems as
possible ;)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From jeremy@alum.mit.edu  Thu Jul 12 18:39:25 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 12 Jul 2001 13:39:25 -0400
Subject: [Python-Dev] Python 2.1.1
In-Reply-To: <20010712183305.S5391@xs4all.nl>
Message-ID: <AJEAKILOCCJMDILAPGJNGENJCFAA.jeremy@alum.mit.edu>

There was a second bug reported recently.  I'll try to dig up the email.

Jeremy



From fdrake@acm.org  Thu Jul 12 21:10:16 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 12 Jul 2001 16:10:16 -0400 (EDT)
Subject: [Distutils] Re: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils]
 Package DB: strawman PEP)
In-Reply-To: <3B4DD291.7FB22BC6@lemburg.com>
References: <3B4D587D.FDB733C0@lemburg.com>
 <BIEJKCLHCIOIHAGOKOLHIEIPCCAA.tim@digicool.com>
 <20010712090113.E10387@ActiveState.com>
 <3B4DD291.7FB22BC6@lemburg.com>
Message-ID: <15182.1064.74187.737679@cj42289-a.reston1.va.home.com>

M.-A. Lemburg writes:
 > I believe that it creates the directory (distutils has a make_path()
 > API for this), but having it there for testing would sure help
 > in figuring out what to do. Please keep in mind that distutils
 > has to work with Python versions 1.5.2, 2.0 and 2.1.

  Yes; the os.path.isdir(...) seems the right test for this.

 > Also, I think that it is cleaner to have existing directories
 > on sys.path. Indeed, it may be worthwhile having Python eliminate
 > non-existing dirs at startup time (i.e. in site.py).

  It should be doing that now.  If not, please file a bug report and
assign it to me.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From tim@digicool.com  Thu Jul 12 21:23:13 2001
From: tim@digicool.com (Tim Peters)
Date: Thu, 12 Jul 2001 16:23:13 -0400
Subject: [Distutils] Re: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils] Package DB: strawman PEP)
In-Reply-To: <3B4DD291.7FB22BC6@lemburg.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHMEKECCAA.tim@digicool.com>

FYI, I fiddled the Windows Wise install script to create

     Lib\site-packages\

Of course this only applies to PythonLabs Windows installers created at or
after 2.2a1.



From fdrake@acm.org  Thu Jul 12 21:46:41 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 12 Jul 2001 16:46:41 -0400 (EDT)
Subject: [Distutils] Re: [Python-Dev] PEP 250 - site-packages on
 Windows: (Was: [Distutils] Package DB: strawman PEP)
In-Reply-To: <15182.1064.74187.737679@cj42289-a.reston1.va.home.com>
References: <3B4D587D.FDB733C0@lemburg.com>
 <BIEJKCLHCIOIHAGOKOLHIEIPCCAA.tim@digicool.com>
 <20010712090113.E10387@ActiveState.com>
 <3B4DD291.7FB22BC6@lemburg.com>
 <15182.1064.74187.737679@cj42289-a.reston1.va.home.com>
Message-ID: <15182.3249.195312.99147@cj42289-a.reston1.va.home.com>

Fred L. Drake, Jr. writes:
 >   It should be doing that now.  If not, please file a bug report and
 > assign it to me.

  Nevermind.  It is a bug, and I'm about to check in the fix.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From fredrik@pythonware.com  Thu Jul 12 21:54:02 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 12 Jul 2001 22:54:02 +0200
Subject: [Python-Dev] one more thing for 2.2?
Message-ID: <005401c10b14$ca994ba0$4ffa42d5@hagrid>

has anyone looked at Paul Svensson's "unreserved words" patch?

http://mail.python.org/pipermail/python-list/2001-June/047996.html

    "The bottom line: apply this patch, and you can use all of Python's
    'reserved words' as identifiers; in most cases right away, in all other
    cases by wrapping parens around them."

</F>



From guido@digicool.com  Thu Jul 12 22:02:30 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 12 Jul 2001 17:02:30 -0400
Subject: [Python-Dev] one more thing for 2.2?
In-Reply-To: Your message of "Thu, 12 Jul 2001 22:54:02 +0200."
 <005401c10b14$ca994ba0$4ffa42d5@hagrid>
References: <005401c10b14$ca994ba0$4ffa42d5@hagrid>
Message-ID: <200107122102.f6CL2V215763@odiug.digicool.com>

> has anyone looked at Paul Svensson's "unreserved words" patch?
> 
> http://mail.python.org/pipermail/python-list/2001-June/047996.html
> 
>     "The bottom line: apply this patch, and you can use all of Python's
>     'reserved words' as identifiers; in most cases right away, in all other
>     cases by wrapping parens around them."

Wow, an impressive hack.  But a hack!  Lots of special casing, and
breaks abstractions: the parser driver is supposed to know nothing
about the actual grammar embodied in its tables.

And it won't help with yield: things like

    yield (1)
    yield [1]

are as valid in the old syntax as they are with the yield statement
added.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Thu Jul 12 22:13:11 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 12 Jul 2001 23:13:11 +0200
Subject: [Python-Dev] one more thing for 2.2?
In-Reply-To: <200107122102.f6CL2V215763@odiug.digicool.com>
Message-ID: <20010712231311.P5396@xs4all.nl>

On Thu, Jul 12, 2001 at 05:02:30PM -0400, Guido van Rossum wrote:
> > has anyone looked at Paul Svensson's "unreserved words" patch?
> > 
> > http://mail.python.org/pipermail/python-list/2001-June/047996.html
> > 
> >     "The bottom line: apply this patch, and you can use all of Python's
> >     'reserved words' as identifiers; in most cases right away, in all other
> >     cases by wrapping parens around them."

> Wow, an impressive hack.  But a hack!  Lots of special casing, and
> breaks abstractions: the parser driver is supposed to know nothing
> about the actual grammar embodied in its tables.

But does it hurt if it does ? It's not like we use it as a general purpose
parser right now, and would we really want to use the current parser as a
general purpose one ? I have to agree that a nice, clean, powerful parser
that can deal better with ambiguities (an LR parser, is that what it's
called ? :P) is a much better solution, but in some cases, a hack is better
than nothing.

> And it won't help with yield: things like

>     yield (1)
>     yield [1]

> are as valid in the old syntax as they are with the yield statement
> added.

No, but it will help with bindings to languages that require keywords. .NET
comes to mind, again, as does Java. It would also be very cool if we could
rename pprint.pprint to pprint.print ;P

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tim@digicool.com  Thu Jul 12 22:22:53 2001
From: tim@digicool.com (Tim Peters)
Date: Thu, 12 Jul 2001 17:22:53 -0400
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils]   Package DB: strawman PEP)
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AEED@ukrux002.rundc.uk.origin-it.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHEEKICCAA.tim@digicool.com>

[Thomas Heller]
> The bdist_wininst installer simply installs into prefix,
> this is what the registry has under
> HKEY_LOCAL_MACHINE\Software\Python\PythonCore\2.1\InstallPath.
>
> Now what should it do?

[Moore, Paul]
> What does that key *mean*?

Mark Hammond documented it as being the directory into which Python is
installed; python.exe lives here.

> If it is the directory into which packages should get installed,

No; it's much older than the package mechanism <0.1 wink>.

> then bdist_wininst should keep doing what it does now, and the
> Python installer should be changed to put site-packages into that key.

Not its purpose.

> If, on the other hand, this key has a meaning elsewhere in Python,

Not in the PythonLabs distribution, but I expect Mark's Win32 extensions
make use of it.

> and changing it would cause a problem,

IMO, any change to the registry settings requires Mark Hammond's blessing.

> then I would say that this is a bug in the Windows Installer, which
> should use a key of its own.

Couldn't follow that one.

> In that case, my recommendation would be to have the Python 2.2
> installer create a new key and wininst use that if it exists,
> ...

If you have to use the registry, why not paste Lib/site-packages on to the
end of InstallPath and use that?



From guido@digicool.com  Thu Jul 12 22:28:42 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 12 Jul 2001 17:28:42 -0400
Subject: [Python-Dev] one more thing for 2.2?
In-Reply-To: Your message of "Thu, 12 Jul 2001 23:13:11 +0200."
 <20010712231311.P5396@xs4all.nl>
References: <20010712231311.P5396@xs4all.nl>
Message-ID: <200107122128.f6CLSh615890@odiug.digicool.com>

> > > http://mail.python.org/pipermail/python-list/2001-June/047996.html
> >
> > Wow, an impressive hack.  But a hack!  Lots of special casing, and
> > breaks abstractions: the parser driver is supposed to know nothing
> > about the actual grammar embodied in its tables.
> 
> But does it hurt if it does ? It's not like we use it as a general purpose
> parser right now, and would we really want to use the current parser as a
> general purpose one ?

Well, it *is* used to parse its own input. :-)

> I have to agree that a nice, clean, powerful parser that can deal
> better with ambiguities (an LR parser, is that what it's called ?
> :P)

Yes, why the :P)?

LR parsers deal better with ambiguities at the grammar level --
actually, not so much with real ambiguities, but things that look
ambiguous until you've seen more of the input.  For example an LR
grammar can correctly be told that

   f(a, b) = 12

is invalid; the current LL parser can't.  Therefore this has to be
rejected in a separate pass.  Currently I believe that's the code
generation pass but it could be a separate pass altogether.

> is a much better solution, but in some cases, a hack is better than
> nothing.

Adopting this particular hack means you can never go back.  It
effectively "unreserves" most keywords most of the time, and that
means that you can no longer use other parser technologies to parse
Python.  E.g. suppose someone has a Yacc-based parser for Python.  It
would be quite a feat to hack the Yacc driver to do the same retrying
that his hack does.  I bet it would also require a major effort to get
tokenize.py to work correctly again.

The hack it effectively makes it impossible to give a specification of
the real grammar of the language -- you have to try and see if the
parser accepts something or not.

> No, but it will help with bindings to languages that require
> keywords. .NET comes to mind, again, as does Java. It would also be
> very cool if we could rename pprint.pprint to pprint.print ;P

An approach that might work for this is to pick a FEW keywords
(e.g. those that are not reserved words in C or Java or C++) and add
those to a FEW places in the grammar.  E.g. add a rule

    extended_name: NAME | 'print'   # plus a few others

and then use extended_name instead of NAME in the rules for attribute
selection and function definition:

    funcdef: 'def' extended_name parameters ':' suite
       .
       .
       .
    trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' extended_name

This would be unambiguous.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@cj42289-a.reston1.va.home.com  Thu Jul 12 22:37:23 2001
From: fdrake@cj42289-a.reston1.va.home.com (Fred Drake)
Date: Thu, 12 Jul 2001 17:37:23 -0400 (EDT)
Subject: [Python-Dev] [maintenance doc updates]
Message-ID: <20010712213723.CE4202892B@cj42289-a.reston1.va.home.com>

The development version of the documentation has been updated:

	http://python.sourceforge.net/maint-docs/


Final documentation build for Python 2.1.1 release candidate 1.  This
version is also available at the Python FTP site:

    ftp://ftp.python.org/pub/python/doc/2.1.1c1/



From skip@pobox.com (Skip Montanaro)  Thu Jul 12 23:08:06 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 12 Jul 2001 17:08:06 -0500
Subject: [Python-Dev] one more thing for 2.2?
In-Reply-To: <20010712231311.P5396@xs4all.nl>
References: <200107122102.f6CL2V215763@odiug.digicool.com>
 <20010712231311.P5396@xs4all.nl>
Message-ID: <15182.8134.100759.669643@beluga.mojam.com>

    Thomas> No, but it will help with bindings to languages that require
    Thomas> keywords. .NET comes to mind, again, as does Java. It would also
    Thomas> be very cool if we could rename pprint.pprint to pprint.print ;P

Or with Python bindings to various external packages we want to wrap.  They
sometimes have function, variable or attribute names that are keywords in
Python and must therefore be mangled in one fashion or another.  James
Henstridge has to add trailing underscores to a number of attribute names in
his PyGtk wrappers: "in_", "del_" and "raise_".

Another one that always grates on me is "class".  "class_" or "klass" both
look ugly.

The biggest drawback I see is that in some situations people will have to
enclose variable names in parens to sneak them by the parser.  That seems
inelegant to me.  I'm not sure I want to explain this to new users.

Skip



From thomas@xs4all.net  Thu Jul 12 23:42:27 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 13 Jul 2001 00:42:27 +0200
Subject: [Python-Dev] one more thing for 2.2?
In-Reply-To: <200107122128.f6CLSh615890@odiug.digicool.com>
References: <20010712231311.P5396@xs4all.nl> <200107122128.f6CLSh615890@odiug.digicool.com>
Message-ID: <20010713004227.R5396@xs4all.nl>

On Thu, Jul 12, 2001 at 05:28:42PM -0400, Guido van Rossum wrote:

> > I have to agree that a nice, clean, powerful parser that can deal
> > better with ambiguities (an LR parser, is that what it's called ?
> > :P)

> Yes, why the :P)?

Because I was guessing, as I know practically naught about parsers and
parsing techniques. For instance, I was not aware that a yacc-based parser
would be LL(x) (for some small value of x). ':P)' was tongue-in-cheek,
followed by a closing parenthesis.

> > is a much better solution, but in some cases, a hack is better than
> > nothing.
> 
> Adopting this particular hack means you can never go back.  It
> effectively "unreserves" most keywords most of the time, and that
> means that you can no longer use other parser technologies to parse
> Python.  E.g. suppose someone has a Yacc-based parser for Python.  It
> would be quite a feat to hack the Yacc driver to do the same retrying
> that his hack does.  I bet it would also require a major effort to get
> tokenize.py to work correctly again.

[ and ]

> An approach that might work for this is to pick a FEW keywords
> (e.g. those that are not reserved words in C or Java or C++) and add
> those to a FEW places in the grammar.  E.g. add a rule

>     extended_name: NAME | 'print'   # plus a few others
> 
> and then use extended_name instead of NAME in the rules for attribute
> selection and function definition:
> 
>     funcdef: 'def' extended_name parameters ':' suite
>        .
>        .
>        .
>     trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' extended_name

> This would be unambiguous.

This has been discussed before. The main problem with this is that no one's
done it :) I've done a quick test-hack, but ran into somany unguarded
'STR(node)' calls in compile.c that expected a NAME, not an extended_name,
that I gave up. It also wouldn't really alleviate the tokenize.py problem --
if adding a few keywords-as-identifiers is doable, so is adding a lot of
them :) And there's the maintenance problem on the Grammar... when adding a
new keyword, you need to carefully consider where to allow it. However, it's
not like adding a new keyword is done more than once a lustrum ;)

But I don't have any real need for keywords as identifiers, so I don't mind
if we keep the current limitations.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From greg@cosc.canterbury.ac.nz  Fri Jul 13 00:49:04 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 13 Jul 2001 11:49:04 +1200 (NZST)
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <AJEAKILOCCJMDILAPGJNEENDCFAA.jeremy@alum.mit.edu>
Message-ID: <200107122349.LAA02015@s454.cosc.canterbury.ac.nz>

> """It looks like the new coercion rules have optimized number ops at the
> expense of string ops.

Is there still an intention to get rid of centralised
coercion and move it all into the relevant methods?

If that were done, wouldn't problems like this go
away (or at least turn into a different set of
problems)?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From fdrake@acm.org  Fri Jul 13 00:50:43 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Thu, 12 Jul 2001 19:50:43 -0400 (EDT)
Subject: [Python-Dev] [development doc updates]
Message-ID: <20010712235043.5A42D2892B@cj42289-a.reston1.va.home.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/

Lots of small updates.

Added Eric Raymond's documentation for the XML-RPM module added to
the standard library.



From esr@thyrsus.com  Fri Jul 13 01:04:23 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Thu, 12 Jul 2001 20:04:23 -0400
Subject: [Python-Dev] [development doc updates]
In-Reply-To: <20010712235043.5A42D2892B@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Thu, Jul 12, 2001 at 07:50:43PM -0400
References: <20010712235043.5A42D2892B@cj42289-a.reston1.va.home.com>
Message-ID: <20010712200423.A13553@thyrsus.com>

Fred L. Drake <fdrake@acm.org>:
> The development version of the documentation has been updated:
> 
>     http://python.sourceforge.net/devel-docs/
> 
> Lots of small updates.
> 
> Added Eric Raymond's documentation for the XML-RPM module added to
> the standard library.

Calling the effbot!  Calling the effbot!  Fredrik, please proofread 
my stuff and fill in any important bits you think are missing.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"They that can give up essential liberty to obtain a little temporary 
safety deserve neither liberty nor safety."
	-- Benjamin Franklin, Historical Review of Pennsylvania, 1759.


From tim.one@home.com  Fri Jul 13 01:16:57 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 12 Jul 2001 20:16:57 -0400
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <15181.51250.50466.870207@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEBAKOAA.tim.one@home.com>

[Skip Montanaro]
> It's that adjective "meaningful" that makes it difficult...  Obviously
> pystone wouldn't be meaningful since it doesn't do much string stuff.

pystone is always meaningful, and *especially* when "it shouldn't" change
but does anyway <0.4 wink>.  For example, while staring at strings, you may
miss that other kinds of + slow down.

> I tried timing the following:
>
>     PYTHONPATH= time ./python -tt ../Lib/test/regrtest.py -l
>
> after making sure the .py[co] files were deleted.  I got "102.96user
> 1.47system" before and "103.24user 1.57system" after.

I'm afraid this is useless except to get the sense of highly significant
changes:  several of the tests do a varying amount of work depending on
results from random.py (which initializes itself from system time when it's
first imported).

pystone is the only shared "speed benchmark" we have.



From guido@digicool.com  Fri Jul 13 02:16:16 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 12 Jul 2001 21:16:16 -0400
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: Your message of "Fri, 13 Jul 2001 11:49:04 +1200."
 <200107122349.LAA02015@s454.cosc.canterbury.ac.nz>
References: <200107122349.LAA02015@s454.cosc.canterbury.ac.nz>
Message-ID: <200107130116.f6D1GGq16019@odiug.digicool.com>

> > """It looks like the new coercion rules have optimized number ops at the
> > expense of string ops.
> 
> Is there still an intention to get rid of centralised
> coercion and move it all into the relevant methods?

This has been done (except for complex).

> If that were done, wouldn't problems like this go
> away (or at least turn into a different set of
> problems)?

I'm not sure what that remark refers to, actually.

BINARY_ADD and BINARY_SUBTRACT just test if both args are ints and
then in-line the work; BINARY_SUBSCRIPT does the same thing for
list[int].  I don't think it has anything to do with coercions.  When
the operands are strings, the costs are one pointer deref + compare to
link-time constant, and one jump (over the inlined code).

Small things add up, but I doubt that this is responsible for any
particular slow-down.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@alum.mit.edu  Fri Jul 13 04:19:57 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 12 Jul 2001 23:19:57 -0400
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <200107130116.f6D1GGq16019@odiug.digicool.com>
Message-ID: <AJEAKILOCCJMDILAPGJNEEOICFAA.jeremy@alum.mit.edu>

[Greg Ewing:]
>> If that were done, wouldn't problems like this go
>> away (or at least turn into a different set of
>> problems)?

[Guido:]
>I'm not sure what that remark refers to, actually.
>
>BINARY_ADD and BINARY_SUBTRACT just test if both args are ints and
>then in-line the work; BINARY_SUBSCRIPT does the same thing for
>list[int].  I don't think it has anything to do with coercions.  When
>the operands are strings, the costs are one pointer deref + compare to
>link-time constant, and one jump (over the inlined code).
>
>Small things add up, but I doubt that this is responsible for any
>particular slow-down.

The big change is the coercion work being done in binary_op1(), which
tries to turn strings into numbers in a variety of ways.  BINARY_ADD
calls PyNumber_Add(), which calls binary_op1().  When the binary_op1()
calls fails, it then tries sequence concatenation.

If it were possible for binary_op1() to fail quickly for non-numeric
sequences like strings, we would not see the slowdown for small string
operations.  (I believe that's what the silly little benchmark shows
and what one of the pybench tests shows.)

Jeremy




From tim.one@home.com  Fri Jul 13 04:27:32 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 12 Jul 2001 23:27:32 -0400
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <200107130116.f6D1GGq16019@odiug.digicool.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEBGKOAA.tim.one@home.com>

[Greg Ewing]
>> Is there still an intention to get rid of centralised
>> coercion and move it all into the relevant methods?

[Guido]
> This has been done (except for complex).

>> If that were done, wouldn't problems like this go
>> away (or at least turn into a different set of
>> problems)?

> I'm not sure what that remark refers to, actually.
>
> BINARY_ADD and BINARY_SUBTRACT just test if both args are ints and
> then in-line the work; BINARY_SUBSCRIPT does the same thing for
> list[int].  I don't think it has anything to do with coercions.  When
> the operands are strings, the costs are one pointer deref + compare to
> link-time constant, and one jump (over the inlined code).

It's not BINARY_ADD, it's the PyNumber_Add() called by BINARY_ADD, which,
given two strings, calls binary_op1, which does a few failing tests, then
calls PyNumber_CoerceEx, which fails quickly enough to coerce, and then
pokes around a little looking for number methods, and finally says "hmm!
maybe it's a sequence?".

> Small things add up, but I doubt that this is responsible for any
> particular slow-down.

Jeremy earlier pinned the blame on this for one of the "dramatic" pybench
slowdowns; Skip may or may not have bumped into it again with his "silly
little benchmark" (read the Subject line <wink>).  I doubt it's responsible
for significant real-life slowdowns.



From greg@cosc.canterbury.ac.nz  Fri Jul 13 05:29:38 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 13 Jul 2001 16:29:38 +1200 (NZST)
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEBGKOAA.tim.one@home.com>
Message-ID: <200107130429.QAA02078@s454.cosc.canterbury.ac.nz>

Tim Peters <tim.one@home.com>:

> It's not BINARY_ADD, it's the PyNumber_Add() called by BINARY_ADD, which,
> given two strings, calls binary_op1, which does a few failing tests, then
> calls PyNumber_CoerceEx, which fails quickly enough to coerce, and then
> pokes around a little looking for number methods, and finally says "hmm!
> maybe it's a sequence?".

This seems to contradict what Guido just said about
centralised coercion having been removed. Is one or
the other of us talking nonsense, or do we misunderstand
each other?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From thomas.heller@ion-tof.com  Fri Jul 13 09:27:44 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 13 Jul 2001 10:27:44 +0200
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils]   Package DB: strawman PEP)
References: <BIEJKCLHCIOIHAGOKOLHEEKICCAA.tim@digicool.com>
Message-ID: <027201c10b75$b18224a0$e000a8c0@thomasnotebook>

From: "Tim Peters" <tim@digicool.com>
> [Thomas Heller]
> > The bdist_wininst installer simply installs into prefix,
> > this is what the registry has under
> > HKEY_LOCAL_MACHINE\Software\Python\PythonCore\2.1\InstallPath.
> >
> > Now what should it do?
> 
> [Moore, Paul]
> > What does that key *mean*?
> 
> Mark Hammond documented it as being the directory into which Python is
> installed; python.exe lives here.
> 
> > If it is the directory into which packages should get installed,
> 
> No; it's much older than the package mechanism <0.1 wink>.
> 
Per _accident_ it is also the location (pre PEP250 time),
where packages should get installed.

> > In that case, my recommendation would be to have the Python 2.2
> > installer create a new key and wininst use that if it exists,
> > ...
> 
> If you have to use the registry, why not paste Lib/site-packages on to the
> end of InstallPath and use that?
The problem is that the same wininst executable should behave differently
depending on the policy Python has chosen for the installation directory:
Python 2.1 and before: Use prefix, Python 2.2 (and higher) should use
prefix/lib/site-packages.
That's why I said a (very hacky) solution would be to simply check 
for the version number at install time.
A better solution would be to somehow query site.py at install time?

Thomas



From thomas@xs4all.net  Fri Jul 13 11:57:20 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 13 Jul 2001 12:57:20 +0200
Subject: [Python-Dev] Python 2.1.1
In-Reply-To: <20010712184319.T5391@xs4all.nl>
Message-ID: <20010713125720.S5396@xs4all.nl>

On Thu, Jul 12, 2001 at 06:43:19PM +0200, Thomas Wouters wrote:
> On Thu, Jul 12, 2001 at 11:33:13AM -0500, Skip Montanaro wrote:

> >     Thomas> Both of these are distutils-build related, and I'm not sure on
> >     Thomas> the 'right' fix on either. The latter also applies to 'bsddb',
> >     Thomas> by the way, and is especially annoying to me, because I'm
> >     Thomas> running Debian on more and more machines :) Does anyone who
> >     Thomas> understands setup.py have time to look at these before a week
> >     Thomas> from friday, when 2.1.1-final is scheduled ?

> > I just added another variant (with a patch): bsddb build on Mandrake 8.0 is
> > broken because it doesn't account for the libdb* shared library when
> > creating bsddb.so:
> > 
> >     https://sourceforge.net/tracker/index.php?func=detail&aid=440725&group_id=5470&atid=105470

> > Thomas, I'm not sure if this applies to your Debian build woes, but
> > perhaps it will help.

> Yes, it does! Now bsddb builds, but dbmmodule still doesn't. It seems that's
> because setup.py only checks for libndbm.so, and not for libdbX.so, which
> also have a DBM implementation (IIRC), or libgdbm.so, which has one too.

This does fix my problem:

Index: setup.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/setup.py,v
retrieving revision 1.38
diff -c -r1.38 setup.py
*** setup.py    2001/04/15 15:16:12     1.38
--- setup.py    2001/07/13 10:51:14
***************
*** 323,331 ****
  
          # The standard Unix dbm module:
          if platform not in ['cygwin']:
!             if (self.compiler.find_library_file(lib_dirs, 'ndbm')):
                  exts.append( Extension('dbm', ['dbmmodule.c'],
!                                        libraries = ['ndbm'] ) )
              else:
                  exts.append( Extension('dbm', ['dbmmodule.c']) )
  
--- 323,337 ----
  
          # The standard Unix dbm module:
          if platform not in ['cygwin']:
!             for lib in ('ndbm', 'db', 'db1', 'db2', 'db3', 'dbm'):
!                 if self.compiler.find_library_file(lib_dirs, lib):
!                     break
!             else:
!                 lib = None
! 
!             if lib:
                  exts.append( Extension('dbm', ['dbmmodule.c'],
!                                        libraries = [lib]) )
              else:
                  exts.append( Extension('dbm', ['dbmmodule.c']) )
  

The problem is very simple: distutils does not play well with autoconf. The
problem is that I have at least two implementations of 'dbm' available:
'libdbm', which comes with GDBM, and 'libdb1', which comes with libc.
Autoconf tries to figure out which include file to use, and it does a decent
job, but then distutils goes ahead and just tries to link with 'libndbm',
which I don't have. The search path I give above works because I need
'libdb1', but it would still barf if autoconf found a different header than
distutils tries to link with. 

In other words: it's a mess. Distutils should do the include-file-finding
*and* the library-file-finding, and pass the right arguments, *or* autoconf
should find both the include file and the library file, and pass that info
to distutils somehow.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Fri Jul 13 12:59:31 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 13 Jul 2001 13:59:31 +0200
Subject: [Python-Dev] "Brennan, Bernadette": Python 1.5.2 & Solaris 8
In-Reply-To: <20010712165501.M5396@xs4all.nl>
References: <200107121400.f6CE0XA14567@odiug.digicool.com> <20010712165501.M5396@xs4all.nl>
Message-ID: <20010713135931.U5396@xs4all.nl>

On Thu, Jul 12, 2001 at 04:55:01PM +0200, Thomas Wouters wrote:
> On Thu, Jul 12, 2001 at 10:00:33AM -0400, Guido van Rossum wrote:
> > Does anyone remember what the problems with Solaris-8 were?  Shallow,
> > I hope?

> No clue, sorry. I don't have Solaris 8, either, but I do have access to
> Solaris 7 (currently, but not for long) and will attempt to build a couple
> of releases on it.

I managed to get it working on Solaris, though I had some problems with
readline and socket-with-ssl -- presumably because distutils tried to link
against static libraries, not shared ones. However, I just noticed
SourceForge has a compilefarm that includes Solaris 8. I compiled Python
2.1.1 and it worked fine.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Fri Jul 13 13:03:29 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 13 Jul 2001 14:03:29 +0200
Subject: [Python-Dev] PEP: Defining Unicode Literal Encodings
Message-ID: <3B4EE391.19995171@lemburg.com>

Please comment...

--

PEP: 0263 (?)
Title: Defining Unicode Literal Encodings
Version: $Revision: 1.0 $
Author: mal@lemburg.com (Marc-Andr=E9 Lemburg)
Status: Draft
Type: Standards Track
Python-Version: 2.3
Created: 06-Jun-2001
Post-History:=20

Abstract

    This PEP proposes to use the PEP 244 statement "directive" to make
    the encoding used in Unicode string literals u"..." (and their raw
    counterparts ur"...") definable on a per source file basis.

Problem

    In Python 2.1, Unicode literals can only be written using the
    Latin-1 based encoding "unicode-escape". This makes the
    programming environment rather unfriendly to Python users who live
    and work in non-Latin-1 locales such as many of the eastern
    countries. Programmers can write their 8-bit strings using the
    favourite encoding, but are bound to the "unicode-escape" encoding
    for Unicode literals.

Proposed Solution

    I propose to make the Unicode literal encodings (both standard and
    raw) a per-source file option which can be set using the
    "directive" statement proposed in PEP 244.

Syntax

    The syntax for the directives is as follows:

    'directive' WS+ 'unicodeencoding' WS* '=3D' WS* PYTHONSTRINGLITERAL
    'directive' WS+ 'rawunicodeencoding' WS* '=3D' WS* PYTHONSTRINGLITERA=
L

    with the PYTHONSTRINGLITERAL representing the encoding name to be
    used as standard Python 8-bit string literal and WS being the
    whitespace characters [ \t].

Semantics

    Whenever the Python compiler sees such an encoding directive
    during the compiling process, it updates an internal flag which
    holds the encoding name used for the specific literal form. The
    encoding name flags are initialized to "unicode-escape" for u"..."=20
    literals and "raw-unicode-escape" for ur"..." respectively.

    ISSUE:
         Maybe we should restrict the directive usage to once per file
         and additionally to a placement before the first Unicode
literal=20
         in the source file.

    If the Python compiler has to convert a Unicode literal to a
    Unicode object, it will pass the 8-bit string data given by the
    literal to the Python codec registry and have it decode the data
    using the current setting of the encoding name flag for the
    requested type of Unicode literal. It then checks the result of
    the decoding operation for being an Unicode object and stores it
    in the byte code stream.

Scope

    This PEP only affects Python source code which makes use of the
    proposed directives. It does not affect the coercion handling of
    8-bit strings and Unicode in the given module.

Copyright

    This document has been placed in the public domain.

=0C
Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal@lemburg.com  Fri Jul 13 13:04:16 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 13 Jul 2001 14:04:16 +0200
Subject: [Python-Dev] PEP: Unicode Indexing Helper Module
Message-ID: <3B4EE3C0.9875AB3D@lemburg.com>

Please comment...

--

PEP: 0262 (?)
Title: Unicode Indexing Helper Module
Version: $Revision: 1.0 $
Author: mal@lemburg.com (Marc-Andr=E9 Lemburg)
Status: Draft
Type: Standards Track
Python-Version: 2.3
Created: 06-Jun-2001
Post-History:=20

Abstract

    This PEP proposes a new module "unicodeindex" which provides=20
    means to index Unicode objects in various higher level abstractions
    of "characters".

Problem and Terminology

    Unicode objects can be indexed just like string object using what
    in Unicode terms is called a code unit as index basis. =20

    Code units are the storage entities used by the Unicode
    implementation to store a single Unicode information unit and do
    not necessarily map 1-1 to code points which are the smallest
    entities encoded by the Unicode standard.

    These code points can sometimes be composed to form graphemes
    which are then displayed by the Unicode output device as one
    character. A word is then a sequence of characters separated by
    space characters or punctuation, a line is a sequence of code
    points separated by line breaking code point sequences.

    For addressing Unicode, there are basically five different methods
    by which you can reference the data:

    1. per code unit    (codeunit)
    2. per code point   (codepoint)
    3. per grapheme     (grapheme)
    4. per word         (word)
    5. per line         (line)

    The indexing type name is given in parenthesis and used in the
    module interface.

Proposed Solution

    I propose to add a new module to the standard Python library which
    provides interfaces implementing the above indexing methods.

Module Interface

    The module should provide the following interfaces for all four
    indexing styles:

    next_<indextype>(u, index) -> integer

        Returns the Unicode object index for the start of the next
        <indextype> found after u[index] or -1 in case no next element
        of this type exists.

    prev_<indextype>(u, index) -> integer

        Returns the Unicode object index for the start of the previous
        <indextype> found before u[index] or -1 in case no previous
        element of this type exists.

    <indextype>_index(u, n) -> integer

        Returns the Unicode object index for the start of the n-th
        <indextype> element in u. Raises an IndexError in case no n-th
        element can be found.

    <indextype>_count(u, index) -> integer

        Counts the number of complete <indextype> elements found in
        u[:index] and returns the count as integer.

    <indextype>_start(u, index) -> integer

        Returns 1 or 0 depending on u[index] marks the start of an
        <indextype> element.

    <indextype>_end(u, index) -> integer

        Returns 1 or 0 depending on u[index] marks the end of an
        <indextype> element.

    Used symbols:
       <indextype>   one of: codeunit, codepoint, grapheme, word, line
       u             is the Unicode object
       index         the Unicode object index
       n             is an integer   =20

Copyright

    This document has been placed in the public domain.

=0C
Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From mal@lemburg.com  Fri Jul 13 12:39:54 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 13 Jul 2001 13:39:54 +0200
Subject: [Python-Dev] PEP 250 - site-packages on Windows: (Was: [Distutils]
 Package DB: strawman PEP)
References: <BIEJKCLHCIOIHAGOKOLHEEKICCAA.tim@digicool.com> <027201c10b75$b18224a0$e000a8c0@thomasnotebook>
Message-ID: <3B4EDE0A.5D7391D6@lemburg.com>

> [where to get the installation path from on Windows]
> 
> That's why I said a (very hacky) solution would be to simply check
> for the version number at install time.
> A better solution would be to somehow query site.py at install time?

Ideal would be looking at the sys module for e.g. sys.extinstallpath
(which site.py could set). Is the problem of not being able to
embed Python at install time really a problem ? After all, if
it doesn't work for the installer, how should it work at all
in a different setting...

Alternatively, the installer could also simply query the
install path from the user and suggest the sys.extinstallpath
dir as default.

The installer should also make sure that the sys.extinstallpath
is on the sys.path (if not, the Python user won't be able to
use the installed package and should be warned about this).

A totally different problem is that of upgrading from the
old installation (in Python21\) to a new one 
(in Python\Lib\site-packages)... but that one is on the extension
writer, I guess.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From Samuele Pedroni <pedroni@inf.ethz.ch>  Fri Jul 13 13:08:19 2001
From: Samuele Pedroni <pedroni@inf.ethz.ch> (Samuele Pedroni)
Date: Fri, 13 Jul 2001 14:08:19 +0200 (MET DST)
Subject: [Python-Dev] descr-branch, ExtensionClasses
Message-ID: <200107131208.OAA21211@core.inf.ethz.ch>

Hi. Some questions:

- What is the probability that descr-branch go in 2.2?
- Will those changes obsolete ExtensionClasses on the long run?

Why the questions:

There is a guy on jython-dev that is trying to port some ExtensionClasses-like functionality
to jython.

Concretely he is fighting with the fact that jython internals are there to make things work,
not to enable extensibility in any explicit way. At least their messy side make me believe that.

My plans were to try to mimick as long as possible the new descr logic in jython 2.2, and try
to polish all the internals accordingly.

So if the answer to both question is yes, I can promise that to the guy, otherwise I have
to be more helpful or diplomatic ...

It's a kind of "political" matter. Samuele.






From nas@python.ca  Fri Jul 13 13:23:49 2001
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 13 Jul 2001 05:23:49 -0700
Subject: [Python-Dev] Python 2.1.1
In-Reply-To: <20010713125720.S5396@xs4all.nl>; from thomas@xs4all.net on Fri, Jul 13, 2001 at 12:57:20PM +0200
References: <20010712184319.T5391@xs4all.nl> <20010713125720.S5396@xs4all.nl>
Message-ID: <20010713052349.A19240@glacier.fnational.com>

Thomas Wouters wrote:
> In other words: it's a mess.

It sure is.  You don't want to change the DB implementation used if it
worked in 2.1.  I believe that different DBs use different storage
formats.  People would not be happy if they upgraded to a point release
and all their DBs broke (i.e. with 2.1 dbm was actually gdbm but with
2.1.1 it is db1).

  Neil


From nas@python.ca  Fri Jul 13 13:38:40 2001
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 13 Jul 2001 05:38:40 -0700
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <AJEAKILOCCJMDILAPGJNEEOICFAA.jeremy@alum.mit.edu>; from jeremy@alum.mit.edu on Thu, Jul 12, 2001 at 11:19:57PM -0400
References: <200107130116.f6D1GGq16019@odiug.digicool.com> <AJEAKILOCCJMDILAPGJNEEOICFAA.jeremy@alum.mit.edu>
Message-ID: <20010713053840.B19240@glacier.fnational.com>

Jeremy Hylton wrote:
> The big change is the coercion work being done in binary_op1(), which
> tries to turn strings into numbers in a variety of ways.  BINARY_ADD
> calls PyNumber_Add(), which calls binary_op1().  When the binary_op1()
> calls fails, it then tries sequence concatenation.
> 
> If it were possible for binary_op1() to fail quickly for non-numeric
> sequences like strings, we would not see the slowdown for small string
> operations.  (I believe that's what the silly little benchmark shows
> and what one of the pybench tests shows.)

I had a patch that did this:

    * Added an ordinal number to some builtin types.  All other types
      had ordinal 0.

    * Built a 2-D table of binary methods.

    * Had operations like PyNumber_Add look into this table and use the
      method there.

It turned out to not give much of a speedup but I think the idea is
interesting.

  Neil


From thomas@xs4all.net  Fri Jul 13 13:48:14 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 13 Jul 2001 14:48:14 +0200
Subject: [Python-Dev] Python 2.1.1
In-Reply-To: <20010713052349.A19240@glacier.fnational.com>
References: <20010712184319.T5391@xs4all.nl> <20010713125720.S5396@xs4all.nl> <20010713052349.A19240@glacier.fnational.com>
Message-ID: <20010713144814.V5396@xs4all.nl>

On Fri, Jul 13, 2001 at 05:23:49AM -0700, Neil Schemenauer wrote:
> Thomas Wouters wrote:
> > In other words: it's a mess.

> It sure is.  You don't want to change the DB implementation used if it
> worked in 2.1.  I believe that different DBs use different storage
> formats.  People would not be happy if they upgraded to a point release
> and all their DBs broke (i.e. with 2.1 dbm was actually gdbm but with
> 2.1.1 it is db1).

I didn't touch the autoconf code that finds the include file, nor the #ifdef
mess in dbmmodule.c that decides which to use, so it could only lead to
unrunnable/uncompilable code, not to a new .db silently being used. But I
agree that this is not a suitable fix for 2.1.1, I just wish we could fix it
better :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From nhodgson@bigpond.net.au  Fri Jul 13 14:13:40 2001
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Fri, 13 Jul 2001 23:13:40 +1000
Subject: [Python-Dev] PEP: Unicode Indexing Helper Module
References: <3B4EE3C0.9875AB3D@lemburg.com>
Message-ID: <072101c10b9d$a28068e0$0acc8490@neil>

M.-A. Lemburg:
>    next_<indextype>(u, index) -> integer
>
>        Returns the Unicode object index for the start of the next
>        <indextype> found after u[index] or -1 in case no next
>        element of this type exists.
>
>    prev_<indextype>(u, index) -> integer
> ...

   Its not clear to me from the description whether the term "object index"
is used for a code unit index or an <indextype> index. Code unit index seems
to make the most sense but this should be explicit.

   Neil



From mal@lemburg.com  Fri Jul 13 14:44:55 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 13 Jul 2001 15:44:55 +0200
Subject: [Python-Dev] Re: PEP 262: Unicode Indexing Helper Module
Message-ID: <3B4EFB57.1427EF35@lemburg.com>

This is a multi-part message in MIME format.
--------------4273B7E264E4649CF795A2CF
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

> Paul Moore (in privte mail):
>
> You have methods for finding
> the start and end of various <indextypes>, but you don't have a method for
> finding the length of an <indextype>. In the case of words (which is the one
> I understand :-), the length of a word is not the same as the difference
> between the starts of consecutive words - the intervening whitespace should
> be excluded (at least for some applications). I would suggest
> 
> length_<indextype>(u, index) -> integer
> Returns the length in Unicode objects of the <indextype> found at u[index]
> or -1 in case u[index] is not in an element of this type (for example, in
> the whitespace between words). [XXX Should this be the number of Unicode
> objects between index and the end of the element, or should it be the length
> from start to end even if you are in the middle?]
> 
> or maybe better
> 
> nextend_<indextype>(u, index) -> integer
> Returns the Unicode object index for the end of the next <indextype> found
> after u[index] or -1 in case no next element of this type exists.
> 
> [But that runs into issues when you are in a word - If index is not the
> first Unicode object, nextend is the end of *this* element, whereas next is
> the start of the *next* element. I think I'm starting to show my
> ignorance...]
> 
> Even though I suspect my suggested methods are too simplistic, I'd suggest
> at least a comment in the PEP on how to work out the length of the element
> you're in (or why it's hard, and you'd never want to do it :-)...

The two suggested APIs probe into the Unicode object. I think it would
be more useful to return the slice (as slice object) which represents
the <indextype> element found at the given index in u, e.g.

<indextype>_slice(u, index) -> slice object or None

    Returns the slice pointing to the <indextype> element found in 
    u at the given index or None in case no such element can be found
    at that position.

Hmm, I wonder whether slice objects can be "applied" to sequences
somehow...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/
--------------4273B7E264E4649CF795A2CF
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Received: from gw-nl1.origin-it.com (gw-nl1.origin-it.com [193.79.128.34])
	by www.egenix.com (8.11.2/8.11.2/SuSE Linux 8.11.1-0.5) with ESMTP id f6DCTY816219
	for <mal@lemburg.com>; Fri, 13 Jul 2001 14:29:34 +0200
Received: from exchsmtp-nl1.origin-it.com (localhost.origin-it.com [127.0.0.1])
          by gw-nl1.origin-it.com with ESMTP id OAA11738
          for <mal@lemburg.com>; Fri, 13 Jul 2001 14:26:54 +0200 (MEST)
          (envelope-from Paul.Moore@atosorigin.com)
Received: from exchsmtp-nl1.origin-it.com(172.16.127.66) by gw-nl1.origin-it.com via mwrap (4.0a)
	id xma011736; Fri, 13 Jul 01 14:26:54 +0200
Received: from mail.origin-it.com (mail.origin-it.com [172.16.127.3]) 
	by exchsmtp-nl1.origin-it.com (8.9.3/8.8.5-1.2.2m-19990317) with ESMTP id OAA04126
	for <mal@lemburg.com>; Fri, 13 Jul 2001 14:26:53 +0200 (MET DST)
Received: from ukrax001.ras.uk.origin-it.com (ukrax001.ras.uk.origin-it.com [172.16.201.234]) 
	by mail.origin-it.com (8.9.3/8.8.5-1.2.2m-19990317) with ESMTP id OAA12785
	for <mal@lemburg.com>; Fri, 13 Jul 2001 14:26:53 +0200 (MET DST)
Received: by ukrax001.ras.uk.origin-it.com with Internet Mail Service (5.5.2650.21)
	id <NBW9YQM2>; Fri, 13 Jul 2001 13:26:53 +0100
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEF5@ukrux002.rundc.uk.origin-it.com>
From: "Moore, Paul" <Paul.Moore@atosorigin.com>
To: "'mal@lemburg.com'" <mal@lemburg.com>
Subject: PEP 262: Unicode Indexing Helper Module
Date: Fri, 13 Jul 2001 13:26:52 +0100
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-Type: text/plain;
	charset="iso-8859-1"

Excuse me for commenting on an area which I know virtually nothing about,
but one point struck me when I saw this PEP. You have methods for finding
the start and end of various <indextypes>, but you don't have a method for
finding the length of an <indextype>. In the case of words (which is the one
I understand :-), the length of a word is not the same as the difference
between the starts of consecutive words - the intervening whitespace should
be excluded (at least for some applications). I would suggest

length_<indextype>(u, index) -> integer
Returns the length in Unicode objects of the <indextype> found at u[index]
or -1 in case u[index] is not in an element of this type (for example, in
the whitespace between words). [XXX Should this be the number of Unicode
objects between index and the end of the element, or should it be the length
from start to end even if you are in the middle?]

or maybe better

nextend_<indextype>(u, index) -> integer
Returns the Unicode object index for the end of the next <indextype> found
after u[index] or -1 in case no next element of this type exists.

[But that runs into issues when you are in a word - If index is not the
first Unicode object, nextend is the end of *this* element, whereas next is
the start of the *next* element. I think I'm starting to show my
ignorance...]

Even though I suspect my suggested methods are too simplistic, I'd suggest
at least a comment in the PEP on how to work out the length of the element
you're in (or why it's hard, and you'd never want to do it :-)...

Paul.

--------------4273B7E264E4649CF795A2CF--



From mal@lemburg.com  Fri Jul 13 14:49:42 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 13 Jul 2001 15:49:42 +0200
Subject: [Python-Dev] PEP: Unicode Indexing Helper Module
References: <3B4EE3C0.9875AB3D@lemburg.com> <072101c10b9d$a28068e0$0acc8490@neil>
Message-ID: <3B4EFC76.606FBC83@lemburg.com>

Neil Hodgson wrote:
> 
> M.-A. Lemburg:
> >    next_<indextype>(u, index) -> integer
> >
> >        Returns the Unicode object index for the start of the next
> >        <indextype> found after u[index] or -1 in case no next
> >        element of this type exists.
> >
> >    prev_<indextype>(u, index) -> integer
> > ...
> 
>    Its not clear to me from the description whether the term "object index"
> is used for a code unit index or an <indextype> index. Code unit index seems
> to make the most sense but this should be explicit.

Good point.

The "Unicode object index" refers to the index you use for slicing
or indexing Unicode objects, i.e. like in "u[10]" or "u[12:15]".
As such it refers to the Unicode code unit as implemented by the
Unicode implementation (and is application specific).

I'll add a note to the PEP.

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From sjoerd.mullender@oratrix.com  Fri Jul 13 15:27:37 2001
From: sjoerd.mullender@oratrix.com (Sjoerd Mullender)
Date: Fri, 13 Jul 2001 16:27:37 +0200
Subject: [Python-Dev] re with Unicode broken?
Message-ID: <20010713142737.EDBA8301CF7@bireme.oratrix.nl>

This is not for the faint of heart.

My validating XML parser doesn't work anymore, even though I didn't
change a thing (except update Python from CVS).  I use re extensively
in this parser, and all my expressions use Unicode extensively.
If I replace the Unicode stuff by ASCII, the expression works.

The expression which now fails to match is:

entity = re.compile('<!ENTITY'+_S+'(?:%'+_S+'(?P<pname>'+_Name+')'+_S+
	'(?P<pvalue>'+_EntityVal+'|'+_ExternalId+')|(?P<ename>'+_Name+')'+_S+
	'(?P<value>'+_EntityVal+'|'+_ExternalId+'(?:'+_S+'NDATA'+_S+_Name+
	')?))'+_opS+'>')

and the string it fails on is

<!ENTITY % SMIL.prefix "" >

In order to actually use the above expression, you need some more
variable definitions.  To assemble this into a working program, first
copy and paste the bottom part, then the middle part, and finally the
top part.

The resulting pattern is huge: (len(entity.pattern) == 15193).

First the ones that don't use Unicode.

_Letter = _BaseChar + _Ideographic
_NameChar = '-' + _Letter + _Digit + '._:' + _CombiningChar + _Extender
_S = '[ \t\r\n]+'                       # white space
_opS = '[ \t\r\n]*'                     # optional white space
_Name = '['+_Letter+'_:]['+_NameChar+']*' # XML Name
ref = '&(?:(?P<name>'+_Name+')|#(?P<char>(?:[0-9]+|x[0-9a-fA-F]+)));'
_QStr = "(?:'[^']*'|\"[^\"]*\")"        # quoted XML string
_EntityVal = '"(?:[^"&%]|'+ref+'|%'+_Name+';)*"|' \
             "'(?:[^'&%]|"+ref+"|%"+_Name+";)*'"
_SystemLiteral = '(?P<syslit>'+_QStr+')'
_PublicLiteral = '(?P<publit>"[-\'()+,./:=?;!*#@$_%% \n\ra-zA-Z0-9]*"|' \
                            "'[-()+,./:=?;!*#@$_%% \n\ra-zA-Z0-9]*')"
_ExternalId = '(?:SYSTEM|PUBLIC'+_S+_PublicLiteral+')'+_S+_SystemLiteral

The ASCII versions of the Unicode strings are (if you use these
definitions the re matches):

_BaseChar = 'A-Za-z'
_Ideographic = ''
_Digit = '0-9'
_CombiningChar = ''
_Extender = ''

and the Unicode versions (the ones that I actually use and that now
fail):

# The character sets below are taken directly from the XML spec.
_BaseChar = u'\u0041-\u005A\u0061-\u007A\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u00FF' \
            u'\u0100-\u0131\u0134-\u013E\u0141-\u0148\u014A-\u017E' \
            u'\u0180-\u01C3\u01CD-\u01F0\u01F4-\u01F5\u01FA-\u0217' \
            u'\u0250-\u02A8\u02BB-\u02C1\u0386\u0388-\u038A\u038C' \
            u'\u038E-\u03A1\u03A3-\u03CE\u03D0-\u03D6\u03DA\u03DC\u03DE' \
            u'\u03E0\u03E2-\u03F3\u0401-\u040C\u040E-\u044F\u0451-\u045C' \
            u'\u045E-\u0481\u0490-\u04C4\u04C7-\u04C8\u04CB-\u04CC' \
            u'\u04D0-\u04EB\u04EE-\u04F5\u04F8-\u04F9\u0531-\u0556\u0559' \
            u'\u0561-\u0586\u05D0-\u05EA\u05F0-\u05F2\u0621-\u063A' \
            u'\u0641-\u064A\u0671-\u06B7\u06BA-\u06BE\u06C0-\u06CE' \
            u'\u06D0-\u06D3\u06D5\u06E5-\u06E6\u0905-\u0939\u093D' \
            u'\u0958-\u0961\u0985-\u098C\u098F-\u0990\u0993-\u09A8' \
            u'\u09AA-\u09B0\u09B2\u09B6-\u09B9\u09DC-\u09DD\u09DF-\u09E1' \
            u'\u09F0-\u09F1\u0A05-\u0A0A\u0A0F-\u0A10\u0A13-\u0A28' \
            u'\u0A2A-\u0A30\u0A32-\u0A33\u0A35-\u0A36\u0A38-\u0A39' \
            u'\u0A59-\u0A5C\u0A5E\u0A72-\u0A74\u0A85-\u0A8B\u0A8D' \
            u'\u0A8F-\u0A91\u0A93-\u0AA8\u0AAA-\u0AB0\u0AB2-\u0AB3' \
            u'\u0AB5-\u0AB9\u0ABD\u0AE0\u0B05-\u0B0C\u0B0F-\u0B10' \
            u'\u0B13-\u0B28\u0B2A-\u0B30\u0B32-\u0B33\u0B36-\u0B39\u0B3D' \
            u'\u0B5C-\u0B5D\u0B5F-\u0B61\u0B85-\u0B8A\u0B8E-\u0B90' \
            u'\u0B92-\u0B95\u0B99-\u0B9A\u0B9C\u0B9E-\u0B9F\u0BA3-\u0BA4' \
            u'\u0BA8-\u0BAA\u0BAE-\u0BB5\u0BB7-\u0BB9\u0C05-\u0C0C' \
            u'\u0C0E-\u0C10\u0C12-\u0C28\u0C2A-\u0C33\u0C35-\u0C39' \
            u'\u0C60-\u0C61\u0C85-\u0C8C\u0C8E-\u0C90\u0C92-\u0CA8' \
            u'\u0CAA-\u0CB3\u0CB5-\u0CB9\u0CDE\u0CE0-\u0CE1\u0D05-\u0D0C' \
            u'\u0D0E-\u0D10\u0D12-\u0D28\u0D2A-\u0D39\u0D60-\u0D61' \
            u'\u0E01-\u0E2E\u0E30\u0E32-\u0E33\u0E40-\u0E45\u0E81-\u0E82' \
            u'\u0E84\u0E87-\u0E88\u0E8A\u0E8D\u0E94-\u0E97\u0E99-\u0E9F' \
            u'\u0EA1-\u0EA3\u0EA5\u0EA7\u0EAA-\u0EAB\u0EAD-\u0EAE\u0EB0' \
            u'\u0EB2-\u0EB3\u0EBD\u0EC0-\u0EC4\u0F40-\u0F47\u0F49-\u0F69' \
            u'\u10A0-\u10C5\u10D0-\u10F6\u1100\u1102-\u1103\u1105-\u1107' \
            u'\u1109\u110B-\u110C\u110E-\u1112\u113C\u113E\u1140\u114C' \
            u'\u114E\u1150\u1154-\u1155\u1159\u115F-\u1161\u1163\u1165' \
            u'\u1167\u1169\u116D-\u116E\u1172-\u1173\u1175\u119E\u11A8' \
            u'\u11AB\u11AE-\u11AF\u11B7-\u11B8\u11BA\u11BC-\u11C2\u11EB' \
            u'\u11F0\u11F9\u1E00-\u1E9B\u1EA0-\u1EF9\u1F00-\u1F15' \
            u'\u1F18-\u1F1D\u1F20-\u1F45\u1F48-\u1F4D\u1F50-\u1F57\u1F59' \
            u'\u1F5B\u1F5D\u1F5F-\u1F7D\u1F80-\u1FB4\u1FB6-\u1FBC\u1FBE' \
            u'\u1FC2-\u1FC4\u1FC6-\u1FCC\u1FD0-\u1FD3\u1FD6-\u1FDB' \
            u'\u1FE0-\u1FEC\u1FF2-\u1FF4\u1FF6-\u1FFC\u2126\u212A-\u212B' \
            u'\u212E\u2180-\u2182\u3041-\u3094\u30A1-\u30FA\u3105-\u312C' \
            u'\uAC00-\uD7A3'
_Ideographic = u'\u4E00-\u9FA5\u3007\u3021-\u3029'
_CombiningChar = u'\u0300-\u0345\u0360-\u0361\u0483-\u0486\u0591-\u05A1\u05A3-\u05B9' \
                 u'\u05BB-\u05BD\u05BF\u05C1-\u05C2\u05C4\u064B-\u0652\u0670' \
                 u'\u06D6-\u06DC\u06DD-\u06DF\u06E0-\u06E4\u06E7-\u06E8' \
                 u'\u06EA-\u06ED\u0901-\u0903\u093C\u093E-\u094C\u094D' \
                 u'\u0951-\u0954\u0962-\u0963\u0981-\u0983\u09BC\u09BE\u09BF' \
                 u'\u09C0-\u09C4\u09C7-\u09C8\u09CB-\u09CD\u09D7\u09E2-\u09E3' \
                 u'\u0A02\u0A3C\u0A3E\u0A3F\u0A40-\u0A42\u0A47-\u0A48' \
                 u'\u0A4B-\u0A4D\u0A70-\u0A71\u0A81-\u0A83\u0ABC\u0ABE-\u0AC5' \
                 u'\u0AC7-\u0AC9\u0ACB-\u0ACD\u0B01-\u0B03\u0B3C\u0B3E-\u0B43' \
                 u'\u0B47-\u0B48\u0B4B-\u0B4D\u0B56-\u0B57\u0B82-\u0B83' \
                 u'\u0BBE-\u0BC2\u0BC6-\u0BC8\u0BCA-\u0BCD\u0BD7\u0C01-\u0C03' \
                 u'\u0C3E-\u0C44\u0C46-\u0C48\u0C4A-\u0C4D\u0C55-\u0C56' \
                 u'\u0C82-\u0C83\u0CBE-\u0CC4\u0CC6-\u0CC8\u0CCA-\u0CCD' \
                 u'\u0CD5-\u0CD6\u0D02-\u0D03\u0D3E-\u0D43\u0D46-\u0D48' \
                 u'\u0D4A-\u0D4D\u0D57\u0E31\u0E34-\u0E3A\u0E47-\u0E4E\u0EB1' \
                 u'\u0EB4-\u0EB9\u0EBB-\u0EBC\u0EC8-\u0ECD\u0F18-\u0F19\u0F35' \
                 u'\u0F37\u0F39\u0F3E\u0F3F\u0F71-\u0F84\u0F86-\u0F8B' \
                 u'\u0F90-\u0F95\u0F97\u0F99-\u0FAD\u0FB1-\u0FB7\u0FB9' \
                 u'\u20D0-\u20DC\u20E1\u302A-\u302F\u3099\u309A'
_Digit = u'\u0030-\u0039\u0660-\u0669\u06F0-\u06F9\u0966-\u096F\u09E6-\u09EF' \
         u'\u0A66-\u0A6F\u0AE6-\u0AEF\u0B66-\u0B6F\u0BE7-\u0BEF' \
         u'\u0C66-\u0C6F\u0CE6-\u0CEF\u0D66-\u0D6F\u0E50-\u0E59' \
         u'\u0ED0-\u0ED9\u0F20-\u0F29'
_Extender = u'\u00B7\u02D0\u02D1\u0387\u0640\u0E46\u0EC6\u3005\u3031-\u3035' \
            u'\u309D-\u309E\u30FC-\u30FE'

-- Sjoerd Mullender <sjoerd.mullender@oratrix.com>


From mal@lemburg.com  Fri Jul 13 15:41:18 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 13 Jul 2001 16:41:18 +0200
Subject: [Python-Dev] re with Unicode broken?
References: <20010713142737.EDBA8301CF7@bireme.oratrix.nl>
Message-ID: <3B4F088E.14D3B2D3@lemburg.com>

[re failing with CVS sre and Unicode}

I believe Fredrik checked in some changes to sre which affected the
handling of Unicode character ranges. Could this be related to
what you are seeing ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From fredrik@pythonware.com  Fri Jul 13 15:44:22 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 13 Jul 2001 16:44:22 +0200
Subject: [Python-Dev] re with Unicode broken?
References: <20010713142737.EDBA8301CF7@bireme.oratrix.nl>
Message-ID: <002501c10baa$4ea3fb80$0900a8c0@spiff>

sjoerd wrote:

> This is not for the faint of heart.
>
> My validating XML parser doesn't work anymore, even though I didn't
> change a thing (except update Python from CVS).

when did you last update without problems?

the likely cause for this is MvL's "big char set" patch, which
I checked in on July 6.

here's a workaround: tweak sre_compile.py so it doesn't generate
BIGCHARSET op codes. in _optimize_charset, change this:

    except IndexError:
        # character set contains unicode characters
        return _optimize_unicode(charset, fixup)
    # compress character map

to

    except IndexError:
        # character set contains unicode characters
        return charset # WORKAROUND: no compression
    # compress character map

I'll look into this over the weekend.

Cheers /F




From guido@digicool.com  Fri Jul 13 15:59:10 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 13 Jul 2001 10:59:10 -0400
Subject: [Python-Dev] descr-branch, ExtensionClasses
In-Reply-To: Your message of "Fri, 13 Jul 2001 14:08:19 +0200."
 <200107131208.OAA21211@core.inf.ethz.ch>
References: <200107131208.OAA21211@core.inf.ethz.ch>
Message-ID: <200107131459.f6DExAv16504@odiug.digicool.com>

> Hi. Some questions:
> 
> - What is the probability that descr-branch go in 2.2?

At least 75%.  I'm planning to release 2.2a1 next week from
descr-branch.  If this is well received, I'll merge the descr-branch
into the main trunk.

> - Will those changes obsolete ExtensionClasses on the long run?

Yes, that's the whole point.

> Why the questions:
> 
> There is a guy on jython-dev that is trying to port some
> ExtensionClasses-like functionality to jython.

Let him use the design from descr-branch instead (PEP 252 and 253 are
much more up to date now).

> Concretely he is fighting with the fact that jython internals are
> there to make things work, not to enable extensibility in any
> explicit way. At least their messy side make me believe that.

I'll have to trust you there, I'm not familiar with Jython internals.

> My plans were to try to mimick as long as possible the new descr
> logic in jython 2.2, and try to polish all the internals
> accordingly.

Sounds like a good plan.

> So if the answer to both question is yes, I can promise that to the
> guy, otherwise I have to be more helpful or diplomatic ...
> 
> It's a kind of "political" matter. Samuele.

I'd say that even if descr-branch doesn't make it into 2.2, it will
make it into the next release, so by all means study the design and
tell me if it has any problems for Jython!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Fri Jul 13 16:08:55 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 13 Jul 2001 17:08:55 +0200
Subject: [Python-Dev] Solaris 8
Message-ID: <20010713170855.W5396@xs4all.nl>

After much frobbing, I managed to compile Python using the SUNpro compiler
as well as gcc. gcc was no problem, asside from the inability to link with
static libraries as if they were shared, but the Sun compiler (which is an
optional, fairly expensive piece of software, IIRC :) is a nasty little
thing.

It defaults to a half-ANSI, half-K&R mode with Sun extentions, that has
broken thread support and refuses to compile most of the modules because of
ANSI-style token-stringification and -concatenation ('#x' and 'x ## y').
Adding '-mt' to the flags, as suggested by the README, didn't help much.

Passing '-Xa' to the compiler switches it into an
ANSI-compliant-with-Sun-extentions mode, and though threads were still
broken, I managed to compile Python and most of the modules. And it
remarkably passed all the tests it could find: all 6 of them. For some
reason (I couldn't figure it out for the life of me) 'readdir' silently
chopped off the first two characters of the entry name, causing the
'findtests' function in the regrtest to not find any tests besides the
standard ones. Sounds like a mismatch between include-files and structs
actually used by the operating system, but none of the manual pages hinted
to anything like it.

Finally, '-Xc' turns it into a strictly ANSI compiler, though apparently not
as strict as 'gcc -ansi': it compiles with only a few warnings, passes all
tests, and with '-mt' it even had working thread suport! There seems to be
only one oddness: audioop.so uses sqrt() without being linked to libm,
though why this isn't an issue on other systems, I'm not sure. I've added a
blob to the 2.1.1 README to mention this all (but not in time for the
2.1.1c1 release), and will add it to the 2.2 one as soon as I've tested that
tree on Solaris, too.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Fri Jul 13 16:33:54 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 13 Jul 2001 11:33:54 -0400
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: Your message of "Fri, 13 Jul 2001 16:29:38 +1200."
 <200107130429.QAA02078@s454.cosc.canterbury.ac.nz>
References: <200107130429.QAA02078@s454.cosc.canterbury.ac.nz>
Message-ID: <200107131533.f6DFXsV16565@odiug.digicool.com>

> Tim Peters <tim.one@home.com>:
> 
> > It's not BINARY_ADD, it's the PyNumber_Add() called by BINARY_ADD, which,
> > given two strings, calls binary_op1, which does a few failing tests, then
> > calls PyNumber_CoerceEx, which fails quickly enough to coerce, and then
> > pokes around a little looking for number methods, and finally says "hmm!
> > maybe it's a sequence?".
> 
> This seems to contradict what Guido just said about
> centralised coercion having been removed. Is one or
> the other of us talking nonsense, or do we misunderstand
> each other?

It's complicated.  I didn't know everything that was going on when I
wrote that before.  Now I've seen a bit more.  PyNumber_CoerceEx() is
called in order to accommodate old-style numbers for backwards
compatibility (and for complex, which hasn't been converted to
new-style yet).

We could add a new-style numeric add operation to strings so that
s1+s2 takes an earlier path in binary_op1().

I also note that binary_op1() tries PyNumber_CoerceEx() even when both
arguments have a NULL tp_as_number pointer -- at the cost of extra
tests the call to PyNumber_CoerceEx() could be avoided.  (I guess
binary_op1() could add such a test at the top and save itself some
work.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Fri Jul 13 16:34:25 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 13 Jul 2001 17:34:25 +0200
Subject: [Python-Dev] Possible solution for PEP250 and bdist_wininst
Message-ID: <025d01c10bb1$4c0f69c0$e000a8c0@thomasnotebook>

I have a possible solution for this problem.

(I'll use the name INSTALLPATH for installation directory stored
in the registry under the key
HKEY_LOCAL_MACHINE\Software\Python\PythonCore\<version>\InstallPath).

The bdist_wininst installer at _install_ time sets the PYTHONHOME
environment variable to INSTALLPATH, then loads the python dll
and retrieves the 'extinstallpath' attribute from the sys module:

    wwsprintf(buffer, "PYTHONHOME=%s", INSTALLPATH);
    _putenv(buffer);
    Py_SetProgramName(modulename);
    Py_Initialize();
    pextinstallpath = PySys_GetObject("extinstallpath");
    Py_Finalize();

If this is successful, the (string contents of) pextinstallpath
is appended to INSTALLPATH, and that will be the directory where
the package will be installed. If unsuccessful, INSTALLPATH will
be used as before.

I'm unsure about the change to site.py, but this should work:

diff -c -r1.26 site.py
*** site.py     2001/03/23 17:53:49     1.26
--- site.py     2001/07/13 15:32:27
***************
*** 140,153 ****
                                   "python" + sys.version[:3],
                                   "site-packages"),
                          makepath(prefix, "lib", "site-python")]
-         elif os.sep == ':':
-             sitedirs = [makepath(prefix, "lib", "site-packages")]
          else:
!             sitedirs = [prefix]
          for sitedir in sitedirs:
              if os.path.isdir(sitedir):
                  addsitedir(sitedir)

  # Define new built-ins 'quit' and 'exit'.
  # These are simply strings that display a hint on how to exit.
  if os.sep == ':':
--- 140,154 ----
                                   "python" + sys.version[:3],
                                   "site-packages"),
                          makepath(prefix, "lib", "site-python")]
          else:
!             sitedirs = [prefix, os.path.join(prefix, "lib", "site-packages")]
          for sitedir in sitedirs:
              if os.path.isdir(sitedir):
                  addsitedir(sitedir)

+ if os.sep == '\\':
+     sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages")
+
  # Define new built-ins 'quit' and 'exit'.
  # These are simply strings that display a hint on how to exit.
  if os.sep == ':':

If anyone cares, I can post the diffs for the bdist_wininst sources.

Thomas



From thomas.heller@ion-tof.com  Fri Jul 13 16:45:04 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 13 Jul 2001 17:45:04 +0200
Subject: [Python-Dev] Possible solution for PEP250 and bdist_wininst
References: <025d01c10bb1$4c0f69c0$e000a8c0@thomasnotebook>
Message-ID: <02ee01c10bb2$c9460ce0$e000a8c0@thomasnotebook>

> I'm unsure about the change to site.py, but this should work:
This was wrong, of course. Sorry for the confusion, should simply be:

diff -c -r1.30 site.py
*** site.py     2001/07/12 21:08:33     1.30
--- site.py     2001/07/13 15:43:49
***************
*** 151,156 ****
--- 151,159 ----
              if os.path.isdir(sitedir):
                  addsitedir(sitedir)

+ if os.sep == ':':
+     sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages")
+
  del dirs_in_sys_path

  # Define new built-ins 'quit' and 'exit'.

Thomas



From Paul.Moore@atosorigin.com  Fri Jul 13 16:46:09 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Fri, 13 Jul 2001 16:46:09 +0100
Subject: [Python-Dev] RE: Possible solution for PEP250 and bdist_wininst
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AEF7@ukrux002.rundc.uk.origin-it.com>

From: Thomas Heller [mailto:thomas.heller@ion-tof.com]
> I have a possible solution for this problem.
[Description cut]

Sounds OK to me, but if I knew much about this area I'd have covered it in
the PEP :-)

One question: Should sys.extinstallpath be set for all platforms? Cleanrly,
nothing but Windows will use it at present, but is there a meaningful value
it could have on other platforms? If so, exposing it uniformly seems
sensible.

Paul.


From sjoerd.mullender@oratrix.com  Fri Jul 13 16:54:07 2001
From: sjoerd.mullender@oratrix.com (Sjoerd Mullender)
Date: Fri, 13 Jul 2001 17:54:07 +0200
Subject: [Python-Dev] re with Unicode broken?
In-Reply-To: Your message of Fri, 13 Jul 2001 16:44:22 +0200.
 <002501c10baa$4ea3fb80$0900a8c0@spiff>
References: <20010713142737.EDBA8301CF7@bireme.oratrix.nl>
 <002501c10baa$4ea3fb80$0900a8c0@spiff>
Message-ID: <20010713155407.CCCBE301CF7@bireme.oratrix.nl>

On Fri, Jul 13 2001 "Fredrik Lundh" wrote:

> sjoerd wrote:
> 
> > This is not for the faint of heart.
> >
> > My validating XML parser doesn't work anymore, even though I didn't
> > change a thing (except update Python from CVS).
> 
> when did you last update without problems?

I have no idea.  I update regularly (only on the main branch), but I
don't run the program very often.

> the likely cause for this is MvL's "big char set" patch, which
> I checked in on July 6.
> 
> here's a workaround: tweak sre_compile.py so it doesn't generate
> BIGCHARSET op codes. in _optimize_charset, change this:
> 
>     except IndexError:
>         # character set contains unicode characters
>         return _optimize_unicode(charset, fixup)
>     # compress character map
> 
> to
> 
>     except IndexError:
>         # character set contains unicode characters
>         return charset # WORKAROUND: no compression
>     # compress character map
> 
> I'll look into this over the weekend.

Yes, this works.


While you're looking at this, maybe you can also look at speeding up
stuff?  :-)

Importing the module with my XML parser takes an inordinate amount of
time.  This is entirely due to compiling all the regular expressions.
There are a lot of them, and since many of them use the _Name pattern
that I included in my previous message, they tend to be big.

Unfortunately, I can't use any abbreviations that re might provide for
Unicode character sets, since then I don't know for sure that my
expressions are compatible with the XML definition.

Maybe it's possible to add a way of saving precompiled expressions in
the Python file?
-- Sjoerd Mullender <sjoerd.mullender@oratrix.com>


From thomas.heller@ion-tof.com  Fri Jul 13 16:58:17 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 13 Jul 2001 17:58:17 +0200
Subject: [Python-Dev] Re: [Distutils] RE: Possible solution for PEP250 and bdist_wininst
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AEF7@ukrux002.rundc.uk.origin-it.com>
Message-ID: <03fc01c10bb4$a2754de0$e000a8c0@thomasnotebook>

> From: Thomas Heller [mailto:thomas.heller@ion-tof.com]
> > I have a possible solution for this problem.
> [Description cut]
> 
> Sounds OK to me, but if I knew much about this area I'd have covered it in
> the PEP :-)
> 
> One question: Should sys.extinstallpath be set for all platforms? Cleanrly,
> nothing but Windows will use it at present, but is there a meaningful value
> it could have on other platforms? If so, exposing it uniformly seems
> sensible.

This must be answered by other people, I only use windows.
If it would be exposed uniformly, probably distutils itself
should also use it.

Thomas



From guido@digicool.com  Fri Jul 13 17:41:47 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 13 Jul 2001 12:41:47 -0400
Subject: [Python-Dev] Python book reviewers wanted
Message-ID: <200107131641.f6DGfmA16706@odiug.digicool.com>

Prentice Hall needs reviewers...

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    13 Jul 2001 12:39:19 -0400
From:    Kristen_Blanco@prenhall.com
To:      webmaster@python.org
Subject: Python reviewers

I am writing from Prentice Hall publishing and I am seeking reviewers for an up
coming publication. We are one of the
largest college textbook publishers in the US.

We are publishing a book entitled "Python How to Program" by
Harvey and Paul Deitel.  They are premier programming language authors, with th
e
best-selling C++ and Java books in the college market place. More
information on their suite of publications can be found here:

  http://www.prenhall.com/deitel

We are presently seeking qualified technical reviewers to
verify that the Deitels' coverage of Python in their forthcoming book is
accurate.  In return, we are offering a token honorarium.

Might you be willing to participate? If not, could you
perhaps suggest a
colleague?

If you are interested, or have any questions, please contact my colleague, Cris
sy Statuto, at Crissy_Statuto@prenhall.com

Thank you in advance for your assistance and consideration.

Sincerely,
Crissy Statuto


Crissy Statuto
Project Manager, Computer Science
Prentice Hall
One Lake Street- #3F54
Upper Saddle River, NJ  07458


------- End of Forwarded Message



From mal@lemburg.com  Fri Jul 13 18:20:54 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 13 Jul 2001 19:20:54 +0200
Subject: [Python-Dev] Re: Possible solution for PEP250 and bdist_wininst
References: <025d01c10bb1$4c0f69c0$e000a8c0@thomasnotebook>
Message-ID: <3B4F2DF6.85158478@lemburg.com>

Thomas Heller wrote:
> 
> I have a possible solution for this problem.
> 
> (I'll use the name INSTALLPATH for installation directory stored
> in the registry under the key
> HKEY_LOCAL_MACHINE\Software\Python\PythonCore\<version>\InstallPath).
> 
> The bdist_wininst installer at _install_ time sets the PYTHONHOME
> environment variable to INSTALLPATH, then loads the python dll
> and retrieves the 'extinstallpath' attribute from the sys module:
> 
>     wwsprintf(buffer, "PYTHONHOME=%s", INSTALLPATH);
>     _putenv(buffer);
>     Py_SetProgramName(modulename);
>     Py_Initialize();
>     pextinstallpath = PySys_GetObject("extinstallpath");
>     Py_Finalize();
> 
> If this is successful, the (string contents of) pextinstallpath
> is appended to INSTALLPATH, and that will be the directory where
> the package will be installed. If unsuccessful, INSTALLPATH will
> be used as before.

Sounds OK.
 
> I'm unsure about the change to site.py, but this should work:
> 
> diff -c -r1.26 site.py
> *** site.py     2001/03/23 17:53:49     1.26
> --- site.py     2001/07/13 15:32:27
> ***************
> *** 140,153 ****
>                                    "python" + sys.version[:3],
>                                    "site-packages"),
>                           makepath(prefix, "lib", "site-python")]
> -         elif os.sep == ':':
> -             sitedirs = [makepath(prefix, "lib", "site-packages")]
>           else:
> !             sitedirs = [prefix]
>           for sitedir in sitedirs:
>               if os.path.isdir(sitedir):
>                   addsitedir(sitedir)
> 
>   # Define new built-ins 'quit' and 'exit'.
>   # These are simply strings that display a hint on how to exit.
>   if os.sep == ':':
> --- 140,154 ----
>                                    "python" + sys.version[:3],
>                                    "site-packages"),
>                           makepath(prefix, "lib", "site-python")]
>           else:
> !             sitedirs = [prefix, os.path.join(prefix, "lib", "site-packages")]
>           for sitedir in sitedirs:
>               if os.path.isdir(sitedir):
>                   addsitedir(sitedir)
> 
> + if os.sep == '\\':
> +     sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages")
> +

Why not do this for all platforms (which support site-packages) ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From guido@digicool.com  Fri Jul 13 18:21:42 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 13 Jul 2001 13:21:42 -0400
Subject: [Python-Dev] Python 2.1.1c1 released
Message-ID: <200107131721.f6DHLgX16757@odiug.digicool.com>

I'm happy to announce the release today of Python 2.1.1c1, a release
candidate for Python 2.1.1:

  http://www.python.org/2.1.1/

This is a pure bugfix release; see the website for details.  One fixed
"bug" deserves special attention: this release is GPL-compatible.
I hope it's in time for inclusion in the Debian release.

Thanks to Thomas Wouters for all his work on making this a perfect
candidate, despite today's date. :-)

The final 2.1.1 release is expected a week from now.

Enjoy!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@thyrsus.com  Fri Jul 13 18:49:19 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Fri, 13 Jul 2001 13:49:19 -0400
Subject: [Python-Dev] Python 2.1.1c1 released
In-Reply-To: <200107131721.f6DHLgX16757@odiug.digicool.com>; from guido@digicool.com on Fri, Jul 13, 2001 at 01:21:42PM -0400
References: <200107131721.f6DHLgX16757@odiug.digicool.com>
Message-ID: <20010713134919.B7279@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> This is a pure bugfix release; see the website for details.  One fixed
> "bug" deserves special attention: this release is GPL-compatible.
> I hope it's in time for inclusion in the Debian release.

Should be.  They haven't even settled their freeze policy yet, let alone
declared a freeze.  I've been tracking this because I'm coming up on a
fetchmail-5.9.0 stable release and want to get that in, too.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Freedom begins between the ears.
	-- Edward Abbey


From thomas.heller@ion-tof.com  Fri Jul 13 19:03:06 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 13 Jul 2001 20:03:06 +0200
Subject: [Python-Dev] Re: Possible solution for PEP250 and bdist_wininst
References: <025d01c10bb1$4c0f69c0$e000a8c0@thomasnotebook> <3B4F2DF6.85158478@lemburg.com>
Message-ID: <050a01c10bc6$11e401b0$e000a8c0@thomasnotebook>

From: "M.-A. Lemburg" <mal@lemburg.com>
[about setting sys.extinstallpath in site.py]
> 
> Why not do this for all platforms (which support site-packages) ?
Would probably make sense. But in this case,
distutils should also use this setting.

Thomas



From paulp@ActiveState.com  Fri Jul 13 20:03:36 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Fri, 13 Jul 2001 12:03:36 -0700
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings
References: <3B4EE391.19995171@lemburg.com>
Message-ID: <3B4F4608.207DBA7B@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
> Please comment...

I think that there should be a single directive for:

 * unicode strings
 * 8-bit strings
 * comments

If a user uses UTF-8 for 8-bit strings and Shift-JIS for Unicode, there
is basically no text editor in the world that is going to do the right
thing. And it isn't possible for a web server to properly associate an
encoding. In general, it isn't a useful configuration.

Also, no matter what the directive says, I think that \uXXXX should
continue to work. Just as in 8-bit strings, it should be possible to mix
and match direct encoded input and backslash-escaped characters.
Sometimes one is convenient (because of your keyboard setup) and
sometimes the other is convenient. This proposal exists only to improve
typing convenience so we should go all the way and allow both.

I strongly think we should restrict the directive to one per file and in
fact I would say it should be one of the first two lines. It should be
immediately following the shebang line if there is one. This is to allow
text editors to detect it as they detect XML encoding declarations.

My opinions are influenced by the fact that I've helped implement
Unicode support in an Python/XML editor. XML makes it easy to give the
user a good experience. Python could too if we are careful.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From guido@digicool.com  Fri Jul 13 20:16:21 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 13 Jul 2001 15:16:21 -0400
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings
In-Reply-To: Your message of "Fri, 13 Jul 2001 12:03:36 PDT."
 <3B4F4608.207DBA7B@ActiveState.com>
References: <3B4EE391.19995171@lemburg.com>
 <3B4F4608.207DBA7B@ActiveState.com>
Message-ID: <200107131916.f6DJGLU16857@odiug.digicool.com>

> I strongly think we should restrict the directive to one per file and in
> fact I would say it should be one of the first two lines. It should be
> immediately following the shebang line if there is one. This is to allow
> text editors to detect it as they detect XML encoding declarations.

Hm, then the directive would syntactically have to *precede* the
docstring.  That currently doesn't work -- the docstring may only be
preceded by blank lines and comments.  Lots of tools for processing
docstrings already have this built into them.  Is it worth breaking
them so that editors can remain stupid?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paulp@ActiveState.com  Fri Jul 13 20:38:49 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Fri, 13 Jul 2001 12:38:49 -0700
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings
References: <3B4EE391.19995171@lemburg.com>
 <3B4F4608.207DBA7B@ActiveState.com> <200107131916.f6DJGLU16857@odiug.digicool.com>
Message-ID: <3B4F4E49.C299C356@ActiveState.com>

Guido van Rossum wrote:
> 
>...
> 
> Hm, then the directive would syntactically have to *precede* the
> docstring.  

It makes sense for the directive to precede the docstring because the
directive should be able to change the definition of the docstring!

> That currently doesn't work -- the docstring may only be
> preceded by blank lines and comments.  Lots of tools for processing
> docstrings already have this built into them.

The directive statement is inherently a backwards incompatible
extension. It is a grammar change. Many tools sniff out the docstring
from the loaded module anyhow.

>   Is it worth breaking
> them so that editors can remain stupid?

I would say that the more important consideration is that it just makes
sense to figure out what encoding you are using before you start
processing strings!

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From skip@pobox.com (Skip Montanaro)  Fri Jul 13 20:41:39 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 13 Jul 2001 14:41:39 -0500
Subject: [Python-Dev] my nomination for quote-of-the-week
Message-ID: <15183.20211.386837.739757@beluga.mojam.com>

This gets my vote for quote-of-the-week.  Andrew, I seem to recall you are
collecting this sort of stuff.

    From: quinn@yak.ugcs.caltech.edu (Quinn Dunkan)
    To: python-list@python.org
    Subject: Re: not safe at all
    Date: 13 Jul 2001 19:12:51 GMT

    ...

    The static people talk about rigorously enforced interfaces, correctness
    proofs, contracts, etc.  The dynamic people talk about rigorously
    enforced testing and say that types only catch a small portion of
    possible errors.  The static people retort that they don't trust tests
    to cover everything or not have bugs and why write tests for stuff the
    compiler should test for you, so you shouldn't rely on *only* tests, and
    besides static types don't catch a small portion, but a large portion of
    errors.  The dynamic people say no program or test is perfect and static
    typing is not worth the cost in language complexity and design
    difficulty for the gain in eliminating a few tests that would have been
    easy to write anyway, since static types catch a small portion of
    errors, not a large portion.  The static people say static types don't
    add that much language complexity, and it's not design "difficulty" but
    an essential part of the process, and they catch a large portion, not a
    small portion.  The dynamic people say they add enormous complexity, and
    they catch a small portion, and point out that the static people have
    bad breath.  The static people assert that the dynamic people must be
    too stupid to cope with a real language and rigorous requirements, and
    are ugly besides.

    This is when both sides start throwing rocks.

    ...

Skip


From guido@digicool.com  Fri Jul 13 21:25:48 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 13 Jul 2001 16:25:48 -0400
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: Your message of "Fri, 13 Jul 2001 11:33:54 EDT."
 <200107131533.f6DFXsV16565@odiug.digicool.com>
References: <200107130429.QAA02078@s454.cosc.canterbury.ac.nz>
 <200107131533.f6DFXsV16565@odiug.digicool.com>
Message-ID: <200107132025.f6DKPmo16935@odiug.digicool.com>

Here's a patch to abstract.c that does to binary_op1() what I had in
mind.  My own attempts at timing this only serve to confuse me, but
I'm sure the experts will be able to assess it.  I think it may make
pystone about 1% faster.

Note that this assumes that a type object only sets the
NEW_STYLE_NUMBER flag when it has a non-NULL tp_as_number structure
pointer.  This makes sense, but just to be sure I add an assert().

In a bizarre twist of benchmarking, if I comment the asserts out,
pystone is 1% *slower* than without the patch....  I guess I'm going
to ignore that.

Enjoy.

--Guido van Rossum (home page: http://www.python.org/~guido/)

Index: abstract.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/abstract.c,v
retrieving revision 2.60.2.5
diff -c -r2.60.2.5 abstract.c
*** abstract.c	2001/07/07 22:55:30	2.60.2.5
--- abstract.c	2001/07/13 20:14:01
***************
*** 318,324 ****
  {
  	PyObject *x;
  	binaryfunc *slot;
! 	if (v->ob_type->tp_as_number != NULL && NEW_STYLE_NUMBER(v)) {
  		slot = NB_BINOP(v->ob_type->tp_as_number, op_slot);
  		if (*slot) {
  			x = (*slot)(v, w);
--- 318,334 ----
  {
  	PyObject *x;
  	binaryfunc *slot;
! 
! 	/* Quick test if anything down here could work */
! 	if (v->ob_type->tp_as_number == NULL &&
! 	    w->ob_type->tp_as_number == NULL)
! 	{
! 		Py_INCREF(Py_NotImplemented);
! 		return Py_NotImplemented;
! 	}
! 
! 	if (NEW_STYLE_NUMBER(v)) {
! 		assert (v->ob_type->tp_as_number != NULL);
  		slot = NB_BINOP(v->ob_type->tp_as_number, op_slot);
  		if (*slot) {
  			x = (*slot)(v, w);
***************
*** 331,337 ****
  			goto binop_error;
  		}
  	}
! 	if (w->ob_type->tp_as_number != NULL && NEW_STYLE_NUMBER(w)) {
  		slot = NB_BINOP(w->ob_type->tp_as_number, op_slot);
  		if (*slot) {
  			x = (*slot)(v, w);
--- 341,348 ----
  			goto binop_error;
  		}
  	}
! 	if (NEW_STYLE_NUMBER(w)) {
! 		assert (w->ob_type->tp_as_number != NULL);
  		slot = NB_BINOP(w->ob_type->tp_as_number, op_slot);
  		if (*slot) {
  			x = (*slot)(v, w);


From fredrik@pythonware.com  Fri Jul 13 21:30:45 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 13 Jul 2001 22:30:45 +0200
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings
References: <3B4EE391.19995171@lemburg.com> <3B4F4608.207DBA7B@ActiveState.com>
Message-ID: <012e01c10bda$b3927320$4ffa42d5@hagrid>

paul wrote:    

> I think that there should be a single directive for:
>
> * unicode strings
> * 8-bit strings
> * comments

I'd say "the entire program".

> If a user uses UTF-8 for 8-bit strings and Shift-JIS for Unicode, there
> is basically no text editor in the world that is going to do the right
> thing. And it isn't possible for a web server to properly associate an
> encoding. In general, it isn't a useful configuration.

exactly.

any proposal that assumes that different parts of a text file is
going to use different encodings is seriously flawed, and totally
ignorant of reality.  things just don't work that way.

</F>



From tim.one@home.com  Fri Jul 13 22:20:12 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 13 Jul 2001 17:20:12 -0400
Subject: [Python-Dev] RE: Defining Unicode Literal Encodings
In-Reply-To: <3B4EE391.19995171@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEEMKOAA.tim.one@home.com>

[M.-A. Lemburg]
> PEP: 0263 (?)
> Title: Defining Unicode Literal Encodings
> Version: $Revision: 1.0 $
> Author: mal@lemburg.com (Marc-André Lemburg)
> Status: Draft
> Type: Standards Track
> Python-Version: 2.3
> Created: 06-Jun-2001
> Post-History:

Since this depends on PEP 244, it should also have a

  Requires: 244

header line.


> ...
> ... can be set using the "directive" statement proposed in PEP 244.
>
>     The syntax for the directives is as follows:
>
>     'directive' WS+ 'unicodeencoding' WS* '=' WS* PYTHONSTRINGLITERAL
>     'directive' WS+ 'rawunicodeencoding' WS* '=' WS* PYTHONSTRINGLITERAL

PEP 244 doesn't allow these spellings:  at most one atom is allowed after
the directive name, and

    = "whatever"

isn't an atom.  Remove the '=' and PEP 244 is happy, though.  If you want to
keep the "=", PEP 244 has to change.

> ...

[Guido]
> Hm, then the directive would syntactically have to *precede* the
> docstring.  That currently doesn't work -- the docstring may only be
> preceded by blank lines and comments.  Lots of tools for processing
> docstrings already have this built into them.  Is it worth breaking
> them so that editors can remain stupid?

No.



From fredrik@pythonware.com  Fri Jul 13 22:44:36 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 13 Jul 2001 23:44:36 +0200
Subject: [Python-Dev] RE: Defining Unicode Literal Encodings
References: <LNBBLJKPBEHFEDALKOLCAEEMKOAA.tim.one@home.com>
Message-ID: <001b01c10be5$0479c4f0$4ffa42d5@hagrid>

tim wrote:
> [Guido]
> > Hm, then the directive would syntactically have to *precede* the
> > docstring.  That currently doesn't work -- the docstring may only be
> > preceded by blank lines and comments.  Lots of tools for processing
> > docstrings already have this built into them.  Is it worth breaking
> > them so that editors can remain stupid?
> 
> No.

that's why the "directive" statement shouldn't be used as
an encoding directive.

(and since I don't see any other use for it, that's also why
the "directive" statement doesn't belong in Python at all ;-)

</F>



From mal@lemburg.com  Fri Jul 13 22:56:40 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 13 Jul 2001 23:56:40 +0200
Subject: [Python-Dev] RE: Defining Unicode Literal Encodings
References: <LNBBLJKPBEHFEDALKOLCAEEMKOAA.tim.one@home.com>
Message-ID: <3B4F6E98.733B90DC@lemburg.com>

Tim Peters wrote:
>=20
> [M.-A. Lemburg]
> > PEP: 0263 (?)
> > Title: Defining Unicode Literal Encodings
> > Version: $Revision: 1.0 $
> > Author: mal@lemburg.com (Marc-Andr=E9 Lemburg)
> > Status: Draft
> > Type: Standards Track
> > Python-Version: 2.3
> > Created: 06-Jun-2001
> > Post-History:
>=20
> Since this depends on PEP 244, it should also have a
>=20
>   Requires: 244
>=20
> header line.

Ok, I'll add that.
=20
> > ...
> > ... can be set using the "directive" statement proposed in PEP 244.
> >
> >     The syntax for the directives is as follows:
> >
> >     'directive' WS+ 'unicodeencoding' WS* '=3D' WS* PYTHONSTRINGLITER=
AL
> >     'directive' WS+ 'rawunicodeencoding' WS* '=3D' WS* PYTHONSTRINGLI=
TERAL
>=20
> PEP 244 doesn't allow these spellings:  at most one atom is allowed aft=
er
> the directive name, and
>=20
>     =3D "whatever"
>=20
> isn't an atom.  Remove the '=3D' and PEP 244 is happy, though.  If you =
want to
> keep the "=3D", PEP 244 has to change.

True... would that pose a problem ?
=20
[Paul]
> I think that there should be a single directive for:
>=20
>  * unicode strings
>  * 8-bit strings
>  * comments
>=20
> If a user uses UTF-8 for 8-bit strings and Shift-JIS for Unicode, there
> is basically no text editor in the world that is going to do the right
> thing. And it isn't possible for a web server to properly associate an
> encoding. In general, it isn't a useful configuration.

Please don't mix 8-bit strings with Unicode literals: 8-bit
strings don't carry any encoding information, so providing encoding
information cannot be stored anywhere.=20

Comments, OTOH, are part of the program text, so they have to be ASCII
just like the Python source itself.

Note that it doesn't make sense to use a non-ASCII superset
for the Unicode literal encoding (as you and others have noted).
Since all builtin Python encodings are ASCII-supersets, this
shouldn't pose much of a problem, though ;-)
=20
> Also, no matter what the directive says, I think that \uXXXX should
> continue to work. Just as in 8-bit strings, it should be possible to mi=
x
> and match direct encoded input and backslash-escaped characters.
> Sometimes one is convenient (because of your keyboard setup) and
> sometimes the other is convenient. This proposal exists only to improve
> typing convenience so we should go all the way and allow both.

Hmm, good point, but hard to implement. We'd probably need a two
phase decoding for this to work:

1. decode the given Unicode literal encoding
2. decode any Unicode escapes in the Unicode string
=20
> I strongly think we should restrict the directive to one per file and i=
n
> fact I would say it should be one of the first two lines. It should be
> immediately following the shebang line if there is one. This is to allo=
w
> text editors to detect it as they detect XML encoding declarations.
>=20
> My opinions are influenced by the fact that I've helped implement
> Unicode support in an Python/XML editor. XML makes it easy to give the
> user a good experience. Python could too if we are careful.

I think that allowing one directive per file is the way to go,
but I'm not sure about the exact position. Basically, I think it
should go "near" the top, but not necessarily before any doc-string
in the file.
=20
> [Guido]
> > Hm, then the directive would syntactically have to *precede* the
> > docstring.  That currently doesn't work -- the docstring may only be
> > preceded by blank lines and comments.  Lots of tools for processing
> > docstrings already have this built into them.  Is it worth breaking
> > them so that editors can remain stupid?
>=20
> No.

Agreed.

Note that the PEP doesn't require the directive to be placed before the
doc-string. That point is still open. Technically, the compiler
will only need to know about the encoding before the first
Unicode literal in the source file.

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From fredrik@pythonware.com  Fri Jul 13 23:10:42 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 14 Jul 2001 00:10:42 +0200
Subject: [Python-Dev] RE: Defining Unicode Literal Encodings
References: <LNBBLJKPBEHFEDALKOLCAEEMKOAA.tim.one@home.com> <3B4F6E98.733B90DC@lemburg.com>
Message-ID: <004c01c10be8$af2face0$4ffa42d5@hagrid>

M.-A. Lemburg wrote:

> Please don't mix 8-bit strings with Unicode literals: 8-bit
> strings don't carry any encoding information, so providing
> encoding information cannot be stored anywhere. 

doesn't change a thing: the SOURCE CODE still has an
encoding.

I'm strongly -1 on your proposal.

it's not representing current best practices (xml, java),
and it's not future proof.  we can do better.

</F>



From mal@lemburg.com  Fri Jul 13 23:21:32 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 14 Jul 2001 00:21:32 +0200
Subject: [Python-Dev] PEP: Defining Unicode Literal Encodings (revision 1.1)
References: <LNBBLJKPBEHFEDALKOLCAEEMKOAA.tim.one@home.com> <3B4F6E98.733B90DC@lemburg.com> <004c01c10be8$af2face0$4ffa42d5@hagrid>
Message-ID: <3B4F746C.827BD177@lemburg.com>

Here's an updated version which clarifies some issues...

--

PEP: 0263 (?)
Title: Defining Unicode Literal Encodings
Version: $Revision: 1.1 $
Author: mal@lemburg.com (Marc-Andr=E9 Lemburg)
Status: Draft
Type: Standards Track
Python-Version: 2.3
Created: 06-Jun-2001
Post-History:=20
Requires: 244

Abstract

    This PEP proposes to use the PEP 244 statement "directive" to make
    the encoding used in Unicode string literals u"..." (and their raw
    counterparts ur"...") definable on a per source file basis.

Problem

    In Python 2.1, Unicode literals can only be written using the
    Latin-1 based encoding "unicode-escape". This makes the
    programming environment rather unfriendly to Python users who live
    and work in non-Latin-1 locales such as many of the Asian=20
    countries. Programmers can write their 8-bit strings using the
    favourite encoding, but are bound to the "unicode-escape" encoding
    for Unicode literals.

Proposed Solution

    I propose to make the Unicode literal encodings (both standard and
    raw) a per-source file option which can be set using the
    "directive" statement proposed in PEP 244 in a slightly extended
    form (by adding the '=3D' between the directive name and it's value).

Syntax

    The syntax for the directives is as follows:

    'directive' WS+ 'unicodeencoding' WS* '=3D' WS* PYTHONSTRINGLITERAL
    'directive' WS+ 'rawunicodeencoding' WS* '=3D' WS* PYTHONSTRINGLITERA=
L

    with the PYTHONSTRINGLITERAL representing the encoding name to be
    used as standard Python 8-bit string literal and WS being the
    whitespace characters [ \t].

Semantics

    Whenever the Python compiler sees such an encoding directive
    during the compiling process, it updates an internal flag which
    holds the encoding name used for the specific literal form. The
    encoding name flags are initialized to "unicode-escape" for u"..."=20
    literals and "raw-unicode-escape" for ur"..." respectively.

    ISSUE:
         Maybe we should restrict the directive usage to once per file
         and additionally to a placement before the first Unicode literal=
=20
         in the source file.

         (Comments suggest that this approach suits the goal best.)

    If the Python compiler has to convert a Unicode literal to a
    Unicode object, it will pass the 8-bit string data given by the
    literal to the Python codec registry and have it decode the data
    using the current setting of the encoding name flag for the
    requested type of Unicode literal. It then checks the result of
    the decoding operation for being an Unicode object and stores it
    in the byte code stream.

    Since Python source code is defined to be ASCII, the Unicode literal
    encodings (both standard and raw) should be supersets of ASCII and=20
    match the encoding used elsewhere in the program text, e.g. in=20
    comments and maybe even 8-bit strings (even though their encoding=20
    is only implicit and completely under the programmer's control).
    It is the responsability of the programmer to choose reasonable=20
    encodings.

Scope

    This PEP only affects Python source code which makes use of the
    proposed directives. It does not affect the coercion handling of
    8-bit strings and Unicode in the given module.

Copyright

    This document has been placed in the public domain.

=0C
Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From paulp@ActiveState.com  Fri Jul 13 23:46:02 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Fri, 13 Jul 2001 15:46:02 -0700
Subject: [Python-Dev] RE: Defining Unicode Literal Encodings
References: <LNBBLJKPBEHFEDALKOLCAEEMKOAA.tim.one@home.com> <3B4F6E98.733B90DC@lemburg.com>
Message-ID: <3B4F7A2A.5C29C909@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
> ....
> 
> Please don't mix 8-bit strings with Unicode literals: 8-bit
> strings don't carry any encoding information, so providing encoding
> information cannot be stored anywhere.

First, we could store the information if we want.

Second, whether we choose to store the information or not, the point is
that the source file should not mix encodings.

> Comments, OTOH, are part of the program text, so they have to be ASCII
> just like the Python source itself.

The Python interpreter allows non-ASCII characters in comments.

> Hmm, good point, but hard to implement. We'd probably need a two
> 
> phase decoding for this to work:
> 
> 1. decode the given Unicode literal encoding
> 2. decode any Unicode escapes in the Unicode string

That doesn't sound so hard. :)

> I think that allowing one directive per file is the way to go,
> but I'm not sure about the exact position. Basically, I think it
> should go "near" the top, but not necessarily before any doc-string
> in the file.

If Guido is violently opposed to having it before the docstring then we
could allow it either before or after the docstring to give tools time
to catch up.

I'm not sure what tools in particular have the problem, though. Any tool
that uses introspection or inspect.py will be fine.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From skip@pobox.com (Skip Montanaro)  Sat Jul 14 00:50:21 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 13 Jul 2001 18:50:21 -0500
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1)
In-Reply-To: <3B4F746C.827BD177@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCAEEMKOAA.tim.one@home.com>
 <3B4F6E98.733B90DC@lemburg.com>
 <004c01c10be8$af2face0$4ffa42d5@hagrid>
 <3B4F746C.827BD177@lemburg.com>
Message-ID: <15183.35133.264728.399408@beluga.mojam.com>

    mal> Here's an updated version which clarifies some issues...
    ...
    mal>     I propose to make the Unicode literal encodings (both standard
    mal>     and raw) a per-source file option which can be set using the
    mal>     "directive" statement proposed in PEP 244 in a slightly
    mal>     extended form (by adding the '=' between the directive name and
    mal>     it's value).

I think you need to motivate the need for a different syntax than is defined
in PEP 244.  I didn't see any obvious reason why the '=' is required.

Also, how do you propose to address /F's objections, particularly that the
directive can't syntactically appear before the module's docstring (where it
makes sense that the module author would logically want to use a non-default
encoding)?

Skip



From paulp@ActiveState.com  Sat Jul 14 01:23:43 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Fri, 13 Jul 2001 17:23:43 -0700
Subject: [Python-Dev] RE: Defining Unicode Literal Encodings
References: <LNBBLJKPBEHFEDALKOLCAEEMKOAA.tim.one@home.com> <3B4F6E98.733B90DC@lemburg.com> <004c01c10be8$af2face0$4ffa42d5@hagrid>
Message-ID: <3B4F910F.9A1BEC73@ActiveState.com>

Fredrik Lundh wrote:
> 
>...
> 
> doesn't change a thing: the SOURCE CODE still has an
> encoding.
> 
> I'm strongly -1 on your proposal.
> 
> it's not representing current best practices (xml, java),
> and it's not future proof.  we can do better.

I think that with minor tweaks, the PEP can be a real step forward from
where we are.

I as disappointed with Guido's quick dismissal because I do think we
have a problem in that people can send around Python programs with a
bunch of encoded text without any declaration. Neither text editors nor
even the Python interpreter itself know how to display that information
on someone else's machine. Having a declaration would be a big step
towards breaking the implicit dependence of those files on their "home"
machines.

For the declaration to have the effect I hope for, it would have to be
file-scoped and apply to all binary data in the file.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From guido@digicool.com  Sat Jul 14 02:25:55 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 13 Jul 2001 21:25:55 -0400
Subject: [Python-Dev] RE: Defining Unicode Literal Encodings
In-Reply-To: Your message of "Fri, 13 Jul 2001 17:23:43 PDT."
 <3B4F910F.9A1BEC73@ActiveState.com>
References: <LNBBLJKPBEHFEDALKOLCAEEMKOAA.tim.one@home.com> <3B4F6E98.733B90DC@lemburg.com> <004c01c10be8$af2face0$4ffa42d5@hagrid>
 <3B4F910F.9A1BEC73@ActiveState.com>
Message-ID: <200107140125.f6E1PtG17067@odiug.digicool.com>

> I as disappointed with Guido's quick dismissal

Huh????!!!  I haven't dismissed anything.  I just said I saw a problem.

Don't be so quick to jump to conclusions. :-)

I'm still watching the discussion...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Sat Jul 14 04:20:22 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Fri, 13 Jul 2001 23:20:22 -0400 (EDT)
Subject: [Python-Dev] [development doc updates]
Message-ID: <20010714032022.74B0B28927@beowolf.digicool.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/




From mal@lemburg.com  Sat Jul 14 12:32:10 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 14 Jul 2001 13:32:10 +0200
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1)
References: <LNBBLJKPBEHFEDALKOLCAEEMKOAA.tim.one@home.com>
 <3B4F6E98.733B90DC@lemburg.com>
 <004c01c10be8$af2face0$4ffa42d5@hagrid>
 <3B4F746C.827BD177@lemburg.com> <15183.35133.264728.399408@beluga.mojam.com>
Message-ID: <3B502DBA.761D4F69@lemburg.com>

Skip Montanaro wrote:
>=20
>     mal> Here's an updated version which clarifies some issues...
>     ...
>     mal>     I propose to make the Unicode literal encodings (both stan=
dard
>     mal>     and raw) a per-source file option which can be set using t=
he
>     mal>     "directive" statement proposed in PEP 244 in a slightly
>     mal>     extended form (by adding the '=3D' between the directive n=
ame and
>     mal>     it's value).
>=20
> I think you need to motivate the need for a different syntax than is de=
fined
> in PEP 244.  I didn't see any obvious reason why the '=3D' is required.

I'm not picky about the '=3D'; if people don't want it, I'll
happily drop it from the PEP. The only reason I think it may be
worthwhile adding it is because it simply looks right:

directive unicodeencoding =3D 'latin-1'

rather than

directive unicodeencoding 'latin-1'

(Note that internally this will set a flag to a value, so the
assigning character of '=3D' seems to fit in nicely.)
=20
> Also, how do you propose to address /F's objections, particularly that =
the
> directive can't syntactically appear before the module's docstring (whe=
re it
> makes sense that the module author would logically want to use a non-de=
fault
> encoding)?

Guido hinted to the problem of breaking code, Tim objected
to requiring this.=20

I don't see the need to use Unicode literals
as module doc-strings, so I think the problem is not a real one
(8-bit strings can be written using any encoding just like you can=20
now).

Still, if people would like to use Unicode literals for module
doc-strings, then they should place the directive *before* the
doc-string accepting that this could break some tools (the PEP currently
does not restrict the placement of the directive). Alternatively,
we could allow placing the directive into a comment, e.g.

#!/usr/local/python
#directive unicodeencoding =3D 'utf-8'
u"""
     This is a Unicode doc-string
"""

About Fredrik's idea that the source code should only use one=20
encoding:=20

Well, that's possible with the proposed directive, since=20
only Unicode literals carry data for Python is encoding-aware
and all other parts are under the programmer's control, e.g.

#!/usr/local/python
""" Module Docs...
"""
directive unicodeencoding =3D 'latin-1'
...
u =3D "H=E9ll=F4 W=F6rld !"
...

will give you pretty much what Fredrik asked for.=20

Note that since Python does not assign encoding information to=20
8-bit strings, comments etc. the only parts in a Python program=20
for which the programmer must explicitly tell Python which=20
encoding to assume are the Unicode literals.

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From mal@lemburg.com  Sat Jul 14 12:45:10 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 14 Jul 2001 13:45:10 +0200
Subject: [Python-Dev] RE: Defining Unicode Literal Encodings
References: <LNBBLJKPBEHFEDALKOLCAEEMKOAA.tim.one@home.com> <3B4F6E98.733B90DC@lemburg.com> <3B4F7A2A.5C29C909@ActiveState.com>
Message-ID: <3B5030C6.E244ADC2@lemburg.com>

Paul Prescod wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> > ....
> >
> > Please don't mix 8-bit strings with Unicode literals: 8-bit
> > strings don't carry any encoding information, so providing encoding
> > information cannot be stored anywhere.
> 
> First, we could store the information if we want.
> 
> Second, whether we choose to store the information or not, the point is
> that the source file should not mix encodings.

I have added a new paragraph to the PEP (see my rev. 1.1 posting)
pointing out that it is the programmers responsability to choose 
reasonable encodings; in particular, the used encodings should be
compatible so that a text editor can display the data correctly.
 
> > Comments, OTOH, are part of the program text, so they have to be ASCII
> > just like the Python source itself.
> 
> The Python interpreter allows non-ASCII characters in comments.
> 
> > Hmm, good point, but hard to implement. We'd probably need a two
> >
> > phase decoding for this to work:
> >
> > 1. decode the given Unicode literal encoding
> > 2. decode any Unicode escapes in the Unicode string
> 
> That doesn't sound so hard. :)

True. The issue here is very similar to standard literals
vs. raw ones. Perhaps step 2 should only be imposed on standard
literals while raw ones stop after step 1.
 
> > I think that allowing one directive per file is the way to go,
> > but I'm not sure about the exact position. Basically, I think it
> > should go "near" the top, but not necessarily before any doc-string
> > in the file.
> 
> If Guido is violently opposed to having it before the docstring then we
> could allow it either before or after the docstring to give tools time
> to catch up.
> 
> I'm not sure what tools in particular have the problem, though. Any tool
> that uses introspection or inspect.py will be fine.

See my other posting for ways to work around this problem.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From pedroni@inf.ethz.ch  Sat Jul 14 17:50:54 2001
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Sat, 14 Jul 2001 18:50:54 +0200
Subject: [Python-Dev] descr-branch, ExtensionClasses
References: <200107131208.OAA21211@core.inf.ethz.ch>  <200107131459.f6DExAv16504@odiug.digicool.com>
Message-ID: <000b01c10c85$27a82840$8a73fea9@newmexico>

Hi. First thanks for the answers.

> I'd say that even if descr-branch doesn't make it into 2.2, it will
> make it into the next release, so by all means study the design and
> tell me if it has any problems for Jython!
Yup. I will do that as soon as I have time to do it seriously.
I imagine that you need such kind of feedback at least before 2.2 goes beta.

Samuele Pedroni.



From pedroni@inf.ethz.ch  Sat Jul 14 19:45:19 2001
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Sat, 14 Jul 2001 20:45:19 +0200
Subject: [Python-Dev] descr-branch, ExtensionClasses
Message-ID: <000b01c10c95$22b60860$8a73fea9@newmexico>

[GvR]
>Yes, that would be good.  Are you aware of the schedule in PEP 251?

I was, thank you for remembering me of that.
I will try to come out with some comments before a2 or at least before middle
of august.
Things are a bit complicated with jython because of all the support for java
integration playing with classes and instances internals.
As you might know, we are still working on jython 2.1, we have just an a1 for
it out. We have not yet started working on 2.2.

Samuele Pedroni.




From mal@lemburg.com  Sat Jul 14 19:52:21 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 14 Jul 2001 20:52:21 +0200
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1)
References: <Pine.LNX.4.30.0107142120210.11347-100000@rnd.onego.ru>
Message-ID: <3B5094E5.F239D0E9@lemburg.com>

Roman Suzi wrote:
> 
> On Sat, 14 Jul 2001, M.-A. Lemburg wrote:
> 
> >> #!/usr/bin/python
> >> # -*- coding=utf-8 -*-
> >> ...
> >
> >I already mentioned allowing directives in comments to work around
> >the problem of directive placement before the first doc-string.
> >
> >The above would then look like this:
> >
> >#!/usr/local/bin/python
> ># directive unicodeencoding='utf-8'
> >u""" UTF-8 doc-string """
> >
> >The downside of this is that parsing comments breaks the current
> >tokenizing scheme in Python: the tokenizer removes comments before
> >passing the tokens to the compiler ...wouldn't be hard to
> >fix though ;-) (note that tokenize.py does not)
> 
> BTW, it is possible to write variable names in national alphabet
> is locale is set. But I do not know if this is side-effect
> which will be corrected or behaviour one can rely on ;-)

It is a side-effect of Python relying on the isalpha() C API.
I wouldn't count on it though since it is not compliant to 
the Python reference and other Python implementations may
very well not offer this possibility.
 
> It could be also nice to be able replace keywords with localised ones.
> Python remains nice even after translating into Russian.

Eek, no please !  VisualBasic went down that road and
backed out again... it's simply a complete nightmare.

> This + mending broken IDLE (which doesn't allow to enter cyrillic) will
> allow beginners to think and write. Currently "writing while thinking"
> works only for those who think in English ;-)
> 
> And such a move opens Python to secondary schools. For example, Logo has
> national variants without any losses. Why Python, also targeted for
> education requires to use English?
> 
> And unicoding (utf-8-ing) Python source could be the solution.
> 
> What do you think?

I personally think that programs should always be written in ASCII 
and all national language string literals be moved out into
gettext() (or similar) support files. Of course, for beginners
and small projects this is overkill, so the proposed Unicode literal 
variant might help... which is why I wrote the PEP -- adding this
support to Python is really simple and does not require a 
major rewrite of the tokenizer/compiler components in Python.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From tim.one@home.com  Sat Jul 14 20:37:39 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 14 Jul 2001 15:37:39 -0400
Subject: [Python-Dev] RE: PEP: Defining Unicode Literal Encodings (revision 1.1)
In-Reply-To: <3B502DBA.761D4F69@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHKKOAA.tim.one@home.com>

[M.-A. Lemburg]
> ...
> I'm not picky about the '='; if people don't want it, I'll
> happily drop it from the PEP. The only reason I think it may be
> worthwhile adding it is because it simply looks right:
>
> directive unicodeencoding = 'latin-1'
>
> rather than
>
> directive unicodeencoding 'latin-1'

The hangup is finding someone who cares enough <0.9 wink> to change the text
and implementation of the directive PEP.  There was no significant debate
about the proposed directive syntax in that, and in years past similar
crusades that did attract debate floundered on the inability to reach
consensus on overall syntax; it's not a good sign that the first proposed
use wanted syntax the PEP doesn't support.

> ...
> Still, if people would like to use Unicode literals for module
> doc-strings, then they should place the directive *before* the
> doc-string accepting that this could break some tools (the PEP
> currently does not restrict the placement of the directive).
> Alternatively, we could allow placing the directive into a
> comment, e.g.
>
> #!/usr/local/python
> #directive unicodeencoding = 'utf-8'
> u"""
>      This is a Unicode doc-string
> """

Another alternative:

#!/usr/local/python
directive unicodeencoding 'utf-8'

__doc__ = u"""
        This is a Unicode doc-string
"""

That is, the module docstring is just the module's __doc__ attr, and that
can be bound explicitly (a trick I've sometimes use for *computed* module
docstrings).



From paulp@ActiveState.com  Sat Jul 14 23:04:47 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sat, 14 Jul 2001 15:04:47 -0700
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1)
References: <LNBBLJKPBEHFEDALKOLCGEHKKOAA.tim.one@home.com>
Message-ID: <3B50C1FF.7E73739B@ActiveState.com>

Tim Peters wrote:
> 
>...
> 
> That is, the module docstring is just the module's __doc__ attr, and that
> can be bound explicitly (a trick I've sometimes use for *computed* module
> docstrings).

I must be missing something fundamental.

Why wouldn't we just redefine the algorithm used to find the docstring
to allow a directive and implement it in the interpreter? *What tools*
in particular are we worried about breaking?

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From tim.one@home.com  Sat Jul 14 23:39:25 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 14 Jul 2001 18:39:25 -0400
Subject: [Python-Dev] Silly little benchmark
In-Reply-To: <200107132025.f6DKPmo16935@odiug.digicool.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEIDKOAA.tim.one@home.com>

[Guido]
> Here's a patch to abstract.c that does to binary_op1() what I had in
> mind.  My own attempts at timing this only serve to confuse me, but
> I'm sure the experts will be able to assess it.  I think it may make
> pystone about 1% faster.

I would have tried this in PyNumber_Add instead (which pystone never enters!
it doesn't do any string cats, and all its adds are integer adds
special-cased away by BINARY_ADD).

binary_op1() is entered often by pystone, but only for int * and /, and
(just) a few times each for float subtract and the one-shot

    Array1Glob = [0]*51
    Array2Glob = map(lambda x: x[:], [Array1Glob]*51)

module initialization lines.  So adding an early-out in binary_op1()
"should" only harm pystone.  Adding an early-out in PyNumber_Add instead
should be neutral for pystone (but should slow, e.g., floating-point code a
little).

> Note that this assumes that a type object only sets the
> NEW_STYLE_NUMBER flag when it has a non-NULL tp_as_number structure
> pointer.  This makes sense, but just to be sure I add an assert().

Good.

> In a bizarre twist of benchmarking, if I comment the asserts out,
> pystone is 1% *slower* than without the patch....  I guess I'm going
> to ignore that.

Is the Unix build such that release mode doesn't manage to disable asserts?
I wouldn't ignore this, because the source code the C compiler sees in
release builds *should* be the same as if the assert lines had been

    ((void)0);

lines instead.  I don't see anything in the non-Windows builds that's
#define'ing NDEBUG in release builds, which is what they have to do to turn
asserts off.

Note that I understand that the effect of an assert "should be" to slow
things down, but that you're seeing it slow down when they're commented out.
That's not what I'm pursuing in this part:  I'm wondering why you see *any*
difference when commenting out asserts, regardless of direction.  You
shouldn't, and since I don't see anything that ever turns asserts off except
in the Windows build, that makes me twice as suspicious.



From tim.one@home.com  Sun Jul 15 01:12:56 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 14 Jul 2001 20:12:56 -0400
Subject: asserts sure look broken to me (was RE: [Python-Dev] Silly little benchmark)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEIDKOAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIHKOAA.tim.one@home.com>

[Tim]
> ...
> and since I don't see anything that ever turns asserts off except
> in the Windows build, that makes me twice as suspicious.

I built a release-mode Python under Cygwin after including a
guaranteed-to-trigger assert, and sure enough it triggered.  If that's
generally true of non-MSVC builds, it may go quite a way toward explaining,
e.g., why the Linux release-mode Python is significantly slower than the
Windows release-mode Python on our otherwise-identical office boxes.

Ubiquitous screwup or unique to Cygwin?  Disabling asserts in release mode
requires that NDEBUG be #define'd before including assert.h (this is all std
ANSI C, so should work the same way across platforms).  The MSVC project
defines NDEBUG "on the command line" during release builds, which is a good
way to accomplish this.



From guido@digicool.com  Sun Jul 15 02:26:43 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 14 Jul 2001 21:26:43 -0400
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1)
In-Reply-To: Your message of "Sat, 14 Jul 2001 15:04:47 PDT."
 <3B50C1FF.7E73739B@ActiveState.com>
References: <LNBBLJKPBEHFEDALKOLCGEHKKOAA.tim.one@home.com>
 <3B50C1FF.7E73739B@ActiveState.com>
Message-ID: <200107150126.VAA23781@cj20424-a.reston1.va.home.com>

Explain again why a directive is better than a specially marked
comment, when your main goal seems to be to make it easy for
non-parsing tools like editors to find it?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Sun Jul 15 02:35:07 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 14 Jul 2001 21:35:07 -0400
Subject: asserts sure look broken to me (was RE: [Python-Dev] Silly little benchmark)
In-Reply-To: Your message of "Sat, 14 Jul 2001 20:12:56 EDT."
 <LNBBLJKPBEHFEDALKOLCIEIHKOAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCIEIHKOAA.tim.one@home.com>
Message-ID: <200107150135.VAA23850@cj20424-a.reston1.va.home.com>

Yup, I think assert is always on with the Unix build.  I think I knew
this, because I was uncomfortable for a long time with adding asserts
to frequently run code.

We should fix this.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From rnd@onego.ru  Sun Jul 15 08:38:04 2001
From: rnd@onego.ru (Roman Suzi)
Date: Sun, 15 Jul 2001 11:38:04 +0400 (MSD)
Subject: [Python-Dev] This is spoiling Python image!
Message-ID: <Pine.LNX.4.30.0107151126400.4017-100000@rnd.onego.ru>

The problem: it is impossible to use IDLE with non-latin1
encodings under Windows.

IDLE is standard IDE for Python and it is what beginner users of
Python see in their Start->Programs. Unfortunately, IDLE can't work
with non-latin1 characters any more. This could lead beginners to
reconsider their choice of language because of unfriendly i18n issues.

The problem is explained in detail below.

Lets consider all errors one at a time.

1. Tcl can't find encodings (they are in \Python21\tcl\tcl8.3\encoding\).
Without them it is impossible to enter cyrillic and other kinds of letter=
s
in Text and Entry widgets under Windows.

Tkinter tries to help Tcl by means of FixTk.py:

import sys, os, _tkinter
ver =3D str(_tkinter.TCL_VERSION)
for t in "tcl", "tk":
    v =3D os.path.join(sys.prefix, "tcl", t+ver)
    if os.path.exists(os.path.join(v, "tclIndex")):
        os.environ[t.upper() + "_LIBRARY"] =3D v

This sets env. variables TCL_LIBRARY and TK_LIBRARY to
"C:\Python21\tcl\tcl8.3".

The problem is that it imports _tkinter which initialises
and calls Tcl_FindExecutable before TCL_LIBRARY is set.

It is easy to fix this error in FixTk.py:

import sys, os
if not os.environ.has_key('TCL_LIBRARY'):
	tcl_library =3D os.path.join(sys.prefix, "tcl", "tclX.Y")
	os.environ['TCL_LIBRARY'] =3D tcl_library

Tcl is smart enough to look into "C:\Python21\tcl\tclX.Y\..\tcl8.3"
as well.

2. Now we are able to print in IDLE:
>>> print "=F0=D2=C9=D7=C5=D4"

and we  will see russian letter... before we press Enter,
after which:

UnicodeError: ASCII decoding error: ordinal not in range(128)

appears.

Tcl recoded "=F0=D2=C9=D7=C5=D4" into Unicode. Python tries to recode it =
back
into usual string, assuming usual strings have sys.getdefaultencoding().

Now we need to set default encoding.
Lets look into site.py:

# Set the string encoding used by the Unicode implementation.  The
# default is 'ascii', but if you're willing to experiment, you can
# change this.

encoding =3D "ascii" # Default value set by _PyUnicode_Init()

if 0:
    # Enable to support locale aware default string encodings.
    import locale
    loc =3D locale.getdefaultlocale()
    if loc[1]:
        encoding =3D loc[1]

if 0:
    # Enable to switch off string to Unicode coercion and implicit
    # Unicode to string conversion.
    encoding =3D "undefined"

if encoding !=3D "ascii":
    sys.setdefaultencoding(encoding)

The code for setting default encoding is commented (maybe, to allow faste=
r
startup?)

Then goes:

#
# Run custom site specific code, if available.
#
try:
    import sitecustomize
except ImportError:
    pass

#
# Remove sys.setdefaultencoding() so that users cannot change the
# encoding after initialization.  The test for presence is needed when
# this module is run as a script, because this code is executed twice.
#
if hasattr(sys, "setdefaultencoding"):
    del sys.setdefaultencoding

So, sys.setdefaultencoding is deleted after we used it in
sitecustomize.py.

Its too bad, because the program can't set default encoding
and implicit string<->unicode conversions are very common in Python
and IDLE.

The solution could be as follows. Lets put sitecustomize.py in
C:\Python21\ with the following:

import locale, sys
encoding =3D locale.getdefaultlocale()[1]
if encoding:
	sys.setdefaultencoding(encoding)


* It would be wonderful if IDLE itself could setup encoding based
on locale or issued warnings and pointed t o solution somehow.

3. Now we can try it again in IDLE:

>>> print "=F0=D2=C9=D7=C5=D4"

after hitting Enter we are getting... latin1.

It's time to look at how _tkinter.c communicates with Tcl.

The cheap&dirty solution for IDLE is as follows:

--- Percolator.py.orig	Sat Jul 14 19:38:16 2001
+++ Percolator.py	Sat Jul 14 19:38:16 2001
@@ -22,6 +22,8 @@

     def insert(self, index, chars, tags=3DNone):
         # Could go away if inheriting from Delegator
+        if index !=3D 'insert':
+        	chars =3D unicode(chars)
         self.top.insert(index, chars, tags)

     def delete(self, index1, index2=3DNone):


--- PyShell.py.orig	Sat Jul 14 19:38:37 2001
+++ PyShell.py	Sat Jul 14 19:38:37 2001
@@ -469,6 +469,8 @@
         finally:
             self.reading =3D save
         line =3D self.text.get("iomark", "end-1c")
+        if type(line) =3D=3D type(u""):
+        	line =3D line.encode()
         self.resetoutput()
         if self.canceled:
             self.canceled =3D 0

But alas these patches only mask the problem.

What is really needed?

Starting from version 8.1 Tcl is totally unicoded. It is very simple:
tt wants us utf-8 strings and returns also utf-8 strings.
(As an exception, Tcl could assume latin1 if it is unable to decode
string).

_tkinter.c just sends Python strings as is to Tcl.
And does it correctly for Unicode strings. Receiving side is
slightly more complicated:

Tkapp_Call function (aka root.tk.call) handles most of the Tkinter
Tcl/Tk commands. If the result is 7bit clean, Tkapp_Call returns usual
string, if not -- it converts from utf-8 into unicode and returns
Unicode string.

Only Tkapp_Call does it. All others (Tkapp_Eval, GetVar, PythonCmd)
return utf-8 string!

IDLE extensively use Tkinter capabilities and all kinds of strings
go back and forth between Python and Tcl.

Of course, _tkinter.c works incorrectly.

i) before sending a string to Tcl, it must recode it
FROM default encoding TO utf-8

ii) upon receive of a string from Tcl, it must recode it from
utf-8 to default encoding, if possible.
[R.S.: Or return it as Unicode, if impossible]

It is possible to optimize the conversions. Of course, this will have
impact on the speed of Tkinter. But in our opinion correct work is
more important than speed.

Solution checked under Win98.

>From R.S.: yes, IDLE is not ideal and there are better IDEs (Emacs,
for example) and "serious" programmers rarely use it. Also Tkinter is
critisized much, etc. But the problem indicated above is very bad for
Python image as a user-friendly language. That is why it is very
important to FIX the problem as soon, as possible.

We can prepare patches for _tkinter.c as well.

Before we proceed to submitting bug-reports and patches, we will be glad
to hear if somebody has better solution to the indicated problem.

(The big deal of the problem is the need to patch _tkinter.c and recompil=
e
it. Everything else even beginner could fix if supplied with clues and
files with fixes. But of course, Python's IDLE must run correct out of th=
e
box).


Author: Kirill Simonov <kirill(at)xyz.donetsk.ua>
Translator: Roman Suzi <rnd@onego.ru>



From mal@lemburg.com  Sat Jul 14 20:57:29 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 14 Jul 2001 21:57:29 +0200
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1)
References: <LNBBLJKPBEHFEDALKOLCGEHKKOAA.tim.one@home.com>
Message-ID: <3B50A429.59B8FB7B@lemburg.com>

Tim Peters wrote:
> 
> [M.-A. Lemburg]
> > ...
> > I'm not picky about the '='; if people don't want it, I'll
> > happily drop it from the PEP. The only reason I think it may be
> > worthwhile adding it is because it simply looks right:
> >
> > directive unicodeencoding = 'latin-1'
> >
> > rather than
> >
> > directive unicodeencoding 'latin-1'
> 
> The hangup is finding someone who cares enough <0.9 wink> to change the text
> and implementation of the directive PEP.  There was no significant debate
> about the proposed directive syntax in that, and in years past similar
> crusades that did attract debate floundered on the inability to reach
> consensus on overall syntax; it's not a good sign that the first proposed
> use wanted syntax the PEP doesn't support.

Well, I guess I would care enough :-) Martin has to change the PEP
though, since he's the PEP author (and currently on vacation if
I'm not mistaken).

I think that supporting the typical "key = value" format is
quite reasonable for setting flags in the compiler. The PEP's
original idea of replacing your "from __future__ import spam"
does not require this format, since is only needs to support
switches.

> > ...
> > Still, if people would like to use Unicode literals for module
> > doc-strings, then they should place the directive *before* the
> > doc-string accepting that this could break some tools (the PEP
> > currently does not restrict the placement of the directive).
> > Alternatively, we could allow placing the directive into a
> > comment, e.g.
> >
> > #!/usr/local/python
> > #directive unicodeencoding = 'utf-8'
> > u"""
> >      This is a Unicode doc-string
> > """
> 
> Another alternative:
> 
> #!/usr/local/python
> directive unicodeencoding 'utf-8'
> 
> __doc__ = u"""
>         This is a Unicode doc-string
> """
> 
> That is, the module docstring is just the module's __doc__ attr, and that
> can be bound explicitly (a trick I've sometimes use for *computed* module
> docstrings).

Hmm, that looks a little cumbersome, but it would work (at least for
doc string extraction tools which import the module rather than
tokenize it).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From paulp@ActiveState.com  Sun Jul 15 18:15:28 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sun, 15 Jul 2001 10:15:28 -0700
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision
 1.1)
References: <LNBBLJKPBEHFEDALKOLCGEHKKOAA.tim.one@home.com>
 <3B50C1FF.7E73739B@ActiveState.com> <200107150126.VAA23781@cj20424-a.reston1.va.home.com>
Message-ID: <3B51CFB0.ACE9070D@ActiveState.com>

Guido van Rossum wrote:
> 
> Explain again why a directive is better than a specially marked
> comment, when your main goal seems to be to make it easy for
> non-parsing tools like editors to find it?
>...

Parsing tools do need it. The directive changes the file's semantics.
Both parsing and non-parsing tools need it.

I could live with a comment but I think that that is actually harder to
implement so I don't understand the benefit...I'm still trying to
understand what tools we are protecting. compiler.py can be easily
fixed. The real parser/compiler can be easily fixed. The other tools
mostly take their cue from one of these two modules, right?
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From guido@digicool.com  Sun Jul 15 18:29:24 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 15 Jul 2001 13:29:24 -0400
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1)
In-Reply-To: Your message of "Sun, 15 Jul 2001 10:15:28 PDT."
 <3B51CFB0.ACE9070D@ActiveState.com>
References: <LNBBLJKPBEHFEDALKOLCGEHKKOAA.tim.one@home.com> <3B50C1FF.7E73739B@ActiveState.com> <200107150126.VAA23781@cj20424-a.reston1.va.home.com>
 <3B51CFB0.ACE9070D@ActiveState.com>
Message-ID: <200107151729.NAA00455@cj20424-a.reston1.va.home.com>

> > Explain again why a directive is better than a specially marked
> > comment, when your main goal seems to be to make it easy for
> > non-parsing tools like editors to find it?
> >...
> 
> Parsing tools do need it. The directive changes the file's semantics.
> Both parsing and non-parsing tools need it.

I understand that.

> I could live with a comment but I think that that is actually harder to
> implement so I don't understand the benefit...I'm still trying to
> understand what tools we are protecting. compiler.py can be easily
> fixed. The real parser/compiler can be easily fixed. The other tools
> mostly take their cue from one of these two modules, right?

I disagree with the first sentence -- I believe a comment is easier to
implement.  The directive statement is still problematic.  Martin's
hack falls short of doing the right thing in all cases: you can't have
the first statement of your program be "directive = ..." or
"directive(...)".

Another argument for a comment: I expect there could be situations
where you want to declare an encoding that doesn't affect the Python
parser, but that does affect the editor (e.g. when you use the
encoding only in comments and/or 8-bit strings).  A comment would
back-port to older Python versions; a directive statement wouldn't.  I
don't know how important this is though.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Sun Jul 15 19:07:50 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 15 Jul 2001 20:07:50 +0200
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision
 1.1)
References: <LNBBLJKPBEHFEDALKOLCGEHKKOAA.tim.one@home.com> <3B50C1FF.7E73739B@ActiveState.com> <200107150126.VAA23781@cj20424-a.reston1.va.home.com>
 <3B51CFB0.ACE9070D@ActiveState.com> <200107151729.NAA00455@cj20424-a.reston1.va.home.com>
Message-ID: <3B51DBF6.6456A750@lemburg.com>

Guido van Rossum wrote:
> 
> > > Explain again why a directive is better than a specially marked
> > > comment, when your main goal seems to be to make it easy for
> > > non-parsing tools like editors to find it?
> > >...
> >
> > Parsing tools do need it. The directive changes the file's semantics.
> > Both parsing and non-parsing tools need it.
> 
> I understand that.
> 
> > I could live with a comment but I think that that is actually harder to
> > implement so I don't understand the benefit...I'm still trying to
> > understand what tools we are protecting. compiler.py can be easily
> > fixed. The real parser/compiler can be easily fixed. The other tools
> > mostly take their cue from one of these two modules, right?
> 
> I disagree with the first sentence -- I believe a comment is easier to
> implement.  The directive statement is still problematic.  Martin's
> hack falls short of doing the right thing in all cases: you can't have
> the first statement of your program be "directive = ..." or
> "directive(...)".
> 
> Another argument for a comment: I expect there could be situations
> where you want to declare an encoding that doesn't affect the Python
> parser, but that does affect the editor (e.g. when you use the
> encoding only in comments and/or 8-bit strings).  A comment would
> back-port to older Python versions; a directive statement wouldn't.  I
> don't know how important this is though.

Even though putting the information into a comment would
indeed be easier to implement, I think that from a design point
of view, it is a hack and not a clean design.

Note that a programmer can always place the encoding information
in the format needed for the editor into an additional comment
in fron of the doc-string if that's needed (the comment format 
needed for the editor will be editor-specific !).

I think that apart from adding a new keyword to the language
the argument about breaking doc-string tools is not a valid
one. Non-Unicode doc-strings will continue to work like they
always have:

#!/usr/local/bin/python
# -*- encoding='utf-8' -*-
""" Binary doc-string using UTF-8
"""
directive unicodeencoding = 'utf-8'
...
print u"Unicode encoded as UTF-8 rather than unicode-escape"
...

Or am I missing something ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal@lemburg.com  Sun Jul 15 19:09:10 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 15 Jul 2001 20:09:10 +0200
Subject: [Python-Dev] Re: Possible solution for PEP250 and bdist_wininst
References: <025d01c10bb1$4c0f69c0$e000a8c0@thomasnotebook> <3B4F2DF6.85158478@lemburg.com> <050a01c10bc6$11e401b0$e000a8c0@thomasnotebook>
Message-ID: <3B51DC46.6DD434F2@lemburg.com>

Thomas Heller wrote:
> 
> From: "M.-A. Lemburg" <mal@lemburg.com>
> [about setting sys.extinstallpath in site.py]
> >
> > Why not do this for all platforms (which support site-packages) ?
> Would probably make sense. But in this case,
> distutils should also use this setting.

Sure. (That was the point of inventing sys.extinstallpath ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From paulp@ActiveState.com  Sun Jul 15 20:48:08 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sun, 15 Jul 2001 12:48:08 -0700
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1)
References: <LNBBLJKPBEHFEDALKOLCAEEMKOAA.tim.one@home.com> <3B4F6E98.733B90DC@lemburg.com> <004c01c10be8$af2face0$4ffa42d5@hagrid> <3B4F746C.827BD177@lemburg.com>
Message-ID: <3B51F378.7DC06482@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
>...
>     Since Python source code is defined to be ASCII, the Unicode literal
>     encodings (both standard and raw) should be supersets of ASCII and
>     match the encoding used elsewhere in the program text, e.g. in
>     comments and maybe even 8-bit strings (even though their encoding
>     is only implicit and completely under the programmer's control).

Python programmers do not read PEPs to learn how to use new features. I
think it makes the whole thing much simpler if we define it on the file
level explicitly. To me, the feature is most helpful if it helps the
interpreter and various code inspection tools to understand all of the
non-ASCII information in the file.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From c_ullman@yahoo.com  Mon Jul 16 06:08:08 2001
From: c_ullman@yahoo.com (Cayce Ullman)
Date: Sun, 15 Jul 2001 22:08:08 -0700 (PDT)
Subject: [Python-Dev] Leading with XML-RPC
Message-ID: <20010716050808.79446.qmail@web11003.mail.yahoo.com>

--0-1917799420-995260088=:79364
Content-Type: text/plain; charset=us-ascii


/F wrote:

>-0 on soap support in 2.2 (it's still a moving target; a new spec draft
>was released this weekend).  if we want something now, it should be
>cayce ullman's SOAP.py, not my soaplib.py.  but I don't think we need
>SOAP in the standard library for another year or two.

Agreed on it being too early for any existing SOAP library to be included in the standard library. SOAP.py for example is not very clean at the moment as it evolved (and does some things wrong) in an attempt to interop with most impls.  I do think that if xmlrpclib.py is included in the std lib (which IMHO is a very good idea), any future included SOAP lib should be similar in structure and use (ie more like soaplib.py than SOAP.py). Hopefully, bringing soaplib.py up to speed or cleaning up SOAP.py should be an increasingly easier task as the interop process continues to settle down.

I do think Python has an opportunity to become an excellent choice for doing "web services" or .NET type of work out of the box.  If these concepts do take off I would hope to see some included SOAP functionality sooner rather than later in the std lib.

Also, for the record SOAP.py has moved to http://pywebsvcs.sourceforge.net

Cayce



---------------------------------
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail - only $35 a year!
http://personal.mail.yahoo.com/
--0-1917799420-995260088=:79364
Content-Type: text/html; charset=us-ascii

<P>/F wrote:</P>
<P>&gt;-0 on soap support in 2.2 (it's still a moving target; a new spec draft<BR>&gt;was released this weekend).&nbsp; if we want something now, it should be<BR>&gt;cayce ullman's SOAP.py, not my soaplib.py.&nbsp; but I don't think we need<BR>&gt;SOAP in the standard library for another year or two.</P>
<P>Agreed on it being too early for any existing SOAP library to be included in the standard library.&nbsp;SOAP.py for example is not very clean at the moment as it evolved (and does some things wrong) in an attempt to interop with most impls.&nbsp; I do think that if&nbsp;xmlrpclib.py&nbsp;is included in the std lib&nbsp;(which IMHO is a very good idea), any&nbsp;future&nbsp;included SOAP lib should be similar in structure and use (ie more like soaplib.py than SOAP.py).&nbsp;Hopefully, bringing soaplib.py up to speed or cleaning up SOAP.py should be an increasingly easier task as the interop process continues to settle down.</P>
<P>I do think Python has an opportunity to become an excellent choice for doing "web services" or .NET&nbsp;type of work out of the box.&nbsp; If these concepts do take off I would hope to see some included SOAP functionality sooner rather than later in the std lib.</P>
<P>Also, for the record SOAP.py has moved to <A href="http://pywebsvcs.sourceforge.net">http://pywebsvcs.sourceforge.net</A></P>
<P>Cayce</P><p><br><hr size=1><b>Do You Yahoo!?</b><br>
Get personalized email addresses from Yahoo! Mail - only $35 
a year!<BR><a href="http://personal.mail.yahoo.com/?.refer=tagline">http://personal.mail.yahoo.com/</a>
--0-1917799420-995260088=:79364--


From thomas@xs4all.net  Mon Jul 16 07:57:29 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 16 Jul 2001 08:57:29 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include parsetok.h,2.15,2.16 pythonrun.h,2.42,2.43
In-Reply-To: <E15M14Q-0005uu-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010716085729.E5396@xs4all.nl>

On Sun, Jul 15, 2001 at 10:37:26PM -0700, Tim Peters wrote:

> Modified Files:
> 	parsetok.h pythonrun.h 
> Log Message:
> Ugly.  A pile of new xxxFlags() functions, to communicate to the parser
> that 'yield' is a keyword.  This doesn't help test_generators at all!  I
> don't know why not.  These things do work now (and didn't before this
> patch):

What's the problem with this, anyway ? Why would "from __future__ import
generators" or special flags be necessary to enable the existance of
generators ? I'd have thought it's just a parser directive (okay, so that's
tricky to implement) but to code that doesn't use 'yield' a generator is
just another iterator, right ?

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Mon Jul 16 08:53:18 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 16 Jul 2001 09:53:18 +0200
Subject: [Python-Dev] Re: CVS: python/dist/src/Parser parsetok.c,2.25,2.26
References: <E15M14Q-0005v0-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <3B529D6E.A4BAD3FA@lemburg.com>

[Tim]
> A pile of new xxxFlags() functions, to communicate to the parser
> that 'yield' is a keyword.

Would those APIs also be usable for a new "directive" keyword ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From atehwa@iki.fi  Mon Jul 16 10:52:30 2001
From: atehwa@iki.fi (Panu A Kalliokoski)
Date: Mon, 16 Jul 2001 12:52:30 +0300 (EET DST)
Subject: [Python-Dev] A replacement for asyncore / asynchat
Message-ID: <Pine.OSF.4.30.0107161246001.18906-100000@sirppi.helsinki.fi>

Hello all, I've developed a Python module (in Python) to make somewhat
higher abstraction over select.select(). The package is called
"Selecting".  The package is somewhat similar to asyncore, but has many
advantages over it:

- It's made in OO fashion, allowing for greater flexibility in
  overriding default behaviour;
- Event queues, which allow you to schedule events that should happen
  sometime in the future (nicely synced with select()) (permanent /
  one-shot events);
- Cleaner API;
- Channel interfaces. It's possible to make many different channels as
  long as they have a fd to select() on; with this, you can implement,
  for example, inter-thread locking with pipes.
- Simpler buffering scheme, which makes it unnecessary to use unblocking
  fd's, and might even give some speed;
- No exception handling (I found exception packing of asyncore to be a
  real nuisance)
- Clearer (?) division of responsibility: the API of channel handlers,
  etc.  (asyncore puts part of message handling into the socket wrapper)

For these reasons, I think that the asyncore package in the Python main
distribution should be replaced with Selecting or at least Selecting
should be put in the main distribution.

The package is available at
http://sange.fi/~atehwa-u/selecting/		(for browsing)  and
http://sange.fi/~atehwa-u/selecting-0.89.tar.gz (for downloading).

The package is quite well tested and has been used to build ircd-style
daemons, but more testing and comments are always welcome.

Panu Kalliokoski



From mal@lemburg.com  Mon Jul 16 13:29:11 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 16 Jul 2001 14:29:11 +0200
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision
 1.1)
References: <LNBBLJKPBEHFEDALKOLCAEEMKOAA.tim.one@home.com> <3B4F6E98.733B90DC@lemburg.com> <004c01c10be8$af2face0$4ffa42d5@hagrid> <3B4F746C.827BD177@lemburg.com> <3B51F378.7DC06482@ActiveState.com>
Message-ID: <3B52DE17.E283521@lemburg.com>

Paul Prescod wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> >...
> >     Since Python source code is defined to be ASCII, the Unicode literal
> >     encodings (both standard and raw) should be supersets of ASCII and
> >     match the encoding used elsewhere in the program text, e.g. in
> >     comments and maybe even 8-bit strings (even though their encoding
> >     is only implicit and completely under the programmer's control).
> 
> Python programmers do not read PEPs to learn how to use new features. I
> think it makes the whole thing much simpler if we define it on the file
> level explicitly. To me, the feature is most helpful if it helps the
> interpreter and various code inspection tools to understand all of the
> non-ASCII information in the file.

I don't think I understand your point... please clarify.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From martin@loewis.home.cs.tu-berlin.de  Mon Jul 16 16:15:01 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 16 Jul 2001 17:15:01 +0200
Subject: [Python-Dev] Leading with XML-RPC
Message-ID: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de>

> It might benefit from also including the sgmlop.c extension.

+1 on including this one (after fixing the bugs, that is). People want
a "good" XML parser in Python, regardless of XML-RPC; they complain
that expat requires an external library.

sgmlop should then go into xml.parsers.sgmlop; making sgmllib and
xmllib use sgmlop is optional.

Regards,
Martin


From guido@digicool.com  Mon Jul 16 17:17:37 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 16 Jul 2001 12:17:37 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: Your message of "Mon, 16 Jul 2001 17:15:01 +0200."
 <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de>
References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de>
Message-ID: <200107161617.f6GGHeE31369@odiug.digicool.com>

> > It might benefit from also including the sgmlop.c extension.
> 
> +1 on including this one (after fixing the bugs, that is). People want
> a "good" XML parser in Python, regardless of XML-RPC; they complain
> that expat requires an external library.
> 
> sgmlop should then go into xml.parsers.sgmlop; making sgmllib and
> xmllib use sgmlop is optional.

+0

I believe sgmlop can crash on grossly bad input (admittedly I looked
at the source once over a year ago).  If this were fixed I'd be +1.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@loewis.home.cs.tu-berlin.de  Mon Jul 16 17:05:47 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 16 Jul 2001 18:05:47 +0200
Subject: [Python-Dev] guido@digicool.com
Message-ID: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de>

> Martin's hack falls short of doing the right thing in all cases: you
> can't have the first statement of your program be "directive = ..."
> or "directive(...)".

If that is considered as a serious problem, I'll try to solve it with
an additional lookahead token: If the next token is a name, then it is
a directive.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Mon Jul 16 16:59:08 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 16 Jul 2001 17:59:08 +0200
Subject: [Python-Dev] PEP 244 syntax
Message-ID: <200107161559.f6GFx8V03899@mira.informatik.hu-berlin.de>

> Well, I guess I would care enough :-) Martin has to change the PEP
> though, since he's the PEP author.

I don't like having an equal sign there, but I can add this as an
alternative and leave it for BDFL pronouncement (and count votes in
favour or against).

In any case, I'd need to know what the exact proposed change to PEP
244 is. The syntax currently reads

directive_statement: 'directive' NAME [atom] [';'] NEWLINE

How do you want this to change?

> I think that supporting the typical "key = value" format is
> quite reasonable for setting flags in the compiler. The PEP's
> original idea of replacing your "from __future__ import spam"
> does not require this format, since is only needs to support
> switches.

Actually, based on Tim's objections, I need the syntax in a different
way:

directive transitional generators

Here, "directive transitional" indicates that a transitional feature
is being activated, followed by the name of the feature. This is in
line with

directive transitional nested_scopes

Spelling them as

directive transitional = nested_scopes
# or
directive transitional = 'nested_scopes'

doesn't sound right, since I'm not assigning to "transitional".

Of course, since this directive is spelled "from __future__ import"
these days, the only remaining application for directives is the
unicodeencoding directive. I'm just pointing out that adding an equal
sign likely restricts the applicability of directives.

Regards,
Martin


From mal@lemburg.com  Mon Jul 16 17:49:40 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 16 Jul 2001 18:49:40 +0200
Subject: [Python-Dev] Re: PEP 244 syntax
References: <200107161559.f6GFx8V03899@mira.informatik.hu-berlin.de>
Message-ID: <3B531B24.C8CCC74F@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > Well, I guess I would care enough :-) Martin has to change the PEP
> > though, since he's the PEP author.
> 
> I don't like having an equal sign there, but I can add this as an
> alternative and leave it for BDFL pronouncement (and count votes in
> favour or against).
> 
> In any case, I'd need to know what the exact proposed change to PEP
> 244 is. The syntax currently reads
> 
> directive_statement: 'directive' NAME [atom] [';'] NEWLINE
> 
> How do you want this to change?

To make the directive statment useful for setting compiler
parameters, the syntax should be extended to allow
for an (optional) '='. Whether or not this '=' sign must be there
is up to the definition of the directive NAME.

It may also be worthwhile using a testlist (see Grammar)
instead of the fixed atom for cases where the compiler
parameter needs to be a e.g. list of options.

I'd also suggest to remove the optional ';' since this is
not confrom with the rest of Python....

directive_statement: 'directive' NAME ['='] [testlist] NEWLINE
 
> > I think that supporting the typical "key = value" format is
> > quite reasonable for setting flags in the compiler. The PEP's
> > original idea of replacing your "from __future__ import spam"
> > does not require this format, since is only needs to support
> > switches.
> 
> Actually, based on Tim's objections, I need the syntax in a different
> way:
> 
> directive transitional generators
> 
> Here, "directive transitional" indicates that a transitional feature
> is being activated, followed by the name of the feature. This is in
> line with
> 
> directive transitional nested_scopes
> 
> Spelling them as
> 
> directive transitional = nested_scopes
> # or
> directive transitional = 'nested_scopes'
> 
> doesn't sound right, since I'm not assigning to "transitional".

True.
 
> Of course, since this directive is spelled "from __future__ import"
> these days, the only remaining application for directives is the
> unicodeencoding directive. I'm just pointing out that adding an equal
> sign likely restricts the applicability of directives.

It doesn't need to: simply leave the requirement whether to
use or not to use an equal sign to the definition of the 
directive.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From tim@digicool.com  Mon Jul 16 17:52:42 2001
From: tim@digicool.com (Tim Peters)
Date: Mon, 16 Jul 2001 12:52:42 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include parsetok.h,2.15,2.16 pythonrun.h,2.42,2.43
In-Reply-To: <20010716085729.E5396@xs4all.nl>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEMECCAA.tim@digicool.com>

[Tim]
> 	parsetok.h pythonrun.h
> Log Message:
> Ugly.  A pile of new xxxFlags() functions, to communicate to the parser
> that 'yield' is a keyword.  This doesn't help test_generators at all!
> I don't know why not.  These things do work now (and didn't before this
> patch):

[Thomas Wouters]
> What's the problem with this, anyway ? Why would "from __future__ import
> generators" or special flags be necessary to enable the existance of
> generators ?

Sorry, I'm lost.  Guido introduced a generators future-statement, and now
we're trying to get it to work the way PEP 236 says future statements work.
A future statement is needed because yield *will* be a new keyword in 2.3,
but is not in 2.2 (unless a module includes the generators
future-statement).

> I'd have thought it's just a parser directive (okay, so that's
> tricky to implement)

The new xxxFlags() functions allow passing in flags to the parser, and I
guess that's what "a parser directive" means to you.

> but to code that doesn't use 'yield' a generator
> is just another iterator, right ?

Right.  Now what?  I don't think I grasped what you were getting at.



From tim@digicool.com  Mon Jul 16 18:01:28 2001
From: tim@digicool.com (Tim Peters)
Date: Mon, 16 Jul 2001 13:01:28 -0400
Subject: [Python-Dev] RE: CVS: python/dist/src/Parser parsetok.c,2.25,2.26
In-Reply-To: <3B529D6E.A4BAD3FA@lemburg.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHEEMFCCAA.tim@digicool.com>

[Tim]
> A pile of new xxxFlags() functions, to communicate to the parser
> that 'yield' is a keyword.

[MAL]
> Would those APIs also be usable for a new "directive" keyword ?

Sure, but there's no general machinery here, just the raw existence of a new
int "flags" argument, and a ton of teensy special-casing in two dozen other
files to support "from __future__ import generators" only and specifically.
No new general "parser API" exists or should be inferred.



From guido@digicool.com  Mon Jul 16 18:05:37 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 16 Jul 2001 13:05:37 -0400
Subject: [Python-Dev] guido@digicool.com
In-Reply-To: Your message of "Mon, 16 Jul 2001 18:05:47 +0200."
 <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de>
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de>
Message-ID: <200107161705.f6GH5bA32228@odiug.digicool.com>

(Where did this subject come from???)

> > Martin's hack falls short of doing the right thing in all cases: you
> > can't have the first statement of your program be "directive = ..."
> > or "directive(...)".
> 
> If that is considered as a serious problem, I'll try to solve it with
> an additional lookahead token: If the next token is a name, then it is
> a directive.

Wait.

MAL seems to want two other changes: directive should be allowed
(required???) before the module docstring, and it should support the
syntax from his proto-PEP (directive key = value).

But MAL and PaulP don't seem to agree on the semantics of this
directive, and I haven't gotten a good answer why we can't do that
with a magic comment.

In the mean time, I've decided to enable the yield keyword with a
future statement.  In general I now prefer using future statements for
enabling future features over the directive statement.

So it's still unclear if we want a directive...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Mon Jul 16 18:40:25 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 16 Jul 2001 19:40:25 +0200
Subject: [Python-Dev] directive statement (PEP 244)
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com>
Message-ID: <3B532709.56972A46@lemburg.com>

Guido van Rossum wrote:
> 
> > > Martin's hack falls short of doing the right thing in all cases: you
> > > can't have the first statement of your program be "directive = ..."
> > > or "directive(...)".
> >
> > If that is considered as a serious problem, I'll try to solve it with
> > an additional lookahead token: If the next token is a name, then it is
> > a directive.
> 
> Wait.
> 
> MAL seems to want two other changes: directive should be allowed
> (required???) 

"allowed" not "required".

> before the module docstring, and it should support the
> syntax from his proto-PEP (directive key = value).
> 
> But MAL and PaulP don't seem to agree on the semantics of this
> directive, and I haven't gotten a good answer why we can't do that
> with a magic comment.

We don't ? 

Paul suggested adding encoding directives for 8-bit 
strings and comments, but these cannot be used by the Python
compiler in any way and would only be for the benefit of an
editor, so I don't really see the need for them. A programmer
can still add some editor specific comment to the source file
to tell the editor in what encoding to display the file, but this
information is really only useful for the editor, not the
Python compiler.

About the magic comment: Unicode literals are translated into
Unicode objects at compile time. The encoding information is
vital for the decoding to succeed. If you place this information
into a comment of the Python source code and have the compiler
depend on it, removing the comment would break your program.

I don't think that's good language design (besides, we already
have enough Unicode magic in Python already...), but then
people may feel different about this.
 
> In the mean time, I've decided to enable the yield keyword with a
> future statement.  In general I now prefer using future statements for
> enabling future features over the directive statement.
> 
> So it's still unclear if we want a directive...

One way or another we need a way to specify compiler parameters
and settings on a per-source file basis. Whether you call it
directive, pragma or magic comment is really secondary and only
a matter of language design.

I've only chosen PEP 244 as basis for the PEP because it seemed
to fit the need. If you decide to go down some other path,
then I'll happily update the PEP to whatever becomes part of 
Python.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From guido@digicool.com  Mon Jul 16 19:24:21 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 16 Jul 2001 14:24:21 -0400
Subject: [Python-Dev] directive statement (PEP 244)
In-Reply-To: Your message of "Mon, 16 Jul 2001 19:40:25 +0200."
 <3B532709.56972A46@lemburg.com>
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com>
 <3B532709.56972A46@lemburg.com>
Message-ID: <200107161824.f6GIOL532466@odiug.digicool.com>

> > MAL seems to want two other changes: directive should be allowed
> > (required???) 
> 
> "allowed" not "required".

but last I looked if there was a docstring before the directive you
couldn't guarantee that the directive applied.

> > before the module docstring, and it should support the
> > syntax from his proto-PEP (directive key = value).
> > 
> > But MAL and PaulP don't seem to agree on the semantics of this
> > directive, and I haven't gotten a good answer why we can't do that
> > with a magic comment.
> 
> We don't ? 

It seems to me that each post from you gets a response from Paul with
some kind of objection, and vice versa.  Maybe you're converging, but
I don't see where you are converging yet.  Also, your arguments
sometimes seem contradictory.  For example, Paul has said that you may
need a comment with an editor-specific encoding indicator, while you
were expecting editors to look at the directive and made this a reason
why the directive should precede the docstring.

> Paul suggested adding encoding directives for 8-bit 
> strings and comments, but these cannot be used by the Python
> compiler in any way and would only be for the benefit of an
> editor, so I don't really see the need for them.

Another indication you two aren't on the same page just yet.

> A programmer
> can still add some editor specific comment to the source file
> to tell the editor in what encoding to display the file, but this
> information is really only useful for the editor, not the
> Python compiler.

This redundancy worries me though.  Are we going to encourage people
to use an editor-specific comment for each editor out there that could
be used to touch the file?

> About the magic comment: Unicode literals are translated into
> Unicode objects at compile time. The encoding information is
> vital for the decoding to succeed. If you place this information
> into a comment of the Python source code and have the compiler
> depend on it, removing the comment would break your program.

Yes, and so would removing a directive.  I don't see the point at
all.

> I don't think that's good language design (besides, we already
> have enough Unicode magic in Python already...), but then
> people may feel different about this.

Directives come with their own set of magic.

> > In the mean time, I've decided to enable the yield keyword with a
> > future statement.  In general I now prefer using future statements for
> > enabling future features over the directive statement.
> > 
> > So it's still unclear if we want a directive...
> 
> One way or another we need a way to specify compiler parameters
> and settings on a per-source file basis. Whether you call it
> directive, pragma or magic comment is really secondary and only
> a matter of language design.

I still haven't seen this need demonstrated.  Most purported uses of
these are better done with existing mechanisms.  For example, in PEP
253 I propose an assignment to a global __metaclass__ to set the
default class for a baseless class statement.

> I've only chosen PEP 244 as basis for the PEP because it seemed
> to fit the need. If you decide to go down some other path,
> then I'll happily update the PEP to whatever becomes part of 
> Python.

But you're implying without clearly specifying all sorts of amendments
to PEP 244, which weakens your position.

For example, PEP 244 allows a doc string before the directive, but you
indicated that the directive can only affect strings that occur after
it.  I don't think this is true: the creation of actual string objects
is done after the whole file has been parsed, is it wouldn't be hard
to collect and interpret all directives before creating code objects.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paulp@ActiveState.com  Mon Jul 16 19:36:58 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 16 Jul 2001 11:36:58 -0700
Subject: [Python-Dev] directive statement (PEP 244)
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com>
Message-ID: <3B53344A.25AE6EB4@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
>....
> 
> We don't ?
> 
> Paul suggested adding encoding directives for 8-bit
> strings and comments, but these cannot be used by the Python
> compiler in any way and would only be for the benefit of an
> editor, so I don't really see the need for them. 

Sorry I wasn't clear. Like \F, I think that the best model is that of
XML, Java and (I've learned recently) Perl. There should be a single
encoding for the file. Logically speaking it should be decoded before
tokenization or parsing. Practically speaking it may be simpler to fake
this logical decoding in the implementation. I don't care how it is
implemented. Logically the model should be that any encoding declaration
affects the interpretation of the *file* not some particular construct
in the file.

If this is too difficult to implement today then maybe we should wait on
the whole feature until someone has time to do it right.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From mal@lemburg.com  Mon Jul 16 20:02:58 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 16 Jul 2001 21:02:58 +0200
Subject: [Python-Dev] directive statement (PEP 244)
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <3B53344A.25AE6EB4@ActiveState.com>
Message-ID: <3B533A62.73ECD605@lemburg.com>

Paul Prescod wrote:
> 
> "M.-A. Lemburg" wrote:
> > Paul suggested adding encoding directives for 8-bit
> > strings and comments, but these cannot be used by the Python
> > compiler in any way and would only be for the benefit of an
> > editor, so I don't really see the need for them.
> 
> Sorry I wasn't clear. Like \F, I think that the best model is that of
> XML, Java and (I've learned recently) Perl. There should be a single
> encoding for the file. Logically speaking it should be decoded before
> tokenization or parsing. Practically speaking it may be simpler to fake
> this logical decoding in the implementation. I don't care how it is
> implemented. Logically the model should be that any encoding declaration
> affects the interpretation of the *file* not some particular construct
> in the file.
> 
> If this is too difficult to implement today then maybe we should wait on
> the whole feature until someone has time to do it right.

Hmm, I guess you have something like this in mind...

1. read the file
2. decode it into Unicode assuming some fixed per-file encoding
3. tokenize the Unicode content
4. compile it, creating Unicode objects from the given Unicode data
   and creating string objects from the Unicode literal data
   by first reencoding the Unicode data into 8-bit string data

To make this backwards compatible, the implementation would have to
assume Latin-1 as the original file encoding if not given (otherwise,
binary data currently stored in 8-bit strings wouldn't make the
roundtrip).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From thomas@xs4all.net  Mon Jul 16 20:07:45 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 16 Jul 2001 21:07:45 +0200
Subject: [Python-Dev] Python 2.1.1 and distutils
Message-ID: <20010716210744.H5396@xs4all.nl>

I've got a few distutils fixes pending (the unixcompiler thing, and
Just/Jack mentioned a few Mac/Metroworks fixes they wanted in) but I'm not
sure how to handle this; distutils has a separate version number, and I seem
to recall it is/was developed seperately. Basically I'm distutils-ignorant,
as I hardly have a need to distribute my scripts :)

Anyway, should I apply the fixes and up the version number ? Apply the fixes
but keep quiet about them ? Hand the fixes over to someone with distutils
clue ? Scream and shout ? (Always my favorite, that ;P)

(BTW, Jack, Just, I'm waiting for one of you to follow up on the metroworks
thing; just mail me the patches, preferably written in blood, with a signed
confession that they won't break any code what so ever :-)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Mon Jul 16 20:14:43 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 16 Jul 2001 21:14:43 +0200
Subject: [Python-Dev] directive statement (PEP 244)
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com>
 <3B532709.56972A46@lemburg.com> <200107161824.f6GIOL532466@odiug.digicool.com>
Message-ID: <3B533D23.E2099A20@lemburg.com>

Guido van Rossum wrote:
> 
> > > MAL seems to want two other changes: directive should be allowed
> > > (required???)
> >
> > "allowed" not "required".
> 
> but last I looked if there was a docstring before the directive you
> couldn't guarantee that the directive applied.

That was due to a misunderstanding of how the implementation could
work... after reading your explanation below, here's a way which
would work around this "requirement":

If the tokenizer gets to do the directive processing
(rather than the compiler), then the placement of the directive 
becomes irrelevant: it may only appear once per file and the tokenizer
will see it before the compiler, so the encoding setting will already 
have been made before the compiler even starts to compile the
first doc-string.
 
> > > before the module docstring, and it should support the
> > > syntax from his proto-PEP (directive key = value).
> > >
> > > But MAL and PaulP don't seem to agree on the semantics of this
> > > directive, and I haven't gotten a good answer why we can't do that
> > > with a magic comment.
> >
> > We don't ?
> 
> It seems to me that each post from you gets a response from Paul with
> some kind of objection, and vice versa.  Maybe you're converging, but
> I don't see where you are converging yet.  Also, your arguments
> sometimes seem contradictory.  For example, Paul has said that you may
> need a comment with an editor-specific encoding indicator, while you
> were expecting editors to look at the directive and made this a reason
> why the directive should precede the docstring.

No, I was never talking about editors. Paul brought that up.
I am only concerned about telling the Python interpreter which
encoding to assume when converting Unicode literals into
Unicode objects -- that's all.
 
> > Paul suggested adding encoding directives for 8-bit
> > strings and comments, but these cannot be used by the Python
> > compiler in any way and would only be for the benefit of an
> > editor, so I don't really see the need for them.
> 
> Another indication you two aren't on the same page just yet.

He posted a clarification of what he think's is the way to go.
I think this settles the argument.
 
> > A programmer
> > can still add some editor specific comment to the source file
> > to tell the editor in what encoding to display the file, but this
> > information is really only useful for the editor, not the
> > Python compiler.
> 
> This redundancy worries me though.  Are we going to encourage people
> to use an editor-specific comment for each editor out there that could
> be used to touch the file?

Let's put it this way: are you expecting that all editors out
there will be able to parse the Python way of defining the
encoding of Unicode literals ?

My point is that I don't see editors as an issue in this discussion.
 
> > About the magic comment: Unicode literals are translated into
> > Unicode objects at compile time. The encoding information is
> > vital for the decoding to succeed. If you place this information
> > into a comment of the Python source code and have the compiler
> > depend on it, removing the comment would break your program.
> 
> Yes, and so would removing a directive.  I don't see the point at
> all.

Sure, but a user would normally not expect his program to
fail just because he removes a comment...
 
> > I don't think that's good language design (besides, we already
> > have enough Unicode magic in Python already...), but then
> > people may feel different about this.
> 
> Directives come with their own set of magic.
> 
> > > In the mean time, I've decided to enable the yield keyword with a
> > > future statement.  In general I now prefer using future statements for
> > > enabling future features over the directive statement.
> > >
> > > So it's still unclear if we want a directive...
> >
> > One way or another we need a way to specify compiler parameters
> > and settings on a per-source file basis. Whether you call it
> > directive, pragma or magic comment is really secondary and only
> > a matter of language design.
> 
> I still haven't seen this need demonstrated.  Most purported uses of
> these are better done with existing mechanisms.  For example, in PEP
> 253 I propose an assignment to a global __metaclass__ to set the
> default class for a baseless class statement.

Hmm, are you suggesting to use something like the following
instead:

__unicodeencoding__ = 'utf-8'

> > I've only chosen PEP 244 as basis for the PEP because it seemed
> > to fit the need. If you decide to go down some other path,
> > then I'll happily update the PEP to whatever becomes part of
> > Python.
> 
> But you're implying without clearly specifying all sorts of amendments
> to PEP 244, which weakens your position.
>
> For example, PEP 244 allows a doc string before the directive, but you
> indicated that the directive can only affect strings that occur after
> it.  I don't think this is true: the creation of actual string objects
> is done after the whole file has been parsed, is it wouldn't be hard
> to collect and interpret all directives before creating code objects.

Please see the correction I gave above and my reply to Martin which has 
the specification of my proposed amendment.
 
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From guido@digicool.com  Mon Jul 16 20:19:27 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 16 Jul 2001 15:19:27 -0400
Subject: [Python-Dev] directive statement (PEP 244)
In-Reply-To: Your message of "Mon, 16 Jul 2001 11:36:58 PDT."
 <3B53344A.25AE6EB4@ActiveState.com>
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com>
 <3B53344A.25AE6EB4@ActiveState.com>
Message-ID: <200107161919.f6GJJRE00537@odiug.digicool.com>

> Sorry I wasn't clear. Like \F, I think that the best model is that of
> XML, Java and (I've learned recently) Perl. There should be a single
> encoding for the file. Logically speaking it should be decoded before
> tokenization or parsing. Practically speaking it may be simpler to fake
> this logical decoding in the implementation. I don't care how it is
> implemented. Logically the model should be that any encoding declaration
> affects the interpretation of the *file* not some particular construct
> in the file.

This is the *only* model that makes sense.

> If this is too difficult to implement today then maybe we should wait on
> the whole feature until someone has time to do it right.

Right-o!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Mon Jul 16 20:21:00 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 16 Jul 2001 21:21:00 +0200
Subject: [Python-Dev] Python 2.1.1 and distutils
References: <20010716210744.H5396@xs4all.nl>
Message-ID: <3B533E9C.4307923A@lemburg.com>

Thomas Wouters wrote:
> 
> I've got a few distutils fixes pending (the unixcompiler thing, and
> Just/Jack mentioned a few Mac/Metroworks fixes they wanted in) but I'm not
> sure how to handle this; distutils has a separate version number, and I seem
> to recall it is/was developed seperately. Basically I'm distutils-ignorant,
> as I hardly have a need to distribute my scripts :)
> 
> Anyway, should I apply the fixes and up the version number ? Apply the fixes
> but keep quiet about them ? Hand the fixes over to someone with distutils
> clue ? Scream and shout ? (Always my favorite, that ;P)
> 
> (BTW, Jack, Just, I'm waiting for one of you to follow up on the metroworks
> thing; just mail me the patches, preferably written in blood, with a signed
> confession that they won't break any code what so ever :-)

Why not simply include the latest stable distutils version in
Python 2.1.1 and add the new patches/features to the next
distutils release (which would then go into 2.1.2, etc.) ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From paulp@ActiveState.com  Mon Jul 16 20:22:43 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 16 Jul 2001 12:22:43 -0700
Subject: [Python-Dev] directive statement (PEP 244)
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <3B53344A.25AE6EB4@ActiveState.com> <3B533A62.73ECD605@lemburg.com>
Message-ID: <3B533F03.A5FD37D8@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
>...
> 
> Hmm, I guess you have something like this in mind...
> 
> 1. read the file
> 2. decode it into Unicode assuming some fixed per-file encoding
> 3. tokenize the Unicode content
> 4. compile it, 

Right. This is how XML, Java, Perl etc. work. XML and Python would be
the only languages to actually declare the encoding in use (in ASCII). I
think that the declaration way is clearly superior to depending on
command line arguments or BOMs.

But this is just how it has to *look* to the user. If there is an
implementation that behind the scenes only decodes Unicode literals,
that would be fine.

> ... creating Unicode objects from the given Unicode data
>    and creating string objects from the Unicode literal data
>    by first reencoding the Unicode data into 8-bit string data

Or we could just disallow non-ASCII 8-bit strings literals in files that
use the declaration. That was never a feature Guido really intended to
support (as I understand it!) and I don't see a need to carry it
forward. If you are in the Unicode universe then the need to put binary
data in 8-bit string literals is massively reduced.

> To make this backwards compatible, the implementation would have to
> assume Latin-1 as the original file encoding if not given (otherwise,
> binary data currently stored in 8-bit strings wouldn't make the
> roundtrip).

Another way to think about it is that files without the declaration skip
directly to the tokenize step and skip the decoding step.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From thomas@xs4all.net  Mon Jul 16 20:31:25 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 16 Jul 2001 21:31:25 +0200
Subject: [Python-Dev] Python 2.1.1 and distutils
In-Reply-To: <3B533E9C.4307923A@lemburg.com>
Message-ID: <20010716213125.K5391@xs4all.nl>

On Mon, Jul 16, 2001 at 09:21:00PM +0200, M.-A. Lemburg wrote:

> > Anyway, should I apply the fixes and up the version number ? Apply the fixes
> > but keep quiet about them ? Hand the fixes over to someone with distutils
> > clue ? Scream and shout ? (Always my favorite, that ;P)

> Why not simply include the latest stable distutils version in
> Python 2.1.1 and add the new patches/features to the next
> distutils release (which would then go into 2.1.2, etc.) ?

Two reasons:

1) Like I said, I have *no* clue about distutils :) What is the 'latest stable
distutils version' ? Where can I find it ? Who has an idea of what, exactly,
changed, and whether all changes are appropriate in a bugfix release (I can
be lenient in the case of distutils, but bugfix releases are supposed to
keep *even broken code* working, up to a point.

2) I'm not sure if the fixes I talked about are in the 'latest stable
distutils version', since one of them was checked in mere hours ago.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Mon Jul 16 20:46:07 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 16 Jul 2001 15:46:07 -0400
Subject: [Python-Dev] directive statement (PEP 244)
In-Reply-To: Your message of "Mon, 16 Jul 2001 21:02:58 +0200."
 <3B533A62.73ECD605@lemburg.com>
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <3B53344A.25AE6EB4@ActiveState.com>
 <3B533A62.73ECD605@lemburg.com>
Message-ID: <200107161946.f6GJk7Q00944@odiug.digicool.com>

> Hmm, I guess you have something like this in mind...
> 
> 1. read the file
> 2. decode it into Unicode assuming some fixed per-file encoding
> 3. tokenize the Unicode content
> 4. compile it, creating Unicode objects from the given Unicode data
>    and creating string objects from the Unicode literal data
>    by first reencoding the Unicode data into 8-bit string data
> 
> To make this backwards compatible, the implementation would have to
> assume Latin-1 as the original file encoding if not given (otherwise,
> binary data currently stored in 8-bit strings wouldn't make the
> roundtrip).

To be compatible with the current default encoding, I would use ASCII
as the default encoding and issue an error if any non-ASCII characters
are found.  One should always use hex/oct escapes to enter binary data
in literals!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Mon Jul 16 20:56:16 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 16 Jul 2001 15:56:16 -0400
Subject: [Python-Dev] directive statement (PEP 244)
In-Reply-To: Your message of "Mon, 16 Jul 2001 21:14:43 +0200."
 <3B533D23.E2099A20@lemburg.com>
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <200107161824.f6GIOL532466@odiug.digicool.com>
 <3B533D23.E2099A20@lemburg.com>
Message-ID: <200107161956.f6GJuG600983@odiug.digicool.com>

> > but last I looked if there was a docstring before the directive you
> > couldn't guarantee that the directive applied.
> 
> That was due to a misunderstanding of how the implementation could
> work... after reading your explanation below, here's a way which
> would work around this "requirement":
> 
> If the tokenizer gets to do the directive processing
> (rather than the compiler), then the placement of the directive 
> becomes irrelevant: it may only appear once per file and the tokenizer
> will see it before the compiler, so the encoding setting will already 
> have been made before the compiler even starts to compile the
> first doc-string.

Sure.  (Technically, it's not the tokenizer that interprets the
directives, but a pass that runs before the code generator runs.  The
compiler has sprouted quite a few passes lately... :-)

> No, I was never talking about editors. Paul brought that up.
> I am only concerned about telling the Python interpreter which
> encoding to assume when converting Unicode literals into
> Unicode objects -- that's all.

Well, I believe that for XML everybody (editors and other processors)
looks in the same place, right?

> He posted a clarification of what he think's is the way to go.
> I think this settles the argument.

I agree.

> Let's put it this way: are you expecting that all editors out
> there will be able to parse the Python way of defining the
> encoding of Unicode literals ?

Not right away, but this is what I would hope would happen eventually.

> My point is that I don't see editors as an issue in this discussion.

Well, anything we can do to make parsing the encoding indicator easier
for editors helps.

> > > About the magic comment: Unicode literals are translated into
> > > Unicode objects at compile time. The encoding information is
> > > vital for the decoding to succeed. If you place this information
> > > into a comment of the Python source code and have the compiler
> > > depend on it, removing the comment would break your program.
> > 
> > Yes, and so would removing a directive.  I don't see the point at
> > all.
> 
> Sure, but a user would normally not expect his program to
> fail just because he removes a comment...

Weak argument.  A magic comment is specially marked as such, e.g.

    #*encoding utf-8

You might as well say that users are prone to remove the #! comment...

> Hmm, are you suggesting to use something like the following
> instead:
> 
> __unicodeencoding__ = 'utf-8'

Not in this particular case, but for other cases where directives have
been suggested.  In this case (encoding) I'd prefer a magic comment.
I still haven't seen a good example of something for which directives
are the best solution.  Of course, it should be '__fileencoding__'. :-)

> Please see the correction I gave above and my reply to Martin which has 
> the specification of my proposed amendment.

I've seen them now.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Mon Jul 16 20:55:11 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 16 Jul 2001 15:55:11 -0400
Subject: [Python-Dev] Python 2.1.1 and distutils
In-Reply-To: <20010716210744.H5396@xs4all.nl>; from thomas@xs4all.net on Mon, Jul 16, 2001 at 09:07:45PM +0200
References: <20010716210744.H5396@xs4all.nl>
Message-ID: <20010716155511.A13393@ute.cnri.reston.va.us>

On Mon, Jul 16, 2001 at 09:07:45PM +0200, Thomas Wouters wrote:
>Anyway, should I apply the fixes and up the version number ? Apply the fixes
>but keep quiet about them ? Hand the fixes over to someone with distutils
>clue ? Scream and shout ? (Always my favorite, that ;P)

Apply the fixes and don't bother increasing the version number.  The
standalone Distutils releases happen in sync with Python releases, and
are so that users of older Python versions, particularly 1.5.2, can
get the current set of Distutils fixes.  I don't think there have been
enough changes at this point to make it worth issuing a new Distutils
release; indeed, I don't know if it's worth the bother of issuing them
any longer.

--amk


From fdrake@acm.org  Mon Jul 16 21:09:24 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 16 Jul 2001 16:09:24 -0400 (EDT)
Subject: [Python-Dev] directive statement (PEP 244)
In-Reply-To: <200107161956.f6GJuG600983@odiug.digicool.com>
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de>
 <200107161705.f6GH5bA32228@odiug.digicool.com>
 <3B532709.56972A46@lemburg.com>
 <200107161824.f6GIOL532466@odiug.digicool.com>
 <3B533D23.E2099A20@lemburg.com>
 <200107161956.f6GJuG600983@odiug.digicool.com>
Message-ID: <15187.18932.675886.239925@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > Well, I believe that for XML everybody (editors and other processors)
 > looks in the same place, right?

  It also assumes a pretty strict set of expected characters:  If you
don't have UTF-8, you have:

	[byte-order-mark] "<?xml " [version-spec] [encoding-spec]
				   [standalog-spec] "?>"

  Basically, the encoding can be discovered very easily given an
assumption about legal content.  Once that assumption doesn't hold,
the encoding can't be discovered reliably.
  We could probably make a pretty reasonable statement of how to
auto-detect enough so that Python files could have an encoded
declaration (however we spell it), but it's hard to beat the
assumption of mandated structure.  (Some assumption, huh? ;)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From paulp@ActiveState.com  Mon Jul 16 21:11:10 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 16 Jul 2001 13:11:10 -0700
Subject: [Python-Dev] directive statement (PEP 244)
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com>
 <3B532709.56972A46@lemburg.com> <200107161824.f6GIOL532466@odiug.digicool.com> <3B533D23.E2099A20@lemburg.com>
Message-ID: <3B534A5E.D38DD416@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
>...
> 
> My point is that I don't see editors as an issue in this discussion.

There are two points where this touches editors:

 * if we keep the encoding consistent throughout the file then at least
a Unicode-aware text editor like Notepad or Visual Studio will be able
to do something intelligent with the files. The user will choose
"Shift-JIS" from their menu and go ahead.

 * if we make the directive easy for an editor to find the declaration,
we increase the liklihood of people writing editors that magically guess
the right encoding instead of requiring the user to instruct them.

The first is more important to me than the second.

>...
> 
> Sure, but a user would normally not expect his program to
> fail just because he removes a comment...

#!/usr/bin/python

:)

I am usually not a fan of putting semantic information in comments but
the practical difficulties in doing so in this case seem small. And the
benefit would be that we could require the declaration to precede the
first non-comment line in the file. That means that we (both the
tokenizer and editors) don't have to seach the file for the declaration.
We just read two lines and then give up.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From guido@digicool.com  Mon Jul 16 22:09:43 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 16 Jul 2001 17:09:43 -0400
Subject: [Python-Dev] Heads up: Python 2.2a1 to be released from descr-branch
Message-ID: <200107162109.f6GL9iL05351@odiug.digicool.com>

PEP 251 promises a 2.2a1 release on July 18 (coming Wednesday), and I
have every intention to fulfill this promise.  (That's why we added
the future statement for generators.)

The PEP also promises that the release will be done from a branch.
Rather than forming a new branch, I intend to do the release, for
once, from the descr-branch.

This means that the release will contain the experimental code that
implements most (but not all) of PEP 252 and 253.  This is intended to
be backwards compatible.  One purpose of the release is to see *how*
backwards compatible.

If the descr-branch release turns out to be a disaster, I may decide
to hold off on the descr-branch work and we'll release 2.2 without all
the good stuff from the descr-branch.  But I don't expect that this
will happen.  The worst that I really expect is that we'll have to do
a bunch more backwards compatibility work.  If 2.2a1 is a success,
I'll merge the descr-branch into the trunk.

I realize that the descr-branch work is not finished and not
sufficiently documented (despite the 10K words in the two PEPs).
That's OK, it's an alpha release.

In preparation for this event, Tim is semi-continuously merging the
trunk into the descr-branch, and I've added the branch tag to all
files in the trunk (so the branch is now a complete set of files).

If you have something that should go into the 2.2a1 release, please
check it in on the trunk and add a note to the checkin message "please
merge into 2.2a1".


Backwards incompatibility
-------------------------

99% of the features on descr-branch are only invoked when you use a
class statement with a built-in object as a base class (or when you
use an explicit __metaclass__ assignment).

Some descr-branch things that might affect old code:

- Introspection works differently (see PEP 252).  In particular, most
  objects now have a __class__ attribute, and the __methods__ and
  __members__ attributes no longer work.  This means that dir([]) will
  return an empty list.  Use dir(type([])) instead -- this is
  consistent with regular classes.  See the example in PEP 252.

- Several built-ins that can be seen as coercions or constructors are
  now type objects rather than factory functions; the type objects
  support the same behaviors as the old factory functions.  Affected
  are: complex, float, long, int, str, tuple, list, unicode, and
  type.  (There are also new ones: dictionary, object, classmethod,
  staticmethod, but since these are new built-ins I can't see how this
  would break old code.)

- There's one very specific (and fortunately uncommon) bug that used
  to go undetected, but which is now reported as an error:

    class A:
        def foo(self): pass

    class B(A): pass

    class C(A):
        def foo(self):
            B.foo(self)

  Here, C.foo wants to call A.foo, but by mistake calls B.foo.  In the
  old system, because B doesn't define foo, B.foo is identical to
  A.foo, so the call would succeed.  In the new system, B.foo is
  marked as a method requiring a B instance, and a C is not a B, so
  the call fails.

- Binary compatibility with old extensions is not guaranteed.  We'll
  tighten this in future releases.  I also very much doubt that
  extensions based on Jim Fulton's ExtensionClass will work --
  although I encourage folks to try this to see how much breaks, so we
  can hopefully fix this for 2.2a2.  While the ultimate goal of PEP
  253 is to do away with ExtensionClass, I believe that ExtensionClass
  should still work in 2.2, breaking it in 2.3.

I should also note that PEP 254 will probably remain unimplemented for
now, since it would create way more incompatibilities.  I promise to
reopen it for Python 2.3.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Mon Jul 16 22:38:50 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 16 Jul 2001 23:38:50 +0200
Subject: [Python-Dev] directive statement (PEP 244)
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <3B53344A.25AE6EB4@ActiveState.com>
 <3B533A62.73ECD605@lemburg.com> <200107161946.f6GJk7Q00944@odiug.digicool.com>
Message-ID: <3B535EEA.A629E61C@lemburg.com>


Guido van Rossum wrote:
> 
> > Hmm, I guess you have something like this in mind...
> >
> > 1. read the file
> > 2. decode it into Unicode assuming some fixed per-file encoding
> > 3. tokenize the Unicode content
> > 4. compile it, creating Unicode objects from the given Unicode data
> >    and creating string objects from the Unicode literal data
> >    by first reencoding the Unicode data into 8-bit string data
> >
> > To make this backwards compatible, the implementation would have to
> > assume Latin-1 as the original file encoding if not given (otherwise,
> > binary data currently stored in 8-bit strings wouldn't make the
> > roundtrip).
> 
> To be compatible with the current default encoding, I would use ASCII
> as the default encoding and issue an error if any non-ASCII characters
> are found.  One should always use hex/oct escapes to enter binary data
> in literals!

Hmm, Latin-1 and other locale-specific encodings
are currently being used in 8-bit strings by far too many people 
in Europe and elsewhere... people won't feel good about it.

Note that the reason for using Latin-1 is that Latin-1 decoded
into Unicode and then reencoded into Latin-1 is a 1-1
mapping for all 8-bit values -- this gives us binary 
backward compatibility.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From martin@loewis.home.cs.tu-berlin.de  Mon Jul 16 22:40:59 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 16 Jul 2001 23:40:59 +0200
Subject: [Python-Dev] Re: PEP 244 syntax
In-Reply-To: <3B531B24.C8CCC74F@lemburg.com> (mal@lemburg.com)
References: <200107161559.f6GFx8V03899@mira.informatik.hu-berlin.de> <3B531B24.C8CCC74F@lemburg.com>
Message-ID: <200107162140.f6GLexo09608@mira.informatik.hu-berlin.de>

> > directive_statement: 'directive' NAME [atom] [';'] NEWLINE
> > 
> > How do you want this to change?
> 
> To make the directive statment useful for setting compiler
> parameters, the syntax should be extended to allow
> for an (optional) '='. Whether or not this '=' sign must be there
> is up to the definition of the directive NAME.

Ok.

> It may also be worthwhile using a testlist (see Grammar)
> instead of the fixed atom for cases where the compiler
> parameter needs to be a e.g. list of options.

Ok.

> 
> I'd also suggest to remove the optional ';' since this is
> not confrom with the rest of Python....

Sure it is; disallowing the semicolon would be not conform:

simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE

You can have a semicolon after each small_stmt

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Mon Jul 16 22:48:21 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 16 Jul 2001 23:48:21 +0200
Subject: [Python-Dev] directive statement (PEP 244)
In-Reply-To: <200107161705.f6GH5bA32228@odiug.digicool.com> (message from
 Guido van Rossum on Mon, 16 Jul 2001 13:05:37 -0400)
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com>
Message-ID: <200107162148.f6GLmLl09639@mira.informatik.hu-berlin.de>

> (Where did this subject come from???)

I meant to put it into CC:, not into Subject: ...

> So it's still unclear if we want a directive...

It seems to me that to reasonably use non-ASCII *characters* in
strings (as opposed to using mere byte sequences), we have to offer a
declaration-type statement. A comment should not be used since
comments should not change the outcome of a program, whereas this
thing may change the program result.

The question is whether a general-purpose syntax is needed. I think
the answer is yes: I'd also like to say "all strings are Unicode" on a
per-module basis, perhaps combined with providing an encoding. But
then, this might be a future import:

from __future__ import all_strings_are_unicode

Regards,
Martin



From skip@pobox.com (Skip Montanaro)  Mon Jul 16 22:56:19 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 16 Jul 2001 16:56:19 -0500
Subject: [Python-Dev] directive statement (PEP 244)
In-Reply-To: <3B535EEA.A629E61C@lemburg.com>
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de>
 <200107161705.f6GH5bA32228@odiug.digicool.com>
 <3B532709.56972A46@lemburg.com>
 <3B53344A.25AE6EB4@ActiveState.com>
 <3B533A62.73ECD605@lemburg.com>
 <200107161946.f6GJk7Q00944@odiug.digicool.com>
 <3B535EEA.A629E61C@lemburg.com>
Message-ID: <15187.25347.383123.602253@beluga.mojam.com>

Please excuse the naive interruption to this discussion.  

I'm a bit removed from this debate, being someone who is generally happy
with ASCII (and who really doesn't understand all the fur that is flying),
however I would imagine that programmers in Moscow writing code to be read
by other Russian programmers would want to enter Cyrillic characters
directly into their module doc strings and not have to insert hex escapes.
Can they safely do that now if they set the encoding variable in site.py
appropriately?  If so, what is the need for the proposed directive to set
encodings?  Is it an attempt simply to allow different encodings on a
per-module basis?

On a related note, can the "Defining Unicode Literal Encodings" PEP be added
to the PEP site/page so those of us who don't save every message that flows
into their inboxes have it to refer to?

Thx,

Skip


From skip@pobox.com (Skip Montanaro)  Mon Jul 16 22:59:50 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 16 Jul 2001 16:59:50 -0500
Subject: [Python-Dev] Re: PEP 244 syntax
In-Reply-To: <200107162140.f6GLexo09608@mira.informatik.hu-berlin.de>
References: <200107161559.f6GFx8V03899@mira.informatik.hu-berlin.de>
 <3B531B24.C8CCC74F@lemburg.com>
 <200107162140.f6GLexo09608@mira.informatik.hu-berlin.de>
Message-ID: <15187.25558.243182.290606@beluga.mojam.com>

    >> I'd also suggest to remove the optional ';' since this is not confrom
    >> with the rest of Python....

    Martin> Sure it is; disallowing the semicolon would be not conform:

    Martin> simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE

    Martin> You can have a semicolon after each small_stmt

But it doesn't appear that PEP 244 allows multiple directives per line:

    A directive_statement is a statement of the form

        directive_statement: 'directive' NAME [atom] [';'] NEWLINE

If you decide to allow it, then the semicolon makes sense, but not
otherwise.

Skip


From mal@lemburg.com  Sat Jul 14 17:04:04 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 14 Jul 2001 18:04:04 +0200
Subject: [Python-Dev] Re: PEP: Defining Unicode Literal Encodings (revision 1.1)
References: <Pine.LNX.4.30.0107141805550.5125-100000@rnd.onego.ru>
Message-ID: <3B506D74.634EA1F@lemburg.com>

Roman Suzi wrote:
>=20
> On Sat, 14 Jul 2001, M.-A. Lemburg wrote:
>=20
> >directive unicodeencoding =3D 'latin-1'
>=20
> >#!/usr/local/python
> >""" Module Docs...
> >"""
> >directive unicodeencoding =3D 'latin-1'
> >...
> >u =3D "H=E9ll=F4 W=F6rld !"
> >...
>=20
> Is there any need for new directive like that?
> Maybe it is possible to use Emacs-style "coding" directive
> in the second line instead:
>=20
> #!/usr/bin/python
> # -*- coding=3Dutf-8 -*-
> ...

I already mentioned allowing directives in comments to work around
the problem of directive placement before the first doc-string.

The above would then look like this:

#!/usr/local/bin/python
# directive unicodeencoding=3D'utf-8'
u""" UTF-8 doc-string """

The downside of this is that parsing comments breaks the current
tokenizing scheme in Python: the tokenizer removes comments before
passing the tokens to the compiler ...wouldn't be hard to=20
fix though ;-) (note that tokenize.py does not)

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From martin@loewis.home.cs.tu-berlin.de  Mon Jul 16 23:01:37 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 17 Jul 2001 00:01:37 +0200
Subject: [Python-Dev] directive statement (PEP 244)
In-Reply-To: <200107161824.f6GIOL532466@odiug.digicool.com> (message from
 Guido van Rossum on Mon, 16 Jul 2001 14:24:21 -0400)
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com>
 <3B532709.56972A46@lemburg.com> <200107161824.f6GIOL532466@odiug.digicool.com>
Message-ID: <200107162201.f6GM1bX09702@mira.informatik.hu-berlin.de>

> > A programmer
> > can still add some editor specific comment to the source file
> > to tell the editor in what encoding to display the file, but this
> > information is really only useful for the editor, not the
> > Python compiler.
> 
> This redundancy worries me though.  Are we going to encourage people
> to use an editor-specific comment for each editor out there that could
> be used to touch the file?

For non-ASCII source code? Certainly, this is the only option
(although many editors might chose a "display something, even as
garbage" mode without being further instructed).

We cannot expect all editors to correctly detect the encoding. So if
some provide customization through comments, users will use that. A
dedicated Python editor would look at the encoding directive, of
course.

> Yes, and so would removing a directive.  I don't see the point at
> all.

It contradicts what most users expect from comments, and contradicts
what the language reference says:

# Comments are ignored by the syntax; they are not tokens.

Comments are ignored; putting a meaning into them for program
execution is a hack.

> Directives come with their own set of magic.

There is no magic to the directive statement. Instead, it does what
all statements do: It has a certain meaning to the language. Python
has few declarations, the directive statement would be one of them.
Is that bothering you?

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Mon Jul 16 23:12:12 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 17 Jul 2001 00:12:12 +0200
Subject: [Python-Dev] directive statement (PEP 244)
In-Reply-To: <3B533F03.A5FD37D8@ActiveState.com> (message from Paul Prescod on
 Mon, 16 Jul 2001 12:22:43 -0700)
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de> <200107161705.f6GH5bA32228@odiug.digicool.com> <3B532709.56972A46@lemburg.com> <3B53344A.25AE6EB4@ActiveState.com> <3B533A62.73ECD605@lemburg.com> <3B533F03.A5FD37D8@ActiveState.com>
Message-ID: <200107162212.f6GMCCp09767@mira.informatik.hu-berlin.de>

> But this is just how it has to *look* to the user. If there is an
> implementation that behind the scenes only decodes Unicode literals,
> that would be fine.

Formally, you would have to decode everything just to make sure that
everything follows the declared encoding (i.e. no invalid byte
sequences).

I'm also not sure what an "ASCII superset" exactly is. Is it an
encoding where all ASCII strings just mean themselves? If so, and if
you allow encodings that have a shift state, you need to keep track of
the shift state for tokenization: char 39 might not always mean
APOSTROPHE, e.g. if you are in a shift state.

> Or we could just disallow non-ASCII 8-bit strings literals in files that
> use the declaration. 

+1.

> > To make this backwards compatible, the implementation would have to
> > assume Latin-1 as the original file encoding if not given (otherwise,
> > binary data currently stored in 8-bit strings wouldn't make the
> > roundtrip).
> 
> Another way to think about it is that files without the declaration skip
> directly to the tokenize step and skip the decoding step.

That's the way I would think of it also. You don't have Latin-1 values
in such strings - they are just byte strings.

Regards,
Martin


From PyChecker <pychecker@metaslash.com>  Tue Jul 17 03:23:26 2001
From: PyChecker <pychecker@metaslash.com> (Neal Norwitz)
Date: Mon, 16 Jul 2001 22:23:26 -0400
Subject: [Python-Dev] ANN: PyChecker version 0.7
Message-ID: <3B53A19E.899A5084@metaslash.com>

A new version of PyChecker is available for your hacking pleasure.

        PyChecker is a tool for finding common bugs in python source code.
        It finds problems that are typically caught by a compiler for less
        dynamic languages, like C and C++.

Comments, criticisms, new ideas, and other feedback is welcome.

Change Log:
  * Improve import warning messages, add from ... import ... checks
  * checker.py -h prints defaults after processing .pycheckrc file
  * Add config option -k/--pkgimport to disable unused imports from __init__.py
  * Add warning for variable used before being set
  * Improve format string checks/warnings
  * Check arguments to constructors
  * Check that self is first arg to base constructor
  * Add -e/--errors option to only warn about likely errors
  * Make 'self' configurable as the first argument to methods
  * Add check that there is a c'tor when instantiating an object and 
	passing arguments
  * Add config option (-N/--initreturn) to turn off warnings 
	when returning None from __init__()
  * Fix internal error with python 2.1 which defines a new op: LOAD_DEREF
  * Check in lambda functions for module/variable use
  * Fix inability to evaluate { 1: 'a' } inline,
	led to incorrect __init__() not called warnings
  * Fix exception when class overrides __special__() methods & raise exception
  * Fix check in format strings when using '%*g %*.*g', etc
  * Add check for static class attributes
  * Fix checking of module attributes
  * Fix wrong filename in 'Base class (xxx) __init__() not called'
  	when doing a from X import *
  * Fix 'No attribute found' for very dynamic classes
  	(may also work for classes that use __getattr__)

PyChecker is available on Source Forge:
    Web page:           http://pychecker.sourceforge.net/
    Project page:       http://sourceforge.net/projects/pychecker/

Neal
--
pychecker@metaslash.com


From fredrik@pythonware.com  Tue Jul 17 08:49:16 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 17 Jul 2001 09:49:16 +0200
Subject: [Python-Dev] guido@digicool.com
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de>  <200107161705.f6GH5bA32228@odiug.digicool.com>
Message-ID: <007301c10e95$00a7d6c0$4ffa42d5@hagrid>

guido wrote:
> But MAL and PaulP don't seem to agree on the semantics of this
> directive

and I don't agree with either of them.  the encoding should
apply to the entire source file, and there are lots of tricky
issues related to source code embedding (e.g. python code
in XML) and encoding-aware transports (e.g. python code
over HTTP) that needs to be covered.

> and I haven't gotten a good answer why we can't do that
> with a magic comment.

probably because we can ;-)

I'm on vacation; assuming neither encodings nor directives will
go into 2.2a1, I prepare a counter-PEP when I have some time
to spare (not today, most likely).

</F>



From fredrik@pythonware.com  Tue Jul 17 09:07:58 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 17 Jul 2001 10:07:58 +0200
Subject: [Python-Dev] Leading with XML-RPC
References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de>
Message-ID: <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid>

martin wrote:
> > It might benefit from also including the sgmlop.c extension.
> 
> +1 on including this one (after fixing the bugs, that is). People want
> a "good" XML parser in Python, regardless of XML-RPC; they complain
> that expat requires an external library.
> 
> sgmlop should then go into xml.parsers.sgmlop; making sgmllib and
> xmllib use sgmlop is optional.

any reason we cannot ship a snapshot of the expat sources
with Python?  (just the necessary files, that is: three C files,
and some header files)

</F>



From mal@lemburg.com  Tue Jul 17 11:08:58 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 17 Jul 2001 12:08:58 +0200
Subject: [Python-Dev] PEP: Defining Python Source Code Encodings
Message-ID: <3B540EBA.EE5372BD@lemburg.com>

After having been through two rounds of comments with the "Unicode
Literal Encoding" pre-PEP, it has turned out that people actually
prefer to go for the full Monty meaning that the PEP should handle
the complete Python source code encoding and not just the encoding
of the Unicode literals (which are currently the only parts in a
Python source code file for which Python assumes a fixed encoding).

Here's a summary of what I've learned from the comments:

1. The complete Python source file should use a single encoding.

2. Handling of escape sequences should continue to work as it does 
   now, but with all possible source code encodings, that is
   standard string literals (both 8-bit and Unicode) are subject to 
   escape sequence expansion while raw string literals only expand
   a very small subset of escape sequences.

3. Python's tokenizer/compiler combo will need to be updated to
   work as follows:

   1. read the file
   2. decode it into Unicode assuming a fixed per-file encoding
   3. tokenize the Unicode content
   4. compile it, creating Unicode objects from the given Unicode data
      and creating string objects from the Unicode literal data
      by first reencoding the Unicode data into 8-bit string data
      using the given file encoding

   To make this backwards compatible, the implementation would have to
   assume Latin-1 as the original file encoding if not given (otherwise,
   binary data currently stored in 8-bit strings wouldn't make the
   roundtrip).

4. The encoding used in a Python source file should be easily
   parseable for en editor; a magic comment at the top of the
   file seems to be what people want to see, so I'll drop the
   directive (PEP 244) requirement in the PEP.

Issues that still need to be resolved:

- how to enable embedding of differently encoded data in Python
  source code (e.g. UTF-8 encoded XML data in a Latin-1
  source file)

- what to do with non-literal data in the source file, e.g.
  variable names and comments:

  * reencode them just as would be done for literals
  * only allow ASCII for certain elements like variable names
  etc.

- which format to use for the magic comment, e.g.

  * Emacs style:

      #!/usr/bin/python
      # -*- encoding = 'utf-8' -*-

  * Via meta-option to the interpreter:

      #!/usr/bin/python --encoding=utf-8

  * Using a special comment format:

      #!/usr/bin/python
      #!encoding = 'utf-8'

Comments are welcome !

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From mal@lemburg.com  Tue Jul 17 11:24:16 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 17 Jul 2001 12:24:16 +0200
Subject: [Python-Dev] directive statement (PEP 244)
References: <200107161605.f6GG5lO03935@mira.informatik.hu-berlin.de>
 <200107161705.f6GH5bA32228@odiug.digicool.com>
 <3B532709.56972A46@lemburg.com>
 <3B53344A.25AE6EB4@ActiveState.com>
 <3B533A62.73ECD605@lemburg.com>
 <200107161946.f6GJk7Q00944@odiug.digicool.com>
 <3B535EEA.A629E61C@lemburg.com> <15187.25347.383123.602253@beluga.mojam.com>
Message-ID: <3B541250.6FD73F11@lemburg.com>

Skip Montanaro wrote:
> 
> Please excuse the naive interruption to this discussion.
> 
> I'm a bit removed from this debate, being someone who is generally happy
> with ASCII (and who really doesn't understand all the fur that is flying),
> however I would imagine that programmers in Moscow writing code to be read
> by other Russian programmers would want to enter Cyrillic characters
> directly into their module doc strings and not have to insert hex escapes.
> Can they safely do that now if they set the encoding variable in site.py
> appropriately?  If so, what is the need for the proposed directive to set
> encodings?  Is it an attempt simply to allow different encodings on a
> per-module basis?

You can only set the default encoding in site.py and this only
affects magic conversions from strings to Unicode and back. Unicode
literals must currently always use the unicode-escape encoding.

The PEP tries to undo with the latter restriction by allowing
flexible Python source code encodings on a per-file basis, e.g.
Japanese programmer would be able to write source files which
use Japanese characters in the Unicode literals which should
enhance code readability and user acceptance.
 
> On a related note, can the "Defining Unicode Literal Encodings" PEP be added
> to the PEP site/page so those of us who don't save every message that flows
> into their inboxes have it to refer to?

I will update the pre-PEP according to the findings I've posted under
the subject "PEP: Defining Python Source Code Encodings" and then
ask Barry to assign a PEP number for the upload.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From mal@lemburg.com  Tue Jul 17 13:11:21 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 17 Jul 2001 14:11:21 +0200
Subject: [Python-Dev] Re: PEP: Defining Python Source Code Encodings
References: <Pine.LNX.4.21.BCL.0107171439080.704-100000@suzi.com.onego.ru>
Message-ID: <3B542B69.8C092964@lemburg.com>

Roman Suzi wrote:
> 
> On Tue, 17 Jul 2001, M.-A. Lemburg wrote:
> 
> > After having been through two rounds of comments with the "Unicode
> > Literal Encoding" pre-PEP, it has turned out that people actually
> > prefer to go for the full Monty meaning that the PEP should handle
> > the complete Python source code encoding and not just the encoding
> > of the Unicode literals (which are currently the only parts in a
> > Python source code file for which Python assumes a fixed encoding).
> >
> > Here's a summary of what I've learned from the comments:
> >
> > 1. The complete Python source file should use a single encoding.
> 
> Yes, certainly
> 
> > 2. Handling of escape sequences should continue to work as it does
> >    now, but with all possible source code encodings, that is
> >    standard string literals (both 8-bit and Unicode) are subject to
> >    escape sequence expansion while raw string literals only expand
> >    a very small subset of escape sequences.
> >
> > 3. Python's tokenizer/compiler combo will need to be updated to
> >    work as follows:
> >
> >    1. read the file
> >    2. decode it into Unicode assuming a fixed per-file encoding
> >    3. tokenize the Unicode content
> >    4. compile it, creating Unicode objects from the given Unicode data
> >       and creating string objects from the Unicode literal data
> >       by first reencoding the Unicode data into 8-bit string data
> >       using the given file encoding
> 
> I think, that if encoding is not given, it must sillently assume "UNKNOWN"
> encoding and do nothing, that is be 8-bit clean (as it is now).

To be 8-bit clean it will have to use Latin-1 as fallback encoding
since this encoding assures the roundtrip safety (decode to Unicode,
then reencode).
 
> Otherwise, it will slow down parser considerably.

Yes, that could be an issue (I don't think it matters much though,
since parsing usually only done during byte-code compilation and
the results are buffered in .pyc files).
 
> I also think that if encoding is choosen, there is no need to reencode it
> back to literal strings: let them be in Unicode.

That would be nice, but is not feasable at the moment (just try
to run Python with -U option and see what happens...).
 
> Or the encoding must _always_ be ASCII+something, as utf-8 for example.
> Eliminating the need to bother with tokenizer (Because only docstrings,
> comments and string-literals are entities which require encoding /
> decoding).
> 
> If I understood correctly, Python will soon switch to "unicode-only"
> strings, as Java and Tcl did. (This is of course disaster for some Python
> usage areas such as fast text-processing, but...)
> 
> Or am I missing something?

It won't switch any time soon... there's still too much work
ahead and I'm also pretty sure that the 8-bit string type won't
go away for backward compatibility reasons.
 
> >    To make this backwards compatible, the implementation would have to
> >    assume Latin-1 as the original file encoding if not given (otherwise,
> >    binary data currently stored in 8-bit strings wouldn't make the
> >    roundtrip).
> 
> ...as I said, there must be no assumed charset. Things must
> be left as is now when no explicit encoding given.

This is what the Latin-1 encoding assures.
 
> > 4. The encoding used in a Python source file should be easily
> >    parseable for en editor; a magic comment at the top of the
> >    file seems to be what people want to see, so I'll drop the
> >    directive (PEP 244) requirement in the PEP.
> >
> > Issues that still need to be resolved:
> >
> > - how to enable embedding of differently encoded data in Python
> >   source code (e.g. UTF-8 encoded XML data in a Latin-1
> >   source file)
> 
> Probably, adding explicit conversions.

Yes, but there are cases where the source file having the embedded
data will not decode into Unicode (I got the example backwards:
think of a UTF-8 encoded source file with a Latin-1 string literal).

Perhaps we should simply rule out this case and have the 
programmer stick to the source file encoding + some escaping
or a run-time recoding of the literal data into the preferred
encoding.
 
> > - what to do with non-literal data in the source file, e.g.
> >   variable names and comments:
> >
> >   * reencode them just as would be done for literals
> >   * only allow ASCII for certain elements like variable names
> >   etc.
> 
> I think non-literal data must be in ASCII.
> But it could be too cheesy to have variable names in national
> alphabet ;-)

That's for Guido to decide...
 
> > - which format to use for the magic comment, e.g.
> >
> >   * Emacs style:
> >
> >       #!/usr/bin/python
> >       # -*- encoding = 'utf-8' -*-
> >
> >   * Via meta-option to the interpreter:
> >
> >       #!/usr/bin/python --encoding=utf-8
> >
> >   * Using a special comment format:
> >
> >       #!/usr/bin/python
> >       #!encoding = 'utf-8'
> 
> No variant is ideal. The 2nd is worse/best than all
> (it depends on how to look at it!)
> 
> Python has no macro directives. In this situation
> they could help greatly!

We've been discussing these on python-dev, but Guido is not
too keen on having them.
 
> That "#!encoding" is special case of macro directive.
> 
> May be just put something like ''# <!DOCTYPE HTML PUBLIC''
> at the beginning...
> 
> Or, even greater idea occured to me: allow some XML
> with meta-information (not only encoding) somehow escaped.
> 
> I think, GvR could come with some advice here...
> 
> > Comments are welcome !

Thanks for your comments,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From guido@digicool.com  Tue Jul 17 15:21:54 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 17 Jul 2001 10:21:54 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: Your message of "Tue, 17 Jul 2001 10:07:58 +0200."
 <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid>
References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de>
 <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid>
Message-ID: <200107171421.KAA20380@cj20424-a.reston1.va.home.com>

> any reason we cannot ship a snapshot of the expat sources
> with Python?  (just the necessary files, that is: three C files,
> and some header files)

Well, there's maintenance (someone has to sync these files from the
expat source tree into the Python tree regularly) and possibly
licensing (I don't know about the expat license, but who knows if it's
GPL compatible).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Tue Jul 17 15:28:03 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 17 Jul 2001 10:28:03 -0400
Subject: [Python-Dev] Re: PEP: Defining Python Source Code Encodings
In-Reply-To: Your message of "Tue, 17 Jul 2001 14:11:21 +0200."
 <3B542B69.8C092964@lemburg.com>
References: <Pine.LNX.4.21.BCL.0107171439080.704-100000@suzi.com.onego.ru>
 <3B542B69.8C092964@lemburg.com>
Message-ID: <200107171428.KAA20424@cj20424-a.reston1.va.home.com>

> I think, GvR could come with some advice here...

I have to bow out of this discussion for now.  There are too many
things requesting my attention, and I have to shed load.  Ditto for
the "directive" proposal.

Sorry,

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gregor@hoffleit.de  Tue Jul 17 15:32:09 2001
From: gregor@hoffleit.de (Gregor Hoffleit)
Date: Tue, 17 Jul 2001 16:32:09 +0200
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <200107171421.KAA20380@cj20424-a.reston1.va.home.com>
References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid> <200107171421.KAA20380@cj20424-a.reston1.va.home.com>
Message-ID: <20010717163209.A30372@mediasupervision.de>

On Tue, Jul 17, 2001 at 10:21:54AM -0400, Guido van Rossum wrote:
> > any reason we cannot ship a snapshot of the expat sources
> > with Python?  (just the necessary files, that is: three C files,
> > and some header files)
> 
> Well, there's maintenance (someone has to sync these files from the
> expat source tree into the Python tree regularly) and possibly
> licensing (I don't know about the expat license, but who knows if it's
> GPL compatible).

Grumble, browse, murmur... The license should be fine for all practical
uses: It's a MIT/X style license, i.e. very similar to the old Python
license, and therefore compatible with the GPL without any doubt ;-) !

http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/~checkout~/expat/expat/COPYING


    Gregor


From fdrake@acm.org  Tue Jul 17 15:34:06 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 17 Jul 2001 10:34:06 -0400 (EDT)
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <200107171421.KAA20380@cj20424-a.reston1.va.home.com>
References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de>
 <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid>
 <200107171421.KAA20380@cj20424-a.reston1.va.home.com>
Message-ID: <15188.19678.344303.182227@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > Well, there's maintenance (someone has to sync these files from the
 > expat source tree into the Python tree regularly) and possibly

  We even know who that someone would be, and how annoyed he'd be at
having one more place to check things in.  ;-)
  Expat (not pyexpat) *really* needs to have some attention.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From mwh@python.net  Tue Jul 17 16:14:11 2001
From: mwh@python.net (Michael Hudson)
Date: 17 Jul 2001 11:14:11 -0400
Subject: [Python-Dev] PEP: Defining Python Source Code Encodings
References: <3B540EBA.EE5372BD@lemburg.com>
Message-ID: <2m1ynfr3jw.fsf@starship.python.net>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> - which format to use for the magic comment, e.g.
> 
>   * Emacs style:
> 
>       #!/usr/bin/python
>       # -*- encoding = 'utf-8' -*-

Emacs already has a name for this; you'd write that

# -*- coding: utf-8; -*-

Seems reasonable to me.

Cheers,
M.

-- 
  We've had a lot of problems going from glibc 2.0 to glibc 2.1.
  People claim binary compatibility.  Except for functions they
  don't like.                       -- Peter Van Eynde, comp.lang.lisp


From mal@lemburg.com  Tue Jul 17 17:06:12 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 17 Jul 2001 18:06:12 +0200
Subject: [Python-Dev] A replacement for asyncore / asynchat
References: <Pine.OSF.4.30.0107161246001.18906-100000@sirppi.helsinki.fi>
Message-ID: <3B546274.86B0FD7B@lemburg.com>

Panu A Kalliokoski wrote:
> 
> Hello all, I've developed a Python module (in Python) to make somewhat
> higher abstraction over select.select(). The package is called
> "Selecting".  The package is somewhat similar to asyncore, but has many
> advantages over it:
>
> [...]
>
> For these reasons, I think that the asyncore package in the Python main
> distribution should be replaced with Selecting or at least Selecting
> should be put in the main distribution.

Is your package backwards compatible to asyncore ? If not, then
it might be a better idea, to place it on the web (e.g. on SourceForge) 
and  register the URLs with Parnassus so that Python users can
easily find it.
 
> The package is available at
> http://sange.fi/~atehwa-u/selecting/            (for browsing)  and
> http://sange.fi/~atehwa-u/selecting-0.89.tar.gz (for downloading).
> 
> The package is quite well tested and has been used to build ircd-style
> daemons, but more testing and comments are always welcome.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From skip@pobox.com (Skip Montanaro)  Tue Jul 17 17:32:26 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 17 Jul 2001 11:32:26 -0500
Subject: [Python-Dev] A replacement for asyncore / asynchat
In-Reply-To: <3B546274.86B0FD7B@lemburg.com>
References: <Pine.OSF.4.30.0107161246001.18906-100000@sirppi.helsinki.fi>
 <3B546274.86B0FD7B@lemburg.com>
Message-ID: <15188.26778.547568.706846@beluga.mojam.com>

    >> The package is available at
    >> http://sange.fi/~atehwa-u/selecting/            (for browsing)  and
    >> http://sange.fi/~atehwa-u/selecting-0.89.tar.gz (for downloading).
    >> 
    >> The package is quite well tested and has been used to build ircd-style
    >> daemons, but more testing and comments are always welcome.

    mal> Is your package backwards compatible to asyncore ? If not, then it
    mal> might be a better idea, to place it on the web (e.g. on
    mal> SourceForge) and register the URLs with Parnassus so that Python
    mal> users can easily find it.

You might also bring it to Sam Rushing's attention.  I think he's been
working on "async, the next generation".  He will probably have some good
ideas and could provide some useful critique of the selecting code.  I don't
recall if he is on python-dev or not.

Skip


From atehwa@iki.fi  Tue Jul 17 18:02:21 2001
From: atehwa@iki.fi (Panu A Kalliokoski)
Date: Tue, 17 Jul 2001 20:02:21 +0300 (EET DST)
Subject: [Python-Dev] A replacement for asyncore / asynchat
In-Reply-To: <3B546274.86B0FD7B@lemburg.com>
Message-ID: <Pine.OSF.4.30.0107171951000.19617-100000@sirppi.helsinki.fi>

On Tue, 17 Jul 2001, M.-A. Lemburg wrote:

| > For these reasons, I think that the asyncore package in the Python main
| > distribution should be replaced with Selecting or at least Selecting
| > should be put in the main distribution.
|
| Is your package backwards compatible to asyncore ? If not, then
| it might be a better idea, to place it on the web (e.g. on SourceForge)
| and  register the URLs with Parnassus so that Python users can
| easily find it.

I've registered the module in Parnassus. My point is mostly that because
Selecting really is (at least in my opinion) much easier to work with
than asyncore, it should be placed where people will look for standard
solutions, and that is the standard library.

Selecting is not backwards compatible with asyncore. It is not
impossible to write such glue code that one could use Selecting directly
in projects that have been written for asyncore, but I doubt whether it
is worth it. Selecting does not offer great advantages in performance
(at least not currently, but this might change), but in customisability
and clean API. asyncore does its work well in projects that use it, and
Selecting is mostly better because it is easier to make a new project
that uses it.

The sensible solution, as I see it, would be to deprecate
asyncore/asynchat but leave them there for supporting old projects, and
add Selecting as the primary way of abstracting over select() (and
poll(), kqueue and rtsig, which I plan to transparently add to
Selecting).

Panu




From fdrake@acm.org  Tue Jul 17 19:21:29 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 17 Jul 2001 14:21:29 -0400 (EDT)
Subject: [Python-Dev] Docs for 2.2a1 frozen
Message-ID: <15188.33321.534672.664230@cj42289-a.reston1.va.home.com>

  Please do not make any documentation checkins on the trunk or
descr-branch; we're getting things ready for the 2.2a1 release.
Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From bsass@freenet.edmonton.ab.ca  Tue Jul 17 19:31:24 2001
From: bsass@freenet.edmonton.ab.ca (Bruce Sass)
Date: Tue, 17 Jul 2001 12:31:24 -0600 (MDT)
Subject: [Python-Dev] Re: PEP: Defining Python Source Code Encodings
In-Reply-To: <3B540EBA.EE5372BD@lemburg.com>
Message-ID: <Pine.LNX.4.33.0107171117070.11975-100000@bms>

On Tue, 17 Jul 2001, M.-A. Lemburg wrote:
<...>
> - which format to use for the magic comment, e.g.
>
>   * Emacs style:
>
>       #!/usr/bin/python
>       # -*- encoding = 'utf-8' -*-

This should work for everyone, but will it confuse emacs?.
I suppose, "# # ...", or "### ...", or almost any short sequence
starting with "#" will work, eh.

>   * Via meta-option to the interpreter:
>
>       #!/usr/bin/python --encoding=utf-8

This will require editing if python is not in /usr/bin, and can not be
used to pass more than one argument to the command (python, in this
case).

>   * Using a special comment format:
>
>       #!/usr/bin/python
>       #!encoding = 'utf-8'

This is confusing, and will only work on *nix (linux?) iff it is the
second (or later) line; if it is the first line... it will fail
because there is probably no executable named "encoding" available,
and if there is, "= 'utf8'" is unlikely to exist.

please,
Avoid character sequences that have other meanings in this context.


I think this should be done as a generic method for pre-processing
Python source before the compiler/interpreter has a look at it.

e.g.,

	# ## encoding utf-8

triggers whatever you encoding fans want,

	# ## format noweb

runs the source through a filter which can extract code noweb marked
up code, and maybe even installs the weaved docs and tangled code
(via distutils?)

	# ## MySpecialMarkup

runs the source through a filter named MySpecialMarkup.
MySpecialMarkup could be anything: extensions to docstrings, a
proprietary binary format, an entire package-in-a-file!

Generally:  #<magic> <directive> [<arguments>]

If Python does not know what the <directive> is it should either look
in a set location for a program of the same name then use its output as
the source, or look into a table that maps <directive> to a procedure
which results in Python source.


- Bruce



From martin@loewis.home.cs.tu-berlin.de  Tue Jul 17 23:50:27 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 18 Jul 2001 00:50:27 +0200
Subject: [Python-Dev] Re: PEP 244 syntax
In-Reply-To: <15187.25558.243182.290606@beluga.mojam.com> (message from Skip
 Montanaro on Mon, 16 Jul 2001 16:59:50 -0500)
References: <200107161559.f6GFx8V03899@mira.informatik.hu-berlin.de>
 <3B531B24.C8CCC74F@lemburg.com>
 <200107162140.f6GLexo09608@mira.informatik.hu-berlin.de> <15187.25558.243182.290606@beluga.mojam.com>
Message-ID: <200107172250.f6HMoR501665@mira.informatik.hu-berlin.de>

> But it doesn't appear that PEP 244 allows multiple directives per line:
> 
>     A directive_statement is a statement of the form
> 
>         directive_statement: 'directive' NAME [atom] [';'] NEWLINE
> 
> If you decide to allow it, then the semicolon makes sense, but not
> otherwise.

Ok, then I guess I should put the directive statement into the
small_stmt category, and allow multiple of those in a single line.

Regards,
Martin



From martin@loewis.home.cs.tu-berlin.de  Tue Jul 17 23:45:42 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 18 Jul 2001 00:45:42 +0200
Subject: [Python-Dev] PEP: Defining Python Source Code Encodings
Message-ID: <200107172245.f6HMjgp01656@mira.informatik.hu-berlin.de>

> To be 8-bit clean it will have to use Latin-1 as fallback encoding
> since this encoding assures the roundtrip safety (decode to Unicode,
> then reencode).

No, that is not true. Any other 8-bit encoding that has all code
points assigned (e.g. Latin-2, or KOI8-R) would also give you full
round-trip encoding.

Regards,
Martin



From martin@loewis.home.cs.tu-berlin.de  Wed Jul 18 00:05:07 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 18 Jul 2001 01:05:07 +0200
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid> (fredrik@pythonware.com)
References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid>
Message-ID: <200107172305.f6HN57T01730@mira.informatik.hu-berlin.de>

> > +1 on including this one (after fixing the bugs, that is). People want
> > a "good" XML parser in Python, regardless of XML-RPC; they complain
> > that expat requires an external library.
> > 
> > sgmlop should then go into xml.parsers.sgmlop; making sgmllib and
> > xmllib use sgmlop is optional.
> 
> any reason we cannot ship a snapshot of the expat sources
> with Python?  (just the necessary files, that is: three C files,
> and some header files)

Would be fine with me, and I would contribute the necessary changes to
the CVS - I just would need permission to do so (and an advise whether
to stuff everything into Modules, or to create an expat subdirectory).

Regards,
Martin


From fdrake@acm.org  Wed Jul 18 00:31:07 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Tue, 17 Jul 2001 19:31:07 -0400 (EDT)
Subject: [Python-Dev] [development doc updates]
Message-ID: <20010717233107.C0EAE2892B@beowolf.digicool.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/


Final update of the 2.2a1 documentation.



From guido@digicool.com  Wed Jul 18 00:53:36 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 17 Jul 2001 19:53:36 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: Your message of "Wed, 18 Jul 2001 01:05:07 +0200."
 <200107172305.f6HN57T01730@mira.informatik.hu-berlin.de>
References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid>
 <200107172305.f6HN57T01730@mira.informatik.hu-berlin.de>
Message-ID: <200107172353.TAA21022@cj20424-a.reston1.va.home.com>

> > any reason we cannot ship a snapshot of the expat sources
> > with Python?  (just the necessary files, that is: three C files,
> > and some header files)
> 
> Would be fine with me, and I would contribute the necessary changes to
> the CVS - I just would need permission to do so (and an advise whether
> to stuff everything into Modules, or to create an expat subdirectory).

If Fred (Drake) approves, that's fine with me.  Negotiate the details
with him.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Wed Jul 18 08:36:50 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 18 Jul 2001 09:36:50 +0200
Subject: [Python-Dev] Re: PEP: Defining Python Source Code Encodings
References: <200107172245.f6HMjgp01656@mira.informatik.hu-berlin.de>
Message-ID: <3B553C92.D030234F@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > To be 8-bit clean it will have to use Latin-1 as fallback encoding
> > since this encoding assures the roundtrip safety (decode to Unicode,
> > then reencode).
> 
> No, that is not true. Any other 8-bit encoding that has all code
> points assigned (e.g. Latin-2, or KOI8-R) would also give you full
> round-trip encoding.

True. Still, Latin-1 gives you the best performance and is also
compatible with the unicode-ecsape encoding which is currently in
use.

I'll add a note about this to the PEP.

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From dee@investorcafe.net  Wed Jul 18 09:00:56 2001
From: dee@investorcafe.net (InvestorCafe)
Date: Wed, 18 Jul 2001  01:00:56 -0700
Subject: [Python-Dev] NEWS ALERT!!! - RateXchange (RTX: AMEX)
Message-ID: <E15MpN8-0003dn-00@mail.python.org>

<HTML>
<HEAD>
<META NAME="GENERATOR" Content="Microsoft DHTML Editing Control">
<TITLE></TITLE>
</HEAD>
<BODY>
<DIV><FONT face=Arial size=2>
<DIV><FONT face=Arial size=2><IMG 
src="mhtml:mid://00000223/!http://investorcafe.net/RTX/rtxheader.gif" 
border=0></FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2><STRONG>InvestorCafe.Net Profile: RateXchange 
Corporation (AMEX: RTX)<BR><BR></STRONG></FONT><FONT face=Arial size=2>According 
to&nbsp;President and Chief Executive Jon Merriman, "RateXchange is now poised 
for success, having completed a transition </FONT></DIV>
<DIV><FONT face=Arial size=2>into a dynamic electronic trading system. Last 
year, the company raised Dollars 30 million in an institutional private 
placement, </FONT><FONT face=Arial size=2>and </FONT></DIV>
<DIV><FONT face=Arial size=2>moved its stock listing from the OTC Bulletin Board 
to the American Stock Exchange."<BR></DIV><FONT face=Arial size=2><B></B></FONT>
<TABLE height=540 border=0>
  <TBODY>
  <TR>
    <TD width=245 bgColor=#dddddd height=21><FONT face=Arial size=2><B>RTX 
      Info</B></FONT></TD>
    <TD width=540 bgColor=#dddddd height=21><B><FONT face=Arial size=2>About 
      <U><A href="http://www.ratexchange.com/"><FONT 
      color=#000000>RateXchange</FONT></A></U></FONT></B> </TD></TR>
  <TR>
    <TD vAlign=top width=245 height=294><FONT face=Arial size=1>
      <TABLE cellSpacing=0 cellPadding=0 width=230 bgColor=#ffffff border=0>
        
        <TR>
          <TD bgColor=#29749c colSpan=4>
            <P align=left><FONT face=Verdana color=#ffffff size=2><B>RTX 
            Quote</FONT> </B></P></TD></TR>
        <TR>
          <TD>&nbsp;<IMG height=9 
            src="mhtml:mid://00000223/!http://www.financialcontent.com/images/arrow_dash2.gif" 
            width=11>&nbsp;<BR></TD>
          <TD><A 
            href="http://user.financialcontent.com/mspad/quote.cgi?account=mspad&amp;ticker=RTX"><FONT 
            face=Verdana color=#29749c 
            size=2>RTX</FONT></A>&nbsp;&nbsp;&nbsp;<BR></TD>
          <TD align=right><FONT face=Verdana color=#000000 
            size=2>1.00</B>&nbsp;&nbsp;&nbsp;<BR></FONT></TD>
          <TD noWrap align=right><FONT face=Verdana color=#000000 
            size=2>0.00%</B>&nbsp;<BR></FONT></TD></TR>
        <TR>
          <TD bgColor=#000000 colSpan=4>
            <TABLE cellSpacing=0 cellPadding=0 width="100%" border=0>
              
              <TR>
                <TD height=1><SPACER type="block" width="1" 
              height="1"></TD></TR></TABLE></TD></TR>
        <TR>
          <TD align=middle bgColor=#ffffff colSpan=4><FONT face=Verdana 
            color=#000000 size=2><SMALL>Delayed 15 
        minutes</SMALL></FONT><BR></TD></TR></TABLE>
      <DIV>RTX Intraday Chart <BR><IMG 
      src="mhtml:mid://00000223/!http://chart.financialcontent.com/intraday/mspad/RTX/210/60/FFFFFF/29749C/000000/000000/000000/000000/CCCCCC/FF0000" 
      border=0> <BR><BR>RTX 1-Month Chart<BR><IMG 
      src="mhtml:mid://00000223/!http://chart.financialcontent.com/historical/mspad/RTX/20/210/60/FFFFFF/29749C/000000/000000/000000/000000/CCCCCC/FF0000" 
      border=0> <BR><BR>RTX 1-Year Chart<BR><IMG 
      src="mhtml:mid://00000223/!http://chart.financialcontent.com/historical/mspad/RTX/240/210/60/FFFFFF/29749C/000000/000000/000000/000000/CCCCCC/FF0000" 
      border=0> </FONT></DIV></TD>
    <TD vAlign=top width=540 height=461 rowSpan=2><FONT face=Arial 
      size=2>RateXchange Corporation provides trading, consulting and 
      information solutions enabling market participants to maximize their 
      assets. These trading solutions allow network providers, energy merchants, 
      financial institutions and asset managers engaged on the <STRONG><FONT 
      color=#000000>RateXchange Trading System (RTS)<SUP 
      class=ten>TM</SUP></FONT></STRONG> and <STRONG><FONT 
      color=#000000>RateXchange Futures System (RFS)<SUP 
      class=ten>TM</SUP></FONT></STRONG> the ability to trade bandwidth and 
      futures globally. The Company's consulting solutions practice provides 
      asset valuation tools, risk management strategies and analytics. The 
      information solutions group provides pricing information, market research 
      and comprehensive industry background. Founded in 1997, RateXchange is a 
      publicly-traded <FONT color=#000000>(AMEX: RTX) </FONT>trading solutions 
      company recognized as an industry leader in enabling the creation of 
      liquid marketplaces for bandwidth and other telecommunications 
      products.<BR><BR><B>RateXchange Trading System (RTS)TM: </B>RateXchange 
      Trading System (RTS)TM is RateXchange's platform for bandwidth trading and 
      is analogous to trading systems that dominate online natural gas and 
      electricity commodity trading. RateXchange's proprietary trading platform 
      is available for customization to meet your organization's needs in the 
      form of Private Label Exchanges.</FONT> 
      <P><FONT face=Arial size=2><B>OffXchange Trading: </B>Market participants 
      can trade bandwidth with the assistance of an experienced broker as a 
      result of RateXchange's alliance with Amerex, the world's largest 
      over-the-counter (OTC) energy &amp; power broker.</FONT></P>
      <P><FONT face=Arial size=2><B>RateXchange Futures System:&nbsp;</B> 
      RateXchange Future System's high-speed online trading and risk management 
      system enables brokerages, institutional fund managers and financial 
      institutions to both execute and clear listed futures contracts on a 
      global basis.<BR><BR><STRONG>Recent News</STRONG></FONT></P>
      <P><FONT face=Arial size=2>
      <TABLE cellSpacing=0 cellPadding=2 width="100%" border=0>
        
        <TR>
          <TD vAlign=top noWrap align=left width="1%"><FONT face=verdana 
            size=1>7/16/01</FONT> </TD>
          <TD vAlign=top align=left width="99%">
            <DIV><A 
            href="http://user.financialcontent.com/mspad/newsread.cgi?account=mspad&amp;article=010716-cgm080!pr"><FONT 
            face=arial color=#0000ff size=2>RateXchange Selects Avantrust to 
            Provide Trade Credit Insurance For RateXchange's Telecom 
            Clients</FONT></A></DIV></TD></TR></TABLE></FONT></P>
      <P><FONT face=Arial size=2>
      <TABLE cellSpacing=0 cellPadding=2 width="100%" border=0>
        
        <TR>
          <TD vAlign=top noWrap align=left width="1%"><FONT face=verdana 
            size=1>6/28/01</FONT> </TD>
          <TD vAlign=top align=left width="99%">
            <DIV><A 
            href="http://user.financialcontent.com/mspad/newsread.cgi?account=mspad&amp;article=010628-cgth019!pr"><FONT 
            face=arial size=2>RateXchange Announces Teleglobe as Major User of 
            RateXchange Trading 
      System</FONT></A></DIV></TD></TR></TABLE></FONT><FONT face=Arial 
      size=2></FONT></P>
      <P><FONT face=Arial size=2><A 
      href="mailto:investor_relations@ratexchange.com"></A></FONT>&nbsp;</P>
      <P><FONT size=2></FONT>&nbsp;</P>
      <P align=right><FONT size=2></FONT>&nbsp;</P></TD></TR>
  <TR>
    <TD vAlign=top width=245 height=163>
      <FORM name=RTXdatabase onsubmit="" action=index.asp method=post 
      webbot-action="--WEBBOT-SELF--"><FONT face=arial size=1>Want more info on 
      RTX? Join our database, and we'll contact you with further 
      information.<BR>&nbsp;</FONT><BR><FONT face=arial size=2>Your Name:</FONT> 
      <FONT size=1><FONT face=arial size=2><BR><INPUT tabIndex=1 size=30 
      name=name><BR>Email </FONT><FONT face=arial color=#ff0000 
      size=1>*required</FONT><FONT face=arial size=2>:<BR><INPUT tabIndex=2 
      size=30 name=email><BR><INPUT tabIndex=3 type=submit value="Sign Up for Newsletter" name=B1></FONT></FORM></FONT></TD></TR></TBODY></TABLE>
<DIV><B><FONT face=Arial>Disclaimer:</FONT></B></DIV>
<P><FONT face=Arial size=1>Verify all claims and do your own due diligence. This 
profile is not a solicitation or recommendation to buy, sell or hold 
securities.&nbsp;InvestorCafe.Net is not offering securities for sale. An offer 
to buy or sell can be made only with accompanying disclosure documents and only 
in the states and provinces for which they are approved. All statements and 
expressions are the sole opinion of the editor and are subject to change without 
notice. InvestorCafe.Net is not liable for any investment decisions by its 
readers or subscribers. It is strongly recommended that any purchase or sale 
decision be discussed with a financial adviser, or a broker-dealer, or a member 
of any financial regulatory bodies. The information contained herein has been 
provided as an information service only. The accuracy or completeness of the 
information is not warranted and is only as reliable as the sources from which 
it was obtained. It should be understood there is no guarantee that past 
performance will be indicative of future results. Investors are cautioned that 
they may lose all or a portion of their investment in this or any other company. 
In order to be in full compliance with the Securities Act of 1933, Section 
17(b), InvestorCafe.net and its management fully disclose that they receive fees 
from profiled companies or agents representing the profiled companies. These 
fees may be paid in cash or in stock and they will be fully disclosed in each 
profile. Neither InvestorCafe.Net nor any of its affiliates, or employees shall 
be liable to you or anyone else for any loss or damages from use of this 
Internet Web Site or e-mail, caused in whole or part by its negligence or 
contingencies beyond its control in procuring, compiling, interpreting, 
reporting, or delivering this Web Site or e-mail and any contents. Since 
InvestorCafe.Net&nbsp;receives compensation and its employees or members of 
their families may hold stock in the profiled companies, there is an inherent 
conflict of interest in InvestorCafe.Net's statements and opinions and such 
statements and opinions cannot be considered independent. InvestorCafe.Net and 
its management may benefit from any increase in the share prices of the profiled 
companies. Information contained herein contains "forward looking statements" 
within the meaning of Section 27A of the Securities Act of 1933 and Section 21E 
of the Securities and Exchange Act of 1934. Any statements that express or 
involve discussions with respect to predictions, expectations, beliefs, plans, 
projections, objectives, goals, assumptions or future events or performance are 
not statements of historical facts and may be "forward looking statements". 
Forward looking statements are based on expectations, estimates and projections 
at the time the statements are made that involve a number of risks and 
uncertainties which could cause actual results or events to differ materially 
from those presently anticipated. Forward looking statement may be identified 
through the use of words such as "expects", "will", "anticipates", "estimates", 
"believes", or by statements indicating certain actions "may", "could", "should" 
or "might" occur. InvestorCafe.Net has been compensated a total of 6,000 free 
trading shares of RTX for the production and distribution of this document. We 
will benefit from any increase in the share price or liquidity of RTX. We retain 
the option of liquidating all or part of our compensation (RTX shares) before, 
during, or immediately after the dissemination of this report. 
EquInvestorCafe.Net encourages its readers to invest carefully and read the 
investor information available at the Web sites of the Securities and Exchange 
Commission (SEC) and/or the National Association of Securities Dealers (NASD). 
The NASD offers very good information on its site about how to invest with 
caution. Readers can also review all public filings by companies at the SEC's 
EDGAR page. InvestorCafe.Net may make use of information including, but not 
limited to, Company Press Releases, SEC Filings, Company Profiles, Other 
Research Sites, Brokerages, Newspapers, Magazines, Journals, Electronic 
Databases, Company Interviews, and other Publicly accessible sources. 
InvestorCafe.Net neither represents nor warrants the accuracy of information 
provided by any of the sources mentioned above and therefore: THE READER SHOULD 
VERIFY ALL CLAIMS AND DO THEIR OWN DUE DILIGENCE BEFORE INVESTING IN ANY 
SECURITIES MENTIONED. INVESTING IN SECURITIES IS SPECULATIVE AND CARRIES A HIGH 
DEGREE OF RISK. THE INFORMATION FOUND IN THIS WEB SITE OR E-MAIL IS PROTECTED BY 
THE COPYRIGHT LAWS OF THE UNITED STATES AND MAY NOT BE COPIES, OR REPRODUCED IN 
ANY WAY WITHOUT THE EXPRESSED, WRITTEN CONSENT OF THE EDITOR OF 
INVESTORCAFE.NET.</FONT></P>
<P align=left><FONT size=1><IMG alt="" hspace=0 
src="mhtml:mid://00000223/!http://investorcafe.net/images/icheader.jpg" 
align=baseline border=0><FONT size=2><BR></FONT></FONT><A 
href="http://investorcafe.net/optout.asp">Click</A> here to 
unsubscribe</P></FONT></FONT></DIV>
</BODY>
</HTML>




From guido@digicool.com  Wed Jul 18 14:23:04 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 18 Jul 2001 09:23:04 -0400
Subject: [Python-Dev] 2.2a1 released
Message-ID: <200107181323.JAA27606@cj20424-a.reston1.va.home.com>

The 2.2a1 release is on the website: http://www.python.org/2.2/

I'm waiting for two things before I send out a wider announcement:

- SF has a 30 minute cron job delay before the .tgz file is visible;

- I have to finish an introduction to the features added ob behalf of
  PEP 252 and PEP 253 (http://www.python.org/2.2/descrintro.html).

I'm feeling shitty today so I may not complete that intro right away,
but I'll post the announcement anyway once SF has all three files.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Paul.Moore@atosorigin.com  Wed Jul 18 14:57:06 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Wed, 18 Jul 2001 14:57:06 +0100
Subject: [Python-Dev] PEP 250: Summary of comments
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AF15@UKRUX002.rundc.uk.origin-it.com>

Having waited a few days to let the dust settle, I believe that the
following is the current state of affairs:

1. The change to site.py, to include site-packages in sys.path, is in.
2. The change to distutils.sysconfig to change to site-packages, is in.
3. The Windows Installer still needs changes:
   a) site.py should change to export a "sys.extinstallpath" which points to
site-packages
   b) the Windows Installer should use this, rather than the registry

I see no great issue with 3a - it should be a pretty trivial change. Can
someone with access to the sources make it? I attach a suggested patch. Item
3b is the key point - it's pretty critical that the Windows Installer change
to use the new directory, otherwise, most of the work is a waste. I can't do
much about this, as I haven't even seen the source yet. Can someone do
something about this?

On point 3a, sys.extinstallpath should be set for all platforms, but I have
to admit that I don't know what to do for non-Windows platforms. The best I
can suggest is that we do something like

if os.sep == '/':
    sys.extinstallpath = os.path.join(sys.prefix, "lib", "python" +
sys.version[:3], "site-packages")
else:
    sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages")

which matches the sys.path setting for Unix - but I couldn't really offer
this as a patch, as I don't understand the issues around site-packages vs
site-python on Unix. All I could say is that it's better than leaving
sys.extinstallpath unset on some platforms.

To summarise the summary:

1. The patch to site.py to expose sys.extinstallpath should be made, at
least for Windows.
2. The Windows Installer needs to be updated to use sys.extinstallpath for
Python 2.2 and greater.
3. If the Mac people want, the same can be done for Mac.
4. If the Unix people have a consensus, that should go in too (affects
site.py, bdist_rpm, at least).

As a side benefit, if this goes in, bdist_wininst will start working for
Pythons which don't use the registry (such as the PythonLabs ones).

Sadly, I note that I've just missed the 2.2a1 release. Is anyone likely to
be able to do anything about this prior to 2.2a2? (If someone sends me a
pointer to the wininst sources, I'll look into what's involved in a patch,
assuming no-one else has adequate time).

Paul.

PS As documentation of sys.extinstallpath, I'd suggest something like:

sys.extinstallpath: The directory into which Python extensions should be
installed. This is merely a recommendation - Python will pick up extensions
which are located anywhere along sys.path. However, extension installers
should use this directory by default. The distutils package (and installers
built with it) will use this directory (XXX - currently not trie for
bdist_rpm, I guess).

Patch for site.py (point 3a - Windows only, from Thomas Heller)

--- \Applications\Python\lib\site.py.orig    Tue Jun 26 10:07:06 2001
+++ \Applications\Python\lib\site.py   Wed Jul 18 14:43:54 2001
@@ -148,6 +148,9 @@
             if os.path.isdir(sitedir):
                 addsitedir(sitedir)

+if os.sep == '\\': # != '/' if you want to do all except Unix like this...
+    sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages")
+
 # Define new built-ins 'quit' and 'exit'.
 # These are simply strings that display a hint on how to exit.
 if os.sep == ':':


From mal@lemburg.com  Wed Jul 18 15:21:37 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 18 Jul 2001 16:21:37 +0200
Subject: [Python-Dev] PEP: Defining Python Source Code Encodings
Message-ID: <3B559B71.C08C6145@lemburg.com>

Here's an update of the pre-PEP. After this round of comments, the
PEP will be checked into CVS (provided Barry assigns a PEP number,
hi Barry ;-)

--

PEP: 0263 (?)
Title: Defining Python Source Code Encodings
Version: $Revision: 1.2 $
Author: mal@lemburg.com (Marc-Andr=E9 Lemburg)
Status: Draft
Type: Standards Track
Python-Version: 2.3
Created: 06-Jun-2001
Post-History:=20
Requires: 244

Abstract

    This PEP proposes to introduce a syntax to declare the encoding of
    a Python source file. The encoding information is then used by the
    Python parser to interpret the file using the given encoding. Most
    notably this enhances the interpretation of Unicode literals in
    the source code and makes it possible to write Unicode literals
    using e.g. UTF-8 directly in an Unicode aware editor.

Problem

    In Python 2.1, Unicode literals can only be written using the
    Latin-1 based encoding "unicode-escape". This makes the
    programming environment rather unfriendly to Python users who live
    and work in non-Latin-1 locales such as many of the Asian=20
    countries. Programmers can write their 8-bit strings using the
    favourite encoding, but are bound to the "unicode-escape" encoding
    for Unicode literals.

Proposed Solution

    I propose to make the Python source code encoding both visible and
    changeable on a per-source file basis by using a special comment
    at the top of the file to declare the encoding.

    To make Python aware of this encoding declaration a number of
    concept changes are necessary with repect to the handling of
    Python source code data.

Concepts

    The PEP is based on the following concepts which would have to be
    implemented to enable usage of such a magic comment:

    1. The complete Python source file should use a single encoding.
       Embedding of differently encoded data is not allowed and will
       result in a decoding error during compilation of the Python
       source code.

    2. Handling of escape sequences should continue to work as it does=20
       now, but with all possible source code encodings, that is
       standard string literals (both 8-bit and Unicode) are subject to=20
       escape sequence expansion while raw string literals only expand
       a very small subset of escape sequences.

    3. Python's tokenizer/compiler combo will need to be updated to
       work as follows:

       1. read the file

       2. decode it into Unicode assuming a fixed per-file encoding

       3. tokenize the Unicode content

       4. compile it, creating Unicode objects from the given Unicode dat=
a
          and creating string objects from the Unicode literal data
          by first reencoding the Unicode data into 8-bit string data
          using the given file encoding

       5. variable names and other identifiers will be reencoded into
          8-bit strings using the file encoding to assure backward
          compatibility with the existing implementation

          ISSUE:=20

              Should we restrict identifiers to ASCII ?

       To make this backwards compatible, the implementation would have t=
o
       assume Latin-1 as the original file encoding if not given (otherwi=
se,
       binary data currently stored in 8-bit strings wouldn't make the
       roundtrip).

Comment Syntax

    The magic comment will use the following syntax. It will have to
    appear as first or second line in the Python source file.

    ISSUE:

        Possible choices for the format:

        1. Emacs style:

          #!/usr/bin/python
          # -*- coding: utf-8; -*-

        2. Via a pseudo-option to the interpreter (one which is not used
           by the interpreter):

          #!/usr/bin/python --encoding=3Dutf-8

        3. Using a special comment format:

          #!/usr/bin/python
          #!encoding =3D 'utf-8'

        4. XML-style format:

          #!/usr/bin/python
          #?python encoding =3D 'utf-8'

    Usage of a new keyword "directive" (see PEP 244) for this purpose
    has been proposed, but was put aside due to PEP 244 not being
    widely accepted (yet).

Scope

    This PEP only affects Python source code which makes use of the
    proposed magic comment. Without the magic comment in the proposed
    position, Python will treat the source file as it does currently
    to maintain backwards compatibility.

Copyright

    This document has been placed in the public domain.

=0C
Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From Paul.Moore@atosorigin.com  Wed Jul 18 15:21:03 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Wed, 18 Jul 2001 15:21:03 +0100
Subject: [Python-Dev] RE: [Distutils] PEP 250: Summary of comments
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AF16@UKRUX002.rundc.uk.origin-it.com>

From: Moore, Paul [mailto:Paul.Moore@atosorigin.com]
> Having waited a few days to let the dust settle, I believe that the
> following is the current state of affairs:
>
> 1. The change to site.py, to include site-packages in sys.path, is in.
> 2. The change to distutils.sysconfig to change to site-packages, is in.

Urk. I just downloaded 2.2a1, and the sysconfig.py change isn't in :-(

Attached are a couple of possible patches - one which just tweaks Windows
(os.name is 'nt' for Win9x???) and another which tries to slim down on the
platform-specific special casing, but which may have larger effects on
non-Windows platforms.

Can someone put one of these in?

Thanks,
Paul.

Trivial version of the change:

--- sysconfig.py.orig        Sat Jul 07 23:55:28 2001
+++ sysconfig.py   Wed Jul 18 15:13:45 2001
@@ -89,7 +89,7 @@
         if standard_lib:
             return os.path.join(PREFIX, "Lib")
         else:
-            return prefix
+            return os.path.join(libpython, "site-packages")

     elif os.name == "mac":
         if plat_specific:

Better version, which covers all platforms (but which removes a couple of
"OK, where DO site-specific modules go on the Mac?" errors, which may be
glossing over an issue...) is

--- sysconfig.py.orig	Sat Jul 07 23:55:28 2001
+++ sysconfig.py	Wed Jul 18 15:18:55 2001
@@ -80,35 +80,23 @@
     if os.name == "posix":
         libpython = os.path.join(prefix,
                                  "lib", "python" + sys.version[:3])
-        if standard_lib:
-            return libpython
-        else:
-            return os.path.join(libpython, "site-packages")
-
     elif os.name == "nt":
-        if standard_lib:
-            return os.path.join(PREFIX, "Lib")
-        else:
-            return prefix
-
+        libpython = os.path.join(PREFIX, "Lib")
     elif os.name == "mac":
         if plat_specific:
-            if standard_lib:
-                return os.path.join(EXEC_PREFIX, "Mac", "Plugins")
-            else:
-                raise DistutilsPlatformError, \
-                      "OK, where DO site-specific extensions go on the
Mac?"
+            libpython = os.path.join(EXEC_PREFIX, "Mac", "Plugins")
         else:
-            if standard_lib:
-                return os.path.join(PREFIX, "Lib")
-            else:
-                raise DistutilsPlatformError, \
-                      "OK, where DO site-specific modules go on the Mac?"
+            libpython = os.path.join(PREFIX, "Lib")
     else:
         raise DistutilsPlatformError, \
               ("I don't know where Python installs its library " +
                "on platform '%s'") % os.name
 
+    if standard_lib:
+        return libpython
+    else:
+        return os.path.join(libpython, "site-packages")
+
 # get_python_lib()
         


From akuchlin@mems-exchange.org  Wed Jul 18 15:23:41 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 18 Jul 2001 10:23:41 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <200107172353.TAA21022@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jul 17, 2001 at 07:53:36PM -0400
References: <200107161515.f6GFF1h03790@mira.informatik.hu-berlin.de> <00ce01c10e97$99ee0cd0$4ffa42d5@hagrid> <200107172305.f6HN57T01730@mira.informatik.hu-berlin.de> <200107172353.TAA21022@cj20424-a.reston1.va.home.com>
Message-ID: <20010718102341.B16348@ute.cnri.reston.va.us>

On Tue, Jul 17, 2001 at 07:53:36PM -0400, Guido van Rossum wrote:
>> the CVS - I just would need permission to do so (and an advise whether
>> to stuff everything into Modules, or to create an expat subdirectory).
>
>If Fred (Drake) approves, that's fine with me.  Negotiate the details
>with him.

Note that, the last time this idea was brought up, the issue of
version mismatches was brought up.  What if the platform has a newer
version of Expat?  What if you have an extension module for a library
that also links with Expat internally?

--amk



From thomas@xs4all.net  Wed Jul 18 15:24:40 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 18 Jul 2001 16:24:40 +0200
Subject: [Python-Dev] Python 2.1.1 & Mac/
Message-ID: <20010718162439.E2054@xs4all.nl>

When I updated the 2.1.1 tree this morning, I noticed it checked out the
entire Mac/ subtree... As it wasn't part of 2.1, I don't think it should be
part of 2.1.1, and I don't remember seeing any add's for it (at least not
with the release21-maint tag.) Jack/Just, did either of you add it
explicitly ? If so, there's something wrong with the checkin messages for
those adds. If not, it's likely something went wrong with Guido's attempt to
add the date-snapshot/descr-branch tags to all files :P In that case, I
guess we'll have to manually exclude the Mac/ tree from the source tarball of
the final release, next friday.


-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From fdrake@acm.org  Wed Jul 18 15:26:26 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 18 Jul 2001 10:26:26 -0400 (EDT)
Subject: [Python-Dev] PEP 250: Summary of comments
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AF15@UKRUX002.rundc.uk.origin-it.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AF15@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <15189.40082.973330.760541@cj42289-a.reston1.va.home.com>

Moore, Paul writes:
 > On point 3a, sys.extinstallpath should be set for all platforms, but I have
 > to admit that I don't know what to do for non-Windows platforms. The best I
 > can suggest is that we do something like
 > 
 > if os.sep == '/':
 >     sys.extinstallpath = os.path.join(sys.prefix, "lib", "python" +
 > sys.version[:3], "site-packages")
 > else:
 >     sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages")

  There's one aspect that doesn't appear to have been addressed for
Unix: there are two reasonable values for extinstallpath.  In
multi-architecture installations, where the Python portions of the
library are shared among architectures, there are two site-packages
directories:

	$prefix/lib/pythonX.Y/site-packages/

and

	$exec_prefix/lib/pythonX.Y/site-packages/

  When $prefix and $exec_prefix are the same, this isn't an issue, but
for this is a problem for multi-platform installations.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From guido@digicool.com  Wed Jul 18 15:36:07 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 18 Jul 2001 10:36:07 -0400
Subject: [Python-Dev] RE: [Distutils] PEP 250: Summary of comments
In-Reply-To: Your message of "Wed, 18 Jul 2001 15:21:03 BST."
 <714DFA46B9BBD0119CD000805FC1F53B01B5AF16@UKRUX002.rundc.uk.origin-it.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AF16@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <200107181436.KAA28132@cj20424-a.reston1.va.home.com>

> Can someone put one of these in?

Don't count on me -- I haven't followed this discussion at all.  Sorry.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Wed Jul 18 15:41:55 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 18 Jul 2001 10:41:55 -0400
Subject: [Python-Dev] Python 2.1.1 & Mac/
In-Reply-To: Your message of "Wed, 18 Jul 2001 16:24:40 +0200."
 <20010718162439.E2054@xs4all.nl>
References: <20010718162439.E2054@xs4all.nl>
Message-ID: <200107181441.KAA28173@cj20424-a.reston1.va.home.com>

> When I updated the 2.1.1 tree this morning, I noticed it checked out
> the entire Mac/ subtree... As it wasn't part of 2.1, I don't think
> it should be part of 2.1.1, and I don't remember seeing any add's
> for it (at least not with the release21-maint tag.) Jack/Just, did
> either of you add it explicitly ? If so, there's something wrong
> with the checkin messages for those adds. If not, it's likely
> something went wrong with Guido's attempt to add the
> date-snapshot/descr-branch tags to all files :P In that case, I
> guess we'll have to manually exclude the Mac/ tree from the source
> tarball of the final release, next friday.

I don't recall doing this, but it's possible that it happened this
way.  "cvs tag" operations don't create email notifications!  E.g. on
Mac/Relnotes, I see that the release21-maint tag is set.

If Jack&Just agree, the best way to go about this (I think) is to
execute the command "cvs tag -d release21-maint" in the Mac branch, to
delete the tag.  Make very sure to only do this in the Mac branch!!!

On the other hand, it could be that Jack/Just intends to release a
2.1.1 version of MacPython, and then the tagging is correct.  But I
would still exclude the Mac tree from the 2.1.1 source distribution.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Paul.Moore@atosorigin.com  Wed Jul 18 15:41:05 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Wed, 18 Jul 2001 15:41:05 +0100
Subject: [Python-Dev] PEP 250: Summary of comments
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AF17@UKRUX002.rundc.uk.origin-it.com>

From: Fred L. Drake, Jr. [mailto:fdrake@acm.org]
> There's one aspect that doesn't appear to have been addressed for
> Unix: there are two reasonable values for extinstallpath.  In
> multi-architecture installations, where the Python portions of the
> library are shared among architectures, there are two site-packages
> directories:

I agree entirely. I have no knowledge of the issues on Unix, and hence I can
make no comment.

As far as I am concerned, I feel very strongly that this should go in for
Windows (where I believe that the current use of a "bare" sys.prefix is
wrong), but I have no view at all on other platforms. My PEP was originally
entitled "Using site-packages on Windows" - I am happy if the scope gets
extended, but don't rely on me to do it - and please don't let the key point
(for me) which is fixing Windows, get sidetracked by issues for Unix, where
(as far as I know) the current status quo is perfectly acceptable to the
majority of users.

So I'd vote to leave sys.extinstallpath undefined except on Windows, and
leave the other platforms for a new PEP.

Paul.


From mal@lemburg.com  Wed Jul 18 16:00:41 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 18 Jul 2001 17:00:41 +0200
Subject: [Distutils] Re: [Python-Dev] PEP 250: Summary of comments
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AF15@UKRUX002.rundc.uk.origin-it.com> <15189.40082.973330.760541@cj42289-a.reston1.va.home.com>
Message-ID: <3B55A499.42F55B5A@lemburg.com>

"Fred L. Drake, Jr." wrote:
> 
> Moore, Paul writes:
>  > On point 3a, sys.extinstallpath should be set for all platforms, but I have
>  > to admit that I don't know what to do for non-Windows platforms. The best I
>  > can suggest is that we do something like
>  >
>  > if os.sep == '/':
>  >     sys.extinstallpath = os.path.join(sys.prefix, "lib", "python" +
>  > sys.version[:3], "site-packages")
>  > else:
>  >     sys.extinstallpath = os.path.join(sys.prefix, "lib", "site-packages")
> 
>   There's one aspect that doesn't appear to have been addressed for
> Unix: there are two reasonable values for extinstallpath.  In
> multi-architecture installations, where the Python portions of the
> library are shared among architectures, there are two site-packages
> directories:
> 
>         $prefix/lib/pythonX.Y/site-packages/
> 
> and
> 
>         $exec_prefix/lib/pythonX.Y/site-packages/
> 
>   When $prefix and $exec_prefix are the same, this isn't an issue, but
> for this is a problem for multi-platform installations.

I don't think this is an issue since distutils already knows
that extension package live in .../site-package on Unix. 

The Windows install and unix_home are the only ones which copy
the files into non-standard dirs (Unix seems to be the only
target which supports multi-platform installs out-of-the-box):

[taken from distutils.commands.install]
"""
INSTALL_SCHEMES = {
    'unix_prefix': {
        'purelib': '$base/lib/python$py_version_short/site-packages',
        'platlib': '$platbase/lib/python$py_version_short/site-packages',
        'headers': '$base/include/python$py_version_short/$dist_name',
        'scripts': '$base/bin',
        'data'   : '$base',
        },
    'unix_home': {
        'purelib': '$base/lib/python',
        'platlib': '$base/lib/python',
        'headers': '$base/include/python/$dist_name',
        'scripts': '$base/bin',
        'data'   : '$base',
        },
    'nt': {
        'purelib': '$base',
        'platlib': '$base',
        'headers': '$base/Include/$dist_name',
        'scripts': '$base/Scripts',
        'data'   : '$base',
        },
    'mac': {
        'purelib': '$base/Lib/site-packages',
        'platlib': '$base/Lib/site-packages',
        'headers': '$base/Include/$dist_name',
        'scripts': '$base/Scripts',
        'data'   : '$base',
        }
    }
"""

Paul, note that your patches don't even touch install.py -- are 
your sure that the patch to sysconfig.py suffices to have distutils
install the extensions into site-packages on Windows ?

(I believe that install.py would have to be told about
sys.extinstallpath too and that it should fallback to the
defaults given in the install schemes if it is not set.)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From fdrake@acm.org  Wed Jul 18 16:06:00 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 18 Jul 2001 11:06:00 -0400 (EDT)
Subject: [Distutils] Re: [Python-Dev] PEP 250: Summary of comments
In-Reply-To: <3B55A499.42F55B5A@lemburg.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AF15@UKRUX002.rundc.uk.origin-it.com>
 <15189.40082.973330.760541@cj42289-a.reston1.va.home.com>
 <3B55A499.42F55B5A@lemburg.com>
Message-ID: <15189.42456.461317.848018@cj42289-a.reston1.va.home.com>

M.-A. Lemburg writes:
 > I don't think this is an issue since distutils already knows
 > that extension package live in .../site-package on Unix. 

  Frankly, I'm not convinced that there's a need for extinstallpath.
Why not define INSTALL_SCHEMES like this:

if sys.version < "2.2":
    WINDOWS_SCHEME = {
        'purelib': '$base',
        'platlib': '$base',
        'headers': '$base/Include/$dist_name',
        'scripts': '$base/Scripts',
        'data'   : '$base',
        }
else:
    WINDOWS_SCHEME = {
        'purelib': '$base/Lib/site-packages',
        'platlib': '$base/Lib/site-packages',
        'headers': '$base/Include/$dist_name',
        'scripts': '$base/Scripts',
        'data'   : '$base',
        }

INSTALL_SCHEMES = {
    'nt': WINDOWS_SCHEME,
    ...
    }


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From loewis@informatik.hu-berlin.de  Wed Jul 18 16:18:07 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Wed, 18 Jul 2001 17:18:07 +0200 (MEST)
Subject: [Python-Dev] Freeze hacks
Message-ID: <200107181518.RAA05425@pandora.informatik.hu-berlin.de>

A number of modules in the standard library make use of dynamic
imports, or import modules through C code. In either case, no import
statement can be found.

Unfortunately, this means that tools like freeze or py2exe cannot
detect that those modules are used, so the frozen applications will
then fail at runtime. To make this work, I suggest to add explicit
import statements, which are put into a conditional 'if 0:'.

In particular, I found that the following modules need to be
referenced somewhere:
- xml.sax.expatreader, from xml.sax.__init__
- encodings.__init__, probably from codecs
- encodings.*, from encodings.__init__
- dbhash, gdbm, dbm, dumbdbm, from anydbm
- unixccompiler, msvccompiler, cygwinccompiler,
  bcppcompiler, mwerkscompiler, from distutils.ccompiler
- distutils.command.* from distutils.dist

What is the purpose of dumbdbm not importing os directly?

To give a specific example, I'd change xml.sax.__init__ to read

default_parser_list = ["xml.sax.expatreader"]
if 0:
    # freeze hack: the import relationship is not visible without this
    # statement
    import xml.sax.expatreader

Is that a desirable change? If so, I'll produce a patch.

The case of encodings is particularly troubling: I don't think there
is a way to tell freeze/py2exe/installer that

print u"Hallo".encode("iso8859-2")

will require additional modules. As a convention, I'd still recommend
to link all this to codecs, so that an application requiring any
codecs can do

if 0:
    import codecs

explicitly, or just tell the freeze tool to use codecs, and then will
get all codecs that are known statically.

Regards,
Martin


From thomas.heller@ion-tof.com  Wed Jul 18 16:32:37 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 18 Jul 2001 17:32:37 +0200
Subject: [Python-Dev] Freeze hacks
References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de>
Message-ID: <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook>

From: "Martin von Loewis" <loewis@informatik.hu-berlin.de>
> A number of modules in the standard library make use of dynamic
> imports, or import modules through C code. In either case, no import
> statement can be found.
> 
> Unfortunately, this means that tools like freeze or py2exe cannot
> detect that those modules are used, so the frozen applications will
> then fail at runtime. To make this work, I suggest to add explicit
> import statements, which are put into a conditional 'if 0:'.
Very good idea IMO, but 'if 0:' is optimized away.
This one works:

_FAKE=0
if _FAKE:
     import whatever

Thomas



From Paul.Moore@atosorigin.com  Wed Jul 18 16:52:02 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Wed, 18 Jul 2001 16:52:02 +0100
Subject: [Distutils] Re: [Python-Dev] PEP 250: Summary of comments
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AF1B@UKRUX002.rundc.uk.origin-it.com>

From: M.-A. Lemburg [mailto:mal@lemburg.com]
> (I believe that install.py would have to be told about
> sys.extinstallpath too and that it should fallback to the
> defaults given in the install schemes if it is not set.)

Hmm, browsing this a bit more, I'm getting further confused. The cause of
this is the INSTALL_SCHEMES stuff, which has a purelib/platlib distinction,
which is only used on unix_prefix (all other cases use the same value for
both of these). I can't see how sys.extinstallpath relates - I could use it
as default for both purelib and platlib, but that somewhat defeats the point
of having the two. Does this imply that sys.extinstallpath should be split
into two parts (pure & plat)? I can't comment, as this is a Unix-only thing.

This is getting silly. I feel that the correct approach is to go back to my
original stance, of *only* changing Windows behaviour - leave the Unix and
Mac camps as they are. With that in mind, sys.extinstallpath seems like an
overgeneralisation, and the attached patch does everything bar handle
bdist_wininst. The Windows Installer should then do the same thing - load
Python, and generate os.path.join(sys.prefix, "lib", "site-packages") as the
destination directory. OK, so the same thing is hard-coded in four places,
but this whole area is rife with duplicated code, and fixing that issue is
way outside the scope of PEP 250.

For the limited purpose of making site-packages appear in sys.path, and
making python setup.py install install to site-packages, the attached patch
works. I've only tested it on a simple Python module, but that's all I have
to hand. I can try some C modules tonight when I get home, but I see no
reason why they wouldn't work as well. The patch is pretty much trivial,
which (IMHO) is very much in its favour as Python 2.2a1 is already out...

Unless someone comes up with a *very strong* argument as to why I should be
going further than this, I would like to request that this goes into Python
as it stands. If someone can supply the source of the bdist_wininst
installer, I will make a corresponding change to that.

I will NOT make any changes which affect Unix, or Mac platforms. I don't
know the issues. If someone wants to supply a patch which does this, I'll be
happy to see it go in, and I am quite comfortable with it going under the
banner of PEP 250, but I will not get involved in the issues - I simply am
not qualified to comment.

Paul.

------------------------------------------------------

diff -u site.py.orig site.py
--- site.py.orig	Tue Jun 26 10:07:06 2001
+++ site.py	Wed Jul 18 16:33:37 2001
@@ -143,7 +143,7 @@
         elif os.sep == ':':
             sitedirs = [makepath(prefix, "lib", "site-packages")]
         else:
-            sitedirs = [prefix]
+            sitedirs = [prefix, makepath(prefix, "lib", "site-packages")]
         for sitedir in sitedirs:
             if os.path.isdir(sitedir):
                 addsitedir(sitedir)
diff -u distutils\sysconfig.py.orig distutils\sysconfig.py
--- distutils\sysconfig.py.orig	Thu Apr 19 10:24:24 2001
+++ distutils\sysconfig.py	Wed Jul 18 16:20:20 2001
@@ -87,7 +87,7 @@
 
     elif os.name == "nt":
         if standard_lib:
-            return os.path.join(PREFIX, "Lib")
+            return os.path.join(PREFIX, "Lib", "site-packages")
         else:
             return prefix
 
diff -u distutils\command\install.py.orig distutils\command\install.py
--- distutils\command\install.py.orig	Thu Apr 19 10:24:24 2001
+++ distutils\command\install.py	Wed Jul 18 16:29:29 2001
@@ -31,8 +31,8 @@
         'data'   : '$base',
         },
     'nt': {
-        'purelib': '$base',
-        'platlib': '$base',
+        'purelib': '$base/Lib/site-packages',
+        'platlib': '$base/Lib/site-packages',
         'headers': '$base/Include/$dist_name',
         'scripts': '$base/Scripts',
         'data'   : '$base',


From loewis@informatik.hu-berlin.de  Wed Jul 18 16:53:41 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Wed, 18 Jul 2001 17:53:41 +0200 (MEST)
Subject: [Python-Dev] Freeze hacks
In-Reply-To: <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook>
 (thomas.heller@ion-tof.com)
References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook>
Message-ID: <200107181553.RAA19593@pandora.informatik.hu-berlin.de>

> Very good idea IMO, but 'if 0:' is optimized away.

I'm not sure I understand. freeze does not optimize away such a code
block. Under which condition is that optimized away?

Regards,
Martin


From loewis@informatik.hu-berlin.de  Wed Jul 18 17:01:05 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Wed, 18 Jul 2001 18:01:05 +0200 (MEST)
Subject: [Python-Dev] Bumping the API version
Message-ID: <200107181601.SAA20887@pandora.informatik.hu-berlin.de>

I'm about to commit patch #412229, which will add an addition field at
the end of PyInterpreterState if HAVE_DLOPEN is defined. Do I need to
bump the API version for that?

Regards,
Martin


From thomas.heller@ion-tof.com  Wed Jul 18 17:10:24 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 18 Jul 2001 18:10:24 +0200
Subject: [Python-Dev] Freeze hacks
References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> <200107181553.RAA19593@pandora.informatik.hu-berlin.de>
Message-ID: <013e01c10fa4$27376700$e000a8c0@thomasnotebook>

From: "Martin von Loewis" <loewis@informatik.hu-berlin.de>
> > Very good idea IMO, but 'if 0:' is optimized away.
> 
> I'm not sure I understand. freeze does not optimize away such a code
> block. Under which condition is that optimized away?
> 
The Python compiler itself. 'if 0: import whatever' does
not generate any byte code. Modulefinder (used by freeze,
py2exe, and Gordon's installer) checks the compiled byte code
for import statements.

Thomas

C:\Python21>c:\python21\python.exe
ActivePython 2.1, build 211 (ActiveState)
based on Python 2.1 (#15, Jun 18 2001, 21:42:28) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> _FAKE=0
>>> def f():
...   if 0:
...     import win32api
...
>>> def g():
...   if _FAKE:
...     import win32api
...
>>> dis.dis(f)
          0 SET_LINENO               1

          3 SET_LINENO               2
          6 LOAD_CONST               0 (None)
          9 RETURN_VALUE
>>> dis.dis(g)
          0 SET_LINENO               1

          3 SET_LINENO               2
          6 LOAD_GLOBAL              0 (_FAKE)
          9 JUMP_IF_FALSE           16 (to 28)
         12 POP_TOP

         13 SET_LINENO               3
         16 LOAD_CONST               0 (None)
         19 IMPORT_NAME              1 (win32api)
         22 STORE_FAST               0 (win32api)
         25 JUMP_FORWARD             1 (to 29)
    >>   28 POP_TOP
    >>   29 LOAD_CONST               0 (None)
         32 RETURN_VALUE
>>> ^Z



From dubois1@llnl.gov  Wed Jul 18 17:19:59 2001
From: dubois1@llnl.gov (Paul F. Dubois)
Date: Wed, 18 Jul 2001 09:19:59 -0700
Subject: [Python-Dev] 2.2a1 and Numerical
Message-ID: <01071809231800.14475@almanac>

Numerical Python 20.1 compiles and passes all its tests with 2.2a1.

It is about 10% slower than 2.1 both with and without optimization in exe=
cuting
the test suite. That test suite uses PyUnit and most of the operations in=
volve
small arrays so I suspect most of its time is in Python not the C code fo=
r
Numeric.=20


From guido@digicool.com  Wed Jul 18 17:32:03 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 18 Jul 2001 12:32:03 -0400
Subject: [Python-Dev] Freeze hacks
In-Reply-To: Your message of "Wed, 18 Jul 2001 17:18:07 +0200."
 <200107181518.RAA05425@pandora.informatik.hu-berlin.de>
References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de>
Message-ID: <200107181632.MAA28401@cj20424-a.reston1.va.home.com>

> - unixccompiler, msvccompiler, cygwinccompiler,
>   bcppcompiler, mwerkscompiler, from distutils.ccompiler
> - distutils.command.* from distutils.dist

I don't expect frozen programs to use distutils.

> What is the purpose of dumbdbm not importing os directly?

I'm afraid that it's just an old way of spelling

    import os as _os

With the intention of not exporting anything undesired on "from
dumbdbm import *".  Feel free to fix it.

> To give a specific example, I'd change xml.sax.__init__ to read
> 
> default_parser_list = ["xml.sax.expatreader"]
> if 0:
>     # freeze hack: the import relationship is not visible without this
>     # statement
>     import xml.sax.expatreader
> 
> Is that a desirable change? If so, I'll produce a patch.

Sounds good to me.

> The case of encodings is particularly troubling: I don't think there
> is a way to tell freeze/py2exe/installer that
> 
> print u"Hallo".encode("iso8859-2")
> 
> will require additional modules. As a convention, I'd still recommend
> to link all this to codecs, so that an application requiring any
> codecs can do
> 
> if 0:
>     import codecs
> 
> explicitly, or just tell the freeze tool to use codecs, and then will
> get all codecs that are known statically.

Won't this create enormously bloated frozen binaries?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From loewis@informatik.hu-berlin.de  Wed Jul 18 17:29:27 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Wed, 18 Jul 2001 18:29:27 +0200 (MEST)
Subject: [Python-Dev] Freeze hacks
In-Reply-To: <013e01c10fa4$27376700$e000a8c0@thomasnotebook>
 (thomas.heller@ion-tof.com)
References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> <200107181553.RAA19593@pandora.informatik.hu-berlin.de> <013e01c10fa4$27376700$e000a8c0@thomasnotebook>
Message-ID: <200107181629.SAA29368@pandora.informatik.hu-berlin.de>

> The Python compiler itself. 'if 0: import whatever' does
> not generate any byte code. Modulefinder (used by freeze,
> py2exe, and Gordon's installer) checks the compiled byte code
> for import statements.

Ah, thanks for the explanation. I'll consider it in my patch.

Regards,
Martin


From guido@digicool.com  Wed Jul 18 17:42:23 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 18 Jul 2001 12:42:23 -0400
Subject: [Python-Dev] Bumping the API version
In-Reply-To: Your message of "Wed, 18 Jul 2001 18:01:05 +0200."
 <200107181601.SAA20887@pandora.informatik.hu-berlin.de>
References: <200107181601.SAA20887@pandora.informatik.hu-berlin.de>
Message-ID: <200107181642.MAA28484@cj20424-a.reston1.va.home.com>

> I'm about to commit patch #412229, which will add an addition field at
> the end of PyInterpreterState if HAVE_DLOPEN is defined. Do I need to
> bump the API version for that?

I don't think so, since PyInterpreterState objects are always
allocated by Python, not by 3rd party code.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Wed Jul 18 17:44:40 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 18 Jul 2001 12:44:40 -0400
Subject: [Python-Dev] Freeze hacks
In-Reply-To: Your message of "Wed, 18 Jul 2001 18:10:24 +0200."
 <013e01c10fa4$27376700$e000a8c0@thomasnotebook>
References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> <200107181553.RAA19593@pandora.informatik.hu-berlin.de>
 <013e01c10fa4$27376700$e000a8c0@thomasnotebook>
Message-ID: <200107181644.MAA28513@cj20424-a.reston1.va.home.com>

> > > Very good idea IMO, but 'if 0:' is optimized away.
> > 
> > I'm not sure I understand. freeze does not optimize away such a code
> > block. Under which condition is that optimized away?
> > 
> The Python compiler itself. 'if 0: import whatever' does
> not generate any byte code. Modulefinder (used by freeze,
> py2exe, and Gordon's installer) checks the compiled byte code
> for import statements.

Good catch, Thomas.

I find defining a variable _FAKE a bit cumbersome as a work-around.  I
would suggest instead:

    if 1==0:
        import whatever

since the optimizer only optimizes out "if 0:".

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Wed Jul 18 17:46:33 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 18 Jul 2001 12:46:33 -0400
Subject: [Python-Dev] 2.2a1 and Numerical
In-Reply-To: Your message of "Wed, 18 Jul 2001 09:19:59 PDT."
 <01071809231800.14475@almanac>
References: <01071809231800.14475@almanac>
Message-ID: <200107181646.MAA28536@cj20424-a.reston1.va.home.com>

> Numerical Python 20.1 compiles and passes all its tests with 2.2a1.
> 
> It is about 10% slower than 2.1 both with and without optimization
> in executing the test suite. That test suite uses PyUnit and most of
> the operations involve small arrays so I suspect most of its time is
> in Python not the C code for Numeric.

Thanks for the report, Paul!

We'll definitely put performance on the agenda for 2.2; it's already
got our attention since Jim Fulton posted some Zope benchmarks in an
internal list at Digital Creations...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Wed Jul 18 17:50:34 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 18 Jul 2001 18:50:34 +0200
Subject: [Python-Dev] Freeze hacks
References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> <200107181553.RAA19593@pandora.informatik.hu-berlin.de>              <013e01c10fa4$27376700$e000a8c0@thomasnotebook>  <200107181644.MAA28513@cj20424-a.reston1.va.home.com>
Message-ID: <01fb01c10fa9$c3c4d170$e000a8c0@thomasnotebook>

From: "Guido van Rossum" <guido@digicool.com>
> > > > Very good idea IMO, but 'if 0:' is optimized away.
> > > 
> > > I'm not sure I understand. freeze does not optimize away such a code
> > > block. Under which condition is that optimized away?
> > > 
> > The Python compiler itself. 'if 0: import whatever' does
> > not generate any byte code. Modulefinder (used by freeze,
> > py2exe, and Gordon's installer) checks the compiled byte code
> > for import statements.
> 
> Good catch, Thomas.
> 
> I find defining a variable _FAKE a bit cumbersome as a work-around.  I
> would suggest instead:
> 
>     if 1==0:
>         import whatever
> 
> since the optimizer only optimizes out "if 0:".
If the optimizer becomes more intelligent
in the future, and also probably easier to document
the purpose would be to use an (uncalled) function:

def _freeze_hints():
    import spam

Just another idea,

Thomas




From fdrake@acm.org  Wed Jul 18 17:54:05 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 18 Jul 2001 12:54:05 -0400 (EDT)
Subject: [Python-Dev] Freeze hacks
In-Reply-To: <01fb01c10fa9$c3c4d170$e000a8c0@thomasnotebook>
References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de>
 <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook>
 <200107181553.RAA19593@pandora.informatik.hu-berlin.de>
 <013e01c10fa4$27376700$e000a8c0@thomasnotebook>
 <200107181644.MAA28513@cj20424-a.reston1.va.home.com>
 <01fb01c10fa9$c3c4d170$e000a8c0@thomasnotebook>
Message-ID: <15189.48941.975174.479751@cj42289-a.reston1.va.home.com>

Thomas Heller writes:
 > def _freeze_hints():
 >     import spam

  Much nicer!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From guido@digicool.com  Wed Jul 18 18:05:42 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 18 Jul 2001 13:05:42 -0400
Subject: [Python-Dev] Freeze hacks
In-Reply-To: Your message of "Wed, 18 Jul 2001 18:50:34 +0200."
 <01fb01c10fa9$c3c4d170$e000a8c0@thomasnotebook>
References: <200107181518.RAA05425@pandora.informatik.hu-berlin.de> <008f01c10f9e$e10316d0$e000a8c0@thomasnotebook> <200107181553.RAA19593@pandora.informatik.hu-berlin.de> <013e01c10fa4$27376700$e000a8c0@thomasnotebook> <200107181644.MAA28513@cj20424-a.reston1.va.home.com>
 <01fb01c10fa9$c3c4d170$e000a8c0@thomasnotebook>
Message-ID: <200107181705.NAA30880@cj20424-a.reston1.va.home.com>

> If the optimizer becomes more intelligent
> in the future, and also probably easier to document
> the purpose would be to use an (uncalled) function:
> 
> def _freeze_hints():
>     import spam
> 
> Just another idea,

Very nice one!  An optimizer could never remove this, because it could
be called from outside.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From loewis@informatik.hu-berlin.de  Wed Jul 18 19:27:52 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Wed, 18 Jul 2001 20:27:52 +0200 (MEST)
Subject: [Python-Dev] re with Unicode broken?
Message-ID: <200107181827.UAA00284@pandora.informatik.hu-berlin.de>

> The expression which now fails to match is:

Did you, by any chance, use a big-endian system for that? If so, could
you please try the patch

http://sourceforge.net/tracker/?func=detail&aid=442512&group_id=5470&atid=305470

With that patch, your example code matches fine on my SPARC box.

Regards,
Martin


From guido@digicool.com  Wed Jul 18 19:38:55 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 18 Jul 2001 14:38:55 -0400
Subject: [Python-Dev] re with Unicode broken?
In-Reply-To: Your message of "Wed, 18 Jul 2001 20:27:52 +0200."
 <200107181827.UAA00284@pandora.informatik.hu-berlin.de>
References: <200107181827.UAA00284@pandora.informatik.hu-berlin.de>
Message-ID: <200107181838.OAA00994@cj20424-a.reston1.va.home.com>

> > The expression which now fails to match is:
> 
> Did you, by any chance, use a big-endian system for that? If so, could
> you please try the patch
> 
> http://sourceforge.net/tracker/?func=detail&aid=442512&group_id=5470&atid=305470
> 
> With that patch, your example code matches fine on my SPARC box.

I'm guessing this is a showstopper fix for 2.1.1 too...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From loewis@informatik.hu-berlin.de  Wed Jul 18 19:44:51 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Wed, 18 Jul 2001 20:44:51 +0200 (MEST)
Subject: [Python-Dev] re with Unicode broken?
In-Reply-To: <200107181838.OAA00994@cj20424-a.reston1.va.home.com> (message
 from Guido van Rossum on Wed, 18 Jul 2001 14:38:55 -0400)
References: <200107181827.UAA00284@pandora.informatik.hu-berlin.de> <200107181838.OAA00994@cj20424-a.reston1.va.home.com>
Message-ID: <200107181844.UAA00379@pandora.informatik.hu-berlin.de>

> > With that patch, your example code matches fine on my SPARC box.
> 
> I'm guessing this is a showstopper fix for 2.1.1 too...

No, the BIGCHARSET support was added to the CVS only recently; this
bug is not in 2.1. In case it got merged to the 2.2a1 branch, it might
be worthwhile applying the patch to that branch as well - provided /F
approves the patch. Or, we could live with the bug until 2.2a2, since
it only triggers if Unicode character classes are used on a big-endian
machine.

Regards,
Martin


From guido@digicool.com  Wed Jul 18 19:49:36 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 18 Jul 2001 14:49:36 -0400
Subject: [Python-Dev] re with Unicode broken?
In-Reply-To: Your message of "Wed, 18 Jul 2001 20:44:51 +0200."
 <200107181844.UAA00379@pandora.informatik.hu-berlin.de>
References: <200107181827.UAA00284@pandora.informatik.hu-berlin.de> <200107181838.OAA00994@cj20424-a.reston1.va.home.com>
 <200107181844.UAA00379@pandora.informatik.hu-berlin.de>
Message-ID: <200107181849.OAA01109@cj20424-a.reston1.va.home.com>

> No, the BIGCHARSET support was added to the CVS only recently; this
> bug is not in 2.1. In case it got merged to the 2.2a1 branch, it might
> be worthwhile applying the patch to that branch as well - provided /F
> approves the patch. Or, we could live with the bug until 2.2a2, since
> it only triggers if Unicode character classes are used on a big-endian
> machine.

When you check it in to the trunk, it will be merged into the branch
the next time we do a merge.  We've adopted a policy of "merge early,
merge often", by the way -- my procrastination was ill-guided. :-)

If 2.2a1 receives good marks, we'll do a marge back to the trunk and
then we'll be out of the merge business.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Wed Jul 18 20:22:37 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 18 Jul 2001 15:22:37 -0400
Subject: [Python-Dev] re with Unicode broken?
In-Reply-To: <200107181849.OAA01109@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFAKPAA.tim.one@home.com>

Want to emphasize a point:  NOBODY check anything into descr-branch!  The
only people who have a legitimate reason to check into descr-branch already
know that scream wasn't directed at them -- and if you had to think even an
instant, you're not one of them.  The way we're merging the trunk back into
the branch works much better if you let trunk changes show up in the branch
by magic (which means Tim at 3 in the morning, but that's close enough to
magic that Guido can't tell the difference <wink>).



From mal@lemburg.com  Wed Jul 18 20:24:01 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 18 Jul 2001 21:24:01 +0200
Subject: [Distutils] Re: [Python-Dev] PEP 250: Summary of comments
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AF15@UKRUX002.rundc.uk.origin-it.com>
 <15189.40082.973330.760541@cj42289-a.reston1.va.home.com>
 <3B55A499.42F55B5A@lemburg.com> <15189.42456.461317.848018@cj42289-a.reston1.va.home.com>
Message-ID: <3B55E251.7BEE00F0@lemburg.com>


"Fred L. Drake, Jr." wrote:
> 
> M.-A. Lemburg writes:
>  > I don't think this is an issue since distutils already knows
>  > that extension package live in .../site-package on Unix.
> 
>   Frankly, I'm not convinced that there's a need for extinstallpath.

Uhm... that's what I implied (or at least tried to imply) with 
my reply ;-)

> Why not define INSTALL_SCHEMES like this:
> 
> if sys.version < "2.2":
>     WINDOWS_SCHEME = {
>         'purelib': '$base',
>         'platlib': '$base',
>         'headers': '$base/Include/$dist_name',
>         'scripts': '$base/Scripts',
>         'data'   : '$base',
>         }
> else:
>     WINDOWS_SCHEME = {
>         'purelib': '$base/Lib/site-packages',
>         'platlib': '$base/Lib/site-packages',
>         'headers': '$base/Include/$dist_name',
>         'scripts': '$base/Scripts',
>         'data'   : '$base',
>         }
> 
> INSTALL_SCHEMES = {
>     'nt': WINDOWS_SCHEME,
>     ...
>     }
> 
>   -Fred
> 
> --
> Fred L. Drake, Jr.  <fdrake at acm.org>
> PythonLabs at Digital Creations
> 
> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG@python.org
> http://mail.python.org/mailman/listinfo/distutils-sig

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From simon@netthink.co.uk  Wed Jul 18 20:21:28 2001
From: simon@netthink.co.uk (Simon Cozens)
Date: Wed, 18 Jul 2001 15:21:28 -0400
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <20010718102341.B16348@ute.cnri.reston.va.us>
Message-ID: <20010718152128.C27569@netthink.co.uk>

On Wed, Jul 18, 2001 at 10:23:41AM -0400, Andrew Kuchling wrote:
> Note that, the last time this idea was brought up, the issue of
> version mismatches was brought up.  What if the platform has a newer
> version of Expat?  What if you have an extension module for a library
> that also links with Expat internally?

For what it's worth, when I (very recently) raised the issue on
perl5-porters, the concerns were:
    1) Expat isn't *necessarily* the best tool for the job.
    2) No support for linking as shared library (this has been fixed, 
       though, apparently)
    3) Symbol conflicts between, eg, apache's expat and Perl's expat.
       Apparently this affects PHP and mod_python too.
    4) Worries about portability
    5) Version mismatches between Perl's expat and expat's expat

Simon



From fdrake@beowolf.digicool.com  Wed Jul 18 21:10:31 2001
From: fdrake@beowolf.digicool.com (Fred Drake)
Date: Wed, 18 Jul 2001 16:10:31 -0400 (EDT)
Subject: [Python-Dev] [maintenance doc updates]
Message-ID: <20010718201031.F34352892C@beowolf.digicool.com>

The development version of the documentation has been updated:

	http://python.sourceforge.net/maint-docs/

Current status of the 2.1.1 documentation -- very few changes since the
2.1.1c1 release.



From skip@pobox.com (Skip Montanaro)  Wed Jul 18 21:30:11 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 18 Jul 2001 15:30:11 -0500
Subject: [Python-Dev] Please have a look at proposed doc changes for time epoch
Message-ID: <15189.61907.883300.127987@beluga.mojam.com>

I was just reminded by an update to another bug I had submitted that I was
assigned bug #434143.  Anything that uses time.mktime will fail if the time
tuple it is passed is "too old".  Unfortunately, by trying to be precise,
the message that goes along with the ValueError that's raised, it's a little
misleading:

    ValueError: year out of range (00-99, 1900-*)

Obviously, on Unix systems the baseline date is more like 1970.  I suspect
this error message was written when Python was being developed mostly (or
entirely) on Macs.

The context in which this arose was a user trying to generate a calendar
using the calendar module.

    https://sourceforge.net/tracker/?func=detail&aid=434143&group_id=5470&atid=105470

I think this is a difficult problem to solve properly without adding an
alternative to time.mktime and making some changes to the various modules
that use it (calendar, imaplib and rfc822 in the current CVS tree).  I
propose instead to make a few documentation changes:

    * in Modules/timemodule.c, make the error message more vague ;-)

    * in Doc/lib/lib{time,calendar}.tex indicate that the "epoch" is
      platform-dependent

I'm more than happy to add the necessary ifdefs to Modules/timemodule.c if
we can settle on what the actual epochs are for the various platforms.  For
Unix it is 1970-01-01.  For Macs I think it is 1900-01-01.  Is it 1904-01-01
on Windows?

If you have a moment, please have a look at the above sourceforge url.  I'd
like to get this off my plate in the next few days.

Skip


From mal@lemburg.com  Wed Jul 18 21:31:14 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 18 Jul 2001 22:31:14 +0200
Subject: [Python-Dev] PEP: Defining Python Source Code Encodings
References: <3B559B71.C08C6145@lemburg.com>
Message-ID: <3B55F212.9B11346A@lemburg.com>

Barry has assigned the PEP number 0263 to this PEP. If you
prefer to read the PEP online, here is the URL:

	http://python.sourceforge.net/peps/pep-0263.html

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From skip@pobox.com (Skip Montanaro)  Wed Jul 18 21:37:43 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 18 Jul 2001 15:37:43 -0500
Subject: [Python-Dev] some unassigned bugs - where should they go?
Message-ID: <15189.62359.164818.924865@beluga.mojam.com>

https://sourceforge.net/my/ is a very useful page, 'cuz it reminds you of
all the bugs assigned to you as well as all the bugs you submitted that
haven't been closed. ;-)

In reviewing bugs that I submitted, I find these that have yet to be
assigned to anybody:

    https://sourceforge.net/tracker/?func=detail&aid=440725&group_id=5470&atid=105470
    https://sourceforge.net/tracker/?func=detail&aid=427073&group_id=5470&atid=105470

one bug and one patch that were assigned to Ping (probably by me, 'cuz I
knew they were in his inspect and pydoc code)

    https://sourceforge.net/tracker/?func=detail&aid=426740&group_id=5470&atid=105470
    https://sourceforge.net/tracker/?func=detail&aid=419419&group_id=5470&atid=305470

and one bug that got assigned to Guido:

    https://sourceforge.net/tracker/?func=detail&aid=424554&group_id=5470&atid=305470

I assume the last one should probably be reassigned.  I can't believe it
would actually need BDFL eyes to figure out or approve.

Should I just randomly (or otherwise) assign the first two to someone?  Ping
seems to have disappeared from the face of the earth.  Has anyone heard from
him lately?  In fact, there were seven bugs assigned to Ping between March
and May, all related to pydoc or inspect.

Skip


From tim.one@home.com  Thu Jul 19 00:20:20 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 18 Jul 2001 19:20:20 -0400
Subject: [Python-Dev] Please have a look at proposed doc changes for time epoch
In-Reply-To: <15189.61907.883300.127987@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEGDKPAA.tim.one@home.com>

[Skip Montanaro]
> ...
> Unfortunately, by trying to be precise, the message that goes along
> with the ValueError that's raised, it's a little misleading:
>
>     ValueError: year out of range (00-99, 1900-*)
>
> Obviously, on Unix systems the baseline date is more like 1970.  I
> suspect this error message was written when Python was being
> developed mostly (or entirely) on Macs.

C mktime is broken all over the place.  Since the tm_year member of a struct
tm is defined as the number of years since 1900, reasonable implementers
should have assumed the committee intended that years before 1970 weren't
anything special -- but apparently only the Mac implementers were reasonable
<0.9 wink>.  C99 spells this out in severe detail, making clear that there's
nothing special even about negative tm_year offsets (they're for years
before 1900, of course).

> ...
>    * in Modules/timemodule.c, make the error message more vague ;-)
>
>   * in Doc/lib/lib{time,calendar}.tex indicate that the "epoch" is
>     platform-dependent

The defn. of mktime makes no reference to epoch (indeed, the C std doesn't
mention "the epoch" anywhere!), so that's mixing concepts that shouldn't get
mixed even when making excuses for bad implementations.  If we can't fix it
ourselves, better to say that mktime simply isn't well-defined across
platforms.

> Is it 1904-01-01 on Windows?

The MS docs say:

    mktime returns the specified calendar time encoded as a value of
    type time_t. If timeptr references a date before midnight,
    January 1, 1970, or if the calendar time cannot be represented,
    the function returns –1 cast to type time_t.



From tim.one@home.com  Thu Jul 19 00:47:23 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 18 Jul 2001 19:47:23 -0400
Subject: [Python-Dev] Python 2.1.1 & Mac/
In-Reply-To: <20010718162439.E2054@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGFKPAA.tim.one@home.com>

[Thomas Wouters]
> When I updated the 2.1.1 tree this morning, I noticed it checked
> out the entire Mac/ subtree... As it wasn't part of 2.1, I don't
> think it should be part of 2.1.1, and I don't remember seeing any
> add's for it (at least not> with the release21-maint tag.) Jack/Just,
> did either of you add it explicitly ?

Ever get an answer?  Looks like all of Jack's checkins today were made on
the release21-maint branch (and not the trunk) too.



From tim.one@home.com  Thu Jul 19 02:28:54 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 18 Jul 2001 21:28:54 -0400
Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1
Message-ID: <LNBBLJKPBEHFEDALKOLCKEGIKPAA.tim.one@home.com>

A very recent patch to webbrowser.py broke this module on Windows; the patch
also appears in the 2.1.1 maintenance branch.

C:\Code\2.1.1\dist\src\PCbuild>python
Python 2.1.1c1 (#19, Jul 13 2001, 00:25:06) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import webbrowser
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "C:\CODE\2.1.1\DIST\SRC\lib\webbrowser.py", line 312, in ?
    if _iscommand(cmd.lower()):
NameError: name '_iscommand' is not defined
>>>

This also causes test___all__.py to fail on Windows, in 2.1.1 and CVS.  Note
that we intend to build 2.1.1 final tomorrow (Thursday) night, so please fix
it or rip it out ASAP.



From akuchlin@mems-exchange.org  Thu Jul 19 02:55:46 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 18 Jul 2001 21:55:46 -0400
Subject: [Python-Dev] 2.2 Unicode questions
Message-ID: <20010718215546.A16539@ludwig.cnri.reston.va.us>

I've written some text on Unicode for the 2.2 article, but it's
doubtful I actually understand what's going on.  Can people who
actually understand where Unicode has been please take a look at the
following?  

First, a short one, Mark Hammond's patch for supporting MBCS on
Windows.  I trust everyone can handle a little bit of TeX markup?

  % XXX is this explanation correct?  
  \item When presented with a Unicode filename on Windows, Python will
  now correctly convert it to a string using the MBCS encoding.
  Filenames on Windows are a case where Python's choice of ASCII as
  the default encoding turns out to be an annoyance.  

  This patch also adds \samp{et} as a format sequence to
  \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and
  an encoding name, and converts it to the given encoding if the
  parameter turns out to be a Unicode string, or leaves it alone if
  it's an 8-bit string, assuming it to already be in the desired
  encoding.  (This differs from the \samp{es} format character, which
  assumes that 8-bit strings are in Python's default ASCII encoding
  and converts them to the specified new encoding.)
   
  (Contributed by Mark Hammond with assistance from Marc-Andr\'e
  Lemburg.)

Second, the --enable-unicode changes:

%======================================================================
\section{Unicode Changes}

Python's Unicode support has been enhanced a bit in 2.2.  Unicode
strings are usually stored as UCS-2, as 16-bit unsigned integers.
Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned
integers, as its internal encoding by supplying
\longprogramopt{enable-unicode=ucs4} to the configure script.  When
built to use UCS-4, in theory Python could handle Unicode characters
from U-00000000 to U-7FFFFFFF.  Being able to use UCS-4 internally is
a necessary step to do that, but it's not the only step, and in Python
2.2alpha1 the work isn't complete yet.  For example, the
\function{unichr()} function still only accepts values from 0 to
65535, and there's no \code{\e U} notation for embedding characters
greater than 65535 in a Unicode string literal.  All this is the
province of the still-unimplemented PEP 261, ``Support for `wide'
Unicode characters''; consult it for further details, and please offer
comments and suggestions on the proposal it describes.

% ... section on decode() deleted; on firmer ground there...

\method{encode()} and \method{decode()} were implemented by
Marc-Andr\'e Lemburg.  The changes to support using UCS-4 internally
were implemented by Fredrik Lundh and Martin von L\"owis.

\begin{seealso}

\seepep{261}{Support for `wide' Unicode characters}{PEP written by
Paul Prescod.  Not yet accepted or fully implemented.}

\end{seealso}

Corrections?  Thanks in advance...

--amk


From fdrake@acm.org  Thu Jul 19 04:49:06 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 18 Jul 2001 23:49:06 -0400 (EDT)
Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEGIKPAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCKEGIKPAA.tim.one@home.com>
Message-ID: <15190.22706.479809.871806@cj42289-a.reston1.va.home.com>

Tim Peters writes:
 > A very recent patch to webbrowser.py broke this module on Windows; the patch
 > also appears in the 2.1.1 maintenance branch.

  Please try again; I think this is fixed in the patch I just checked
in, but I don't have a convenient Windows box to try it on right now.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From skip@pobox.com (Skip Montanaro)  Thu Jul 19 04:57:22 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 18 Jul 2001 22:57:22 -0500
Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEGIKPAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCKEGIKPAA.tim.one@home.com>
Message-ID: <15190.23202.463412.71847@beluga.mojam.com>

    Tim> A very recent patch to webbrowser.py broke this module on Windows;
    Tim> the patch also appears in the 2.1.1 maintenance branch.

Ah shit.

This whole branching thing has me very confused, so I don't dare check
anything in.  To get things to work, I think all you need to do at the end
of webbrowser.py is replace the for loop with

    try:
        _iscommand
        for cmd in _tryorder:
            if not _browsers.has_key(cmd.lower()):
                if _iscommand(cmd.lower()):
                    register(cmd.lower(), None, GenericBrowser("%s %%s" % cmd.lower()))
    except NameError:
        pass

My version suddenly looks a hell of a lot different than what I checked in
earlier today.  I suspect someone may have backed stuff out and went too far
back in time.

Skip


From fdrake@acm.org  Thu Jul 19 04:56:43 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 18 Jul 2001 23:56:43 -0400 (EDT)
Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1
In-Reply-To: <15190.23202.463412.71847@beluga.mojam.com>
References: <LNBBLJKPBEHFEDALKOLCKEGIKPAA.tim.one@home.com>
 <15190.23202.463412.71847@beluga.mojam.com>
Message-ID: <15190.23163.302172.511620@cj42289-a.reston1.va.home.com>

Skip Montanaro writes:
 > My version suddenly looks a hell of a lot different than what I checked in
 > earlier today.  I suspect someone may have backed stuff out and went too far
 > back in time.

  No; this was strictly forward motion, at least in my book.  The
patch you submitted was *not* reverted.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From tim.one@home.com  Thu Jul 19 05:30:06 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 19 Jul 2001 00:30:06 -0400
Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1
In-Reply-To: <15190.23163.302172.511620@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEHAKPAA.tim.one@home.com>

Fred fixed webbrowswer.py on Windows, in both the trunk and 2.1.1 (thank
you, Fred!).

[Skip]
> This whole branching thing has me very confused, so I don't dare
> check anything in. ...

Feel free to check things into the trunk.  This close to the release,
though, I strongly advise checking anything into the maintenance branch,
unless you're Thomas Wouters, or one of the PythonLabs guys (Guido is able
to make us stay up until Friday to fix anything we screw up <0.9 wink>).



From thomas@xs4all.net  Thu Jul 19 08:30:01 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 19 Jul 2001 09:30:01 +0200
Subject: [Python-Dev] Python 2.1.1 & Mac/
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEGFKPAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCCEGFKPAA.tim.one@home.com>
Message-ID: <20010719093001.G2054@xs4all.nl>

On Wed, Jul 18, 2001 at 07:47:23PM -0400, Tim Peters wrote:
> [Thomas Wouters]
> > When I updated the 2.1.1 tree this morning, I noticed it checked
> > out the entire Mac/ subtree... As it wasn't part of 2.1, I don't
> > think it should be part of 2.1.1, and I don't remember seeing any
> > add's for it (at least not> with the release21-maint tag.) Jack/Just,
> > did either of you add it explicitly ?

> Ever get an answer?  Looks like all of Jack's checkins today were made on
> the release21-maint branch (and not the trunk) too.

Yeah, the issue was resolved. Jack added it to make a MacPython 2.1.1,
hadn't realized it would make it trickier for the rest of us, apologized
Guido (and you <wink>) agreed to remember to keep the Mac/ subdirectory out
of the release 'by hand'.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From sjoerd.mullender@oratrix.com  Thu Jul 19 09:57:47 2001
From: sjoerd.mullender@oratrix.com (Sjoerd Mullender)
Date: Thu, 19 Jul 2001 10:57:47 +0200
Subject: [Python-Dev] Re: re with Unicode broken?
In-Reply-To: Your message of Wed, 18 Jul 2001 20:27:52 +0200.
 <200107181827.UAA00284@pandora.informatik.hu-berlin.de>
References: <200107181827.UAA00284@pandora.informatik.hu-berlin.de>
Message-ID: <20010719085747.CC051301CF7@bireme.oratrix.nl>

Yes, I was using a big-endian system (SGI), and yes, the patch worked.

On Wed, Jul 18 2001 Martin von Loewis wrote:

> > The expression which now fails to match is:
> 
> Did you, by any chance, use a big-endian system for that? If so, could
> you please try the patch
> 
> http://sourceforge.net/tracker/?func=detail&aid=442512&group_id=5470&atid=305470
> 
> With that patch, your example code matches fine on my SPARC box.
> 
> Regards,
> Martin
> 

-- Sjoerd Mullender <sjoerd.mullender@oratrix.com>


From thomas@xs4all.net  Thu Jul 19 10:18:06 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 19 Jul 2001 11:18:06 +0200
Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEHAKPAA.tim.one@home.com>
Message-ID: <20010719111806.H2054@xs4all.nl>

On Thu, Jul 19, 2001 at 12:30:06AM -0400, Tim Peters wrote:

> Feel free to check things into the trunk.  This close to the release,
> though, I strongly advise checking anything into the maintenance branch,
> unless you're Thomas Wouters, or one of the PythonLabs guys (Guido is able
> to make us stay up until Friday to fix anything we screw up <0.9 wink>).

And there's another reason not to check things into the maint branch unless
you know I won't (which is true only for Jack and Just :): I have a hard
enough time keeping track of all the checkins and the stuff people *want* me
to check in, without people actually checking things in themselves ;P

No harm done, this time, but it's definately something to remember for the
next branch :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Thu Jul 19 10:03:30 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 19 Jul 2001 11:03:30 +0200
Subject: [Python-Dev] Please have a look at proposed doc changes for time
 epoch
References: <LNBBLJKPBEHFEDALKOLCEEGDKPAA.tim.one@home.com>
Message-ID: <3B56A262.C461CB@lemburg.com>

Tim Peters wrote:
> 
> [Skip Montanaro about deficiencies in the time module]

Why don't you use mxDateTime ? It provides a platform independent
layer on top of all the C lib confusion underneath.

Also, the representable time range is 

	-5851455-01-01 00:00:00.00 - 5867440-12-31 00:00:00.00

... should cover most people's needs ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal@lemburg.com  Thu Jul 19 12:04:02 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 19 Jul 2001 13:04:02 +0200
Subject: [Python-Dev] 2.2 Unicode questions
References: <20010718215546.A16539@ludwig.cnri.reston.va.us>
Message-ID: <3B56BEA2.3472A44F@lemburg.com>

After looking at the web-page I found:

"""
Since their introduction, Unicode strings have supported an encode()
method to convert the string to a selected encoding such as UTF-8 or=20
Latin-1. A symmetric decode([encoding]) method
has been added to both 8-bit and Unicode strings in 2.2, which assumes=20
that the string is in the specified encoding and
decodes it. This means that encode() and decode() can be called on=20
both types of strings, and can be used for tasks
not directly related to Unicode.
"""

I did want to add unicode_string.decode(), but there was unexpected
opposition to this small addition, so I decided to postpone the
change. As a result, things are not as symmetric as they could be=20
in 2.2.

I hope that Walter D=F6rwald finishes the codec callback=20
error handling patch before 2.2a2... it would make a great
difference to the XML crowd.

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From guido@digicool.com  Thu Jul 19 13:10:08 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 19 Jul 2001 08:10:08 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: Your message of "Wed, 18 Jul 2001 21:55:46 EDT."
 <20010718215546.A16539@ludwig.cnri.reston.va.us>
References: <20010718215546.A16539@ludwig.cnri.reston.va.us>
Message-ID: <200107191210.IAA07020@cj20424-a.reston1.va.home.com>

> First, a short one, Mark Hammond's patch for supporting MBCS on
> Windows.  I trust everyone can handle a little bit of TeX markup?
> 
>   % XXX is this explanation correct?  
>   \item When presented with a Unicode filename on Windows, Python will
>   now correctly convert it to a string using the MBCS encoding.
>   Filenames on Windows are a case where Python's choice of ASCII as
>   the default encoding turns out to be an annoyance.  
> 
>   This patch also adds \samp{et} as a format sequence to
>   \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and
>   an encoding name, and converts it to the given encoding if the
>   parameter turns out to be a Unicode string, or leaves it alone if
>   it's an 8-bit string, assuming it to already be in the desired
>   encoding.  (This differs from the \samp{es} format character, which
>   assumes that 8-bit strings are in Python's default ASCII encoding
>   and converts them to the specified new encoding.)
>    
>   (Contributed by Mark Hammond with assistance from Marc-Andr\'e
>   Lemburg.)

I learned something here, so I hope this is correct. :-)

> Second, the --enable-unicode changes:
> 
> %======================================================================
> \section{Unicode Changes}
> 
> Python's Unicode support has been enhanced a bit in 2.2.  Unicode
> strings are usually stored as UCS-2, as 16-bit unsigned integers.
> Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned
> integers, as its internal encoding by supplying
> \longprogramopt{enable-unicode=ucs4} to the configure script.  When
> built to use UCS-4, in theory Python could handle Unicode characters
> from U-00000000 to U-7FFFFFFF.

I think the Unicode folks use U+, not U-, and the largest Unicode
chracter is "only" U+10FFFF.  (Never mind that the data type can
handle larger values.)

> Being able to use UCS-4 internally is
> a necessary step to do that, but it's not the only step, and in Python
> 2.2alpha1 the work isn't complete yet.  For example, the
> \function{unichr()} function still only accepts values from 0 to
> 65535,

Untrue: it supports range(0x110000) (in UCS-2 mode this returns a
surrogate pair).  Now, maybe that's not what it *should* do...

> and there's no \code{\e U} notation for embedding characters
> greater than 65535 in a Unicode string literal.

Not true either -- correct \U has been part of Python since 2.0.  It
does the same thing as unichr() described above.

> All this is the
> province of the still-unimplemented PEP 261, ``Support for `wide'
> Unicode characters''; consult it for further details, and please offer
> comments and suggestions on the proposal it describes.
> 
> % ... section on decode() deleted; on firmer ground there...
> 
> \method{encode()} and \method{decode()} were implemented by
> Marc-Andr\'e Lemburg.  The changes to support using UCS-4 internally
> were implemented by Fredrik Lundh and Martin von L\"owis.
> 
> \begin{seealso}
> 
> \seepep{261}{Support for `wide' Unicode characters}{PEP written by
> Paul Prescod.  Not yet accepted or fully implemented.}
> 
> \end{seealso}
> 
> Corrections?  Thanks in advance...

If I were you, I would make sure that Marc-Andre and Martin agree
with me before adopting my comments above...

And thank *you* for doing this very useful write-up again!  (I'm doing
my part by writing up the types/class unification thing -- now mostly
complete at http://www.python.org/2.2/descrintro.html.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@digicool.com  Thu Jul 19 13:29:25 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 19 Jul 2001 08:29:25 -0400
Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1
In-Reply-To: Your message of "Thu, 19 Jul 2001 00:30:06 EDT."
 <LNBBLJKPBEHFEDALKOLCCEHAKPAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCCEHAKPAA.tim.one@home.com>
Message-ID: <200107191229.IAA07241@cj20424-a.reston1.va.home.com>

[Tim]
> Feel free to check things into the trunk.  This close to the release,
> though, I strongly advise checking anything into the maintenance branch,
                           ^AGAINST!
> unless you're Thomas Wouters, or one of the PythonLabs guys (Guido is able
> to make us stay up until Friday to fix anything we screw up <0.9 wink>).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Thu Jul 19 14:05:55 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 19 Jul 2001 15:05:55 +0200
Subject: [Python-Dev] 2.2 Unicode questions
References: <20010718215546.A16539@ludwig.cnri.reston.va.us> <200107191210.IAA07020@cj20424-a.reston1.va.home.com>
Message-ID: <3B56DB33.71C9161B@lemburg.com>

Guido van Rossum wrote:
> 
> > First, a short one, Mark Hammond's patch for supporting MBCS on
> > Windows.  I trust everyone can handle a little bit of TeX markup?
> >
> >   % XXX is this explanation correct?
> >   \item When presented with a Unicode filename on Windows, Python will
> >   now correctly convert it to a string using the MBCS encoding.
> >   Filenames on Windows are a case where Python's choice of ASCII as
> >   the default encoding turns out to be an annoyance.
> >
> >   This patch also adds \samp{et} as a format sequence to
> >   \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and
> >   an encoding name, and converts it to the given encoding if the
> >   parameter turns out to be a Unicode string, or leaves it alone if
> >   it's an 8-bit string, assuming it to already be in the desired
> >   encoding.  (This differs from the \samp{es} format character, which
> >   assumes that 8-bit strings are in Python's default ASCII encoding
> >   and converts them to the specified new encoding.)
> >
> >   (Contributed by Mark Hammond with assistance from Marc-Andr\'e
> >   Lemburg.)
> 
> I learned something here, so I hope this is correct. :-)

The last part is... the rest is for Mark to comment on.
 
> > Second, the --enable-unicode changes:
> >
> > %======================================================================
> > \section{Unicode Changes}
> >
> > Python's Unicode support has been enhanced a bit in 2.2.  Unicode
> > strings are usually stored as UCS-2, as 16-bit unsigned integers.
> > Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned
> > integers, as its internal encoding by supplying
> > \longprogramopt{enable-unicode=ucs4} to the configure script.  When
> > built to use UCS-4, in theory Python could handle Unicode characters
> > from U-00000000 to U-7FFFFFFF.
> 
> I think the Unicode folks use U+, not U-, 

True.

> and the largest Unicode
> chracter is "only" U+10FFFF.  (Never mind that the data type can
> handle larger values.)

I wouldn't count on that...  (note that Andrew wrote "could" ;-)
 
> > Being able to use UCS-4 internally is
> > a necessary step to do that, but it's not the only step, and in Python
> > 2.2alpha1 the work isn't complete yet.  For example, the
> > \function{unichr()} function still only accepts values from 0 to
> > 65535,
> 
> Untrue: it supports range(0x110000) (in UCS-2 mode this returns a
> surrogate pair).  Now, maybe that's not what it *should* do...

It should definitely not, unless you want to break code which assumes
that chr() and unichr() always return a single byte/code unit ! 

This was part of the UCS-4 checkins which hadn't had time yet to 
review. Should I remove the surrogate part for narrow builds ?
 
> > and there's no \code{\e U} notation for embedding characters
> > greater than 65535 in a Unicode string literal.
> 
> Not true either -- correct \U has been part of Python since 2.0.  It
> does the same thing as unichr() described above.

Right.

Note that in this case, the handling of surrogates is needed
to make the unicode-escape encoding roundtrip safe.
 
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From akuchlin@mems-exchange.org  Thu Jul 19 14:52:20 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 19 Jul 2001 09:52:20 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <200107191210.IAA07020@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Jul 19, 2001 at 08:10:08AM -0400
References: <20010718215546.A16539@ludwig.cnri.reston.va.us> <200107191210.IAA07020@cj20424-a.reston1.va.home.com>
Message-ID: <20010719095220.A7282@ute.cnri.reston.va.us>

On Thu, Jul 19, 2001 at 08:10:08AM -0400, Guido van Rossum wrote:
>Untrue: it supports range(0x110000) (in UCS-2 mode this returns a
>surrogate pair).  Now, maybe that's not what it *should* do...

I formed the impression that all of the UCS-4 and surrogate work was
for the goal of supporting ISO 10646 (or whatever the number is -- you
know, the 31-bit character set), so everything is written with that
assumption.  Presumably that's wrong.  Is ISO 10646 on the roadmap at
this point, or is it completely irrelevant?  

Your other corrections will get applied; thanks!

--amk


From skip@pobox.com (Skip Montanaro)  Thu Jul 19 15:06:03 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 19 Jul 2001 09:06:03 -0500
Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEHAKPAA.tim.one@home.com>
References: <15190.23163.302172.511620@cj42289-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCCEHAKPAA.tim.one@home.com>
Message-ID: <15190.59723.520322.609204@beluga.mojam.com>

    Tim> Feel free to check things into the trunk.  This close to the
    Tim> release, though, I strongly advise checking anything into the
    Tim> maintenance branch, unless you're Thomas Wouters, or one of the
    Tim> PythonLabs guys (Guido is able to make us stay up until Friday to
    Tim> fix anything we screw up <0.9 wink>).

I only checked webbrowser.py into the maintenance branch 'cuz Fred said to.
Would someone please flog him for me? 

;-)

Skip



From guido@digicool.com  Thu Jul 19 15:09:33 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 19 Jul 2001 10:09:33 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: Your message of "Thu, 19 Jul 2001 15:05:55 +0200."
 <3B56DB33.71C9161B@lemburg.com>
References: <20010718215546.A16539@ludwig.cnri.reston.va.us> <200107191210.IAA07020@cj20424-a.reston1.va.home.com>
 <3B56DB33.71C9161B@lemburg.com>
Message-ID: <200107191409.KAA07785@cj20424-a.reston1.va.home.com>

> > Untrue: it supports range(0x110000) (in UCS-2 mode this returns a
> > surrogate pair).  Now, maybe that's not what it *should* do...
> 
> It should definitely not, unless you want to break code which assumes
> that chr() and unichr() always return a single byte/code unit !

Reasonable people can disagree about this.

> This was part of the UCS-4 checkins which hadn't had time yet to 
> review. Should I remove the surrogate part for narrow builds ?

Well, this snuck into the 2.2a1, so hopefully we'll get some comments
("love it" / "hate it") from the field to guide our decision.

> > > and there's no \code{\e U} notation for embedding characters
> > > greater than 65535 in a Unicode string literal.
> > 
> > Not true either -- correct \U has been part of Python since 2.0.  It
> > does the same thing as unichr() described above.
> 
> Right.
> 
> Note that in this case, the handling of surrogates is needed
> to make the unicode-escape encoding roundtrip safe.

I don't understand what this means.  Can you give an example?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Thu Jul 19 15:14:15 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 19 Jul 2001 10:14:15 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src Makefile.pre.in,1.35.2.1,1.35.2.2
In-Reply-To: Your message of "Thu, 19 Jul 2001 06:21:08 PDT."
 <E15NDjo-0005s9-00@usw-pr-cvs1.sourceforge.net>
References: <E15NDjo-0005s9-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <200107191414.KAA07830@cj20424-a.reston1.va.home.com>

> Revert the previous two changes, unsetting PYTHONHOME breaks the build
> procedure on some platforms. Better safe than sorry!
[...]
> *** Makefile.pre.in	2001/07/19 09:28:24	1.35.2.1
> --- Makefile.pre.in	2001/07/19 13:21:05	1.35.2.2
> ***************
> *** 283,288 ****
>   # Build the shared modules
>   sharedmods: $(PYTHON)
> ! 	PYTHONPATH= PYTHONHOME= PYTHONSTARTUP= \
> ! 		./$(PYTHON) $(srcdir)/setup.py build
>   
>   # buildno should really depend on something like LIBRARY_SRC
> --- 283,287 ----
>   # Build the shared modules
>   sharedmods: $(PYTHON)
> ! 	PYTHONPATH= ./$(PYTHON) $(srcdir)/setup.py build
>   
>   # buildno should really depend on something like LIBRARY_SRC

It suddenly occurred to me that in the future (like 2.2) perhaps we
ought to have a command line option that means "ignore all
$PYTHON... environment variables".

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Thu Jul 19 15:14:05 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 19 Jul 2001 10:14:05 -0400 (EDT)
Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1
In-Reply-To: <15190.59723.520322.609204@beluga.mojam.com>
References: <15190.23163.302172.511620@cj42289-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCCEHAKPAA.tim.one@home.com>
 <15190.59723.520322.609204@beluga.mojam.com>
Message-ID: <15190.60205.105809.805536@cj42289-a.reston1.va.home.com>

Skip Montanaro writes:
 > I only checked webbrowser.py into the maintenance branch 'cuz Fred said to.
 > Would someone please flog him for me? 
 > 
 > ;-)

  Hey, I know that smile... it's the one you use when you're serious
but don't want it too obvious.  Now I'll have to cower under my desk
in fear of the arrival of the rest of the PythonLabs crew, 'cuz I know
they'll do as you ask!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From simon@netthink.co.uk  Thu Jul 19 15:15:49 2001
From: simon@netthink.co.uk (Simon Cozens)
Date: Thu, 19 Jul 2001 10:15:49 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <200107191409.KAA07785@cj20424-a.reston1.va.home.com>
Message-ID: <20010719101549.B31796@netthink.co.uk>

On Thu, Jul 19, 2001 at 10:09:33AM -0400, Guido van Rossum wrote:
> > > Untrue: it supports range(0x110000) (in UCS-2 mode this returns a
> > > surrogate pair).  Now, maybe that's not what it *should* do...
> > 
> > It should definitely not, unless you want to break code which assumes
> > that chr() and unichr() always return a single byte/code unit !
> 
> Reasonable people can disagree about this.

It certainly should not, if by UCS-2 you actually mean UCS-2.
UCS-2 can't access characters outside the Basic Multilingual Plane,
and so shouldn't be using surrogates.

If by UCS-2 you actually mean UTF-16, then using surrogates is the
right approach. :)

Simon


From guido@digicool.com  Thu Jul 19 15:18:15 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 19 Jul 2001 10:18:15 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: Your message of "Thu, 19 Jul 2001 09:52:20 EDT."
 <20010719095220.A7282@ute.cnri.reston.va.us>
References: <20010718215546.A16539@ludwig.cnri.reston.va.us> <200107191210.IAA07020@cj20424-a.reston1.va.home.com>
 <20010719095220.A7282@ute.cnri.reston.va.us>
Message-ID: <200107191418.KAA07872@cj20424-a.reston1.va.home.com>

> I formed the impression that all of the UCS-4 and surrogate work was
> for the goal of supporting ISO 10646 (or whatever the number is -- you
> know, the 31-bit character set), so everything is written with that
> assumption.  Presumably that's wrong.  Is ISO 10646 on the roadmap at
> this point, or is it completely irrelevant?  

The impression I got from the discussion around this was that ISO
10464 now *also* promises to limit itself to 0x110000 characters
forever.  MvL or MAL can corroborate.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com (Skip Montanaro)  Thu Jul 19 15:19:47 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 19 Jul 2001 09:19:47 -0500
Subject: [Python-Dev] Please have a look at proposed doc changes for time
 epoch
In-Reply-To: <3B56A262.C461CB@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCEEGDKPAA.tim.one@home.com>
 <3B56A262.C461CB@lemburg.com>
Message-ID: <15190.60547.254115.394049@beluga.mojam.com>

    mal> Tim Peters wrote:
    >> 
    >> [Skip Montanaro about deficiencies in the time module]

    mal> Why don't you use mxDateTime ? It provides a platform independent
    mal> layer on top of all the C lib confusion underneath.

    mal> Also, the representable time range is 

    mal>        -5851455-01-01 00:00:00.00 - 5867440-12-31 00:00:00.00

    mal> ... should cover most people's needs ;-)

I think we're getting a bit far removed from the original context here.  I'm
quite well aware of mx.DateTime and use it in my own code.  I was assigned a
bug report about the calendar module:

    http://sourceforge.net/tracker/?func=detail&aid=434143&group_id=5470&atid=105470

nThe tail end of the traceback is a ValueError generated by time.mktime
whose message suggests that it accepts years in the range 00-99 and 1900+.
I don't think it's reasonable to try and make time.mktime "work", so I
propose that we make the documentation and exception messages more
forthcoming about its platform-dependence.

Personally, I think adding mx.DateTime to the core wouldn't be a bad idea.
Python's date manipulation code is in need of some more cojones.  2.2 is
probably too near, but that's ultimately for the PythonLabs folks to decide.

Skip



From skip@pobox.com (Skip Montanaro)  Thu Jul 19 15:28:15 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 19 Jul 2001 09:28:15 -0500
Subject: [Python-Dev] webbrowser.py broken on Windows; also in 2.1.1
In-Reply-To: <15190.60205.105809.805536@cj42289-a.reston1.va.home.com>
References: <15190.23163.302172.511620@cj42289-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCCEHAKPAA.tim.one@home.com>
 <15190.59723.520322.609204@beluga.mojam.com>
 <15190.60205.105809.805536@cj42289-a.reston1.va.home.com>
Message-ID: <15190.61055.368392.280668@beluga.mojam.com>

    Fred> Now I'll have to cower under my desk in fear of the arrival of the
    Fred> rest of the PythonLabs crew, 'cuz I know they'll do as you ask!

Worse yet, I bcc'd that message to the PS


From guido@digicool.com  Thu Jul 19 15:44:18 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 19 Jul 2001 10:44:18 -0400
Subject: [Python-Dev] Please have a look at proposed doc changes for time epoch
In-Reply-To: Your message of "Thu, 19 Jul 2001 09:19:47 CDT."
 <15190.60547.254115.394049@beluga.mojam.com>
References: <LNBBLJKPBEHFEDALKOLCEEGDKPAA.tim.one@home.com> <3B56A262.C461CB@lemburg.com>
 <15190.60547.254115.394049@beluga.mojam.com>
Message-ID: <200107191444.f6JEiIk12637@odiug.digicool.com>

> Personally, I think adding mx.DateTime to the core wouldn't be a bad idea.

I've heard this endorsement before.  It looks a bit too unwieldy to
me, but it might be a good starting point for something truly
Pythonic.

> Python's date manipulation code is in need of some more cojones.  2.2 is
> probably too near, but that's ultimately for the PythonLabs folks to decide.

No, there's plenty of time to add this to 2.2.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Thu Jul 19 15:46:45 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 19 Jul 2001 10:46:45 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: Your message of "Thu, 19 Jul 2001 10:15:49 EDT."
 <20010719101549.B31796@netthink.co.uk>
References: <20010719101549.B31796@netthink.co.uk>
Message-ID: <200107191446.f6JEkjc12663@odiug.digicool.com>

> If by UCS-2 you actually mean UTF-16, then using surrogates is the
> right approach. :)

But isn't the whole point of UTF-16 to fool code that believes it's
manipulating UCS-2 into a false sense of security? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From simon@netthink.co.uk  Thu Jul 19 15:50:27 2001
From: simon@netthink.co.uk (Simon Cozens)
Date: Thu, 19 Jul 2001 10:50:27 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <200107191446.f6JEkjc12663@odiug.digicool.com>
Message-ID: <20010719105027.A32172@netthink.co.uk>

On Thu, Jul 19, 2001 at 10:46:45AM -0400, Guido van Rossum wrote:
> But isn't the whole point of UTF-16 to fool code that believes it's
> manipulating UCS-2 into a false sense of security? :-)

Well, sort of. More like fooling into a true sense of insecurity. :)

Anyway, the Standard sez that a conforming UCS-2 application will
not use characters in the surrogates area. Future versions of ISO10646
and the Unicode Standard will probably require UTF-16 instead of UCS-2.

Simon


From guido@digicool.com  Thu Jul 19 15:58:23 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 19 Jul 2001 10:58:23 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: Your message of "Thu, 19 Jul 2001 10:50:27 EDT."
 <20010719105027.A32172@netthink.co.uk>
References: <20010719105027.A32172@netthink.co.uk>
Message-ID: <200107191458.f6JEwNA12824@odiug.digicool.com>

> > But isn't the whole point of UTF-16 to fool code that believes it's
> > manipulating UCS-2 into a false sense of security? :-)
> 
> Well, sort of. More like fooling into a true sense of insecurity. :)

Same difference. :-)

> Anyway, the Standard sez that a conforming UCS-2 application will
> not use characters in the surrogates area. Future versions of ISO10646
> and the Unicode Standard will probably require UTF-16 instead of UCS-2.

So the proper way to code *libraries* that use 16-bit data would be
not to commit on the issue: don't generate surrogates on your own
account, but also don't actively reject them, instead passing them
through transparently.  This should conform to both UCS-2 and UTF-16.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Thu Jul 19 15:57:37 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 19 Jul 2001 10:57:37 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <20010719101549.B31796@netthink.co.uk>; from simon@netthink.co.uk on Thu, Jul 19, 2001 at 10:15:49AM -0400
References: <200107191409.KAA07785@cj20424-a.reston1.va.home.com> <20010719101549.B31796@netthink.co.uk>
Message-ID: <20010719105737.D7282@ute.cnri.reston.va.us>

On Thu, Jul 19, 2001 at 10:15:49AM -0400, Simon Cozens wrote:
>If by UCS-2 you actually mean UTF-16, then using surrogates is the
>right approach. :)

<head explodes> If a narrow Python uses UTF-16 (and it does seem to,
according to PEP 100), then the configure script's
--enable-unicode=ucs2 option should be changed, because it's
misleading.

Here's another pass:

%======================================================================
\section{Unicode Changes}

Python's Unicode support has been enhanced a bit in 2.2.  Unicode
strings are usually stored as UTF-16, as 16-bit unsigned integers.
Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned
integers, as its internal encoding by supplying
\longprogramopt{enable-unicode=ucs4} to the configure script.  When
built to use UCS-4 (a ``wide Python''), the interpreter can natively
handle Unicode characters from U+000000 to U+110000.  The range of
legal values for the \function{unichr()} function has been expanded;
it used to only accept values up to 65535, but in 2.2 will accept
values from 0 to 0x110000.  Using a ``narrow Python'', an interpreter
compiled to use UTF-16, values greater than 65535 will result in
\function{unichr()} returning a string of length 2:

\begin{verbatim}
>>> s = unichr(65536)
>>> s
u'\ud800\udc00'
>>> len(s)
2
\end{verbatim}

This possibly-confusing behaviour, breaking the intuitive invariant
that \function{chr()} and\function{unichr()} always return strings of
length 1, may be changed later in 2.2, depending on public reaction.

All this is the province of the still-unimplemented PEP 261, ``Support
for `wide' Unicode characters''; consult it for further details, and
please offer comments and suggestions on the proposal it describes.

--amk


From loewis@informatik.hu-berlin.de  Thu Jul 19 16:37:42 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Thu, 19 Jul 2001 17:37:42 +0200 (MEST)
Subject: [Python-Dev] 2.2 Unicode questions
Message-ID: <200107191537.RAA29684@pandora.informatik.hu-berlin.de>

> The impression I got from the discussion around this was that ISO
> 10464 now *also* promises to limit itself to 0x110000 characters
> forever.  MvL or MAL can corroborate.

It appears that the state is still the one of resolution M38.6, as
reported in

http://209.109.201.97/unicode/reports/tr19/tr19-7.html

# WG2 accepts the proposal in document N2175 towards removing the
# provision for Private Use Groups and Planes beyond Plane 16 in
# ISO/IEC 10646, to ensure internal consistency in the standard
# between UCS-4, UTF-8 and UTF-16 encoding formats, and instructs its
# project editor [to] prepare suitable text for processing as a future
# Technical Corrigendum or an Amendment to 10646-1:2000."

The original proposal can be found in

http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n2175.htm

It appears that the promised amendment is PDAM 1 to ISO 10646-1:2000,
in

http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n2308.pdf

which, in 9.1, reserves planes 11 to FF in group 0, and all other
groups, for future use, and removes the private use planes E0 to plane
FF of group 0, as well as the private use groups 60-7F. In addition,
it adds the note

# To ensure continued interoperability between the UTF-16 form and
# other coded representations of the UCS, it is intended that no other
# characters will ever be allocated to code positions above 0010FFFF.

However, this addmendment is still in the draft stage, with comments
in

http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n2355.pdf

Since voting in ISO usually takes a while, there may be some more
months until ISO 10646 is officially restricted to 17 planes - but it
is unlikely that this won't happen.

Regards,
Martin


From mal@lemburg.com  Thu Jul 19 18:41:47 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 19 Jul 2001 19:41:47 +0200
Subject: [Python-Dev] 2.2 Unicode questions
References: <20010718215546.A16539@ludwig.cnri.reston.va.us> <200107191210.IAA07020@cj20424-a.reston1.va.home.com>
 <3B56DB33.71C9161B@lemburg.com> <200107191409.KAA07785@cj20424-a.reston1.va.home.com>
Message-ID: <3B571BDB.9FB77EF4@lemburg.com>

Guido van Rossum wrote:
> 
> > > Untrue: it supports range(0x110000) (in UCS-2 mode this returns a
> > > surrogate pair).  Now, maybe that's not what it *should* do...
> >
> > It should definitely not, unless you want to break code which assumes
> > that chr() and unichr() always return a single byte/code unit !
> 
> Reasonable people can disagree about this.
> 
> > This was part of the UCS-4 checkins which hadn't had time yet to
> > review. Should I remove the surrogate part for narrow builds ?
> 
> Well, this snuck into the 2.2a1, so hopefully we'll get some comments
> ("love it" / "hate it") from the field to guide our decision.

Waiting for comments from the field :-) 
 
> > > > and there's no \code{\e U} notation for embedding characters
> > > > greater than 65535 in a Unicode string literal.
> > >
> > > Not true either -- correct \U has been part of Python since 2.0.  It
> > > does the same thing as unichr() described above.
> >
> > Right.
> >
> > Note that in this case, the handling of surrogates is needed
> > to make the unicode-escape encoding roundtrip safe.
> 
> I don't understand what this means.  Can you give an example?

It means that the roundtrip Unicode -> encoding -> Unicode is a
1-1 mapping for all Unicode code points. Other examples for 
roundtrip safe encodings are UTF-8 and UT-16.

Looking at the code, I found that the unicode-escape encoder
does not convert Unicode surrogates to \UXXXXXXXX escapes.
I'll fix that.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From guido@digicool.com  Thu Jul 19 19:36:58 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 19 Jul 2001 14:36:58 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: Your message of "Thu, 19 Jul 2001 19:41:47 +0200."
 <3B571BDB.9FB77EF4@lemburg.com>
References: <20010718215546.A16539@ludwig.cnri.reston.va.us> <200107191210.IAA07020@cj20424-a.reston1.va.home.com> <3B56DB33.71C9161B@lemburg.com> <200107191409.KAA07785@cj20424-a.reston1.va.home.com>
 <3B571BDB.9FB77EF4@lemburg.com>
Message-ID: <200107191836.f6JIawF16908@odiug.digicool.com>

> > > Note that in this case, the handling of surrogates is needed
> > > to make the unicode-escape encoding roundtrip safe.
> > 
> > I don't understand what this means.  Can you give an example?
> 
> It means that the roundtrip Unicode -> encoding -> Unicode is a
> 1-1 mapping for all Unicode code points. Other examples for 
> roundtrip safe encodings are UTF-8 and UT-16.
> 
> Looking at the code, I found that the unicode-escape encoder
> does not convert Unicode surrogates to \UXXXXXXXX escapes.
> I'll fix that.

Ah.  I had missed the fact that this was a roundtrip for a specific
encoding, the unicode-escape encoding.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim@digicool.com  Thu Jul 19 20:01:04 2001
From: tim@digicool.com (Tim Peters)
Date: Thu, 19 Jul 2001 15:01:04 -0400
Subject: [Python-Dev] getaddrinfo.c:  warnings on Windows
Message-ID: <BIEJKCLHCIOIHAGOKOLHEECKCDAA.tim@digicool.com>

If you're mucking with Windows specifically (as the latest patch here was),
and you don't have the MS Windows compiler, please upload a patch to SF
instead.  "NO WARNINGS" is a rule on Windows.

C:\Code\python\dist\src\Modules\getaddrinfo.c(418) : warning C4090:
'function' : different 'const' qualifiers
C:\Code\python\dist\src\Modules\getaddrinfo.c(418) : warning C4024:
'inet_pton' : different types for formal and actual parameter 2
C:\Code\python\dist\src\Modules\getaddrinfo.c(420) : warning C4101: 'pfx' :
unreferenced local variable
C:\Code\python\dist\src\Modules\getaddrinfo.c(495) : warning C4101:
'h_error' : unreferenced local variable
C:\Code\python\dist\src\Modules\getnameinfo.c(101) : warning C4101: 'pfx' :
unreferenced local variable
C:\Code\python\dist\src\Modules\getaddrinfo.c(346) : warning C4761: integral
size mismatch in argument; conversion supplied



From andymac@bullseye.apana.org.au  Thu Jul 19 14:40:55 2001
From: andymac@bullseye.apana.org.au (Andrew MacIntyre)
Date: Thu, 19 Jul 2001 23:40:55 +1000 (EST)
Subject: [Python-Dev] experiments with PYMALLOC (long)
Message-ID: <Pine.OS2.4.32.0107192215470.4658-200000@central>

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

---888574987-20871-995550055=:4658
Content-Type: TEXT/PLAIN; charset=US-ASCII

[this post is primarily for informational purposes, although I would
 welcome serious suggestions on possible options for dealing with
 either the longexp issue or the PYMALLOC performance issue - AIM]

In my port of Python to OS/2 using the EMX suite, I encountered the
situation of not being able to pass the longexp test in the test suite.

The test is simply:
>>>NUMREPS = 65580
>>>eval('[' + '2,' * NUMREPS + ']')

With the advent of PYMALLOC in 2.1 I hoped that this issue could be dealt
with, however defining WITH_PYMALLOC achieved nothing other than to cause
Numeric to fail on import (I am lead to believe that this is now fixed in
Numeric 20.1).

Revisiting my earlier diagnostic results reinforced the fact that the
longexp test is really a stress test of the parser.  In this test, the
parser ends up creating humongous numbers of nodes.  Each of these nodes
is only 20 bytes (+1 for insurance) for which the EMX malloc() returns a
chunk 64 bytes long - and there appears to be a minimum of 13 such nodes
+ a handful of 2+1 byte allocations occupying 12 bytes each for each
element in the list being parsed.

Not a happy situation, as it is sufficient to exhaust my dev system's
swap space, and OS/2 stops dead.

I then thought of doctoring Python to use PYMALLOC for _all_ interpreter
memory management (the attached patch is all it took, against 2.1).

And with the exception of the socket test, which fails the first time with
a "no memory" error but succeeds the second time when the .pycs don't need
to be recompiled, the completely PYMALLOC managed interpreter passes the
regression test _including_ the longexp test.

I was starting to think in terms of releasing the (yet to be) 2.1.1 port
configured this way.  But then I decided to benchmark the two interpreter
configurations using the regression test as the benchmark.....

On my dev system, the average results (of 3 runs) are:
             no .pyc      w/.pyc
std malloc    3m 41s      3m 25s    (test_longexp skipped)
PYMALLOC      6m 12s      5m 25s    (test_socket fails in "no .pyc" case)

[the skipped longexp test, run standalone on the PYMALLOC interpreter,
takes <5s total, so its not a significant factor in the times]

:-( :-(  I think the OS/2 port is going to have to continue to risk
failure on the longexp test on many systems as such a performance hit is
hard to justify.

Environment:
System=  AMD K6/2-300, 64M RAM, DMA IDE drive (pre UDMA33)
         40MB preallocated swap space, that can expand to 140MB
S/ware=  OS/2 v4, FP12
         EMX 0.9d fix 03, gcc 2.8.1
         compile options "-O2 -fomit-frame-pointer"
         NDEBUG _not_ defined, so all assert()s still active

[PS: please cc any replies to me as I'm not subscribed to this list]

--
Andrew I MacIntyre                    "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au     | Snail: PO Box 370
        andymac@pcug.org.au               |        Belconnen ACT 2616
Web:    http://www.andymac.org/           |        Australia

---888574987-20871-995550055=:4658
Content-Type: TEXT/PLAIN; charset=US-ASCII; name="pymalloc_all.patch"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.OS2.4.32.0107192340550.4658@central>
Content-Description: pymalloc_all.patch
Content-Disposition: attachment; filename="pymalloc_all.patch"

KioqIEluY2x1ZGVccHltZW0uaC5vcmlnCVNhdCBTZXAgIDIgMDk6Mjk6MjYg
MjAwMA0KLS0tIEluY2x1ZGVccHltZW0uaAlTdW4gSnVsIDE1IDE3OjQ0OjM4
IDIwMDENCioqKioqKioqKioqKioqKg0KKioqIDI1LDM2ICoqKioNCi0tLSAy
NSw0NyAtLS0tDQogICAgIFNlZSB0aGUgY29tbWVudCBibG9jayBhdCB0aGUg
ZW5kIG9mIHRoaXMgZmlsZSBmb3IgdHdvIHNjZW5hcmlvcw0KICAgICBzaG93
aW5nIGhvdyB0byB1c2UgdGhpcyB0byB1c2UgYSBkaWZmZXJlbnQgYWxsb2Nh
dG9yLiAqLw0KICANCisgI2lmZGVmCVBZTUFMTE9DX0FMTA0KKyAjaWZuZGVm
IFB5Q29yZV9NQUxMT0NfRlVOQw0KKyAjdW5kZWYgUHlDb3JlX1JFQUxMT0Nf
RlVOQw0KKyAjdW5kZWYgUHlDb3JlX0ZSRUVfRlVOQw0KKyAjZGVmaW5lIFB5
Q29yZV9NQUxMT0NfRlVOQyAgICAgIF9QeUNvcmVfT2JqZWN0TWFsbG9jDQor
ICNkZWZpbmUgUHlDb3JlX1JFQUxMT0NfRlVOQyAgICAgX1B5Q29yZV9PYmpl
Y3RSZWFsbG9jDQorICNkZWZpbmUgUHlDb3JlX0ZSRUVfRlVOQyAgICAgICAg
X1B5Q29yZV9PYmplY3RGcmVlDQorICNkZWZpbmUgTkVFRF9UT19ERUNMQVJF
X01BTExPQ19BTkRfRlJJRU5ECTENCisgI2VuZGlmDQorICNlbHNlDQogICNp
Zm5kZWYgUHlDb3JlX01BTExPQ19GVU5DDQogICN1bmRlZiBQeUNvcmVfUkVB
TExPQ19GVU5DDQogICN1bmRlZiBQeUNvcmVfRlJFRV9GVU5DDQogICNkZWZp
bmUgUHlDb3JlX01BTExPQ19GVU5DICAgICAgbWFsbG9jDQogICNkZWZpbmUg
UHlDb3JlX1JFQUxMT0NfRlVOQyAgICAgcmVhbGxvYw0KICAjZGVmaW5lIFB5
Q29yZV9GUkVFX0ZVTkMgICAgICAgIGZyZWUNCisgI2VuZGlmDQogICNlbmRp
Zg0KICANCiAgI2lmbmRlZiBQeUNvcmVfTUFMTE9DX1BST1RPDQoqKiogT2Jq
ZWN0c1xvYm1hbGxvYy5jLm9yaWcJTW9uIE1hciAxMiAwNTozNjoxMiAyMDAx
DQotLS0gT2JqZWN0c1xvYm1hbGxvYy5jCVRodSBKdWwgMTkgMjM6MjQ6MjQg
MjAwMQ0KKioqKioqKioqKioqKioqDQoqKiogNzMsODIgKioqKg0KLS0tIDcz
LDg5IC0tLS0NCiAgICogYWxsb2NhdG9yIHdoaWNoIGV4cG9ydHMgZnVuY3Rp
b25zIHdpdGggbmFtZXMgX290aGVyXyB0aGFuIHRoZSBzdGFuZGFyZA0KICAg
KiBtYWxsb2MsIGNhbGxvYywgcmVhbGxvYywgZnJlZS4NCiAgICovDQorICNp
ZmRlZglQWU1BTExPQ19BTEwNCisgI2RlZmluZSBfU1lTVEVNX01BTExPQwkJ
bWFsbG9jDQorICNkZWZpbmUgX1NZU1RFTV9DQUxMT0MJCS8qIHVudXNlZCAq
Lw0KKyAjZGVmaW5lIF9TWVNURU1fUkVBTExPQwkJcmVhbGxvYw0KKyAjZGVm
aW5lIF9TWVNURU1fRlJFRQkJZnJlZQ0KKyAjZWxzZQ0KICAjZGVmaW5lIF9T
WVNURU1fTUFMTE9DCQlQeUNvcmVfTUFMTE9DX0ZVTkMNCiAgI2RlZmluZSBf
U1lTVEVNX0NBTExPQwkJLyogdW51c2VkICovDQogICNkZWZpbmUgX1NZU1RF
TV9SRUFMTE9DCQlQeUNvcmVfUkVBTExPQ19GVU5DDQogICNkZWZpbmUgX1NZ
U1RFTV9GUkVFCQlQeUNvcmVfRlJFRV9GVU5DDQorICNlbmRpZg0KICANCiAg
LyoNCiAgICogSWYgbWFsbG9jIGhvb2tzIGFyZSBuZWVkZWQsIG5hbWVzIG9m
IHRoZSBob29rcycgc2V0ICYgZmV0Y2gNCg==
---888574987-20871-995550055=:4658--


From klm@digicool.com  Thu Jul 19 23:43:45 2001
From: klm@digicool.com (Ken Manheimer)
Date: Thu, 19 Jul 2001 18:43:45 -0400 (EDT)
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <20010719105737.D7282@ute.cnri.reston.va.us>
Message-ID: <Pine.LNX.4.21.0107191833220.20828-100000@serenade.digicool.com>

On Thu, 19 Jul 2001, Andrew Kuchling wrote:

> On Thu, Jul 19, 2001 at 10:15:49AM -0400, Simon Cozens wrote:
> >If by UCS-2 you actually mean UTF-16, then using surrogates is the
> >right approach. :)
> 
> <head explodes> If a narrow Python uses UTF-16 (and it does seem to,
> according to PEP 100), then the configure script's
> --enable-unicode=ucs2 option should be changed, because it's
> misleading.

(-: I am becoming convinced that Unicode is a multi-national plot to take
over the minds of our most gifted (and/or most obsessive) programmers, in
pursuit of an elusive, unresolvable, and ultimately, undefinable goal.

To what point?  

To divert those of merit, and enable the emergence of the mediocritocricy
- a modest plot to elevate the overshadowed to positions of remotely
impressive power.

Now that i'm nearly convinced of this conspiracy, perhaps i should be
nearly committed...

Ken:-)



From tim.one@home.com  Fri Jul 20 00:05:21 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 19 Jul 2001 19:05:21 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <Pine.LNX.4.21.0107191833220.20828-100000@serenade.digicool.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEJKKPAA.tim.one@home.com>

[Ken Manheimer]
> (-: I am becoming convinced that Unicode is a multi-national plot to
> take over the minds of our most gifted (and/or most obsessive)
> programmers, in pursuit of an elusive, unresolvable, and ultimately,
> undefinable goal.
>
> To what point?

I'm afraid the universal adoption of the IEEE-754 floating-point standard
took the committe by surprise, and they had to start some other unboundedly
detailed yet inherently futile project lest they find themselves in need of
real jobs.

stare-at-a-zero-width-non-breaking-space-hard-enough-and-you'll-
    find-kahan-staring-right-back-ly y'rs  - tim



From barry@digicool.com  Fri Jul 20 00:36:53 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Thu, 19 Jul 2001 19:36:53 -0400
Subject: [Python-Dev] 2.2 Unicode questions
References: <20010719105737.D7282@ute.cnri.reston.va.us>
 <Pine.LNX.4.21.0107191833220.20828-100000@serenade.digicool.com>
Message-ID: <15191.28437.315103.405941@anthem.wooz.org>

>>>>> "KM" == Ken Manheimer <klm@digicool.com> writes:

    KM> (-: I am becoming convinced that Unicode is a multi-national
    KM> plot to take over the minds of our most gifted (and/or most
    KM> obsessive) programmers, in pursuit of an elusive,
    KM> unresolvable, and ultimately, undefinable goal.

>From Andrew's (hilarious and wonderful) quotes page:

    http://www.amk.ca/quotations/python-quotes/page-7.html

Unicode: everyone wants it, until they get it.
Barry Warsaw, 16 May 2000


From DavidA@ActiveState.com  Fri Jul 20 01:07:35 2001
From: DavidA@ActiveState.com (David Ascher)
Date: Thu, 19 Jul 2001 17:07:35 -0700
Subject: [Python-Dev] 2.2 Unicode questions
References: <Pine.LNX.4.21.0107191833220.20828-100000@serenade.digicool.com>
Message-ID: <3B577647.8F95721A@ActiveState.com>

Ken Manheimer wrote:

> (-: I am becoming convinced that Unicode is a multi-national plot to take
> over the minds of our most gifted (and/or most obsessive) programmers, in
> pursuit of an elusive, unresolvable, and ultimately, undefinable goal.

Amen brother.

Unicode is the first technology I have to deal with which makes me hope
I die before I really _really_ *really* need to understand it fully.

--david


From alex_c@MIT.EDU  Fri Jul 20 04:04:14 2001
From: alex_c@MIT.EDU (Alex Coventry)
Date: Thu, 19 Jul 2001 23:04:14 -0400
Subject: [Python-Dev] Pointers to python-dev threads pertaining to Patch #441791?
Message-ID: <200107200304.XAA15088@opus.mit.edu>

Hi, I posted a patch recently, #441791, causing "import foo.bar" to set
"sys.modules['foo'].bar = sys.modules['foo.bar']" even if an error is
raised during the importing of bar.  With this patch, import commands
like "import foo.bar; reload(foo.bar)" work in a fashion more consistent
with the way "import unpackaged_module; reload(unpackaged_module)"
works.  

Thomas Wouters posted a reply saying that this has been discussed on
python-dev before.  I've searched the archives for the keywords
"import.c", "import_submodule" (the function I modify,) and "package
import" but didn't turn up anything relevant.  Could someone point me at
a thread which discusses this?  This patch has proved very useful to me,
as I tend to carry around a lot of data in long-running python process,
and being able to reload submodules of a package has been very useful to
me.  It'd be nice if the patch got into python itself, so that I can
retain my development habits without having to keep an eye on
import.c. :)

Alex.


From akuchlin@mems-exchange.org  Fri Jul 20 04:10:26 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 19 Jul 2001 23:10:26 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <3B577647.8F95721A@ActiveState.com>; from DavidA@activestate.com on Thu, Jul 19, 2001 at 05:07:35PM -0700
References: <Pine.LNX.4.21.0107191833220.20828-100000@serenade.digicool.com> <3B577647.8F95721A@ActiveState.com>
Message-ID: <20010719231026.A500@ute.cnri.reston.va.us>

On Thu, Jul 19, 2001 at 05:07:35PM -0700, David Ascher wrote:
>Unicode is the first technology I have to deal with which makes me hope
>I die before I really _really_ *really* need to understand it fully.

Welcome to the quote file (again), David!  And Barry, thanks for
posting your quote; if you hadn't posted it, I would have.  We mustn't
forget the third Unicode reference:

I never realized it before, but having looked that over I'm certain
I'd rather have my eyes burned out by zombies with flaming dung sticks
than work on a conscientious Unicode regex engine.
    -- Tim Peters, 3 Dec 1998

Doesn't anyone have *anything* nice to say about Unicode? 

--amk


From simon@netthink.co.uk  Fri Jul 20 04:23:33 2001
From: simon@netthink.co.uk (Simon Cozens)
Date: Thu, 19 Jul 2001 23:23:33 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <20010719231026.A500@ute.cnri.reston.va.us>
Message-ID: <20010719232333.A815@netthink.co.uk>

On Thu, Jul 19, 2001 at 11:10:26PM -0400, Andrew Kuchling wrote:
> Doesn't anyone have *anything* nice to say about Unicode? 

Sure: having had to deal with three different Japanese encodings, (at least)
two different Japanese character repertoires, one huge chunk of special-casing
code and *no* idea what's going on, I'll take the zombies with flaming sticks
any day. Thank you.

Simon


From paulp@ActiveState.com  Fri Jul 20 05:22:54 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Thu, 19 Jul 2001 21:22:54 -0700
Subject: [Python-Dev] 2.2 Unicode questions
References: <Pine.LNX.4.21.0107191833220.20828-100000@serenade.digicool.com>
Message-ID: <3B57B21E.AB922A9@ActiveState.com>

Ken Manheimer wrote:
> 
>...
> 
> (-: I am becoming convinced that Unicode is a multi-national plot to take
> over the minds of our most gifted (and/or most obsessive) programmers, in
> pursuit of an elusive, unresolvable, and ultimately, undefinable goal.

I know that you are half-kidding but if you think that
internationalization is hard now you should have seen it before Unicode.
Unicode is the *simplification*.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From esr@thyrsus.com  Fri Jul 20 05:48:04 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Fri, 20 Jul 2001 00:48:04 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <3B57B21E.AB922A9@ActiveState.com>; from paulp@ActiveState.com on Thu, Jul 19, 2001 at 09:22:54PM -0700
References: <Pine.LNX.4.21.0107191833220.20828-100000@serenade.digicool.com> <3B57B21E.AB922A9@ActiveState.com>
Message-ID: <20010720004804.H6164@thyrsus.com>

Paul Prescod <paulp@ActiveState.com>:
> I know that you are half-kidding but if you think that
> internationalization is hard now you should have seen it before Unicode.
> Unicode is the *simplification*.

Quite.  If Unicode is a horde of zombies with flaming dung sticks, the
hideous intricacies of JIS, Chinese Big-5, Chinese Traditional, KOI-8,
et cetera are at least an army of ogres with salt and flensing knives.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"The best we can hope for concerning the people at large is that they be
properly armed."
        -- Alexander Hamilton, The Federalist Papers at 184-188


From tim.one@home.com  Fri Jul 20 06:22:00 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 20 Jul 2001 01:22:00 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <20010719231026.A500@ute.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEKDKPAA.tim.one@home.com>

[Andrew Kuchling]
> Doesn't anyone have *anything* nice to say about Unicode?

Europeans do.  Unfortunately, the Japanese seem to have little use for it,
while the Anglo-Europeans keep repeating how it saves them from the
nightmare of dealing with Japanese <0.9 wink>.

just-so-long-as-we-don't-have-to-deal-with-the-french-ly y'rs  - tim



From DavidA@ActiveState.com  Fri Jul 20 06:49:52 2001
From: DavidA@ActiveState.com (David Ascher)
Date: Thu, 19 Jul 2001 22:49:52 -0700
Subject: [Python-Dev] 2.2 Unicode questions
References: <LNBBLJKPBEHFEDALKOLCKEKDKPAA.tim.one@home.com>
Message-ID: <3B57C680.8271C765@ActiveState.com>

Tim Peters wrote:

> just-so-long-as-we-don't-have-to-deal-with-the-french-ly y'rs  - tim

I stopped using accents and cedillas in my french writings in 1986 when
I was stuck in a foreign land with IBM 3278 terminals on an EBCDIC
system.  Haven't missed 'em since.  My greek graduate student buddy
wrote greek in ASCII using a completely made-up transliteration system
known only to greek expatriates on the internet.  The Newton vs.
Graffiti debate has shown for the Nth time that people are more adaptive
than computers.  

7 bits is enough for anything worth saying.   Anything else consists of
error-correcting bits. =)

--david


From thomas@xs4all.net  Fri Jul 20 10:47:52 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 20 Jul 2001 11:47:52 +0200
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEKDKPAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCKEKDKPAA.tim.one@home.com>
Message-ID: <20010720114752.K2054@xs4all.nl>

On Fri, Jul 20, 2001 at 01:22:00AM -0400, Tim Peters wrote:
> [Andrew Kuchling]
> > Doesn't anyone have *anything* nice to say about Unicode?
> 
> Europeans do.

Nonsense (or should I say, 'hypergeneralization' ? :) Most Europeans can
deal with ISO8859-1 just fine... I can honestly say I don't care a burning
dung stick about unicode ;)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tim.one@home.com  Fri Jul 20 10:53:57 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 20 Jul 2001 05:53:57 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <20010720114752.K2054@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKOKPAA.tim.one@home.com>

[Andrew Kuchling]
>>> Doesn't anyone have *anything* nice to say about Unicode?

[Tim, severly cut]
>> Europeans do.

[Thomas Wouters, less severely cut]
> Nonsense (or should I say, 'hypergeneralization' ? :)

If Paul Prescod and Simon Cozens aren't Europeans, then I suppose I'm not
either.  QED.

grab-some-coffee-and-rediscover-your-sense-of-humor<wink>-ly y'rs  - tim



From moshez@zadka.site.co.il  Fri Jul 20 11:06:10 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Fri, 20 Jul 2001 13:06:10 +0300
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <20010720114752.K2054@xs4all.nl>
References: <20010720114752.K2054@xs4all.nl>, <LNBBLJKPBEHFEDALKOLCKEKDKPAA.tim.one@home.com>
Message-ID: <E15NXAg-0004UV-00@darjeeling>

On Fri, 20 Jul 2001 11:47:52 +0200, Thomas Wouters <thomas@xs4all.net> wrote:
 
> Nonsense (or should I say, 'hypergeneralization' ? :) Most Europeans can
> deal with ISO8859-1 just fine... I can honestly say I don't care a burning
> dung stick about unicode ;)

*West* Europeans!
East European languages in in -2, so East Europeans have a problem too.
And, since in the last European Python Meeting I was officially declared
to be a European too ;-), I must say that Unicode is a boon as far as Hebrew
is concerned.
-- 
gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE
Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6
Insecure (accessible): C5A5 A8FA CA39 AB03 10B8  F116 1713 1BCF 54C4 E1FE


From jack@oratrix.nl  Fri Jul 20 11:22:40 2001
From: jack@oratrix.nl (Jack Jansen)
Date: Fri, 20 Jul 2001 12:22:40 +0200
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: Message by Thomas Wouters <thomas@xs4all.net> ,
 Fri, 20 Jul 2001 11:47:52 +0200 , <20010720114752.K2054@xs4all.nl>
Message-ID: <20010720102240.96711303181@snelboot.oratrix.nl>

> On Fri, Jul 20, 2001 at 01:22:00AM -0400, Tim Peters wrote:
> > [Andrew Kuchling]
> > > Doesn't anyone have *anything* nice to say about Unicode?
> > 
> > Europeans do.
> 
> Nonsense (or should I say, 'hypergeneralization' ? :) Most Europeans can
> deal with ISO8859-1 just fine...

... until you start supporting software for non-europeans. The various 8-bit 
macintosh codesets are hell to deal with, especially because you can't really 
test how things work if you speak a seven-bit-clean language with your 
computer as well as your loved ones. I assume the same problems apply to the 
windows codepage blabla.

At least unicode gives the whole world a common ground. Or let me rephrase 
that as "... should give...".
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 




From guido@digicool.com  Fri Jul 20 14:41:45 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 20 Jul 2001 09:41:45 -0400
Subject: [Python-Dev] Pointers to python-dev threads pertaining to Patch #441791?
In-Reply-To: Your message of "Thu, 19 Jul 2001 23:04:14 EDT."
 <200107200304.XAA15088@opus.mit.edu>
References: <200107200304.XAA15088@opus.mit.edu>
Message-ID: <200107201341.JAA09907@cj20424-a.reston1.va.home.com>

> Hi, I posted a patch recently, #441791, causing "import foo.bar" to set
> "sys.modules['foo'].bar = sys.modules['foo.bar']" even if an error is
> raised during the importing of bar.  With this patch, import commands
> like "import foo.bar; reload(foo.bar)" work in a fashion more consistent
> with the way "import unpackaged_module; reload(unpackaged_module)"
> works.  
> 
> Thomas Wouters posted a reply saying that this has been discussed on
> python-dev before.  I've searched the archives for the keywords
> "import.c", "import_submodule" (the function I modify,) and "package
> import" but didn't turn up anything relevant.  Could someone point me at
> a thread which discusses this?  This patch has proved very useful to me,
> as I tend to carry around a lot of data in long-running python process,
> and being able to reload submodules of a package has been very useful to
> me.  It'd be nice if the patch got into python itself, so that I can
> retain my development habits without having to keep an eye on
> import.c. :)

I hardly recall such a discussion, and I don't think that much light
was shed on the situation.

In any case, I agree it would be nice if this was fixed.  But I'm too
busy to look into this myself -- sorry.

Maybe Thomas was thinking of a different issue, where some people want
the sys.modules[name] entry to be *removed* when an import fails.  I
am not for that change, but I haven't recovered the reason (I know I
had a good one when I implemented things this way).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm@hypernet.com  Fri Jul 20 15:21:04 2001
From: gmcm@hypernet.com (Gordon McMillan)
Date: Fri, 20 Jul 2001 10:21:04 -0400
Subject: [Python-Dev] Pointers to python-dev threads pertaining to Patch #441791?
In-Reply-To: <200107201341.JAA09907@cj20424-a.reston1.va.home.com>
References: Your message of "Thu, 19 Jul 2001 23:04:14 EDT."             <200107200304.XAA15088@opus.mit.edu>
Message-ID: <3B580610.30772.ECBE36E@localhost>

[Alex Coventry]
> > Hi, I posted a patch recently, #441791, causing "import
> > foo.bar" to set "sys.modules['foo'].bar =
> > sys.modules['foo.bar']" even if an error is raised during the
> > importing of bar.  With this patch, import commands like
> > "import foo.bar; reload(foo.bar)" work in a fashion more
> > consistent with the way "import unpackaged_module;
> > reload(unpackaged_module)" works.  

[Guido] 

> In any case, I agree it would be nice if this was fixed.  

Import issues are subtle, but this looks good to me.

> Maybe Thomas was thinking of a different issue, where some people
> want the sys.modules[name] entry to be *removed* when an import
> fails.  I am not for that change, but I haven't recovered the
> reason (I know I had a good one when I implemented things this
> way).

Perhaps one could construct a situation with circular imports 
in which one module ends up with a name (and no error, 
because the name is being imported) that later turns into an 
error?

There's also the issue of failed relative imports that succeed 
as absolute imports - you don't want every module in the 
package hunting around for package.sys.

- Gordon


From mal@lemburg.com  Fri Jul 20 16:08:09 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 20 Jul 2001 17:08:09 +0200
Subject: [Python-Dev] Python 2.2a1 nits
Message-ID: <3B584959.8AABCC9E@lemburg.com>

Here's a summary of nits I found compiling Python 2.2a1:

* configure options: the options should follow a single
  methodology (either all use --with(out)-<feature> or all
  use --enable-<feature>/--disable-<feature>) and the
  defaults should be clearly indicated...

  --without-gcc                   never use gcc
  --with-cxx=<compiler>           enable C++ support
  --with-suffix=.exe              set executable suffix
  --with-pydebug                  build with Py_DEBUG defined
  --enable-ipv6                   Enable ipv6 (with ipv4) support
  --disable-ipv6                  Disable ipv6 support
  --with-libs='lib1 ...'          link against additional libs
  --with-signal-module            disable/enable signal module
  --with-dec-threads              use DEC Alpha/OSF1 thread-safe libraries
  --with(out)-threads[=DIRECTORY] disable/enable thread support
  --with(out)-thread[=DIRECTORY]  deprecated; use --with(out)-threads
  --with-pth                      use GNU pth threading libraries
  --with(out)-cycle-gc            disable/enable garbage collection
  --with(out)-pymalloc            disable/enable specialized mallocs
  --with-wctype-functions         use wctype.h functions
  --with-sgi-dl=DIRECTORY         IRIX 4 dynamic linking
  --with-dl-dld=DL_DIR,DLD_DIR    GNU dynamic linking
  --with-fpectl                   enable SIGFPE catching
  --with-libm=STRING              math library
  --with-libc=STRING              C library
  --enable-unicode[=ucs2,ucs4]    Enable Unicode strings (default is yes)

  I'd suggest going with --with(out)-<feature> since this
  seems to be the most often used one.

* warnings:

In file included from ./Modules/_sre.c:54:
Modules/sre.h:24: warning: `SRE_CODE' redefined
Modules/sre.h:19: warning: this is the location of the previous definition

libpython2.2.a(posixmodule.o): In function `posix_tmpnam':
/home/lemburg/orig/Python-2.2a1/./Modules/posixmodule.c:4262: the use of `tmpnam_r' is dangerous, better use `mkstemp'

libpython2.2.a(posixmodule.o): In function `posix_tempnam':
/home/lemburg/orig/Python-2.2a1/./Modules/posixmodule.c:4217: the use of `tempnam' is dangerous, better use `mkstemp'

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From barry@digicool.com  Fri Jul 20 16:13:25 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Fri, 20 Jul 2001 11:13:25 -0400
Subject: [Python-Dev] 2.2 Unicode questions
References: <20010719231026.A500@ute.cnri.reston.va.us>
 <LNBBLJKPBEHFEDALKOLCKEKDKPAA.tim.one@home.com>
Message-ID: <15192.19093.861241.438673@anthem.wooz.org>

>>>>> "TP" == Tim Peters <tim.one@home.com> writes:

    TP> Europeans do.  Unfortunately, the Japanese seem to have little
    TP> use for it, while the Anglo-Europeans keep repeating how it
    TP> saves them from the nightmare of dealing with Japanese <0.9
    TP> wink>.

Heck, it's even more regional than that.  Us Marylanders have some
/serious/ concerns about the characters down in Virginia.

-Barry


From mal@lemburg.com  Fri Jul 20 16:56:17 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 20 Jul 2001 17:56:17 +0200
Subject: [Python-Dev] mail.python.org black listed ?!
Message-ID: <3B5854A1.214EFF19@lemburg.com>

Since yesterday evening I haven't received any email from
mail.python.org. A look in my mail delivery log file showed
that sendmail is rejecting mails from mail.python.org 
(63.102.49.29) with the following message:

"""
5.3.0 Mail from 63.102.49.29 rejected - open relay;see http://www.orbs.org
"""

Checking www.orbs.org I find:

"""
Due to circumstances beyond our control, the ORBS website is no longer 
available. 
"""

I've disabled the orbs.org spam filter in sendmail for now,
but this could be a problem for others as well... 

Perhaps someone could find out how to remove mail.python.org 
from the ORBS black list since this is likely going to affect
more people who use sendmail.

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From mal@lemburg.com  Fri Jul 20 17:03:35 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 20 Jul 2001 18:03:35 +0200
Subject: [Python-Dev] mail.python.org black listed ?!
References: <3B5854A1.214EFF19@lemburg.com>
Message-ID: <3B585657.AEB8821E@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> Since yesterday evening I haven't received any email from
> mail.python.org. A look in my mail delivery log file showed
> that sendmail is rejecting mails from mail.python.org
> (63.102.49.29) with the following message:
> 
> """
> 5.3.0 Mail from 63.102.49.29 rejected - open relay;see http://www.orbs.org
> """

FYI, starship.python.net [63.102.49.32] has the same problem.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From akuchlin@mems-exchange.org  Fri Jul 20 17:10:20 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Fri, 20 Jul 2001 12:10:20 -0400
Subject: [Python-Dev] mail.python.org black listed ?!
In-Reply-To: <3B5854A1.214EFF19@lemburg.com>; from mal@lemburg.com on Fri, Jul 20, 2001 at 05:56:17PM +0200
References: <3B5854A1.214EFF19@lemburg.com>
Message-ID: <20010720121020.B1470@ute.cnri.reston.va.us>

On Fri, Jul 20, 2001 at 05:56:17PM +0200, M.-A. Lemburg wrote:
>Since yesterday evening I haven't received any email from
>mail.python.org. A look in my mail delivery log file showed
>that sendmail is rejecting mails from mail.python.org 
>(63.102.49.29) with the following message:

See http://www.uwsg.iu.edu/hypermail/linux/kernel/0107.1/0929.html .

Short explanation: ORBS will, one eleventh of the time, report that any IP
address is an open relay.

Fix: your sysadmins have to stop using ORBS and switch to some other
open relay detection service.

--amk



From mal@lemburg.com  Fri Jul 20 17:23:46 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 20 Jul 2001 18:23:46 +0200
Subject: [Python-Dev] mail.python.org black listed ?!
References: <3B5854A1.214EFF19@lemburg.com> <20010720121020.B1470@ute.cnri.reston.va.us>
Message-ID: <3B585B12.57B409A6@lemburg.com>

Andrew Kuchling wrote:
> 
> On Fri, Jul 20, 2001 at 05:56:17PM +0200, M.-A. Lemburg wrote:
> >Since yesterday evening I haven't received any email from
> >mail.python.org. A look in my mail delivery log file showed
> >that sendmail is rejecting mails from mail.python.org
> >(63.102.49.29) with the following message:
> 
> See http://www.uwsg.iu.edu/hypermail/linux/kernel/0107.1/0929.html .
> 
> Short explanation: ORBS will, one eleventh of the time, report that any IP
> address is an open relay.
> 
> Fix: your sysadmins have to stop using ORBS and switch to some other
> open relay detection service.

Thanks for the pointer. I've switched off ORBS... and just
saw that missed the nice Unicode thread yesterday ;-(

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From mal@lemburg.com  Fri Jul 20 17:39:30 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 20 Jul 2001 18:39:30 +0200
Subject: [Python-Dev] 2.2 Unicode questions
Message-ID: <3B585EC2.DF9225D@lemburg.com>

>From Andrew's new pass:

"""
Python's Unicode support has been enhanced a bit in 2.2.  Unicode
strings are usually stored as UTF-16, as 16-bit unsigned integers.
"""

Please replace UTF-16 with UCS-2. Python's Unicode implementation
does not support UTF-16 in a surrogate aware way, only some
of the codecs do this.

As a result, the internal storage format of Python is more
precisely described with UCS-2.

"""
Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned
integers, as its internal encoding by supplying
\longprogramopt{enable-unicode=ucs4} to the configure script.  When
built to use UCS-4 (a ``wide Python''), the interpreter can natively
handle Unicode characters from U+000000 to U+110000.  The range of
legal values for the \function{unichr()} function has been expanded;
it used to only accept values up to 65535, but in 2.2 will accept
values from 0 to 0x110000.  Using a ``narrow Python'', an interpreter
compiled to use UTF-16, values greater than 65535 will result in
\function{unichr()} returning a string of length 2:

\begin{verbatim}
>>> s = unichr(65536)
>>> s
u'\ud800\udc00'
>>> len(s)
2
\end{verbatim}

"""

Same here: UTF-16 -> UCS-2. Note that I very much favour
removing the surrogate generation in unichr() for UCS2-builds.

If I don't here strong opposition, I'll disable this feature
which was added as part of the UCS-4 patches. unichr()
will then raise an exception as it did in version 2.1.

"""
This possibly-confusing behaviour, breaking the intuitive invariant
that \function{chr()} and\function{unichr()} always return strings of
length 1, may be changed later in 2.2, depending on public reaction.
"""

Right.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From akuchlin@mems-exchange.org  Fri Jul 20 17:42:47 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Fri, 20 Jul 2001 12:42:47 -0400
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <3B585EC2.DF9225D@lemburg.com>; from mal@lemburg.com on Fri, Jul 20, 2001 at 06:39:30PM +0200
References: <3B585EC2.DF9225D@lemburg.com>
Message-ID: <20010720124247.A1769@ute.cnri.reston.va.us>

On Fri, Jul 20, 2001 at 06:39:30PM +0200, M.-A. Lemburg wrote:
>Same here: UTF-16 -> UCS-2. Note that I very much favour
>removing the surrogate generation in unichr() for UCS2-builds.

Do I understand the new behavior you intend to implement?
  * Narrow Python: unichr() accepts values from 0 .. 65535.  len(unichr(x))
    is always 1.
  * Wide Python: unichr() accepts values from 0 .. 0x110000.  len(unichr(x))
    is also always 1.
 
--amk


From mal@lemburg.com  Fri Jul 20 17:49:04 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 20 Jul 2001 18:49:04 +0200
Subject: [Python-Dev] 2.2 Unicode questions
References: <3B585EC2.DF9225D@lemburg.com> <20010720124247.A1769@ute.cnri.reston.va.us>
Message-ID: <3B586100.B645A08C@lemburg.com>

Andrew Kuchling wrote:
> 
> On Fri, Jul 20, 2001 at 06:39:30PM +0200, M.-A. Lemburg wrote:
> >Same here: UTF-16 -> UCS-2. Note that I very much favour
> >removing the surrogate generation in unichr() for UCS2-builds.
> 
> Do I understand the new behavior you intend to implement?
>   * Narrow Python: unichr() accepts values from 0 .. 65535.  len(unichr(x))
>     is always 1.
>   * Wide Python: unichr() accepts values from 0 .. 0x110000.  len(unichr(x))
>     is also always 1.

Right.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From guido@digicool.com  Fri Jul 20 18:03:32 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 20 Jul 2001 13:03:32 -0400
Subject: [Python-Dev] RELEASED: Python 2.1.1
Message-ID: <200107201703.NAA16871@cj20424-a.reston1.va.home.com>

I've released Python 2.1.1 today.  This is the final version of this
bugfix release for Python 2.1, and should be fully compatible with
Python 2.1.  There should be *no* reason to use 2.1 any more.  Many
thanks to Thomas Wouters for being the release manager!  Pick up your
copy here:

    http://www.python.org/2.1.1/

Python 2.1.1 is GPL-compatible.  This means that it is okay to
distribute Python binaries linked with GPL-licensed software; Python
itself is not released under the GPL but under a less restrictive
license which is Open Source compliant.

PS: I've noticed some disarray of mail sent through python.org; this
seems to have to do with the dysfunctional ORBS "spam-checker".  See
http://mail.python.org/pipermail/python-dev/2001-July/016151.html for
details.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com (Skip Montanaro)  Fri Jul 20 19:11:43 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 20 Jul 2001 13:11:43 -0500
Subject: [Python-Dev] mail.python.org black listed ?!
In-Reply-To: <20010720121020.B1470@ute.cnri.reston.va.us>
References: <3B5854A1.214EFF19@lemburg.com>
 <20010720121020.B1470@ute.cnri.reston.va.us>
Message-ID: <15192.29791.789277.909128@beluga.mojam.com>

    mal> A look in my mail delivery log file showed that sendmail is rejecting
    mal> mails from mail.python.org (63.102.49.29) with the following message:

    amk> See http://www.uwsg.iu.edu/hypermail/linux/kernel/0107.1/0929.html .

    amk> Short explanation: ORBS will, one eleventh of the time, report that
    amk> any IP address is an open relay.

More telling I think is that Ron Guilmette, the author of the note amk
referenced, felt that he couldn't recommend any of the other black list
"services".  The server that I run a few mailman lists on occasionally has
messages rejected by MAPS RBL.  Whenever I go there to see what's what, it
always tells me my server has never been on their black list.  I think the
whole black list exercise has been a net waste of time and certainly has
done little to stem the flow of email spam.

as-opposed-to-python-spam-ly y'rs,

Skip


From gward@python.net  Fri Jul 20 19:32:05 2001
From: gward@python.net (Greg Ward)
Date: Fri, 20 Jul 2001 14:32:05 -0400
Subject: [Python-Dev] Python 2.2a1 nits
In-Reply-To: <3B584959.8AABCC9E@lemburg.com>; from mal@lemburg.com on Fri, Jul 20, 2001 at 05:08:09PM +0200
References: <3B584959.8AABCC9E@lemburg.com>
Message-ID: <20010720143205.A2209@gerg.ca>

On 20 July 2001, M.-A. Lemburg said:
> * configure options: the options should follow a single
>   methodology (either all use --with(out)-<feature> or all
>   use --enable-<feature>/--disable-<feature>) and the
>   defaults should be clearly indicated...

Actually, the Autoconf docs say there is a difference between
"with/without" and "enable/disable":

  Some packages pay attention to `--enable-FEATURE' options to
  `configure', where FEATURE indicates an optional part of the package.
  They may also pay attention to `--with-PACKAGE' options, where PACKAGE
  is something like `gnu-as' or `x' (for the X Window System).  The
  `README' should mention any `--enable-' and `--with-' options that the
  package recognizes.

IOW, --enable is for internal features, --with is for interfaces to
external features/programs/libraries (as I read it).  So I think most of
the --with/--enable options are right, but:

>   --with-pydebug                  build with Py_DEBUG defined

should be --enable-pydebug

>   --with-signal-module            disable/enable signal module

should be --enable-signal-module (I think)

>   --with(out)-cycle-gc            disable/enable garbage collection
>   --with(out)-pymalloc            disable/enable specialized mallocs

should be --enable-cycle-gc and --enable-pymalloc

        Greg
-- 
Greg Ward - Linux weenie                                gward@python.net
http://starship.python.net/~gward/
"What do you mean -- a European or an African swallow?"


From guido@digicool.com  Fri Jul 20 19:31:45 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 20 Jul 2001 14:31:45 -0400
Subject: [Python-Dev] mail.python.org black listed ?!
In-Reply-To: Your message of "Fri, 20 Jul 2001 13:11:43 CDT."
 <15192.29791.789277.909128@beluga.mojam.com>
References: <3B5854A1.214EFF19@lemburg.com> <20010720121020.B1470@ute.cnri.reston.va.us>
 <15192.29791.789277.909128@beluga.mojam.com>
Message-ID: <200107201831.OAA22144@cj20424-a.reston1.va.home.com>

> I think the whole black list exercise has been a net waste of time
> and certainly has done little to stem the flow of email spam.

Amen.  My thoughts exactly.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Fri Jul 20 19:45:45 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 20 Jul 2001 14:45:45 -0400 (EDT)
Subject: [Python-Dev] [Q] patches in maintenace branch
Message-ID: <15192.31833.148845.560021@cj42289-a.reston1.va.home.com>

Thomas,
  It came out the other day that you'd rather not have anyone making
checkins on the maintenance branch, but would like to migrate patches
yourself.  Can you add a discussion about this to the bugfix-release
PEP?
  The discussion should include rationale and any information needed
to make branch management easier (cookbook-style instructions for
multiple merges, for example!), and guidelines so that the managers
for bugfix releases won't wait too long to integrate patches.
  One issue that might need to be addressed is that I'd like to be
able to keep a fairly up-to-date version of the patched documentation
available at http://python.sourceforge.net/maint-docs/, so I'd like to
discourage long periods between integrations (unless of course there's
nothing to merge in).
  I'd also like to thank you -- you did a great job as the release
manager for 2.1.1!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From guido@digicool.com  Fri Jul 20 20:48:23 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 20 Jul 2001 15:48:23 -0400
Subject: [Python-Dev] Pointers to python-dev threads pertaining to Patch #441791?
In-Reply-To: Your message of "Fri, 20 Jul 2001 10:21:04 EDT."
 <3B580610.30772.ECBE36E@localhost>
References: Your message of "Thu, 19 Jul 2001 23:04:14 EDT." <200107200304.XAA15088@opus.mit.edu>
 <3B580610.30772.ECBE36E@localhost>
Message-ID: <200107201948.PAA26281@cj20424-a.reston1.va.home.com>

http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=441791

> > In any case, I agree it would be nice if this was fixed.  
> 
> Import issues are subtle, but this looks good to me.

I've added my own version to the patch, which does the right thing if
the module has a SyntaxError, and also conforms to the style guide
(PEP 7).

> > Maybe Thomas was thinking of a different issue, where some people
> > want the sys.modules[name] entry to be *removed* when an import
> > fails.  I am not for that change, but I haven't recovered the
> > reason (I know I had a good one when I implemented things this
> > way).
> 
> Perhaps one could construct a situation with circular imports 
> in which one module ends up with a name (and no error, 
> because the name is being imported) that later turns into an 
> error?

Yes, that's the one (Moshe remembered this too in private mail).

> There's also the issue of failed relative imports that succeed 
> as absolute imports - you don't want every module in the 
> package hunting around for package.sys.

That's a different issue; those create None entries in sys.modules,
and obviously those None entries shouldn't be removed on failure.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Fri Jul 20 22:02:42 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Fri, 20 Jul 2001 17:02:42 -0400 (EDT)
Subject: [Python-Dev] [development doc updates]
Message-ID: <20010720210242.487322892E@cj42289-a.reston1.va.home.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/

Some additional information for extension writers.



From thomas@xs4all.net  Sat Jul 21 00:05:33 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sat, 21 Jul 2001 01:05:33 +0200
Subject: [Python-Dev] Re: [Q] patches in maintenace branch
In-Reply-To: <15192.31833.148845.560021@cj42289-a.reston1.va.home.com>
Message-ID: <20010721010533.B2025@xs4all.nl>

On Fri, Jul 20, 2001 at 02:45:45PM -0400, Fred L. Drake, Jr. wrote:

>   It came out the other day that you'd rather not have anyone making
> checkins on the maintenance branch, but would like to migrate patches
> yourself.  Can you add a discussion about this to the bugfix-release
> PEP?

Yeah. I plan to update it with a bunch of practical info.

>   One issue that might need to be addressed is that I'd like to be
> able to keep a fairly up-to-date version of the patched documentation
> available at http://python.sourceforge.net/maint-docs/, so I'd like to
> discourage long periods between integrations (unless of course there's
> nothing to merge in).

You were implictly allowed to do anything you want in the documentation
tree. Documentation doesn't create broken code, and I didn't expect you to
check in documentation that was wrong or for features that don't exist in
the maintenance branch. (But I kept an eye on what you checked in none the
less :)

>   I'd also like to thank you -- you did a great job as the release
> manager for 2.1.1!

Pfah, no, I didn't :) It started out good, but I lost energy and focus in
the end. This has also a lot do with lousy planning on my side... My
girlfriend was scheduled to go on vacation three weeks ago; instead, she's
leaving tomorrow, and she decided to use tonight to rent a truck and move
all of our leftover stuff from the old house (which we still lease for free)
to the new one (which we've been living in for months :P) That's why I was
offline for most of the US afternoon, today.

Anyway, better luck next time :)

Mental-note-to-self--need-to-add-sourceforge-address-to-'alternates'-ly y'rs,
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Sat Jul 21 00:21:32 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sat, 21 Jul 2001 01:21:32 +0200
Subject: [Python-Dev] mail.python.org black listed ?!
In-Reply-To: <200107201831.OAA22144@cj20424-a.reston1.va.home.com>
Message-ID: <20010721012132.A9882@xs4all.nl>

On Fri, Jul 20, 2001 at 02:31:45PM -0400, Guido van Rossum wrote:
> > I think the whole black list exercise has been a net waste of time
> > and certainly has done little to stem the flow of email spam.

> Amen.  My thoughts exactly.

I disagree, but for a very simple reason: I don't block based on blacklists,
I flag :) Blocking on the SMTP server is a bad idea, IMO too, though I can
understand that people running a single, small SMTP server want to block
spam at the earliest moment. But by flagging it, I can save it to a
different folder, or send auto-replies, or just colour it in my mail client
(mutt).

We have a basic procmailrc which does a whole boatload of spamchecks (ORBS,
RBL, DUL, RSS, various header-checks for illegal ipadresses, well known spam
software, well known addresses like friend@public.com, buffer-overflow
attempts, etc) and simply adds a header to my emails, which I then give a
scoring and color based on what spamtests it triggered. No single spam test
is 100% accurate, but I haven't seen false positives on something that has
ORBS or RBL *and* DUL, RSS, or one or more of the others.

Sadly, ORBS is exit, and MAPS is turning into a commercial service. We're
still debating, at work, whether to pay for it or not :P

If people are interested in the procmailrc, and the  perl script (sorry) it
uses, let me know and I'll see if we can distribute it. It's already
available to XS4ALL customers, together with a simple script to report spam
to Spamcop, which is especially easy to use from inside mutt :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tim.one@home.com  Sat Jul 21 01:31:17 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 20 Jul 2001 20:31:17 -0400
Subject: [Python-Dev] Re: [Q] patches in maintenace branch
In-Reply-To: <20010721010533.B2025@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEODKPAA.tim.one@home.com>

{Fred, to Thomas]
>   I'd also like to thank you -- you did a great job as the release
> manager for 2.1.1!

[Thomas Wouters]
> Pfah, no, I didn't :) It started out good, but I lost energy and focus
> in the end.

Welcome to the club.  No matter how much you may want to beat yourself up,
you did an infinitely better job than the non-existent 2.1.1 release manager
you didn't replace.  Since the release is out, and on schedule, the only
objective thing to be said is that your efforts met with more success than
97.31% of all industry releases!  If you're very nice to Guido, he may even
let you do it again <wink>.



From aahz@rahul.net  Sat Jul 21 04:54:40 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Fri, 20 Jul 2001 20:54:40 -0700 (PDT)
Subject: [Python-Dev] Re: [Q] patches in maintenace branch
In-Reply-To: <20010721010533.B2025@xs4all.nl> from "Thomas Wouters" at Jul 21, 2001 01:05:33 AM
Message-ID: <20010721035440.E659399C83@waltz.rahul.net>

Thomas Wouters wrote:
> On Fri, Jul 20, 2001 at 02:45:45PM -0400, Fred L. Drake, Jr. wrote:
>>
>>   I'd also like to thank you -- you did a great job as the release
>> manager for 2.1.1!
> 
> Pfah, no, I didn't :) 

As the author of PEP 6, I want to publicly echo Fred and Tim: the whole
exercise of 2.0.1 and 2.1.1 turned out better than I actually hoped
for.  I did not anticipate such a wholesale migration of minor fixes.

Quite frankly (and I know Tim will agree with me here ;-), I figured a
likely outcome was that the "put up or shut up" of PEP 6 would result in
no action.  I think it's wonderful that that we've already had two
maintenance releases go so smoothly!
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From mal@lemburg.com  Sat Jul 21 11:28:39 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 21 Jul 2001 12:28:39 +0200
Subject: [Python-Dev] mail.python.org black listed ?!
References: <20010721012132.A9882@xs4all.nl>
Message-ID: <3B595957.1F4D85F5@lemburg.com>

Thomas Wouters wrote:
> 
> On Fri, Jul 20, 2001 at 02:31:45PM -0400, Guido van Rossum wrote:
> > > I think the whole black list exercise has been a net waste of time
> > > and certainly has done little to stem the flow of email spam.
> 
> > Amen.  My thoughts exactly.
> 
> I disagree, but for a very simple reason: I don't block based on blacklists,
> I flag :) Blocking on the SMTP server is a bad idea, IMO too, though I can
> understand that people running a single, small SMTP server want to block
> spam at the earliest moment. But by flagging it, I can save it to a
> different folder, or send auto-replies, or just colour it in my mail client
> (mutt).
> 
> We have a basic procmailrc which does a whole boatload of spamchecks (ORBS,
> RBL, DUL, RSS, various header-checks for illegal ipadresses, well known spam
> software, well known addresses like friend@public.com, buffer-overflow
> attempts, etc) and simply adds a header to my emails, which I then give a
> scoring and color based on what spamtests it triggered. No single spam test
> is 100% accurate, but I haven't seen false positives on something that has
> ORBS or RBL *and* DUL, RSS, or one or more of the others.
> 
> Sadly, ORBS is exit, and MAPS is turning into a commercial service. We're
> still debating, at work, whether to pay for it or not :P
> 
> If people are interested in the procmailrc, and the  perl script (sorry) it
> uses, let me know and I'll see if we can distribute it. It's already
> available to XS4ALL customers, together with a simple script to report spam
> to Spamcop, which is especially easy to use from inside mutt :)

Perhaps we should start a small project for such a tool written in
Python (to bring the subject back on topic ;-) and place it on
the web somewhere ?!

If we separate out the engine from the rest we could also have
different backends, e.g. one which hooks into .forward as filter,
a daemon style backend which does on-server flagging based on
imap, a Mailman filter backend which does the same for mailing
lists etc.

Would be cool to have python-list mark non-python spam using a 
special header automagically ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From guido@digicool.com  Sat Jul 21 14:40:50 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 21 Jul 2001 09:40:50 -0400
Subject: [Python-Dev] Re: [Q] patches in maintenace branch
In-Reply-To: Your message of "Fri, 20 Jul 2001 20:54:40 PDT."
 <20010721035440.E659399C83@waltz.rahul.net>
References: <20010721035440.E659399C83@waltz.rahul.net>
Message-ID: <200107211340.JAA29547@cj20424-a.reston1.va.home.com>

> Quite frankly (and I know Tim will agree with me here ;-), I figured a
> likely outcome was that the "put up or shut up" of PEP 6 would result in
> no action.

You understand the PEP process all too well...  :-)

But in this case I genuinely thought that we needed to do this.

> I think it's wonderful that that we've already had two
> maintenance releases go so smoothly!

It may look smooth to *you*...  Behind the scenes, 2.1.1 final was a
bit of a last-minute mess. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz@rahul.net  Sat Jul 21 15:22:59 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Sat, 21 Jul 2001 07:22:59 -0700 (PDT)
Subject: [Python-Dev] Re: [Q] patches in maintenace branch
In-Reply-To: <200107211340.JAA29547@cj20424-a.reston1.va.home.com> from "Guido van Rossum" at Jul 21, 2001 09:40:50 AM
Message-ID: <20010721142259.E06A799C83@waltz.rahul.net>

Guido van Rossum wrote:
>Aahz:
>>
>> I think it's wonderful that that we've already had two
>> maintenance releases go so smoothly!
> 
> It may look smooth to *you*...  Behind the scenes, 2.1.1 final was a
> bit of a last-minute mess. :-)

My "smooth" was referring less to the actual build process than to the
lack of wrangling over what went in, despite the large numbers of
patches.  If in time we get no complaints about the inclusion of any
patch and y'all didn't miss any critical patches, I'll consider this a
complete, absolute, and unqualified success.

I mean, when I wrote the PEP, you'll recall that I originally tried to
stipulate a separate mailing list because I was worried that the traffic
would overwhelm python-dev.  Hasn't been an issue at *all*, in the end.
Having been on the inside of major political fights over what should go
into a maintenance release (and seeing some of the feature fights here
on python-dev), I truly do find the smoothness of 2.0.1 and 2.1.1
absolutely amazing.

I think kudos go all around to everyone who participated, and most
especially to Moshe and Thomas for orchestrating them.

And now I'm off to OSCON.  I probably won't be back for a week.  Hope to
see many of you in person!
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From thomas@xs4all.net  Sat Jul 21 17:02:02 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sat, 21 Jul 2001 18:02:02 +0200
Subject: [Python-Dev] Pointers to python-dev threads pertaining to Patch #441791?
In-Reply-To: <200107201341.JAA09907@cj20424-a.reston1.va.home.com>
References: <200107200304.XAA15088@opus.mit.edu> <200107201341.JAA09907@cj20424-a.reston1.va.home.com>
Message-ID: <20010721180201.A619@xs4all.nl>

On Fri, Jul 20, 2001 at 09:41:45AM -0400, Guido van Rossum wrote:

> Maybe Thomas was thinking of a different issue, where some people want
> the sys.modules[name] entry to be *removed* when an import fails.

I was.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Sat Jul 21 22:17:13 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 21 Jul 2001 23:17:13 +0200
Subject: [Python-Dev] PEP 253: Subtyping Built-in Types
Message-ID: <3B59F159.919DF4C@lemburg.com>

I've started playing with making mxDateTime types subclassable
and have run into a few problems which the PEP does not seem
to have answers to:

1. Are tp_new et al. inherited by subclassed types ?

This is important when implementing the slot methods, since
they may then see types other than the one for which they
are defined (e.g. keeping a free list around will only
work for the original types, not subclassed ones).

2. In which order are the allocation/deallocation methods
of subclass and base class called (if at all) and how
does this depend on whether they are implemented or inherited ?

3. How can I make attributes visible in subclassed types ?

Even though I found out that I need to use the generic APIs
PyObject_GenericGet|SetAttr() for the tp_get|setattro to
make methods visible, attributes cannot be accessed (and this
even though dir(instance) displays them).

In any case, the new feature looks very promising !

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From guido@digicool.com  Sat Jul 21 23:29:22 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 21 Jul 2001 18:29:22 -0400
Subject: [Python-Dev] PEP 253: Subtyping Built-in Types
In-Reply-To: Your message of "Sat, 21 Jul 2001 23:17:13 +0200."
 <3B59F159.919DF4C@lemburg.com>
References: <3B59F159.919DF4C@lemburg.com>
Message-ID: <200107212229.SAA00894@cj20424-a.reston1.va.home.com>

> I've started playing with making mxDateTime types subclassable

Cool!!!

> and have run into a few problems which the PEP does not seem
> to have answers to:
> 
> 1. Are tp_new et al. inherited by subclassed types ?

My apologies that this stuff is so underdocumented -- there's just so
*much* to be documented...  in typeobject.c, in inherit_slots(),
there's a call to COPYSLOT(tp_new), so the answer is yes.

> This is important when implementing the slot methods, since
> they may then see types other than the one for which they
> are defined (e.g. keeping a free list around will only
> work for the original types, not subclassed ones).

Yes, I've worked out a scheme to make this work, but I don't think
I've written it down anywhere yet.  If your tp_new calls tp_alloc, and
your tp_dealloc calls tp_free, then a subtype can override tp_alloc
*and* tp_free and the right thing will happen.  A subtype can also
*extend* tp_new and tp_dealloc.  (tp_new and tp_dealloc are sort-of
each other's companions, and ditto for tp_alloc and tp_free.)

> 2. In which order are the allocation/deallocation methods
> of subclass and base class called (if at all) and how
> does this depend on whether they are implemented or inherited ?

Here's the scheme.  A subtype's tp_new should call the base type's
tp_new, passing the subtype.  The base class will call tp_alloc, which
is the subtype's version.  Similar for deallocation: the subtype's
tp_dealloc calls the base type's tp_dealloc which calls tp_free which
is the subtype's version.

> 3. How can I make attributes visible in subclassed types ?
> 
> Even though I found out that I need to use the generic APIs
> PyObject_GenericGet|SetAttr() for the tp_get|setattro to
> make methods visible, attributes cannot be accessed (and this
> even though dir(instance) displays them).

Strange.  This should work.  Probably something's subtly wrong in your
setup.  Compare your code to xxsubtype.c.

> In any case, the new feature looks very promising !

Thanks!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@scottb.demon.co.uk  Sun Jul 22 01:45:19 2001
From: barry@scottb.demon.co.uk (Barry Scott)
Date: Sun, 22 Jul 2001 01:45:19 +0100
Subject: [Python-Dev] Leading with XML-RPC
In-Reply-To: <018501c109e0$c345a450$4ffa42d5@hagrid>
Message-ID: <000001c11247$94f7f2a0$060210ac@private>

> (fwiw, my current thinking is that SOAP is a flawed idea, and that the
> need for SOAP will go away when people get better XML/Schema tools,
> but that's another story.  and don't get me started on SOAP BDG...)

	Do you mean that the main claim to fame of SOAP
	is its standard encoding and that's just a schema?

	Don't we need that standard encoding schema?

		BArry
P.S.
	Any date on your 0.92 SOAP lib?



From guido@digicool.com  Sun Jul 22 05:36:38 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 22 Jul 2001 00:36:38 -0400
Subject: [Python-Dev] Future division patch available (PEP 238)
Message-ID: <200107220436.AAA05323@cj20424-a.reston1.va.home.com>

For those interested in the future of the division operator a la PEP
238, I've produced a reasonably complete patch (relative to the CVS
trunk, but it probably also works for the descr-branch or the 2.2a1
release).

Get it here:

  http://sourceforge.net/tracker/index.php?func=detail&aid=443474&group_id=5470&atid=305470

It works as follows:

- unconditionally, there's a new operator // that will always do int
  division (and an in-place companion //=).

- by default, / is unchanged (and so is /=).

- after "from __future__ import division", / is changed to return a
  float result from int or long operands (and so is /=).

Read the patch description for more details; the implementation of int
and float division are semi-lame.

There's no warning yet for int division returning a truncated result;
I'm not sure if I want such a warning to be part of 2.2 (maybe if it's
off by default).

I'm cc'ing Bruce Sherwood and Davin Scherer, because they asked for
this and used a similar implementation in VPython.  When this patch
(or something not entirely unlike it) is accepted into Python 2.2,
they will no longer have to maintain their own hacked Python.  (We've
already added 10**-15 returning a float to 2.2a1, also specifically
for them; that was easier because it used to be an error, so no
backwards compatibility code or future statement is necessary there.)

I thought again about the merits of the '//' operator vs. 'div'
(either as a function or as a keyword binary operator), and figured
that '//' is the best choice: it doesn't introduce a new keyword
(which would cause more pain), and it works as an augmented assignment
(//=) as well.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From moshez@zadka.site.co.il  Sun Jul 22 06:14:42 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Sun, 22 Jul 2001 08:14:42 +0300
Subject: [Python-Dev] Future division patch available (PEP 238)
In-Reply-To: <200107220436.AAA05323@cj20424-a.reston1.va.home.com>
References: <200107220436.AAA05323@cj20424-a.reston1.va.home.com>
Message-ID: <E15OBZi-0001pj-00@darjeeling>

On Sun, 22 Jul 2001 00:36:38 -0400, Guido van Rossum <guido@digicool.com> wrote:

> For those interested in the future of the division operator a la PEP
> 238, I've produced a reasonably complete patch (relative to the CVS
> trunk, but it probably also works for the descr-branch or the 2.2a1
> release).

Do you want me to update PEP-0238 to reflect the new realities as
to "open issues"? (I saw you already added a link to the patch in the
PEP. Great!)

-- 
gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE
Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6
Insecure (accessible): C5A5 A8FA CA39 AB03 10B8  F116 1713 1BCF 54C4 E1FE


From mal@lemburg.com  Sun Jul 22 12:49:45 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 22 Jul 2001 13:49:45 +0200
Subject: [Python-Dev] PEP 253: Subtyping Built-in Types
References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com>
Message-ID: <3B5ABDD9.1A73D7E4@lemburg.com>

Guido van Rossum wrote:
> 
> > I've started playing with making mxDateTime types subclassable
> 
> Cool!!!

A few people keep asking me for new features on those types, so
I guess enabling this for Python 2.2 would be a real advantage for 
them.

I still haven't found out how to solve the construction problem
though (the base type is hard coded into various factory functions
and methods)... the factory methods could use self.__class__
to solve this, but the factory functions would need some different
tweaking.
 
> > and have run into a few problems which the PEP does not seem
> > to have answers to:
> >
> > 1. Are tp_new et al. inherited by subclassed types ?
> 
> My apologies that this stuff is so underdocumented -- there's just so
> *much* to be documented...  in typeobject.c, in inherit_slots(),
> there's a call to COPYSLOT(tp_new), so the answer is yes.

Ok.
 
> > This is important when implementing the slot methods, since
> > they may then see types other than the one for which they
> > are defined (e.g. keeping a free list around will only
> > work for the original types, not subclassed ones).
> 
> Yes, I've worked out a scheme to make this work, but I don't think
> I've written it down anywhere yet.  If your tp_new calls tp_alloc, and
> your tp_dealloc calls tp_free, then a subtype can override tp_alloc
> *and* tp_free and the right thing will happen.  A subtype can also
> *extend* tp_new and tp_dealloc.  (tp_new and tp_dealloc are sort-of
> each other's companions, and ditto for tp_alloc and tp_free.)

So I will have to implement tp_free as well ?! Currently I have
tp_new (which calls tp_alloc), tp_alloc, tp_init for the creation
procedure and tp_dealloc (which does not call tp_free) for the
finalization.

I wonder whether it'd be a good idea to have a tp_del in there
as well (the __del__ at C level) which is then called instead
of tp_dealloc if set and which must call tp_dealloc if the
instance is going to be deleted for good.
 
> > 2. In which order are the allocation/deallocation methods
> > of subclass and base class called (if at all) and how
> > does this depend on whether they are implemented or inherited ?
> 
> Here's the scheme.  A subtype's tp_new should call the base type's
> tp_new, passing the subtype.  The base class will call tp_alloc, which
> is the subtype's version.  Similar for deallocation: the subtype's
> tp_dealloc calls the base type's tp_dealloc which calls tp_free which
> is the subtype's version.

Like this... ?

         subtype                  basetype
----------------------------------------------------
Creation

         tp_new(subtype) 
                               -> tp_new(subtype)    # calls tp_alloc & tp_init

         tp_alloc(subtype)     <-
                               -> tp_alloc(subtype)

         tp_init(instance)     <-
                               -> tp_init(instance)

Finalization

        (
         tp_delete(instance)
                               -> tp_delete(instance) # calls tp_dealloc if
                                                      # the instance should
                                                      # be deleted
        )
         tp_dealloc(instance)
                               -> tp_dealloc(instance) # calls tp_free

         tp_free(instance)     <-
                               -> tp_free(instance)

> > 3. How can I make attributes visible in subclassed types ?
> >
> > Even though I found out that I need to use the generic APIs
> > PyObject_GenericGet|SetAttr() for the tp_get|setattro to
> > make methods visible, attributes cannot be accessed (and this
> > even though dir(instance) displays them).
> 
> Strange.  This should work.  Probably something's subtly wrong in your
> setup.  Compare your code to xxsubtype.c.

The xxsubtype doesn't define any attributes and neither do lists
or dictionaries so there seems to be no precedent.

In mxDateTime under Python 2.1, the tp_gettattr slot takes care of
processing attribute lookup. Now to enable the dynamic goodies in
Python 2.2, I have to provide the tp_getattro slot (and set it to
the generic APIs mentioned above). 

Since tp_getattro override the tp_getattr slots, I have to rely 
on the generic APIs calling back to the tp_getattr slots to process 
the attributes which are not dynamically set by the user or a 
subclass. However, the new generic lookup APIs do not call the
tp_getattr slot at all and thus the attributes which were "defined"
by the tp_getattr in Python 2.1 are no longer visible.

- How do I have to implement attribute lookup in Python 2.2
  for TP_BASETYPEs (methods are now magically handled by the tp_methods
  slot, there doesn't seem to be a corresponding feature for attributes
  though) ?

- Could the generic APIs perhaps fall back to tp_getattr to make
  the transition from classic types to base types a little easier ?

Thanks, 
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal@lemburg.com  Sun Jul 22 15:14:18 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 22 Jul 2001 16:14:18 +0200
Subject: [Python-Dev] PEP 253: Subtyping Built-in Types
References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com>
Message-ID: <3B5ADFBA.A058A0D7@lemburg.com>

A suggestion after looking at the typeobject.c implementation:
wouldn't PyType_InitDict() better be named something like
PyType_InitType() ?! -- the API does so much more than only
init the tp_dict dictionary...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From guido@digicool.com  Sun Jul 22 16:14:34 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 22 Jul 2001 11:14:34 -0400
Subject: [Python-Dev] PEP 253: Subtyping Built-in Types
In-Reply-To: Your message of "Sun, 22 Jul 2001 16:14:18 +0200."
 <3B5ADFBA.A058A0D7@lemburg.com>
References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com>
 <3B5ADFBA.A058A0D7@lemburg.com>
Message-ID: <200107221514.LAA11364@cj20424-a.reston1.va.home.com>

> A suggestion after looking at the typeobject.c implementation:
> wouldn't PyType_InitDict() better be named something like
> PyType_InitType() ?! -- the API does so much more than only
> init the tp_dict dictionary...

Yes, absolutely.  I just haven't gotten around to it yet...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Sun Jul 22 16:49:55 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 22 Jul 2001 11:49:55 -0400
Subject: [Python-Dev] PEP 253: Subtyping Built-in Types
In-Reply-To: Your message of "Sun, 22 Jul 2001 13:49:45 +0200."
 <3B5ABDD9.1A73D7E4@lemburg.com>
References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com>
 <3B5ABDD9.1A73D7E4@lemburg.com>
Message-ID: <200107221549.LAA11503@cj20424-a.reston1.va.home.com>

> A few people keep asking me for new features on those types, so
> I guess enabling this for Python 2.2 would be a real advantage for 
> them.
> 
> I still haven't found out how to solve the construction problem
> though (the base type is hard coded into various factory functions
> and methods)... the factory methods could use self.__class__
> to solve this, but the factory functions would need some different
> tweaking.

Using the new "classmethod" feature you can make the factory functions
class methods.

> > Yes, I've worked out a scheme to make this work, but I don't think
> > I've written it down anywhere yet.  If your tp_new calls tp_alloc, and
> > your tp_dealloc calls tp_free, then a subtype can override tp_alloc
> > *and* tp_free and the right thing will happen.  A subtype can also
> > *extend* tp_new and tp_dealloc.  (tp_new and tp_dealloc are sort-of
> > each other's companions, and ditto for tp_alloc and tp_free.)
> 
> So I will have to implement tp_free as well ?! Currently I have
> tp_new (which calls tp_alloc), tp_alloc, tp_init for the creation
> procedure and tp_dealloc (which does not call tp_free) for the
> finalization.

Yes, if your tp_new calls tp_alloc, your tp_dealloc should call
tp_free.  Otherwise the user can override tp_alloc to use a different
heap, and tp_dealloc would mess up.

> I wonder whether it'd be a good idea to have a tp_del in there
> as well (the __del__ at C level) which is then called instead
> of tp_dealloc if set and which must call tp_dealloc if the
> instance is going to be deleted for good.

I've been thinking about this.  I don't think that's quite the right
protocol; I don't want to complicate the DECREF macro any more.  I
think that tp_dealloc must call tp_del and then decide whether to
proceed depending on the refcount.

> > > 2. In which order are the allocation/deallocation methods
> > > of subclass and base class called (if at all) and how
> > > does this depend on whether they are implemented or inherited ?
> > 
> > Here's the scheme.  A subtype's tp_new should call the base type's
> > tp_new, passing the subtype.  The base class will call tp_alloc, which
> > is the subtype's version.  Similar for deallocation: the subtype's
> > tp_dealloc calls the base type's tp_dealloc which calls tp_free which
> > is the subtype's version.
> 
> Like this... ?
> 
>          subtype                  basetype
> ----------------------------------------------------
> Creation
> 
>          tp_new(subtype) 
>                                -> tp_new(subtype)    # calls tp_alloc & tp_init
> 
>          tp_alloc(subtype)     <-
>                                -> tp_alloc(subtype)

Typically, the derved type's tp_alloc shouldn't call the base type's
tp_alloc -- tp_alloc is supposed to allocate memory for the actual
type, zero it, set the type pointer and reference count, and register
it with GC.  Any other initializations that can't be left to tp_init
(which is optional) are tp_new's responsibility.

>          tp_init(instance)     <-
>                                -> tp_init(instance)
> 
> Finalization
> 
>         (
>          tp_delete(instance)
>                                -> tp_delete(instance) # calls tp_dealloc if
>                                                       # the instance should
>                                                       # be deleted
>         )
>          tp_dealloc(instance)
>                                -> tp_dealloc(instance) # calls tp_free
> 
>          tp_free(instance)     <-
>                                -> tp_free(instance)

Likewise, tp_free needn't call the base tp_free.

> > > 3. How can I make attributes visible in subclassed types ?
> > >
> > > Even though I found out that I need to use the generic APIs
> > > PyObject_GenericGet|SetAttr() for the tp_get|setattro to
> > > make methods visible, attributes cannot be accessed (and this
> > > even though dir(instance) displays them).
> > 
> > Strange.  This should work.  Probably something's subtly wrong in your
> > setup.  Compare your code to xxsubtype.c.
> 
> The xxsubtype doesn't define any attributes and neither do lists
> or dictionaries so there seems to be no precedent.
> 
> In mxDateTime under Python 2.1, the tp_gettattr slot takes care of
> processing attribute lookup. Now to enable the dynamic goodies in
> Python 2.2, I have to provide the tp_getattro slot (and set it to
> the generic APIs mentioned above). 
> 
> Since tp_getattro override the tp_getattr slots, I have to rely 
> on the generic APIs calling back to the tp_getattr slots to process 
> the attributes which are not dynamically set by the user or a 
> subclass. However, the new generic lookup APIs do not call the
> tp_getattr slot at all and thus the attributes which were "defined"
> by the tp_getattr in Python 2.1 are no longer visible.
> 
> - How do I have to implement attribute lookup in Python 2.2
>   for TP_BASETYPEs (methods are now magically handled by the tp_methods
>   slot, there doesn't seem to be a corresponding feature for attributes
>   though) ?

Ah, now I see the question.  There's a tp_members slot, similar to the
tp_methods slot.  The tp_members slot is a pointer to a
NULL-terminated array of the same form that you would pass to
PyMember_Get().  If your attributes require custom computation,
there's also a tp_getset slot which points to a NULL-terminated array
of 'struct getsetlist' items, which specify a name, a getter C
function, a setter C function, and a context void *.  This means you
have to write a pair of (very simple) functions for each writable
attribute, or a single function per read-only attribute.  (The context
pointer gives you a chance to share function implementations, but
I haven't found the need for this yet.)

Examples of all of these can be found in typeobject.c, look for
type_getsets and type_members.

> - Could the generic APIs perhaps fall back to tp_getattr to make
>   the transition from classic types to base types a little easier ?

I'd rather not: that would prevent discovery of attributes supported
by the classic tp_getattr.  The beauty of the new scheme is that *all*
attributes (methods and data) are listed in the type's __dict__.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Sun Jul 22 17:28:54 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 22 Jul 2001 18:28:54 +0200
Subject: [Python-Dev] PEP 253: Subtyping Built-in Types
References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com>
 <3B5ABDD9.1A73D7E4@lemburg.com> <200107221549.LAA11503@cj20424-a.reston1.va.home.com>
Message-ID: <3B5AFF46.7BBCE245@lemburg.com>

Guido van Rossum wrote:
> 
> > A few people keep asking me for new features on those types, so
> > I guess enabling this for Python 2.2 would be a real advantage for
> > them.
> >
> > I still haven't found out how to solve the construction problem
> > though (the base type is hard coded into various factory functions
> > and methods)... the factory methods could use self.__class__
> > to solve this, but the factory functions would need some different
> > tweaking.
> 
> Using the new "classmethod" feature you can make the factory functions
> class methods.

Hmm, I don't like these class methods, but it would probably
help with the problem...

from mx.DateTime import DateTime

dt1 = DateTime(2001,1,16)
dt2 = DateTime.From("16. Januar 2001")

Still looks silly to me... (I don't like these class methods).
 
> > > Yes, I've worked out a scheme to make this work, but I don't think
> > > I've written it down anywhere yet.  If your tp_new calls tp_alloc, and
> > > your tp_dealloc calls tp_free, then a subtype can override tp_alloc
> > > *and* tp_free and the right thing will happen.  A subtype can also
> > > *extend* tp_new and tp_dealloc.  (tp_new and tp_dealloc are sort-of
> > > each other's companions, and ditto for tp_alloc and tp_free.)
> >
> > So I will have to implement tp_free as well ?! Currently I have
> > tp_new (which calls tp_alloc), tp_alloc, tp_init for the creation
> > procedure and tp_dealloc (which does not call tp_free) for the
> > finalization.
> 
> Yes, if your tp_new calls tp_alloc, your tp_dealloc should call
> tp_free.  Otherwise the user can override tp_alloc to use a different
> heap, and tp_dealloc would mess up.

Ok.
 
> > I wonder whether it'd be a good idea to have a tp_del in there
> > as well (the __del__ at C level) which is then called instead
> > of tp_dealloc if set and which must call tp_dealloc if the
> > instance is going to be deleted for good.
> 
> I've been thinking about this.  I don't think that's quite the right
> protocol; I don't want to complicate the DECREF macro any more.  I
> think that tp_dealloc must call tp_del and then decide whether to
> proceed depending on the refcount.

Have you tried to move the decref action into a separate function
(which is only called in case the refcount reaches 0) ? I think
that this could in fact enhance the overall performance since 
the compiler can then decide whether or not to inline the relevant
code.

I wonder what the impact would be...
 
> > > > 2. In which order are the allocation/deallocation methods
> > > > of subclass and base class called (if at all) and how
> > > > does this depend on whether they are implemented or inherited ?
> > >
> > > Here's the scheme.  A subtype's tp_new should call the base type's
> > > tp_new, passing the subtype.  The base class will call tp_alloc, which
> > > is the subtype's version.  Similar for deallocation: the subtype's
> > > tp_dealloc calls the base type's tp_dealloc which calls tp_free which
> > > is the subtype's version.
> >
> > Like this... ?
> >
> >          subtype                  basetype
> > ----------------------------------------------------
> > Creation
> >
> >          tp_new(subtype)
> >                                -> tp_new(subtype)    # calls tp_alloc & tp_init
> >
> >          tp_alloc(subtype)     <-
> >                                -> tp_alloc(subtype)
> 
> Typically, the derved type's tp_alloc shouldn't call the base type's
> tp_alloc -- tp_alloc is supposed to allocate memory for the actual
> type, zero it, set the type pointer and reference count, and register
> it with GC.  Any other initializations that can't be left to tp_init
> (which is optional) are tp_new's responsibility.

Good, so overriding the tp_alloc/free slots is generally not
a wise thing to do, I guess.
 
> >          tp_init(instance)     <-
> >                                -> tp_init(instance)
> >
> > Finalization
> >
> >         (
> >          tp_delete(instance)
> >                                -> tp_delete(instance) # calls tp_dealloc if
> >                                                       # the instance should
> >                                                       # be deleted
> >         )
> >          tp_dealloc(instance)
> >                                -> tp_dealloc(instance) # calls tp_free
> >
> >          tp_free(instance)     <-
> >                                -> tp_free(instance)
> 
> Likewise, tp_free needn't call the base tp_free.
> 
> > > > 3. How can I make attributes visible in subclassed types ?
> > > >
> > > > Even though I found out that I need to use the generic APIs
> > > > PyObject_GenericGet|SetAttr() for the tp_get|setattro to
> > > > make methods visible, attributes cannot be accessed (and this
> > > > even though dir(instance) displays them).
> > >
> > > Strange.  This should work.  Probably something's subtly wrong in your
> > > setup.  Compare your code to xxsubtype.c.
> >
> > The xxsubtype doesn't define any attributes and neither do lists
> > or dictionaries so there seems to be no precedent.
> >
> > In mxDateTime under Python 2.1, the tp_gettattr slot takes care of
> > processing attribute lookup. Now to enable the dynamic goodies in
> > Python 2.2, I have to provide the tp_getattro slot (and set it to
> > the generic APIs mentioned above).
> >
> > Since tp_getattro override the tp_getattr slots, I have to rely
> > on the generic APIs calling back to the tp_getattr slots to process
> > the attributes which are not dynamically set by the user or a
> > subclass. However, the new generic lookup APIs do not call the
> > tp_getattr slot at all and thus the attributes which were "defined"
> > by the tp_getattr in Python 2.1 are no longer visible.
> >
> > - How do I have to implement attribute lookup in Python 2.2
> >   for TP_BASETYPEs (methods are now magically handled by the tp_methods
> >   slot, there doesn't seem to be a corresponding feature for attributes
> >   though) ?
> 
> Ah, now I see the question.  There's a tp_members slot, similar to the
> tp_methods slot.  The tp_members slot is a pointer to a
> NULL-terminated array of the same form that you would pass to
> PyMember_Get().  If your attributes require custom computation,
> there's also a tp_getset slot which points to a NULL-terminated array
> of 'struct getsetlist' items, which specify a name, a getter C
> function, a setter C function, and a context void *.  This means you
> have to write a pair of (very simple) functions for each writable
> attribute, or a single function per read-only attribute.  (The context
> pointer gives you a chance to share function implementations, but
> I haven't found the need for this yet.)
> 
> Examples of all of these can be found in typeobject.c, look for
> type_getsets and type_members.

Thanks. I'll take a look at the implementation ...
 
> > - Could the generic APIs perhaps fall back to tp_getattr to make
> >   the transition from classic types to base types a little easier ?
> 
> I'd rather not: that would prevent discovery of attributes supported
> by the classic tp_getattr.  The beauty of the new scheme is that *all*
> attributes (methods and data) are listed in the type's __dict__.

Uhm, I think you misunderstood me: tp_getattr is not used anymore
once the Python interpreter finds a tp_getattro slot 
implementation, so there's nothing to prevent ;-):

PyObject_GetAttr() does not use tp_getattr if tp_getattro is 
defined, while PyObject_GetAttrString() prefers tp_getattr over
tp_getattro -- something is not symmertic here !

As a result, dir() finds the __members__ attribute which lists
the attributes (it uses PyObject_GetAttrString(), but 
instance.attribute does not work because it uses PyObject_GetAttr().

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From guido@digicool.com  Sun Jul 22 17:42:49 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 22 Jul 2001 12:42:49 -0400
Subject: [Python-Dev] PEP 253: Subtyping Built-in Types
In-Reply-To: Your message of "Sun, 22 Jul 2001 18:28:54 +0200."
 <3B5AFF46.7BBCE245@lemburg.com>
References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> <200107221549.LAA11503@cj20424-a.reston1.va.home.com>
 <3B5AFF46.7BBCE245@lemburg.com>
Message-ID: <200107221642.MAA11705@cj20424-a.reston1.va.home.com>

> Hmm, I don't like these class methods, but it would probably
> help with the problem...
> 
> from mx.DateTime import DateTime
> 
> dt1 = DateTime(2001,1,16)
> dt2 = DateTime.From("16. Januar 2001")
> 
> Still looks silly to me... (I don't like these class methods).

Maybe it's time you started to like them. :-)

> > I've been thinking about this.  I don't think that's quite the right
> > protocol; I don't want to complicate the DECREF macro any more.  I
> > think that tp_dealloc must call tp_del and then decide whether to
> > proceed depending on the refcount.
> 
> Have you tried to move the decref action into a separate function
> (which is only called in case the refcount reaches 0) ? I think
> that this could in fact enhance the overall performance since 
> the compiler can then decide whether or not to inline the relevant
> code.
> 
> I wonder what the impact would be...

My gut tells me that the compiler will usually *not* inline it, and
then it will slow deallocation down by one extra function call.  And
if the compiler *does* inline it, it's code bloat.  So either way you
lose, my gut tells me.  (The dealloc functions for most common types
are very fast and I would hate to see them slow down.)

> Good, so overriding the tp_alloc/free slots is generally not
> a wise thing to do, I guess.

If the base type has a custom free list (like the int type does), you
*have* to override it if the instances of the subtype are larger than
the base type.  Currently int doesn't allow subtyping yet because I
haven't refactored its code in this area yet.

> > > - Could the generic APIs perhaps fall back to tp_getattr to make
> > >   the transition from classic types to base types a little easier ?
> > 
> > I'd rather not: that would prevent discovery of attributes supported
> > by the classic tp_getattr.  The beauty of the new scheme is that *all*
> > attributes (methods and data) are listed in the type's __dict__.
> 
> Uhm, I think you misunderstood me: tp_getattr is not used anymore
> once the Python interpreter finds a tp_getattro slot 
> implementation, so there's nothing to prevent ;-):
> 
> PyObject_GetAttr() does not use tp_getattr if tp_getattro is 
> defined, while PyObject_GetAttrString() prefers tp_getattr over
> tp_getattro -- something is not symmertic here !
> 
> As a result, dir() finds the __members__ attribute which lists
> the attributes (it uses PyObject_GetAttrString(), but 
> instance.attribute does not work because it uses PyObject_GetAttr().

The simplified rule is that a type should only provide *either*
tp_getattr *or* tp_getattro, and likewise for set.  The complete rule
is that if you insist on having both tp_getattr and tp_getattro, they
should implement the same semantics -- tp_getattr should be faster
when PyObject_GetAttrString() is called, and tp_getattro should be
faster when PyObject_GetAttr() is called.

Apparently you left your tp_getattr implementation in place but added
PyObject_GenericGetAttr to the tp_getattro slot -- this simply doesn't
follow the rules.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Sun Jul 22 18:41:56 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 22 Jul 2001 19:41:56 +0200
Subject: [Python-Dev] PEP 253: Subtyping Built-in Types
References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> <200107221549.LAA11503@cj20424-a.reston1.va.home.com>
 <3B5AFF46.7BBCE245@lemburg.com> <200107221642.MAA11705@cj20424-a.reston1.va.home.com>
Message-ID: <3B5B1064.5882398A@lemburg.com>

Guido van Rossum wrote:
> 
> > Hmm, I don't like these class methods, but it would probably
> > help with the problem...
> >
> > from mx.DateTime import DateTime
> >
> > dt1 = DateTime(2001,1,16)
> > dt2 = DateTime.From("16. Januar 2001")
> >
> > Still looks silly to me... (I don't like these class methods).
> 
> Maybe it's time you started to like them. :-)

I'll have a hard time finding my way through all these extra
dots in the names ;-)
 
> > Good, so overriding the tp_alloc/free slots is generally not
> > a wise thing to do, I guess.
> 
> If the base type has a custom free list (like the int type does), you
> *have* to override it if the instances of the subtype are larger than
> the base type.  Currently int doesn't allow subtyping yet because I
> haven't refactored its code in this area yet.

What I did was to enhance the base class' tp_alloc and tp_dealloc 
APIs to only use the free list in case the type being passed to the
APIs is a base type; in all other cases, standard processing takes
place.

Perhaps ints could do the same ?

> > > > - Could the generic APIs perhaps fall back to tp_getattr to make
> > > >   the transition from classic types to base types a little easier ?
> > >
> > > I'd rather not: that would prevent discovery of attributes supported
> > > by the classic tp_getattr.  The beauty of the new scheme is that *all*
> > > attributes (methods and data) are listed in the type's __dict__.
> >
> > Uhm, I think you misunderstood me: tp_getattr is not used anymore
> > once the Python interpreter finds a tp_getattro slot
> > implementation, so there's nothing to prevent ;-):
> >
> > PyObject_GetAttr() does not use tp_getattr if tp_getattro is
> > defined, while PyObject_GetAttrString() prefers tp_getattr over
> > tp_getattro -- something is not symmertic here !
> >
> > As a result, dir() finds the __members__ attribute which lists
> > the attributes (it uses PyObject_GetAttrString(), but
> > instance.attribute does not work because it uses PyObject_GetAttr().
> 
> The simplified rule is that a type should only provide *either*
> tp_getattr *or* tp_getattro, and likewise for set.  The complete rule
> is that if you insist on having both tp_getattr and tp_getattro, they
> should implement the same semantics -- tp_getattr should be faster
> when PyObject_GetAttrString() is called, and tp_getattro should be
> faster when PyObject_GetAttr() is called.

Ah, ok, didn't know that rule.
 
> Apparently you left your tp_getattr implementation in place but added
> PyObject_GenericGetAttr to the tp_getattro slot -- this simply doesn't
> follow the rules.

Yep. That's what I did.

I'll move to the new scheme for 2.2 then and leave the old tp_getattr
around for backward compatibility.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From guido@digicool.com  Mon Jul 23 03:40:27 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 22 Jul 2001 22:40:27 -0400
Subject: [Python-Dev] PEP 253: Subtyping Built-in Types
In-Reply-To: Your message of "Sun, 22 Jul 2001 19:41:56 +0200."
 <3B5B1064.5882398A@lemburg.com>
References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> <200107221549.LAA11503@cj20424-a.reston1.va.home.com> <3B5AFF46.7BBCE245@lemburg.com> <200107221642.MAA11705@cj20424-a.reston1.va.home.com>
 <3B5B1064.5882398A@lemburg.com>
Message-ID: <200107230240.WAA14644@cj20424-a.reston1.va.home.com>

> What I did was to enhance the base class' tp_alloc and tp_dealloc 
> APIs to only use the free list in case the type being passed to the
> APIs is a base type; in all other cases, standard processing takes
> place.
> 
> Perhaps ints could do the same ?

Yes, that's what I was planning to do.

> > The simplified rule is that a type should only provide *either*
> > tp_getattr *or* tp_getattro, and likewise for set.  The complete rule
> > is that if you insist on having both tp_getattr and tp_getattro, they
> > should implement the same semantics -- tp_getattr should be faster
> > when PyObject_GetAttrString() is called, and tp_getattro should be
> > faster when PyObject_GetAttr() is called.
> 
> Ah, ok, didn't know that rule.

Well, I just made it up today. :-)

But it's a sensible rule, if you want predictable results.

> I'll move to the new scheme for 2.2 then and leave the old tp_getattr
> around for backward compatibility.

You should #ifdef on the Python version, unless you make your
tp_getattr do everything that tp_getattro does (possibly by calling on
the latter).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From MarkH@ActiveState.com  Tue Jul 24 00:02:41 2001
From: MarkH@ActiveState.com (Mark Hammond)
Date: Mon, 23 Jul 2001 16:02:41 -0700
Subject: [Python-Dev] 2.2 Unicode questions
In-Reply-To: <3B56DB33.71C9161B@lemburg.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPMEMEEDAA.MarkH@ActiveState.com>

> Guido van Rossum wrote:
> >
> > > First, a short one, Mark Hammond's patch for supporting MBCS on
> > > Windows.  I trust everyone can handle a little bit of TeX markup?
> > >
> > >   % XXX is this explanation correct?
> > >   \item When presented with a Unicode filename on Windows, Python will
> > >   now correctly convert it to a string using the MBCS encoding.
> > >   Filenames on Windows are a case where Python's choice of ASCII as
> > >   the default encoding turns out to be an annoyance.
> > >
> > >   This patch also adds \samp{et} as a format sequence to
> > >   \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and
> > >   an encoding name, and converts it to the given encoding if the
> > >   parameter turns out to be a Unicode string, or leaves it alone if
> > >   it's an 8-bit string, assuming it to already be in the desired
> > >   encoding.  (This differs from the \samp{es} format character, which
> > >   assumes that 8-bit strings are in Python's default ASCII encoding
> > >   and converts them to the specified new encoding.)
> > >
> > >   (Contributed by Mark Hammond with assistance from Marc-Andr\'e
> > >   Lemburg.)
> >
> > I learned something here, so I hope this is correct. :-)
>
> The last part is... the rest is for Mark to comment on.

Sorry for the delay - I hope this reponse is not too late.  The description
is technically correct, but may be better phrased as:

\item When presented with a Unicode filename on Windows, Python will
now convert it to an MBCS encoded string, as used by the Microsoft
file APIs.  As MBCS is explicitly used by the file APIs,
the default Python encoding (be it ASCII or any other encoding
explicitly set) is generally not appropriate for these conversions.

Mark.



From fredrik@pythonware.com  Mon Jul 23 08:45:35 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 23 Jul 2001 09:45:35 +0200
Subject: [Python-Dev] 2.2 Unicode questions
References: <3B585EC2.DF9225D@lemburg.com>
Message-ID: <01ae01c1134b$76ca7820$4ffa42d5@hagrid>

mal wrote:
> Same here: UTF-16 -> UCS-2. Note that I very much favour
> removing the surrogate generation in unichr() for UCS2-builds.
> 
> If I don't here strong opposition, I'll disable this feature
> which was added as part of the UCS-4 patches. unichr()
> will then raise an exception as it did in version 2.1.

the rationale behind this change was that unichr() should
behave like the \U escape.

(they both take a 32-bit character code, and turn it into
a unicode string; see GvR's mails in the ucs4 thread for more
on this topic).

don't change one of them without considering if the other
one really does the right thing.

</F>



From mal@lemburg.com  Mon Jul 23 09:52:18 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 23 Jul 2001 10:52:18 +0200
Subject: [Python-Dev] 2.2 Unicode questions
References: <3B585EC2.DF9225D@lemburg.com> <01ae01c1134b$76ca7820$4ffa42d5@hagrid>
Message-ID: <3B5BE5C2.878EB6A2@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:
> > Same here: UTF-16 -> UCS-2. Note that I very much favour
> > removing the surrogate generation in unichr() for UCS2-builds.
> >
> > If I don't here strong opposition, I'll disable this feature
> > which was added as part of the UCS-4 patches. unichr()
> > will then raise an exception as it did in version 2.1.
> 
> the rationale behind this change was that unichr() should
> behave like the \U escape.

Please note that unichr() is a low-level API which is part
of the Unicode implementation. The implementation itself
does not handle surrogates in any special way, only the codecs
do (and after my last checkin unicode-escape and UTF-16 do
handle surrogates correctly).

To simplify the picture: the implementation itself only sees
UCS-2 or UCS-4 depending on the compile time option and these
do not treat surrogates in any special way except reserve
code points for their usage. Accordingly, unichr() should not
create UTF-16 but UCS-2 for narrow builds and UCS-4 on wide
builds (unichr() is a contructor for code units, not code 
points).

If an application needs an UTF-16 generating API, then it can 
easily implement one using the UCS-2 generating
unichr() API to create Unicode code units representing 
isolated surrogates.

> (they both take a 32-bit character code, and turn it into
> a unicode string; see GvR's mails in the ucs4 thread for more
> on this topic).
> 
> don't change one of them without considering if the other
> one really does the right thing.

<plug> 

For those of you who are not too much into all these
code unit vs. code point vs. character discussions, a look at
the slides of the talk I gave at the European Python Meeting 
in Bordeaux may provide some insights:

	http://www.lemburg.com/python/Unicode-Talk.pdf

</plug>

Cheers,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From fredrik@pythonware.com  Mon Jul 23 11:00:16 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 23 Jul 2001 12:00:16 +0200
Subject: [Python-Dev] 2.2 Unicode questions
References: <3B585EC2.DF9225D@lemburg.com> <01ae01c1134b$76ca7820$4ffa42d5@hagrid> <3B5BE5C2.878EB6A2@lemburg.com>
Message-ID: <032801c1135e$490c4900$4ffa42d5@hagrid>

MAL wrote:
> Please note that unichr() is a low-level API which is part
> of the Unicode implementation.

well, I thought unichr() was a built-in Python function...

> To simplify the picture: the implementation itself only sees
> UCS-2 or UCS-4 depending on the compile time option and these
> do not treat surrogates in any special way except reserve
> code points for their usage. Accordingly, unichr() should not
> create UTF-16 but UCS-2 for narrow builds and UCS-4 on wide
> builds

you didn't answer my question: is there any reason why
unichr(0xXXXXXXXX) shouldn't return exactly the same
thing as "\UXXXXXXXX" ?

in 2.0 and 2.1, it doesn't.  in 2.2, it does.

> (unichr() is a contructor for code units, not code points).

really?  according to the documentation, it creates unicode
*characters*.  so does \U, according to the documentation.

imo, it makes more sense to let "characters" mean code points
than code units, but that's me.  the important thing here is to
figure out if \U and unichr are the same thing, and fix the code
and the documentation to do/say what we mean.

</F>



From mal@lemburg.com  Mon Jul 23 11:36:38 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 23 Jul 2001 12:36:38 +0200
Subject: [Python-Dev] 2.2 Unicode questions
References: <3B585EC2.DF9225D@lemburg.com> <01ae01c1134b$76ca7820$4ffa42d5@hagrid> <3B5BE5C2.878EB6A2@lemburg.com> <032801c1135e$490c4900$4ffa42d5@hagrid>
Message-ID: <3B5BFE36.B43F33A5@lemburg.com>

Fredrik Lundh wrote:
> 
> MAL wrote:
> > To simplify the picture: the implementation itself only sees
> > UCS-2 or UCS-4 depending on the compile time option and these
> > do not treat surrogates in any special way except reserve
> > code points for their usage. Accordingly, unichr() should not
> > create UTF-16 but UCS-2 for narrow builds and UCS-4 on wide
> > builds
> 
> you didn't answer my question: is there any reason why
> unichr(0xXXXXXXXX) shouldn't return exactly the same
> thing as "\UXXXXXXXX" ?
> 
> in 2.0 and 2.1, it doesn't.  in 2.2, it does.
>
> > (unichr() is a contructor for code units, not code points).

Doesn't this answer your question ? The point I wanted to
make was that unichr() is a constructor for a single code unit
just like chr() is a constructor for a single code unit -- in
that sense the storage format used by the implementation defines
the outcome: for UCS-2 builds, it can only create UCS-2 values,
for UCS-4 builds, UCS-4 values are possible as well.
 
The question of u"\UXXXXXXXX" creating surrogates on UCS-2
builds is different: \UXXXXXXXX is an encoding of a Unicode
code point, so the codec has to decide whether or not to
map this to two code units or an exception on UCS-2 builds.

> really?  according to the documentation, it creates unicode
> *characters*.  so does \U, according to the documentation.
> 
> imo, it makes more sense to let "characters" mean code points
> than code units, but that's me. 

The term "character" is vastly overloaded. There are three
different forms of interpretation: graphemes (this is what
a user usually sees as character on her display), codec points
(this is what Unicode encodes) and code units (this is what
the implementation uses a atom for storing code points).

Since Python exposes code units (u[0] gives you direct access
to the implementation defined storage area) and makes no
assumption about surrogates, it would not be a good idea to
suddenly introduce a break in the meaning of the outcome of
indexing into a Unicode string (u[0]) and len(unichr()).

I know that the name unichr() does not help in this situation,
the correct name would be unicodeunit().

> the important thing here is to
> figure out if \U and unichr are the same thing, and fix the code
> and the documentation to do/say what we mean.

Right.

Note that apart from agreeing on a common meaning, we should
also think about the consequences of breaking len(unichr())==1,
e.g. when creating a Unicode string using unichr() you'd expect
to find the generated code unit at the position you appended
it to the Unicode object.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From thomas@xs4all.net  Mon Jul 23 12:04:54 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 23 Jul 2001 13:04:54 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Mac/Distributions/(vise) Python 2.1.vct,1.12,1.12.4.1
In-Reply-To: <E15ORR6-0005Fw-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010723130453.A569@xs4all.nl>

On Sun, Jul 22, 2001 at 03:10:52PM -0700, Jack Jansen wrote:
> Update of /cvsroot/python/python/dist/src/Mac/Distributions/(vise)
> In directory usw-pr-cvs1:/tmp/cvs-serv20074/Python/Mac/Distributions/(vise)

> Modified Files:
>       Tag: release21-maint
> 	Python 2.1.vct 
> Log Message:
> Files used for 2.1.1c1 distribution.

We really should different tags for the regular vs. the Mac release (or any
other release, for that matter) next time :P

> ***** Bogus filespec: Python

Note to self: fix this bug in syncmail (and yes, we can fix this bug in
syncmail.)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From martin@strakt.com  Mon Jul 23 13:21:57 2001
From: martin@strakt.com (Martin Sjögren)
Date: Mon, 23 Jul 2001 14:21:57 +0200
Subject: [Python-Dev] BEGIN_ALLOW_THREADS
Message-ID: <20010723142157.B16665@strakt.com>

Hello

Is there a reason the Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS
don't allow an argument specifying what variable to save the state to? I
needed this myself so I wrote the following:

#ifdef WITH_THREAD
#  define MY_BEGIN_ALLOW_THREADS(st)    \
    { st =3D PyEval_SaveThread(); }
#  define MY_END_ALLOW_THREADS(st)      \
    { PyEval_RestoreThread(st); st =3D NULL; }
#else
#  define MY_BEGIN_ALLOW_THREADS(st)
#  define MY_END_ALLOW_THREADS(st)      { st =3D NULL; }
#endif

It works just fine but has one drawback: Whenever Py_BEGIN_ALLOW_THREADS
changes, I have to change my macros too.

Wouldn't it be reasonable to supply two sets of macros, one that allows
exactly this, and one that does what Py_BEGIN_ALLOW_THREADS currently
does.

Martin Sj=F6gren

--=20
Martin Sj=F6gren
  martin@strakt.com              ICQ : 41245059
  Phone: +46 (0)31 405242        Cell: +46 (0)739 169191
  GPG key: http://www.strakt.com/~martin/gpg.html


From m.favas@per.dem.csiro.au  Mon Jul 23 20:58:04 2001
From: m.favas@per.dem.csiro.au (Mark Favas)
Date: Tue, 24 Jul 2001 03:58:04 +0800
Subject: [Python-Dev] CVS build breakage: snprintf finds its way into socketmodule.c
Message-ID: <3B5C81CC.E0F6D3CF@per.dem.csiro.au>

In the current CVS of 2.2, a call to snprintf now occurs in
socketmodule.c, breaking builds on those systems without such a library
call (such as Tru64 Unix, and older Solarises).

-- 
Mark Favas  -   m.favas@per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA


From barry@zope.com  Mon Jul 23 21:39:32 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 23 Jul 2001 16:39:32 -0400
Subject: [Python-Dev] mail.python.org black listed ?!
References: <20010721012132.A9882@xs4all.nl>
 <3B595957.1F4D85F5@lemburg.com>
Message-ID: <15196.35716.121063.991831@anthem.wooz.org>

>>>>> "M" == M  <mal@lemburg.com> writes:

    M> Perhaps we should start a small project for such a tool written
    M> in Python (to bring the subject back on topic ;-) and place it
    M> on the web somewhere ?!

I think that's an excellent idea!

    M> If we separate out the engine from the rest we could also have
    M> different backends, e.g. one which hooks into .forward as
    M> filter, a daemon style backend which does on-server flagging
    M> based on imap, a Mailman filter backend which does the same for
    M> mailing lists etc.

    M> Would be cool to have python-list mark non-python spam using a
    M> special header automagically ;-)

We could go one better in MM2.1.  There's now a "topics filter"
feature in the alpha codebase (sponsored by Control.com -- thanks
guys!)  and I can easily see how it might be extended to something
like:

- The filter marks the message with a % confidence of being spam
  (e.g. X-Spam: 75%)

- Each Mailman recipient could specify the threshhold above which they
  do not want to receive the message (e.g. don't sent me anything
  that's spam with a more than 70% confidence level).  This only works
  for regular delivery.

-Barry


From barry@zope.com  Mon Jul 23 21:56:46 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 23 Jul 2001 16:56:46 -0400
Subject: [Python-Dev] CVS build breakage: snprintf finds its way into socketmodule.c
References: <3B5C81CC.E0F6D3CF@per.dem.csiro.au>
Message-ID: <15196.36750.517951.674031@anthem.wooz.org>

>>>>> "MF" == Mark Favas <m.favas@per.dem.csiro.au> writes:

    MF> In the current CVS of 2.2, a call to snprintf now occurs in
    MF> socketmodule.c, breaking builds on those systems without such
    MF> a library call (such as Tru64 Unix, and older Solarises).

I have a GPL'd version of vsnprintf() -- taken from GNU screen -- in
Mailman for systems that don't have native support.  That's not
appropriate for Python, but I seem to remember a few other LGPL or
MIT-ish licensed versions floating around when I did a search a couple
of years ago.  Maybe it's time to add our own which would only be
linked in if the platform didn't have native support?

-Barry



From m.favas@per.dem.csiro.au  Mon Jul 23 22:21:05 2001
From: m.favas@per.dem.csiro.au (Mark Favas)
Date: Tue, 24 Jul 2001 05:21:05 +0800
Subject: [Python-Dev] Warning on use of "unset VARIABLE_NOT_SET" in Makefiles on FreeBSD 4.3
Message-ID: <3B5C9541.B69285D9@per.dem.csiro.au>

It seems that FreeBSD 4.3-RELEASE considers that 
"unset VARIABLE_NOT_ALREADY_SET" should be an error and sets the shell
return code to 1. This causes "make" to exit with an error when
executing (for example)

unset PYTHONPATH PYTHONHOME PYTHONSTARTUP; \
	./$(PYTHON) $(srcdir)/setup.py build

The "unset" is no longer in the CVS version, but is in 2.2a1...

uname -a
FreeBSD teche 4.3-RELEASE FreeBSD 4.3-RELEASE
sh
$ unset GGGG
$ echo $?
1
GGGG=42
unset GGGG
echo $?
0

-- 
Mark Favas  -   m.favas@per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA


From m.favas@per.dem.csiro.au  Mon Jul 23 22:26:28 2001
From: m.favas@per.dem.csiro.au (Mark Favas)
Date: Tue, 24 Jul 2001 05:26:28 +0800
Subject: [Python-Dev] CVS build breakage: snprintf finds its way into
 socketmodule.c
References: <3B5C81CC.E0F6D3CF@per.dem.csiro.au> <15196.36750.517951.674031@anthem.wooz.org>
Message-ID: <3B5C9684.BC43661B@per.dem.csiro.au>

"Barry A. Warsaw" wrote:
> =

> >>>>> "MF" =3D=3D Mark Favas <m.favas@per.dem.csiro.au> writes:
> =

>     MF> In the current CVS of 2.2, a call to snprintf now occurs in
>     MF> socketmodule.c, breaking builds on those systems without such
>     MF> a library call (such as Tru64 Unix, and older Solarises).
> =

> I have a GPL'd version of vsnprintf() -- taken from GNU screen -- in
> Mailman for systems that don't have native support.  That's not
> appropriate for Python, but I seem to remember a few other LGPL or
> MIT-ish licensed versions floating around when I did a search a couple
> of years ago.  Maybe it's time to add our own which would only be
> linked in if the platform didn't have native support?
> =

> -Barry

How about the one at http://www.ijs.si/software/snprintf/ ?

=46rom the URL:

"""
Author

Mark Martinec <mark.martinec@ijs.si>, April 1999, June 2000 =

Copyright =A9 1999, Mark Martinec =


Terms and conditions ...

This program is free software; you can redistribute it and/or modify it
under the terms of the Frontier Artistic License which comes with this
Kit. =


Features

    careful adherence to specs regarding flags, field width and
precision; =

    good performance for large string handling (large format, large
argument or large paddings). Performance is similar to system's sprintf
    and in several cases significantly better (make sure you compile
with optimizations turned on, tell the compiler the code is strict ANSI
if
    necessary to give it more freedom for optimizations); =

    return value semantics per ISO/IEC 9899:1999 ("ISO C99"); =

    written in standard ISO/ANSI C - requires an ANSI C compiler.
""" =


-- =

Mark Favas  -   m.favas@per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA


From fdrake@acm.org  Mon Jul 23 22:42:30 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 23 Jul 2001 17:42:30 -0400 (EDT)
Subject: [Python-Dev] Warning on use of "unset VARIABLE_NOT_SET" in Makefiles on FreeBSD 4.3
In-Reply-To: <3B5C9541.B69285D9@per.dem.csiro.au>
References: <3B5C9541.B69285D9@per.dem.csiro.au>
Message-ID: <15196.39494.845671.730890@cj42289-a.reston1.va.home.com>

Mark Favas writes:
 > The "unset" is no longer in the CVS version, but is in 2.2a1...

  And don't expect it to return; Neil's implementation of the -E
option means we don't have to worry about this any more.
  (Thanks, Neil!)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From skip@pobox.com (Skip Montanaro)  Mon Jul 23 23:11:22 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 23 Jul 2001 17:11:22 -0500
Subject: [Python-Dev] mail.python.org black listed ?!
In-Reply-To: <15196.35716.121063.991831@anthem.wooz.org>
References: <20010721012132.A9882@xs4all.nl>
 <3B595957.1F4D85F5@lemburg.com>
 <15196.35716.121063.991831@anthem.wooz.org>
Message-ID: <15196.41226.977676.237807@beluga.mojam.com>

    BAW> - The filter marks the message with a % confidence of being spam
    BAW>   (e.g. X-Spam: 75%)

    BAW> - Each Mailman recipient could specify the threshhold above which
    BAW>   they do not want to receive the message (e.g. don't sent me
    BAW>   anything that's spam with a more than 70% confidence level).
    BAW>   This only works for regular delivery.

On thing to consider is that many mail filters probably only have crude
numeric comparison capability.  In procmail I have to filter using regular
expressions.  (Most of) my mail comes through pobox.com who modifies the
subject header to stuff like

    Subject: [spam score 9.00/10.0 -pobox] remove me

While I'm sure I could create a regular expression that would allow me to
classify pobox.com's spam score numerically (or call out to a Python script
to do it for me), I'm lazy enough that I simply lump everything that has a
pobox.com spam subject (I think 5.0/10.0 is their minimum criterion for
subject mangling) that I just toss everything with spam.*-pobox in the
Subject into the spam-hole.  I assume other mail software systems' filtering
capabilities are similarly limited.

I would therefore suggest that the X-Spam header be simply a three-digit
number in the range 000 to 100.  (No percent sign, always with any necessary
leading zeroes.)  It might even be better to create an X-Spam-Value header
in one-bit arithmetic, e.g. make a slightly smaller range (say 0 to 50) and
include a header like:

    X-Spam-Value: sssssssssssssssssssssssssssssssssss

to indicate a 70% likelihood (35 "s"s).  You could then match it with

    X-Spam-Value: s{25,50}

in procmail to spam-categorize anything with a probability of spamhood >=
50%.  You could include a readable X-Spam header like:

    X-Spam: rated 75% probability of being spam by "Spam Pie v. 0.1"

Skip



From skip@pobox.com (Skip Montanaro)  Mon Jul 23 23:42:33 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 23 Jul 2001 17:42:33 -0500
Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together?
Message-ID: <15196.43097.529737.173915@beluga.mojam.com>

There are several active or could-be-active PEPs related to Python's numeric
behavior:

     S   211  pep-0211.txt  Adding A New Outer Product Operator    Wilson
     S   228  pep-0228.txt  Reworking Python's Numeric Model       Zadka
     S   237  pep-0237.txt  Unifying Long Integers and Integers    Zadka
     S   238  pep-0238.txt  Non-integer Division                   Zadka
     S   239  pep-0239.txt  Adding a Rational Type to Python       Zadka
     S   240  pep-0240.txt  Adding a Rational Literal to Python    Zadka
     S   242  pep-0242.txt  Numeric Kinds                          Dubois

Instead of implementing them piecemeal, shouldn't we be considering them as
a related group?  For example, implementing any or all of PEPs 237, 239 and
240 might well have an effect on what needs to be done for PEP 238.  With
slight modifications, the proposals in PEP 242 might well subsume PEP 238's
functionality in a different way.

If the semantics of arithmetic are going to change, I think they should
change in the context of expanded capability in the language.

-- 
Skip Montanaro (skip@pobox.com)
http://www.mojam.com/
http://www.musi-cal.com/


From fdrake@acm.org  Mon Jul 23 23:04:50 2001
From: fdrake@acm.org (Fred L. Drake)
Date: Mon, 23 Jul 2001 18:04:50 -0400 (EDT)
Subject: [Python-Dev] [development doc updates]
Message-ID: <20010723220450.33D4428932@beowolf.digicool.com>

The development version of the documentation has been updated:

    http://python.sourceforge.net/devel-docs/

Various minor updates.



From martin@loewis.home.cs.tu-berlin.de  Tue Jul 24 07:38:25 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 24 Jul 2001 08:38:25 +0200
Subject: [Python-Dev] CVS build breakage: snprintf finds its way into socketmodule.c
Message-ID: <200107240638.f6O6cPn03340@mira.informatik.hu-berlin.de>

> In the current CVS of 2.2, a call to snprintf now occurs in
> socketmodule.c, breaking builds on those systems without such a
> library call (such as Tru64 Unix, and older Solarises).

Following itojun's proposal, I have now added an autoconf test for
snprintf, and use sprintf if it is not available.

Regards,
Martin



From mal@lemburg.com  Tue Jul 24 11:15:43 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 24 Jul 2001 12:15:43 +0200
Subject: [Python-Dev] shouldn't we be considering all pending numeric
 proposals together?
References: <15196.43097.529737.173915@beluga.mojam.com>
Message-ID: <3B5D4ACF.F79CD179@lemburg.com>

Skip Montanaro wrote:
> 
> There are several active or could-be-active PEPs related to Python's numeric
> behavior:
> 
>      S   211  pep-0211.txt  Adding A New Outer Product Operator    Wilson
>      S   228  pep-0228.txt  Reworking Python's Numeric Model       Zadka
>      S   237  pep-0237.txt  Unifying Long Integers and Integers    Zadka
>      S   238  pep-0238.txt  Non-integer Division                   Zadka
>      S   239  pep-0239.txt  Adding a Rational Type to Python       Zadka
>      S   240  pep-0240.txt  Adding a Rational Literal to Python    Zadka
>      S   242  pep-0242.txt  Numeric Kinds                          Dubois
> 
> Instead of implementing them piecemeal, shouldn't we be considering them as
> a related group?  For example, implementing any or all of PEPs 237, 239 and
> 240 might well have an effect on what needs to be done for PEP 238.  With
> slight modifications, the proposals in PEP 242 might well subsume PEP 238's
> functionality in a different way.
> 
> If the semantics of arithmetic are going to change, I think they should
> change in the context of expanded capability in the language.

May I suggest that these rather controversial changes be carried
out on a separate branch of the Python source tree before adding 
them to the trunk ?!

The reasoning here is that numerics are so low-level that porting
applications to a new release implementing these changes will
cause a lot of work (mostly due to the dynamic nature of Python).

Another suggestion I would like to make is that the new semantics
are first implemented using alternative subclassed numeric 
objects (e.g. newint()) which can then live side-by-side with the
old semantics types for a few releases until they replace the
old types.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From mal@lemburg.com  Tue Jul 24 11:08:34 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 24 Jul 2001 12:08:34 +0200
Subject: [Python-Dev] Spam flagging filter (mail.python.org black listed ?!)
References: <20010721012132.A9882@xs4all.nl>
 <3B595957.1F4D85F5@lemburg.com> <15196.35716.121063.991831@anthem.wooz.org>
Message-ID: <3B5D4922.6FBD8743@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "M" == M  <mal@lemburg.com> writes:
> 
>     M> Perhaps we should start a small project for such a tool written
>     M> in Python (to bring the subject back on topic ;-) and place it
>     M> on the web somewhere ?!
> 
> I think that's an excellent idea!
> 
>     M> If we separate out the engine from the rest we could also have
>     M> different backends, e.g. one which hooks into .forward as
>     M> filter, a daemon style backend which does on-server flagging
>     M> based on imap, a Mailman filter backend which does the same for
>     M> mailing lists etc.
> 
>     M> Would be cool to have python-list mark non-python spam using a
>     M> special header automagically ;-)
> 
> We could go one better in MM2.1.  There's now a "topics filter"
> feature in the alpha codebase (sponsored by Control.com -- thanks
> guys!)  and I can easily see how it might be extended to something
> like:
> 
> - The filter marks the message with a % confidence of being spam
>   (e.g. X-Spam: 75%)

I think we ought to consider a format which allows easy mail
filtering. Like Skip mentioned, mail filters are usually not
very smart about parsing the headers, e.g. Netscape only allows
you to do substring matching.

Ideal would be a format like:

X-SpamLevel: 0123456789x (100%)
X-SpamLevel: 0123456789 (90%)
X-SpamLevel: 0123456 (60%)
X-SpamLevel: 0 (0%)

A substring filter for e.g. "012" would then move all messages
with a spam level of >=20% to Trash.

> - Each Mailman recipient could specify the threshhold above which they
>   do not want to receive the message (e.g. don't sent me anything
>   that's spam with a more than 70% confidence level).  This only works
>   for regular delivery.

Cool (even though I think that client side filtering is more
flexible).

Could you send me the filter source code, so that I can look into
splitting out the engine for use by e.g. procmail ?!

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From jepler@inetnebr.com  Tue Jul 24 13:49:16 2001
From: jepler@inetnebr.com (Jeff Epler)
Date: Tue, 24 Jul 2001 07:49:16 -0500
Subject: [Python-Dev] cygwin "test_pwd" failure on win98
Message-ID: <20010724074916.52922@bald.inetnebr.com>

As described in an ancient cygnus mailing list message,
	http://www.cygwin.com/ml/cygwin-announce/2000/msg00042.html
"getpwnam/getpwuid functions report NULL instead of a fallback entry
*if /etc/passwd exist* and the user name or uid is not found in the file"
(my emphasis)

Once I generated a password file, using 
	mkpasswd.exe > /etc/passwd
test_pwd.py seems to succeed as expected.

Strangely, the Cygnus install seems to have automatically generated the passwd
file into /etc/group at install time, rather than into /etc/passwd.  In any
case, this test failure is due to incorrect cygwin32 setup, not Python.

Jeff
PS Tim, sorry about reporting those two known test failures yesterday.
I hope this message is more helpful.  In any case, when can I expect to
receive my punishment from the PS
-- 
\/ http://www.slashdot.org/                      Jeff Epler jepler@inetnebr.com
"One Architecture, One OS" also translates as "One Egg, One Basket".


From jason@tishler.net  Tue Jul 24 15:05:42 2001
From: jason@tishler.net (Jason Tishler)
Date: Tue, 24 Jul 2001 10:05:42 -0400
Subject: [Python-Dev] cygwin "test_pwd" failure on win98
In-Reply-To: <20010724074916.52922@bald.inetnebr.com>
Message-ID: <20010724100542.A328@dothill.com>

Jeff,

On Tue, Jul 24, 2001 at 07:49:16AM -0500, Jeff Epler wrote:
> Once I generated a password file, using 
> 	mkpasswd.exe > /etc/passwd
> test_pwd.py seems to succeed as expected.

Thanks for tracking down the above.  When I release the next Cygwin
Python distribution, I will update the README with this new information.

However, I'm surprise that mkpasswd works under Windows 9x/Me.  IIRC,
it only works under Windows NT/2000.  Implying that if one desired a
passwd file on 9x/Me, then they had to create it by hand.  I do not have
a 9x/Me machine handy, so I cannot verify the current behavior.

> Strangely, the Cygnus install seems to have automatically generated the passwd
> file into /etc/group at install time, rather than into /etc/passwd.  In any
> case, this test failure is due to incorrect cygwin32 setup, not Python.

If the above is really true, then this is a bug in the current Cygwin
installer (i.e., setup.exe 2.78.2.3) and you should report this to the
Cygwin mailing list.  Note I just reran the current setup.exe under NT
and it generated valid passwd and group files.

Please look in /etc/postinstall.  Do you see a file called
passwd-grp.bat.done?  If so, then examining its content will determine
the commands that were automatically run during the install.  I would be
very interested in your findings.  My passwd-grp.bat.done contains the
following:

    bin\mkpasswd -l > etc\passwd
    bin\mkgroup -l > etc\group

BTW, the cygwin-apps or cygwin mailing list is a more appropriate
forum for a Cygwin Python discussion of this nature.

Thanks,
Jason

-- 
Jason Tishler
Director, Software Engineering       Phone: 732.264.8770 x235
Dot Hill Systems Corp.               Fax:   732.264.8798
82 Bethany Road, Suite 7             Email: Jason.Tishler@dothill.com
Hazlet, NJ 07730 USA                 WWW:   http://www.dothill.com


From mal@lemburg.com  Tue Jul 24 15:07:35 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 24 Jul 2001 16:07:35 +0200
Subject: [Python-Dev] PEP 253: Subtyping Built-in Types
References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> <200107221549.LAA11503@cj20424-a.reston1.va.home.com> <3B5AFF46.7BBCE245@lemburg.com> <200107221642.MAA11705@cj20424-a.reston1.va.home.com>
 <3B5B1064.5882398A@lemburg.com> <200107230240.WAA14644@cj20424-a.reston1.va.home.com>
Message-ID: <3B5D8127.8BAE1BD@lemburg.com>

Guido van Rossum wrote:
> 
> > > The simplified rule is that a type should only provide *either*
> > > tp_getattr *or* tp_getattro, and likewise for set.  The complete rule
> > > is that if you insist on having both tp_getattr and tp_getattro, they
> > > should implement the same semantics -- tp_getattr should be faster
> > > when PyObject_GetAttrString() is called, and tp_getattro should be
> > > faster when PyObject_GetAttr() is called.
> >
> > Ah, ok, didn't know that rule.
> 
> Well, I just made it up today. :-)
> 
> But it's a sensible rule, if you want predictable results.

I'll implement it.
 
> > I'll move to the new scheme for 2.2 then and leave the old tp_getattr
> > around for backward compatibility.
> 
> You should #ifdef on the Python version, unless you make your
> tp_getattr do everything that tp_getattro does (possibly by calling on
> the latter).

Sure; that was my plan. I have to maintain 1.5.2 compatibility
for the packages, that's why I'm trying to keep code redundancy 
minimal in the code base. So far, that has worked rather well (except
for the attribute lookup part).

About the typeobject.h struct names: could you tell me the Py-prefixed
names of the getset et al. lists ? I'd rather not use the current
non-prefixed names. 

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From skip@pobox.com (Skip Montanaro)  Tue Jul 24 18:21:42 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 24 Jul 2001 12:21:42 -0500
Subject: [Python-Dev] number-sig anyone?
Message-ID: <15197.44710.656892.910976@beluga.mojam.com>

Dev-ers,

I have been guilty of generating as much heat as light the past few days on
the subject of integer division (though not quite as much heat as Stephen
Horne!).  For that I apologize.

There are several active PEPs related to various aspects of Python's concept
of numbers.  Yesterday I found these:

     S   211  pep-0211.txt  Adding A New Outer Product Operator    Wilson
     S   228  pep-0228.txt  Reworking Python's Numeric Model       Zadka
     S   237  pep-0237.txt  Unifying Long Integers and Integers    Zadka
     S   238  pep-0238.txt  Non-integer Division                   Zadka
     S   239  pep-0239.txt  Adding a Rational Type to Python       Zadka
     S   240  pep-0240.txt  Adding a Rational Literal to Python    Zadka
     S   242  pep-0242.txt  Numeric Kinds                          Dubois

Today I took a look at http://mail.python.org/mailman/listinfo and could
find no math-sig or number-sig mailing list.  If Python's number system is
going to change in one or more backwards-incompatible I think there may only
be one chance to get it right.  I think a number-sig mailing list would be a
worthwhile forum to discuss these issues.

If there's already a group specific to this topic I missed it.  Point me and
I will start reading archives.

Skip


From fdrake@acm.org  Tue Jul 24 18:27:20 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 24 Jul 2001 13:27:20 -0400 (EDT)
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: <15197.44710.656892.910976@beluga.mojam.com>
References: <15197.44710.656892.910976@beluga.mojam.com>
Message-ID: <15197.45048.841653.553164@cj42289-a.reston1.va.home.com>

Skip Montanaro writes:
 > Today I took a look at http://mail.python.org/mailman/listinfo and could
 > find no math-sig or number-sig mailing list.  If Python's number system is
 > going to change in one or more backwards-incompatible I think there may only
 > be one chance to get it right.  I think a number-sig mailing list would be a
 > worthwhile forum to discuss these issues.

  There is the python-numerics mailing list on SourceForge; find it
from the Python project page there:

	http://sourceforge.net/projects/python/


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From skip@pobox.com (Skip Montanaro)  Tue Jul 24 18:57:59 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 24 Jul 2001 12:57:59 -0500
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: <15197.45048.841653.553164@cj42289-a.reston1.va.home.com>
References: <15197.44710.656892.910976@beluga.mojam.com>
 <15197.45048.841653.553164@cj42289-a.reston1.va.home.com>
Message-ID: <15197.46887.496488.246536@beluga.mojam.com>

    Fred> There is the python-numerics mailing list on SourceForge; find it
    Fred> from the Python project page there:

    Fred>       http://sourceforge.net/projects/python/

I don't suppose there's any chance those three sourceforge-hosted mailing
lists could be mentioned on mail.python.org, could they?  Seems to me that
those three mailing lists are sponsored by the same organization as those
hosted on mail.python.org.

Skip




From skip@pobox.com (Skip Montanaro)  Tue Jul 24 19:01:23 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 24 Jul 2001 13:01:23 -0500
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: <15197.45048.841653.553164@cj42289-a.reston1.va.home.com>
References: <15197.44710.656892.910976@beluga.mojam.com>
 <15197.45048.841653.553164@cj42289-a.reston1.va.home.com>
Message-ID: <15197.47091.783426.407294@beluga.mojam.com>

Damn...  Are the archives of the python-numeric list available somewhere as
a single mbox file or something?  Looks like geocrawler is going to make me
wade through the archives message-by-message on their website.

Skip




From guido@digicool.com  Tue Jul 24 19:26:33 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 24 Jul 2001 14:26:33 -0400
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: Your message of "Tue, 24 Jul 2001 12:57:59 CDT."
 <15197.46887.496488.246536@beluga.mojam.com>
References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com>
 <15197.46887.496488.246536@beluga.mojam.com>
Message-ID: <200107241826.OAA07299@cj20424-a.reston1.va.home.com>

>     Fred> There is the python-numerics mailing list on SourceForge; find it
>     Fred> from the Python project page there:
> 
>     Fred>       http://sourceforge.net/projects/python/
> 
> I don't suppose there's any chance those three sourceforge-hosted mailing
> lists could be mentioned on mail.python.org, could they?  Seems to me that
> those three mailing lists are sponsored by the same organization as those
> hosted on mail.python.org.

I think they should be mentioned there.  Fred, can you edit the
MailingLists.ht file?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Tue Jul 24 19:33:35 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 24 Jul 2001 14:33:35 -0400
Subject: [Python-Dev] PEP 253: Subtyping Built-in Types
In-Reply-To: Your message of "Tue, 24 Jul 2001 16:07:35 +0200."
 <3B5D8127.8BAE1BD@lemburg.com>
References: <3B59F159.919DF4C@lemburg.com> <200107212229.SAA00894@cj20424-a.reston1.va.home.com> <3B5ABDD9.1A73D7E4@lemburg.com> <200107221549.LAA11503@cj20424-a.reston1.va.home.com> <3B5AFF46.7BBCE245@lemburg.com> <200107221642.MAA11705@cj20424-a.reston1.va.home.com> <3B5B1064.5882398A@lemburg.com> <200107230240.WAA14644@cj20424-a.reston1.va.home.com>
 <3B5D8127.8BAE1BD@lemburg.com>
Message-ID: <200107241833.OAA07373@cj20424-a.reston1.va.home.com>

> About the typeobject.h struct names: could you tell me the Py-prefixed
> names of the getset et al. lists ? I'd rather not use the current
> non-prefixed names. 

Argh, there aren't any Py-prefixed names for these yet!  Nor for
structmember.  Since these are just structure names, they aren't
visible to the linker, so there shouldn't be any conflicts with 3rd
party libraries.  But for consistency, and for compile-time as opposed
to link-time conflict avoidance, they really should use Py-prefixes.
I've added a bug report for myself so I won't forget.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From alex_c@MIT.EDU  Tue Jul 24 20:04:26 2001
From: alex_c@MIT.EDU (Alex Coventry)
Date: 24 Jul 2001 15:04:26 -0400
Subject: [Python-Dev] CVS build breakage: snprintf finds its way into socketmodule.c
In-Reply-To: "Martin v. Loewis"'s message of "Tue, 24 Jul 2001 08:38:25 +0200"
References: <200107240638.f6O6cPn03340@mira.informatik.hu-berlin.de>
Message-ID: <etdvgkino79.fsf@opus.mit.edu>

> Following itojun's proposal, I have now added an autoconf test for
> snprintf, and use sprintf if it is not available.

In PySocket_getaddrinfo, would it make sense to increase the allocation
of pbuf from 10 characters to, say, 30 characters, in case 

sprintf(pbuf, "%ld", PyInt_AsLong(pobj));

gets run on a 64-bit machine?

Alex.


From fdrake@acm.org  Tue Jul 24 20:20:47 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 24 Jul 2001 15:20:47 -0400 (EDT)
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: <15197.46887.496488.246536@beluga.mojam.com>
References: <15197.44710.656892.910976@beluga.mojam.com>
 <15197.45048.841653.553164@cj42289-a.reston1.va.home.com>
 <15197.46887.496488.246536@beluga.mojam.com>
 <200107241826.OAA07299@cj20424-a.reston1.va.home.com>
Message-ID: <15197.51855.237526.101524@cj42289-a.reston1.va.home.com>

Skip Montanaro writes:
 > I don't suppose there's any chance those three sourceforge-hosted mailing
 > lists could be mentioned on mail.python.org, could they?  Seems to me that
 > those three mailing lists are sponsored by the same organization as those
 > hosted on mail.python.org.

  I don't know any way to do that.  If we could more easily create new
lists on python.org (i.e., not have to wait for Barry), they never
would have been created on SourceForge.

Guido van Rossum writes:
 > I think they should be mentioned there.  Fred, can you edit the
 > MailingLists.ht file?

  Done.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From guido@digicool.com  Tue Jul 24 20:29:12 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 24 Jul 2001 15:29:12 -0400
Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together?
In-Reply-To: Your message of "Tue, 24 Jul 2001 12:15:43 +0200."
 <3B5D4ACF.F79CD179@lemburg.com>
References: <15196.43097.529737.173915@beluga.mojam.com>
 <3B5D4ACF.F79CD179@lemburg.com>
Message-ID: <200107241929.PAA07684@cj20424-a.reston1.va.home.com>

> Skip Montanaro wrote:
> > 
> > There are several active or could-be-active PEPs related to Python's numeric
> > behavior:
> > 
> >      S   211  pep-0211.txt  Adding A New Outer Product Operator    Wilson
> >      S   228  pep-0228.txt  Reworking Python's Numeric Model       Zadka
> >      S   237  pep-0237.txt  Unifying Long Integers and Integers    Zadka
> >      S   238  pep-0238.txt  Non-integer Division                   Zadka
> >      S   239  pep-0239.txt  Adding a Rational Type to Python       Zadka
> >      S   240  pep-0240.txt  Adding a Rational Literal to Python    Zadka
> >      S   242  pep-0242.txt  Numeric Kinds                          Dubois
> > 
> > Instead of implementing them piecemeal, shouldn't we be
> > considering them as a related group?  For example, implementing
> > any or all of PEPs 237, 239 and 240 might well have an effect on
> > what needs to be done for PEP 238.  With slight modifications, the
> > proposals in PEP 242 might well subsume PEP 238's functionality in
> > a different way.
> > 
> > If the semantics of arithmetic are going to change, I think they should
> > change in the context of expanded capability in the language.

I think PEP 211 and PEP 242 don't belong in this list.  PEP 211
doesn't affect Python's number system at all, and PEP 242 proposes
a set of storage choices, not choices in semantics.  PEP 242 is valid
regardless of what we decide about int division.

The others, however, indeed are connected.  In fact the one that's
currently generating so much heat, PEP 238, is an essential
prerequisite for PEP 228, and so is PEP 237: if the different numeric
types are to be made fully interchangeable, as PEP 228 requires, a
different answer for 1/2 and 1.0/2.0 is impossible, and likewise, 1L
will be treated the same as 1 (in fact, the 'L' suffix will probably
be ignored eventually, and the representation choice is made solely
based on the numerical value).

But it's different the other way around: PEP 238 can easily stand on
its own.  It addresses a problem that exists even without a unified
numeric model.

Conversely, if PEP 238 is unacceptable, PEP 228 also has no hope, and
PEP 239 is much less attractive.  Since PEP 238 is the only one that
cannot avoid breaking existing code, I want to introduce it as soon as
I can, since the others can only be introduced after the full
compatibility waiting period for PEP 238, at least two years.

The relationship between PEP 238 and PEP 239 is interesting.  PEP 238
currently proposes to let int division return a float, because that's
the only available type.  But I believe that if we decide down the
road that int division should return a rational number instead, this
will break little or no code, as long as we embed the rationals in the
floats.  That is, the coercion rules would use this ordering:

  int -> long -> rational -> float -> complex

This is in spite of the fact that floating point numbers are actually
representable exactly as rationals!  (Using unbounded precision, which
Python rationals should definitely have.)  When I add a float to a
rational number, I want the result to be a float, not a rational,
because the float (most likely) represents an approximated value, and
turning it into an exact rational seems a mistake in that case.

The property which current division lacks, and which I think is an
important step towards PEP 228, is the following:

    In an expression involving only numeric variables and operators,
    the *mathematical value* of the result (except for accuracy issues
    due to the fallibility of floating point hardware) should only
    depend on the mathematical value of the inputs.  The *type* of the
    result should be the first type in the above coercion list that
    does not come before any of the input types, and that can
    represent the mathematical value of the result.

With "mathematical value" I mean the abstraction of numbers generally
used in mathematics, where the integers are embedded in the rationals
which are embedded in the reals, etc.  Mathematicians may talk about
the type of a variable ("let i be an integer, etc.")  but they never
talk about the type of a *value*: integer literals are used without
prejudice in formulas yielding real results.

If we introduce rationals, and we redefine int division as returning a
rational instead of a float, this will not affect the mathematical
value.

(BTW, float is a misnomer.  I should have called it real, but alas, I
was a little *too* much under the influence of C at the time.  This is
not worth fixing.)

[MAL]
> May I suggest that these rather controversial changes be carried
> out on a separate branch of the Python source tree before adding 
> them to the trunk ?!

Definitely.  I am currently maintaining the PEP 238 implementation as
a patch; I don't want to start any new branches before we've merged
the descr-branch into the trunk.

> The reasoning here is that numerics are so low-level that porting
> applications to a new release implementing these changes will
> cause a lot of work (mostly due to the dynamic nature of Python).

I am aware of the amount of work; that's why I want to allow a very
generous waiting period before making it law.

> Another suggestion I would like to make is that the new semantics
> are first implemented using alternative subclassed numeric 
> objects (e.g. newint()) which can then live side-by-side with the
> old semantics types for a few releases until they replace the
> old types.

Hm, I don't think that that will be very useful.  A new-division-aware
module could create integer values of the new type, but in order to
protect itself against old-style integers passed in as arguments, it
would have to force a conversion of all arguments -- in which case the
code becomes even uglier than if we added explicit float() coercions
to all arguments.

Have you looked at my PEP-238 patch at all?  It solves the
side-by-side problem with a future statement and two new division
operators: one that forces int results, for //, one that forces float
results, for / under the influence of the appropriate future
statement, and one that implements the old behavior, for / without the
future statement.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From alex_c@MIT.EDU  Tue Jul 24 21:09:36 2001
From: alex_c@MIT.EDU (Alex Coventry)
Date: 24 Jul 2001 16:09:36 -0400
Subject: [Python-Dev] CVS build breakage: snprintf finds its way into socketmodule.c
In-Reply-To: Alex Coventry's message of "24 Jul 2001 15:04:26 -0400"
References: <etdvgkino79.fsf@opus.mit.edu>
Message-ID: <etdsnfmnl6n.fsf@opus.mit.edu>

> In PySocket_getaddrinfo, would it make sense to increase the
> allocation of pbuf from 10 characters to, say, 30 characters

Or even to "char pbuf[sizeof(long)*3];" so no one has to think about it
anymore.

Alex.


From martin@loewis.home.cs.tu-berlin.de  Tue Jul 24 21:48:39 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 24 Jul 2001 22:48:39 +0200
Subject: [Python-Dev] IPv6 committed
Message-ID: <200107242048.f6OKmdq01942@mira.informatik.hu-berlin.de>

Hi itojun,

As you may have noticed, I just committed the last chunk of your IPv6
patch. Thanks a lot for your contributions, I think you've provided a
highly valuable contribution to Python 2.2. We still have to figure
out a way to provide documentation, but I expect that we can complete
that before 2.2a2.

As with all new code, there may occur some problems; I hope you'll be
around for the coming weeks and give the professional advise that
you've provided throughout the integration of the code.

People finding problems in the IPv6 code (ie. with the current socket
applications) are encouraged to use the SF bug-reporting procedure as
they do for all other problems in the Python libraries; you can assign
all such bugs to me.

Kind regards,
Martin


From fdrake@acm.org  Tue Jul 24 21:52:33 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 24 Jul 2001 16:52:33 -0400 (EDT)
Subject: [Python-Dev] IPv6 committed
In-Reply-To: <200107242048.f6OKmdq01942@mira.informatik.hu-berlin.de>
References: <200107242048.f6OKmdq01942@mira.informatik.hu-berlin.de>
Message-ID: <15197.57361.306046.539086@cj42289-a.reston1.va.home.com>

Martin v. Loewis writes:
 >                                             We still have to figure
 > out a way to provide documentation, but I expect that we can complete
 > that before 2.2a2.

  I'll warn you now that I know next to nothing about IPv6, and my
attempts to spend enough time reading about it to be useful have been
thwarted more than one.  I'm afraid I'll be able to provide no more
than editorial & markup assistance for the IPv6 documentation.  ;-(


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From guido@digicool.com  Tue Jul 24 22:02:30 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 24 Jul 2001 17:02:30 -0400
Subject: [Python-Dev] IPv6 committed
In-Reply-To: Your message of "Tue, 24 Jul 2001 22:48:39 +0200."
 <200107242048.f6OKmdq01942@mira.informatik.hu-berlin.de>
References: <200107242048.f6OKmdq01942@mira.informatik.hu-berlin.de>
Message-ID: <200107242102.RAA07970@cj20424-a.reston1.va.home.com>

Martin and Itojun,

I would like to thank you both for adding IPv6 support to Python.
It's a big boon for Python as well as for IPv6!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@zope.com  Tue Jul 24 22:24:13 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 24 Jul 2001 17:24:13 -0400
Subject: [Python-Dev] number-sig anyone?
References: <15197.44710.656892.910976@beluga.mojam.com>
Message-ID: <15197.59261.754548.28233@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    SM> Today I took a look at http://mail.python.org/mailman/listinfo
    SM> and could find no math-sig or number-sig mailing list.  If
    SM> Python's number system is going to change in one or more
    SM> backwards-incompatible I think there may only be one chance to
    SM> get it right.  I think a number-sig mailing list would be a
    SM> worthwhile forum to discuss these issues.

+1.  If others agree, I'll create the sig.

-Barry


From barry@zope.com  Tue Jul 24 22:26:40 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 24 Jul 2001 17:26:40 -0400
Subject: [Python-Dev] number-sig anyone?
References: <15197.44710.656892.910976@beluga.mojam.com>
 <15197.45048.841653.553164@cj42289-a.reston1.va.home.com>
Message-ID: <15197.59408.52053.335198@anthem.wooz.org>

>>>>> "Fred" == Fred L Drake, Jr <fdrake@acm.org> writes:

    Fred>   There is the python-numerics mailing list on SourceForge;
    Fred> find it from the Python project page there:

    Fred> 	http://sourceforge.net/projects/python/

Ah.  Shouldn't this page

    http://www.python.org/sigs/

point to this page

    http://sourceforge.net/mail/?group_id=5470

???

-Barry


From guido@digicool.com  Tue Jul 24 22:30:52 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 24 Jul 2001 17:30:52 -0400
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: Your message of "Tue, 24 Jul 2001 17:24:13 EDT."
 <15197.59261.754548.28233@anthem.wooz.org>
References: <15197.44710.656892.910976@beluga.mojam.com>
 <15197.59261.754548.28233@anthem.wooz.org>
Message-ID: <200107242130.RAA08105@cj20424-a.reston1.va.home.com>

> >>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:
> 
>     SM> Today I took a look at http://mail.python.org/mailman/listinfo
>     SM> and could find no math-sig or number-sig mailing list.  If
>     SM> Python's number system is going to change in one or more
>     SM> backwards-incompatible I think there may only be one chance to
>     SM> get it right.  I think a number-sig mailing list would be a
>     SM> worthwhile forum to discuss these issues.
> 
> +1.  If others agree, I'll create the sig.
> 
> -Barry

Sounds like a good plan, but please wait until we have a SIG
owner/moderator and a charter.  Without both of these a SIG will be a
failure.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@zope.com  Tue Jul 24 22:41:37 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 24 Jul 2001 17:41:37 -0400
Subject: [Python-Dev] number-sig anyone?
References: <15197.44710.656892.910976@beluga.mojam.com>
 <15197.45048.841653.553164@cj42289-a.reston1.va.home.com>
 <15197.46887.496488.246536@beluga.mojam.com>
 <200107241826.OAA07299@cj20424-a.reston1.va.home.com>
 <15197.51855.237526.101524@cj42289-a.reston1.va.home.com>
Message-ID: <15197.60305.896122.475136@anthem.wooz.org>

>>>>> "Fred" == Fred L Drake, Jr <fdrake@acm.org> writes:

    Fred>   I don't know any way to do that.  If we could more easily
    Fred> create new lists on python.org (i.e., not have to wait for
    Fred> Barry), they never would have been created on SourceForge.

Actually, creating the lists is no problem, takes just seconds.  It's
updating the sigs page that's the PITA.

Note that in Mailman 2.1, you'll be able to create new lists
thru-the-web, and we can delegate that responsibility to a "list
creator" password which can be shared by a small group of trusted
folks.  Updating the sigs page is still a separate process.

-Barry


From klm@digicool.com  Tue Jul 24 22:46:56 2001
From: klm@digicool.com (Ken Manheimer)
Date: Tue, 24 Jul 2001 17:46:56 -0400 (EDT)
Subject: [Python-Dev] mail.python.org black listed ?!
In-Reply-To: <15196.41226.977676.237807@beluga.mojam.com>
Message-ID: <Pine.LNX.4.21.0107241738080.30847-100000@serenade.digicool.com>

On Mon, 23 Jul 2001, Skip Montanaro wrote:

>     BAW> - The filter marks the message with a % confidence of being spam
>     BAW>   (e.g. X-Spam: 75%)
> 
>     BAW> - Each Mailman recipient could specify the threshhold above which
>     BAW>   they do not want to receive the message (e.g. don't sent me
>     BAW>   anything that's spam with a more than 70% confidence level).
>     BAW>   This only works for regular delivery.
>
> [Could use re's to match] 
>
> I would therefore suggest that the X-Spam header be simply a three-digit
> number in the range 000 to 100.  (No percent sign, always with any necessary
> leading zeroes.)  It might even be better to create an X-Spam-Value header
> in one-bit arithmetic, e.g. make a slightly smaller range (say 0 to 50) and
> include a header like:
> 
>     X-Spam-Value: sssssssssssssssssssssssssssssssssss
> 
> to indicate a 70% likelihood (35 "s"s).  You could then match it with
> 
>     X-Spam-Value: s{25,50}
> 
> in procmail to spam-categorize anything with a probability of spamhood >=
> 50%.  You could include a readable X-Spam header like:
> 
>     X-Spam: rated 75% probability of being spam by "Spam Pie v. 0.1"

Um, yick!-)  The idea of using a bar-like representation of the assessment
strikes me like suggesting presentation of the info in a graph, and then
screen-scraping to evaluate the graph.  Aieee!

How about a spam-estimate of 0-9?  Pretty darn easy to match.  I wouldn't
imagine the lack of precision is going to be a problem, in this domain...

Or is this all too off-topic?

Ken
klm@digicool.com



From skip@pobox.com (Skip Montanaro)  Tue Jul 24 22:54:54 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 24 Jul 2001 16:54:54 -0500
Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together?
In-Reply-To: <200107241929.PAA07684@cj20424-a.reston1.va.home.com>
References: <15196.43097.529737.173915@beluga.mojam.com>
 <3B5D4ACF.F79CD179@lemburg.com>
 <200107241929.PAA07684@cj20424-a.reston1.va.home.com>
Message-ID: <15197.61102.391599.162359@beluga.mojam.com>

>>>>> "Guido" == Guido van Rossum <guido@digicool.com> writes:

    >> Skip Montanaro wrote:
    >> > 
    >> > There are several active or could-be-active PEPs related to Python's numeric
    >> > behavior:
    >> > 
    >> >      S   211  pep-0211.txt  Adding A New Outer Product Operator    Wilson
    >> >      S   228  pep-0228.txt  Reworking Python's Numeric Model       Zadka
    >> >      S   237  pep-0237.txt  Unifying Long Integers and Integers    Zadka
    >> >      S   238  pep-0238.txt  Non-integer Division                   Zadka
    >> >      S   239  pep-0239.txt  Adding a Rational Type to Python       Zadka
    >> >      S   240  pep-0240.txt  Adding a Rational Literal to Python    Zadka
    >> >      S   242  pep-0242.txt  Numeric Kinds                          Dubois

    Guido> I think PEP 211 and PEP 242 don't belong in this list.  PEP 211
    Guido> doesn't affect Python's number system at all, and PEP 242
    Guido> proposes a set of storage choices, not choices in semantics.  PEP
    Guido> 242 is valid regardless of what we decide about int division.

The inclusion of PEP 211 in this message was an oversight.  I pasted this
list from another message.  I included PEP 242 on purpose however.  I think
Paul gives you a language for perhaps defining other sorts of numeric
properties besides numeric precision (which is what my reading led me to
believe it was focused on).

    ...

    Guido> But it's different the other way around: PEP 238 can easily stand
    Guido> on its own.  It addresses a problem that exists even without a
    Guido> unified numeric model.

    Guido> Conversely, if PEP 238 is unacceptable, PEP 228 also has no hope,
    Guido> and PEP 239 is much less attractive.  Since PEP 238 is the only
    Guido> one that cannot avoid breaking existing code, I want to introduce
    Guido> it as soon as I can, since the others can only be introduced
    Guido> after the full compatibility waiting period for PEP 238, at least
    Guido> two years.

    ...

    Guido> If we introduce rationals, and we redefine int division as
    Guido> returning a rational instead of a float, this will not affect the
    Guido> mathematical value.

    ...

    Guido> I am currently maintaining the PEP 238 implementation as a patch;
    Guido> I don't want to start any new branches before we've merged the
    Guido> descr-branch into the trunk.

I elided a bunch of valuable information, stuff I was previously unaware of.
The acceptability or not of PEP 238 in the broader Python community appears
to be based on people only looking back.  As far as I know most people
aren't aware of the long-term motivation.  (It may have been there in one of
Guido's or Tim's messages, but if so, I missed it.)  I certainly wasn't
aware of the motivation, and I just read the above PEPs in the past day or
two.  Connecting all that together (a "meta PEP"?)  probably belongs in PEP
228.

Here's what I propose.  Once the descr-branch has been merged, create a new
branch, call it mouse-branch.  Add the PEP 238 and other changes there and
update PEP 228 (last change: 4 Nov 2000) to include the rationale I deleted
from Guido's message.  Then urge anyone with an interest in any of these
topics to check out the mouse from CVS and play with it.  (Just don't squish
it, that's the Python's job!)  Initially, it will just have the one change
that has stirred up such a hornet's nest.  Still, even that will be
instructive to play with, and in concert with a stronger motivation for the
change in PEP 228 (and perhaps PEP 238) should help soften the blow caused
by the change.  As I mentioned in a previous message, I think you have one
chance to make this change.  If people perceive that "hey, he's going
somewhere interesting with this stuff", I think they will be more open to
the discomfort of individual changes.

Then, once you're ready (I don't know if 2.2 is far enough out), have the
Python eat the mouse and start a rat-branch that incorporates all the
rational stuff (having never used a programming language that supported
rational numbers, I find the prospect both a bit daunting and exciting).
That branch will live for a fairly long time, probably at least until 2.4,
when the int division change is complete, at which point the Python can eat
the rat.

    Guido> Have you looked at my PEP-238 patch at all?  

Not yet.  Should it be applied to the head branch or the descr-branch?

Skip



From skip@pobox.com (Skip Montanaro)  Tue Jul 24 23:20:59 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 24 Jul 2001 17:20:59 -0500
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: <15197.59408.52053.335198@anthem.wooz.org>
References: <15197.44710.656892.910976@beluga.mojam.com>
 <15197.45048.841653.553164@cj42289-a.reston1.va.home.com>
 <15197.59408.52053.335198@anthem.wooz.org>
Message-ID: <15197.62667.379233.918619@beluga.mojam.com>

    Fred> There is the python-numerics mailing list on SourceForge; find it
    Fred> from the Python project page there:

    Fred> http://sourceforge.net/projects/python/

    BAW> Ah.  Shouldn't this page

    BAW>     http://www.python.org/sigs/

    BAW> point to this page

    BAW>     http://sourceforge.net/mail/?group_id=5470

    BAW> ???

Looks like we have at least three pages that list related info:

    http://www.python.org/sigs/
    http://mail.python.org/
    http://sourceforge.net/mail/?group_id=5470

Can they be unified?

Skip


From skip@pobox.com (Skip Montanaro)  Tue Jul 24 23:38:59 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 24 Jul 2001 17:38:59 -0500
Subject: [Python-Dev] mail.python.org black listed ?!
In-Reply-To: <Pine.LNX.4.21.0107241738080.30847-100000@serenade.digicool.com>
References: <15196.41226.977676.237807@beluga.mojam.com>
 <Pine.LNX.4.21.0107241738080.30847-100000@serenade.digicool.com>
Message-ID: <15197.63747.743439.630607@beluga.mojam.com>

    >> X-Spam-Value: sssssssssssssssssssssssssssssssssss

    Ken> Um, yick!-) The idea of using a bar-like representation of the
    Ken> assessment strikes me like suggesting presentation of the info in a
    Ken> graph, and then screen-scraping to evaluate the graph.  Aieee!

Well, that's not what I had in mind, but if that floats your boat.  The
fundamental I'm trying to solve is that many mail filter programs can't do
numeric comparisons at all.  I'm used to procmail which allows me to easily
use regular expressions.  Somebody else (Barry?  Marc-Andre?) suggested

    X-Spam-Value: 0123456789

(90% probability) or

    X-Spam-Value: 012345

(50% probability)

which can be matched by feeble filters like Netscape's that only supports
substring matches ("X-Spam-Value: 0123456" would match anything of 60%
probability or higher).

    Ken> How about a spam-estimate of 0-9?  Pretty darn easy to match.  I
    Ken> wouldn't imagine the lack of precision is going to be a problem, in
    Ken> this domain...

You're unfortunately back to trying to make numeric comparisons or using
fairly complex regular expressions to perform the comparisons.  The poor
saps using Netscape would have to have four rules to match a 60% or higher
spam probability.

This discussion almost certainly doesn't belong on python-dev.  Is there a
more appropriate Python-related list already in existence in which to hatch
these ideas?

Skip


From tim@digicool.com  Tue Jul 24 23:49:50 2001
From: tim@digicool.com (Tim Peters)
Date: Tue, 24 Jul 2001 18:49:50 -0400
Subject: [Python-Dev] mail.python.org black listed ?!
In-Reply-To: <15197.63747.743439.630607@beluga.mojam.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEKHCDAA.tim@digicool.com>

[Ken Manheimer]
> How about a spam-estimate of 0-9?  Pretty darn easy to match.

[Skip Montanaro]
> You're unfortunately back to trying to make numeric comparisons or
> using fairly complex regular expressions to perform the comparisons.
> The poor saps using Netscape would have to have four rules to match a
> 60% or higher spam probability.

I don't use Netscape or know which flavor of regexps it supports, but I
don't know of any regexp pkg that lacks character-class support.  That is,
in Python regexp syntax,

    r"X-Spam-Whatever:\s+[6-9]"  # match >= 60%

    r"X-Spame-Whatever:\s+[0-5]" # match < 60%

    r"X-Spame-Whatever:\s+[2357]" # match int(spamprob/10) is prime

> This discussion almost certainly doesn't belong on python-dev.  Is
> there a more appropriate Python-related list already in existence in
> which to hatch these ideas?

Only one that comes to mind is the numerics list, since this *is* about
numeric comparisons, and has the potential to become ugly <wink>>



From mwh@python.net  Tue Jul 24 23:56:32 2001
From: mwh@python.net (Michael Hudson)
Date: 24 Jul 2001 18:56:32 -0400
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib BaseHTTPServer.py,1.15,1.16 SocketServer.py,1.25,1.26 ftplib.py,1.53,1.54 httplib.py,1.35,1.36 poplib.py,1.14,1.15 smtplib.py,1.36,1.37 telnetlib.py,1.11,1.12
References: <E15P8sc-00037H-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <2mvgkiuean.fsf@starship.python.net>

"Martin v. L?wis" <loewis@users.sourceforge.net> writes:

> Update of /cvsroot/python/python/dist/src/Lib
> In directory usw-pr-cvs1:/tmp/cvs-serv11791
> 
> Modified Files:
> 	BaseHTTPServer.py SocketServer.py ftplib.py httplib.py 
> 	poplib.py smtplib.py telnetlib.py 
> Log Message:
> Patch #401196: Use getaddrinfo and AF_INET6 in TCP servers and clients.

> ! 	for res in socket.getaddrinfo(self.host, self.port, 0, socket.SOCK_STREAM):
> ! 	    af, socktype, proto, canonname, sa = res
> ! 	    try:
> ! 		self.sock = socket.socket(af, socktype, proto)
> ! 		self.sock.connect(sa)
> ! 	    except socket.error, msg:
> ! 		self.sock.close()
> ! 		self.sock = None
> ! 		continue
> ! 	    break
> ! 	if not self.sock:
> ! 	    raise socket.error, msg

> ! 	for res in socket.getaddrinfo(None, 0, self.af, socket.SOCK_STREAM, 0, socket.AI_PASSIVE):
> ! 	    af, socktype, proto, canonname, sa = res
> ! 	    try:
> ! 		sock = socket.socket(af, socktype, proto)
> ! 		sock.bind(sa)
> ! 	    except socket.error, msg:
> ! 		sock.close()
> ! 		sock = None
> ! 		continue
> ! 	    break
> ! 	if not sock:
> ! 	    raise socket.error, msg

> !  	for res in socket.getaddrinfo(self.host, self.port, 0, socket.SOCK_STREAM):
> !  	    af, socktype, proto, canonname, sa = res
> !  	    try:
> !  		self.sock = socket.socket(af, socktype, proto)
> ! 		if self.debuglevel > 0:
> ! 		    print "connect: (%s, %s)" % (self.host, self.port)
> ! 		self.sock.connect(sa)
> ! 	    except socket.error, msg:
> ! 		if self.debuglevel > 0:
> ! 		    print 'connect fail:', (self.host, self.port)
> ! 		self.sock.close()
> ! 		self.sock = None
> ! 		continue
> ! 	    break
> ! 	if not self.sock:
> ! 	    raise socket.error, msg

> ! 	for res in socket.getaddrinfo(self.host, self.port, 0, socket.SOCK_STREAM):
> ! 	    af, socktype, proto, canonname, sa = res
> ! 	    try:
> ! 		self.sock = socket.socket(af, socktype, proto)
> ! 		self.sock.connect(sa)
> ! 	    except socket.error, msg:
> ! 		self.sock.close()
> ! 		self.sock = None
> ! 		continue
> ! 	    break
> ! 	if not self.sock:
> ! 	    raise socket.error, msg

> !  	for res in socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM):
> !  	    af, socktype, proto, canonname, sa = res
> !  	    try:
> !  		self.sock = socket.socket(af, socktype, proto)
> !  		if self.debuglevel > 0: print 'connect:', (host, port)
> !  		self.sock.connect(sa)
> !  	    except socket.error, msg:
> !  		if self.debuglevel > 0: print 'connect fail:', (host, port)
> !  		self.sock.close()
> !  		self.sock = None
> !  		continue
> !  	    break
> ! 	if not self.sock:
> !  	    raise socket.error, msg

> ! 	for res in socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM):
> ! 	    af, socktype, proto, canonname, sa = res
> ! 	    try:
> ! 		self.sock = socket.socket(af, socktype, proto)
> ! 		self.sock.connect(sa)
> ! 	    except socket.error, msg:
> ! 		self.sock.close()
> ! 		self.sock = None
> ! 		continue
> ! 	    break
> !         if not self.sock:
> ! 	    raise socket.error, msg

Excuse my ignorance, but: A case for refactoring?

Also this patch introduced some hard tabs, but I guess Tim'll beat
these to death with reindent.py at some point...

Cheers,
M.



From guido@digicool.com  Wed Jul 25 00:02:06 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 24 Jul 2001 19:02:06 -0400
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: Your message of "Tue, 24 Jul 2001 17:20:59 CDT."
 <15197.62667.379233.918619@beluga.mojam.com>
References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> <15197.59408.52053.335198@anthem.wooz.org>
 <15197.62667.379233.918619@beluga.mojam.com>
Message-ID: <200107242302.TAA08477@cj20424-a.reston1.va.home.com>

> Looks like we have at least three pages that list related info:
> 
>     http://www.python.org/sigs/
>     http://mail.python.org/
>     http://sourceforge.net/mail/?group_id=5470
> 
> Can they be unified?

I don't see how.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com (Skip Montanaro)  Wed Jul 25 00:21:43 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 24 Jul 2001 18:21:43 -0500
Subject: [Python-Dev] mail.python.org black listed ?!
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHCEKHCDAA.tim@digicool.com>
References: <15197.63747.743439.630607@beluga.mojam.com>
 <BIEJKCLHCIOIHAGOKOLHCEKHCDAA.tim@digicool.com>
Message-ID: <15198.775.101437.858823@beluga.mojam.com>

    Tim> [Skip Montanaro]
    >> You're unfortunately back to trying to make numeric comparisons or
    >> using fairly complex regular expressions to perform the comparisons.
    >> The poor saps using Netscape would have to have four rules to match a
    >> 60% or higher spam probability.

    Tim> I don't use Netscape or know which flavor of regexps it supports,
    Tim> but I don't know of any regexp pkg that lacks character-class
    Tim> support.  That is, in Python regexp syntax,

Yeah, but Netscape apparently doesn't support regexs in its mail filters at
all, just substring matches.

    >> This discussion almost certainly doesn't belong on python-dev.  Is
    >> there a more appropriate Python-related list already in existence in
    >> which to hatch these ideas?

    Tim> Only one that comes to mind is the numerics list, since this *is*
    Tim> about numeric comparisons, and has the potential to become ugly
    Tim> <wink>>

Hey, not a bad idea.  I just subscribed and the archives suggest it's been
idle.  Perhaps I can hijack it... ;-)

S



From skip@pobox.com (Skip Montanaro)  Wed Jul 25 00:24:33 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 24 Jul 2001 18:24:33 -0500
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: <200107242302.TAA08477@cj20424-a.reston1.va.home.com>
References: <15197.44710.656892.910976@beluga.mojam.com>
 <15197.45048.841653.553164@cj42289-a.reston1.va.home.com>
 <15197.59408.52053.335198@anthem.wooz.org>
 <15197.62667.379233.918619@beluga.mojam.com>
 <200107242302.TAA08477@cj20424-a.reston1.va.home.com>
Message-ID: <15198.945.330390.874100@beluga.mojam.com>

    >> Looks like we have at least three pages that list related info:
    >> 
    >> http://www.python.org/sigs/
    >> http://mail.python.org/
    >> http://sourceforge.net/mail/?group_id=5470
    >> 
    >> Can they be unified?

    Guido> I don't see how.

Perhaps they can at least be made to point incestuously to one another
(though I imagine the sf page is completely out of our control and thus
immune to such incest)...

Skip



From itojun@iijlab.net  Wed Jul 25 00:43:53 2001
From: itojun@iijlab.net (itojun@iijlab.net)
Date: Wed, 25 Jul 2001 08:43:53 +0900
Subject: [Python-Dev] IPv6 committed
In-Reply-To: guido's message of Tue, 24 Jul 2001 17:02:30 -0400.
 <200107242102.RAA07970@cj20424-a.reston1.va.home.com>
Message-ID: <2069.996018233@itojun.org>

>Martin and Itojun,
>I would like to thank you both for adding IPv6 support to Python.
>It's a big boon for Python as well as for IPv6!

	no, thank you!  and actually the last one mile was all done by Martin,
	i would really like to so thank Martin.

itojun


From itojun@iijlab.net  Wed Jul 25 00:47:17 2001
From: itojun@iijlab.net (itojun@iijlab.net)
Date: Wed, 25 Jul 2001 08:47:17 +0900
Subject: [Python-Dev] Re: IPv6 committed
In-Reply-To: martin's message of Tue, 24 Jul 2001 22:48:39 +0200.
 <200107242048.f6OKmdq01942@mira.informatik.hu-berlin.de>
Message-ID: <2086.996018437@itojun.org>

>Hi itojun,
>
>As you may have noticed, I just committed the last chunk of your IPv6
>patch. Thanks a lot for your contributions, I think you've provided a
>highly valuable contribution to Python 2.2. We still have to figure
>out a way to provide documentation, but I expect that we can complete
>that before 2.2a2.

	sorry that i'm delayed about documentation (specificaly socket module).
	i have no TeX environment now (i had before) and having trouble
	checking if i'm typesetting right.  do you mind if i send you just
	plaintext?

>As with all new code, there may occur some problems; I hope you'll be
>around for the coming weeks and give the professional advise that
>you've provided throughout the integration of the code.

	as for Lib/*.y changes, there shouldn't be much changes unless you have
	faulty IPv6 connectivity - the code will try to connect to IPv6
	destination then IPv4 against FQDN hostnames, so if IPv6 connectivity
	is faulty you will see more delays.

	with Lib/ftp.py people is most likely to see something is happening
	as it will try protocol-independent FTP commands (EPSV/EPRT) first.

	anyway... if possible drop me notes.  i don't check SF too frequently
	(i cannot adapt to the SF UI).  i'll subscribe to python-dev.

itojun


From guido@digicool.com  Wed Jul 25 01:09:18 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 24 Jul 2001 20:09:18 -0400
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: Your message of "Tue, 24 Jul 2001 18:24:33 CDT."
 <15198.945.330390.874100@beluga.mojam.com>
References: <15197.44710.656892.910976@beluga.mojam.com> <15197.45048.841653.553164@cj42289-a.reston1.va.home.com> <15197.59408.52053.335198@anthem.wooz.org> <15197.62667.379233.918619@beluga.mojam.com> <200107242302.TAA08477@cj20424-a.reston1.va.home.com>
 <15198.945.330390.874100@beluga.mojam.com>
Message-ID: <200107250009.UAA08631@cj20424-a.reston1.va.home.com>

>     >> Looks like we have at least three pages that list related info:
>     >> 
>     >> http://www.python.org/sigs/
>     >> http://mail.python.org/
>     >> http://sourceforge.net/mail/?group_id=5470
>     >> 
>     >> Can they be unified?
> 
>     Guido> I don't see how.
> 
> Perhaps they can at least be made to point incestuously to one another
> (though I imagine the sf page is completely out of our control and thus
> immune to such incest)...
> 
> Skip

And the mailman page is also auto-generated.  This leaves the sigs
page, which AFAIK already points to the others (incestuously or
otherwise :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@zope.com  Wed Jul 25 03:44:58 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 24 Jul 2001 22:44:58 -0400
Subject: [Python-Dev] number-sig anyone?
References: <15197.44710.656892.910976@beluga.mojam.com>
 <15197.45048.841653.553164@cj42289-a.reston1.va.home.com>
 <15197.59408.52053.335198@anthem.wooz.org>
 <15197.62667.379233.918619@beluga.mojam.com>
 <200107242302.TAA08477@cj20424-a.reston1.va.home.com>
 <15198.945.330390.874100@beluga.mojam.com>
Message-ID: <15198.12970.318099.891202@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    >> Looks like we have at least three pages that list related info:
    >> http://www.python.org/sigs/ http://mail.python.org/
    >> http://sourceforge.net/mail/?group_id=5470 Can they be unified?

    Guido> I don't see how.

    SM> Perhaps they can at least be made to point incestuously to one
    SM> another (though I imagine the sf page is completely out of our
    SM> control and thus immune to such incest)...

Only the /sigs/ page is statically generated, so only it is easy to
change.

-Barry


From guido@digicool.com  Wed Jul 25 04:52:26 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 24 Jul 2001 23:52:26 -0400
Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together?
In-Reply-To: Your message of "Tue, 24 Jul 2001 16:54:54 CDT."
 <15197.61102.391599.162359@beluga.mojam.com>
References: <15196.43097.529737.173915@beluga.mojam.com> <3B5D4ACF.F79CD179@lemburg.com> <200107241929.PAA07684@cj20424-a.reston1.va.home.com>
 <15197.61102.391599.162359@beluga.mojam.com>
Message-ID: <200107250352.XAA01001@cj20424-a.reston1.va.home.com>

>     Guido> I think PEP 211 and PEP 242 don't belong in this list.  PEP 211
>     Guido> doesn't affect Python's number system at all, and PEP 242
>     Guido> proposes a set of storage choices, not choices in semantics.  PEP
>     Guido> 242 is valid regardless of what we decide about int division.
> 
> The inclusion of PEP 211 in this message was an oversight.  I pasted this
> list from another message.  I included PEP 242 on purpose however.  I think
> Paul gives you a language for perhaps defining other sorts of numeric
> properties besides numeric precision (which is what my reading led me to
> believe it was focused on).

Maybe, but I don't think a review of PEP 242 is necessary in order to
decide on the others.

>     ...
> 
>     Guido> But it's different the other way around: PEP 238 can easily stand
>     Guido> on its own.  It addresses a problem that exists even without a
>     Guido> unified numeric model.
> 
>     Guido> Conversely, if PEP 238 is unacceptable, PEP 228 also has no hope,
>     Guido> and PEP 239 is much less attractive.  Since PEP 238 is the only
>     Guido> one that cannot avoid breaking existing code, I want to introduce
>     Guido> it as soon as I can, since the others can only be introduced
>     Guido> after the full compatibility waiting period for PEP 238, at least
>     Guido> two years.
> 
>     ...
> 
>     Guido> If we introduce rationals, and we redefine int division as
>     Guido> returning a rational instead of a float, this will not affect the
>     Guido> mathematical value.
> 
>     ...
> 
>     Guido> I am currently maintaining the PEP 238 implementation as a patch;
>     Guido> I don't want to start any new branches before we've merged the
>     Guido> descr-branch into the trunk.
> 
> I elided a bunch of valuable information, stuff I was previously unaware of.
> The acceptability or not of PEP 238 in the broader Python community appears
> to be based on people only looking back.  As far as I know most people
> aren't aware of the long-term motivation.  (It may have been there in one of
> Guido's or Tim's messages, but if so, I missed it.)  I certainly wasn't
> aware of the motivation, and I just read the above PEPs in the past day or
> two.  Connecting all that together (a "meta PEP"?)  probably belongs in PEP
> 228.

I have Moshe's permission to co-author PEP 238, which I'll do as soon
as I'm done with my remote keynote at the O'Reilly conference (due to
circumstances beyond my control I'm not in San Diego), sometime
tomorrow.

> Here's what I propose.  Once the descr-branch has been merged, create a new
> branch, call it mouse-branch.  Add the PEP 238 and other changes there and
> update PEP 228 (last change: 4 Nov 2000) to include the rationale I deleted
> from Guido's message.  Then urge anyone with an interest in any of these
> topics to check out the mouse from CVS and play with it.  (Just don't squish
> it, that's the Python's job!)  Initially, it will just have the one change
> that has stirred up such a hornet's nest.  Still, even that will be
> instructive to play with, and in concert with a stronger motivation for the
> change in PEP 228 (and perhaps PEP 238) should help soften the blow caused
> by the change.  As I mentioned in a previous message, I think you have one
> chance to make this change.  If people perceive that "hey, he's going
> somewhere interesting with this stuff", I think they will be more open to
> the discomfort of individual changes.

That's one suggestion.  I've noticed that very few people check out
branches unless you force them.  The PEP-238 changes are localized
enough that I can maintain them as a patch in the SF patch manager;
that's easier to use for most people.

> Then, once you're ready (I don't know if 2.2 is far enough out), have the
> Python eat the mouse and start a rat-branch that incorporates all the
> rational stuff (having never used a programming language that supported
> rational numbers, I find the prospect both a bit daunting and exciting).
> That branch will live for a fairly long time, probably at least until 2.4,
> when the int division change is complete, at which point the Python can eat
> the rat.

I don't have time for rationals yet; but I do want to put phase 1 of
PEP 238 in the 2.2 release, and preferably sooner (e.g. 2.2a2) rather
than later.  Phase 1 breaks no code; all it does is add the //
operator and the future division statement.  I also plan command line
options to (1) add warnings for old-style / with int or long args; and
(2) make new-style / the default.  Both are tools (though not the only
ones) for future-proofing code.  (One goal is to make the whole
library robust under any combination of command line options; this
will require a branch or checking things in on the trunk, as it will
affect a large number of files.)

>     Guido> Have you looked at my PEP-238 patch at all?  
> 
> Not yet.  Should it be applied to the head branch or the descr-branch?

It works with either.  Also with the 2.2a1 release, I expect.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Wed Jul 25 06:01:26 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 25 Jul 2001 01:01:26 -0400
Subject: [Python-Dev] shouldn't we be considering all pending numeric  proposals together?
In-Reply-To: <3B5D4ACF.F79CD179@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEPDLAAA.tim.one@home.com>

[MAL]
> May I suggest that these rather controversial changes be carried
> out on a separate branch of the Python source tree before adding
> them to the trunk ?!

Sure, provided you're volunteering to keep the branch in synch with the
trunk:  branches are both expensive and risky, unless the intent is never to
merge in either direction.

Much as I hate the obfuscating effects of #ifdefs, these changes are
localized enough that it would be a clear net win to use them rather than
branches, if Guido gets weary of maintaining a patch.



From skip@pobox.com (Skip Montanaro)  Wed Jul 25 06:08:13 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 25 Jul 2001 00:08:13 -0500
Subject: [Python-Dev] post mortem after threading deadlock?
Message-ID: <15198.21565.83325.86255@beluga.mojam.com>

Is there any possibility of getting some post-mortem info out of a
multi-threaded system whose threads are deadlocked?  I am trying to
multi-thread my xmlrpc server methods.  It's working okay "almost all the
time" and I see a wonderful overall throughput boost because slow operations
tend to no longer impede fast ones.

I got a deadlock today, however, and am wondering how I am going to go about
figuring out what happened the next time this happens.  The main thread
never deadlocks.  It just spins off a thread to perform handle the current
request and goes back to listening for new requests.  For the purposes of
inspecting the deadlocked threads I plan to add a method to my server that
roots around for interesting info without attempting to lock anything (and
thus possibly joining the deadlock party).  Can one thread get at any state
from other threads?  I can inspect the values of the various shared locks
and semaphores I'm using, but I was hoping to get at perhaps the current
frame of each of the deadlocked threads.  Any chance of that?  Failing that,
what about locks and semaphores that can time out and raise exceptions?  I
think they'd be useful for debugging if they could be implemented.

Thx,

Skip




From m@moshez.org  Wed Jul 25 06:14:01 2001
From: m@moshez.org (m@moshez.org)
Date: Wed, 25 Jul 2001 08:14:01 +0300
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: <200107242130.RAA08105@cj20424-a.reston1.va.home.com>
References: <200107242130.RAA08105@cj20424-a.reston1.va.home.com>, <15197.44710.656892.910976@beluga.mojam.com>
 <15197.59261.754548.28233@anthem.wooz.org>
Message-ID: <E15PGzh-0004pu-00@darjeeling>

On Tue, 24 Jul 2001 17:30:52 -0400, Guido van Rossum <guido@digicool.com> wrote:

> Sounds like a good plan, but please wait until we have a SIG
> owner/moderator and a charter.  Without both of these a SIG will be a
> failure.

If we do have a number-sig, I suppose python-numberics@sf.net should
die and merge into that, right?
I am all for that, but I won't be volunteering to be the champion...
I've learned my lesson about spreading myself too thin.
-- 
gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE
Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6
Insecure (accessible): C5A5 A8FA CA39 AB03 10B8  F116 1713 1BCF 54C4 E1FE
Learn Python! http://www.ibiblio.org/obp/thinkCSpy


From martin@loewis.home.cs.tu-berlin.de  Wed Jul 25 07:31:53 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 25 Jul 2001 08:31:53 +0200
Subject: [Python-Dev] Re: IPv6 committed
In-Reply-To: <2086.996018437@itojun.org>
References: <2086.996018437@itojun.org>
Message-ID: <200107250631.f6P6Vr602375@mira.informatik.hu-berlin.de>

> 	sorry that i'm delayed about documentation (specificaly socket module).
> 	i have no TeX environment now (i had before) and having trouble
> 	checking if i'm typesetting right.  do you mind if i send you just
> 	plaintext?

That's fine; I'll then put it into the python-tex format - in the end,
Fred will compile it into HTML and publish it on SF (devel-docs), so
that we can check whether it came out right.

> 	anyway... if possible drop me notes.

I'll keep you informed, no problem.

Regards,
Martin


From martin@strakt.com  Wed Jul 25 09:28:05 2001
From: martin@strakt.com (Martin Sjögren)
Date: Wed, 25 Jul 2001 10:28:05 +0200
Subject: [Python-Dev] Memory leaks?
Message-ID: <20010725102805.A24723@strakt.com>

I'm a bit curious about the memory handling of the Py_InitModule4...

When adding methods, if PyCFunction_New fails, NULL is returned without
the module object being DECREF'd, and similarly, if the
PyDict_SetItemString fails, NULL is returned without neither module objec=
t
nor function object being DECREF'd.

Is this a problem, or is this taken care of somewhere else?

--=20
Martin Sj=F6gren
  martin@strakt.com              ICQ : 41245059
  Phone: +46 (0)31 405242        Cell: +46 (0)739 169191
  GPG key: http://www.strakt.com/~martin/gpg.html


From tanzer@swing.co.at  Wed Jul 25 09:37:04 2001
From: tanzer@swing.co.at (Christian Tanzer)
Date: Wed, 25 Jul 2001 10:37:04 +0200
Subject: [Python-Dev] Re: Future division patch available (PEP 238)
In-Reply-To: Your message of "Sun, 22 Jul 2001 00:36:38 EDT."
 <200107220436.AAA05323@cj20424-a.reston1.va.home.com>
Message-ID: <m15PKAC-000wcBC@swing.co.at>

Guido,

I deeply respect your language design skills and I appreciate your
ongoing improvements. I'm impatiently looking forward to using many of
the features new in 2.2.

The future division patch is a different issue, though. While I agree
that the proposed changes are a definite improvement over the current
semantics, the issue of backwards compatibility is a huge problem
difficult to solve. I see the following problems:

- What's going to happen to code released into the wild (i.e., I can
  change all my code but what about code I gave to others)? =


- How can one write readable code working correctly in both old and
  new Python versions?

- It takes a potentially huge effort to change all the existing code.

- In some cases, warnings might not be seen (e.g., they might land in
  /dev/null or in some log files nobody looks at).

- If an application jumps versions (e.g., from 2.1 to 2.6), no warnings
  might be generated at all.

- Upgrading to a new version of an application might break user scripts o=
r
  databases.

I have no idea how to tackle all these issues but I'll offer some
ideas nevertheless.

- If `int` always truncated (instead of truncating or rounding,
  depending on how the C compiler does it), one could write reasonably
  readable version independent code for truncating integer division.

  Compare `int (a/b)` to `divmod (a, b) [0]` or
  `int (math.floor (a/b))`.

- Just a wild idea: the problem you want to solve is that the existing
  division operator mixes two totally different meanings and thus
  leads to nasty surprises.

  What if `/` applied to two integer values returned neither an
  integer nor a float but an object carrying the float result but
  behaving like an integer if used in an integer context?

  For instance:

      >>> x =3D 1/2
      >>> type(x)
      <type 'Ratio'>
      >>> print "%d %f %s" % (x, x, x)
      0 0.5 0.5
      >>> 2 * x
      0
      >>> 2. * x
      1.0

   The difficult issue here is how `integer context` is defined.
   Should multiplication by an integer be considered an integer
   context? Pro: would preserve correctness of existing code like
   `(size / 8) * 8`. Con: is incompatible with Rationals which might
   be added in the future.

- Command line options are not a good way of handling this -- in many
  cases, different modules might need different settings. Even worse,
  looking at the code of a module won't tell you what option to use.

- If there is a possibility of specifying division semantics on a
  per module case (via a directive or the file extension), it should
  also be possible to specify the semantics for thingies like
  `compile`, `execfile`, `exec`, and `eval`.

  This only works if absence of a semantics indicator means old-style
  division. =


  I think this would go the farthest to alleviate compatibility
  problems. I understand your desire to avoid dragging the past around
  with you wherever you go and I like Python for its cleanliness. But
  in this case, it might be worthwhile to carry the ballast.

Let me outline the problems faced by my current customer TTTech (I'm
working as consultant for them). [This is going to be long -- sorry.]

TTTech provides design and on-line tools for embedded distributed
real-time systems in safety critical application domains (e.g.,
aerospace, automotive, ...). TTTech sells software tools (programmed
in Python) to customers worldwide.

Currently, there is a major release once a year. Due to various
reasons, the shipped tools normally don't use the most recent version
of Python. The current release is still based on 1.5.2. We hope to use
Python 2.1 for the release planned for the end of the year.
Internally, we try to use the most recent Python version. Therefore,
our Python code must be compatible to several Python versions.

The division change effects:

- Python programs
- Python scripts
- user scripts
- design databases

Python programs
---------------

I just used Skip's div-finder (thanks, Skip) to check the code of
three of our applications. It finds 391 uses of division. I know that
many of those are meant to be truncating divisions, while many others
are meant to be floating divisions. Somebody will have to look at each
and every one and fix it -- automatic conversion won't be possible.

Unfortunately, the applications also contain lots of code inside of
strings feed to eval or exec during run-time. I don't know
how many divisions are in those, but somebody will have to look at
them as well. This is one area frequently overlooked when the effect
of changes is discussed and conversion tools proposed on c.l.py.

As these tools are frozen, they don't depend on what Python version
the user has installed.

Python scripts
--------------

Internally, TTTech uses quite a number of Python scripts.
Unfortunately, different users have different Python versions
installed. Currently, 1.5.2, 2.0, and 2.1 are installed (there might
still be the odd 1.5.1 around somewhere, too). As the scripts are
taken from a central file server whereas Python is installed locally,
the scripts and the library modules used must be compatible to all the
Python versions deployed. That makes migration difficult if the same
symbol means crossly different things in different versions.

User scripts
------------

Our tools are user scriptable. These scripts are written in Python and
executed in the application's context via execfile (they don't work as
standalone scripts).

Such scripts are written and maintained by unknown customers who may
or may not be programmers and who may or may not have Python
experience. (One of the nice features of Python is that even a
computer naive user can start writing scripts with little knowledge
about Python by modifying examples). Quite often, important scripts =

have been implemented by people who since changed jobs.

Various customers use such scripts for interfacing to other tools,
creating designs, checking designs for conformance, writing test
cases, implementing test frameworks, generating design reports, ...

The delivery of a new tool version ***must not break*** such scripts.
TTTech simply cannot tell their customers that they have to review all
their scripts and change some but not all occurrences of the division
operator. We don't want to get stuck with an outdated Python version,
either. =


OTOH, we cannot assume old style semantics in the scripts either as
new users might never have heard about how division used to work in
warty versions of Python.

Design databases
-----------------

Design databases store the design of an embedded distributed
real-time system as specified by the user. Such databases must stay
alive for a looooong time (think of 10+ years for some application
domains). =


Our tools allow the specification of symbolic expressions by the user.
Such expressions are feed through eval at the right time (i.e., late)
to get a numeric value. The symbolic expressions are stored in the
database as entered by the user. Reading an old database with a new
tool version ***must not change*** the semantics.

To be honest, for TTTech design databases the change in division
probably doesn't pose any problems. Due to user demand, the tools
coerced divisions to floating point for a long time. Other companies
might be bitten in this way, though.

-- =

Christian Tanzer                                         tanzer@swing.co.=
at
Glasauergasse 32                                       Tel: +43 1 876 62 =
36
A-1130 Vienna, Austria                                 Fax: +43 1 877 66 =
92



From mal@lemburg.com  Wed Jul 25 10:42:06 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 25 Jul 2001 11:42:06 +0200
Subject: [Python-Dev] shouldn't we be considering all pending numeric
 proposals together?
References: <LNBBLJKPBEHFEDALKOLCOEPDLAAA.tim.one@home.com>
Message-ID: <3B5E946E.1105C79B@lemburg.com>

Tim Peters wrote:
> 
> [MAL]
> > May I suggest that these rather controversial changes be carried
> > out on a separate branch of the Python source tree before adding
> > them to the trunk ?!
> 
> Sure, provided you're volunteering to keep the branch in synch with the
> trunk:  branches are both expensive and risky, unless the intent is never to
> merge in either direction.

As you may have guessed: I'm not particulary interested in any
change to the status quo w/r to Python's treatment of integer
division, since I know that I have used the current C-like
behaviour in code I've written in the past few years and that 
finding this code will be a nightmare.

PEP 238 doesn't help with this either since it still changes
the semantics of '/' instead of keeping them and adding the
new semantics using a new operator '//' which wouldn't break
anything and still make people happy.

Also, I think that the warning framework will not help much for 
moving to PEP 238:  if you generate a warning for every source 
code occurrance of  '/' where integer division takes place, this 
will render at least some programs unusable: either due to the 
slow-down of having to branch through the warning machinery only 
to find that the user doesn't want to see the warning or by 
producing stderr messages in quantities which will keep any user 
out there from using the program.

OTOH, I wouldn't mind if we add a per-module directive which then
tells the compiler to generate new style semantics integer
division opcodes. Guido's patch already implements this, except
that it uses the magic __future__ import which will be phased
out eventually... how about a "from __semantics__ import 
non_integer_division" which does not have a timeout attached 
to it ?!

> Much as I hate the obfuscating effects of #ifdefs, these changes are
> localized enough that it would be a clear net win to use them rather than
> branches, if Guido gets weary of maintaining a patch.

If that's feasable, sure...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From mal@lemburg.com  Wed Jul 25 11:44:55 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 25 Jul 2001 12:44:55 +0200
Subject: [Python-Dev] Daily CVS snapshots
Message-ID: <3B5EA327.F63D37EF@lemburg.com>

Looks like the daily CVS snapshots are not working anymore:

	http://python.sourceforge.net/snapshots/

"""
Daily snapshots from Python CVS repository

     python-20010501 tar.gz .zip 
     python-20010430 tar.gz .zip 
     python-20010429 tar.gz .zip 
     ...
"""

Also, the .zip link points to a .gzip file ?!

Could someone please check this ? 

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From fdrake@acm.org  Wed Jul 25 12:48:05 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 25 Jul 2001 07:48:05 -0400 (EDT)
Subject: [Python-Dev] Daily CVS snapshots
In-Reply-To: <3B5EA327.F63D37EF@lemburg.com>
References: <3B5EA327.F63D37EF@lemburg.com>
Message-ID: <15198.45557.572298.868990@cj42289-a.reston1.va.home.com>

M.-A. Lemburg writes:
 > Looks like the daily CVS snapshots are not working anymore:
...
 >      python-20010501 tar.gz .zip 

  I suspect this date is tied to a furniture move at the PythonLabs
office; Jeremy's workstation was unplugged without his assistance, and
so things may not have come back up correctly.
  I don't think any of us know how he has this set up off-hand.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From loewis@informatik.hu-berlin.de  Wed Jul 25 13:03:48 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Wed, 25 Jul 2001 14:03:48 +0200 (MEST)
Subject: [Python-Dev] Opening sockets in protocol-independent manner (was: BaseHTTPServer.py etc)
Message-ID: <200107251203.OAA27691@pandora.informatik.hu-berlin.de>

> Excuse my ignorance, but: A case for refactoring?

Certainly, but it is debatable what exactly the best refactorization
is. Abstractly, these fall into two cases

- open a stream connection to some address (aka "client socket")
- open a server socket to wait for incoming clients

Either of these may find that it opens AF_INET, AF_INET6, or AF_UNIX
sockets, depending on the values of host and port, and depending on
what name lookup returns. Also, similar procedures are required for
opening datagram sockets.

In Ruby, this loop is done completely in C code, and there is a number
of wrapper classes to access the various options:

- TCPSocket opens a client stream socket, i.e. does getaddrinfo(),
  socket(), connect()
- UDPSocket opens a client datagram socket. Same as TCPSocket, only
  that it uses SOCK_DATAGRAM
- TCPServer opens a server stream socket, i.e. does getaddrinfo,
  socket, bind, and listen(5)
- UDPServer likewise
- UNIXSocket opens a client AF_UNIX socket
- UNIXServer opens a server AF_UNIX socket
- Socket does socket() only, allowing for subsequent other low-level
  calls

There are some base classes: IPSocket is the base for all
{TCP,UDP}{Socket,Server}; BasicSocket is base for UNIX{Socket,Server},
IPSocket, and Socket.

I cannot say that I particularly like this API, but I could not easily
find other/better generalizations. Therefore, no API is defined,
yet. Please note that refactorizing "for internal use only" is not an
acceptable solution: This is the Python library, so any function that
gets defined has to be supported for quite some time.

Any new API probably needs to take the existing SocketServer into
account, also.

Regards,
Martin


From mal@lemburg.com  Wed Jul 25 13:11:20 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 25 Jul 2001 14:11:20 +0200
Subject: [Python-Dev] Daily CVS snapshots
References: <3B5EA327.F63D37EF@lemburg.com> <15198.45557.572298.868990@cj42289-a.reston1.va.home.com>
Message-ID: <3B5EB768.9EF49C94@lemburg.com>

"Fred L. Drake, Jr." wrote:
> 
> M.-A. Lemburg writes:
>  > Looks like the daily CVS snapshots are not working anymore:
> ...
>  >      python-20010501 tar.gz .zip
> 
>   I suspect this date is tied to a furniture move at the PythonLabs
> office; Jeremy's workstation was unplugged without his assistance, and
> so things may not have come back up correctly.
>   I don't think any of us know how he has this set up off-hand.

Wouldn't it be possible to set up a CRON job on SF which takes
care of these snapshots ? I have no idea how to do this myself
(and probably don't have the necessary permissions), but since the
pep2html.py tool also uploads into the SF web-area, I suppose that
this is possible.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From fdrake@acm.org  Wed Jul 25 14:41:22 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 25 Jul 2001 09:41:22 -0400 (EDT)
Subject: [Python-Dev] Daily CVS snapshots
In-Reply-To: <3B5EB768.9EF49C94@lemburg.com>
References: <3B5EA327.F63D37EF@lemburg.com>
 <15198.45557.572298.868990@cj42289-a.reston1.va.home.com>
 <3B5EB768.9EF49C94@lemburg.com>
Message-ID: <15198.52354.271165.285519@cj42289-a.reston1.va.home.com>

M.-A. Lemburg writes:
 > Wouldn't it be possible to set up a CRON job on SF which takes
 > care of these snapshots ? I have no idea how to do this myself

  Sure, it could be done.  But we expect Jeremy to be back soon, and
fixing this is probably a 5-min operation, whereas anyone else would
need to create a new script to do the work and test it.
  I don't think it's worth worrying about for just a few days of
snapshots.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From mal@lemburg.com  Wed Jul 25 14:55:51 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 25 Jul 2001 15:55:51 +0200
Subject: [Python-Dev] Daily CVS snapshots
References: <3B5EA327.F63D37EF@lemburg.com>
 <15198.45557.572298.868990@cj42289-a.reston1.va.home.com>
 <3B5EB768.9EF49C94@lemburg.com> <15198.52354.271165.285519@cj42289-a.reston1.va.home.com>
Message-ID: <3B5ECFE7.3807915E@lemburg.com>

"Fred L. Drake, Jr." wrote:
> 
> M.-A. Lemburg writes:
>  > Wouldn't it be possible to set up a CRON job on SF which takes
>  > care of these snapshots ? I have no idea how to do this myself
> 
>   Sure, it could be done.  But we expect Jeremy to be back soon, and
> fixing this is probably a 5-min operation, whereas anyone else would
> need to create a new script to do the work and test it.
>   I don't think it's worth worrying about for just a few days of
> snapshots.

Right on all accounts :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From mal@lemburg.com  Wed Jul 25 15:11:39 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 25 Jul 2001 16:11:39 +0200
Subject: [Python-Dev] shouldn't we be considering all pending numeric
 proposals together?
References: <LNBBLJKPBEHFEDALKOLCOEPDLAAA.tim.one@home.com> <3B5E946E.1105C79B@lemburg.com>
Message-ID: <3B5ED39B.7A48C0D@lemburg.com>

I just "discovered" the loooong threads on c.l.p about PEP 238 --
looks like Guido is getting flamed badly here and I certainly don't
want to add to this, so just to summarize my previous post: the
only issue I have with PEP 238 (and all other PEPs trying to change
basic numeric properties in wild ways ;-) is backwards compatibility.

IMHO, these are all great feature to have in a nice language, it's
just that the path to these features should be carefully laid
out and this is probably *much* harder to get right than the
features themselves.

BTW, I intend to make the mxNumber types subclassable once the
dust has settled over the PEP 253 (subclassing builtin types) 
et al. features. 

I believe that this should provide a nice base for experimenting
with rationals, long integers, etc. For example, it might turn
out that having int / int create a rational number would
solve most of the problems mentioned on the various threads about
PEP 238 since rationals don't lose precision and simply defers the
conversion to either integers or floats to the point where one of
the two interpretations is actually needed by the code, e.g.
an "i" parser marker will invoke truncation to an integer while
float(result) will apply the conversion to a floating point 
number. If we make rationals a subtype of integers we wouldn't
even have PyInt_Check() problems at C level.... hmm, I'm getting
carried away.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From mclay@nist.gov  Wed Jul 25 15:17:47 2001
From: mclay@nist.gov (Michael McLay)
Date: Wed, 25 Jul 2001 10:17:47 -0400
Subject: [meta-sig] Re: [Python-Dev] number-sig anyone?
In-Reply-To: <15197.62568.918195.19877@beluga.mojam.com>
References: <15197.44710.656892.910976@beluga.mojam.com> <15197.59261.754548.28233@anthem.wooz.org> <15197.62568.918195.19877@beluga.mojam.com>
Message-ID: <0107251017470G.02438@fermi.eeel.nist.gov>

On Tuesday 24 July 2001 06:19 pm, Skip Montanaro wrote:
>     SM> Today I took a look at http://mail.python.org/mailman/listinfo and
>     SM> could find no math-sig or number-sig mailing list.
>
>     BAW> +1.  If others agree, I'll create the sig.
>
> In light of the other responses to my mail, perhaps the python-numerics
> list on Sourceforge is as good a place to carry this coversation as a new
> SIG.

How about just holding the conversation on the python-numberics. The members 
of that list will probably be interested in any proposed changes to the 
Python numeric model.



From guido@digicool.com  Wed Jul 25 15:32:49 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 25 Jul 2001 10:32:49 -0400
Subject: [Python-Dev] Daily CVS snapshots
In-Reply-To: Your message of "Wed, 25 Jul 2001 14:11:20 +0200."
 <3B5EB768.9EF49C94@lemburg.com>
References: <3B5EA327.F63D37EF@lemburg.com> <15198.45557.572298.868990@cj42289-a.reston1.va.home.com>
 <3B5EB768.9EF49C94@lemburg.com>
Message-ID: <200107251432.KAA02123@cj20424-a.reston1.va.home.com>

> "Fred L. Drake, Jr." wrote:
> > 
> > M.-A. Lemburg writes:
> >  > Looks like the daily CVS snapshots are not working anymore:
> > ...
> >  >      python-20010501 tar.gz .zip
> > 
> >   I suspect this date is tied to a furniture move at the PythonLabs
> > office; Jeremy's workstation was unplugged without his assistance, and
> > so things may not have come back up correctly.
> >   I don't think any of us know how he has this set up off-hand.

[MAL]
> Wouldn't it be possible to set up a CRON job on SF which takes
> care of these snapshots ? I have no idea how to do this myself
> (and probably don't have the necessary permissions), but since the
> pep2html.py tool also uploads into the SF web-area, I suppose that
> this is possible.

SF makes the latest snapshot available to project administrators.  I
could give you the URL but I don't think you can see them.  So there's
no need to run anything on SF, I think.

BTW, I don't think the date (May 1st) correlated to our furniture
move.  I dunno *what* happened on May 2nd, nor who makes the tar
copies (I thought it was Barry?).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Wed Jul 25 15:34:41 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 25 Jul 2001 10:34:41 -0400 (EDT)
Subject: [Python-Dev] Daily CVS snapshots
In-Reply-To: <200107251432.KAA02123@cj20424-a.reston1.va.home.com>
References: <3B5EA327.F63D37EF@lemburg.com>
 <15198.45557.572298.868990@cj42289-a.reston1.va.home.com>
 <3B5EB768.9EF49C94@lemburg.com>
 <200107251432.KAA02123@cj20424-a.reston1.va.home.com>
Message-ID: <15198.55553.597831.427828@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > SF makes the latest snapshot available to project administrators.  I
 > could give you the URL but I don't think you can see them.  So there's
 > no need to run anything on SF, I think.

  No; SF makes tarballs of the repository available, but not snapshots
of the current state of things.

 > BTW, I don't think the date (May 1st) correlated to our furniture
 > move.  I dunno *what* happened on May 2nd, nor who makes the tar
 > copies (I thought it was Barry?).

  I'm pretty sure Barry just pushes the repository backups to tape,
but that Jeremy handles the snapshots.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From guido@digicool.com  Wed Jul 25 15:38:29 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 25 Jul 2001 10:38:29 -0400
Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together?
In-Reply-To: Your message of "Wed, 25 Jul 2001 16:11:39 +0200."
 <3B5ED39B.7A48C0D@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCOEPDLAAA.tim.one@home.com> <3B5E946E.1105C79B@lemburg.com>
 <3B5ED39B.7A48C0D@lemburg.com>
Message-ID: <200107251438.KAA02162@cj20424-a.reston1.va.home.com>

> I just "discovered" the loooong threads on c.l.p about PEP 238 --
> looks like Guido is getting flamed badly here and I certainly don't
> want to add to this, so just to summarize my previous post: the
> only issue I have with PEP 238 (and all other PEPs trying to change
> basic numeric properties in wild ways ;-) is backwards compatibility.
> 
> IMHO, these are all great feature to have in a nice language, it's
> just that the path to these features should be carefully laid
> out and this is probably *much* harder to get right than the
> features themselves.

Yup, and that's what I'm focusing on in my responses.  I plan to lay
out a very careful compatibility track and discuss *that* with the
community in earnest.

> BTW, I intend to make the mxNumber types subclassable once the
> dust has settled over the PEP 253 (subclassing builtin types) 
> et al. features. 

Very cool.  All extension types should be subclassable!  (Also all
built-in types.  But that's my job. :-)

> I believe that this should provide a nice base for experimenting
> with rationals, long integers, etc. For example, it might turn
> out that having int / int create a rational number would
> solve most of the problems mentioned on the various threads about
> PEP 238 since rationals don't lose precision and simply defers the
> conversion to either integers or floats to the point where one of
> the two interpretations is actually needed by the code, e.g.
> an "i" parser marker will invoke truncation to an integer while
> float(result) will apply the conversion to a floating point 
> number. If we make rationals a subtype of integers we wouldn't
> even have PyInt_Check() problems at C level.... hmm, I'm getting
> carried away.

For the folks concerned about code breakage, it doesn't make much of a
difference whether 1/2 returns a float or a rational -- in both case
the integer division property that they want is broken.

I actually expect that most conversion jobs will be easy -- all those
folks who suffer from "Extreme Fear of Floating Point" (as Tim calls
it) can simply change every / into a // in their program (using a tool
that properly tokenizes) and they should be done, since most likely
their code never uses floating point. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Wed Jul 25 15:55:51 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 25 Jul 2001 10:55:51 -0400
Subject: [Python-Dev] Memory leaks?
In-Reply-To: Your message of "Wed, 25 Jul 2001 10:28:05 +0200."
 <20010725102805.A24723@strakt.com>
References: <20010725102805.A24723@strakt.com>
Message-ID: <200107251455.KAA02295@cj20424-a.reston1.va.home.com>

[Martin Sjögren]
> I'm a bit curious about the memory handling of the Py_InitModule4...
> 
> When adding methods, if PyCFunction_New fails, NULL is returned
> without the module object being DECREF'd, and similarly, if the
> PyDict_SetItemString fails, NULL is returned without neither module
> object nor function object being DECREF'd.
> 
> Is this a problem, or is this taken care of somewhere else?

The first one is not a problem.  The module 'm' is received from
PyImport_AddModule(), which (in a comment in the source) emphasizes
that the return value does not have its reference count incremented.
Instead, the module is kept alive because it is stored in sys.modules.

The second one should really DECREF v when PyDict_SetItemString()
fails.  Ditto for the docstring.

I've added a low-priority bug report, because this is very unlikely to
happen.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From perry@stsci.edu  Wed Jul 25 16:02:21 2001
From: perry@stsci.edu (Perry Greenfield)
Date: Wed, 25 Jul 2001 11:02:21 -0400
Subject: [Python-Dev] A future division proposal
Message-ID: <JFEGLNDJEDNOMPPHDEJFGELADJAA.perry@stsci.edu>

Clearly the issue of changing the semantics of division is a very
[ahem] divisive one in the Python community. It seems to me that
providing a foolproof way of providing backwards compatibility 
would go a long way to reducing the ire of those with a lot of
code to inspect and change.

I'm not particularly thrilled with the suggestions made so far.
Suggesting that people continue to use an older version of Python
if they have a problem with it is especially unsatisfying.
Eventually that older version will have to be updated in some
manner (resulting in a fork of Python) or they will have to
make the necessary changes to their code (albeit over a longer
time).

Providing a new division operator (//) that handles integer division
doesn't really solve the issue of inspecting the code either.
There isn't any automatic way of telling when / or // should be used
in old code.

Command line switches or other mechanisms to indicate that 
division should have different behavior will confuse those
trying to understand source code ("is this '/' a new or
old division?").

Why not provide yet another division operator for backwards
compatibility purpose? This operator would have exactly the 
same semantics as the current division operator. If this were
available, it should be a relatively simple matter to provide
a tool to convert all uses of the / operator to the new form 
for old code. With this solution, the code never has to be
manually inspected to work with the new version, instead, it
just has to be mechanically translated. The fact that the
operator has different semantics will be evident in the 
translated code.

I don't know what the best name or symbol would be (olddiv, ///?)
and admittedly it is ugly to have 3 division operators. But
the alternatives seem far, far uglier. Isn't this a case
where practicality beats purity (for keywords or operators)?

Perry Greenfield


From barry@zope.com  Wed Jul 25 16:21:56 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 25 Jul 2001 11:21:56 -0400
Subject: [Python-Dev] Daily CVS snapshots
References: <3B5EA327.F63D37EF@lemburg.com>
Message-ID: <15198.58388.537611.447486@anthem.wooz.org>

Looks okay to me:

% tar ztvf python-20010501.tar.gz | head
drwxr-sr-x jhylton/python    0 2001-05-01 08:08:51 Python-20010501/
drwxr-sr-x jhylton/python    0 2001-05-01 08:08:50 Python-20010501/BeOS/
drwxr-sr-x jhylton/python    0 2001-05-01 08:08:50 Python-20010501/BeOS/ar-1.1/
drwxr-sr-x jhylton/python    0 2001-05-01 08:08:50 Python-20010501/BeOS/ar-1.1/docs/
drwxr-sr-x jhylton/python    0 2001-05-01 08:08:50 Python-20010501/Demo/
drwxr-sr-x jhylton/python    0 2001-05-01 08:08:50 Python-20010501/Demo/classes/
-rwxr-xr-x jhylton/python 7816 1997-12-09 14:38:39 Python-20010501/Demo/classes/Complex.py
-rwxr-xr-x jhylton/python 7728 1998-09-14 11:34:45 Python-20010501/Demo/classes/Dates.py
-rwxr-xr-x jhylton/python 1249 1993-12-17 09:23:52 Python-20010501/Demo/classes/Dbm.py
-rw-r--r-- jhylton/python  597 1993-12-17 09:23:52 Python-20010501/Demo/classes/README

Also, I run a nightly script to grab the CVS repository, and that
looks fine to me too.

-Barry


From barry@zope.com  Wed Jul 25 16:30:18 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 25 Jul 2001 11:30:18 -0400
Subject: [Python-Dev] Daily CVS snapshots
References: <3B5EA327.F63D37EF@lemburg.com>
 <15198.45557.572298.868990@cj42289-a.reston1.va.home.com>
 <3B5EB768.9EF49C94@lemburg.com>
 <200107251432.KAA02123@cj20424-a.reston1.va.home.com>
Message-ID: <15198.58890.633459.435982@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@digicool.com> writes:

    GvR> BTW, I don't think the date (May 1st) correlated to our
    GvR> furniture move.  I dunno *what* happened on May 2nd, nor who
    GvR> makes the tar copies (I thought it was Barry?).

Nope, I pull down the CVS repository snapshots from

    http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz

(I grab a bunch of tarballs, including for Jython, Mailman, and
mimelib).  These are different than what's on the other page
mentioned; those are just CVS working directory snapshots.

-Barry


From barry@zope.com  Wed Jul 25 16:32:07 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 25 Jul 2001 11:32:07 -0400
Subject: [Python-Dev] Daily CVS snapshots
References: <3B5EA327.F63D37EF@lemburg.com>
 <15198.45557.572298.868990@cj42289-a.reston1.va.home.com>
 <3B5EB768.9EF49C94@lemburg.com>
 <200107251432.KAA02123@cj20424-a.reston1.va.home.com>
 <15198.55553.597831.427828@cj42289-a.reston1.va.home.com>
Message-ID: <15198.58999.861192.236578@anthem.wooz.org>

>>>>> "Fred" == Fred L Drake, Jr <fdrake@acm.org> writes:

    Fred>   I'm pretty sure Barry just pushes the repository backups
    Fred> to tape, but that Jeremy handles the snapshots.

Yup.


From Paul.Moore@atosorigin.com  Wed Jul 25 16:53:45 2001
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Wed, 25 Jul 2001 16:53:45 +0100
Subject: [Python-Dev] A future division proposal
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5AF31@UKRUX002.rundc.uk.origin-it.com>

> I don't know what the best name or symbol would be
> (olddiv, ///?) and admittedly it is ugly to have 3
> division operators. But the alternatives seem far,
> far uglier. Isn't this a case where practicality
> beats purity (for keywords or operators)?

You could probably write a function to do this. There's no need for anything
built into Python.

Actually, when I tried, I got into a bit of a mess getting the type checks
(which you need) right -

    def olddiv(n,m):
        if type(n) == type(m) == type(0):
            return n//m
        else:
            return n/m

But this needs the checks expanded to take longs into account. Which is
where it gets messy.

But:
a) It can be done, and
b) The fact that it's messy probably exposes what's wrong with the old
semantics quite well :-)

Paul.


From skip@pobox.com (Skip Montanaro)  Wed Jul 25 17:30:19 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 25 Jul 2001 11:30:19 -0500
Subject: [Python-Dev] find method for lists
Message-ID: <15198.62491.532831.221942@beluga.mojam.com>

This has probably been discussed before, but why doesn't the list object
support a find method?  Seems like if a non-exception-raising index method
is good enough for strings, it should be good enough for lists as well.  I
realize I can use "l.count(x) and l.index(x)" to avoid the possible
ValueError.  (Or maybe it's strings that shouldn't have find, but can't be
deleted not for code breakage reasons?)

I'm mostly just curious.  Am I missing something?

Skip


From fdrake@acm.org  Wed Jul 25 17:29:41 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 25 Jul 2001 12:29:41 -0400 (EDT)
Subject: [Python-Dev] find method for lists
In-Reply-To: <15198.62491.532831.221942@beluga.mojam.com>
References: <15198.62491.532831.221942@beluga.mojam.com>
Message-ID: <15198.62453.125734.953636@cj42289-a.reston1.va.home.com>

Skip Montanaro writes:
 > This has probably been discussed before, but why doesn't the list object
 > support a find method?  Seems like if a non-exception-raising index method
 > is good enough for strings, it should be good enough for lists as well.  I

  I've seen this brought up, but I'm not sure how important it is.  It
certainly seems like this would be handy.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From guido@digicool.com  Wed Jul 25 17:37:09 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 25 Jul 2001 12:37:09 -0400
Subject: [Python-Dev] find method for lists
In-Reply-To: Your message of "Wed, 25 Jul 2001 11:30:19 CDT."
 <15198.62491.532831.221942@beluga.mojam.com>
References: <15198.62491.532831.221942@beluga.mojam.com>
Message-ID: <200107251637.MAA07861@cj20424-a.reston1.va.home.com>

> This has probably been discussed before, but why doesn't the list object
> support a find method?  Seems like if a non-exception-raising index method
> is good enough for strings, it should be good enough for lists as well.  I
> realize I can use "l.count(x) and l.index(x)" to avoid the possible
> ValueError.  (Or maybe it's strings that shouldn't have find, but can't be
> deleted not for code breakage reasons?)
> 
> I'm mostly just curious.  Am I missing something?

List searching is much less common, and the string functions (both
index() and find()) have different semantics: they look for
substrings, while list.index() only searches for a particular item.

With lists, if you need this, you're probbly using the wrong
datastructure.  With strings, substring matching is a standard
pattern.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From SBrunning@trisystems.co.uk  Wed Jul 25 18:00:55 2001
From: SBrunning@trisystems.co.uk (Simon Brunning)
Date: Wed, 25 Jul 2001 18:00:55 +0100
Subject: [Python-Dev] Small feature request - optional argument for string.strip()
Message-ID: <31575A892FF6D1118F5800600846864D78BEFD@intrepid>

Is it OK to post small feature requests directly to this list, or is there
some other mechanism for them? What I have in mind certainly isn't worth a
PEP.

The .split method on strings splits at whitespace by default, but takes an
optional argument allowing splitting by other strings. The .strip method
(and its siblings) always strip whitespace - on more than one occasion I
would have found it useful if these methods also took an optional argument
allowing other strings to be stripped. For example, to strip, say, asterisks
from a file you could do:

>>>fred = '**word**word**'
>>>fred.strip('*')
word**word

Does this sound sensible/useful?

Cheers,
Simon Brunning.




-----------------------------------------------------------------------
The information in this email is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this email by anyone else
is unauthorised. If you are not the intended recipient, any disclosure,
copying, distribution, or any action taken or omitted to be taken in
reliance on it, is prohibited and may be unlawful. TriSystems Ltd. cannot
accept liability for statements made which are clearly the senders own.


From mal@lemburg.com  Wed Jul 25 18:04:41 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 25 Jul 2001 19:04:41 +0200
Subject: [Python-Dev] shouldn't we be considering all pending numeric
 proposals together?
References: <LNBBLJKPBEHFEDALKOLCOEPDLAAA.tim.one@home.com> <3B5E946E.1105C79B@lemburg.com>
 <3B5ED39B.7A48C0D@lemburg.com> <200107251438.KAA02162@cj20424-a.reston1.va.home.com>
Message-ID: <3B5EFC29.B4B30CC2@lemburg.com>

Guido van Rossum wrote:
> ...
> I actually expect that most conversion jobs will be easy -- all those
> folks who suffer from "Extreme Fear of Floating Point" (as Tim calls
> it) can simply change every / into a // in their program (using a tool
> that properly tokenizes) and they should be done, since most likely
> their code never uses floating point. :-)

Well, that would break floating points then... unless float // float
works like float / float does now. Perhaps you should simply 
add a nb_altdivide slot to the numeric set of slots which is then
called for a // b. Floats would then reuse their nb_divide 
for //.

BTW, my idea about rationals turns out not to work too well:
1/6 + 5/6 would give 6/6 == 1 while the current semantics 
return 0 in this case.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From guido@digicool.com  Wed Jul 25 18:22:53 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 25 Jul 2001 13:22:53 -0400
Subject: [Python-Dev] Small feature request - optional argument for string.strip()
In-Reply-To: Your message of "Wed, 25 Jul 2001 18:00:55 BST."
 <31575A892FF6D1118F5800600846864D78BEFD@intrepid>
References: <31575A892FF6D1118F5800600846864D78BEFD@intrepid>
Message-ID: <200107251722.NAA08073@cj20424-a.reston1.va.home.com>

> Is it OK to post small feature requests directly to this list, or is
> there some other mechanism for them? What I have in mind certainly
> isn't worth a PEP.

It's better to use the SF feture request tracker:
http://sourceforge.net/tracker/?atid=355470&group_id=5470&func=browse

> The .split method on strings splits at whitespace by default, but
> takes an optional argument allowing splitting by other strings. The
> .strip method (and its siblings) always strip whitespace - on more
> than one occasion I would have found it useful if these methods also
> took an optional argument allowing other strings to be stripped. For
> example, to strip, say, asterisks from a file you could do:
> 
> >>>fred = '**word**word**'
> >>>fred.strip('*')
> word**word
> 
> Does this sound sensible/useful?

Marginally.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Wed Jul 25 18:26:28 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 25 Jul 2001 13:26:28 -0400
Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together?
In-Reply-To: Your message of "Wed, 25 Jul 2001 19:04:41 +0200."
 <3B5EFC29.B4B30CC2@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCOEPDLAAA.tim.one@home.com> <3B5E946E.1105C79B@lemburg.com> <3B5ED39B.7A48C0D@lemburg.com> <200107251438.KAA02162@cj20424-a.reston1.va.home.com>
 <3B5EFC29.B4B30CC2@lemburg.com>
Message-ID: <200107251726.NAA08088@cj20424-a.reston1.va.home.com>

> Guido van Rossum wrote:
> > ...
> > I actually expect that most conversion jobs will be easy -- all those
> > folks who suffer from "Extreme Fear of Floating Point" (as Tim calls
> > it) can simply change every / into a // in their program (using a tool
> > that properly tokenizes) and they should be done, since most likely
> > their code never uses floating point. :-)
> 
> Well, that would break floating points then...

Not under the assumption that they will never use floating point.

> unless float // float works like float / float does now.

No, that would be a bad idea.  float//float should either raise an
exception or return a rounded-towards-minus-infinity result.

> Perhaps you should simply 
> add a nb_altdivide slot to the numeric set of slots which is then
> called for a // b. Floats would then reuse their nb_divide 
> for //.

Something like this is part of the implemenation plan (not yet part of
the patch).

> BTW, my idea about rationals turns out not to work too well:
> 1/6 + 5/6 would give 6/6 == 1 while the current semantics 
> return 0 in this case.

Indeed, rationals can't ease the pain of PEP 238 -- but PEP 238 is
required before rationals can make sense.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com (Skip Montanaro)  Wed Jul 25 18:48:51 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 25 Jul 2001 12:48:51 -0500
Subject: [Python-Dev] A future division proposal
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AF31@UKRUX002.rundc.uk.origin-it.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5AF31@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <15199.1667.970769.894705@beluga.mojam.com>

    Paul> Actually, when I tried, I got into a bit of a mess getting the
    Paul> type checks (which you need) right -

    Paul>     def olddiv(n,m):
    Paul>         if type(n) == type(m) == type(0):
    Paul>             return n//m
    Paul>         else:
    Paul>             return n/m

    Paul> But this needs the checks expanded to take longs into
    Paul> account. Which is where it gets messy.

Wouldn't this work for ints and longs?

    def olddiv(n,m):
        ints = [type(0), type(0L)]
        if type(n) in ints and type(m) in ints:
            return n//m
        else:
            return n/m

-- 
Skip Montanaro (skip@pobox.com)
http://www.mojam.com/
http://www.musi-cal.com/


From perry@stsci.edu  Wed Jul 25 18:48:10 2001
From: perry@stsci.edu (Perry Greenfield)
Date: Wed, 25 Jul 2001 13:48:10 -0400
Subject: [Python-Dev] A future division proposal
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5AF31@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <JFEGLNDJEDNOMPPHDEJFCELBDJAA.perry@stsci.edu>

[Paul Moore]
> You could probably write a function to do this. There's no need 
> for anything
> built into Python.
>
Sure, a functional form would be just as feasible and not require
another operator. On the other hand there are perhaps a couple 
reasons not to do it this way:

1) It can make a mess of the expressions (if automatically translated)
   and make the code far less readable. Some may object to this.
2) If I recall some objected to a functional version on the basis
   of speed, but I'm not sure about that.

Perry


From mal@lemburg.com  Wed Jul 25 19:14:15 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 25 Jul 2001 20:14:15 +0200
Subject: [Python-Dev] shouldn't we be considering all pending numeric
 proposals together?
References: <LNBBLJKPBEHFEDALKOLCOEPDLAAA.tim.one@home.com> <3B5E946E.1105C79B@lemburg.com> <3B5ED39B.7A48C0D@lemburg.com> <200107251438.KAA02162@cj20424-a.reston1.va.home.com>
 <3B5EFC29.B4B30CC2@lemburg.com> <200107251726.NAA08088@cj20424-a.reston1.va.home.com>
Message-ID: <3B5F0C77.DEE608F8@lemburg.com>

Guido van Rossum wrote:
> 
> > Guido van Rossum wrote:
> > > ...
> > > I actually expect that most conversion jobs will be easy -- all those
> > > folks who suffer from "Extreme Fear of Floating Point" (as Tim calls
> > > it) can simply change every / into a // in their program (using a tool
> > > that properly tokenizes) and they should be done, since most likely
> > > their code never uses floating point. :-)
> >
> > Well, that would break floating points then...
> 
> Not under the assumption that they will never use floating point.

Verifying such an assumption will be just as hard as auditing the
code itself, I'm afraid.

> > unless float // float works like float / float does now.
> 
> No, that would be a bad idea.  float//float should either raise an
> exception or return a rounded-towards-minus-infinity result.

Hmm, it would assure that your tool doesn't accidentally
break floating point code.
 
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From guido@digicool.com  Wed Jul 25 20:05:47 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 25 Jul 2001 15:05:47 -0400
Subject: [Python-Dev] A future division proposal
In-Reply-To: Your message of "Wed, 25 Jul 2001 13:48:10 EDT."
 <JFEGLNDJEDNOMPPHDEJFCELBDJAA.perry@stsci.edu>
References: <JFEGLNDJEDNOMPPHDEJFCELBDJAA.perry@stsci.edu>
Message-ID: <200107251905.PAA08305@cj20424-a.reston1.va.home.com>

Can we please keep this discussion out of python-dev?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Wed Jul 25 20:11:06 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 25 Jul 2001 15:11:06 -0400
Subject: [Python-Dev] shouldn't we be considering all pending numeric proposals together?
In-Reply-To: Your message of "Wed, 25 Jul 2001 20:14:15 +0200."
 <3B5F0C77.DEE608F8@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCOEPDLAAA.tim.one@home.com> <3B5E946E.1105C79B@lemburg.com> <3B5ED39B.7A48C0D@lemburg.com> <200107251438.KAA02162@cj20424-a.reston1.va.home.com> <3B5EFC29.B4B30CC2@lemburg.com> <200107251726.NAA08088@cj20424-a.reston1.va.home.com>
 <3B5F0C77.DEE608F8@lemburg.com>
Message-ID: <200107251911.PAA08352@cj20424-a.reston1.va.home.com>

> > Not under the assumption that they will never use floating point.
> 
> Verifying such an assumption will be just as hard as auditing the
> code itself, I'm afraid.

Not for the biggest cry-babies -- I've seen several claims from folks
who say that they never use floating point, and I believe them.

> > > unless float // float works like float / float does now.
> > 
> > No, that would be a bad idea.  float//float should either raise an
> > exception or return a rounded-towards-minus-infinity result.
> 
> Hmm, it would assure that your tool doesn't accidentally
> break floating point code.

A better idea then would be to make float//float raise an exception.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gward@python.net  Wed Jul 25 20:15:57 2001
From: gward@python.net (Greg Ward)
Date: Wed, 25 Jul 2001 15:15:57 -0400
Subject: [Python-Dev] Small feature request - optional argument for string.strip()
In-Reply-To: <31575A892FF6D1118F5800600846864D78BEFD@intrepid>; from SBrunning@trisystems.co.uk on Wed, Jul 25, 2001 at 06:00:55PM +0100
References: <31575A892FF6D1118F5800600846864D78BEFD@intrepid>
Message-ID: <20010725151557.A2013@gerg.ca>

On 25 July 2001, Simon Brunning said:
> >>>fred = '**word**word**'
> >>>fred.strip('*')
> word**word
> 
> Does this sound sensible/useful?

Not really.  I can't recall ever having a need for such a feature in any
programming language I've ever used.

        Greg
-- 
Greg Ward - programmer-at-big                           gward@python.net
http://starship.python.net/~gward/
I haven't lost my mind; I know exactly where I left it.


From gward@python.net  Wed Jul 25 21:23:05 2001
From: gward@python.net (Greg Ward)
Date: Wed, 25 Jul 2001 16:23:05 -0400
Subject: [Python-Dev] Branches here, branches there, branches everywherte
Message-ID: <20010725162305.A2390@gerg.ca>

I've finally started reviewing the changes made to the Distutils during
my extended leave-of-absence, and making a few minor commits.  So far
I've just made these commits on the trunk, because I don't really
understand anything else.  (Yes, yes, I've read the CVS docs many many
times.  It just takes a while to sink in.)  Am I right in doing this?
Ie. will 2.2a2 be released from the trunk?  Or should I be doing commits
that I want in 2.2a2 on the 22a1 branch?

        Greg
-- 
Greg Ward - nerd                                        gward@python.net
http://starship.python.net/~gward/
All of science is either physics or stamp collecting.


From guido@digicool.com  Wed Jul 25 21:44:39 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 25 Jul 2001 16:44:39 -0400
Subject: [Python-Dev] Branches here, branches there, branches everywherte
In-Reply-To: Your message of "Wed, 25 Jul 2001 16:23:05 EDT."
 <20010725162305.A2390@gerg.ca>
References: <20010725162305.A2390@gerg.ca>
Message-ID: <200107252044.QAA08895@cj20424-a.reston1.va.home.com>

> I've finally started reviewing the changes made to the Distutils during
> my extended leave-of-absence, and making a few minor commits.

Great!

> So far I've just made these commits on the trunk, because I don't
> really understand anything else.  (Yes, yes, I've read the CVS docs
> many many times.  It just takes a while to sink in.)  Am I right in
> doing this?  Ie. will 2.2a2 be released from the trunk?  Or should I
> be doing commits that I want in 2.2a2 on the 22a1 branch?

You needn't worry about the branches at all.  Everything checked in on
the trunk will be merged into the branch.  In fact, if you check it in
on the trunk *and* on the branch, you'd end up creating more pain for
the bot doing the merges.

Checking in on the branch should only be done if the change
specifically applies to the branch only.  For example, only type/class
unification changes should be checked in on the descr-branch.

And yes, I plan to merge the descr-branch back into the trunk,
hopefully (but not yet certainly) before 2.2a2 is due.

Branches are a necessary evil.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Wed Jul 25 21:52:04 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 25 Jul 2001 16:52:04 -0400
Subject: [Python-Dev] Branches here, branches there, branches everywherte
In-Reply-To: <20010725162305.A2390@gerg.ca>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEDGLBAA.tim.one@home.com>

[Greg Ward]
> I've finally started reviewing the changes made to the Distutils during
> my extended leave-of-absence, and making a few minor commits.

Welcome back!  I hope you've recovered from Candianness <wink>.

> So far I've just made these commits on the trunk,

Good!

> because I don't really understand anything else.

That's the way Guido likes it <wink>.

> (Yes, yes, I've read the CVS docs many many times.  It just takes a
> while to sink in.)  Am I right in doing this?

Yes.

> Ie. will 2.2a2 be released from the trunk?

Unknown at this time.

> Or should I be doing commits that I want in 2.2a2 on the 22a1 branch?

Definitely not.  Trunk.  There is no 22a1 branch, BTW, 22a1 is just a tag
applied at the time of the 2.2a1 release.  2.2a1 was released from the
descr-branch.  It's my job to magically merge trunk checkins into
descr-branch while you sleep.  This approach will become a nightmare if
people check stuff into descr-branch themselves (except for Guido and Fred
and me, who are doing some work *specific* to descr-branch).

The best thing you can do to help is look for massive clumps of merge
checkins (usually early AM EDT on Saturdays), then run whatever distutils
tests you have *from* a descr-branch checkout.  I don't believe the ongoing
distutils merges on descr-branch get tested at all now, and that's not good.



From martin@loewis.home.cs.tu-berlin.de  Wed Jul 25 22:06:03 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 25 Jul 2001 23:06:03 +0200
Subject: [Python-Dev] post mortem after threading deadlock?
Message-ID: <200107252106.f6PL63301643@mira.informatik.hu-berlin.de>

> Is there any possibility of getting some post-mortem info out of a
> multi-threaded system whose threads are deadlocked?

You could attach to the process using a C debugger (e.g. gdb), and
have a look at the C stacks of each thread. Then, you can look into
the variables of the eval_code invocations to get a clue of what the
Python stack is.

Regards,
Martin

P.S. Isn't this off-topic for python-dev, and rather a question to
python-list or python-tutor?


From bckfnn@worldonline.dk  Wed Jul 25 22:27:37 2001
From: bckfnn@worldonline.dk (Finn Bock)
Date: Wed, 25 Jul 2001 21:27:37 GMT
Subject: [Python-Dev] zipfiles on sys.path
Message-ID: <3b5f2b11.50733180@mail.wanadoo.dk>

Hi,

We have recently added support for .zip files on sys.path to Jython.
Now, after the fact, I wondered what prior art exists for such a feature
and the semantic that is used. We came up with a solution where:

- It is the name (as a string) of the zipfile that can be added to
  sys.path.

- The zipfile is opened on the next import that checks this sys.path
  entry  and kept open until all references to the zipfile is gone
  (including references from packages).

- A side effect of the implementation is that the identity of a string
  on sys.path or __path__ might change during import. The value of the
  string stay the same.

- The __path__ vrbl in a package 'foo.bar' loaded from zipfile.zip
  will have the value ['zipfile.zip!foo/bar'] and this same syntax can
  also be used when adding entries to sys.path and __path__.

I hope it doesn't conflict too much with the solutions that already
exists or the solution (of any) that CPython might choose to adopt.

regards,
finn


From skip@pobox.com (Skip Montanaro)  Wed Jul 25 22:57:03 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 25 Jul 2001 16:57:03 -0500
Subject: [Python-Dev] Re: post mortem after threading deadlock?
In-Reply-To: <200107252106.f6PL63301643@mira.informatik.hu-berlin.de>
References: <200107252106.f6PL63301643@mira.informatik.hu-berlin.de>
Message-ID: <15199.16559.543625.870439@beluga.mojam.com>

    [suggestions elided - thanks, I will look into gdb's thread debugging
    capabilities]

    Martin> P.S. Isn't this off-topic for python-dev, and rather a question
    Martin> to python-list or python-tutor?

Well sort of.  However, if you read my problem as a thinly veiled
enhancement request, the people most likely to be able to implement such a
thing are on this list.  I sort of suspect that from the Python level about
all I can do today is what I'm already doing - poking around the various
locks and semaphores that the threads all share.

Skip


From jack@oratrix.nl  Wed Jul 25 22:58:25 2001
From: jack@oratrix.nl (Jack Jansen)
Date: Wed, 25 Jul 2001 23:58:25 +0200
Subject: [Python-Dev] zipfiles on sys.path
In-Reply-To: Message by bckfnn@worldonline.dk (Finn Bock) ,
 Wed, 25 Jul 2001 21:27:37 GMT , <3b5f2b11.50733180@mail.wanadoo.dk>
Message-ID: <20010725215830.2F49D14A25D@oratrix.oratrix.nl>

Recently, bckfnn@worldonline.dk (Finn Bock) said:
> Hi,
> 
> We have recently added support for .zip files on sys.path to Jython.
> Now, after the fact, I wondered what prior art exists for such a feature
> and the semantic that is used.

MacPython uses a similar scheme, but slightly different. If there is a
file on sys.path it will be inspected for "PYC " resources with the
module name. (The main use for this feature is that you can put the
application itself in sys.path, compile all your modules into PYC
resources and you have a frozen Python program without having used a C
compiler. A boon on a platform where all C compilers cost money or are
arcane).

I'll go thru the issues one by one:

> We came up with a solution where:
> 
> - It is the name (as a string) of the zipfile that can be added to
>   sys.path.

Same.

> - The zipfile is opened on the next import that checks this sys.path
>   entry  and kept open until all references to the zipfile is gone
>   (including references from packages).

Different, MacPython opens it every time. With the  exception of the
application itself, which is already open (and this is checked).

What MacPython does do, and what speeds up imports immensely, is that
it interns all sys.path strings, and keeps a cache of the sys.path
entries that are known to be files, not directories. This forestalls
the import code testing many non-existing paths for existence
(/path/to/myfile.zip/mod.py, path/o/myfile.zip/mod.pyc, etc).

> - A side effect of the implementation is that the identity of a string
>   on sys.path or __path__ might change during import. The value of the
>   string stay the same.
> 
> - The __path__ vrbl in a package 'foo.bar' loaded from zipfile.zip
>   will have the value ['zipfile.zip!foo/bar'] and this same syntax can
>   also be used when adding entries to sys.path and __path__.

__path__ is set to the package name. I'm not sure of the exact
rationale for this (Just did the package support) but it seems to work
fine. 
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From tim.one@home.com  Wed Jul 25 23:11:16 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 25 Jul 2001 18:11:16 -0400
Subject: [Python-Dev] Re: post mortem after threading deadlock?
In-Reply-To: <15199.16559.543625.870439@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEEBLBAA.tim.one@home.com>

[Skip Montanaro]
> ...
> However, if you read my problem as a thinly veiled enhancement request,
> the people most likely to be able to implement such a thing are on this
> list.  I sort of suspect that from the Python level about all I can do
> today is what I'm already doing - poking around the various locks
> and semaphores that the threads all share.

I've got better advice <wink>:  Never use semaphores for anything.  Never
use locks except for dirt-simple one- or two-line critical sections.  For
everything but the latter, always use condition variables.  They're the only
synch protocol I've seen that non-specialist thread programmers can use
without routinely screwing themselves.  The genius of the condvar protocol
is that, used correctly, you *always* run-time test your crucial assumptions
about non-local state (and automatically do so under the protection of a
critical section), and *always* loop back to try again if your hopes or
assumptions turn out not to be true.  This saves you from a universe of
possible problems with non-local state changing in unanticipated ways.

if-you-had-used-condvars-you-wouldn't-be-debugging-now-ly y'rs  - tim



From just@letterror.com  Wed Jul 25 23:11:06 2001
From: just@letterror.com (Just van Rossum)
Date: Thu, 26 Jul 2001 00:11:06 +0200
Subject: [Python-Dev] zipfiles on sys.path
In-Reply-To: <20010725215830.2F49D14A25D@oratrix.oratrix.nl>
Message-ID: <20010726001112-r01010700-776380a4-0910-010c@213.84.27.177>

Jack Jansen wrote:

> > - The __path__ vrbl in a package 'foo.bar' loaded from zipfile.zip
> >   will have the value ['zipfile.zip!foo/bar'] and this same syntax can
> >   also be used when adding entries to sys.path and __path__.
> 
> __path__ is set to the package name. I'm not sure of the exact
> rationale for this (Just did the package support) but it seems to work
> fine. 

I don't know the rationale either (or at least: not anymore ;-), I just copied
the behavior of frozen packages (as in freeze.py) from import.c.
PyImport_ImportFrozenModule() contains this snippet:

    if (ispackage) {
        /* Set __path__ to the package name */
        ...


Just


From guido@zope.com  Wed Jul 25 23:16:45 2001
From: guido@zope.com (Guido van Rossum)
Date: Wed, 25 Jul 2001 18:16:45 -0400
Subject: [Python-Dev] Re: post mortem after threading deadlock?
In-Reply-To: Your message of "Wed, 25 Jul 2001 18:11:16 EDT."
 <LNBBLJKPBEHFEDALKOLCIEEBLBAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCIEEBLBAA.tim.one@home.com>
Message-ID: <200107252216.SAA09493@cj20424-a.reston1.va.home.com>

> I've got better advice <wink>:  Never use semaphores for anything.  Never
> use locks except for dirt-simple one- or two-line critical sections.  For
> everything but the latter, always use condition variables.  They're the only
> synch protocol I've seen that non-specialist thread programmers can use
> without routinely screwing themselves.  The genius of the condvar protocol
> is that, used correctly, you *always* run-time test your crucial assumptions
> about non-local state (and automatically do so under the protection of a
> critical section), and *always* loop back to try again if your hopes or
> assumptions turn out not to be true.  This saves you from a universe of
> possible problems with non-local state changing in unanticipated ways.

I believe that Aahz, in his thread tutorial, has even more radical
advice: use the Queue module for all inter-thread communication.  It
is even higher level than semaphores, and has the same nice
properties.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@home.com  Wed Jul 25 23:31:01 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 25 Jul 2001 18:31:01 -0400
Subject: [Python-Dev] Re: post mortem after threading deadlock?
In-Reply-To: <200107252216.SAA09493@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEEDLBAA.tim.one@home.com>

[Guido van Rossum]
> I believe that Aahz, in his thread tutorial, has even more radical
> advice: use the Queue module for all inter-thread communication.  It
> is even higher level than semaphores, and has the same nice
> properties.

If they're *flexible* enough for Skip, I endorse Queues too.  Else condvars
are the bee's second-prettiest knees.



From barry@scottb.demon.co.uk  Wed Jul 25 23:59:32 2001
From: barry@scottb.demon.co.uk (Barry Scott)
Date: Wed, 25 Jul 2001 23:59:32 +0100
Subject: [Python-Dev] Please have a look at proposed doc changes for time epoch
In-Reply-To: <15189.61907.883300.127987@beluga.mojam.com>
Message-ID: <001c01c1155d$77c64970$060210ac@private>

If you use the POSIX.1 functions the base time is always 1 Jan 1970
even if the OS you are running on has a different epoch.

	Barry



From skip@pobox.com (Skip Montanaro)  Thu Jul 26 00:04:11 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 25 Jul 2001 18:04:11 -0500
Subject: [Python-Dev] Re: post mortem after threading deadlock?
In-Reply-To: <200107252216.SAA09493@cj20424-a.reston1.va.home.com>
References: <LNBBLJKPBEHFEDALKOLCIEEBLBAA.tim.one@home.com>
 <200107252216.SAA09493@cj20424-a.reston1.va.home.com>
Message-ID: <15199.20587.527593.965607@beluga.mojam.com>

    Tim> I've got better advice <wink>: Never use semaphores for anything.
    Tim> Never use locks except for dirt-simple one- or two-line critical
    Tim> sections.

I didn't find either particularly difficult to work with.  Guess I was
fooling myself. ;-)

    Guido> I believe that Aahz, in his thread tutorial, has even more
    Guido> radical advice: use the Queue module for all inter-thread
    Guido> communication.  It is even higher level than semaphores, and has
    Guido> the same nice properties.

Ah, thanks!  I saw the mention of Queues in his slides and thought he was
talking about a queue class that he wrote as an add-on.  It never occurred
to me that it would be a core library module.  doh!  A queue is really what
I want anyway.  I'm sharing a limited pool of database connections between a
(potentially large) set of threads.

one-regulation-head-slap-has-been-administered-sir!-ly y'rs,

Skip


From paulp@ActiveState.com  Thu Jul 26 00:29:32 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Wed, 25 Jul 2001 16:29:32 -0700
Subject: [Python-Dev] Re: Static method and class method comments
References: <9jn382$lqf$1@license1.unx.sas.com>
Message-ID: <3B5F565C.5F452AC5@ActiveState.com>

>Kevin Smith wrote:
> 
> I am very glad to see the new features of Python 2.2, but I do have a minor
> gripe about the implementation of static and class methods.  My issue stems
> from the fact that when glancing over Python code that uses static or class
> methods, you cannot tell that a method is a static or class method by looking
> at the point where it is defined.  
> ...

Agree strongly. This will also be a problem for documentation generation
tools, type extraction tools and class browsers. I believe it would be
easy to add a contextual keyword 

> class C:
>         def static foo(x, y):
>             print "classmethod", x, y

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From greg@cosc.canterbury.ac.nz  Thu Jul 26 00:31:53 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Jul 2001 11:31:53 +1200 (NZST)
Subject: Python 3 (Re: [Python-Dev] shouldn't we be considering all pending numeric proposals together?)
In-Reply-To: <3B5E946E.1105C79B@lemburg.com>
Message-ID: <200107252331.LAA04164@s454.cosc.canterbury.ac.nz>

"M.-A. Lemburg" <mal@lemburg.com>:

> how about a "from __semantics__ import 
> non_integer_division" which does not have a timeout attached 
> to it ?!

If we're to have some form of version declaration in
perpetuity, I hope we can find a MUCH nicer syntax for
it than that!

I suggest simply putting

  python 3

at the top of the module (and calling the first release which
supports it 3.0).

This would completely eliminate all backwards-compatibility
objections at a stroke; there wouldn't even be any need for
warnings. And we wouldn't necessarily be committing to
"dragging the past around forever", since there's always
the possibility of dropping support for older versions
in some release suitably far in the future.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From paulp@ActiveState.com  Thu Jul 26 00:38:01 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Wed, 25 Jul 2001 16:38:01 -0700
Subject: [Python-Dev] number-sig anyone?
References: <15197.44710.656892.910976@beluga.mojam.com>
Message-ID: <3B5F5859.EBDFDC84@ActiveState.com>

Skip Montanaro wrote:
> 
> Dev-ers,
> 
>...
> 
> Today I took a look at http://mail.python.org/mailman/listinfo and could
> find no math-sig or number-sig mailing list.  If Python's number system is
> going to change in one or more backwards-incompatible I think there may only
> be one chance to get it right.  

That implies there is a "right". There isn't. There are just a bunch of
opinions. And I can't imagine that a SIG would lead to a convergence of
opinions because people come from such radically different backgrounds.
I would rather see a rational-sig, float-division-sig, decimal-sig and
so forth. Each could come up with a "locally coherent" plan and Guido
could pick and choose. Otherwise it is might as well be called
numeric-flame-flame-flame-sig.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From skip@pobox.com (Skip Montanaro)  Thu Jul 26 00:54:05 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 25 Jul 2001 18:54:05 -0500
Subject: [Python-Dev] Re: Static method and class method comments
In-Reply-To: <3B5F565C.5F452AC5@ActiveState.com>
References: <9jn382$lqf$1@license1.unx.sas.com>
 <3B5F565C.5F452AC5@ActiveState.com>
Message-ID: <15199.23581.653505.336350@beluga.mojam.com>

    Paul> Agree strongly. This will also be a problem for documentation
    Paul> generation tools, type extraction tools and class browsers. I
    Paul> believe it would be easy to add a contextual keyword

    >> class C:
    >>     def static foo(x, y):
    >>         print "classmethod", x, y

Even better yet, why not simply reuse the class keyword in this context:

    class C:
        def class foo(x, y):
            print "classmethod", x, y

-- 
Skip Montanaro (skip@pobox.com)
http://www.mojam.com/
http://www.musi-cal.com/


From skip@pobox.com (Skip Montanaro)  Thu Jul 26 05:05:42 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 25 Jul 2001 23:05:42 -0500
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: <3B5F5859.EBDFDC84@ActiveState.com>
References: <15197.44710.656892.910976@beluga.mojam.com>
 <3B5F5859.EBDFDC84@ActiveState.com>
Message-ID: <15199.38678.456223.981364@beluga.mojam.com>

    Skip> Today I took a look at http://mail.python.org/mailman/listinfo and
    Skip> could find no math-sig or number-sig mailing list.  If Python's
    Skip> number system is going to change in one or more backwards-
    Skip> incompatible [ways] I think there may only be one chance to get it
    Skip> right.

    Paul> That implies there is a "right". There isn't. There are just a
    Paul> bunch of opinions. And I can't imagine that a SIG would lead to a
    Paul> convergence of opinions because people come from such radically
    Paul> different backgrounds.  I would rather see a rational-sig,
    Paul> float-division-sig, decimal-sig and so forth. Each could come up
    Paul> with a "locally coherent" plan and Guido could pick and choose.

Paul,

My operational definition of "right" in this context is perhaps different
than yours.  I realize there is no obviously right numeric model.  If there
was, most programming languages would use it and we wouldn't need bots like
Tim to help guide us through minefields like IEEE 754.

By "right" I mean that we can arrive at a long-term stable numeric model
that will be accepted by both the Python community as a whole *and* by the
decision makers who will vote thumbs up or down on adopting Python in their
organizations.  One of the most vocal opponents to PEP 238 (I won't mention
his name, but his initials are S.H. ;-) lamented loudly that he'd be a
laughing stock in his company because of that "division thing".  He
mentioned something about being a "right arse" I think.

By having a well-considered overall plan for Python's numeric behavior, if
you have to make an incompatible change today, another next year and a third
two years after that, you can point to the plan that shows people where
you're headed, how you plan to get there, and how they can write their
programs in the meantime so as to be as resilient as possible.  Without such
a plan -- or with several potentially competing plans as you proposed --
every change proposed or made will simply fuel the fires of those people who
dismiss Python because "it's unstable".  The funny thing is, Python's
semantics changed so little for so long that by comparison the rate of
change does seem pretty high, but it's still much better than many
applications or application libraries (such as the relatively recent glibc
upheaval or the API changes Gtk is undergoing now).  And let's not even
mention the folks in Redmond...

Skip





From tim.one@home.com  Thu Jul 26 05:41:02 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 26 Jul 2001 00:41:02 -0400
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: <15199.38678.456223.981364@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEFBLBAA.tim.one@home.com>

Briefly:

[Skip Montanaro]
> ...
> I realize there is no obviously right numeric model.

There are many that are reasonable, though -- and that's the other half of
the problem.

> If there was, most programming languages would use it and we wouldn't
> need bots like Tim to help guide us through minefields like IEEE 754.

You've never seen a language that supports 754 properly (== as the committee
intended).  Certainly not Python, C or Java.  It's far less a minefield when
properly supported, and was designed to be much saner than previous binary
f.p. systems.  One problem is that languages only support the *corner* of
754 that intersects with 1950's Fortran; the other is that very few chips
other than Pentium support the 754 80-bit extended format that's key to
making binary f.p. much safer for non-experts.  OTOH, the "proper 754
support" in the C99 Annex is a minefield of its own.

> By "right" I mean that we can arrive at a long-term stable numeric
> model that will be accepted by both the Python community as a whole
> *and* by the decision makers who will vote thumbs up or down on
> adopting Python in their organizations.

The danger I see here is that Scheme's "numeric tower" is almost obviously a
reasonable numeric model, but in practice is so vague that you can't really
count on anything beyond simple small-int arithmetic working the same way
across Scheme implementations.  Guido appears to have come to an
appreciation of that model in the abstract, but hoping that there's not much
difference between floats and rationals in practice "because they represent
the same mathematical values" just isn't going to pan out (IMO).  1/49*49
equals 1 or it doesn't; it doesn't using IEEE doubles, it does using
rationals, and the difference will be significant to programs.  Certainly
better to switch from floats to rationals someday than to move in the other
direction, though.

I've come to suspect the issues *may& be complicated <wink>.



From mal@lemburg.com  Thu Jul 26 09:32:28 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 26 Jul 2001 10:32:28 +0200
Subject: [Python-Dev] Re: Static method and class method comments
References: <9jn382$lqf$1@license1.unx.sas.com>
 <3B5F565C.5F452AC5@ActiveState.com> <15199.23581.653505.336350@beluga.mojam.com>
Message-ID: <3B5FD59C.5FA52611@lemburg.com>

Skip Montanaro wrote:
> 
>     Paul> Agree strongly. This will also be a problem for documentation
>     Paul> generation tools, type extraction tools and class browsers. I
>     Paul> believe it would be easy to add a contextual keyword
> 
>     >> class C:
>     >>     def static foo(x, y):
>     >>         print "classmethod", x, y
> 
> Even better yet, why not simply reuse the class keyword in this context:
> 
>     class C:
>         def class foo(x, y):
>             print "classmethod", x, y

AFAIK, the only way to add classmethods to a class is by doing
so after creation of the class object. In that sense you don't have
a problem with parsing doc-extraction tools at all: they don't
have a chance of finding the class methods anyway ;-) 

Importing doc-extraction tools won't have a problem with these though 
and neither will human doc-extraction tools, since these will note
that the class methods are special :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From thomas.heller@ion-tof.com  Thu Jul 26 10:22:59 2001
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 26 Jul 2001 11:22:59 +0200
Subject: [Python-Dev] Re: Static method and class method comments
References: <9jn382$lqf$1@license1.unx.sas.com>        <3B5F565C.5F452AC5@ActiveState.com> <15199.23581.653505.336350@beluga.mojam.com> <3B5FD59C.5FA52611@lemburg.com>
Message-ID: <017d01c115b4$8ff5ed00$e000a8c0@thomasnotebook>

From: "M.-A. Lemburg" <mal@lemburg.com>
> AFAIK, the only way to add classmethods to a class is by doing
> so after creation of the class object.
Wrong IMO:

C:\>c:\python22\python.exe
Python 2.2a1 (#21, Jul 18 2001, 04:25:46) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> class X:
...   def foo(*args): return args
...   goo = classmethod(foo)
...   global x
...   x = (foo, goo)
...
>>> print x
(<function foo at 007B786C>, <classmethod object at 007C7190>)
>>> print X.foo, X.goo
<unbound method X.foo> <bound method class.foo of <class __main__.X at 007B6664>>

The classmethod is created before the class is done,
it is converted into a method bound to the class
when you access it.

Thomas



From mal@lemburg.com  Thu Jul 26 11:10:36 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 26 Jul 2001 12:10:36 +0200
Subject: [Python-Dev] Re: Static method and class method comments
References: <9jn382$lqf$1@license1.unx.sas.com>        <3B5F565C.5F452AC5@ActiveState.com> <15199.23581.653505.336350@beluga.mojam.com> <3B5FD59C.5FA52611@lemburg.com> <017d01c115b4$8ff5ed00$e000a8c0@thomasnotebook>
Message-ID: <3B5FEC9C.4248B663@lemburg.com>

Thomas Heller wrote:
>=20
> From: "M.-A. Lemburg" <mal@lemburg.com>
> > AFAIK, the only way to add classmethods to a class is by doing
> > so after creation of the class object.
> Wrong IMO:
>=20
> C:\>c:\python22\python.exe
> Python 2.2a1 (#21, Jul 18 2001, 04:25:46) [MSC 32 bit (Intel)] on win32
> Type "copyright", "credits" or "license" for more information.
> >>> class X:
> ...   def foo(*args): return args
> ...   goo =3D classmethod(foo)
> ...   global x
> ...   x =3D (foo, goo)
> ...
> >>> print x
> (<function foo at 007B786C>, <classmethod object at 007C7190>)
> >>> print X.foo, X.goo
> <unbound method X.foo> <bound method class.foo of <class __main__.X at =
007B6664>>
>=20
> The classmethod is created before the class is done,
> it is converted into a method bound to the class
> when you access it.

Touch=E9 :-)

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From martin@strakt.com  Thu Jul 26 12:12:23 2001
From: martin@strakt.com (Martin Sjögren)
Date: Thu, 26 Jul 2001 13:12:23 +0200
Subject: [Python-Dev] Import hassle
Message-ID: <20010726131222.A30459@strakt.com>

Hello

I've been writing quite a few mails lately, all concerning import
problems. I thought I'd write a little longer mail to explain what I'm
doing and what I find strange here.

Basically all (at least the 10-20 ones I've checked) the C modules in the
distribution have one thing in common: if something in their initFoo()
function fails, they return without freeing any memory. I.e. they return
an incomplete module.

The only way I can think of that one of the standard modules could fail i=
s
when you're out of memory, and that's kinda hard to simulate, so I put in
a faked failure, i.e. I raised an exception and returned prematurely (in
one of my own C modules, not one in the distribution!).

The code looks like this:
    PyErr_SetString(PyExc_ImportError, "foo");
    return;
    /* do other things here, this "fails" */

>>> import Foo
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: foo
>>> import Foo
>>> dir()
['Foo', '__builtins__', '__doc__', '__name__']

Huh?! How did this happen? What is Foo doing there?

Even more interesting, say that I create a submodule and throw in a bunch
of PyCFunctions in it (I stole the code from InitModule since I don't kno=
w
how to fake submodules in a C module in another way, is there a way?). I
create the module, fail on inserting it into the dictionary and DECREF it.
Now, that ought to free the darn submodule, doesn't it? Anyway, I wrote a
simple "mean" script to test this:

try: import Foo
except: import Foo
while 1:
  try: reload(Foo)
  except: pass

And this leaks memory like I-don't-know-what!
What memory doesn't get freed?


Now to my questions: What exactly SHOULD I do when loading my module fail=
s
halfway through? Common sense says I should free the memory I've used and
the module object ought to be unusable.

Why-oh-why can I import Foo, catch the exception, import it again and it
shows up in the dictionary? What's the purpose of this?

How do I work with submodules in a C module?


I find the import semantics really weird here, something is not quite
right...

Regards,
Martin Sj=F6gren

--=20
Martin Sj=F6gren
  martin@strakt.com              ICQ : 41245059
  Phone: +46 (0)31 405242        Cell: +46 (0)739 169191
  GPG key: http://www.strakt.com/~martin/gpg.html


From Donald Beaudry <donb@abinitio.com>  Thu Jul 26 15:00:46 2001
From: Donald Beaudry <donb@abinitio.com> (Donald Beaudry)
Date: Thu, 26 Jul 2001 10:00:46 -0400
Subject: [Python-Dev] Re: Static method and class method comments
References: <9jn382$lqf$1@license1.unx.sas.com> <3B5F565C.5F452AC5@ActiveState.com>
Message-ID: <200107261400.KAA28632@localhost.localdomain>

Paul Prescod <paulp@ActiveState.com> wrote,
> >Kevin Smith wrote:
> > 
> > I am very glad to see the new features of Python 2.2, but I do have a minor
> > gripe about the implementation of static and class methods.  My issue stems
> > from the fact that when glancing over Python code that uses static or class
> > methods, you cannot tell that a method is a static or class method by looking
> > at the point where it is defined.  
> > ...
> 
> Agree strongly. This will also be a problem for documentation generation
> tools, type extraction tools and class browsers. I believe it would be
> easy to add a contextual keyword 
> 
> > class C:
> >         def static foo(x, y):
> >             print "classmethod", x, y

My favorite way to spell this is:

    class C:
        class __class__:
            def foo(c, x, y):
                print "class method", x, y


Or in words, class methods defined in their own name space, inside the
class __class__.

As for the distinction between "static methods" and "class methods", I
havnt been able to convince myself that it's useful.

--
Donald Beaudry                                     Ab Initio Software Corp.
                                                   201 Spring Street
donb@abinito.com                                   Lexington, MA 02421
                  ...So much code, so little time...



From guido@zope.com  Thu Jul 26 16:25:51 2001
From: guido@zope.com (Guido van Rossum)
Date: Thu, 26 Jul 2001 11:25:51 -0400
Subject: [Python-Dev] Import hassle
In-Reply-To: Your message of "Thu, 26 Jul 2001 13:12:23 +0200."
 <20010726131222.A30459@strakt.com>
References: <20010726131222.A30459@strakt.com>
Message-ID: <200107261525.LAA11553@cj20424-a.reston1.va.home.com>

> I've been writing quite a few mails lately, all concerning import
> problems. I thought I'd write a little longer mail to explain what I'm
> doing and what I find strange here.

Martin,

Why does this interest you?  This never happens in reality unless your
memory allocator is broken, and then you have worse problems than
"leaks".

Also, why are you posting to python-dev?

> Basically all (at least the 10-20 ones I've checked) the C modules in the
> distribution have one thing in common: if something in their initFoo()
> function fails, they return without freeing any memory. I.e. they return
> an incomplete module.
> 
> The only way I can think of that one of the standard modules could
> fail is when you're out of memory, and that's kinda hard to
> simulate, so I put in a faked failure, i.e. I raised an exception
> and returned prematurely (in one of my own C modules, not one in the
> distribution!).
> 
> The code looks like this:
>     PyErr_SetString(PyExc_ImportError, "foo");
>     return;
>     /* do other things here, this "fails" */
> 
> >>> import Foo
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> ImportError: foo
> >>> import Foo
> >>> dir()
> ['Foo', '__builtins__', '__doc__', '__name__']
> 
> Huh?! How did this happen? What is Foo doing there?

In general, when import fails after a certain point, the module has
already been created in sys.modules.  There is a reason for this,
having to do with recursive imports.

> Even more interesting, say that I create a submodule and throw in a
> bunch of PyCFunctions in it (I stole the code from InitModule since
> I don't know how to fake submodules in a C module in another way, is
> there a way?). I create the module, fail on inserting it into the
> dictionary and DECREF it.  Now, that ought to free the darn
> submodule, doesn't it? Anyway, I wrote a simple "mean" script to
> test this:
> 
> try: import Foo
> except: import Foo
> while 1:
>   try: reload(Foo)
>   except: pass
> 
> And this leaks memory like I-don't-know-what!
> What memory doesn't get freed?

Memory leaks are hard to find.  I prefer to focus on memory leaks that
occur in real situations, rather than theoretical leaks.

> Now to my questions: What exactly SHOULD I do when loading my module fails
> halfway through? Common sense says I should free the memory I've used and
> the module object ought to be unusable.

You should free the memory if you care.  "Disabling" the module is
unnecessary -- in practice, the program usually quits when an import
fails anyway.

> Why-oh-why can I import Foo, catch the exception, import it again and it
> shows up in the dictionary? What's the purpose of this?
> 
> How do I work with submodules in a C module?
> 
> I find the import semantics really weird here, something is not quite
> right...

Consider two modules, A and B, where A imports B and B imports A.
This is perfectly legal, and works fine as long as B's module
initialization doesn't use names defined in A.

In order to make this work, sys.module['A'] is initialized to an empty
module and filled with names during A's initialization; ditto for
sys.modules['B'].

Now suppose A triggers an exception after it has successfully loaded
and imported B.  B already has a reference to A.  A is not completely
initialized, but it's not empty either.  Should we delete B's
reference to A?  No -- that's interference with B's namespace, and we
don't know whether B might have stored references to A elsewhere, so
we don't know if this would be effective.  Should we delete
sys.modules['A']?  I don't think so.  If we delete sys.modules['A'],
and later someone attempts to import A again, the following will
happen: when A imports B, it finds sys.modules['B'], so it doesn't
reload B; it will use the existing B.  But now B has a reference to
the *old* A, not the new one.

There are now two possibilities: either the second import of A somehow
succeeds (this could only happen if somehow the problem that caused it
to trigger an exception was repaired before the second attempted
import), or the second import of A fails again.  If it succeeds, the
situation is still broken, because B references the old, incomplete
A.  If it fails, we my end up in an infinite loop, attempting to
reimport A, failing, and catching the exception forever.  Neither is
good.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@zope.com  Thu Jul 26 16:38:18 2001
From: guido@zope.com (Guido van Rossum)
Date: Thu, 26 Jul 2001 11:38:18 -0400
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: Your message of "Thu, 26 Jul 2001 00:41:02 EDT."
 <LNBBLJKPBEHFEDALKOLCIEFBLBAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCIEFBLBAA.tim.one@home.com>
Message-ID: <200107261538.LAA11658@cj20424-a.reston1.va.home.com>

> The danger I see here is that Scheme's "numeric tower" is almost obviously a
> reasonable numeric model, but in practice is so vague that you can't really
> count on anything beyond simple small-int arithmetic working the same way
> across Scheme implementations.

I certainly expect that we'll be able to do better than Scheme in our
cross-implementation semantics -- Scheme is infamous for this.

> Guido appears to have come to an
> appreciation of that model in the abstract, but hoping that there's not much
> difference between floats and rationals in practice "because they represent
> the same mathematical values" just isn't going to pan out (IMO).  1/49*49
> equals 1 or it doesn't; it doesn't using IEEE doubles, it does using
> rationals, and the difference will be significant to programs.  Certainly
> better to switch from floats to rationals someday than to move in the other
> direction, though.

Indeed, my only assumption is that switching from floats to rationals
shouldn't be very disruptive.  In my ideal numeric model, rationals
auto-convert to floats but not the other way around, and str() and
repr() of rationals would yield a decimal floating point
representation similar to that of floats.  (This is more or less what
ABC did, except that for floats it added an annoying "~" as
inexactness indicator.)  To get a rational to print as x/y, you'd have
to extract the numerator and denominator explicitly, or use some
standard method.

> I've come to suspect the issues *may& be complicated <wink>.

Sure.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@zope.com  Thu Jul 26 17:01:07 2001
From: guido@zope.com (Guido van Rossum)
Date: Thu, 26 Jul 2001 12:01:07 -0400
Subject: [Python-Dev] number-sig anyone?
In-Reply-To: Your message of "Wed, 25 Jul 2001 23:05:42 CDT."
 <15199.38678.456223.981364@beluga.mojam.com>
References: <15197.44710.656892.910976@beluga.mojam.com> <3B5F5859.EBDFDC84@ActiveState.com>
 <15199.38678.456223.981364@beluga.mojam.com>
Message-ID: <200107261601.MAA11741@cj20424-a.reston1.va.home.com>

> By "right" I mean that we can arrive at a long-term stable numeric
> model that will be accepted by both the Python community as a whole
> *and* by the decision makers who will vote thumbs up or down on
> adopting Python in their organizations.  One of the most vocal
> opponents to PEP 238 (I won't mention his name, but his initials are
> S.H. ;-) lamented loudly that he'd be a laughing stock in his
> company because of that "division thing".  He mentioned something
> about being a "right arse" I think.

I'm not so worried.  While many of the opponents tried to explain
their position by arguing that int division was "right", their real
worry was backwards compatibility.  PEP 238 represents the *only*
serious backwards incompatibility in the transition to a new numeric
model that I can imagine.  The transition plan that I hope to be
checking into PEP 238 deals with the fears of the opponents by putting
the sea change off until Python 3.0.

> By having a well-considered overall plan for Python's numeric
> behavior, if you have to make an incompatible change today, another
> next year and a third two years after that, you can point to the
> plan that shows people where you're headed, how you plan to get
> there, and how they can write their programs in the meantime so as
> to be as resilient as possible.  Without such a plan -- or with
> several potentially competing plans as you proposed -- every change
> proposed or made will simply fuel the fires of those people who
> dismiss Python because "it's unstable".  The funny thing is,
> Python's semantics changed so little for so long that by comparison
> the rate of change does seem pretty high, but it's still much better
> than many applications or application libraries (such as the
> relatively recent glibc upheaval or the API changes Gtk is
> undergoing now).  And let's not even mention the folks in Redmond...

Any additional serious incompatibilities will also be put off till
Python 3.0.  But I repeat, I don't expect any.  Let me review:

Int/long unification:

    - This "breaks" code that counts on the OverflowError.

    - This changes the meaning of left shift (the only operation that
      silently throws away bits rather than raising OverflowError).

    - This changes the meaning of octal and hex constants that set the
      sign bit in short integers -- 0xffffffff is currently a fancy
      way of writing -1 on a 32-bit machine, but after unification it
      will be the same as 0xffffffffL (i.e., 2**32-1).

    None of these is a big deal I think.

Rationals:

    - The introduction of a new rational type in itself doesn't break
      anything.

    - Making 1/2 return a rational instead of a float could break
      some things but not at the scale of PEP 238.

    - Making 0.5 be a rational instead of a float will break more;
      we'll have to discuss this.

    I should note that the inclusion of rationals in the new numeric
    model is far from certain.  There are potential problems with
    rationals that may require them to remain a separate type forever.

This is about the extent of the changes to the numeric model that I'm
contemplating; I don't think that the new numeric model should change
much else. (I don't care much for some of the details of PEP 228, but
I have to think more about it.)

In other words, while the plan isn't spelled out yet, the only
disruption is PEP 238.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From cgw@alum.mit.edu  Thu Jul 26 20:40:23 2001
From: cgw@alum.mit.edu (Charles G Waldman)
Date: Thu, 26 Jul 2001 14:40:23 -0500
Subject: [Python-Dev] Problems building info documentation
Message-ID: <15200.29223.607996.271764@nyx.dyndns.org>

I did a CVS update and am trying to rebuild the info docs, and am
getting the following errors.  Any suggestions (other than the obvious
one of rewriting html2texi.pl as html2texi.py)?



cd /home/cgw/Python/python/dist/src/Doc/info/
make -k 
../tools/mkinfo ../html/api/api.html
perl -I/home/cgw/Python/python/dist/src/Doc/tools /home/cgw/Python/python/dist/src/Doc/tools/html2texi.pl /home/cgw/Python/python/dist/src/Doc/html/api/api.html
/usr/lib/perl5/site_perl/5.6.1/HTML/Element.pm:2091: function main::collect_if_text expected 3 arguments, got 5: Front Matter 1 1 HTML::Element=HASH(0x81bf70c) 0
make: *** [python-api.info] Error 255
../tools/mkinfo ../html/ext/ext.html
perl -I/home/cgw/Python/python/dist/src/Doc/tools /home/cgw/Python/python/dist/src/Doc/tools/html2texi.pl /home/cgw/Python/python/dist/src/Doc/html/ext/ext.html
/usr/lib/perl5/site_perl/5.6.1/HTML/Element.pm:2091: function main::collect_if_text expected 3 arguments, got 5: Front Matter 1 1 HTML::Element=HASH(0x81bf6b8) 0
make: *** [python-ext.info] Error 255
../tools/mkinfo ../html/lib/lib.html
perl -I/home/cgw/Python/python/dist/src/Doc/tools /home/cgw/Python/python/dist/src/Doc/tools/html2texi.pl /home/cgw/Python/python/dist/src/Doc/html/lib/lib.html
/usr/lib/perl5/site_perl/5.6.1/HTML/Element.pm:2091: function main::collect_if_text expected 3 arguments, got 5: Front Matter 1 1 HTML::Element=HASH(0x81bf76c) 0
...


From paulp@ActiveState.com  Thu Jul 26 20:52:29 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Thu, 26 Jul 2001 12:52:29 -0700
Subject: [Python-Dev] number-sig anyone?
References: <15197.44710.656892.910976@beluga.mojam.com>
 <3B5F5859.EBDFDC84@ActiveState.com> <15199.38678.456223.981364@beluga.mojam.com>
Message-ID: <3B6074FD.3822B771@ActiveState.com>

Skip Montanaro wrote:
> 
>...
> 
> By "right" I mean that we can arrive at a long-term stable numeric model
> that will be accepted by both the Python community as a whole *and* by the
> decision makers who will vote thumbs up or down on adopting Python in their
> organizations. 

But I don't think that there is any numeric model that will be accepted
by the whole Python community. Some will like any change and some will
dislike it. Likely there will appear to be equal numbers on either side
of any issue because each flame is answered by one or more
counter-flames.

>...
> By having a well-considered overall plan for Python's numeric behavior, if
> you have to make an incompatible change today, another next year and a third
> two years after that, you can point to the plan that shows people where
> you're headed, how you plan to get there, and how they can write their
> programs in the meantime so as to be as resilient as possible. 

I'm not a against such a plan but I don't think it can be designed in a
committee. It would be largely the vision of one or two like-minded
people with the same weighting of factors such as performance, ease of
use, backwards compatibility and so forth. If you put an
representation-obsessed engineer in the same committee with a
purity-obsessed mathematician you'll find that the only thing they can
agree on is to disagree.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From mclay@nist.gov  Thu Jul 26 08:40:36 2001
From: mclay@nist.gov (Michael McLay)
Date: Thu, 26 Jul 2001 03:40:36 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
Message-ID: <01072603403600.02216@fermi.eeel.nist.gov>

PEP: XXX
Title: Adding a Decimal type to Python
Version: $Revision:$
Author: mclay@nist.gov <mclay@nist.gov>
Status: Draft
Type: ??
Created: 25-Jul-2001
Python-Version: 2.2


Abstract

    Several PEPs have been written about fixing Python's numerical
    types.  The proposed changes raise issues about breaking backwards
    compatibility in the process. Changing the existing numerical types
    can be avoided by introducing a decimal number type. This change
    will also enhance the utility of Python for several key markets.

    A decimal type is also a natural super-type of both integers and
    floating point numbers.  This makes it an important root type for an
    inheritance tree of numerical types.

    This PEP suggests adding the decimal number type to Python in such
    a way that the existing number types will be the default type for
    .py files and the python command and the new decimal number type
    will be used for .dp files and the dpython command.


Rationale

    Conflicts surface in the discussion of language design when
    programming goals differ.  One example of this is found when
    selecting the best method for interpreting numerical values.  The
    correct answer is dependent on the application domain of the
    software. While Python is very good at providing a simple
    generalized language, it is not an ideal language in all
    cases. 

    For developers of scientific application the use of binary
    numbers, are often important for performance reasons.  The
    developers of financial application need to use decimal numbers in
    order to control roundoff errors.  Decimal numbers are also best
    for newbie users because decimal numbers have simpler rules and
    fewer surprises.

    The current implementation of numbers in Python is limited to a
    binary floating point type (both imaginary and real) and two types
    of integers.  This makes the language suitable for scientific
    programming.  Python is also suitable for domains which do not
    make use of numerical types. 

    Changing the existing python implementation to use decimal numbers
    and the default type for literals is likely to irritate scientific
    programmers.  Having to use special notation for decimal
    literals will make financial application developers second class
    citizen.  Both groups can coexist and share compiled modules by
    making the parser of Python sensitive to the context of the
    syntax.  This can be done by adding a new decimal type and then
    selectively changing the definition of default literals (that is a
    literal without a type suffix). In the proposed implementation the
    .py files and the python command would continue to parse numerical
    literals as they currently are interpreted.  The new decimal
    type would be used for number literals for .dp files and the
    dpython command. 


Proposal

    A new decimal type will be added to Python.  The new type
    will be based on the ANSI standard for decimal numbers.  The
    proposal will also add two new literal for representing numbers
    A decimal literal will have a 'd' appended to the number
    and a float literal or an integer literal will have a 'f' appended
    to the number. The current '.py' file and the use of the python
    command will continue to use the existing float and integer types
    for the number literals without a suffix.  

    The proposed change will add support for a second file type
    with a '.dp' suffix. There will also be an alternative command
    name, 'dpython', for the Python executable.  The decimal number
    will be used for the interpretation numerical literals in a '.dp'
    file and when using the 'dpython' command. The following examples
    illustrate the two commands.

    $ ./dpython
    Python 2.2a1 (#87, Jul 26 2001, 11:07:58)
    [GCC 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)] on linux2
    Type "copyright", "credits" or "license" for more information.
    >>> type(21.2)
    <type 'decimal'>
    >>> type(21.2f)
    <type 'float'>
    >>> type(21f)
    <type 'int'>
    >>> 21.2f
    21.199999999999999
    >>> 21.2
    2.12
    >>> 1f/2f
    0
    >>> 1/2
    0.5
    >>>
    $./python
    Python 2.2a1 (#87, Jul 26 2001, 11:07:58)
    [GCC 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)] on linux2
    Type "copyright", "credits" or "license" for more information.
    >>> type(21.2)
    <type 'float'>
    >>> type(21.2f)
    <type 'float'>
    >>> type(21.2d)
    <type 'decimal'>
    >>> 1/2
    0
    >>> 21.2
    21.199999999999999
    >>> 21.2d
    21.2

    The new decimal type is a "super-type" of float, integer, and
    long, so when decimal math is used there are only decimal
    numbers, regardless of whether it is an integer or a floating
    point number. Newbies and developers of financial applications
    would use the dpython command and the '.dp' suffix for modules.  
    The language will remain unchanged for existing programs.

    The addition of a decimal type that can be sub-classed may
    eliminate the need to add inheritance to float or integer types.
    The inheritance from float and integer are likely to be
    challenging. How will the inheritance from the float or integer
    type work?  The definition and implementation of these types are
    dependent on the C compiler used to compile the interpreter.

    By contrast, a new decimal type could be designed to be highly
    customizable.  The implementation could be implemented like class
    instances with a dictionary that starts out with three members, a
    sign, a coefficient, and an exponent. This basic type could be
    extended with flags that set the type of rounding to be used, or 
    by adding a member that sets the precision of the numbers, or
    perhaps a minimum and maximum value member could be added. 

    Adding the new file type is also an opportunity to fix some other
    ugliness in Python.  The tab character could be eliminated in
    from block indentation.  The default character type could be set to
    Unicode. (In dpython a 'b' would be added to the front of strings
    that are sequences of bytes.)  Using Unicode as the default has
    one important downside.  The change would limit the viewing of
    the '.dp' files to display devices that are Unicode enabled. This
    may have been a problem five years ago. Would it be today? 

    --- need to add other improvement that could be done in dpython ---

Backwards Compatibility

    The proposed change is backward compatible with the existing
    syntax when the python command is used. The new dpython command
    would be used to take advantage of the new language syntax.  The
    python command will have access to the decimal number type and the
    dpython command will have access to the traditional float and
    integer types. Both versions of the language could be used to
    write exactly the same programs that generate exactly the same
    byte code output.  The only difference will be a few syntax
    improvements in the dpython language.


Prototype Implementation

    An implementation of this PEP has been started, but has not been
    completed.  The parsing works as described, and a partial
    implementation of a decimal type has been started.  The prototype
    implementation of the decimal object is sufficient for testing the
    approach of mingling dpython and python.  The design of the
    current implementation does not support sub-classing. This minimal
    implementation of a decimal type could be completed with a days
    work. The development of an extendable type, as was described
    above, could take place in a later release.

    The interpretation of number literal that does not have a suffix
    is determined in in the parsetok() function.  The function adds a
    'd' or 'f' flag to any numerical literal that does not already
    have a number type suffix. The suffix attached to the numerical
    literal is based on the command used to invoke the parser or the
    suffix of the filename.  The parsenumber() function in compile.c
    file was modified to key off the number type suffix.  This type
    indicator is used in a switch statement for compiling the text of
    the literal into to the correct type of number.

    The implementation of the decimal type was created by copying the
    complexobject.[hc] files and then doing a global replace of the
    word complex with the word decimal.  The PyDecimal_FromString
    method in decimalobject.c interprets the string encoding of a
    decimal number correctly and populates the data structure that
    contains the sign, coefficient, and exponent values of a decimal
    number. A minimal printing of the decimal number has been enabled.
    It is hard-coded to just print out a scientific notation of the
    number.  The only operator that works properly at this time is
    negation operator defined in decimal_neg(). The d_sum() and d_prod()
    function have been started, but they are very broken.  No work has
    been done on implementing the d_quot() function.  The example that
    shows integer division working properly above was done by editing
    the output.  The format of the echoed decimal number was also edited.

    When a directory in the path contains a '.dp' module and a '.py'
    module with the same module name the '.dp' module is used.

    The prototype implementation is available at http://www.gencam.org/python
    The implementation has only be tested on Mandrake Linux 8.0.

Known Problems and Questions

    The parsetok.c file was duplicated and renamed to parsetok2.c
    because the pgen program could not resolve the Py_GetProgramName()
    function.  

    The dpython repr() function should probably return a number with a
    suffix of 'd' for decimal types if the module is a '.py' module or
    if the python command is used. Should the repr() function add the
    'f' suffix to float and integer values when accessed from a '.dp'
    module or the dpython command is used? 

Common Objections
 
    Adding a new type results in more rules to remember regarding the
    use of numbers in Python. 

    Response:  

    In general the rules for using a the decimal number type will
    be simpler than the rules governing the current set of numerical
    types.  This should make it easier for newbies to learn the
    dpython language.

    The benefits to the users who need a decimal type are significant
    and the added rules will primarily impact these users.  The
    decimal numbers are more precise, which is essential for some
    application domains. The decimal number rules will tend to
    simplify the use of python for these applications.

    The types used in an application will most likely be selected to
    match the user's requirements. Crossover between the new decimal
    types and the classic types will be infrequent. For cases where
    types must be mixed the language will be explicit. There will be 
    no automatic coercion between the types.  Exceptions will be
    raised if an explicit conversion isn't used. 

    Having two languages will confuse users.

    Response:

    This is unlikely to be a problem because there will rarely be a
    python module that requires both types of numbers.  If number
    types must be mixed in a module the proposed syntax provides an
    easy method to visually distinguish between the different number
    types. When types are mixed the choice between python and dpython
    will probably be dictated by the domain of the application developer.

    The distinction between python and dpython disappears once the
    language syntax has been compiled.  The only problem that might
    occur is in recognizing which language version is being used when
    editing a module.  An IDE can minimize the chances of confusion by
    using different background colors or highlighting schemes to
    distinguish between the versions of the language.  Anyone still
    using vi on a black and white monitor will just have to remember
    the name of the file being edited. (Which is probably how they
    think it should be:-)

    Shouldn't the root numerical type be a rational type?

    Response:

    ???


References

    [1] ANSI standard X3.274-1996.  
        (See http://www2.hursley.ibm.com/decimal/deccode.html)

Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:


From tim@digicool.com  Thu Jul 26 21:52:18 2001
From: tim@digicool.com (Tim Peters)
Date: Thu, 26 Jul 2001 16:52:18 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
In-Reply-To: <01072603403600.02216@fermi.eeel.nist.gov>
Message-ID: <BIEJKCLHCIOIHAGOKOLHOENBCDAA.tim@digicool.com>

>     [1] ANSI standard X3.274-1996.
>         (See http://www2.hursley.ibm.com/decimal/deccode.html)

Michael, this is merely a standard for *encoding* decimal numbers; it
doesn't say anything about semantics, or exceptions, or anything else
visible to users.

Are you aware that Aahz is implementing "the real" spec for Python, a level
up at

    http://www2.hursley.ibm.com/decimal/

under "Base specification"?  There are so few people working on the decimal
idea that I hate to see it fragmented already.



From barry@zope.com  Thu Jul 26 22:06:19 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Thu, 26 Jul 2001 17:06:19 -0400
Subject: [Python-Dev] Breakage in CVS on SF?
Message-ID: <15200.34379.61043.569809@yyz.digicool.com>

Has anybody noticed this problem?

% cvs -q up -P -d
P Mac/Lib/findertools.py
cvs [update aborted]: cannot open .new.findertoo: Permission denied
write stdout: Broken pipe


From mclay@nist.gov  Thu Jul 26 22:38:27 2001
From: mclay@nist.gov (Michael McLay)
Date: Thu, 26 Jul 2001 17:38:27 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
Message-ID: <01072617382702.02216@fermi.eeel.nist.gov>

On Thursday 26 July 2001 04:52 pm, Tim Peters wrote:
> >     [1] ANSI standard X3.274-1996.
> >         (See http://www2.hursley.ibm.com/decimal/deccode.html)
>
> Michael, this is merely a standard for *encoding* decimal numbers; it
> doesn't say anything about semantics, or exceptions, or anything else
> visible to users.

This was a proposal for a mechanism for mingling types safely.  It was not
intended as a definition of how decimal numbers should be implemented.  My
implementation tests the interaction of the current number types with the
decimal type and I only completed enought of the decimal type implementation
to support this testing.  I was not expecting to discuss how decimal types
should work.  That has been discussed already. I was primarily interested in
testing the effects of adding a new number type as I described in the PEP.

What did you think of the idea of adding a new command and file format?

> Are you aware that Aahz is implementing "the real" spec for Python, a level
> up at
>
>     http://www2.hursley.ibm.com/decimal/
>
> under "Base specification"?  There are so few people working on the decimal
> idea that I hate to see it fragmented already.

Yes I have played with the Decimal.py module. I developed decimalobject.c so
I could test the inpact of introducing an additional command and file format
to Python.  I expect this code to be replaced.  As I said in the PEP I also
think the decimal number implementation will evolve into a type that supports
inheritance.


From tim@digicool.com  Thu Jul 26 22:43:31 2001
From: tim@digicool.com (Tim Peters)
Date: Thu, 26 Jul 2001 17:43:31 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
In-Reply-To: <01072617382702.02216@fermi.eeel.nist.gov>
Message-ID: <BIEJKCLHCIOIHAGOKOLHOENFCDAA.tim@digicool.com>

[Michael McLay]
> ...
> What did you think of the idea of adding a new command and file format?

I haven't gotten that far yet <wink> -- really, I just skimmed the top and
the bottom so far.  Too much to do; will read later, though.



From guido@zope.com  Thu Jul 26 23:21:31 2001
From: guido@zope.com (Guido van Rossum)
Date: Thu, 26 Jul 2001 18:21:31 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
In-Reply-To: Your message of "Thu, 26 Jul 2001 17:38:27 EDT."
 <01072617382702.02216@fermi.eeel.nist.gov>
References: <01072617382702.02216@fermi.eeel.nist.gov>
Message-ID: <200107262221.SAA21517@cj20424-a.reston1.va.home.com>

[Michael]
> This was a proposal for a mechanism for mingling types safely.  It
> was not intended as a definition of how decimal numbers should be
> implemented.  My implementation tests the interaction of the current
> number types with the decimal type and I only completed enought of
> the decimal type implementation to support this testing.  I was not
> expecting to discuss how decimal types should work.  That has been
> discussed already. I was primarily interested in testing the effects
> of adding a new number type as I described in the PEP.

Can you summarize the rules you used for mixed arithmetic?  I forget
what your PEP said would happen when you add a decimal 1 to a binary
1.  Is the result decimal 2 or binary 2?  Why?

> What did you think of the idea of adding a new command and file format?

I don't think that would be necessary.  I'd prefer the 'd' and 'f' (or
maybe 'b'?) suffixes to be explicit, perhaps combined with an optional
per-module directive to set the default.  This would be more robust
than keying on the filename extension.  If you have to change the
default globally, I'd prefer a command line option.  After all it's
only the scanner that needs to know about the different defaults,
right?

I wonder about the effectiveness of the default though.  If you write
a module for decimal arithmetic, how do you prevent a caller to pass
in a binary number?

> > Are you aware that Aahz is implementing "the real" spec for
> > Python, a level up at
> >
> >     http://www2.hursley.ibm.com/decimal/
> >
> > under "Base specification"?  There are so few people working on
> > the decimal idea that I hate to see it fragmented already.
> 
> Yes I have played with the Decimal.py module. I developed
> decimalobject.c so I could test the inpact of introducing an
> additional command and file format to Python.  I expect this code to
> be replaced.  As I said in the PEP I also think the decimal number
> implementation will evolve into a type that supports inheritance.

Please, please, please, unify all these efforts.  A decimal PEP would
be a good one, but there should be only one.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg@cosc.canterbury.ac.nz  Fri Jul 27 02:51:54 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 27 Jul 2001 13:51:54 +1200 (NZST)
Subject: [Python-Dev] PEP for adding a decimal type to Python
In-Reply-To: <01072617382702.02216@fermi.eeel.nist.gov>
Message-ID: <200107270151.NAA04533@s454.cosc.canterbury.ac.nz>

Michael McLay <mclay@nist.gov> (by way of himself):

> This was a proposal for a mechanism for mingling types safely.  It was not
> intended as a definition of how decimal numbers should be
> implemented.

Perhaps you could make it clearer in the introduction that
this is the part of the problem your PEP is addressing.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From tim.one@home.com  Fri Jul 27 06:27:27 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 27 Jul 2001 01:27:27 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
In-Reply-To: <200107262221.SAA21517@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEJLLBAA.tim.one@home.com>

[Guido]
> ...
> Please, please, please, unify all these efforts.  A decimal PEP would
> be a good one, but there should be only one.

I elect Michael <wink>.  Note there is *no* decimal PEP now -- not even a
decimal PEP number assigned.  Aahz isn't going to write one, either.  I was
hoping to write one instead if time allowed, but that looks increasingly
unlikely by the hour.



From tim.one@home.com  Fri Jul 27 06:35:08 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 27 Jul 2001 01:35:08 -0400
Subject: [Python-Dev] Breakage in CVS on SF?
In-Reply-To: <15200.34379.61043.569809@yyz.digicool.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEJLLBAA.tim.one@home.com>

[Barry A. Warsaw]
> Has anybody noticed this problem?
>
> % cvs -q up -P -d
> P Mac/Lib/findertools.py
> cvs [update aborted]: cannot open .new.findertoo: Permission denied
> write stdout: Broken pipe

I have not.  Do you still see it?

if-so-get-a-real-os-with-a-real-cvs<wink>-ly y'rs  - tim



From mclay@erols.com  Fri Jul 27 06:44:33 2001
From: mclay@erols.com (Michael McLay)
Date: Fri, 27 Jul 2001 01:44:33 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
Message-ID: <01072701443301.05085@localhost.localdomain>

The PEP I posted yesterday, which currently doesn't have a number, addresses 
the syntactic issues of adding a decimal number type to Python and it 
investigates how to safely introduce the new type in a language with a large 
base of legacy code.  The PEP does not address the definition of how decimal 
numbers should be implemented in Python.  This topic has been the subject of 
other PEPs.  

The PEP also proposes the definition of a new language dialect that makes 
some small improvements on the syntax of the classic Python language.  The 
changes to the numerical model is tailored to make the language attractive to 
two very important markets.  Many of the users attracted from these markets 
may initially have little or no interest in classic Python.  They may not 
even know that the Python language exists.  They will happily use a language 
called dpython that works very well for their profession.  The interesting 
thing about the proposed language is how little effort will be required to 
create and maintain it.  The additions to Python were straightforward and the 
total patch was only a few hundred lines.

The prototype implementation uses the following rules when interpreting the 
type to be created from a number literal.

  literal	'.py'		'.dp'		interactive	interactive
  value	file		file		python		dpython

  2.2b	float		float		float		float
    2b	int		int		int		int

   2.2	float		decimal		float		decimal
     2	int		decimal		int		decimal

  2.2d	decimal		decimal		decimal		decimal
    2d	decimal		decimal		decimal		decimal

Based on a comment from Guido I've decided to change the 'f' to 'b' in the 
next version of dpython.  That will be more descriptive of the distinction 
the types.

[Michael]
>> This was a proposal for a mechanism for mingling types safely.  It
>> was not intended as a definition of how decimal numbers should be
>> implemented.  My implementation tests the interaction of the current
>> number types with the decimal type and I only completed enough of
>> the decimal type implementation to support this testing.  I was not
>> expecting to discuss how decimal types should work.  That has been
>> discussed already. I was primarily interested in testing the effects
>> of adding a new number type as I described in the PEP.

> Can you summarize the rules you used for mixed arithmetic?  I forget
> what your PEP said would happen when you add a decimal 1 to a binary
> 1.  Is the result decimal 2 or binary 2?  Why?

The rule is very simple.  You can't mix the types.  You must explicitly cast 
a binary to a decimal or a decimal to a binary.  This introduces the least 
chance of error.  This pedantic behavior is very important in fields like 
accounting.  I want accountants to think of the proposed dpython language as 
the COBOL version of Python:-)  This approach is also the correct one to take 
for newbies.  They will get a nice clean exception if they mix the types.  
This error will be something they can easily look up in the documentation.  
An unexpected answer, like 1/2=0,  will just leave them scratching their head.

This proposal tries to be consistent with what I like about Python and what I 
think makes Python a great language.  The implementation maintains complete 
backwards compatibility and it requires that programmers explicitly state 
that they want to do something rather than have bad things happen 
unexpectedly.  Mixing different types of numbers can lead to bugs that are 
very difficult to identify.  The nature of the errors that would occur 
when binary numbers are used instead of decimals would be particularly 
difficult to detect.  The answers would always be very close, and sometimes 
they would be correct.  Without the use of an explicit cast these errors 
would be silent.  The price paid for being pedantic will be the occasional 
need to add an int() or float() around a decimal number or an decimal() 
around a float or int.

>> What did you think of the idea of adding a new command and file format?

> I don't think that would be necessary. I'd prefer the 'd' and 'f' (or
> maybe 'b'?) suffixes to be explicit, perhaps combined with an optional
> per-module directive to set the default.  This would be more robust
> than keying on the filename extension.  

Why do you think a directive statement would be more robust than using a file 
suffix or command name as the directive? I'll try to explain why I think the 
opposite is true.

Take the example of teaching a newbie to program. They must be told some 
basic things  For instance, they will have only been told to use a specific 
suffix for the file name in order to create a new module. So how do you make 
sure that the newbie always uses decimal numbers? If a directive statement is 
required then the newbie must remember to always add this statement at the 
top of a file.  If they forget they will not get the expected results.  With 
the file suffix based approach they will have to use a '.dp' suffix for the 
file name of a new module.   If the are told to use a '.dp' suffix from the 
outset then the chances of their accidentally typing '.py' instead of '.dp' 
is very unlikely, whereas, forgetting to add a directive would be a silent 
error that they might easily forget.

Your request to have an explicit 'd' and 'f' is already implemented. The 
prototype implementation allows an explicit 'd' or 'f' to be used at anytime. 
The rules on the interpretation of the values that have no suffix were 
defined earlier. The prototype implementation simply uses the suffix of the 
module file and the name of the command as the directive.  This approach 
provides a very natural language experience to someone working in a 
profession that normally uses decimal numbers.  They are not treated as 
second class citizens who must endure the clutter of a magic directive 
statements at the top of every module they create.  They just use their 
special command and the file extension.  

> If you have to change the
> default globally, I'd prefer a command line option.  After all it's
> only the scanner that needs to know about the different defaults,
> right?

I think there would be a problem with only using a command line option.  It 
would work for files that are named on the command line and for code being 
interpreted in an interactive session. However, for imported modules the 
meaning of a number literal must be based on the author intentions when the 
module was created. This means that the interpreter must recognize the type 
of file so it can determine how compile the literals defined in the module. 
If the command line option determines how a scanner is to convert the number 
literals then a module source file could incorrectly be converted if the 
wrong command line option were used.

> I wonder about the effectiveness of the default though.  If you write
> a module for decimal arithmetic, how do you prevent a caller to pass
> in a binary number?

Since the module is written with decimal numbers an exception would be raised 
if a binary number was used where a decimal number was required.  For 
instance:

---------------
#File spam.py
a = 1.0

---------------
#File eggs.dp
import spam
c = a + 1.0

---------------

The name 'a' was compiled into a float type object when the spam.py file was 
scanned.  So when the expression being assigned to 'c'  is executed it would 
result in an TypeError being raised because a float was being added to a 
decimal.  

>> decimalobject.c so I could test the impact of introducing an
>> additional command and file format to Python.  I expect this code to
>> be replaced.  As I said in the PEP I also think the decimal number
>> implementation will evolve into a type that supports inheritance.

> Please, please, please, unify all these efforts.  A decimal PEP would
> be a good one, but there should be only one.

Absolutely.  The PEP process is suppose to formalize the capture of ideas so 
they can be reference. This PEP is mostly orthogonal to Aahz's proposal.  
They can be merge, or we can reference each others PEP.  I'm probably not the 
best choice for doing the implement of the decimal number semantics, so I'd 
be happy to work with Aahz.


From mclay@erols.com  Fri Jul 27 06:52:29 2001
From: mclay@erols.com (Michael McLay)
Date: Fri, 27 Jul 2001 01:52:29 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
In-Reply-To: <01072701443301.05085@localhost.localdomain>
References: <01072701443301.05085@localhost.localdomain>
Message-ID: <01072701522902.05085@localhost.localdomain>

Oops

On Friday 27 July 2001 01:44 am, you wrote:
>
> ---------------
> #File spam.py
> a = 1.0
>
> ---------------
> #File eggs.dp
> import spam
> c = a + 1.0

that should have been 

c = spam.a + 1.0



time for bed:-)


From fdrake@acm.org  Fri Jul 27 00:17:07 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 26 Jul 2001 19:17:07 -0400 (EDT)
Subject: [Python-Dev] Import hassle
In-Reply-To: <20010726131222.A30459@strakt.com>
References: <20010726131222.A30459@strakt.com>
Message-ID: <15200.42227.309819.315626@cj42289-a.reston1.va.home.com>

Martin Sj=F6gren writes:
 > Even more interesting, say that I create a submodule and throw in a =
bunch
 > of PyCFunctions in it (I stole the code from InitModule since I don'=
t know
 > how to fake submodules in a C module in another way, is there a way?=
). I

Martin,
  You can take a look at the code in pyexpat.c's init function; this
creates a couple of module objects to hold constants in segregated
namespaces.  I'm not sure that it does everything it needs to to
properly build up all the namespaces, but it should do reasonably
well.


  -Fred

--=20
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From fdrake@acm.org  Fri Jul 27 00:09:05 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 26 Jul 2001 19:09:05 -0400 (EDT)
Subject: [Python-Dev] Problems building info documentation
In-Reply-To: <15200.29223.607996.271764@nyx.dyndns.org>
References: <15200.29223.607996.271764@nyx.dyndns.org>
Message-ID: <15200.41745.113986.278767@cj42289-a.reston1.va.home.com>

Charles G Waldman writes:
 > I did a CVS update and am trying to rebuild the info docs, and am
 > getting the following errors.  Any suggestions (other than the obvious
 > one of rewriting html2texi.pl as html2texi.py)?

  This has been reported before, but I don't know of a fix.  I don't
think anyone has spent any time on it.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From martin@strakt.com  Fri Jul 27 09:34:13 2001
From: martin@strakt.com (Martin Sjögren)
Date: Fri, 27 Jul 2001 10:34:13 +0200
Subject: [Python-Dev] Import hassle
In-Reply-To: <200107261525.LAA11553@cj20424-a.reston1.va.home.com>
References: <20010726131222.A30459@strakt.com> <200107261525.LAA11553@cj20424-a.reston1.va.home.com>
Message-ID: <20010727103413.A10266@strakt.com>

On Thu, Jul 26, 2001 at 11:25:51AM -0400, Guido van Rossum wrote:
> > I've been writing quite a few mails lately, all concerning import
> > problems. I thought I'd write a little longer mail to explain what I'=
m
> > doing and what I find strange here.
>=20
> Martin,
>=20
> Why does this interest you?  This never happens in reality unless your
> memory allocator is broken, and then you have worse problems than
> "leaks".

Short answer: I want to do it Right<tm>
Long answer: I'm curious about how it works, and I found the import
statement very odd, what with importing "broken" modules and reloading
them and so on.  I could easily get python to leak lots and lots of memor=
y
by catching the import and then catch the reload() in an infinite loop.
Basically, I like Python but Python could be better ;)

> Also, why are you posting to python-dev?

Good question.  I never seemed to get the answers I wanted from
python-list, so my first thought was to mail you personally seeing as "he
created the thing, surely he knows" but then I thought that it'd be a bit
rude so I thought "who else would know a lot about this?" and I came up
with the answer python-dev.  If I shouldn't have done this, I apologize,
but after fooling around with import and trying to figure out where and
what I should free when the init failed, I was mildly confuddled.

I posted this to python-dev too since I started out there, making my erro=
r
worse I guess.  Feel free to flame me :-)

[snip]

> > Even more interesting, say that I create a submodule and throw in a
> > bunch of PyCFunctions in it (I stole the code from InitModule since
> > I don't know how to fake submodules in a C module in another way, is
> > there a way?). I create the module, fail on inserting it into the
> > dictionary and DECREF it.  Now, that ought to free the darn
> > submodule, doesn't it? Anyway, I wrote a simple "mean" script to
> > test this:
> >=20
> > try: import Foo
> > except: import Foo
> > while 1:
> >   try: reload(Foo)
> >   except: pass
> >=20
> > And this leaks memory like I-don't-know-what!
> > What memory doesn't get freed?
>=20
> Memory leaks are hard to find.  I prefer to focus on memory leaks that
> occur in real situations, rather than theoretical leaks.

Agreed, though it's nice to do it Right<tm>, especially when I get asked
on the code review at work "shouldn't you free memory here?" and the only
thing I can reply is "nobody else does", and my boss says "just because
nobody else does it Right<tm>, there's no reason you shouldn't"

But, what IS the Right<tm> way to do this anwyay?

> > Now to my questions: What exactly SHOULD I do when loading my module =
fails
> > halfway through? Common sense says I should free the memory I've used=
 and
> > the module object ought to be unusable.
>=20
> You should free the memory if you care.  "Disabling" the module is
> unnecessary -- in practice, the program usually quits when an import
> fails anyway.

Okay, so how about the situation where an import fails halfway through bu=
t
the things you need are initialized "before" that.  Say that you catch th=
e
exception on import and check wether the things you need are there.  If
they are, fine.  If they aren't, fail.  I don't see this situation as
something that's show up all the time, but it certainly is possible, isn'=
t
it?  In that situation it would be nice if there were no memory leaks...

Then again, maybe I'm just foolish.

> > Why-oh-why can I import Foo, catch the exception, import it again and=
 it
> > shows up in the dictionary? What's the purpose of this?
> >=20
> > How do I work with submodules in a C module?
> >=20
> > I find the import semantics really weird here, something is not quite
> > right...

> Consider two modules, A and B, where A imports B and B imports A.
> This is perfectly legal, and works fine as long as B's module
> initialization doesn't use names defined in A.
>=20
> In order to make this work, sys.module['A'] is initialized to an empty
> module and filled with names during A's initialization; ditto for
> sys.modules['B'].
>=20
> Now suppose A triggers an exception after it has successfully loaded
> and imported B.  B already has a reference to A.  A is not completely
> initialized, but it's not empty either.  Should we delete B's
> reference to A?  No -- that's interference with B's namespace, and we
> don't know whether B might have stored references to A elsewhere, so
> we don't know if this would be effective.  Should we delete
> sys.modules['A']?  I don't think so.  If we delete sys.modules['A'],
> and later someone attempts to import A again, the following will
> happen: when A imports B, it finds sys.modules['B'], so it doesn't
> reload B; it will use the existing B.  But now B has a reference to
> the *old* A, not the new one.
>=20
> There are now two possibilities: either the second import of A somehow
> succeeds (this could only happen if somehow the problem that caused it
> to trigger an exception was repaired before the second attempted
> import), or the second import of A fails again.  If it succeeds, the
> situation is still broken, because B references the old, incomplete
> A.  If it fails, we my end up in an infinite loop, attempting to
> reimport A, failing, and catching the exception forever.  Neither is
> good.

Ah-hah.  Now I get it, thank you!

Martin

--=20
Martin Sj=F6gren
  martin@strakt.com              ICQ : 41245059
  Phone: +46 (0)31 405242        Cell: +46 (0)739 169191
  GPG key: http://www.strakt.com/~martin/gpg.html


From mal@lemburg.com  Fri Jul 27 11:13:02 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 27 Jul 2001 12:13:02 +0200
Subject: [Python-Dev] PEP for adding a decimal type to Python
References: <01072701443301.05085@localhost.localdomain>
Message-ID: <3B613EAE.D1F4AEC4@lemburg.com>

Just a suggestion which might also open the door for other numeric
type extensions to play along nicely:

Would it make sense to have an extensible registry of constructors 
for numeric types which maps number literal modifiers to constructors ?

I am thinking of

123L -> long("123")
123i -> int("123")
123.45f -> float("123.45")

The registry would map 'L' to long(), 'i' to int(), 'f' to float()
and be extensible in the sense, that e.g. an extension like
mxNumber could register its own mappings which would make 
the types defined in these extensions much more accessible
without having to path the interpreter. mxNumber for example could
then register 'r' to map to mx.Number.Rational() and a user could
then write 1/2r would map to 1 / mx.Number.Rational("2") and
generate a Rational number object for 1/2.

The registry would have to be made smart enough to seperate
integer notations from floating point ones and use two separate
default mapping for these, e.g. '<int>' -> int() and '<float>' ->
float().

The advantage of such a mechanism would be that a user could
easily change the literal semantics at his/her taste.

Note that I don't think that we really need a separate interpreter
just to add decimals or rationals to the core. All that is needed
is some easy way to construct these number objects without too
much programming overhead (i.e. number of keys to hit ;-).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From guido@zope.com  Fri Jul 27 14:08:51 2001
From: guido@zope.com (Guido van Rossum)
Date: Fri, 27 Jul 2001 09:08:51 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
In-Reply-To: Your message of "Fri, 27 Jul 2001 12:13:02 +0200."
 <3B613EAE.D1F4AEC4@lemburg.com>
References: <01072701443301.05085@localhost.localdomain>
 <3B613EAE.D1F4AEC4@lemburg.com>
Message-ID: <200107271308.JAA23972@cj20424-a.reston1.va.home.com>

> Just a suggestion which might also open the door for other numeric
> type extensions to play along nicely:
> 
> Would it make sense to have an extensible registry of constructors 
> for numeric types which maps number literal modifiers to constructors ?
> 
> I am thinking of
> 
> 123L -> long("123")
> 123i -> int("123")
> 123.45f -> float("123.45")
> 
> The registry would map 'L' to long(), 'i' to int(), 'f' to float()
> and be extensible in the sense, that e.g. an extension like
> mxNumber could register its own mappings which would make 
> the types defined in these extensions much more accessible
> without having to path the interpreter. mxNumber for example could
> then register 'r' to map to mx.Number.Rational() and a user could
> then write 1/2r would map to 1 / mx.Number.Rational("2") and
> generate a Rational number object for 1/2.
> 
> The registry would have to be made smart enough to seperate
> integer notations from floating point ones and use two separate
> default mapping for these, e.g. '<int>' -> int() and '<float>' ->
> float().
> 
> The advantage of such a mechanism would be that a user could
> easily change the literal semantics at his/her taste.
> 
> Note that I don't think that we really need a separate interpreter
> just to add decimals or rationals to the core. All that is needed
> is some easy way to construct these number objects without too
> much programming overhead (i.e. number of keys to hit ;-).

Funny, I had a similar idea today in the shower (always the best place
to think :-).  I'm not sure exactly how it would work yet --
currently, literals are converted to values at compile-time, so the
registry would have to be available to the compiler, but the concept
seems to make more sense if it is available and changeable at runtime.

Nevertheless, we should keep this in mind.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mclay@nist.gov  Fri Jul 27 14:21:01 2001
From: mclay@nist.gov (Michael McLay)
Date: Fri, 27 Jul 2001 09:21:01 -0400
Subject: [Python-Dev] Splitting the PEP for adding a decimal type to Python
In-Reply-To: <3B613EAE.D1F4AEC4@lemburg.com>
References: <01072701443301.05085@localhost.localdomain> <3B613EAE.D1F4AEC4@lemburg.com>
Message-ID: <01072708200303.02216@fermi.eeel.nist.gov>

On Friday 27 July 2001 06:13 am, M.-A. Lemburg wrote:
> Just a suggestion which might also open the door for other numeric
> type extensions to play along nicely:
>
> Would it make sense to have an extensible registry of constructors
> for numeric types which maps number literal modifiers to constructors ?
>
> I am thinking of
>
> 123L -> long("123")
> 123i -> int("123")
> 123.45f -> float("123.45")

With the changes made in the prototype this would be relatively easy to 
implement. 

Using an 'i' suffix could be confused with an imaginary number.  It would be 
very easy for someone to mistakenly type 12i instead of 12j and get an 
integer instead of an imaginary number

The next implementation of my PEP will change the 'f' to a 'b', as in binary 
number.  The same suffix is used for both integer and float because they work 
together as a binary number implementation of numbers.  With the decimal 
number implementation there is only one type for both integer and float.

    123b -> int("123") 
    123.45b -> float("123.45")
>
> The registry would map 'L' to long(), 'i' to int(), 'f' to float()
> and be extensible in the sense, that e.g. an extension like
> mxNumber could register its own mappings which would make
> the types defined in these extensions much more accessible
> without having to path the interpreter. mxNumber for example could
> then register 'r' to map to mx.Number.Rational() and a user could
> then write 1/2r would map to 1 / mx.Number.Rational("2") and
> generate a Rational number object for 1/2.
>
> The registry would have to be made smart enough to seperate
> integer notations from floating point ones and use two separate
> default mapping for these, e.g. '<int>' -> int() and '<float>' ->
> float().

The tokenizer just passes a number with a suffix as a string to a function in 
compiler.c   The number in the string could be any valid number, e.g. 123, 
123.45, 123.45e-3, or .123. The function processing the string then 
determines what type of number object to create based on the suffix.  It 
would be the responsibility of the function that processes the 'r' suffix to 
accept or reject the number encoded in the string. 

> The advantage of such a mechanism would be that a user could
> easily change the literal semantics at his/her taste.

>
> Note that I don't think that we really need a separate interpreter
> just to add decimals or rationals to the core. All that is needed
> is some easy way to construct these number objects without too
> much programming overhead (i.e. number of keys to hit ;-).

I wasn't suggesting creating a separate interpreter, I was suggesting adding 
a simple mechanism for allowing a new dialect of Python to be added to the 
existing interpreter.  This new dialect would be easier to use for certain 
types of programming activities.  The use of a decimal number type as the 
default type in this new language dialect is only one change that was 
proposed.  Another would be to use Unicode as the default character set.  
This would allow Unicode characters to be in strings without needing to 
escape them.  The proposal also suggests removing the tab character from 
indentation of blocks.  The goal is to create a language that would clean up 
some of the warts in the Python syntax and take advantage of the capabilities 
of modern IDE environments.

The idea of adding a new language on top of the existing infrastructure isn't 
that unusual. The gcc compiler can process many languages to produce a common 
machine dependant object code.  I can envision taking my simple changes a few 
steps further and turning the entire tokenizer into a replaceable unit.  This 
approach would allows projects to build other languages on top of the Python 
byte code interpreter.  Imagine having Javascript, VBasic, or sh tokenizer 
frontends generating Python bytecodes.  Think of it as the pyNET 
architecture:-)  This change probably belongs in Python4k.

Perhaps the PEP should be split into two parts.  The first PEP would be to 
add decimal characters with a 'd' suffix and also allow suffix characters to 
be added to the default float and integer types.  I think everyone agrees 
that this change is needed.

The second PEP will cover the proposed creation of the dpython dialect.  This 
PEP would be a container for proposed changes to the Python syntax that would 
make the language easier to teach to newbies and easier to use in a financial 
application.

Your suggestion to allow additional numerical types to be added by users 
would be included in the first PEP if the BDFL thinks this is a good idea.



From juergen.erhard@gmx.net  Fri Jul 27 10:46:42 2001
From: juergen.erhard@gmx.net (=?ISO-8859-1?Q?=22J=FCrgen_A=2E_Erhard=22?=)
Date: Fri, 27 Jul 2001 11:46:42 +0200
Subject: [Python-Dev] Re: Future division patch available (PEP 238)
In-Reply-To: <m15PKAC-000wcBC@swing.co.at> (tanzer@swing.co.at)
References: <m15PKAC-000wcBC@swing.co.at>
Message-ID: <27072001.2@wanderer.local.jae.dyndns.org>

--pgp-sign-Multipart_Fri_Jul_27_11:46:08_2001-1
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

>>>>> "Christian" =3D=3D Christian Tanzer <tanzer@swing.co.at> writes:

[snipperoonio... lots of interesting stuff about real-life and
seemingly highly dynamic Python deployments]

    Christian> To be honest, for TTTech design databases the change in
    Christian> division probably doesn't pose any problems. Due to
    Christian> user demand, the tools coerced divisions [in
    Christian> customer-written code] to floating point for a long
    Christian> time.

"Due to customer demand"... well, seems to me you have given great
support to PEP 238 with this.  ;-)

Bye, J

PS: No, I'm not seeing you in the (raving) anti-PEP-238 camp,
Christian.  Your post was much too level-headed for this confusion to
happen. ;-)

--=20
 J=FCrgen A. Erhard  (juergen.erhard@gmx.net, jae@users.sourceforge.net)
          My WebHome: http://members.tripod.com/Juergen_Erhard
    "Those who would give up essential Liberty, to purchase a little
 temporary Safety, deserve neither Liberty nor Safety." -- B. Franklin

--pgp-sign-Multipart_Fri_Jul_27_11:46:08_2001-1
Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iEYEABECAAYFAjthOG0ACgkQN0B+CS56qs2PnQCeMG6ytmv1MCM1bXKnzt8kYrHv
lAkAoJChztYXVwXHR0+xidKHQ4rQOEdo
=8h75
-----END PGP SIGNATURE-----

--pgp-sign-Multipart_Fri_Jul_27_11:46:08_2001-1--


From gward@python.net  Fri Jul 27 15:14:00 2001
From: gward@python.net (Greg Ward)
Date: Fri, 27 Jul 2001 10:14:00 -0400
Subject: [Python-Dev] Advice in stat.py
Message-ID: <20010727101400.A1016@gerg.ca>

stat.py in 2.2a1 starts with the following sage advice:

"""Constants/functions for interpreting results of os.stat() and os.lstat().

Suggested usage: from stat import *
"""

Is ths still the suggested usage?

        Greg
-- 
Greg Ward - geek                                        gward@python.net
http://starship.python.net/~gward/
A man without religion is like a fish without a bicycle.


From fdrake@acm.org  Fri Jul 27 15:27:54 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 27 Jul 2001 10:27:54 -0400 (EDT)
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: <20010727101400.A1016@gerg.ca>
References: <20010727101400.A1016@gerg.ca>
Message-ID: <15201.31338.499533.710253@cj42289-a.reston1.va.home.com>

Greg Ward writes:
 > Suggested usage: from stat import *
 > """
 > 
 > Is ths still the suggested usage?

  Well, I would never suggest that, but I didn't write the stat
module.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From mal@lemburg.com  Fri Jul 27 15:27:43 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 27 Jul 2001 16:27:43 +0200
Subject: [Python-Dev] PEP for adding a decimal type to Python
References: <01072701443301.05085@localhost.localdomain>
 <3B613EAE.D1F4AEC4@lemburg.com> <200107271308.JAA23972@cj20424-a.reston1.va.home.com>
Message-ID: <3B617A5F.E193B2CC@lemburg.com>

Guido van Rossum wrote:
> 
> > Just a suggestion which might also open the door for other numeric
> > type extensions to play along nicely:
> >
> > Would it make sense to have an extensible registry of constructors
> > for numeric types which maps number literal modifiers to constructors ?
> >
> > I am thinking of
> >
> > 123L -> long("123")
> > 123i -> int("123")
> > 123.45f -> float("123.45")
> >
> > The registry would map 'L' to long(), 'i' to int(), 'f' to float()
> > and be extensible in the sense, that e.g. an extension like
> > mxNumber could register its own mappings which would make
> > the types defined in these extensions much more accessible
> > without having to path the interpreter. mxNumber for example could
> > then register 'r' to map to mx.Number.Rational() and a user could
> > then write 1/2r would map to 1 / mx.Number.Rational("2") and
> > generate a Rational number object for 1/2.
> >
> > The registry would have to be made smart enough to seperate
> > integer notations from floating point ones and use two separate
> > default mapping for these, e.g. '<int>' -> int() and '<float>' ->
> > float().
> >
> > The advantage of such a mechanism would be that a user could
> > easily change the literal semantics at his/her taste.
> >
> > Note that I don't think that we really need a separate interpreter
> > just to add decimals or rationals to the core. All that is needed
> > is some easy way to construct these number objects without too
> > much programming overhead (i.e. number of keys to hit ;-).
> 
> Funny, I had a similar idea today in the shower (always the best place
> to think :-).  I'm not sure exactly how it would work yet --
> currently, literals are converted to values at compile-time, so the
> registry would have to be available to the compiler, but the concept
> seems to make more sense if it is available and changeable at runtime.

True, but deferring the conversion to runtime (by e.g. using
literal descriptors ;-) would cause a significant slowdown.

So, I believe that the compiler would have be told before starting
the compile process or within the process by looking at some magical
constant/comment in the source code (I think that this ought to be
a per-file overrideable setting, since some code may simply fail
to work if it suddenly starts to work with different types).
 
> Nevertheless, we should keep this in mind.

I could reformat the above into a PEP or Michael could simply
the idea as section to his PEP.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From paul@pfdubois.com  Fri Jul 27 15:56:25 2001
From: paul@pfdubois.com (Paul F. Dubois)
Date: Fri, 27 Jul 2001 07:56:25 -0700
Subject: [Python-Dev] PEP for adding a decimal type to Python
Message-ID: <ADEOIFHFONCLEEPKCACCMEGJCLAA.paul@pfdubois.com>

In dpython, what is 2.0j? Is the "standard" way of writing complex numbers,
3.0 + 2.0j, valid?



From guido@zope.com  Fri Jul 27 16:47:13 2001
From: guido@zope.com (Guido van Rossum)
Date: Fri, 27 Jul 2001 11:47:13 -0400
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: Your message of "Fri, 27 Jul 2001 10:14:00 EDT."
 <20010727101400.A1016@gerg.ca>
References: <20010727101400.A1016@gerg.ca>
Message-ID: <200107271547.LAA24634@cj20424-a.reston1.va.home.com>

> stat.py in 2.2a1 starts with the following sage advice:
> 
> """Constants/functions for interpreting results of os.stat() and os.lstat().
> 
> Suggested usage: from stat import *
> """
> 
> Is ths still the suggested usage?

I don't see why not.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@zope.com  Fri Jul 27 16:50:15 2001
From: guido@zope.com (Guido van Rossum)
Date: Fri, 27 Jul 2001 11:50:15 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
In-Reply-To: Your message of "Fri, 27 Jul 2001 16:27:43 +0200."
 <3B617A5F.E193B2CC@lemburg.com>
References: <01072701443301.05085@localhost.localdomain> <3B613EAE.D1F4AEC4@lemburg.com> <200107271308.JAA23972@cj20424-a.reston1.va.home.com>
 <3B617A5F.E193B2CC@lemburg.com>
Message-ID: <200107271550.LAA24662@cj20424-a.reston1.va.home.com>

> > Funny, I had a similar idea today in the shower (always the best place
> > to think :-).  I'm not sure exactly how it would work yet --
> > currently, literals are converted to values at compile-time, so the
> > registry would have to be available to the compiler, but the concept
> > seems to make more sense if it is available and changeable at runtime.
> 
> True, but deferring the conversion to runtime (by e.g. using
> literal descriptors ;-) would cause a significant slowdown.
> 
> So, I believe that the compiler would have be told before starting
> the compile process or within the process by looking at some magical
> constant/comment in the source code (I think that this ought to be
> a per-file overrideable setting, since some code may simply fail
> to work if it suddenly starts to work with different types).

This may be the first place where a 'directive' statement actually
makes sense to me.

> > Nevertheless, we should keep this in mind.
> 
> I could reformat the above into a PEP or Michael could simply
> the idea as section to his PEP.

I'm not optimistic about Michael's PEP.  He seems to insist on a total
separation between decimal and binary numbers that I don't believe can
work.  I haven't replied to him yet because I can't explain it well
enough yet -- but I don't believe there's much of a future in his
particular idea.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@zope.com  Fri Jul 27 17:35:34 2001
From: guido@zope.com (Guido van Rossum)
Date: Fri, 27 Jul 2001 12:35:34 -0400
Subject: [Python-Dev] Splitting the PEP for adding a decimal type to Python
In-Reply-To: Your message of "Fri, 27 Jul 2001 09:21:01 EDT."
 <01072708200303.02216@fermi.eeel.nist.gov>
References: <01072701443301.05085@localhost.localdomain> <3B613EAE.D1F4AEC4@lemburg.com>
 <01072708200303.02216@fermi.eeel.nist.gov>
Message-ID: <200107271635.MAA25690@cj20424-a.reston1.va.home.com>

[me]
> > Note that I don't think that we really need a separate interpreter
> > just to add decimals or rationals to the core. All that is needed
> > is some easy way to construct these number objects without too
> > much programming overhead (i.e. number of keys to hit ;-).

[Michael]
> I wasn't suggesting creating a separate interpreter, I was
> suggesting adding a simple mechanism for allowing a new dialect of
> Python to be added to the existing interpreter.

Understood.  I see no big difference in having two binaries or one
binary with a command line option; the two binaries effectively
contain the same functionality, just with a different default.  I
would vote for one binary; if you really think it's too much for your
users to say "python -d" instead of "dpython", give them a script.  (I
know that the -d option currently means something else.  That's a
detail to worry about later.)

> This new dialect would be easier to use for certain types of
> programming activities.  The use of a decimal number type as the
> default type in this new language dialect is only one change that
> was proposed.

I'm not very fond of having multiple dialects.  There are lots of
contexts where the dialect in use is not explicitly mentioned
(e.g. when people discuss fragments of Python code).

> Another would be to use Unicode as the default character set.  This
> would allow Unicode characters to be in strings without needing to
> escape them.

That's not a dialect, that's a different input encoding.  MAL already
has a PEP for that.

> The proposal also suggests removing the tab character from
> indentation of blocks.  The goal is to create a language that would
> clean up some of the warts in the Python syntax and take advantage
> of the capabilities of modern IDE environments.

What does removing tab characters have to do with decimal numbers?
One topic per PEP, please!

> The idea of adding a new language on top of the existing
> infrastructure isn't that unusual. The gcc compiler can process many
> languages to produce a common machine dependant object code.  I can
> envision taking my simple changes a few steps further and turning
> the entire tokenizer into a replaceable unit.  This approach would
> allows projects to build other languages on top of the Python byte
> code interpreter.  Imagine having Javascript, VBasic, or sh
> tokenizer frontends generating Python bytecodes.  Think of it as the
> pyNET architecture:-) This change probably belongs in Python4k.

Or in Python .NET.  Decoupling the various part of the parse+compile
pipeline is something I've considered.

But again this has nothing to do with decimal numbers: your proposal
allows the mixing of decimal and binary numbers (as long as one of
them uses an explicit base indicator) so you don't really need two
parsers -- you need one tokenizer plus a way to specify the default
numeric base for literals.

> Perhaps the PEP should be split into two parts.  The first PEP would
> be to add decimal characters with a 'd' suffix and also allow suffix
> characters to be added to the default float and integer types.  I
> think everyone agrees that this change is needed.

It's needed *if* we agree that we need a decimal data type.

> The second PEP will cover the proposed creation of the dpython
> dialect.  This PEP would be a container for proposed changes to the
> Python syntax that would make the language easier to teach to
> newbies and easier to use in a financial application.

I'll have to go back to your defense of the two dialect approach, but
I think it's neither sufficient nor necessary.

> Your suggestion to allow additional numerical types to be added by
> users would be included in the first PEP if the BDFL thinks this is
> a good idea.

Well, sometimes more generality than you need hurts.  I'm not
convinced that we need an open-ended set of numeric literals.  But in
the light of the unified numeric model, we may need ways to make
exactness or inexactness explicit, and/or we may need a way to specify
rational numbers.  If we can fit all of these in the
number-with-letter-suffix mold, that would be nice for the lexer, I
suppose.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@zope.com  Fri Jul 27 17:38:33 2001
From: guido@zope.com (Guido van Rossum)
Date: Fri, 27 Jul 2001 12:38:33 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
In-Reply-To: Your message of "Fri, 27 Jul 2001 01:27:27 EDT."
 <LNBBLJKPBEHFEDALKOLCEEJLLBAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCEEJLLBAA.tim.one@home.com>
Message-ID: <200107271638.MAA25779@cj20424-a.reston1.va.home.com>

> I elect Michael <wink>.  Note there is *no* decimal PEP now -- not even a
> decimal PEP number assigned.

I thought all PEPs had decimal numbers? :)

> Aahz isn't going to write one, either.  I was hoping to write one
> instead if time allowed, but that looks increasingly unlikely by the
> hour.

I'm not sure that Michael would write the PEP we want.  You may be
"it", after all, if you want this done right.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gward@python.net  Fri Jul 27 17:51:20 2001
From: gward@python.net (Greg Ward)
Date: Fri, 27 Jul 2001 12:51:20 -0400
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: <200107271547.LAA24634@cj20424-a.reston1.va.home.com>; from guido@zope.com on Fri, Jul 27, 2001 at 11:47:13AM -0400
References: <20010727101400.A1016@gerg.ca> <200107271547.LAA24634@cj20424-a.reston1.va.home.com>
Message-ID: <20010727125120.O677@gerg.ca>

On 27 July 2001, Guido van Rossum said:
> > stat.py in 2.2a1 starts with the following sage advice:
> > 
> > """Constants/functions for interpreting results of os.stat() and os.lstat().
> > 
> > Suggested usage: from stat import *
> > """
> > 
> > Is ths still the suggested usage?
> 
> I don't see why not.

My understanding was that it's generally considered Bad Form to do this
at module level, while doing it at function level is tricky (or a
performance hit? whatever...) because of nested scopes.

        Greg
-- 
Greg Ward - Unix geek                                   gward@python.net
http://starship.python.net/~gward/
No animals were harmed in transmitting this message.


From guido@zope.com  Fri Jul 27 17:57:00 2001
From: guido@zope.com (Guido van Rossum)
Date: Fri, 27 Jul 2001 12:57:00 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
In-Reply-To: Your message of "Thu, 26 Jul 2001 17:38:27 EDT."
 <01072617382702.02216@fermi.eeel.nist.gov>
References: <01072617382702.02216@fermi.eeel.nist.gov>
Message-ID: <200107271657.MAA26156@cj20424-a.reston1.va.home.com>

Michael,

Your PEP doesn't spell out what happens when a binary and a decimal
number are the input for a numerical operator.  I believe you said
that this would be an unconditional error.

But I foresee serious problems.  Most standard library modules use
numbers.  Most of the modules using numbers occasionally use a literal
(e.g. 0 or 1).  According to your PEP, literals in module files ending
with .py default to binary.  This means that almost any use of a
standard library module from your "dpython" will fail as soon as a
literal is used.

I can't believe that this will work satisfactorily.

Another example of the kind of problem your approach runs into: what
should the type of len("abc") be?  3d or 3b?  Should it depend on the
default mode?

I suppose sequence indexing has to accept decimal as well as binary
integers as indexes -- certainly in a decimal program you will want to
be able to use decimal integers for indexes.

The whole thing seems screwed.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@zope.com  Fri Jul 27 18:14:36 2001
From: guido@zope.com (Guido van Rossum)
Date: Fri, 27 Jul 2001 13:14:36 -0400
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: Your message of "Fri, 27 Jul 2001 12:51:20 EDT."
 <20010727125120.O677@gerg.ca>
References: <20010727101400.A1016@gerg.ca> <200107271547.LAA24634@cj20424-a.reston1.va.home.com>
 <20010727125120.O677@gerg.ca>
Message-ID: <200107271714.NAA26383@cj20424-a.reston1.va.home.com>

> > > Suggested usage: from stat import *
> > > 
> > > Is ths still the suggested usage?
> > 
> > I don't see why not.
> 
> My understanding was that it's generally considered Bad Form to do this
> at module level, while doing it at function level is tricky (or a
> performance hit? whatever...) because of nested scopes.

Generally yes, but there's an explicit disclaimer "unless the module
is written for this".  And stat.py is (hence the recommendation in the
docstring).

Inside a function, from ... import * is always bad form.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@zope.com  Fri Jul 27 18:12:54 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 27 Jul 2001 13:12:54 -0400
Subject: [Python-Dev] Advice in stat.py
References: <20010727101400.A1016@gerg.ca>
 <200107271547.LAA24634@cj20424-a.reston1.va.home.com>
 <20010727125120.O677@gerg.ca>
Message-ID: <15201.41238.543442.486438@anthem.wooz.org>

>>>>> "GW" == Greg Ward <gward@python.net> writes:

    GW> My understanding was that it's generally considered Bad Form
    GW> to do this at module level, while doing it at function level
    GW> is tricky (or a performance hit? whatever...) because of
    GW> nested scopes.

Yes, but some modules are designed for from-import-* so they're less
evil.  types, stat, and Tkinter are the three most common ones for
me.  Usually though if I'm importing fewer than about 3 symbols, I'll
import then explicitly.

-Barry



From jepler@inetnebr.com  Fri Jul 27 19:59:16 2001
From: jepler@inetnebr.com (Jeff Epler)
Date: Fri, 27 Jul 2001 13:59:16 -0500
Subject: [Python-Dev] Splitting the PEP for adding a decimal type to Python
In-Reply-To: <200107271635.MAA25690@cj20424-a.reston1.va.home.com>; from guido@zope.com on Fri, Jul 27, 2001 at 12:35:34PM -0400
References: <01072701443301.05085@localhost.localdomain> <3B613EAE.D1F4AEC4@lemburg.com> <01072708200303.02216@fermi.eeel.nist.gov> <200107271635.MAA25690@cj20424-a.reston1.va.home.com>
Message-ID: <20010727135914.B18280@inetnebr.com>

On Fri, Jul 27, 2001 at 12:35:34PM -0400, Guido van Rossum wrote:
> But again this has nothing to do with decimal numbers: your proposal
> allows the mixing of decimal and binary numbers (as long as one of
> them uses an explicit base indicator) so you don't really need two
> parsers -- you need one tokenizer plus a way to specify the default
> numeric base for literals.

If this were possible, then could it be a per-module decision what "1/2"
produces, depending whether unadorned whole-number literals correspond
to ClassicInt or NewInt ?

That sounds miles better than writing "1//2" to me.

Jeff


From mclay@nist.gov  Fri Jul 27 20:51:38 2001
From: mclay@nist.gov (Michael McLay)
Date: Fri, 27 Jul 2001 15:51:38 -0400
Subject: [Python-Dev] Splitting the PEP for adding a decimal type to Python
In-Reply-To: <200107271635.MAA25690@cj20424-a.reston1.va.home.com>
References: <01072701443301.05085@localhost.localdomain> <01072708200303.02216@fermi.eeel.nist.gov> <200107271635.MAA25690@cj20424-a.reston1.va.home.com>
Message-ID: <01072715513805.02216@fermi.eeel.nist.gov>

On Friday 27 July 2001 12:35 pm, Guido van Rossum wrote:
> [me]
>
> > I wasn't suggesting creating a separate interpreter, I was
> > suggesting adding a simple mechanism for allowing a new dialect of
> > Python to be added to the existing interpreter.
>
> Understood.  I see no big difference in having two binaries or one
> binary with a command line option; the two binaries effectively
> contain the same functionality, just with a different default.  I
> would vote for one binary; if you really think it's too much for your
> users to say "python -d" instead of "dpython", give them a script.  (I
> know that the -d option currently means something else.  That's a
> detail to worry about later.)

I decided to use a symbolic link to a different command name to set the 
default encoding of numerical literals. I did this because refer to the 
'dpython' command more concise than  "python -d".  The executable could also 
have command options to select between python and dpython modes.

> I'm not very fond of having multiple dialects.  There are lots of
> contexts where the dialect in use is not explicitly mentioned
> (e.g. when people discuss fragments of Python code).

I'm not fond of dialects when they don't serve a significant purpose.  
However, I believe it would be useful to at least discuss creating a special 
purpose "safe" mode for the Python lexer.  This mode would be attractive to 
newbies and financial programmers.  Calling this a new dialect is an 
overstatement.  It is more like defining a subset of the language that uses a 
special vocabulary for working with decimal types.

> > Another would be to use Unicode as the default character set.  This
> > would allow Unicode characters to be in strings without needing to
> > escape them.
>
> That's not a dialect, that's a different input encoding.  MAL already
> has a PEP for that.

I know about the PEP.  I was refering to making it the default string type 
for a '.dp' file.  There would be no prefix 'u' required.  

I'll remove this and the other unrelated items from the decimal type PEP

If you don't agree with the idea of adding dpython lexer mode then there is 
no point in discussing the features that would be in that mode.

> > The idea of adding a new language on top of the existing
> > infrastructure isn't that unusual. The gcc compiler can process many
> > languages to produce a common machine dependant object code.  I can
> > envision taking my simple changes a few steps further and turning
> > the entire tokenizer into a replaceable unit.  This approach would
> > allows projects to build other languages on top of the Python byte
> > code interpreter.  Imagine having Javascript, VBasic, or sh
> > tokenizer frontends generating Python bytecodes.  Think of it as the
> > pyNET architecture:-) This change probably belongs in Python4k.
>
> Or in Python .NET.  Decoupling the various part of the parse+compile
> pipeline is something I've considered.

Did you decide against it, or has it just not been a high enough priority?

> But again this has nothing to do with decimal numbers: your proposal
> allows the mixing of decimal and binary numbers (as long as one of
> them uses an explicit base indicator) so you don't really need two
> parsers -- you need one tokenizer plus a way to specify the default
> numeric base for literals.

That is exactly what I implemented.  The dpython command and the '.dp' cause 
the Py_USE_DECIMAL_AS_DEFAULT[1] flag to be set.  When this flag is set 
decimal numbers are used for literals.  

>
> I'll have to go back to your defense of the two dialect approach, but
> I think it's neither sufficient nor necessary.

I have mixed too many ideas into a PEP.  I'll rework the PEP to remove the 
cruft and focus on the addition of decimal numbers.  I move the other ideas 
into a separate PEP.

> Well, sometimes more generality than you need hurts.  I'm not
> convinced that we need an open-ended set of numeric literals.  But in
> the light of the unified numeric model, we may need ways to make
> exactness or inexactness explicit, and/or we may need a way to specify
> rational numbers.  If we can fit all of these in the
> number-with-letter-suffix mold, that would be nice for the lexer, I
> suppose.

I worry about a "unified numerical model" getting overly complex.  I think 
decimal numbers help because they are a better choice than binary numbers for 
a significant percentage of all software applications.  I know that rationale 
numbers are imporant in some applications.  Am I overlooking some huge class 
of applications that use rationales?  While Tim and some of the other 
Pythoneers can probably think of dozens of specialized numerical types, I 
would venture to guess that binary types and a decimal type probably cover 
90% of all the user's requirements. 



[1] I'll be renaming the flat to this in the next version.  The flag is 
currently called Py_NEW_PARSER.  I named it that because at one time I was 
creating a new parser.  I trimmed the changes down to just a few edits of the 
tokenizer and compile.c 



From guido@zope.com  Fri Jul 27 20:49:57 2001
From: guido@zope.com (Guido van Rossum)
Date: Fri, 27 Jul 2001 15:49:57 -0400
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
Message-ID: <200107271949.PAA27171@cj20424-a.reston1.va.home.com>

Here's a new revision of PEP 238.  I've incorporated clarifications of
issues that were brought up during the discussion of rev 1.10 -- from
typos via rewording of ambiguous phrasing to the addition of new open
issues.  I've decided not to go for the "quotient and ratio"
terminology -- my rationale is in the PEP.

I'm posting this also to c.l.py and c.l.py.a, to make sure enough
people see it.  Feel free to discuss it either in c.l.py or here in
python-dev, but please don't change the subject.

--Guido van Rossum (home page: http://www.python.org/~guido/)

PEP: 238
Title: Changing the Division Operator
Version: $Revision: 1.12 $
Author: pep@zadka.site.co.il (Moshe Zadka), guido@python.org (Guido van Rossum)
Status: Draft
Type: Standards Track
Created: 11-Mar-2001
Python-Version: 2.2
Post-History: 16-Mar-2001, 26-Jul-2001, 27-Jul-2001


Abstract

    The current division (/) operator has an ambiguous meaning for
    numerical arguments: it returns the floor of the mathematical
    result of division if the arguments are ints or longs, but it
    returns a reasonable approximation of the division result if the
    arguments are floats or complex.  This makes expressions expecting
    float or complex results error-prone when integers are not
    expected but possible as inputs.

    We propose to fix this by introducing different operators for
    different operations: x/y to return a reasonable approximation of
    the mathematical result of the division ("true division"), x//y to
    return the floor ("floor division").  We call the current, mixed
    meaning of x/y "classic division".

    Because of severe backwards compatibility issues, not to mention a
    major flamewar on c.l.py, we propose the following transitional
    measures (starting with Python 2.2):

    - Classic division will remain the default in the Python 2.x
      series; true division will be standard in Python 3.0.

    - The // operator will be available to request floor division
      unambiguously.

    - The future division statement, spelled "from __future__ import
      division", will change the / operator to mean true division
      throughout the module.

    - A command line option will enable run-time warnings for classic
      division applied to int or long arguments; another command line
      option will make true division the default.

    - The standard library will use the future division statement and
      the // operator when appropriate, so as to completely avoid
      classic division.


Motivation

    The classic division operator makes it hard to write numerical
    expressions that are supposed to give correct results from
    arbitrary numerical inputs.  For all other operators, one can
    write down a formula such as x*y**2 + z, and the calculated result
    will be close to the mathematical result (within the limits of
    numerical accuracy, of course) for any numerical input type (int,
    long, float, or complex).  But division poses a problem: if the
    expressions for both arguments happen to have an integral type, it
    implements floor division rather than true division.

    The problem is unique to dynamically typed languages: in a
    statically typed language like C, the inputs, typically function
    arguments, would be declared as double or float, and when a call
    passes an integer argument, it is converted to double or float at
    the time of the call.  Python doesn't have argument type
    declarations, so integer arguments can easily find their way into
    an expression.

    The problem is particularly pernicious since ints are perfect
    substitutes for floats in all other circumstances: math.sqrt(2)
    returns the same value as math.sqrt(2.0), 3.14*100 and 3.14*100.0
    return the same value, and so on.  Thus, the author of a numerical
    routine may only use floating point numbers to test his code, and
    believe that it works correctly, and a user may accidentally pass
    in an integer input value and get incorrect results.

    Another way to look at this is that classic division makes it
    difficult to write polymorphic functions that work well with
    either float or int arguments; all other operators already do the
    right thing.  No algorithm that works for both ints and floats has
    a need for truncating division in one case and true division in
    the other.

    The correct work-around is subtle: casting an argument to float()
    is wrong if it could be a complex number; adding 0.0 to an
    argument doesn't preserve the sign of the argument if it was minus
    zero.  The only solution without either downside is multiplying an
    argument (typically the first) by 1.0.  This leaves the value and
    sign unchanged for float and complex, and turns int and long into
    a float with the corresponding value.

    It is the opinion of the authors that this is a real design bug in
    Python, and that it should be fixed sooner rather than later.
    Assuming Python usage will continue to grow, the cost of leaving
    this bug in the language will eventually outweigh the cost of
    fixing old code -- there is an upper bound to the amount of code
    to be fixed, but the amount of code that might be affected by the
    bug in the future is unbounded.

    Another reason for this change is the desire to ultimately unify
    Python's numeric model.  This is the subject of PEP 228[0] (which
    is currently incomplete).  A unified numeric model removes most of
    the user's need to be aware of different numerical types.  This is
    good for beginners, but also takes away concerns about different
    numeric behavior for advanced programmers.  (Of course, it won't
    remove concerns about numerical stability and accuracy.)

    In a unified numeric model, the different types (int, long, float,
    complex, and possibly others, such as a new rational type) serve
    mostly as storage optimizations, and to some extent to indicate
    orthogonal properties such as inexactness or complexity.  In a
    unified model, the integer 1 should be indistinguishable from the
    floating point number 1.0 (except for its inexactness), and both
    should behave the same in all numeric contexts.  Clearly, in a
    unified numeric model, if a==b and c==d, a/c should equal b/d
    (taking some liberties due to rounding for inexact numbers), and
    since everybody agrees that 1.0/2.0 equals 0.5, 1/2 should also
    equal 0.5.  Likewise, since 1//2 equals zero, 1.0//2.0 should also
    equal zero.


Variations

    Aesthetically, x//y doesn't please everyone, and hence several
    variations have been proposed: x div y, or div(x, y), sometimes in
    combination with x mod y or mod(x, y) as an alternative spelling
    for x%y.

    We consider these solutions inferior, on the following grounds.

    - Using x div y would introduce a new keyword.  Since div is a
      popular identifier, this would break a fair amount of existing
      code, unless the new keyword was only recognized under a future
      division statement.  Since it is expected that the majority of
      code that needs to be converted is dividing integers, this would
      greatly increase the need for the future division statement.
      Even with a future statement, the general sentiment against
      adding new keywords unless absolutely necessary argues against
      this.

    - Using div(x, y) makes the conversion of old code much harder.
      Replacing x/y with x//y or x div y can be done with a simple
      query replace; in most cases the programmer can easily verify
      that a particular module only works with integers so all
      occurrences of x/y can be replaced.  (The query replace is still
      needed to weed out slashes occurring in comments or string
      literals.)  Replacing x/y with div(x, y) would require a much
      more intelligent tool, since the extent of the expressions to
      the left and right of the / must be analyzed before the
      placement of the "div(" and ")" part can be decided.


Alternatives

    In order to reduce the amount of old code that needs to be
    converted, several alternative proposals have been put forth.
    Here is a brief discussion of each proposal (or category of
    proposals).  If you know of an alternative that was discussed on
    c.l.py that isn't mentioned here, please mail the second author.

    - Let / keep its classic semantics; introduce // for true
      division.  This still leaves a broken operator in the language,
      and invites to use the broken behavior.  It also shuts off the
      road to a unified numeric model a la PEP 228[0].

    - Let int division return a special "portmanteau" type that
      behaves as an integer in integer context, but like a float in a
      float context.  The problem with this is that after a few
      operations, the int and the float value could be miles apart,
      it's unclear which value should be used in comparisons, and of
      course many contexts (like conversion to string) don't have a
      clear integer or float context.

    - Use a directive to use specific division semantics in a module,
      rather than a future statement.  This retains classic division
      as a permanent wart in the language, requiring future
      generations of Python programmers to be aware of the problem and
      the remedies.

    - Use "from __past__ import division" to use classic division
      semantics in a module.  This also retains the classic division
      as a permanent wart, or at least for a long time (eventually the
      past division statement could raise an ImportError).

    - Use a directive (or some other way) to specify the Python
      version for which a specific piece of code was developed.  This
      requires future Python interpreters to be able to emulate
      *exactly* several previous versions of Python, and moreover to
      do so for multiple versions within the same interpreter.  This
      is way too much work.  A much simpler solution is to keep
      multiple interpreters installed.


API Changes

    During the transitional phase, we have to support *three* division
    operators within the same program: classic division (for / in
    modules without a future division statement), true division (for /
    in modules with a future division statement), and floor division
    (for //).  Each operator comes in two flavors: regular, and as an
    augmented assignment operator (/= or //=).

    The names associated with these variations are:

    - Overloaded operator methods:

      __div__(), __floordiv__(), __truediv__();

      __idiv__(), __ifloordiv__(), __itruediv__().

    - Abstract API C functions:

      PyNumber_Divide(), PyNumber_FloorDivide(),
      PyNumber_TrueDivide();

      PyNumber_InPlaceDivide(), PyNumber_InPlaceFloorDivide(),
      PyNumber_InPlaceTrueDivide().

    - Byte code opcodes:

      BINARY_DIVIDE, BINARY_FLOOR_DIVIDE, BINARY_TRUE_DIVIDE;

      INPLACE_DIVIDE, INPLACE_FLOOR_DIVIDE, INPLACE_TRUE_DIVIDE.

    - PyNumberMethod slots:

      nb_divide, nb_floor_divide, nb_true_divide,

      nb_inplace_divide, nb_inplace_floor_divide,
      nb_inplace_true_divide.

    The added PyNumberMethod slots require an additional flag in
    tp_flags; this flag will be named Py_TPFLAGS_HAVE_NEWDIVIDE and
    will be included in Py_TPFLAGS_DEFAULT.

    The true and floor division APIs will look for the corresponding
    slots and call that; when that slot is NULL, they will raise an
    exception.  There is no fallback to the classic divide slot.

    In Python 3.0, the classic division semantics will be removed; the
    classic division APIs will become synonymous with true division.


Command Line Option

    The -D command line option takes a string argument that can take
    three values: "old", "warn", or "new".  The default is "old" in
    Python 2.2 but will change to "warn" in later 2.x versions.  The
    "old" value means the classic division operator acts as described.
    The "warn" value means the classic division operator issues a
    warning (a DeprecationWarning using the standard warning
    framework) when applied to ints or longs.  The "new" value changes
    the default globally so that the / operator is always interpreted
    as true division.  The "new" option is only intended for use in
    certain educational environments, where true division is required,
    but asking the students to include the future division statement
    in all their code would be a problem.

    This option will not be supported in Python 3.0; Python 3.0 will
    always interpret / as true division.

    (Other names have been proposed, like -Dclassic, -Dclassic-warn,
    -Dtrue, or -Dold_division etc.; these seem more verbose to me
    without much advantage.  After all the term classic division is
    not used in the language at all (only in the PEP), and the term
    true division is rarely used in the language -- only in
    __truediv__.)


Semantics of Floor Division

    Floor division will be implemented in all the Python numeric
    types, and will have the semantics of

        a // b == floor(a/b)

    except that the result type will be the common type into which a
    and b are coerced before the operation.

    Specifically, if a and b are of the same type, a//b will be of
    that type too.  If the inputs are of different types, they are
    first coerced to a common type using the same rules used for all
    other arithmetic operators.

    In particular, if a and b are both ints or longs, the result has
    the same type and value as for classic division on these types
    (including the case of mixed input types; int//long and long//int
    will both return a long).

    For floating point inputs, the result is a float.  For example:

      3.5//2.0 == 1.0

    For complex numbers, // raises an exception, since float() of a
    complex number is not allowed.

    For user-defined classes and extension types, all semantics are up
    to the implementation of the class or type.


Semantics of True Division

    True division for ints and longs will convert the arguments to
    float and then apply a float division.  That is, even 2/1 will
    return a float (2.0), not an int.  For floats and complex, it will
    be the same as classic division.

    Note that for long arguments, true division may lose information;
    this is in the nature of true division (as long as rationals are
    not in the language).  Algorithms that consciously use longs
    should consider using //.

    If and when a rational type is added to Python (see PEP 239[2]),
    true division for ints and longs should probably return a
    rational.  This avoids the problem with true division of longs
    losing information.  But until then, for consistency, float is the
    only choice for true division.


The Future Division Statement

    If "from __future__ import division" is present in a module, or if
    -Dnew is used, the / and /= operators are translated to true
    division opcodes; otherwise they are translated to classic
    division (until Python 3.0 comes along, where they are always
    translated to true division).

    The future division statement has no effect on the recognition or
    translation of // and //=.

    See PEP 236[4] for the general rules for future statements.

    (It has been proposed to use a longer phrase, like "true_division"
    or "modern_division".  These don't seem to add much information.)


Open Issues

    - It has been proposed to call // the quotient operator, and the /
      operator the ratio operator.  I'm not sure about this -- for
      some people quotient is just a synonym for division, and ratio
      suggests rational numbers, which is wrong.  I prefer the
      terminology to be slightly awkward if that avoids unambiguity.
      Also, for some folks "quotient" suggests truncation towards
      zero, not towards infinity as "floor division" says explicitly.

    - It has been argued that a command line option to change the
      default is evil.  It can certainly be dangerous in the wrong
      hands: for example, it would be impossible to combine a 3rd
      party library package that requires -Dnew with another one that
      requires -Dold.  But I believe that the VPython folks need a way
      to enable true division by default, and other educators might
      need the same.  These usually have enough control over the
      library packages available in their environment.

    - For very large long integers, the definition of true division as
      returning a float causes problems, since the range of Python
      longs is much larger than that of Python floats.  This problem
      will disappear if and when rational numbers are supported.  In
      the interim, maybe the long-to-float conversion could be made to
      raise OverflowError if the long is out of range.


FAQ

    Q. Why isn't true division called float division?

    A. Because I want to keep the door open to *possibly* introducing
       rationals and making 1/2 return a rational rather than a
       float.  See PEP 239[2].

    Q. Why is there a need for __truediv__ and __itruediv__?

    A. We don't want to make user-defined classes second-class
       citizens.  Certainly not with the type/class unification going
       on.

    Q. How do I write code that works under the classic rules as well
       as under the new rules without using // or a future division
       statement?

    A. Use x*1.0/y for true division, divmod(x, y)[0] for int
       division.  Especially the latter is best hidden inside a
       function.  You may also write float(x)/y for true division if
       you are sure that you don't expect complex numbers.  If you
       know your integers are never negative, you can use int(x/y) --
       while the documentation of int() says that int() can round or
       truncate depending on the C implementation, we know of no C
       implementation that doesn't truncate, and we're going to change
       the spec for int() to promise truncation.  Note that for
       negative ints, classic division (and floor division) round
       towards negative infinity, while int() rounds towards zero.

    Q. How do I specify the division semantics for input(), compile(),
       execfile(), eval() and exec?

    A. They inherit the choice from the invoking module.  PEP 236[4]
       lists this as a partially resolved problem.

    Q. What about code compiled by the codeop module?

    A. Alas, this will always use the default semantics (set by the -D
       command line option).  This is a general problem with the
       future statement; PEP 236[4] lists it as an unresolved
       problem.  You could have your own clone of codeop.py that
       includes a future division statement, but that's not a general
       solution.

    Q. Will there be conversion tools or aids?

    A. Certainly, but these are outside the scope of the PEP.

    Q. Why is my question not answered here?

    A. Because we weren't aware of it.  If it's been discussed on
       c.l.py and you believe the answer is of general interest,
       please notify the second author.  (We don't have the time or
       inclination to answer every question sent in private email,
       hence the requirement that it be discussed on c.l.py first.)


Implementation

    A very early implementation (not yet following the above spec, but
    supporting // and the future division statement) is available from
    the SourceForge patch manager[5].


References

    [0] PEP 228, Reworking Python's Numeric Model
        http://www.python.org/peps/pep-0228.html

    [1] PEP 237, Unifying Long Integers and Integers, Zadka,
        http://www.python.org/peps/pep-0237.html

    [2] PEP 239, Adding a Rational Type to Python, Zadka,
        http://www.python.org/peps/pep-0239.html

    [3] PEP 240, Adding a Rational Literal to Python, Zadka,
        http://www.python.org/peps/pep-0240.html

    [4] PEP 236, Back to the __future__, Peters,
        http://www.python.org/peps/pep-0236.html

    [5] Patch 443474, from __future__ import division
        http://sourceforge.net/tracker/index.php?func=detail&aid=443474&group_id=5470&atid=305470


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:


From mclay@nist.gov  Fri Jul 27 21:30:56 2001
From: mclay@nist.gov (Michael McLay)
Date: Fri, 27 Jul 2001 16:30:56 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
Message-ID: <01072716305607.02216@fermi.eeel.nist.gov>

On Friday 27 July 2001 12:57 pm, Guido van Rossum wrote:
> Michael,
>
> Your PEP doesn't spell out what happens when a binary and a decimal
> number are the input for a numerical operator.  I believe you said
> that this would be an unconditional error.
>
> But I foresee serious problems.  Most standard library modules use
> numbers.  Most of the modules using numbers occasionally use a literal
> (e.g. 0 or 1).  According to your PEP, literals in module files ending
> with .py default to binary.  This means that almost any use of a
> standard library module from your "dpython" will fail as soon as a
> literal is used.

No, because the '.py' file will generate bytecodes for a number literals as
binary number when the module is compiled.  If a '.dp' file imports the
contents of a '.py' file the binary numbers will be imported as binary
numbers.  If the '.dp' file will need to use the binary number in a
calculation with a decimal number the binary number will have to be cast it
to a decimal number.

---------------------
#gui.py
BLUE = 155
x_axis = 1024
y_axis = 768

--------------------
#calculator.dp
import gui
ytd_interest = 0.04
# ytd_interest is now a decimal number
win = gui.open_window(gui.bg, x_size=gui.x_axis, y_size=gui.y_axis)
app = win.dialog("Bank Balance", bankbalance_callback)
bb  = app.get_bankbalance()
# bb now contains a string
newbalance = decimal(bb) *ytd_interest
# now update the display
app.set_bankbalance(str(newbalance))

-------------------

In the example the gui module was used in the calculator module, but they
were alway handled as binary numbers.  The parser did not convert them to
decimal numbers because they had been parsed into a gui.pyc file prior to
being loaded into calculator.dp.

> I can't believe that this will work satisfactorily.

I think it will.  There will be some cases where it might be necessary to add
modules of convenience functions to make it easier to to use applications
that cross boundaries, but I think these cases will be rare.

Immediately following the introduction of the decimal number types all binary
modules will work as the work today.  There will be no additional pain to
continue using those module.  There will be no decimal modules, so there is
no problem with making them work with the binary modules.  As decimal module
users start developing applications they will develop techniques for working
with the binary modules.  Initially it may require a significant effort, but
eventually bondaries will be created and they two domains will coexists.

> Another example of the kind of problem your approach runs into: what
> should the type of len("abc") be?  3d or 3b?  Should it depend on the
> default mode?

That is an interesting question.  With my current proposal the following
would be required:

stlen = decimal(len("abc"))

A dlen() function could be added, or perhaps allowing the automatic promotion
of int to a decimal would be a reasonable exception.  That is one case were
there is no chance of data loss.  I'm not apposed to automatic conversions if
there is no danger of errors being introduced.

> I suppose sequence indexing has to accept decimal as well as binary
> integers as indexes -- certainly in a decimal program you will want to
> be able to use decimal integers for indexes.

That is how I would expect it to work.


From mclay@nist.gov  Fri Jul 27 21:32:42 2001
From: mclay@nist.gov (Michael McLay)
Date: Fri, 27 Jul 2001 16:32:42 -0400
Subject: [Python-Dev] PEP for adding a decimal type to Python
In-Reply-To: <200107271550.LAA24662@cj20424-a.reston1.va.home.com>
References: <01072701443301.05085@localhost.localdomain> <3B617A5F.E193B2CC@lemburg.com> <200107271550.LAA24662@cj20424-a.reston1.va.home.com>
Message-ID: <01072716324208.02216@fermi.eeel.nist.gov>

On Friday 27 July 2001 11:50 am, Guido van Rossum wrote:
>
> I'm not optimistic about Michael's PEP.  He seems to insist on a total
> separation between decimal and binary numbers that I don't believe can
> work. 

I'm not insisting on total separation.  I propose that we start with a 
requirement that an explicit call be made to a conversion function.  These 
functions would allow a decimal type to be converted to a float or to an int. 
There would also be conversion function going from a float or an int to a 
decimal type.  

What I would like to avoid is creating a decimal type in Python that enables 
silent errors that are difficult to recognize.  Allowing automatic coersion 
between the binary and decimal types will open the door to errors that would 
be detected if a conversion is required.  If at some point in the future it 
becomes apparent that a particular form of coersion is safe and useful it 
could be added.  I'd like to move slowly on opening up this potential trouble 
spot.

>  I haven't replied to him yet because I can't explain it well
> enough yet -- but I don't believe there's much of a future in his
> particular idea.

I guess I'm not understanding something about the direction you are taking 
Python.   As I understood the goals of the CP4E project you were attempting 
to make Python appealing to a wider audience and make it possible for 
everyone to learn to write programs.  And then there are occasional 
references to a Python 3k which will fix some Python warts.  My proposal 
moves Python towards these goals, while retaining full backwards compatible.  
I am not trying to create a new interpreter.  I'm trying to make the current 
interpreter useful to a wider market.  What is it you are trying to 
accomplish in the process of "unifying the numerical types" in Python?



From mclay@nist.gov  Fri Jul 27 21:40:38 2001
From: mclay@nist.gov (Michael McLay)
Date: Fri, 27 Jul 2001 16:40:38 -0400
Subject: [Python-Dev] dpython interaction with imaginary types
Message-ID: <01072716403809.02216@fermi.eeel.nist.gov>

Paul F. Dubois  writes:
> In dpython, what is 2.0j? Is the "standard" way of writing complex numbers,
> 3.0 + 2.0j, valid?

If this expression is placed in a module with a  '.py' extension it will work 
exactly as it does today.  

An exception will be raised if the expression is in a '.dp' module because 
the 3.0 would be a decimal number and the complex type is expecting a binary 
number for the real portiion.


From guido@zope.com  Fri Jul 27 22:06:37 2001
From: guido@zope.com (Guido van Rossum)
Date: Fri, 27 Jul 2001 17:06:37 -0400
Subject: [Python-Dev] Splitting the PEP for adding a decimal type to Python
In-Reply-To: Your message of "Fri, 27 Jul 2001 15:51:38 EDT."
 <01072715513805.02216@fermi.eeel.nist.gov>
References: <01072701443301.05085@localhost.localdomain> <01072708200303.02216@fermi.eeel.nist.gov> <200107271635.MAA25690@cj20424-a.reston1.va.home.com>
 <01072715513805.02216@fermi.eeel.nist.gov>
Message-ID: <200107272106.RAA27755@cj20424-a.reston1.va.home.com>

[Michael]
> I'm not fond of dialects when they don't serve a significant
> purpose.  However, I believe it would be useful to at least discuss
> creating a special purpose "safe" mode for the Python lexer.  This
> mode would be attractive to newbies and financial programmers.
> Calling this a new dialect is an overstatement.  It is more like
> defining a subset of the language that uses a special vocabulary for
> working with decimal types.

Sounds like a dialect to me.  But alright, I'll take your word for
it. :-)

[Michael]
> > > Another would be to use Unicode as the default character set.  This
> > > would allow Unicode characters to be in strings without needing to
> > > escape them.

[Guido]
> > That's not a dialect, that's a different input encoding.  MAL already
> > has a PEP for that.

[Michael]
> I know about the PEP.  I was refering to making it the default string type 
> for a '.dp' file.  There would be no prefix 'u' required.  

Have you thourght this through?  What would be the input encoding?
How do you expect your programmers to edit their Unicode files?

Otherwise, the only effect of making all string literals Unicode
strings is to break most of the standard library.  You can get this
effect with "python -U" today.  It's not pretty.  (That option exists
to see how much progress has been made with Python's Unicodification,
not for anything very practical.)

> I'll remove this and the other unrelated items from the decimal type PEP

It would indeed be better to focus on one idea at a time.

> If you don't agree with the idea of adding dpython lexer mode then
> there is no point in discussing the features that would be in that
> mode.

Maybe you can rewrite the PEP to explain the idea better.  It wasn't
very clear the first time.

> > Or in Python .NET.  Decoupling the various part of the parse+compile
> > pipeline is something I've considered.
> 
> Did you decide against it, or has it just not been a high enough priority?

It's one of those many "would-be-nice" things that I never get to...

> > But again this has nothing to do with decimal numbers: your proposal
> > allows the mixing of decimal and binary numbers (as long as one of
> > them uses an explicit base indicator) so you don't really need two
> > parsers -- you need one tokenizer plus a way to specify the default
> > numeric base for literals.
> 
> That is exactly what I implemented.  The dpython command and the
> '.dp' cause the Py_USE_DECIMAL_AS_DEFAULT[1] flag to be set.  When
> this flag is set decimal numbers are used for literals.

Where is this flag set?  Is it a global variable?  If my main program
has the .dp extension, does the flag remain set for all other module
that it imports?

> > I'll have to go back to your defense of the two dialect approach, but
> > I think it's neither sufficient nor necessary.
> 
> I have mixed too many ideas into a PEP.  I'll rework the PEP to remove the 
> cruft and focus on the addition of decimal numbers.  I move the other ideas 
> into a separate PEP.

Posterity will be grateful.

> > Well, sometimes more generality than you need hurts.  I'm not
> > convinced that we need an open-ended set of numeric literals.  But in
> > the light of the unified numeric model, we may need ways to make
> > exactness or inexactness explicit, and/or we may need a way to specify
> > rational numbers.  If we can fit all of these in the
> > number-with-letter-suffix mold, that would be nice for the lexer, I
> > suppose.
> 
> I worry about a "unified numerical model" getting overly complex.

Funny.  I think that a unified numeric model will take away some
complexity from the current model; for example the programmer would no
longer have to be aware of the limit on int values, so nobody would
have to learn about long any more.

> I think decimal numbers help because they are a better choice than
> binary numbers for a significant percentage of all software
> applications.

(Just not for most of the apps that are likely to be written in Python
today. :-)

> I know that rationale numbers are imporant in some applications.  Am
> I overlooking some huge class of applications that use rationales?

I doubt it -- if I was allowed to add exactly *one* numeric type to
Python, and I had to choose between decimal and rational, I'd choose
decimal.  Practicality beats purity.

> While Tim and some of the other Pythoneers can probably think of
> dozens of specialized numerical types, I would venture to guess that
> binary types and a decimal type probably cover 90% of all the user's
> requirements.

Add rational, and I'd agree.

> [1] I'll be renaming the flat to this in the next version.  The flag
> is currently called Py_NEW_PARSER.  I named it that because at one
> time I was creating a new parser.  I trimmed the changes down to
> just a few edits of the tokenizer and compile.c

Why does a flag variable have an UPPER_CASE name?  That normally means
the name is a preprocessor symbol.

[Next message]

[Guido]
> > But I foresee serious problems.  Most standard library modules use
> > numbers.  Most of the modules using numbers occasionally use a
> > literal (e.g. 0 or 1).  According to your PEP, literals in module
> > files ending with .py default to binary.  This means that almost
> > any use of a standard library module from your "dpython" will fail
> > as soon as a literal is used.

[Michael]
> No, because the '.py' file will generate bytecodes for a number
> literals as binary number when the module is compiled.  If a '.dp'
> file imports the contents of a '.py' file the binary numbers will be
> imported as binary numbers.  If the '.dp' file will need to use the
> binary number in a calculation with a decimal number the binary
> number will have to be cast it to a decimal number.

I understood all that.  but what if the decimal module wants to pass
some numbers into a binary module.  Then it has to make sure all the
arguments it passes are decimal.

> ---------------------
> #gui.py
> BLUE = 155
> x_axis = 1024
> y_axis = 768
> 
> --------------------
> #calculator.dp
> import gui
> ytd_interest = 0.04
> # ytd_interest is now a decimal number
> win = gui.open_window(gui.bg, x_size=gui.x_axis, y_size=gui.y_axis)
> app = win.dialog("Bank Balance", bankbalance_callback)
> bb  = app.get_bankbalance()
> # bb now contains a string
> newbalance = decimal(bb) *ytd_interest
> # now update the display
> app.set_bankbalance(str(newbalance))
> 
> -------------------
> 
> In the example the gui module was used in the calculator module, but they
> were alway handled as binary numbers.  The parser did not convert them to
> decimal numbers because they had been parsed into a gui.pyc file prior to
> being loaded into calculator.dp.

Blech.  That means that whenever you use a library module that does
something useful with your data, you have to convert all your data
explicitly to binary, even if it's just integers.  Yuck.  Bah.  (Need
I say more?  OK, one more then.  Argh! :-)

> > I can't believe that this will work satisfactorily.
> 
> I think it will.  There will be some cases where it might be
> necessary to add modules of convenience functions to make it easier
> to to use applications that cross boundaries, but I think these
> cases will be rare.

I would be much more comfortable if there was just one integer type,
or if at least binary ints would mix freely with decimal ints.  I see
a lot of use for decimal *floating point* (more predictable
arithmetic, calculator style), and also a lot of use for decimal
*fixed point* (money calculations), but I don't see the need for
distinguishing the radix of of integers.

> Immediately following the introduction of the decimal number types
> all binary modules will work as the work today.  There will be no
> additional pain to continue using those module.  There will be no
> decimal modules, so there is no problem with making them work with
> the binary modules.  As decimal module users start developing
> applications they will develop techniques for working with the
> binary modules.  Initially it may require a significant effort, but
> eventually bondaries will be created and they two domains will
> coexists.

You make it sound as if most of the standard library would not be
useful for decimal users.  I doubt that.  Decimal users also need to
parse XML, do bisection on lists, use database files, and so on.

> > Another example of the kind of problem your approach runs into: what
> > should the type of len("abc") be?  3d or 3b?  Should it depend on the
> > default mode?
> 
> That is an interesting question.  With my current proposal the following
> would be required:
> 
> stlen = decimal(len("abc"))
> 
> A dlen() function could be added, or perhaps allowing the automatic
> promotion of int to a decimal would be a reasonable exception.  That
> is one case were there is no chance of data loss.  I'm not apposed
> to automatic conversions if there is no danger of errors being
> introduced.

OK, then we agree.  Let's freely allow mixing decimal and binary
integers.  That makes much more sense.

> > I suppose sequence indexing has to accept decimal as well as
> > binary integers as indexes -- certainly in a decimal program you
> > will want to be able to use decimal integers for indexes.
> 
> That is how I would expect it to work.

But it contradicts your original assertion that decimal and binary
numbers were two incompatible types.  Glad we sorted that out.

[Next message]

[Guido]
> > I'm not optimistic about Michael's PEP.  He seems to insist on a
> > total separation between decimal and binary numbers that I don't
> > believe can work.

[Michael]
> I'm not insisting on total separation.  I propose that we start with
> a requirement that an explicit call be made to a conversion
> function.  These functions would allow a decimal type to be
> converted to a float or to an int.  There would also be conversion
> function going from a float or an int to a decimal type.

(Except for ints, we have now established.)

> What I would like to avoid is creating a decimal type in Python that
> enables silent errors that are difficult to recognize.  Allowing
> automatic coersion between the binary and decimal types will open
> the door to errors that would be detected if a conversion is
> required.  If at some point in the future it becomes apparent that a
> particular form of coersion is safe and useful it could be added.
> I'd like to move slowly on opening up this potential trouble spot.

I recommend that you make a more complete analysis of what errors you
want to avoid.  Every binary can be represented in decimal if you
allow enough digits.

On the other hand, if you are thinking of decimal floating point, some
decimal calculations will also lose precision.  If you never want to
lose precision, the radix of the numbers is a red herring, and you
might as well use rationals under the covers.  If you allow the kind
of precision loss that decimal floating point can cause, I would like
to understand more about what it *is* that you are trying to avoid
with your Draconian separation rule.  Floating point decimal
arithmetic cannot avoid loss of precision for division (e.g. 1d/3d
cannot be represented exactly with a finite number of decimal
digits).  Fixed point decimal arithmetic isn't any better.

> >  I haven't replied to him yet because I can't explain it well
> > enough yet -- but I don't believe there's much of a future in his
> > particular idea.
> 
> I guess I'm not understanding something about the direction you are
> taking Python.  As I understood the goals of the CP4E project you
> were attempting to make Python appealing to a wider audience and
> make it possible for everyone to learn to write programs.  And then
> there are occasional references to a Python 3k which will fix some
> Python warts.  My proposal moves Python towards these goals, while
> retaining full backwards compatible.  I am not trying to create a
> new interpreter.

I think you haven't completely thought through the rules you are
proposing, and you haven't stated your underlying goals very clearly.
I believe the rules that you *claim* to propose won't further your
goals, but it seems that you aren't sure of the rules you propose and
maybe you aren't sure of your goal either.  Under these adverse
circumstances I'm trying to tease out a set of rules that might
further the kind of goal I *think* you want to obtain, but it's hard
because you have overspecified your "solution".

> I'm trying to make the current interpreter useful to a wider market.

Adding an Oracle module to the standard library would probably do more
to further that goal than any wrangling with the numeric model that we
can carry out here... :-)

> What is it you are trying to accomplish in the process of "unifying
> the numerical types" in Python?

Removing specific warts of the current numeric system that require the
programmer to be aware of more details than necessary.  We will never
be able to remove the need for careful numerical analysis of
algorithms involving floating point (be it binary or decimal).  But we
can certainly remove the need to be aware of the number of bits in a
machine word (long/int unification, PEP 237) or the need to explicitly
promote ints to floats in certain cases (PEP 238).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ping@lfw.org  Sat Jul 28 00:37:08 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Fri, 27 Jul 2001 16:37:08 -0700 (PDT)
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
In-Reply-To: <200107271949.PAA27171@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.32.0107271612480.474-100000@ziggy.localdomain.fake>

This all looks pretty good!  Nice work, Guido -- especially given the
minefield of compatibility issues you have been tiptoeing through.

On Fri, 27 Jul 2001, Guido van Rossum wrote:
>     - Overloaded operator methods:
>
>       __div__(), __floordiv__(), __truediv__();

I'm concerned about this.  Does this mean that a/b will call __truediv__?
So code that today expects a/b to call __div__ will be permanently broken?

I think you might want to provide a little table in the PEP.  Here is my
stab at describing the current proposal, so you can correct it:

                  in Python 2.1     in Python 2.2         in Python 3.0 [*]

 / on numbers     classic division  classic division      true division
 // on numbers    nothing           floor division        floor division

 / on instances   __div__           __div__               __truediv__?
 // on instances  nothing           __floordiv__?         __floordiv__

 / API call       PyNumber_Divide   PyNumber_Divide       PyNumber_TrueDivide?
 // API call      nothing           PyNumber_FloorDivide  PyNumber_FloorDivide

 / AsNumber slot  nb_divide         nb_divide             nb_true_divide?
 // AsNumber slot nothing           nb_floor_divide       nb_floor_divide

 / opcode         BINARY_DIVIDE     BINARY_DIVIDE         BINARY_TRUE_DIVIDE
 // opcode        nothing           BINARY_FLOOR_DIVIDE   BINARY_FLOOR_DIVIDE

 [*] or in Python >= 2.2 with "from __future__ import division"


I'm thinking that nb_true_divide and __truediv__ should be replaced with
just nb_divide and __div__ in the above table.


> Semantics of Floor Division
[...]
>     For complex numbers, // raises an exception, since float() of a
>     complex number is not allowed.

I assume you meant "floor()" here rather than "float()".


-- ?!ng











From fdrake@acm.org  Sat Jul 28 04:44:31 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 27 Jul 2001 23:44:31 -0400 (EDT)
Subject: [Python-Dev] [OT] Expat 1.95.2 released
Message-ID: <15202.13599.964445.917473@cj42289-a.reston1.va.home.com>

Slightly off-topic, but this may be interesting to a few of you...

  In case anyone is interested, Expat 1.95.2 has been released, with
both a source archive for Unix users and a handy installer for Windows
victims (thanks to Tim Peters for getting me started!).  This release
fixes some small bugs and improves the portability of the build
process (and there is one for Windows this time).
  You can pick up the 1.95.2 release at:

	http://sourceforge.net/projects/expat/


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


_______________________________________________
XML-SIG maillist  -  XML-SIG@python.org
http://mail.python.org/mailman/listinfo/xml-sig



From michel@digicool.com  Sat Jul 28 08:41:21 2001
From: michel@digicool.com (Michel Pelletier)
Date: Sat, 28 Jul 2001 00:41:21 -0700
Subject: [Python-Dev] Splitting the PEP for adding a decimal type to Python
References: <01072701443301.05085@localhost.localdomain> <01072708200303.02216@fermi.eeel.nist.gov> <200107271635.MAA25690@cj20424-a.reston1.va.home.com> <01072715513805.02216@fermi.eeel.nist.gov>
Message-ID: <3B626CA1.617A5121@digicool.com>

Michael McLay wrote:
> 
> On Friday 27 July 2001 12:35 pm, Guido van Rossum wrote:
> 
> > I'm not very fond of having multiple dialects.  There are lots of
> > contexts where the dialect in use is not explicitly mentioned
> > (e.g. when people discuss fragments of Python code).
> 
> I'm not fond of dialects when they don't serve a significant purpose.
> However, I believe it would be useful to at least discuss creating a special
> purpose "safe" mode for the Python lexer.  This mode would be attractive to
> newbies and financial programmers.  Calling this a new dialect is an
> overstatement.  It is more like defining a subset of the language that uses a
> special vocabulary for working with decimal types.

I don't know nothin about no number theory, but I did use a simliar
dialect technique to implement a PEP 245 prototype using mobius.  Like
what I've read so far about dpython, it's objects from *.pyi files (a
superset of python) could be easily intermigled with objects from *.py
files.

I'm all for no dialects at large, but some people may find need to
implement new languages on top of python's run time engine.  Especially
people embedding python into specialized applications.  Mobius was a way
to control the python language using the language itself, it would be
cool to have this kind of thing stock in python.

-Michel


From tim.one@home.com  Sat Jul 28 09:12:47 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 28 Jul 2001 04:12:47 -0400
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: <200107271714.NAA26383@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEOILBAA.tim.one@home.com>

[Guido]
> ...
> Inside a function, from ... import * is always bad form.

Worse, according to the Reference Manual,

    The "from" form with "*" may only occur in a module scope.



From mwh@python.net  Sat Jul 28 10:35:01 2001
From: mwh@python.net (Michael Hudson)
Date: 28 Jul 2001 05:35:01 -0400
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
In-Reply-To: Guido van Rossum's message of "Fri, 27 Jul 2001 15:49:57 -0400"
References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com>
Message-ID: <2mn15pwg56.fsf@starship.python.net>

Not directly relavent to the PEP, but...

Guido van Rossum <guido@zope.com> writes:

>     Q. What about code compiled by the codeop module?
> 
>     A. Alas, this will always use the default semantics (set by the -D
>        command line option).  This is a general problem with the
>        future statement; PEP 236[4] lists it as an unresolved
>        problem.  You could have your own clone of codeop.py that
>        includes a future division statement, but that's not a general
>        solution.

Did you look at my Nasty Hack(tm) to bodge around this?  It's at 

    http://starship.python.net/crew/mwh/hacks/codeop-hack.diff

if you haven't.  I'm not sure it will work with what you're planning
for division, but it works for generators (and worked for nested
scopes when that was relavent).

There are a host of saner ways round this, of course - like adding an
optional "flags" argument to compile, for instance.

Cheers,
M.

-- 
  ARTHUR:  Why should a rock hum?
    FORD:  Maybe it feels good about being a rock.
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 8


From guido@zope.com  Sat Jul 28 14:54:21 2001
From: guido@zope.com (Guido van Rossum)
Date: Sat, 28 Jul 2001 09:54:21 -0400
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: Your message of "Sat, 28 Jul 2001 04:12:47 EDT."
 <LNBBLJKPBEHFEDALKOLCCEOILBAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCCEOILBAA.tim.one@home.com>
Message-ID: <200107281354.JAA30808@cj20424-a.reston1.va.home.com>

> Worse, according to the Reference Manual,
> 
>     The "from" form with "*" may only occur in a module scope.

I don't know when that snuck in, but it's not enforced.  If we're
serious, we should at least add a warning!

I'll add a bug report.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@zope.com  Sat Jul 28 14:57:28 2001
From: guido@zope.com (Guido van Rossum)
Date: Sat, 28 Jul 2001 09:57:28 -0400
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
In-Reply-To: Your message of "28 Jul 2001 05:35:01 EDT."
 <2mn15pwg56.fsf@starship.python.net>
References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com>
 <2mn15pwg56.fsf@starship.python.net>
Message-ID: <200107281357.JAA30859@cj20424-a.reston1.va.home.com>

> Not directly relavent to the PEP, but...
> 
> Guido van Rossum <guido@zope.com> writes:
> 
> >     Q. What about code compiled by the codeop module?
> > 
> >     A. Alas, this will always use the default semantics (set by the -D
> >        command line option).  This is a general problem with the
> >        future statement; PEP 236[4] lists it as an unresolved
> >        problem.  You could have your own clone of codeop.py that
> >        includes a future division statement, but that's not a general
> >        solution.
> 
> Did you look at my Nasty Hack(tm) to bodge around this?  It's at 
> 
>     http://starship.python.net/crew/mwh/hacks/codeop-hack.diff
> 
> if you haven't.  I'm not sure it will work with what you're planning
> for division, but it works for generators (and worked for nested
> scopes when that was relavent).

Ouch.  Nasty.  Hat off to you for thinking of this!

> There are a host of saner ways round this, of course - like adding an
> optional "flags" argument to compile, for instance.

We'll have to keep that in mind.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@zope.com  Sat Jul 28 16:28:55 2001
From: guido@zope.com (Guido van Rossum)
Date: Sat, 28 Jul 2001 11:28:55 -0400
Subject: [Python-Dev] Ready to merge descr-branch back into the trunk?
Message-ID: <200107281528.LAA31137@cj20424-a.reston1.va.home.com>

Is it time to merge the descr-branch (from which 2.2a1 was built) back
into the trunk?  In the fray over PEP 238 I haven't seen too much
feedback on the alpha release, but there have been plenty of
downloads.  Telling from the bug reports, a few people have clearly
been kicking the tires quite a bit.

I don't think we'll have to withdraw the type/class unification, and
I'd like to fire Tim from his branch-merge duties. :)

I'll post a qury about this on c.l.py too.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paulp@ActiveState.com  Sat Jul 28 17:40:37 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sat, 28 Jul 2001 09:40:37 -0700
Subject: [Python-Dev] pep-discuss
Message-ID: <3B62EB05.396DF4D7@ActiveState.com>

We've talked about having a mailing list for general PEP-related
discussions. Two things make me think that revisiting this would be a
good idea right now.

First, the recent loosening up of the python-dev rules threatens the
quality of discussion about bread and butter issues such as patch
discussions and process issues.

Second, the flamewar on python-list basically drowned out the usual
newbie questions and would give a person coming new to Python a very
negative opinion about the language's future and the friendliness of the
community. I would rather redirect as much as possible of that to a list
that only interested participants would have to endure.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From paulp@ActiveState.com  Sat Jul 28 18:03:37 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sat, 28 Jul 2001 10:03:37 -0700
Subject: [Python-Dev] ActiveCobolScript?
Message-ID: <3B62F069.D0CB002E@ActiveState.com>

http://www.cobolscript.com/

Too bad this up and coming language already has a corporate benefactor.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From paulp@ActiveState.com  Sat Jul 28 18:24:42 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sat, 28 Jul 2001 10:24:42 -0700
Subject: [Python-Dev] ActiveCobolScript?
References: <3B62F069.D0CB002E@ActiveState.com>
Message-ID: <3B62F55A.28EE3866@ActiveState.com>

Sorry guys, I meant to send this to ActiveState's internal lists where
we plot the takeover of the world...my brain hasn't totally recovered
from my trip to Tijuana (er, I mean San Diego).
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From paulp@ActiveState.com  Sat Jul 28 20:08:17 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sat, 28 Jul 2001 12:08:17 -0700
Subject: [Python-Dev] Ready to merge descr-branch back into the trunk?
References: <200107281528.LAA31137@cj20424-a.reston1.va.home.com>
Message-ID: <3B630DA1.6523B177@ActiveState.com>

Guido van Rossum wrote:
> 
> Is it time to merge the descr-branch (from which 2.2a1 was built) back
> into the trunk?  In the fray over PEP 238 I haven't seen too much
> feedback on the alpha release, but there have been plenty of
> downloads.  Telling from the bug reports, a few people have clearly
> been kicking the tires quite a bit.

I'm not deeply concerned from a backwards compatibility standpoint but I
would like to see more documentation and more widespread understanding
of the feature before we say "yes, this is the right way." I wonder how
many people truly understand all of the changes. 

When you added metaclasses you labelled the feature experimental so you
could change it once people got a sense of it. I propose you do the same
thing in this case. I could even imagine using the warnings framework to
tell people that they are playing with stuff that may change.

As smart as you are, you are only one person, with experience with a
certain set of problems. Wider understanding, experimentation and
discussion might help you to improve the design...but they may break
some of the features you have already added.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From tim.one@home.com  Sat Jul 28 21:13:53 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 28 Jul 2001 16:13:53 -0400
Subject: [Python-Dev] Picking on platform fmod
Message-ID: <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com>

Here's your chance to prove your favorite platform isn't a worthless pile of
monkey crap <wink>.  Please run the attached.  If it prints anything other
than

0 failures in 10000 tries

it will probably print a lot.  In that case I'd like to know which flavor of
C+libc+libm you're using, and the OS; a few of the failures it prints may be
helpful too.  If it only prints one or two failures, it's probably a bug in
*my* code, so I especially want to know about that.  If it dies with an
assertion error, that would be mondo interesting to know, and then the
flavor of HW chip may also be relevant.

I already know MSVC 6 under Win98SE passes ("0 failures") on two distinct
boxes <wink>, so no need for more about that one.

What you're looking for:  Given finite doubles x>0 and y>0, there's a unique
integer N and unique real number R such that

    x = N*y + R

exactly and 0 <= R < y.  N may not be *representable* as an integer (or
long, or long long, or double) on the machine, but it's A Theorem that the
infinitely-precise value of R is exactly representable as a machine double.

The C stds have never (IMO) been clear about whether fmod(x, y) must return
this exact R, although the C99 *rationale* is clear that this is the intent
(why committees don't fold these subtleties into the bodies of their stds is
beyond me).

The program below generates nasty test cases at random, computes the exact R
in a clever (read "on the edge of not working but pretty fast") way using
Python, and compares it to the platform fmod result.  It takes about 6
seconds to run on a 866MHz box using current CVS Python, so it shouldn't be
a burden to try.

If you get no failure, of course I'd like to hear about that too.

it's-not-like-you-had-anything-fun-to-do-this-weekend-ly y'rs  - tim


from math import frexp as _frexp, ldexp as _ldexp

# ffmod is a Pythonic variant of fmod, returning a remainder with the
# same sign as y.  Excepting infs and NaNs, the result is exact.

def ffmod(x, y):
    if y == 0:
        raise ZeroDivisionError("ffmod: divide by 0")
    remainder = abs(x)
    yabs = abs(y)
    if remainder >= yabs:
        dexp = _frexp(remainder)[1] - _frexp(yabs)[1]
        assert dexp >= 0
        yshifted = _ldexp(yabs, dexp)  # exact
        for i in xrange(dexp + 1):
            # compute one bit of the quotient (but not materialized;
            # we only care about the remainder at the end)
            if remainder >= yshifted:
                assert remainder < yshifted * 2.0
                remainder -= yshifted  # exact
            yshifted *= 0.5            # exact
        assert yshifted * 2.0 == yabs
    assert remainder < yabs
    if y < 0 and remainder > 0:
        remainder -= yabs              # exact
        assert remainder < 0
    return remainder

# ffmod and C99 fmod should agree whenever x>0 and y>0.  Try one, and
# return 1 iff they don't agree.

def _tryone(x, y, dump=0):
    n = math.fmod(x, y)
    e = ffmod(x, y)
    if dump:
        print "fmod" + `x, y`
        print "native:", `n`
        print " exact:", `e`
    return n != e

# Test n random inputs, in the sense of random mantissas and random
# exponents in range(-300, 301).  The hardest cases have x much larger
# than y, and this will generate lots of those.

def _test(n, showresults=0):
    from random import random, randrange
    nfail = 0
    for i in xrange(n):
        x = _ldexp(random(), randrange(-300, 301))
        y = _ldexp(random(), randrange(-300, 301))
        if x < y:
            x, y = y, x
        if _tryone(x, y, showresults):
            nfail += 1
            _tryone(x, y, 1)
    print nfail, "failures in", n, "tries"

if __name__ == "__main__":
    import math
    _test(10000, 0)



From tim.one@home.com  Sun Jul 29 12:10:45 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 29 Jul 2001 07:10:45 -0400
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: <200107281354.JAA30808@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEBELCAA.tim.one@home.com>

[Tim]
> Worse, according to the Reference Manual,
>
>     The "from" form with "*" may only occur in a module scope.

[Guido]
> I don't know when that snuck in,

On Friday, 14 Aug 1992:  it's in rev 1.1 of ref6.tex, and was there in the
0.98 release.  This proved tedious to trace backwards, because you and Fred
went through an amazing variety of ways to mark up "*" <wink>.

> but it's not enforced.  If we're serious, we should at least add a
> warning!

I thought we had agreed to do this back when the nested-scopes warnings were
being added; guess not.

> I'll add a bug report.

aka-the-retroactive-todo-list-ly y'rs  - tim



From thomas@xs4all.net  Sun Jul 29 16:33:43 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sun, 29 Jul 2001 17:33:43 +0200
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: <200107281354.JAA30808@cj20424-a.reston1.va.home.com>
Message-ID: <20010729173343.H21770@xs4all.nl>

On Sat, Jul 28, 2001 at 09:54:21AM -0400, Guido van Rossum wrote:
> > Worse, according to the Reference Manual,

> >     The "from" form with "*" may only occur in a module scope.

> I don't know when that snuck in, but it's not enforced.  If we're
> serious, we should at least add a warning!

Eh, last I looked, you and Jeremy were most serious about this :) It came up
during the nested-scopes change in 2.1, where it was first made illegal, and
later just illegal in the presence of a nested scope:

(without future statement)
>>> def spam(x):
...      from stat import *
...      def eggs():
...          print x
... 
<stdin>:1: SyntaxWarning: local name 'x' in 'spam' shadows use of 'x' as
global in nested scope 'eggs'
<stdin>:1: SyntaxWarning: import * is not allowed in function 'spam' because
it contains a nested function with free variables

(with future statement)
>>> def spam(x):
...       from stat import *
...       def eggs():
...           print x
... 
  File "<stdin>", line 2
SyntaxError: import * is not allowed in function 'spam' because it contains
a nested function with free variables

> I'll add a bug report.

Should we warn about exec (without 'in' clause) in functions as well ?

(without future statement)
>>> def spam(x,y):
...     exec y
...     def eggs():
...         print x
... 
<stdin>:1: SyntaxWarning: local name 'x' in 'spam' shadows use of 'x' as
global in nested scope 'eggs'
<stdin>:1: SyntaxWarning: unqualified exec is not allowed in function 'spam'
it contains a nested function with free variables

(with future statement)
>>> def spam(x,y):
...      exec y
...      def eggs():
...          print x
... 
  File "<stdin>", line 2
SyntaxError: unqualified exec is not allowed in function 'spam' it contains
a nested function with free variables

The warnings *only* occur in the presence of a nested scope, though.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Sun Jul 29 17:13:59 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sun, 29 Jul 2001 18:13:59 +0200
Subject: [Python-Dev] Picking on platform fmod
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com>
Message-ID: <20010729181359.I21770@xs4all.nl>

On Sat, Jul 28, 2001 at 04:13:53PM -0400, Tim Peters wrote:

> Here's your chance to prove your favorite platform isn't a worthless pile of
> monkey crap <wink>.  Please run the attached.  If it prints anything other
> than
> 0 failures in 10000 tries

> it will probably print a lot. 

Worked fine on BSDI, FreeBSD and Linux on Intel hardware here, as well as
Solaris on SPARC hardware and Linux on (IBM) PPC and (Compaq) Alpha
hardware, at least the ones in the SourceForge compilefarm :)

However, on this sourceforge compilefarm machine:

Linux usf-cf-sparc-linux-1 2.2.18pre21 #1 SMP Wed Nov 22 17:27:17 EST 2000 sparc64 unknown

(Linux on SPARC hardware, somewhat different hardware than the Solaris
compilefarm machine, though both are UltraSparcs (sun4u), so I assume they
have the same FPU hardware too.)

cpu             : TI UltraSparc II  (BlackBird)
fpu             : UltraSparc II integrated FPU
promlib         : Version 3 Revision 17
prom            : 3.17.0
type            : sun4u
ncpus probed    : 2
ncpus active    : 2

It fails like so:

Python-2.1.1/SPARC-linux/python fmodtest.py
Traceback (most recent call last):
  File "fmodtest.py", line 60, in ?
    _test(10000, 0)
  File "fmodtest.py", line 53, in _test
    if _tryone(x, y, showresults):
  File "fmodtest.py", line 33, in _tryone
    n = math.fmod(x, y)
OverflowError: math range error

This is most often on the first time through _tryone, or otherwise on the
second time through. Since it didn't print the oodles of info you wanted,
here are some values that cause this:

x: 1.9855727039972493e-39 y: 3.3665190124762732e-65
x: 9.5227191085185764e+47 y: 4.2603743746337035e-20
x: 5.9222419270524289e+19 y: 1.1515096079336105e-17
x: 1.0095372277815077e+37 y: 5.1347483313106109e-23
x: 7612675.5666046143 y: 16016.095533924272
x: 1.9710117673387707e+27 y: 3.8974792352555581e-75
x: 7.2481762337961828e-72 y: 6.9275805608109076e-91
x: 444606.5185310659 y: 0.040252210139028341


Here are some that it didn't break on:
x: 6.4925064019277635e+82 y: 3.3863081542612738e+39
x: 1.5537102518838885e+28 y: 7.4706363056326852e+21
x: 6.0545201466539534e+82 y: 1.0632674821830584e+22
x: 2.4744658600351291e+51 y: 3.8431582369146088e+39
x: 5.019729019166613e+56 y: 1.18286034219559e+48

Notice how none of these have a '-' in them...

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From cgw@alum.mit.edu  Sun Jul 29 17:24:31 2001
From: cgw@alum.mit.edu (Charles G Waldman)
Date: Sun, 29 Jul 2001 11:24:31 -0500
Subject: [Python-Dev] Picking on platform fmod
Message-ID: <15204.14527.823870.226422@nyx.dyndns.org>

 > If you get no failure, of course I'd like to hear about that too.

Got "no failure" on:

  SunOS 5.8 Generic_108529-08 
  Linux 2.2.18 / glibc 2.1.3




From aahz@rahul.net  Sun Jul 29 17:44:28 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Sun, 29 Jul 2001 09:44:28 -0700 (PDT)
Subject: [Python-Dev] Re: post mortem after threading deadlock?
In-Reply-To: <200107252216.SAA09493@cj20424-a.reston1.va.home.com> from "Guido van Rossum" at Jul 25, 2001 06:16:45 PM
Message-ID: <20010729164429.4FA5D99C90@waltz.rahul.net>

Guido van Rossum wrote:
> 
> I believe that Aahz, in his thread tutorial, has even more radical
> advice: use the Queue module for all inter-thread communication.  It
> is even higher level than semaphores, and has the same nice
> properties.

Not only that, Queue.Queue has the especially nice property of handling
both data protection (mutexes) and synchronization.

I'm following up primarily to announce that I've just uploaded my OSCON
slides (new and improved!) to http:/starship.python.net/crew/aahz/
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From guido@zope.com  Sun Jul 29 18:00:59 2001
From: guido@zope.com (Guido van Rossum)
Date: Sun, 29 Jul 2001 13:00:59 -0400
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: Your message of "Sun, 29 Jul 2001 17:33:43 +0200."
 <20010729173343.H21770@xs4all.nl>
References: <20010729173343.H21770@xs4all.nl>
Message-ID: <200107291700.NAA07488@cj20424-a.reston1.va.home.com>

> On Sat, Jul 28, 2001 at 09:54:21AM -0400, Guido van Rossum wrote:
> > > Worse, according to the Reference Manual,
> 
> > >     The "from" form with "*" may only occur in a module scope.
> 
> > I don't know when that snuck in, but it's not enforced.  If we're
> > serious, we should at least add a warning!
> 
> Eh, last I looked, you and Jeremy were most serious about this :) It came up
> during the nested-scopes change in 2.1, where it was first made illegal, and
> later just illegal in the presence of a nested scope:
> 
> (without future statement)
> >>> def spam(x):
> ...      from stat import *
> ...      def eggs():
> ...          print x
> ... 
> <stdin>:1: SyntaxWarning: local name 'x' in 'spam' shadows use of 'x' as
> global in nested scope 'eggs'
> <stdin>:1: SyntaxWarning: import * is not allowed in function 'spam' because
> it contains a nested function with free variables
> 
> (with future statement)
> >>> def spam(x):
> ...       from stat import *
> ...       def eggs():
> ...           print x
> ... 
>   File "<stdin>", line 2
> SyntaxError: import * is not allowed in function 'spam' because it contains
> a nested function with free variables
> 
> > I'll add a bug report.

Hm.  I'm curious why it was not made a warning without a nested
function.  Perhaps because too much 3rd party code would trigger the
warning?  (I have a feeling that lots of amateur programmers are a lot
fonder of import * than they should be :-( ).

> Should we warn about exec (without 'in' clause) in functions as well ?
> 
> (without future statement)
> >>> def spam(x,y):
> ...     exec y
> ...     def eggs():
> ...         print x
> ... 
> <stdin>:1: SyntaxWarning: local name 'x' in 'spam' shadows use of 'x' as
> global in nested scope 'eggs'
> <stdin>:1: SyntaxWarning: unqualified exec is not allowed in function 'spam'
> it contains a nested function with free variables
> 
> (with future statement)
> >>> def spam(x,y):
> ...      exec y
> ...      def eggs():
> ...          print x
> ... 
>   File "<stdin>", line 2
> SyntaxError: unqualified exec is not allowed in function 'spam' it contains
> a nested function with free variables
> 
> The warnings *only* occur in the presence of a nested scope, though.

That one is just fine I think.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz@rahul.net  Sun Jul 29 18:19:27 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Sun, 29 Jul 2001 10:19:27 -0700 (PDT)
Subject: [Python-Dev] Small feature request - optional argument for string.strip()
In-Reply-To: <31575A892FF6D1118F5800600846864D78BEFD@intrepid> from "Simon Brunning" at Jul 25, 2001 06:00:55 PM
Message-ID: <20010729171927.9516699C90@waltz.rahul.net>

Simon Brunning wrote:
> 
> The .split method on strings splits at whitespace by default, but takes an
> optional argument allowing splitting by other strings. The .strip method
> (and its siblings) always strip whitespace - on more than one occasion I
> would have found it useful if these methods also took an optional argument
> allowing other strings to be stripped. For example, to strip, say, asterisks
> from a file you could do:
> 
> >>>fred = '**word**word**'
> >>>fred.strip('*')
> word**word
> 
> Does this sound sensible/useful?

I've never seen a case where this was wanted except to delete *all* such
characters.  string.translate() does that, but in an awkward way.
Perhaps a wrapper for string.translate() might make sense, called
something like string.delete().
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From thomas@xs4all.net  Sun Jul 29 19:23:37 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sun, 29 Jul 2001 20:23:37 +0200
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: <200107291700.NAA07488@cj20424-a.reston1.va.home.com>
References: <20010729173343.H21770@xs4all.nl> <200107291700.NAA07488@cj20424-a.reston1.va.home.com>
Message-ID: <20010729202337.M570@xs4all.nl>

On Sun, Jul 29, 2001 at 01:00:59PM -0400, Guido van Rossum wrote:

> > (with future statement)
> > >>> def spam(x):
> > ...       from stat import *
> > ...       def eggs():
> > ...           print x
> > ... 
> >   File "<stdin>", line 2
> > SyntaxError: import * is not allowed in function 'spam' because it contains
> > a nested function with free variables

> Hm.  I'm curious why it was not made a warning without a nested
> function.  Perhaps because too much 3rd party code would trigger the
> warning? 

Yes.

> (I have a feeling that lots of amateur programmers are a lot
> fonder of import * than they should be :-( ).

Oh yeah. If ActiveState's mailinglist statistics were extended to show
howmany of my posts preach against using 'import *', I'd be top dog in the
python-list stats :-) I also still owe Fred a tutorial chapter on why not to
use import * :)

> > >>> def spam(x,y):
> > ...      exec y
> > ...      def eggs():
> > ...          print x

> That one is just fine I think.

Why is 'import *' inside a function fine, but a bare exec isn't ? Weren't
you going to deprecate bare exec's altogether ?

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From gball@cfa.harvard.edu  Sun Jul 29 22:25:37 2001
From: gball@cfa.harvard.edu (Greg Ball)
Date: Sun, 29 Jul 2001 17:25:37 -0400 (EDT)
Subject: [Python-Dev] Picking on platform fmod
Message-ID: <Pine.OSF.4.10.10107291652460.10240-100000@cfata6.harvard.edu>

I got no failure on 

OSF1 cfata6.harvard.edu V4.0 878 alpha
SunOS cfa0 5.8 Generic_108528-06 sun4u sparc SUNW,Ultra-Enterprise

building with or without gcc.


--Greg Ball




From guido@zope.com  Sun Jul 29 23:40:22 2001
From: guido@zope.com (Guido van Rossum)
Date: Sun, 29 Jul 2001 18:40:22 -0400
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: Your message of "Sun, 29 Jul 2001 20:23:37 +0200."
 <20010729202337.M570@xs4all.nl>
References: <20010729173343.H21770@xs4all.nl> <200107291700.NAA07488@cj20424-a.reston1.va.home.com>
 <20010729202337.M570@xs4all.nl>
Message-ID: <200107292240.SAA08051@cj20424-a.reston1.va.home.com>

> > > >>> def spam(x,y):
> > > ...      exec y
> > > ...      def eggs():
> > > ...          print x
> 
> > That one is just fine I think.
> 
> Why is 'import *' inside a function fine, but a bare exec isn't ? Weren't
> you going to deprecate bare exec's altogether ?

You mean the other way around don't you?  I proposed a warning for
import * but not for bare exec.  I guess for me the difference is that
import * is just stupid (potentially lots of work going on every time
you call the function) while the main problem with bare exec is that
it gets in the way of optimizers and the like.  Since we don't have an
optimizer (yet) I don't care so much (yet).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim@zope.com  Sun Jul 29 23:55:19 2001
From: tim@zope.com (Tim Peters)
Date: Sun, 29 Jul 2001 18:55:19 -0400
Subject: [Python-Dev] Python Windows installer: good news!
Message-ID: <LNBBLJKPBEHFEDALKOLCMECILCAA.tim@zope.com>

Wise Solutions generously offered PythonLabs use of their InstallerMaster
8.1 system.  Every PythonLabs Windows installer produced to date used Wise
5.0a, and while that's done a great job for us over the years, some of you
have noticed <wink> that it was starting to show its age.

I've completed upgrading our Windows installation procedures to
InstallerMaster 8.1, and we'll release the next alpha of Windows Python 2.2
using it.  Even if you have no interest in *testing* 2.2a2 at that time, if
you're running on a Windows system please download the installer (when it's
released) just to be sure it works for you!  As always, we have direct
access to only a few Windows boxes, so we rely on cheerful volunteerd to
uncover surprises.

Some things to note:

+ The installer it produces is a 32-bit program, so this should be
  the end of "failure in 16-bit subsystem" deaths some people see
  on Win2K (at least 4 reports of that, and no real handle on why).

+ The uninstaller has a new "repair" option.  The install.log saves
  away file fingerprints at installation time, and so long as you
  still have the original installer .exe, the repair option can
  detect installed files that changed since installation, and
  (optionally) restore them from the original .exe.

+ Aborting an installation in midstream no longer (necessarily) leaves
  a bunch of crap sitting around.  Instead you get a new dialog box
  offering to roll back the changes made so far.  This even works if
  you hit the "Cancel" button on the final "installation finished"
  screen.

+ A Backup directory is created under the root of the Python
  installation, where the installer stores files it changes or
  replaces.  We don't do much of that, but it *does* allow the
  uninstaller to restore Start Menu entries too -- nice for alpha
  and beta testers (before, whatever pre-existing Start Menu
  entries they had were simply wiped out by an uninstall).

+ Since IDLE is an essential part of the Windows Experience for
  most PythonLabs users, I folded the old Tcl/Tk component into
  the main Python interpreter component -- one less checkbox to
  worry about.  Also removed the time-wasting "Welcome!" dialog,
  and made a few cosmetic improvements.  Other than those, the
  look and feel are much the same, it just runs better!

It's slick -- I think you'll like it.

and-if-you-don't-write-a-pep<wink>-ly y'rs  - tim



From aahz@rahul.net  Mon Jul 30 00:32:38 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Sun, 29 Jul 2001 16:32:38 -0700 (PDT)
Subject: [Python-Dev] PEP for adding a decimal type to Python
In-Reply-To: <01072701443301.05085@localhost.localdomain> from "Michael McLay" at Jul 27, 2001 01:44:33 AM
Message-ID: <20010729233239.56D6999C85@waltz.rahul.net>

Michael McLay wrote:
> 
> Absolutely.  The PEP process is suppose to formalize the capture of
> ideas so they can be reference. This PEP is mostly orthogonal to
> Aahz's proposal.  They can be merge, or we can reference each others
> PEP.  I'm probably not the best choice for doing the implement of the
> decimal number semantics, so I'd be happy to work with Aahz.

Note that I am unwilling to discuss this in the context of any PEP
until/unless I finish my implementation.  There is already a spec for
what I'm doing (Cowlishaw), and I see no point in talking until code is
ready for use.  If someone wants to take over my work, I won't complain;
I've already done the easy work.  ;-)
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From paulp@ActiveState.com  Mon Jul 30 00:43:00 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Sun, 29 Jul 2001 16:43:00 -0700
Subject: [Python-Dev] Python on Playstation 2
Message-ID: <3B649F84.5D241B38@ActiveState.com>

It isn't publicly available but Python has been ported to the
Playstation 2 video game console. The only weakness is that a binary
distribution wouldn't be useful because the format of Playstation CDs
isn't portable. Jason Asbahr told me about. Perhaps at the python
conference he can slip us some bootleg disks with a raw interpreter
prompt "game". It might be somewhat tedious programming with a gamepad
thingee but it would nevertheless be cool to be able to. The real goal
of the port is to use Python as a scripting language for a game with a
silly-sounding name that I can't remember.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From greg@cosc.canterbury.ac.nz  Mon Jul 30 00:47:20 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 30 Jul 2001 11:47:20 +1200 (NZST)
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: <200107271547.LAA24634@cj20424-a.reston1.va.home.com>
Message-ID: <200107292347.LAA00409@s454.cosc.canterbury.ac.nz>

Guido:

> > Suggested usage: from stat import *
> > """
> > 
> > Is ths still the suggested usage?
> 
> I don't see why not.

Because it flies in the face of the usual advice, which is never to
use import *.

How are we supposed to convince impressionable newbies to stay away
from the evil drug of import * if the docs for one of the standard
modules is brazenly advocating its use?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Mon Jul 30 01:21:33 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 30 Jul 2001 12:21:33 +1200 (NZST)
Subject: [Python-Dev] Picking on platform fmod
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com>
Message-ID: <200107300021.MAA00429@s454.cosc.canterbury.ac.nz>

Tim:

> Please run the attached.

SunOS s454 5.7 Generic_106541-10 sun4m sparc SUNW,SPARCstation-4:

0 failures in 10000 tries

SunOS pc200 5.8 Generic_108529-03 i86pc i386 i86pc:

0 failures in 10000 tries


Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From aahz@rahul.net  Mon Jul 30 03:51:47 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Sun, 29 Jul 2001 19:51:47 -0700 (PDT)
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: <200107271547.LAA24634@cj20424-a.reston1.va.home.com> from "Guido van Rossum" at Jul 27, 2001 11:47:13 AM
Message-ID: <20010730025148.3279199C85@waltz.rahul.net>

Guido van Rossum wrote:
>Greg Ward:
>> 
>> Suggested usage: from stat import *
>> 
>> Is ths still the suggested usage?
> 
> I don't see why not.

Here's why not:

from stat import *
from threading import *   # Look at the docs if you don't believe me
from Tkinter import *
from types import *

If you have a single module that imports all four of these (and I don't
think that's particularly bizarre), tracing back any random symbol to
its source becomes an annoying trek through *five* modules.  There are
probably a few other modules I don't know about that are declared "safe"
for import *.  IMO, this quickly leads to disaster, particularly when
trying to debug someone else's code (and I've wasted more time than I'd
like over this).  

It just plain goes against "explicit is better than implicit".  I think
we should declare a universal policy of NEVER recommending import *,
except for interactive use.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From barry@zope.com  Mon Jul 30 04:21:03 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Sun, 29 Jul 2001 23:21:03 -0400
Subject: [Python-Dev] Advice in stat.py
References: <200107271547.LAA24634@cj20424-a.reston1.va.home.com>
 <20010730025148.3279199C85@waltz.rahul.net>
Message-ID: <15204.53919.982508.201595@anthem.wooz.org>

>>>>> "AM" == Aahz Maruch <aahz@rahul.net> writes:

    AM> If you have a single module that imports all four of these
    AM> (and I don't think that's particularly bizarre), tracing back
    AM> any random symbol to its source becomes an annoying trek
    AM> through *five* modules.  There are probably a few other
    AM> modules I don't know about that are declared "safe" for import
    AM> *.  IMO, this quickly leads to disaster, particularly when
    AM> trying to debug someone else's code (and I've wasted more time
    AM> than I'd like over this).

    AM> It just plain goes against "explicit is better than implicit".
    AM> I think we should declare a universal policy of NEVER
    AM> recommending import *, except for interactive use.

Just because you can doesn't mean you should. :)

I think it's a good thing that those modules you mention are declared
safe for import-* but certainly in the situation you describe it isn't
a good idea to use them that way.  I don't remember a situation where
I've ever import-*'d more than a couple of modules in any single file.

There are often good reasons to use import-* at the module global
level.  Mailman has two places where this is used effectively.  The
more interesting place is in a configuration file called mm_cfg.py.
This file is where users are supposed to put all their customizations
overriding out-of-the-box defaults.  At the top of the file there's a
line like

    from Defaults import *

Which brings all the symbols from the out-of-the-box default file
(i.e. Defaults.py) into mm_cfg.py.  Overrides go after this import
line.

Mailman modules always import mm_cfg and never import Defaults, so it
makes for a very convenient way to arrange things so users only have
to care about overriding specific variables, and never have to worry
about the installation procedure overwriting their defaults ("make
install" may write a new Defaults.py but never a mm_cfg.py).

import-* is often good for creating this kind of transparent aliasing
of one module's namespace into a second.  I know <wink> no one's
talking about outlawing from-import-*.  It needs to be used
judiciously, but it definitely has its uses.

-Barry


From tim.one@home.com  Mon Jul 30 05:46:23 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 30 Jul 2001 00:46:23 -0400
Subject: [Python-Dev] Picking on platform fmod
In-Reply-To: <200107300021.MAA00429@s454.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEDCLCAA.tim.one@home.com>

Thanks all for torturing your boxes with fmod()!  Would still like to hear
about Platforms from Mars (Macs <wink>, Tru64, HP-UX), but it got a clean
bill of health on all these:

Win98SE + MSVC 6

SuSE 7.2 system
Linux mira 2.4.4-4GB #1 Wed May 16 00:37:55 GMT 2001 i686 unknown
glibc-2.2.2-38
gcc-2.95.3-52
Python 2.2a0 (#383, Jul 24 2001, 09:26:51)

Solaris 8
SunOS pandora 5.8 Generic_108528-06 sun4u sparc SUNW,Ultra-Enterprise
Python 2.1.1 (#1, Jul 21 2001, 20:59:12)
[GCC 2.95.2 19991024 (release)] on sunos5

Reliant
ReliantUNIX-Y deukalion 5.45 B0032 RM600 4/512 R4000
Python 2.0 (#6, Apr 10 2001, 13:20:15) [C] on reliantunix-y5

Linux
% uname -a
Linux anthem 2.2.18 #21 SMP Mon Jan 8 00:33:29 EST 2001 i686 unknown
% rpm -q libc
libc-5.3.12-31
% gcc --version
egcs-2.91.66

Linux-Mandrake 7.2
Linux kernel 2.2.17
GNU libc 2.1.3 (includes libm), and GCC 2.95.3.

SunOS 5.8 Generic_108529-08

Linux 2.2.18 / glibc 2.1.3

OSF1 cfata6.harvard.edu V4.0 878 alpha
SunOS cfa0 5.8 Generic_108528-06 sun4u sparc SUNW,Ultra-Enterprise
building with or without gcc

SunOS s454 5.7 Generic_106541-10 sun4m sparc SUNW,SPARCstation-4

SunOS pc200 5.8 Generic_108529-03 i86pc i386 i86pc

BSDI, FreeBSD and Linux on Intel hardware

Solaris on SPARC hardware

Linux on (IBM) PPC (SourceForge?)

Linxux on (Compaq) Alpha (SourceForge)


Only one failure report, from Thomas Wouters on:

Linux usf-cf-sparc-linux-1 2.2.18pre21 #1 SMP Wed Nov 22 17:27:17 EST 2000
sparc64 unknown (SourceForge)

dying with OverflowError in math.fmod().  Looks like a *badly* buggy fmod()
to me!

If we had tried this a decade ago, *most* platforms would have gotten a
wrong answer on almost every try.  Assuming we don't get a failure report on
Mac, the major platforms do this correctly now, so we can save Python from
growing another few hundred lines of excruciating workaround code:

    http://www.netlib.org/fdlibm/e_fmod.c

when-even-windows-gets-it-right-there's-no-excuse-ly y'rs  - tim



From fdrake@acm.org  Mon Jul 30 06:12:17 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 30 Jul 2001 01:12:17 -0400 (EDT)
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: <200107292347.LAA00409@s454.cosc.canterbury.ac.nz>
References: <200107271547.LAA24634@cj20424-a.reston1.va.home.com>
 <200107292347.LAA00409@s454.cosc.canterbury.ac.nz>
Message-ID: <15204.60593.503331.453900@cj42289-a.reston1.va.home.com>

Greg Ewing writes:
 > How are we supposed to convince impressionable newbies to stay away
 > from the evil drug of import * if the docs for one of the standard
 > modules is brazenly advocating its use?

  If Guido will back off on saying that it's acceptable to use it that
way, I can assure you the docs will be corrected.  But if he's still
advocating that usage (silly Dutchman), I don't think I should touch
it.
  Even if it is silly.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From fdrake@acm.org  Mon Jul 30 06:23:39 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 30 Jul 2001 01:23:39 -0400 (EDT)
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: <20010730025148.3279199C85@waltz.rahul.net>
References: <200107271547.LAA24634@cj20424-a.reston1.va.home.com>
 <20010730025148.3279199C85@waltz.rahul.net>
 <15204.53919.982508.201595@anthem.wooz.org>
Message-ID: <15204.61275.24195.153670@cj42289-a.reston1.va.home.com>

Aahz Maruch writes:
 > If you have a single module that imports all four of these (and I don't
 > think that's particularly bizarre), tracing back any random symbol to

  Only if you import-* them all, and that *is* a pathelogical case.
Any time you import-* *two* modules, you have a pathelogical case on
your hands, just ready to explode.

 > It just plain goes against "explicit is better than implicit".  I think
 > we should declare a universal policy of NEVER recommending import *,
 > except for interactive use.

  I'd be willing to give it up even there, esp. now that we have
import-as.

Barry sez:
 > There are often good reasons to use import-* at the module global
 > level.  Mailman has two places where this is used effectively.  The
 > more interesting place is in a configuration file called mm_cfg.py.
 > This file is where users are supposed to put all their customizations
 > overriding out-of-the-box defaults.  At the top of the file there's a

  This is about the only kind of thing I've ever found it useful for:
re-implementing a module's interface, but when I only want to change a
few things.  Needing to do this never feels like a good solution, and
probably indicates that some object needs to accept a parameter that
offers the module's interface instead of finding it by name.
  So yeah, I'd give up import-*.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From fdrake@acm.org  Mon Jul 30 06:26:43 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 30 Jul 2001 01:26:43 -0400 (EDT)
Subject: [Python-Dev] Python on Playstation 2
In-Reply-To: <3B649F84.5D241B38@ActiveState.com>
References: <3B649F84.5D241B38@ActiveState.com>
Message-ID: <15204.61459.319425.84281@cj42289-a.reston1.va.home.com>

Paul Prescod writes:
 > It isn't publicly available but Python has been ported to the
 > Playstation 2 video game console. The only weakness is that a binary
 > distribution wouldn't be useful because the format of Playstation CDs
 > isn't portable. Jason Asbahr told me about. Perhaps at the python

  I'm curious:  Is it the filesystem format or the lower-level
tracking format?  If its only the former, a prepared image should be
useful.
  (And no, I don't have a PS2 waiting to boot up Python!)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From barry@zope.com  Mon Jul 30 06:38:33 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 30 Jul 2001 01:38:33 -0400
Subject: [Python-Dev] Advice in stat.py
References: <200107271547.LAA24634@cj20424-a.reston1.va.home.com>
 <20010730025148.3279199C85@waltz.rahul.net>
 <15204.53919.982508.201595@anthem.wooz.org>
 <15204.61275.24195.153670@cj42289-a.reston1.va.home.com>
Message-ID: <15204.62169.600623.580141@anthem.wooz.org>

>>>>> "Fred" == Fred L Drake, Jr <fdrake@acm.org> writes:

    Fred>   This is about the only kind of thing I've ever found it
    Fred> useful for: re-implementing a module's interface, but when I
    Fred> only want to change a few things.

I call it "aliasing" a module (i.e. aliasing module A's symbols
exported through module B).
    
    Fred> Needing to do this never feels like a good solution, and
    Fred> probably indicates that some object needs to accept a
    Fred> parameter that offers the module's interface instead of
    Fred> finding it by name.

Um, sure, but it's can be pretty inconvenient to export 193 symbols
this way :).

>>> len(dir(Defaults))
193

I've often thought that it would be nice to have better delegation
support in Python, and no __getattr__() doesn't really hack it.  I'm
encouraged that some of the Py2.2 descr-branch stuff might actually
make this valid programming technique more useful and then even I
could see (eventually) giving up on import-*.
    
-Barry


From tim.one@home.com  Mon Jul 30 06:43:36 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 30 Jul 2001 01:43:36 -0400
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: <15204.60593.503331.453900@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEDGLCAA.tim.one@home.com>

[Fred L. Drake, Jr., on import *]
>   If Guido will back off on saying that it's acceptable to use it that
> way, I can assure you the docs will be corrected.  But if he's still
> advocating that usage (silly Dutchman), I don't think I should touch
> it.
>   Even if it is silly.

Guido never advocates it, but you're not going to get him to say it's Evil
either.  The thing is, an intelligent adult can use import-* profitably and
safely, in the handful of cases an intelligent adult realizes it's
profitable and safe to do so <wink>.

When Jeremy played w/ "import *"-inside-functions wngs, most popped up in
Guido's code.  This was most often

    from Tkinter import *

in a module's _test() function, where doing so was handy amd harmless.

I never use it myself -- but then I never write Tkinter code, and "stat" is
some Unix abomination <wink>.

It couldn't hurt to add an "intelligent adult" warning to the docs!  I
suggest an icon showing a refined English lady trying hard not to notice a
West Virginian next to her picking his nose.

perfect-images-are-too-rare-to-pass-up-ly y'rs  - tim



From shang@cc.jyu.fi  Mon Jul 30 07:56:09 2001
From: shang@cc.jyu.fi (Sami Hangaslammi)
Date: Mon, 30 Jul 2001 09:56:09 +0300 (EET DST)
Subject: [Python-Dev] Iterator addition?
Message-ID: <Pine.GSO.4.21.0107300946430.16836-100000@tukki>

Since iterator objects work like sequences in several contexts, maybe they
could support sequence-like operations such as addition. This would let
you write

  for x in iter1 + iter2:
      do_something(x)

instead of

  for x in iter1:
      do_something(x)

  for x in iter2:
      do_something(x)

or the slightly better

  for i in iter1,iter2:
      for x in i:
          do_something(x)


-- Sami Hangaslammi --



From m@moshez.org  Mon Jul 30 09:38:45 2001
From: m@moshez.org (Moshe Zadka)
Date: Mon, 30 Jul 2001 11:38:45 +0300
Subject: [Python-Dev] Iterator addition?
In-Reply-To: <Pine.GSO.4.21.0107300946430.16836-100000@tukki>
References: <Pine.GSO.4.21.0107300946430.16836-100000@tukki>
Message-ID: <E15R8ZZ-0004CO-00@darjeeling>

On Mon, 30 Jul 2001, Sami Hangaslammi <shang@cc.jyu.fi> wrote:

> Since iterator objects work like sequences in several contexts, maybe they
> could support sequence-like operations such as addition. This would let
> you write
> 
>   for x in iter1 + iter2:
>       do_something(x)
> 
> instead of
> 
>   for x in iter1:
>       do_something(x)
> 
>   for x in iter2:
>       do_something(x)
> 
> or the slightly better
> 
>   for i in iter1,iter2:
>       for x in i:
>           do_something(x)

No, instead of:


class concat:

    def __init__(self, *iterators):
        self.iterators = list(iterators)

    def __iter__(self): return self

    def next(self):
        while self.iterators:
            try:
                return self.iterators[0].next() 
            except StopIteration:
                del self.iterators[0]
        else:
            raise StopIteration


for x in concat(iter1, iter2):
    do_something(x)

(Note that the first n-2 lines can be refactored. Wasn't there talk
about having an iterator module with useful stuff like that?)

-- 
gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE
Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6
Insecure (accessible): C5A5 A8FA CA39 AB03 10B8  F116 1713 1BCF 54C4 E1FE
Learn Python! http://www.ibiblio.org/obp/thinkCSpy


From just@letterror.com  Mon Jul 30 09:53:35 2001
From: just@letterror.com (Just van Rossum)
Date: Mon, 30 Jul 2001 10:53:35 +0200
Subject: [Python-Dev] Iterator addition?
In-Reply-To: <E15R8ZZ-0004CO-00@darjeeling>
Message-ID: <20010730105341-r01010700-3955aee6-0910-010c@10.0.0.2>

Moshe Zadka wrote:

> No, instead of:
> 
> 
> class concat:
> 
>     def __init__(self, *iterators):
>         self.iterators = list(iterators)
> 
>     def __iter__(self): return self
> 
>     def next(self):
>         while self.iterators:
>             try:
>                 return self.iterators[0].next() 
>             except StopIteration:
>                 del self.iterators[0]
>         else:
>             raise StopIteration
> 
> 
> for x in concat(iter1, iter2):
>     do_something(x)

Or:

from __future__ import generators

def concat(*iterators):
    for i in iterators:
        for x in i:
            yield x

for x in concat(iter1, iter2):
    do_something(x)


Just


From shang@cc.jyu.fi  Mon Jul 30 10:20:54 2001
From: shang@cc.jyu.fi (Sami Hangaslammi)
Date: Mon, 30 Jul 2001 12:20:54 +0300 (EET DST)
Subject: [Python-Dev] Iterator addition?
In-Reply-To: <20010730105341-r01010700-3955aee6-0910-010c@10.0.0.2>
Message-ID: <Pine.GSO.4.21.0107301210410.16836-100000@tukki>

Just van Rossum wrote:

> from __future__ import generators
> 
> def concat(*iterators):
>     for i in iterators:
>         for x in i:
>             yield x
> 
> for x in concat(iter1, iter2):
>     do_something(x)

Yes, this is the solution that I eventually ended up with too. However,
the real point I was trying to raise was, wether interators should look
like sequences regarding addition, since the two are already exchangeable
in so many places (e.g. tuple unpacking).


Moshe Zadka wrote:

> Wasn't there talk about having an iterator module with useful stuff
> like that?

This would be a great idea. I've ended up with a sizeable bunch of small
utility functions when playing around with generators/iterators in 2.2a1.

-- Sami Hangaslammi --



From m@moshez.org  Mon Jul 30 12:05:58 2001
From: m@moshez.org (Moshe Zadka)
Date: Mon, 30 Jul 2001 14:05:58 +0300
Subject: [Python-Dev] Nostalgic Versions
Message-ID: <E15RAs2-0004iE-00@darjeeling>

Where can I find *really* old Python versions? I managed to find
1.2, but I want to get my hands on <1.0 versions if at all possible...

Thanks.
-- 
gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE
Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6
Insecure (accessible): C5A5 A8FA CA39 AB03 10B8  F116 1713 1BCF 54C4 E1FE
Learn Python! http://www.ibiblio.org/obp/thinkCSpy


From guido@zope.com  Mon Jul 30 13:42:49 2001
From: guido@zope.com (Guido van Rossum)
Date: Mon, 30 Jul 2001 08:42:49 -0400
Subject: [Python-Dev] Iterator addition?
In-Reply-To: Your message of "Mon, 30 Jul 2001 12:20:54 +0300."
 <Pine.GSO.4.21.0107301210410.16836-100000@tukki>
References: <Pine.GSO.4.21.0107301210410.16836-100000@tukki>
Message-ID: <200107301242.IAA09350@cj20424-a.reston1.va.home.com>

> the real point I was trying to raise was, wether interators should look
> like sequences regarding addition, since the two are already exchangeable
> in so many places (e.g. tuple unpacking).

No.

Adding a + operator would conflict in the case an iterator is also a
user-defined object.  Adding a * operator can't work because an
iterator cannot be restarted (you have to extract the iterator afresh
from the original object).  Adding any other sequence operation
(slicing, indexing, len() etc.) flies in the face of the
"forward-only" nature of iterators.

The *only* thing that iterators and sequences have in common is that
they can be iterated over.  So they are substitutable in all context
where that's all you do -- including sequence (not tuple!) unpacking.
And not in any other contexts.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@zope.com  Mon Jul 30 13:46:39 2001
From: guido@zope.com (Guido van Rossum)
Date: Mon, 30 Jul 2001 08:46:39 -0400
Subject: [Python-Dev] Nostalgic Versions
In-Reply-To: Your message of "Mon, 30 Jul 2001 14:05:58 +0300."
 <E15RAs2-0004iE-00@darjeeling>
References: <E15RAs2-0004iE-00@darjeeling>
Message-ID: <200107301246.IAA09379@cj20424-a.reston1.va.home.com>

> Where can I find *really* old Python versions? I managed to find
> 1.2, but I want to get my hands on <1.0 versions if at all possible...

You can try to check out by symbolic release tag from the CVS.  I
think it goes back to 0.9.8 and maybe even before.  Building may be
problematic: I think the oldest Makefiles got lost, and some files
were renamed -- CVS logs leave no trails of renaming.

For what purpose, may I ask?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From free@greatbabes.gz.ee  Mon Jul 30 11:03:18 2001
From: free@greatbabes.gz.ee (free@greatbabes.gz.ee)
Date: 30 Jul 2001 12:03:18 +0200
Subject: [Python-Dev] Special Deal This Week Only !!!
Message-ID: <E15RCeb-0001lH-00@mail.python.org>

------=_COY6vAPF_ZEiJgplY_MA
Content-Type: text/plain
Content-Transfer-Encoding: 8bit

--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
(This safeguard is not inserted when using the registered version)
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------

--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
(This safeguard is not inserted when using the registered version)
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------
--------------------------------------------------------------------


------=_COY6vAPF_ZEiJgplY_MA
Content-Type: text/html
Content-Transfer-Encoding: 8bit

<html>
<head>
<title>Get Acceess to 10 Sites For $1.99 Only !!!</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>

<body bgcolor="#000000" text="#FFFF00">
<div align="center"> 
  <p><font color="#FFFF00"><b><font size="3">Do not miss out on the opportunity 
    to get a full week of access to 8 websites for the amazing low price of only 
    <font color="#FF0000">$1.99</font>, <br>
    including Big Tit Fantasies - The Ultimate Tit Lover's Paradise!<br>
    You could search the internet for months and you wouldn't find a better deal 
    anywhere! </font></b></font></p>
  <p><font color=#ffffff 
                        face="Century Gothic, Arial" size=5>$1.99 SPECIAL DEAL<br>
    </font><font face="Century Gothic, Arial" 
                        size=5><font color=#8080ff>THIS WEEK ONLY!</font><font 
                        color=#ffffff></font></font></p>
  <TABLE WIDTH=360 BORDER=0 CELLPADDING=0 CELLSPACING=0>
    <TR> 
      <TD> <IMG SRC="http://kodu.neti.ee/~hyper/sp/spacer.gif" WIDTH=173 HEIGHT=1></TD>
      <TD> <IMG SRC="http://kodu.neti.ee/~hyper/sp/spacer.gif" WIDTH=7 HEIGHT=1></TD>
      <TD> <IMG SRC="http://kodu.neti.ee/~hyper/sp/spacer.gif" WIDTH=35 HEIGHT=1></TD>
      <TD> <IMG SRC="http://kodu.neti.ee/~hyper/sp/spacer.gif" WIDTH=145 HEIGHT=1></TD>
	</TR>
    <TR> 
      <TD COLSPAN=4> <IMG SRC="http://kodu.neti.ee/~hyper/sp/multi6ad_01.jpg" WIDTH=360 HEIGHT=30></TD>
    </TR>
    <TR> 
      <TD COLSPAN=3> <a onMouseOver="window.status='Click Here And Get Access to 10 Sites for $1.99 Only !!!'; return true" onMouseOut="window.status=''; return true" href="http://www.fatpockets.com/php/track.php3?ID=FP4453&Site=cf&Referer=http://kodu.neti.ee/~hyper/gb/aaa.htm"_BLANK"><IMG SRC="http://kodu.neti.ee/~hyper/sp/multi6ad_02.gif" width=215 height=28 border="0"></a></TD>
      <TD ROWSPAN=3> <IMG SRC="http://kodu.neti.ee/~hyper/sp/multi6ad_03.jpg" WIDTH=145 HEIGHT=99></TD>
    </TR>
    <TR> 
      <TD COLSPAN=3> <a onMouseOver="window.status='Click Here And Get Access to 10 Sites for $1.99 Only !!!'; return true" onMouseOut="window.status=''; return true" href="http://www.fatpockets.com/php/track.php3?ID=FP4453&Site=tf&Referer=http://kodu.neti.ee/~hyper/gb/aaa.htm" target="_blank"><IMG SRC="http://kodu.neti.ee/~hyper/sp/multi6ad_04.gif" width=215 height=36 border="0"></a></TD>
    </TR>
    <TR> 
      <TD COLSPAN=3> <a onMouseOver="window.status='Click Here And Get Access to 10 Sites for $1.99 Only !!!'; return true" onMouseOut="window.status=''; return true" href="http://www.fatpockets.com/php/track.php3?ID=FP4453&Site=lf&Referer=http://kodu.neti.ee/~hyper/gb/aaa.htm" target="_blank"><IMG SRC="http://kodu.neti.ee/~hyper/sp/multi6ad_05.gif" width=215 height=35 border="0"></a></TD>
    </TR>
    <TR> 
      <TD COLSPAN=4> <IMG SRC="http://kodu.neti.ee/~hyper/sp/multi6ad_06.jpg" WIDTH=360 HEIGHT=74></TD>
    </TR>
    <TR> 
      <TD ROWSPAN=3> <IMG SRC="http://kodu.neti.ee/~hyper/sp/multi6ad_07.jpg" WIDTH=173 HEIGHT=95></TD>
      <TD COLSPAN=3> <a onMouseOver="window.status='Click Here And Get Access to 10 Sites for $1.99 Only !!!'; return true" onMouseOut="window.status=''; return true" href="http://www.fatpockets.com/php/track.php3?ID=FP4453&Site=af&Referer=http://kodu.neti.ee/~hyper/gb/aaa.htm" target="_blank"><IMG SRC="http://kodu.neti.ee/~hyper/sp/multi6ad_08.gif" width=187 height=32 border="0"></a></TD>
    </TR>
    <TR> 
      <TD COLSPAN=3> <a onMouseOver="window.status='Click Here And Get Access to 10 Sites for $1.99 Only !!!'; return true" onMouseOut="window.status=''; return true" href="http://www.fatpockets.com/php/track.php3?ID=FP4453&Site=sf&Referer=http://kodu.neti.ee/~hyper/gb/aaa.htm" target="_blank"><IMG SRC="http://kodu.neti.ee/~hyper/sp/multi6ad_09.gif" width=187 height=35 border="0"></a></TD>
    </TR>
    <TR> 
      <TD COLSPAN=3> <a onMouseOver="window.status='Click Here And Get Access to 10 Sites for $1.99 Only !!!'; return true" onMouseOut="window.status=''; return true" href="http://www.fatpockets.com/php/track.php3?ID=FP4453&Site=bf&Referer=http://kodu.neti.ee/~hyper/gb/aaa.htm" target="_blank"><IMG SRC="http://kodu.neti.ee/~hyper/sp/multi6ad_10.gif" width=187 height=28 border="0"></a></TD>
    </TR>
    <TR> 
      <TD COLSPAN=4> <IMG SRC="http://kodu.neti.ee/~hyper/sp/multi6ad_11.jpg" WIDTH=360 HEIGHT=31></TD>
    </TR>
    <TR> 
      <TD COLSPAN=2> <IMG SRC="http://kodu.neti.ee/~hyper/sp/multi6ad_12.gif" WIDTH=180 HEIGHT=21></TD>
      <TD COLSPAN=2> <IMG SRC="http://kodu.neti.ee/~hyper/sp/multi6ad_13.gif" WIDTH=180 HEIGHT=21></TD>
    </TR>
  </TABLE>
  <p>&nbsp;</p>
  <p>&nbsp;</p>
  <p align="center"><font color=#8080ff><b><font 
                        face="Century Gothic, Arial" size=6>8 Sites for the Price 
    of One!<br>
    </font></b></font><font color=#ffffff 
                        face="Century Gothic, Arial" size=3>Do you really think 
    you're gonna find a better deal somewhere else!?<br>
    <a onMouseOver="window.status='Click Here And Get Access to 10 Sites for $1.99 Only !!!'; return true" onMouseOut="window.status=''; return true"href="http://www.fatpockets.com/php/track.php3?ID=FP4453&Site=lf&Referer=http://kodu.neti.ee/~hyper/gb/aaa.htm" target="_blank"
                        >Click here now</a> and stop wasting valuable jerk-off 
    time!</font></p>
  <p align="center"><font face="Century Gothic, Arial" size="3" color="#ffffff">Powered 
    by <a href="http://greatbabes.gz.ee">GREAT BABES</a> - the best free pics 
    <a href="http://greatbabes.gz.ee">here</a>. </font></p>
  <p align="center"><!-- BEGIN FASTCOUNTER CODE -->
<a href="http://member.bcentral.com/cgi-bin/fc/fastcounter-login?1689083" target="_top">
<img border="0" src="http://fastcounter.bcentral.com/fastcounter?1689083+3378173"></a>
<!-- END FASTCOUNTER CODE -->
<br>
<!-- BEGIN FASTCOUNTER LINK -->
<font face="arial" size="1">
<a href="http://fastcounter.bcentral.com/fc-join" target="_top">FastCounter by bCentral</a></font><br>
<!-- END FASTCOUNTER LINK -->
</p>
                            </div>

</body>
</html>


------=_COY6vAPF_ZEiJgplY_MA--



From loewis@informatik.hu-berlin.de  Mon Jul 30 14:10:23 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 30 Jul 2001 15:10:23 +0200 (MEST)
Subject: [Python-Dev] Nostalgic Versions
Message-ID: <200107301310.PAA18619@pandora.informatik.hu-berlin.de>

> Where can I find *really* old Python versions?

Here's what I found:

ftp://ftp.enst.fr/pub/unix/lang/python/src/python1.1.tar.gz
ftp://ftp.warwick.ac.uk/pub/misc/python1.0.1.tar.gz

Regards,
Martin


From m@moshez.org  Mon Jul 30 14:24:56 2001
From: m@moshez.org (Moshe Zadka)
Date: Mon, 30 Jul 2001 16:24:56 +0300
Subject: [Python-Dev] Nostalgic Versions
In-Reply-To: <200107301310.PAA18619@pandora.informatik.hu-berlin.de>
References: <200107301310.PAA18619@pandora.informatik.hu-berlin.de>
Message-ID: <E15RD2W-0005JI-00@darjeeling>

On Mon, 30 Jul 2001, Martin von Loewis <loewis@informatik.hu-berlin.de> wrote:

> Here's what I found:
> 
> ftp://ftp.enst.fr/pub/unix/lang/python/src/python1.1.tar.gz
> ftp://ftp.warwick.ac.uk/pub/misc/python1.0.1.tar.gz

Thanks a lot.

-- 
gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE
Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6
Insecure (accessible): C5A5 A8FA CA39 AB03 10B8  F116 1713 1BCF 54C4 E1FE
Learn Python! http://www.ibiblio.org/obp/thinkCSpy


From mal@lemburg.com  Mon Jul 30 13:56:32 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 30 Jul 2001 14:56:32 +0200
Subject: [Python-Dev] Python API version & optional features
Message-ID: <3B655980.948BCDEF@lemburg.com>

Martin has uploaded a patch which modifies the Python API level
number depending on the setting of the compile time option
for internal Unicode width (UCS-2/UCS-4):

https://sourceforge.net/tracker/?func=detail&aid=445717&group_id=5470&atid=305470

I am not sure whether this is the right way to approach this
problem, though, since it affects all extensions -- not only
ones using Unicode.

If at all possible, I'd prefer some other means to 
handle this situation (extension developers are certainly not
going to start shipping binaries for narrow and wide Python
versions if their extension does not happen to use Unicode).

Any ideas ?

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From thomas@xs4all.net  Mon Jul 30 14:39:09 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 30 Jul 2001 15:39:09 +0200
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: <15204.61275.24195.153670@cj42289-a.reston1.va.home.com>
Message-ID: <20010730153909.A20676@xs4all.nl>

On Mon, Jul 30, 2001 at 01:23:39AM -0400, Fred L. Drake, Jr. wrote:

> Aahz Maruch writes:
>  > If you have a single module that imports all four of these (and I don't
>  > think that's particularly bizarre), tracing back any random symbol to
> 
>   Only if you import-* them all, and that *is* a pathelogical case.
> Any time you import-* *two* modules, you have a pathelogical case on
> your hands, just ready to explode.

We could generate a warning if the compiler detects two or more import *'s
in the same codeblock ;)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From fdrake@acm.org  Mon Jul 30 14:40:25 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 30 Jul 2001 09:40:25 -0400 (EDT)
Subject: [Python-Dev] Python API version & optional features
In-Reply-To: <3B655980.948BCDEF@lemburg.com>
References: <3B655980.948BCDEF@lemburg.com>
Message-ID: <15205.25545.353887.299167@cj42289-a.reston1.va.home.com>

M.-A. Lemburg writes:
 > I am not sure whether this is the right way to approach this
 > problem, though, since it affects all extensions -- not only
 > ones using Unicode.

  Given that unicodeobject.h defines many macros and size-sensitive
types in the public API, I don't see any way around this.  If the API
always used UCS4 (including in the macros), or defined both UCS2 and
UCS4 versions of everything affected, then we could get around it.
  That seems like a high price to pay.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From thomas@xs4all.net  Mon Jul 30 14:52:26 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 30 Jul 2001 15:52:26 +0200
Subject: [Python-Dev] Python on Playstation 2
In-Reply-To: <15204.61459.319425.84281@cj42289-a.reston1.va.home.com>
Message-ID: <20010730155226.B20676@xs4all.nl>

On Mon, Jul 30, 2001 at 01:26:43AM -0400, Fred L. Drake, Jr. wrote:

> Paul Prescod writes:
>  > It isn't publicly available but Python has been ported to the
>  > Playstation 2 video game console. The only weakness is that a binary
>  > distribution wouldn't be useful because the format of Playstation CDs
>  > isn't portable. Jason Asbahr told me about. Perhaps at the python

I'd *love* a copy of that :)

>   I'm curious:  Is it the filesystem format or the lower-level
> tracking format?  If its only the former, a prepared image should be
> useful.

A prepared image won't help. Playstation CD's are copy-protected using a
gimmick that makes images useless (unless you perform or get someone to
perform a probably illegal operation on your console ;)

>   (And no, I don't have a PS2 waiting to boot up Python!)

I do.. It's not exactly doing nothing right now, but it can sure use some
Python :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Mon Jul 30 14:56:51 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 30 Jul 2001 15:56:51 +0200
Subject: [Python-Dev] Python API version & optional features
References: <3B655980.948BCDEF@lemburg.com> <15205.25545.353887.299167@cj42289-a.reston1.va.home.com>
Message-ID: <3B6567A3.E386EAB9@lemburg.com>

"Fred L. Drake, Jr." wrote:
> 
> M.-A. Lemburg writes:
>  > I am not sure whether this is the right way to approach this
>  > problem, though, since it affects all extensions -- not only
>  > ones using Unicode.
> 
>   Given that unicodeobject.h defines many macros and size-sensitive
> types in the public API, I don't see any way around this.  If the API
> always used UCS4 (including in the macros), or defined both UCS2 and
> UCS4 versions of everything affected, then we could get around it.
>   That seems like a high price to pay.

I think Guido suggested using macros to turn the Unicode APIs
into e.g. PyUnicodeUCS4_Encode() vs. PyUnicodeUCS2_Encode().

That would prevent loading of non-compatible extensions using Unicode
APIs (it doesn't catch the argument parser usage, though, e.g. 
"u").

Perhaps that's the way to go ?!

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From gward@python.net  Mon Jul 30 15:08:20 2001
From: gward@python.net (Greg Ward)
Date: Mon, 30 Jul 2001 10:08:20 -0400
Subject: [Python-Dev] Picking on platform fmod
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com>; from tim.one@home.com on Sat, Jul 28, 2001 at 04:13:53PM -0400
References: <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com>
Message-ID: <20010730100815.A1031@gerg.ca>

On 28 July 2001, Tim Peters said:
> Here's your chance to prove your favorite platform isn't a worthless pile of
> monkey crap <wink>.  Please run the attached.  If it prints anything other
> than

I tried it on a 64-bit SGI box running IRIX 6.5.  It dumped core.  But
there seem to be *lot* of problems with 2.2a1 on this platform; it died
pretty early in the test suite.  Guess I'll go file some bug reports...

        Greg
-- 
Greg Ward - Unix weenie                                 gward@python.net
http://starship.python.net/~gward/
Life is too short for ordinary music.


From guido@zope.com  Mon Jul 30 15:27:32 2001
From: guido@zope.com (Guido van Rossum)
Date: Mon, 30 Jul 2001 10:27:32 -0400
Subject: [Python-Dev] Python API version & optional features
In-Reply-To: Your message of "Mon, 30 Jul 2001 15:56:51 +0200."
 <3B6567A3.E386EAB9@lemburg.com>
References: <3B655980.948BCDEF@lemburg.com> <15205.25545.353887.299167@cj42289-a.reston1.va.home.com>
 <3B6567A3.E386EAB9@lemburg.com>
Message-ID: <200107301427.f6UERW802779@odiug.digicool.com>

> >  > I am not sure whether this is the right way to approach this
> >  > problem, though, since it affects all extensions -- not only
> >  > ones using Unicode.
> > 
> >   Given that unicodeobject.h defines many macros and size-sensitive
> > types in the public API, I don't see any way around this.  If the API
> > always used UCS4 (including in the macros), or defined both UCS2 and
> > UCS4 versions of everything affected, then we could get around it.
> >   That seems like a high price to pay.
> 
> I think Guido suggested using macros to turn the Unicode APIs
> into e.g. PyUnicodeUCS4_Encode() vs. PyUnicodeUCS2_Encode().
> 
> That would prevent loading of non-compatible extensions using Unicode
> APIs (it doesn't catch the argument parser usage, though, e.g. 
> "u").
> 
> Perhaps that's the way to go ?!

Hm, the "u" argument parser is a nasty one to catch.  How likely is
this to be the *only* reference to Unicode in a particular extension?

I'm trying to convince myself that the magic number patch is okay, and
here's what I come up with.  If someone builds a Python with a
non-standard Unicode width and accidentally uses a directory full of
extensions built for the standard Unicode width on his platform, he
deserves a warning.  Since most extensions come with source anyway,
people who want to experiment with UCS4 will have to be more
adventurous and build all the extensions they need from source.  The
warnings will remind them.  If there's a particular extension that
they can only get in binary *and* that extension doesn't use Unicode,
they can train themselves to ignore that warning.

These warnings should use the warnings framework, by the way, to make
it easier to ignore a specific warning.  Currently it's a hard write
to stderr.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Mon Jul 30 15:51:13 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 30 Jul 2001 16:51:13 +0200
Subject: [Python-Dev] Picking on platform fmod
In-Reply-To: <20010730100815.A1031@gerg.ca>
References: <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com> <20010730100815.A1031@gerg.ca>
Message-ID: <20010730165113.C20676@xs4all.nl>

On Mon, Jul 30, 2001 at 10:08:20AM -0400, Greg Ward wrote:
> On 28 July 2001, Tim Peters said:
> > Here's your chance to prove your favorite platform isn't a worthless pile of
> > monkey crap <wink>.  Please run the attached.  If it prints anything other
> > than

> I tried it on a 64-bit SGI box running IRIX 6.5.  It dumped core.  But
> there seem to be *lot* of problems with 2.2a1 on this platform; it died
> pretty early in the test suite.  Guess I'll go file some bug reports...

Note that I didn't see Tim asking for a test on 2.2a1, and I didn't test it
on 2.2 on all but my own Linux box. Instead, I used 2.1.1 (since I already
had binaries for all SourceForge compilefarm machines and all of our
production machines :) and Tim hasn't complained yet.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Mon Jul 30 15:59:38 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 30 Jul 2001 16:59:38 +0200
Subject: [Python-Dev] Python API version & optional features
References: <3B655980.948BCDEF@lemburg.com> <15205.25545.353887.299167@cj42289-a.reston1.va.home.com>
 <3B6567A3.E386EAB9@lemburg.com> <200107301427.f6UERW802779@odiug.digicool.com>
Message-ID: <3B65765A.9706A4A2@lemburg.com>

Guido van Rossum wrote:
> 
> > >  > I am not sure whether this is the right way to approach this
> > >  > problem, though, since it affects all extensions -- not only
> > >  > ones using Unicode.
> > >
> > >   Given that unicodeobject.h defines many macros and size-sensitive
> > > types in the public API, I don't see any way around this.  If the API
> > > always used UCS4 (including in the macros), or defined both UCS2 and
> > > UCS4 versions of everything affected, then we could get around it.
> > >   That seems like a high price to pay.
> >
> > I think Guido suggested using macros to turn the Unicode APIs
> > into e.g. PyUnicodeUCS4_Encode() vs. PyUnicodeUCS2_Encode().
> >
> > That would prevent loading of non-compatible extensions using Unicode
> > APIs (it doesn't catch the argument parser usage, though, e.g.
> > "u").
> >
> > Perhaps that's the way to go ?!
> 
> Hm, the "u" argument parser is a nasty one to catch.  How likely is
> this to be the *only* reference to Unicode in a particular extension?

It is not very likely but IMHO possible for e.g. extensions
which rely on the fact that wchar_t == Py_UNICODE and then do
direct interfacing to some other third party code.

I guess one could argue that extension writers should check
for narrow/wide builds in their extensions before using Unicode.

Since the number of Unicode extension writers is much smaller 
than the number of users, I think that this apporach would be 
reasonable, provided that we document the problem clearly in the 
NEWS file.

> I'm trying to convince myself that the magic number patch is okay, and
> here's what I come up with.  If someone builds a Python with a
> non-standard Unicode width and accidentally uses a directory full of
> extensions built for the standard Unicode width on his platform, he
> deserves a warning.  Since most extensions come with source anyway,
> people who want to experiment with UCS4 will have to be more
> adventurous and build all the extensions they need from source.  The
> warnings will remind them.  If there's a particular extension that
> they can only get in binary *and* that extension doesn't use Unicode,
> they can train themselves to ignore that warning.

Hmm, that would probably not make UCS-4 builds very popular ;-)
 
> These warnings should use the warnings framework, by the way, to make
> it easier to ignore a specific warning.  Currently it's a hard write
> to stderr.

Using the warnings framework would indeed be a good idea (many older
extensions work just fine even with later API levels; the warnings
are annoying, though) !

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From paulp@ActiveState.com  Mon Jul 30 16:09:04 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 30 Jul 2001 08:09:04 -0700
Subject: [Python-Dev] Python on Playstation 2
References: <3B649F84.5D241B38@ActiveState.com> <15204.61459.319425.84281@cj42289-a.reston1.va.home.com>
Message-ID: <3B657890.98ED00CB@ActiveState.com>

"Fred L. Drake, Jr." wrote:
> 
> Paul Prescod writes:
>  > It isn't publicly available but Python has been ported to the
>  > Playstation 2 video game console. The only weakness is that a binary
>  > distribution wouldn't be useful because the format of Playstation CDs
>  > isn't portable. Jason Asbahr told me about. Perhaps at the python
> 
>   I'm curious:  Is it the filesystem format or the lower-level
> tracking format?  If its only the former, a prepared image should be
> useful.
>   (And no, I don't have a PS2 waiting to boot up Python!)

Most likely those "in the know" aren't even allowed to tell us that
much. For all I know the filesystem is encrypted. Remember that these
game manufacturers do NOT want an independent third party market to
arise. They want you to come to them for the specs.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From gward@python.net  Mon Jul 30 16:46:06 2001
From: gward@python.net (Greg Ward)
Date: Mon, 30 Jul 2001 11:46:06 -0400
Subject: [Python-Dev] Picking on platform fmod
In-Reply-To: <20010730100815.A1031@gerg.ca>; from gward@python.net on Mon, Jul 30, 2001 at 10:08:20AM -0400
References: <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com> <20010730100815.A1031@gerg.ca>
Message-ID: <20010730114606.B1031@gerg.ca>

On 30 July 2001, I said:
> I tried it on a 64-bit SGI box running IRIX 6.5.  It dumped core.

OK, it worked this time.  I guess I improved my py-karma by building
seventeen different ways and submitting bug reports for all the problems
I had building on this platform.  Details:

$ uname -a
IRIX64 mouldy 6.5 10181058 IP27

$ hinv
4 180 MHZ IP27 Processors
CPU: MIPS R10000 Processor Chip Revision: 2.6
FPU: MIPS R10010 Floating Point Chip Revision: 0.0
[...]

$ time ./python ~/ffmod.py
0 failures in 10000 tries
51.671u 0.083s 0:51.99 99.5% 0+0k 0+0io 0pf+0w

        Greg
-- 
Greg Ward - nerd                                        gward@python.net
http://starship.python.net/~gward/
There are no stupid questions -- only stupid people.


From gward@python.net  Mon Jul 30 16:48:06 2001
From: gward@python.net (Greg Ward)
Date: Mon, 30 Jul 2001 11:48:06 -0400
Subject: [Python-Dev] Picking on platform fmod
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com>; from tim.one@home.com on Sat, Jul 28, 2001 at 04:13:53PM -0400
References: <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com>
Message-ID: <20010730114806.C1031@gerg.ca>

On 28 July 2001, Tim Peters said:
> Here's your chance to prove your favorite platform isn't a worthless pile of
> monkey crap <wink>.  Please run the attached.  If it prints anything other
> than

Oops, another data point: I didn't see an AMD Athlon or Linux 2.4 in
your list of successes, so here's one:

$ uname -a
Linux cthulhu 2.4.2 #1 Thu May 3 14:30:48 EST 2001 i686 unknown

$ cat /proc/cpuinfo 
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 4
model name      : AMD Athlon(tm) Processor
stepping        : 2
cpu MHz         : 807.197
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr syscall mmxext 3dnowext 3dnow
bogomips        : 1608.90

$ time python ffmod.py
0 failures in 10000 tries
python ffmod.py  5.68s user 0.01s system 92% cpu 6.153 total

        Greg
-- 
Greg Ward - Unix weenie                                 gward@python.net
http://starship.python.net/~gward/
God is real, unless declared integer.


From guido@zope.com  Mon Jul 30 16:47:43 2001
From: guido@zope.com (Guido van Rossum)
Date: Mon, 30 Jul 2001 11:47:43 -0400
Subject: [Python-Dev] Python API version & optional features
In-Reply-To: Your message of "Mon, 30 Jul 2001 16:59:38 +0200."
 <3B65765A.9706A4A2@lemburg.com>
References: <3B655980.948BCDEF@lemburg.com> <15205.25545.353887.299167@cj42289-a.reston1.va.home.com> <3B6567A3.E386EAB9@lemburg.com> <200107301427.f6UERW802779@odiug.digicool.com>
 <3B65765A.9706A4A2@lemburg.com>
Message-ID: <200107301547.f6UFlhB02991@odiug.digicool.com>

> > Hm, the "u" argument parser is a nasty one to catch.  How likely is
> > this to be the *only* reference to Unicode in a particular extension?
> 
> It is not very likely but IMHO possible for e.g. extensions
> which rely on the fact that wchar_t == Py_UNICODE and then do
> direct interfacing to some other third party code.
> 
> I guess one could argue that extension writers should check
> for narrow/wide builds in their extensions before using Unicode.
> 
> Since the number of Unicode extension writers is much smaller 
> than the number of users, I think that this apporach would be 
> reasonable, provided that we document the problem clearly in the 
> NEWS file.

OK.  I approve.

> Hmm, that would probably not make UCS-4 builds very popular ;-)

Do you have any reason to assume that it would be popular otherwise?
:-) :-) :-)

> > These warnings should use the warnings framework, by the way, to make
> > it easier to ignore a specific warning.  Currently it's a hard write
> > to stderr.
> 
> Using the warnings framework would indeed be a good idea (many older
> extensions work just fine even with later API levels; the warnings
> are annoying, though) !

Exactly.

I'm not going to make the change, but it should be a two-liner in
Python/modsupport.c:Py_InitModule4().

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz@rahul.net  Mon Jul 30 16:49:36 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Mon, 30 Jul 2001 08:49:36 -0700 (PDT)
Subject: [Python-Dev] pep-discuss
In-Reply-To: <3B62EB05.396DF4D7@ActiveState.com> from "Paul Prescod" at Jul 28, 2001 09:40:37 AM
Message-ID: <20010730154936.AE36899C94@waltz.rahul.net>

Paul Prescod wrote:
> 
> We've talked about having a mailing list for general PEP-related
> discussions. Two things make me think that revisiting this would be a
> good idea right now.
> 
> First, the recent loosening up of the python-dev rules threatens the
> quality of discussion about bread and butter issues such as patch
> discussions and process issues.
> 
> Second, the flamewar on python-list basically drowned out the usual
> newbie questions and would give a person coming new to Python a very
> negative opinion about the language's future and the friendliness of the
> community. I would rather redirect as much as possible of that to a list
> that only interested participants would have to endure.

While what you say makes sense, overall, there are a lot of people (me
included) who prefer discussion on newsgroups, and I can't quite see
creating a newsgroup for PEP discussions yet.  Call me -0.25 for kicking
discussion off c.l.py and +0.25 for getting it off python-dev.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From esr@thyrsus.com  Mon Jul 30 04:56:44 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sun, 29 Jul 2001 23:56:44 -0400
Subject: [Python-Dev] Picking on platform fmod
In-Reply-To: <20010730114806.C1031@gerg.ca>; from gward@python.net on Mon, Jul 30, 2001 at 11:48:06AM -0400
References: <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com> <20010730114806.C1031@gerg.ca>
Message-ID: <20010729235644.A13628@thyrsus.com>

Greg Ward <gward@python.net>:
> On 28 July 2001, Tim Peters said:
> > Here's your chance to prove your favorite platform isn't a worthless pile of
> > monkey crap <wink>.  Please run the attached.  If it prints anything other
> > than
> 
> Oops, another data point: I didn't see an AMD Athlon or Linux 2.4 in
> your list of successes, so here's one:

My RH 7.1 success report should have been more specific.  Also 2.4.2,
running on a Dual Pentium II box.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Before a standing army can rule, the people must be disarmed, as they
are in almost every kingdom in Europe. The supreme power in America
cannot enforce unjust laws by the sword, because the people are armed,
and constitute a force superior to any band of regular troops.
	-- Noah Webster


From aahz@rahul.net  Mon Jul 30 17:20:48 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Mon, 30 Jul 2001 09:20:48 -0700 (PDT)
Subject: [Python-Dev] Picking on platform fmod
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com> from "Tim Peters" at Jul 28, 2001 04:13:53 PM
Message-ID: <20010730162048.E922999C9F@waltz.rahul.net>

Tim Peters wrote:
> 
> Here's your chance to prove your favorite platform isn't a worthless pile of
> monkey crap <wink>.  Please run the attached.  If it prints anything other
> than
> 
> 0 failures in 10000 tries
> 
> it will probably print a lot.  In that case I'd like to know which flavor of
> C+libc+libm you're using, and the OS; a few of the failures it prints may be
> helpful too.  

Successful with Python 2.0 on NetBSD (unknown CPU) and Win98SE with
Athlon.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From mclay@nist.gov  Mon Jul 30 16:06:52 2001
From: mclay@nist.gov (Michael McLay)
Date: Mon, 30 Jul 2001 11:06:52 -0400
Subject: [Python-Dev] Revised decimal type PEP
Message-ID: <0107301106520A.02216@fermi.eeel.nist.gov>

PEP: 2XX
Title: Adding a Decimal type to Python
Version: $Revision:$
Author: mclay@nist.gov <mclay@nist.gov>
Status: Draft
Type: ??
Created: 25-Jul-2001
Python-Version: 2.2


Introduction

    This PEP describes the addition of a decimal number type to Python.


Rationale

    The original Python numerical model included int, float, and long.
    By popular request the imaginary type was added to improve support
    for engineering and scientific applications.  The addition of a
    decimal number type to Python will improve support for business
    applications as well as improve the utility of Python a teaching
    language. 

    The number types currently used in Python are encoded as base two
    binary numbers.  The base 2 arithmetic used by binary numbers closely
    approximates the decimal number system and for many applications the
    differences in the calculations are unimportant.  The decimal number
    type encodes numbers as decimal digits and use base 10 arithmetic.
    This is the number system taught to the general public and it is the
    system used by businesses when making financial calculations.

    For financial and accounting applications the difference between
    binary and decimal types is significant. Consequently the computer
    languages used for business application development, such as COBOL,
    use decimal types.  

    The decimal number type meets the expectations of non-computer
    scientists when making calculations.  For these users the rounding
    errors that occur when using binary numbers is a source of confusion
    and irritation.


Implementation

    The tokenizer will be modified to recognized number literals with
    a 'd' suffix and a decimal() function will be added to __builtins__.  
    A decimal number can be used to represent integers and floating point
    numbers and decimal numbers can also be displayed using scientific
    notation. Examples of decimal numbers include:  

        1234d
        -1234d
        1234.56d
        -1234.56d
        1234.56e2d
        -1234.56e-2d

    The type returned by either a decimal floating point or a decimal
    integer is the same: 

    >>> type(12.2d)
    <type 'decimal'>
    >>> type(12d)
    <type 'decimal'>
    >>> type(-12d+12d)
    <type 'decimal'>
    >>> type(12d+12.0d)
    <type 'decimal'>

    This proposal will also add an optional  'b' suffix to the
    representation of binary float type literals and binary int type
    literals. 

    >>> float(12b)
    12.0
    >>> type(12.2b)
    <type 'float'>
    >>> type(float(12b))
    <type 'float'>
    >>> type(12b)
    <type 'int'>

    The decimal() conversion function added to __builtins__ will support
    conversions of strings, and binary types to decimal.

    >>> type(decimal("12d"))
    <type 'decimal'>
    >>> type(decimal("12"))
    <type 'decimal'>
    >>> type(decimal(12b))
    <type 'decimal'>
    >>> type(decimal(12.0b))
    <type 'decimal'>
    >>> type(decimal(123456789123L))
    <type 'decimal'>

    The conversion functions int() and float() in the __builtin__ module
    will support conversion of decimal numbers to the binary number
    types. 

    >>> type(int(12d))
    <type 'int'>
    >>> type(float(12.0d))
    <type 'float'>

    Expressions that mix integers with decimals will automatically convert
    the integer to decimal and the result will be a decimal number.

    >>> type(12d + 4b)
    <type 'decimal'>
    >>> type(12b + 4d)
    <type 'decimal'>
    >>> type(12d + len('abc'))
    <type 'decimal'>
    >>> 3d/4b
    0.75

    Expressions that mix binary floats with decimals introduce the
    possibility of unexpected results because the two number types use
    different internal representations for the same numerical value.  The
    severity of this problem is dependent on the application domain.  For
    applications that normally use binary numbers the error may not be
    important and the conversion should be done silently.  For newbie
    programmers a warning should be issued so the newbie will be able to
    locate the source of a discrepancy between the expected results and
    the results that were achieved.  For financial applications the mixing
    of floating point with binary numbers should raise an exception.

    To accommodate the three possible usage models the python interpreter
    command line options will be used to set the level for warning and 
    error messages. The three levels are:   

    promiscuous mode,	-f or  --promiscuous
    safe mode 		-s or --save
    pedantic mode	-p or --pedantic

    The default setting will be set to the safe setting. In safe mode
    mixing decimal and binary floats in a calculation will trigger a warning
    message. 

    >>> type(12.3d + 12.2b)
    Warning: the calculation mixes decimal numbers with binary floats
    <type 'decimal'>

    In promiscuous mode warnings will be turned off.

    >>> type(12.3d + 12.2b)
    <type 'decimal'>

    In pedantic mode warning from safe mode will be turned into exceptions.

    >>> type(12.3d + 12.2b)
    Traceback (innermost last):
      File "<stdin>", line 1, in ?
    TypeError: the calculation mixes decimal numbers with binary floats


Semantics of Decimal Numbers

    ??


From skip@pobox.com (Skip Montanaro)  Mon Jul 30 17:53:44 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 30 Jul 2001 11:53:44 -0500
Subject: [Python-Dev] Nostalgic Versions
In-Reply-To: <200107301246.IAA09379@cj20424-a.reston1.va.home.com>
References: <E15RAs2-0004iE-00@darjeeling>
 <200107301246.IAA09379@cj20424-a.reston1.va.home.com>
Message-ID: <15205.37144.824975.214559@beluga.mojam.com>

    >> Where can I find *really* old Python versions? I managed to find
    >> 1.2, but I want to get my hands on <1.0 versions if at all possible...

    Guido> You can try to check out ...

    Guido> For what purpose, may I ask?

I'll wager Moshe is planning on "fixing" division. ;-)

Skip



From fdrake@acm.org  Mon Jul 30 17:51:57 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 30 Jul 2001 12:51:57 -0400 (EDT)
Subject: [Python-Dev] Advice in stat.py
In-Reply-To: <15204.62169.600623.580141@anthem.wooz.org>
References: <200107271547.LAA24634@cj20424-a.reston1.va.home.com>
 <20010730025148.3279199C85@waltz.rahul.net>
 <15204.53919.982508.201595@anthem.wooz.org>
 <15204.61275.24195.153670@cj42289-a.reston1.va.home.com>
 <15204.62169.600623.580141@anthem.wooz.org>
Message-ID: <15205.37037.725140.801916@cj42289-a.reston1.va.home.com>

Barry A. Warsaw writes:
 > Um, sure, but it's can be pretty inconvenient to export 193 symbols
 > this way :).

  Yeah, that's a lot.  ;-)

 > I've often thought that it would be nice to have better delegation
 > support in Python, and no __getattr__() doesn't really hack it.  I'm
 > encouraged that some of the Py2.2 descr-branch stuff might actually
 > make this valid programming technique more useful and then even I
 > could see (eventually) giving up on import-*.

  Definately; it would be good to have nicer delegation support.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From guido@zope.com  Mon Jul 30 18:40:06 2001
From: guido@zope.com (Guido van Rossum)
Date: Mon, 30 Jul 2001 13:40:06 -0400
Subject: [Python-Dev] pep-discuss
In-Reply-To: Your message of "Mon, 30 Jul 2001 08:49:36 PDT."
 <20010730154936.AE36899C94@waltz.rahul.net>
References: <20010730154936.AE36899C94@waltz.rahul.net>
Message-ID: <200107301740.f6UHe6K03226@odiug.digicool.com>

> Paul Prescod wrote:
> > 
> > We've talked about having a mailing list for general PEP-related
> > discussions. Two things make me think that revisiting this would be a
> > good idea right now.
> > 
> > First, the recent loosening up of the python-dev rules threatens the
> > quality of discussion about bread and butter issues such as patch
> > discussions and process issues.
> > 
> > Second, the flamewar on python-list basically drowned out the usual
> > newbie questions and would give a person coming new to Python a very
> > negative opinion about the language's future and the friendliness of the
> > community. I would rather redirect as much as possible of that to a list
> > that only interested participants would have to endure.
> 
> While what you say makes sense, overall, there are a lot of people (me
> included) who prefer discussion on newsgroups, and I can't quite see
> creating a newsgroup for PEP discussions yet.  Call me -0.25 for kicking
> discussion off c.l.py and +0.25 for getting it off python-dev.

For me personally, it would just be another list to follow, no matter
where it happens, so consider me -0.  I won't object if a majority on
python-dev wants this though.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@thyrsus.com  Mon Jul 30 06:48:59 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 30 Jul 2001 01:48:59 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
Message-ID: <20010730014859.A15971@thyrsus.com>

The 2001 O'Reilly Open Source Convention, was, as usual, a very stimulating 
event and a forum for a lot of valuable high-level conversations between 
the principal developers of many open-source projects.

Many of you know that I maintain friendly and relatively close
relations with a number of senior Perl hackers, including both Larry
Wall himself and others like Chip Salzenberg, Randall Schwartz, Tom
Christiansen, Adam Turoff, and more recently Simon Cozens (who lurks on
this list these days).  At OSCon I believe I got a pretty good picture
of what the leaders of the Perl community are thinking and planning
these days.  They have definitely come out of the slump they were in a
year ago -- there's a much-renewed sense of energy over there.

I think their plans offer both the Perl and Python communities some
large strategic opportunities.  Specifically, I'm urging the Python
community's leadership to seriously explore the possibility of helping
make the Parrot hoax into a working reality.  I have discussed this
with Guido by phone, and though he is skeptical about such an
implementation being actually possible, he also thinks the idea has
tremendous potential and says he is willing to support it in public.

The Perl people have blocked out an architecture for Perl 6 that
envisages a new bytecode level, designed and implemented from
scratch.  They're very serious about this; I participated in some
discussions of the bytecode design (and, incidentally, argued that
the bytecode should emulate a stack rather than a register machine
because the cost/speed disparities that justify register architectures
in hardware don't exist in a software VM).

The Perl people are receptive -- indeed, some of them are actively pushing --
the idea that their new bytecode should not be Perl-specific.  Dan Sugalski,
the current lead for the bytecode interpreter project, has named it Parrot.
At the Perl 6 talk I attended, Chip Salzenberg speculated in public about 
possibly supporting a common runtime for Perl, Python, Ruby, and Intercal(!).

One of the things that makes this an unprecedented opportunity is that
the design of Perl 6 is not yet set in stone -- and Larry has already
shown a willingness to move it in a Pythonward direction.  Syntactically,
Perl 5's -> syntax is going away to be replaced by a Python-like dot
with implicit dereferencing (and Larry said in public this was
Python's influence, not Java's).  The languages have of course converged
in other ways recently -- Python's new lexical scoping actually brings
it closer to Perl "use strict" semantics.

I believe the way is open for Python's leading technical people to be
involved as co-equals in the design and implementation of the Parrot
bytecode interpreter.  I have even detected some willingness to use
Python's existing bytecode as a design model for Parrot, and perhaps
even portions of the Python interpreter codebase!

One bold but possibly workable proposal would be to offer Dan and the
Parrot project the Python bytecode interpreter as a base for the Parrot
code, and then be willing to incorporate whatever (probably relatively
minor) extensions are needed to support Perl 6's primitives.

Following my conversation with Guido, I've put doing an architectural
comparison of the existing Python and Perl bytecodes at the top of my
priority list.  I'm flying to Taipei tomorrow and will have a lot of
hours on airplanes with my laptop to do this.

Committing a common runtime with Perl would mean relinquishing
exclusive design control of our bytecode level, but the Perl people
seem themselves willing to give up *their* exclusive control to make
this work.  It is rather remarkable how respectful of Python they have
become, and I can't emphasize enough that I think they are truly ready
for us to come to the project as equal partners.

(One important place where I think everybody understands the Python
side of the force would clearly win out in a final Parrot design is in
the extension-and-embedding facilities.  Perl XS is acknowledged to be
a nasty mess.  My guess is the Perl guys would drop it like a hot rock
for our stuff -- that would be as clear a win for them as co-opting
Perl-style regexps was for us.)

I think the benefits of a successful unification at the bytecode
level, together with Larry's willingness to Pythonify the upper level
of Perl 6 a bit, could be vast -- both for the Python community in
particular and for scripting-language users in general.

1. Mixed-language programming in Perl and Python could become almost
seamless, with all that implies for both languages getting the use of
each others' libraries.

2. The prospects for getting decent Python compilation to native code would
improve if both the Python and Perl communities were strongly motivated 
to solve the bytecode-compilation problem.

3. More generally, the fact remains that Perl's user/developer base is
still much larger than ours.  Successful unification would co-opt a
lot of that energy for Python.  Because the brain drain between Perl
and Python is pretty much unidirectional in Python's favor (a fact
even the top Perl hackers ruefully acknowledge), I don't think we need
worry about being subsumed in that larger community either.

I think there is a wonderful opportunity here for the Python and Perl
developers to lead the open-source world.  If we can do a good Parrot
design together, I think it will be hard for the smaller scripting
language communities to resist its pull.  Ultimately, the common
Parrot runtime could become the open-source community's answer -- a
very competititive answer -- to the common runtime Microsoft is
pushing for .NET.

I think trying to make Parrot work would be worth some serious effort.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The Bible is not my book, and Christianity is not my religion.  I could never
give assent to the long, complicated statements of Christian dogma.
	-- Abraham Lincoln


From mwh@python.net  Mon Jul 30 19:16:26 2001
From: mwh@python.net (Michael Hudson)
Date: 30 Jul 2001 14:16:26 -0400
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
In-Reply-To: Guido van Rossum's message of "Sat, 28 Jul 2001 09:57:28 -0400"
References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> <2mn15pwg56.fsf@starship.python.net> <200107281357.JAA30859@cj20424-a.reston1.va.home.com>
Message-ID: <2m3d7emged.fsf@starship.python.net>

Guido van Rossum <guido@zope.com> writes:

> > Not directly relavent to the PEP, but...
> > 
> > Guido van Rossum <guido@zope.com> writes:
> > 
> > >     Q. What about code compiled by the codeop module?
> > > 
> > >     A. Alas, this will always use the default semantics (set by the -D
> > >        command line option).  This is a general problem with the
> > >        future statement; PEP 236[4] lists it as an unresolved
> > >        problem.  You could have your own clone of codeop.py that
> > >        includes a future division statement, but that's not a general
> > >        solution.
> > 
> > Did you look at my Nasty Hack(tm) to bodge around this?  It's at 
> > 
> >     http://starship.python.net/crew/mwh/hacks/codeop-hack.diff
> > 
> > if you haven't.  I'm not sure it will work with what you're planning
> > for division, but it works for generators (and worked for nested
> > scopes when that was relavent).
> 
> Ouch.  Nasty.  Hat off to you for thinking of this!

I'll choose to take this as a positive remoark :-)

> > There are a host of saner ways round this, of course - like adding an
> > optional "flags" argument to compile, for instance.
> 
> We'll have to keep that in mind.

Here's a fairly short pre-PEP on the issue.  If I haven't made any
gross editorial blunders, can Barry give it a number and check the
sucker in?

PEP: XXXX
Title: Supporting __future__ statements in simulated shells
Version: $Version:$
Author: Michael Hudson <mwh@python.net>
Status: Draft
Type: Standards Track
Requires: 0236
Created: 30-Jul-2001
Python-Version: 2.2
Post-History: 

Abstract

    As noted in PEP 263, there is no clear way for "simulated
    interactive shells" to simulate the behaviour of __future__
    statements in "real" interactive shells, i.e. have __future__
    statements' effects last the life of the shell.

    This short PEP proposes to make this possible by adding an
    optional fourth argument to the builtin function "compile" and
    adding machinery to the standard library modules "codeop" and
    "code" to make the construction of such shells easy.

Specification

    I propose adding a fourth, optional, "flags" argument to the
    builtin "compile" function.  If this argument is omitted, there
    will be no change in behaviour from that of Python 2.1.

    If it is present it is expected to be an integer, representing
    various possible compile time options as a bitfield.  The
    bitfields will have the same values as the PyCF_* flags #defined
    in Include/pythonrun.h (at the time of writing there are only two
    - PyCF_NESTED_SCOPES and PyCF_GENERATORS).  These are currently
    not exposed to Python, so I propose adding them to codeop.py
    (because it's already here, basically).

    XXX Should the supplied flags be or-ed with the flags of the
    calling frame, or do we override them?  I'm for the former,
    slightly.

    I also propose adding a pair of classes to the standard library
    module codeop.

    One - probably called Compile - will sport a __call__ method which
    will act much like the builtin "compile" of 2.1 with the
    difference that after it has compiled a __future__ statement, it
    "remembers" it and compiles all subsequent code with the
    __future__ options in effect.

    It will do this by examining the co_flags field of any code object
    it returns, which in turn means writing and maintaining a Python
    version of the function PyEval_MergeCompilerFlags found in
    Python/ceval.c.

    Objects of the other class added to codeop - probably called
    CommandCompiler or somesuch - will do the job of the existing
    codeop.compile_command function, but in a __future__-aware way.

    Finally, I propose to modify the class InteractiveInterpreter in
    the standard library module code to use a CommandCompiler to
    emulate still more closely the behaviour of the default Python
    shell.

Backward Compatibility

    Should be very few or none; the changes to compile will make no
    difference to existing code, nor will adding new functions or
    classes to codeop.  Exisiting code using
    code.InteractiveInterpreter may change in behaviour, but only for
    the better in that the "real" Python shell will be being better
    impersonated.

Forward Compatibility

    codeop will require very mild tweaking as each new __future__
    statement is added.  Such events will hopefully be very rare, so
    such a burden is unlikely to cause significant pain.

Implementation

    None yet; none of the above should be at all hard.  If this draft
    is well received, I'll upload a patch to sf "soon" and point to it
    here.

Copyright

    This document has been placed in the public domain.


-- 
  ARTHUR:  The ravenours bugblatter beast of Traal ... is it safe?
    FORD:  Oh yes, it's perfectly safe ... it's just us who are in 
           trouble.
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 6


From paulp@ActiveState.com  Mon Jul 30 19:53:24 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 30 Jul 2001 11:53:24 -0700
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> <2mn15pwg56.fsf@starship.python.net> <200107281357.JAA30859@cj20424-a.reston1.va.home.com> <2m3d7emged.fsf@starship.python.net>
Message-ID: <3B65AD24.84DD88A2@ActiveState.com>

Michael Hudson wrote:
> 
>...
>     I propose adding a fourth, optional, "flags" argument to the
>     builtin "compile" function.  If this argument is omitted, there
>     will be no change in behaviour from that of Python 2.1.
>
>     If it is present it is expected to be an integer, representing
>     various possible compile time options as a bitfield.  

Nit: What is the virtue to using a C-style bitfield? The efficiency
isn't much of an issue. I'd prefer either keyword arguments or a list of
strings.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From Samuele Pedroni <pedroni@inf.ethz.ch>  Mon Jul 30 20:00:12 2001
From: Samuele Pedroni <pedroni@inf.ethz.ch> (Samuele Pedroni)
Date: Mon, 30 Jul 2001 21:00:12 +0200 (MET DST)
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
Message-ID: <200107301900.VAA09388@core.inf.ethz.ch>

Hi.

[Michael Hudson]
>     One - probably called Compile - will sport a __call__ method which
>     will act much like the builtin "compile" of 2.1 with the
>     difference that after it has compiled a __future__ statement, it
>     "remembers" it and compiles all subsequent code with the
>     __future__ options in effect.
> 
>     It will do this by examining the co_flags field of any code object
>     it returns, which in turn means writing and maintaining a Python
>     version of the function PyEval_MergeCompilerFlags found in
>     Python/ceval.c.
FYI, in Jython (internally) we have a series of compile_flags functions
that take a "opaque" object CompilerFlags that is passed to the function
and compilation actually change the object in order to reflect future
statements encoutered during compilation...
Not elegant but avoids code duplication.

Of course we can change that.

Samuele Pedroni.



From mwh@python.net  Mon Jul 30 20:11:16 2001
From: mwh@python.net (Michael Hudson)
Date: 30 Jul 2001 15:11:16 -0400
Subject: [Python-Dev] Simulating shells (was Re: Changing the Division Operator -- PEP 238, rev 1.12)
In-Reply-To: Paul Prescod's message of "Mon, 30 Jul 2001 11:53:24 -0700"
References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> <2mn15pwg56.fsf@starship.python.net> <200107281357.JAA30859@cj20424-a.reston1.va.home.com> <2m3d7emged.fsf@starship.python.net> <3B65AD24.84DD88A2@ActiveState.com>
Message-ID: <2mitgafd0r.fsf_-_@starship.python.net>

Paul Prescod <paulp@ActiveState.com> writes:

> Michael Hudson wrote:
> > 
> >...
> >     I propose adding a fourth, optional, "flags" argument to the
> >     builtin "compile" function.  If this argument is omitted, there
> >     will be no change in behaviour from that of Python 2.1.
> >
> >     If it is present it is expected to be an integer, representing
> >     various possible compile time options as a bitfield.  
> 
> Nit: What is the virtue to using a C-style bitfield? The efficiency
> isn't much of an issue. I'd prefer either keyword arguments or a list of
> strings.

Err, hadn't really occured to me to do anything else, to be honest!

At one point I was going to use the same bits as are used in the
code.co_flags field, which was probably where the bitfield idea
originated.

By "keyword arguments" do you mean e.g:

   compile(source, file, start_symbol, generators=1, division=0)

?  I think that would be mildly painful for the one use I had in mind
(the additions to codeop), and also mildly painful to implement.

   compile(source, file, start_symbol,{'generators':1, 'division':0})

would be better from my point of view.  I think this is a bit of a
propeller-heads-only feature, to be honest, so I'm not that inclined
to worry aobut the API.

Cheers,
M.

-- 
3. Syntactic sugar causes cancer of the semicolon.
  -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html


From guido@zope.com  Mon Jul 30 20:11:17 2001
From: guido@zope.com (Guido van Rossum)
Date: Mon, 30 Jul 2001 15:11:17 -0400
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
In-Reply-To: Your message of "Mon, 30 Jul 2001 21:00:12 +0200."
 <200107301900.VAA09388@core.inf.ethz.ch>
References: <200107301900.VAA09388@core.inf.ethz.ch>
Message-ID: <200107301911.f6UJBHQ03472@odiug.digicool.com>

> [Michael Hudson]
> >     One - probably called Compile - will sport a __call__ method which
> >     will act much like the builtin "compile" of 2.1 with the
> >     difference that after it has compiled a __future__ statement, it
> >     "remembers" it and compiles all subsequent code with the
> >     __future__ options in effect.
> > 
> >     It will do this by examining the co_flags field of any code object
> >     it returns, which in turn means writing and maintaining a Python
> >     version of the function PyEval_MergeCompilerFlags found in
> >     Python/ceval.c.

> FYI, in Jython (internally) we have a series of compile_flags functions
> that take a "opaque" object CompilerFlags that is passed to the function
> and compilation actually change the object in order to reflect future
> statements encoutered during compilation...
> Not elegant but avoids code duplication.
> 
> Of course we can change that.

Does codeop currently work in Jython?  The solution should continue to
work in Jython then.  Does Jython support the same flag bit values as
CPython?  If not, Paul Prescod's suggestion to use keyword arguments
becomes very relevant.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From bckfnn@worldonline.dk  Mon Jul 30 20:16:36 2001
From: bckfnn@worldonline.dk (Finn Bock)
Date: Mon, 30 Jul 2001 19:16:36 GMT
Subject: [Python-Dev] zipfiles on sys.path
In-Reply-To: <20010725215830.2F49D14A25D@oratrix.oratrix.nl>
References: <20010725215830.2F49D14A25D@oratrix.oratrix.nl>
Message-ID: <3b65932d.2748051@mail.wanadoo.dk>

Thanks for the feedback.

>> - The __path__ vrbl in a package 'foo.bar' loaded from zipfile.zip
>>   will have the value ['zipfile.zip!foo/bar'] and this same syntax can
>>   also be used when adding entries to sys.path and __path__.
>
>__path__ is set to the package name. I'm not sure of the exact
>rationale for this (Just did the package support) but it seems to work
>fine. 

I think the result of the Mac implementation is that the package
hierarchy and the folder structure in the archive must match. Normally
this is the case but changes to __path__ can cause sub-modules to loaded
from somewhere else. I'm guessing such changes to to __path__ isn't
considered on Mac when importing from an archive.

[Just]

>I don't know the rationale either (or at least: not anymore ;-), I just copied
>the behavior of frozen packages (as in freeze.py) from import.c.
>PyImport_ImportFrozenModule() contains this snippet:

Dynamic changes to __path__ is probably not needed for frozen packages.

It may not even be needed for imports from zipfile. My first attempt of
adding this feature did not support changes to __path__.

regards,
finn


From guido@zope.com  Mon Jul 30 20:18:55 2001
From: guido@zope.com (Guido van Rossum)
Date: Mon, 30 Jul 2001 15:18:55 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: Your message of "Mon, 30 Jul 2001 01:48:59 EDT."
 <20010730014859.A15971@thyrsus.com>
References: <20010730014859.A15971@thyrsus.com>
Message-ID: <200107301918.f6UJIt003517@odiug.digicool.com>

Obviously, just as the new design is aiming at Perl 6, it would be
aiming at Python 3.  Nothing's impossible these days, so I am keeping
an open mind.  I expect that in addition to the bytecode, the entire
runtime architecture would have to be shared though for this to make
sense, and I'm not sure how easy that would be, even if Perl is
willing to be flexible.  Most of Python's run-time semantics are very
carefully defined and shouldn't be changed in order to fit in the
common runtime.

I'm looking forward to Eric's comparison of the two run-time systems.
(Eric, be sure to use a copy of 2.2a1 or the descr-branch -- *don't*
use the CVS trunk.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paulp@ActiveState.com  Mon Jul 30 20:20:27 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 30 Jul 2001 12:20:27 -0700
Subject: [Python-Dev] Re: Simulating shells (was Re: Changing the Division Operator -- PEP
 238, rev 1.12)
References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> <2mn15pwg56.fsf@starship.python.net> <200107281357.JAA30859@cj20424-a.reston1.va.home.com> <2m3d7emged.fsf@starship.python.net> <3B65AD24.84DD88A2@ActiveState.com> <2mitgafd0r.fsf_-_@starship.python.net>
Message-ID: <3B65B37B.E3E05945@ActiveState.com>

Michael Hudson wrote:
> 
>...
> 
> At one point I was going to use the same bits as are used in the
> code.co_flags field, which was probably where the bitfield idea
> originated.
> 
> By "keyword arguments" do you mean e.g:
> 
>    compile(source, file, start_symbol, generators=1, division=0)
> 
> ?  I think that would be mildly painful for the one use I had in mind
> (the additions to codeop), and also mildly painful to implement.

Sorry, could you elaborate on why this is painful to use and implement?
Considering the availability of **args, the code above looks to me like
syntactic sugar for the code below:

>    compile(source, file, start_symbol,{'generators':1, 'division':0})
> 
> would be better from my point of view.  I think this is a bit of a
> propeller-heads-only feature, to be honest, so I'm not that inclined
> to worry aobut the API.

I would just like to see an end to the convention of using bitfields in
Python everywhere. You're just my latest target. Python is not a really
great bit-manipulation language!
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From jeremy@zope.com  Mon Jul 30 20:23:27 2001
From: jeremy@zope.com (Jeremy Hylton)
Date: Mon, 30 Jul 2001 15:23:27 -0400 (EDT)
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730014859.A15971@thyrsus.com>
References: <20010730014859.A15971@thyrsus.com>
Message-ID: <15205.46127.377897.520922@slothrop.digicool.com>

>>>>> "ESR" == Eric S Raymond <esr@thyrsus.com> writes:

  ESR> Following my conversation with Guido, I've put doing an
  ESR> architectural comparison of the existing Python and Perl
  ESR> bytecodes at the top of my priority list.  I'm flying to Taipei
  ESR> tomorrow and will have a lot of hours on airplanes with my
  ESR> laptop to do this.

Eric,

This is a good project.  It's really difficult to evaluate the Parrot
proposal otherwise.  I know quite a bit about Python's VM and runtime,
but next to nothing about Perl's.

If you're feeling particularly energetic, you might look at some other
VM's -- Ocaml, Java, and Ruby come to mind.  It is probably a much
harder fit for the first two, because they are statically typed.  But
I'd be quite interested to see a survey of language VM techniques.

Jeremy



From Samuele Pedroni <pedroni@inf.ethz.ch>  Mon Jul 30 20:27:55 2001
From: Samuele Pedroni <pedroni@inf.ethz.ch> (Samuele Pedroni)
Date: Mon, 30 Jul 2001 21:27:55 +0200 (MET DST)
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
Message-ID: <200107301928.VAA10577@core.inf.ethz.ch>

[GvR]
> 
> > [Michael Hudson]
> > >     One - probably called Compile - will sport a __call__ method which
> > >     will act much like the builtin "compile" of 2.1 with the
> > >     difference that after it has compiled a __future__ statement, it
> > >     "remembers" it and compiles all subsequent code with the
> > >     __future__ options in effect.
> > > 
> > >     It will do this by examining the co_flags field of any code object
> > >     it returns, which in turn means writing and maintaining a Python
> > >     version of the function PyEval_MergeCompilerFlags found in
> > >     Python/ceval.c.
> 
> > FYI, in Jython (internally) we have a series of compile_flags functions
> > that take a "opaque" object CompilerFlags that is passed to the function
> > and compilation actually change the object in order to reflect future
> > statements encoutered during compilation...
> > Not elegant but avoids code duplication.
> > 
> > Of course we can change that.
> 
> Does codeop currently work in Jython?  The solution should continue to
> work in Jython then. 
We have our interface compatible version of codeop that works.

> Does Jython support the same flag bit values as
> CPython?  If not, Paul Prescod's suggestion to use keyword arguments
> becomes very relevant.
we support a subset of the co_flags, CO_NESTED e.g. is there with the same
value.

But the embedding API is very different, my implementation of nested
scopes does not define any Py_CF... flags, we have an internal CompilerFlags
object but is more similar to PyFutureFeatures ...

Samuele.



From esr@thyrsus.com  Mon Jul 30 08:35:17 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 30 Jul 2001 03:35:17 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <200107301918.f6UJIt003517@odiug.digicool.com>; from guido@zope.com on Mon, Jul 30, 2001 at 03:18:55PM -0400
References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com>
Message-ID: <20010730033517.A17356@thyrsus.com>

Guido van Rossum <guido@zope.com>:
> I'm looking forward to Eric's comparison of the two run-time systems.
> (Eric, be sure to use a copy of 2.2a1 or the descr-branch -- *don't*
> use the CVS trunk.)

What would the CVS magic invocation for that be?

And...um...why?  Has the bytecode changed significantly recently?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The spirit of resistance to government is so valuable on certain occasions, 
that I wish it always to be kept alive.  It will often be exercised when 
wrong, but better so than not to be exercised at all. I like a little 
rebellion now and then.	-- Thomas Jefferson, letter to Abigail Adams, 1787


From mwh@python.net  Mon Jul 30 20:35:11 2001
From: mwh@python.net (Michael Hudson)
Date: 30 Jul 2001 15:35:11 -0400
Subject: [Python-Dev] Re: Simulating shells (was Re: Changing the Division Operator -- PEP  238, rev 1.12)
In-Reply-To: Paul Prescod's message of "Mon, 30 Jul 2001 12:20:27 -0700"
References: <200107271949.PAA27171@cj20424-a.reston1.va.home.com> <2mn15pwg56.fsf@starship.python.net> <200107281357.JAA30859@cj20424-a.reston1.va.home.com> <2m3d7emged.fsf@starship.python.net> <3B65AD24.84DD88A2@ActiveState.com> <2mitgafd0r.fsf_-_@starship.python.net> <3B65B37B.E3E05945@ActiveState.com>
Message-ID: <2mpuaijjm8.fsf@starship.python.net>

Paul Prescod <paulp@ActiveState.com> writes:

> Michael Hudson wrote:
> > 
> >...
> > 
> > At one point I was going to use the same bits as are used in the
> > code.co_flags field, which was probably where the bitfield idea
> > originated.
> > 
> > By "keyword arguments" do you mean e.g:
> > 
> >    compile(source, file, start_symbol, generators=1, division=0)
> > 
> > ?  I think that would be mildly painful for the one use I had in mind
> > (the additions to codeop), and also mildly painful to implement.
> 
> Sorry, could you elaborate on why this is painful to use and implement?

Well, I don't know in detail how keyword arguments work from the C
side.  Your suggestion turns a roughly 4 line change I knew exactly
how to do into a 20-30 line change I'd have to work on.  I only said
"mildly painful".  The awkwardness of use would just mean using **,
yes.

> Considering the availability of **args, the code above looks to me like
> syntactic sugar for the code below:
> 
> >    compile(source, file, start_symbol, {'generators':1, 'division':0})

Well yes, but I think the latter is closer to what one means, which is
to say passing a (i.e. one) set of options.

> > would be better from my point of view.  I think this is a bit of a
> > propeller-heads-only feature, to be honest, so I'm not that inclined
> > to worry aobut the API.
> 
> I would just like to see an end to the convention of using bitfields in
> Python everywhere. You're just my latest target.

Fair enough.  I've probably been corrupted by C on this one.

> Python is not a really great bit-manipulation language!

<aside>Augmented assignment helps a *lot* here!</aside>

At any rate, the fact that I'd temporarily forgotten about the
existence of Jython is the more serious blunder...

Cheers,
M.

-- 
    . <- the point                                your article -> .
    |------------------------- a long way ------------------------|
                                        -- Cristophe Rhodes, ucam.chat


From mwh@python.net  Mon Jul 30 20:46:10 2001
From: mwh@python.net (Michael Hudson)
Date: 30 Jul 2001 15:46:10 -0400
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
In-Reply-To: Samuele Pedroni's message of "Mon, 30 Jul 2001 21:27:55 +0200 (MET DST)"
References: <200107301928.VAA10577@core.inf.ethz.ch>
Message-ID: <2mn15mjj3x.fsf@starship.python.net>

Samuele Pedroni <pedroni@inf.ethz.ch> writes:

> [GvR]
> > 
> > > [Michael Hudson]
> > > >     One - probably called Compile - will sport a __call__ method which
> > > >     will act much like the builtin "compile" of 2.1 with the
> > > >     difference that after it has compiled a __future__ statement, it
> > > >     "remembers" it and compiles all subsequent code with the
> > > >     __future__ options in effect.
> > > > 
> > > >     It will do this by examining the co_flags field of any code object
> > > >     it returns, which in turn means writing and maintaining a Python
> > > >     version of the function PyEval_MergeCompilerFlags found in
> > > >     Python/ceval.c.
> > 
> > > FYI, in Jython (internally) we have a series of compile_flags functions
> > > that take a "opaque" object CompilerFlags that is passed to the function
> > > and compilation actually change the object in order to reflect future
> > > statements encoutered during compilation...
> > > Not elegant but avoids code duplication.
> > > 
> > > Of course we can change that.
> > 
> > Does codeop currently work in Jython?  The solution should continue to
> > work in Jython then. 
> We have our interface compatible version of codeop that works.

Would implementing the new interfaces I sketched out for codeop.py be
possible in Jython?  That's the bit I care about, not so much the
interface to __builtin__.compile.

> > Does Jython support the same flag bit values as
> > CPython?  If not, Paul Prescod's suggestion to use keyword arguments
> > becomes very relevant.
> we support a subset of the co_flags, CO_NESTED e.g. is there with the same
> value.
> 
> But the embedding API is very different, my implementation of nested
> scopes does not define any Py_CF... flags, we have an internal CompilerFlags
> object but is more similar to PyFutureFeatures ...

Is this object exposed to Python code at all?  One approach would be
PyObject-izing PyFutureFlags and making *that* the fourth argument to
compile...

class Compiler:
    def __init__(self):
        self.ff = ff.new() # or whatever
    def __call__(self, source, filename, start_symbol):
        code = compile(source, filename, start_symbol, self.ff)
        self.ff.merge(code.co_flags)
        return code

Cheers,
M.

-- 
  Like most people, I don't always agree with the BDFL (especially
  when he wants to change things I've just written about in very 
  large books), ... 
         -- Mark Lutz, http://python.oreilly.com/news/python_0501.html


From tim@digicool.com  Mon Jul 30 20:47:57 2001
From: tim@digicool.com (Tim Peters)
Date: Mon, 30 Jul 2001 15:47:57 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730014859.A15971@thyrsus.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEPKCDAA.tim@digicool.com>

[Eric S. Raymond]
> ...
> (and, incidentally, argued that the bytecode should emulate a stack
> rather than a register machine because the cost/speed disparities that
> justify register architectures in hardware don't exist in a software
> VM).

Don't get too married to that!  My bet is that if anyone had time for it,
we'd switch the Python VM today to a register model; Skip Montanaro's
Rattlesnake project was aiming at that, but fizzled out due to lack of time.

The per-opcode fetch-decode-dispatch overhead is very high in SW too, so a
register VM can win simply by cutting the number of opcodes needed to
accomplish a given bit of useful work.  Indeed, eliding SET_LINENO opcodes
is the primary reason Python -O runs faster, yet all it saves is one trip
around the eval loop per source-code line (the *body* of SET_LINENO is just
a test, branch, and store -- it's trivial compared to the overhead of
getting to it).

Variants of forth-like threading are alternatives to both.



From Samuele Pedroni <pedroni@inf.ethz.ch>  Mon Jul 30 20:59:35 2001
From: Samuele Pedroni <pedroni@inf.ethz.ch> (Samuele Pedroni)
Date: Mon, 30 Jul 2001 21:59:35 +0200 (MET DST)
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
Message-ID: <200107301959.VAA11733@core.inf.ethz.ch>


...
> > > 
> > > Does codeop currently work in Jython?  The solution should continue to
> > > work in Jython then. 
> > We have our interface compatible version of codeop that works.
> 
> Would implementing the new interfaces I sketched out for codeop.py be
> possible in Jython?  That's the bit I care about, not so much the
> interface to __builtin__.compile.
Yes, it's of possible.


> > > Does Jython support the same flag bit values as
> > > CPython?  If not, Paul Prescod's suggestion to use keyword arguments
> > > becomes very relevant.
> > we support a subset of the co_flags, CO_NESTED e.g. is there with the same
> > value.
> > 
> > But the embedding API is very different, my implementation of nested
> > scopes does not define any Py_CF... flags, we have an internal CompilerFlags
> > object but is more similar to PyFutureFeatures ...
> 
> Is this object exposed to Python code at all?
Not publicily, but in Jython the separating line is a bit different,
because public java classes are always accessible from jython,
even most of the internals. That does not mean and every use of that
is welcome and supported.

>  One approach would be
> PyObject-izing PyFutureFlags and making *that* the fourth argument to
> compile...
> 
> class Compiler:
>     def __init__(self):
>         self.ff = ff.new() # or whatever
>     def __call__(self, source, filename, start_symbol):
>         code = compile(source, filename, start_symbol, self.ff)
>         self.ff.merge(code.co_flags)
>         return code
I see, "internally" we already have a compiler_flags function
that do the same of:
>         code = compile(source, filename, start_symbol, self.ff)
>         self.ff.merge(code.co_flags)

where self.ff is a CompuilerFlags object.

I can re-arrange things for any interface, I was only trying to explain
our approach and situation and a possible way to avoid duplicating some
internal code in Python.

Samuele.



From esr@thyrsus.com  Mon Jul 30 09:08:17 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 30 Jul 2001 04:08:17 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHCEPKCDAA.tim@digicool.com>; from tim@digicool.com on Mon, Jul 30, 2001 at 03:47:57PM -0400
References: <20010730014859.A15971@thyrsus.com> <BIEJKCLHCIOIHAGOKOLHCEPKCDAA.tim@digicool.com>
Message-ID: <20010730040817.A18034@thyrsus.com>

Tim Peters <tim@digicool.com>:
> The per-opcode fetch-decode-dispatch overhead is very high in SW too, so a
> register VM can win simply by cutting the number of opcodes needed to
> accomplish a given bit of useful work.

That's an interesting idea.  OK, so possibly I was wrong -- I hadn't considered
that stack-push/stack-pop operations might introduce overhead comparable
to the order-of-magnitude speed difference between registers and main memory
in hardware.  I'm still skeptical, but my mind is open.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

You know why there's a Second Amendment?  In case the government fails to
follow the first one.
         -- Rush Limbaugh, in a moment of unaccustomed profundity 17 Aug 1993


From guido@zope.com  Mon Jul 30 21:15:09 2001
From: guido@zope.com (Guido van Rossum)
Date: Mon, 30 Jul 2001 16:15:09 -0400
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
In-Reply-To: Your message of "Mon, 30 Jul 2001 21:27:55 +0200."
 <200107301928.VAA10577@core.inf.ethz.ch>
References: <200107301928.VAA10577@core.inf.ethz.ch>
Message-ID: <200107302015.f6UKF9j03661@odiug.digicool.com>

> > Does codeop currently work in Jython?  The solution should continue to
> > work in Jython then. 
> We have our interface compatible version of codeop that works.

Ah, good.

> > Does Jython support the same flag bit values as
> > CPython?  If not, Paul Prescod's suggestion to use keyword arguments
> > becomes very relevant.
> we support a subset of the co_flags, CO_NESTED e.g. is there with the same
> value.

Cool.

> But the embedding API is very different, my implementation of nested
> scopes does not define any Py_CF... flags, we have an internal CompilerFlags
> object but is more similar to PyFutureFeatures ...

That's fine.  We may end up rearchitecting that (rather baroque IMO)
part of the CPython compiler anyway -- if we can get away with
changing the C API.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@zope.com  Mon Jul 30 21:16:49 2001
From: guido@zope.com (Guido van Rossum)
Date: Mon, 30 Jul 2001 16:16:49 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: Your message of "Mon, 30 Jul 2001 03:35:17 EDT."
 <20010730033517.A17356@thyrsus.com>
References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com>
 <20010730033517.A17356@thyrsus.com>
Message-ID: <200107302016.f6UKGoG03676@odiug.digicool.com>

> What would the CVS magic invocation for that be?

cvs update -r descr-branch

or

cvs checkout -r descr-branch python/dist/src

Or just download 2.2a1.

> And...um...why?  Has the bytecode changed significantly recently?

Not the bytecode, but the rest of the runtime has changed
tremendously, and as I tried to explain over the phone, that has a big
impact on reusability of the runtime.  The bytecode engine cannot be
considered independent from the rest of the runtime.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin@mems-exchange.org  Mon Jul 30 21:29:01 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 30 Jul 2001 16:29:01 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <200107302016.f6UKGoG03676@odiug.digicool.com>; from guido@zope.com on Mon, Jul 30, 2001 at 04:16:49PM -0400
References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> <20010730033517.A17356@thyrsus.com> <200107302016.f6UKGoG03676@odiug.digicool.com>
Message-ID: <20010730162901.F9578@ute.cnri.reston.va.us>

On Mon, Jul 30, 2001 at 04:16:49PM -0400, Guido van Rossum wrote:
>impact on reusability of the runtime.  The bytecode engine cannot be
>considered independent from the rest of the runtime.

If you must have a portable bytecode format, why not use the JVM?
Perhaps it's not optimal, but it works reasonably well, has a few
reasonably complete free implementations that are mostly strangling
due to lack of manpower, has some support in GCC 3.0, and is actually
deployed in browsers and on people's systems *right now*.  I fail to
see why we should run after some mythical Perl/Python bytecode that
would have to be 1) designed 2) implemented 3) debugged 4) actually
made available to users 5) actually downloaded by users.  (Much the
same objections apply to .NET for Unix.)

There's also the cultural difference between Python's "write it
clearly and then optimize it" and Perl's "let's write clever optimized
code right from the start".  Perhaps this can be bridged, perhaps not.

--amk


From guido@zope.com  Mon Jul 30 21:42:05 2001
From: guido@zope.com (Guido van Rossum)
Date: Mon, 30 Jul 2001 16:42:05 -0400
Subject: [Python-Dev] Revised decimal type PEP
In-Reply-To: Your message of "Mon, 30 Jul 2001 11:06:52 EDT."
 <0107301106520A.02216@fermi.eeel.nist.gov>
References: <0107301106520A.02216@fermi.eeel.nist.gov>
Message-ID: <200107302042.f6UKg5H03826@odiug.digicool.com>

Michael's PEP touches upon the one difficult area of decimal
semantics: what to do when a decimal and a binary float meet?

We discussed this briefly over lunch here and Tim pointed out that the
default should probably be an error: code expecting to work with
exact decimals should be allowed to continue after contamination with
an inexact binary float.

But in other contexts it would make more sense to turn mixed operands
into inexact, like what currently happens when int/long meets float.

In the IBM model that Aahz is implementing, decimal numbers are not
necessarily exact, but (if I understand correctly) you can set a
context flag that causes an exception to be raised when the result of
an operation on two exact inputs is inexact.  This can happen when
e.g. a multiplication result exceeds the number of significant digits
specified in the context -- then truncation is applied like for binary
floats.

Could the numeric tower look like this?

int < long < decimal < rational < float < complex
*******************************   ***************
          exact                       inexact

A numeric context could contain a flag that decides what happens when
exact and inexact are mixed.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jack@oratrix.nl  Mon Jul 30 22:27:36 2001
From: jack@oratrix.nl (Jack Jansen)
Date: Mon, 30 Jul 2001 23:27:36 +0200
Subject: [Python-Dev] Mac toolbox modules for MacOSX unix-Python
Message-ID: <20010730212741.D2F37162A2A@oratrix.oratrix.nl>

I now have a whole stack of modules that interface to MacOS toolboxes
that compile for unix-Python on MacOSX, but I'm a bit unsure about how
I should add these to the standard build.

So far what I've checked in (in configure) is only a bit of glue that
allows the toolbox modules to be loaded, but not yet the changes to
setup.py that will actually compile and link the modules.

I can do this in two ways:
1) Keep everything as-is and just check in the mods to setup.py.
2) Make the MacOS toolbox modules dependent on a configure switch. The
toolbox glue would then also become dependent on this switch.

The first option seems to be the standard nowadays: setup.py simply
builds everything it can find and for which the prerequisite
headers/libs are found.

The second option seems a bit more friendly to Pythoneers who view
MacOSX as simply unix-with-a-pretty-face and use Python only for
command-line scripts and cgi and such. Also, the toolbox modules will
be less stable than average modules for some time to be: as they're
shared between unix-Python and MacPython and generated on the latter
the repository version might not build for a few days while I get my
act together. On the other hand: a failing compile of an extension
module shouldn't bother them overmuch, and one can always comment out
the setup.py lines.

A problem with the second option is that I have absolutely no idea how
to test for configure flags in setup.py.

To complicate matters more I'm thinking of turning Python into a
framework, which would give OSX-Python a lot of the niceties that
MacPython users are used to (applets and building standalone applications
without a C compiler, to name two). In that case many users will
probably choose either to go the whole way (install Python as a
framework and include the tooolbox modules) or forget about the macos
stuff altogether.

What do people think about this?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | ++++ see http://www.xs4all.nl/~tank/ ++++


From skip@pobox.com (Skip Montanaro)  Mon Jul 30 22:38:53 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 30 Jul 2001 16:38:53 -0500
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730040817.A18034@thyrsus.com>
References: <20010730014859.A15971@thyrsus.com>
 <BIEJKCLHCIOIHAGOKOLHCEPKCDAA.tim@digicool.com>
 <20010730040817.A18034@thyrsus.com>
Message-ID: <15205.54253.976241.842131@beluga.mojam.com>

>>>>> "Eric" == Eric S Raymond <esr@thyrsus.com> writes:

    Eric> Tim Peters <tim@digicool.com>:

    >> The per-opcode fetch-decode-dispatch overhead is very high in SW too,
    >> so a register VM can win simply by cutting the number of opcodes
    >> needed to accomplish a given bit of useful work.

    Eric> That's an interesting idea.  OK, so possibly I was wrong -- I
    Eric> hadn't considered that stack-push/stack-pop operations might
    Eric> introduce overhead comparable to the order-of-magnitude speed
    Eric> difference between registers and main memory in hardware.  I'm
    Eric> still skeptical, but my mind is open.

Order of magnitude increases?  Maybe, maybe not.  Still, something like

    ADD a1,a2,a3

is going to be faster than

    PUSH a1
    PUSH a2
    ADD
    POP a3

My original aim in considering a register-based VM was that it is easier to
track data flow and thus optimize out or rearrange operations to reduce the
operation count.

Translating Python's stack-oriented VM into a register-oriented one was
fairly straightforward (at least it was back when I was fiddling with it -
pre-1.5).  The main stumbling block was that pesky "from module import *"
statement.  It could push an unknown quantity of stuff onto the stack, thus
killing my attempts to track the location of objects on the stack at compile
time.

Skip



From jeremy@zope.com  Mon Jul 30 22:44:52 2001
From: jeremy@zope.com (Jeremy Hylton)
Date: Mon, 30 Jul 2001 17:44:52 -0400 (EDT)
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730162901.F9578@ute.cnri.reston.va.us>
References: <20010730014859.A15971@thyrsus.com>
 <200107301918.f6UJIt003517@odiug.digicool.com>
 <20010730033517.A17356@thyrsus.com>
 <200107302016.f6UKGoG03676@odiug.digicool.com>
 <20010730162901.F9578@ute.cnri.reston.va.us>
Message-ID: <15205.54612.424694.5559@slothrop.digicool.com>

>>>>> "AMK" == Andrew Kuchling <akuchlin@mems-exchange.org> writes:

  AMK> On Mon, Jul 30, 2001 at 04:16:49PM -0400, Guido van Rossum
  AMK> wrote:
  >> impact on reusability of the runtime.  The bytecode engine cannot
  >> be considered independent from the rest of the runtime.

  AMK> If you must have a portable bytecode format, why not use the
  AMK> JVM?  Perhaps it's not optimal, but it works reasonably well,
  AMK> has a few reasonably complete free implementations that are
  AMK> mostly strangling due to lack of manpower, has some support in
  AMK> GCC 3.0, and is actually deployed in browsers and on people's
  AMK> systems *right now*.

I'm not sure I understand the suggestion.  The JVM defines an
instruction set, but it also defines an entire runtime, right?  You've
got to live with the JVM's implementation of threads, garbage
collection, etc.  For the case of Python, that sounds a lot like
abandoning CPython and using JPython instead.

Or would you suggest using the instruction set but nothing else from
the JVM?  I'm not sure that there would be much advantage there.  If
we had a JVM implementation designed to support Python, there would be
no need to implement most of the opcodes.  We'd only need getstatic
and invokevirtual <0.2 wink>.  The typed opcodes (int, float, etc.)
would never be used.

The problem seems to be that the VM ties up a bunch of other issues
with the bytecode.  Python's VM is intimiately tied up with:

    - reference counting: each opcode knows when to INCREF and when to
      DECREF

    - threads: the global interpreter lock is managed outside the
      bytecode by the "Do periodic things" code.

    - object model: BINARY_ADD knows how to special case ints and what
      method to call to dispatch on all other objects

Unless the bytecode is very low level, you buy a lot more than some
instructions when you buy an instruction set.

Jeremy



From greg@cosc.canterbury.ac.nz  Mon Jul 30 23:05:22 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Jul 2001 10:05:22 +1200 (NZST)
Subject: [Python-Dev] Iterator addition?
In-Reply-To: <200107301242.IAA09350@cj20424-a.reston1.va.home.com>
Message-ID: <200107302205.KAA00567@s454.cosc.canterbury.ac.nz>

Guido:

> The *only* thing that iterators and sequences have in common is that
> they can be iterated over.  So they are substitutable in all context
> where that's all you do -- including sequence (not tuple!) unpacking.
> And not in any other contexts.

I agree. The more special cases we add to try to make
iterators look like sequences, the harder it's going to
be to remember what you can and can't do with an iterator.
Let's keep it as simple as possible.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From esr@thyrsus.com  Mon Jul 30 10:18:31 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 30 Jul 2001 05:18:31 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <200107302016.f6UKGoG03676@odiug.digicool.com>; from guido@zope.com on Mon, Jul 30, 2001 at 04:16:49PM -0400
References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> <20010730033517.A17356@thyrsus.com> <200107302016.f6UKGoG03676@odiug.digicool.com>
Message-ID: <20010730051831.B1122@thyrsus.com>

Guido van Rossum <guido@zope.com>:
> Or just download 2.2a1.

It's cool.  My local installation is from 2.2.a0.  I'll update.
 
> > And...um...why?  Has the bytecode changed significantly recently?
> 
> Not the bytecode, but the rest of the runtime has changed
> tremendously, and as I tried to explain over the phone, that has a big
> impact on reusability of the runtime.  The bytecode engine cannot be
> considered independent from the rest of the runtime.

OK, let's try to factor this design problem.  

Let's suppose, for the sake of the design discussion, that we can make
the type ontologies of the Perl and Python bytecode match up.  

(Note: making the type ontologies of the two bytecodes match is not
the same problem as making the type ontologies of the *languages*
match up.  It should be rather simpler because a lot of the
differences between, e.g., class semantics can probably be compiled
away.  Not a trivial problem, but humor me.)

Let's further suppose that we have a callout mechanism from the Parrot 
interpreter core to the Perl or Python runtime's C level that can pass out 
Python/Perl types and return them.

Given these two premises, what other problems are there?

I can see one: garbage collection.  What others are there?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

An armed society is a polite society.  Manners are good when one 
may have to back up his acts with his life.
        -- Robert A. Heinlein, "Beyond This Horizon", 1942


From martin@loewis.home.cs.tu-berlin.de  Mon Jul 30 23:22:05 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 31 Jul 2001 00:22:05 +0200
Subject: [Python-Dev] Python API version & optional features
Message-ID: <200107302222.f6UMM5105688@mira.informatik.hu-berlin.de>

>> I guess one could argue that extension writers should check
>> for narrow/wide builds in their extensions before using Unicode.
>> 
>> Since the number of Unicode extension writers is much smaller 
>> than the number of users, I think that this apporach would be 
>> reasonable, provided that we document the problem clearly in the 
>> NEWS file.

> OK.  I approve.

I'm not sure I can follow. What did you approve? That extension
writers should check whether their Unicode build matches the one they
get at run-time? How are they going to do that?

Regards,
Martin


From gmcm@hypernet.com  Mon Jul 30 23:40:21 2001
From: gmcm@hypernet.com (Gordon McMillan)
Date: Mon, 30 Jul 2001 18:40:21 -0400
Subject: [Python-Dev] zipfiles on sys.path
In-Reply-To: <3b65932d.2748051@mail.wanadoo.dk>
References: <20010725215830.2F49D14A25D@oratrix.oratrix.nl>
Message-ID: <3B65AA15.27947.E9B214D@localhost>

Finn Bock wrote: 

[mac puts package name in __path__ when importing from 
elsewhere]

> Dynamic changes to __path__ is probably not needed for frozen
> packages.
> 
> It may not even be needed for imports from zipfile. My first
> attempt of adding this feature did not support changes to
> __path__.

I know of at least one package that requires an extensible 
__path__, even when frozen. It's a Mark Hammond Special, so 
you needn't worry about that one, but it's my observation that 
package authors are enamored of import hacks, so be wary.

- Gordon


From gmcm@hypernet.com  Mon Jul 30 23:40:21 2001
From: gmcm@hypernet.com (Gordon McMillan)
Date: Mon, 30 Jul 2001 18:40:21 -0400
Subject: [Python-Dev] zipfiles on sys.path
In-Reply-To: <3b5f2b11.50733180@mail.wanadoo.dk>
Message-ID: <3B65AA15.19868.E9B2049@localhost>

Finn Bock wrote:
> 
> We have recently added support for .zip files on sys.path to
> Jython. Now, after the fact, I wondered what prior art exists for
> such a feature and the semantic that is used. We came up with a
> solution where:

Prior art should include imputil.py (especially since it's at 
least partly blessed).

With imputil, an importer object is on sys.path. The default 
implementation will give you a __path__ consisting of the 
package name (I think), but you're free to override that in an 
importer subclass.

I believe Thomas Heller uses zipfiles with imputil in py2exe. I 
use archives (3 different formats), but not zipfiles.
 



- Gordon


From greg@cosc.canterbury.ac.nz  Mon Jul 30 23:51:35 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Jul 2001 10:51:35 +1200 (NZST)
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <15205.54253.976241.842131@beluga.mojam.com>
Message-ID: <200107302251.KAA00585@s454.cosc.canterbury.ac.nz>

Skip Montanaro <skip@pobox.com>:

> The main stumbling block was that pesky "from module import *"
> statement.  It could push an unknown quantity of stuff onto the
> stack

Are you *sure* about that? I'm pretty certain it can't
be true, since the compiler has to know at all times
how much is on the stack, so it can decide how much
stack space is needed.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From jeremy@zope.com  Mon Jul 30 23:55:37 2001
From: jeremy@zope.com (Jeremy Hylton)
Date: Mon, 30 Jul 2001 18:55:37 -0400 (EDT)
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730051831.B1122@thyrsus.com>
References: <20010730014859.A15971@thyrsus.com>
 <200107301918.f6UJIt003517@odiug.digicool.com>
 <20010730033517.A17356@thyrsus.com>
 <200107302016.f6UKGoG03676@odiug.digicool.com>
 <20010730051831.B1122@thyrsus.com>
Message-ID: <15205.58857.375440.347263@slothrop.digicool.com>

>>>>> "ESR" == Eric S Raymond <esr@thyrsus.com> writes:

  ESR> Let's suppose, for the sake of the design discussion, that we
  ESR> can make the type ontologies of the Perl and Python bytecode
  ESR> match up.

What is a type ontology?  The definition of ontology I'm familiar with
is too broad to be useful in understanding what you're getting at.
I've never heard the technical term "type ontology".

  ESR> (Note: making the type ontologies of the two bytecodes match is
  ESR> not the same problem as making the type ontologies of the
  ESR> *languages* match up.  It should be rather simpler because a
  ESR> lot of the differences between, e.g., class semantics can
  ESR> probably be compiled away.  Not a trivial problem, but humor
  ESR> me.)

If I guess at what you mean-- a fuzzy notion that the underlying type
system can support both languages-- then I submit that most of the
hard problems are indeed here.

  ESR> Let's further suppose that we have a callout mechanism from the
  ESR> Parrot interpreter core to the Perl or Python runtime's C level
  ESR> that can pass out Python/Perl types and return them.

Not quite sure what you mean ehre.

  ESR> Given these two premises, what other problems are there?

  ESR> I can see one: garbage collection.  What others are there?

I think you mean memory management in general, not just GC.  Others:
thread model, interpreter management (such as creating embedded
interpreter objects).

Jeremy



From greg@cosc.canterbury.ac.nz  Mon Jul 30 23:58:59 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Jul 2001 10:58:59 +1200 (NZST)
Subject: [Python-Dev] Nostalgic Versions
In-Reply-To: <15205.37144.824975.214559@beluga.mojam.com>
Message-ID: <200107302258.KAA00589@s454.cosc.canterbury.ac.nz>

Skip Montanaro <skip@pobox.com>:

> I'll wager Moshe is planning on "fixing" division. ;-)

Guido, why don't you just lend Moshe your time machine?
Then he can go back and fix division in 0.1 and the whole
problem will disappear from the timeline.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From aahz@rahul.net  Tue Jul 31 00:01:06 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Mon, 30 Jul 2001 16:01:06 -0700 (PDT)
Subject: [Python-Dev] Revised decimal type PEP
In-Reply-To: <200107302042.f6UKg5H03826@odiug.digicool.com> from "Guido van Rossum" at Jul 30, 2001 04:42:05 PM
Message-ID: <20010730230106.0E7E799CA4@waltz.rahul.net>

Guido van Rossum wrote:
> 
> In the IBM model that Aahz is implementing, decimal numbers are not
> necessarily exact, but (if I understand correctly) you can set a
> context flag that causes an exception to be raised when the result of
> an operation on two exact inputs is inexact.  This can happen when
> e.g. a multiplication result exceeds the number of significant digits
> specified in the context -- then truncation is applied like for binary
> floats.

Rounding, actually (unless the specified context (which is not the
default) request truncation), but yes.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From thomas@xs4all.net  Tue Jul 31 00:02:56 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 31 Jul 2001 01:02:56 +0200
Subject: [Python-Dev] pep-discuss
In-Reply-To: <200107301740.f6UHe6K03226@odiug.digicool.com>
Message-ID: <20010731010256.F20676@xs4all.nl>

On Mon, Jul 30, 2001 at 01:40:06PM -0400, Guido van Rossum wrote:

[ Where to discuss PEPs ]

[Aahz]
> > While what you say makes sense, overall, there are a lot of people (me
> > included) who prefer discussion on newsgroups, and I can't quite see
> > creating a newsgroup for PEP discussions yet.  Call me -0.25 for kicking
> > discussion off c.l.py and +0.25 for getting it off python-dev.

> For me personally, it would just be another list to follow, no matter
> where it happens, so consider me -0.  I won't object if a majority on
> python-dev wants this though.

I'd like to second that, with one minor addition: no crossposting, *please*.
I got kind of fed up with the iterators discussions when I got in after a
long weekend, and had to read through three copies of several threads (the
iterators list, python-list and python-dev) and all were sufficiently long
(and only quickly skimmed by me in all cases) that I couldn't remember
whether I'd already seen that message... I ended up skipping whole
discussions, which defeated the point of being on the list.

If people keep up the crossposting, I'd rather have the discussions take
place on python-dev or python-list to start with.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From sdm7g@Virginia.EDU  Tue Jul 31 00:24:34 2001
From: sdm7g@Virginia.EDU (Steven D. Majewski)
Date: Mon, 30 Jul 2001 19:24:34 -0400 (EDT)
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730051831.B1122@thyrsus.com>
Message-ID: <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>


Since python does so much by name (I think locals are the only
place where name lookup is compiled away), dictionary lookup
is pretty fundamental to the runtime, and is used by a number
of opcodes. No reason that dictionary lookup couldn't be part
of the common runtime, or else tucked behind an abstract 
interface common to Python and Perl lookups, but it's something
to keep in mind if the VM is going to operate on a similar 
level to the current one. 

But then, I've always thought that one of the problems with
trying to optimize Python was that the VM was too high level. 

If some sort of Forth-like extensible threaded code were used,
you could build the current opcodes from lower level primitives.

Re: stack vs. register machines:

Some Forth implementations cache some of the top of stack in
registers, but the more you try to cache, the hairier it gets.
( But you can figure out the bookkeeping once and automatically
  generate the code variations. ) 


You might take a look at Anton Ertl's VMGEN:

	<http://www.complang.tuwien.ac.at/anton/vmgen/>

| Vmgen generates much of the code for efficient virtual machine (VM)
| interpreters from simple descriptions of the VM instructions. 


[  One of the nice/useful features of the the Forth VM is the PFA/CFA
  pairing: PFA is "parameter field address" and points to the code 
  ( VM or native ) to be executed. CFA is "code field address" and
  points to the code to interpret what's in the parameter field. 
  For threaded code, it points to the threaded code interpreter; 
  for native code, it points to the PFA -- i.e. native code is
  'self interpreting' . BUILDS/DOES in Forth creates a data type
  (BUILDS) and defines code to addess the data type (DOES) that is
  pointed to by the CFA -- an early but very primitive object-orientation, 
  but with only one method (later Forth's added QUADS and other methods
  to have separate ACCESS/UPDATE (read/write) methods. ]


Re: Other VM implementations:

I'm not very familiar with the internals of Squeak, but I suspect 
that it's worth looking at. They are, in any case, interested in
some of the same sort of things. ( There was a recent thread about
MIT's StarLogo -- which was originally written for the Mac using
(I think) Lisp, and then a portable version was done using Java, 
but they were disappointed in performance, and I think they are 
looking at using Squeak now. ) 


Scheme48 is probably considered the best portable byte-code Scheme
implementation. ( Don't know anything about it's internals myself )


 A lot of other people who have tried using the Java VM for other
languages have had complaints about various things that are difficult
or impossible. ( Scheme folks couldn't have full call/cc, and there
were two different attempts to add generics to Java -- one involved
adding special bytecode support, and the other (Pizza -- now GJ --
Generic Java) tried to stick with a standard Java VM. ) 
 
-- Steve majewski




From thomas@xs4all.net  Tue Jul 31 00:24:32 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 31 Jul 2001 01:24:32 +0200
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730051831.B1122@thyrsus.com>
Message-ID: <20010731012432.G20676@xs4all.nl>

On Mon, Jul 30, 2001 at 05:18:31AM -0400, Eric S. Raymond wrote:

> Let's suppose, for the sake of the design discussion, that we can make
> the type ontologies of the Perl and Python bytecode match up.  

I'm afraid I'll have to side with Jeremy when I say, "What?"

> Let's further suppose that we have a callout mechanism from the Parrot 
> interpreter core to the Perl or Python runtime's C level that can pass out 
> Python/Perl types and return them.

> Given these two premises, what other problems are there?

> I can see one: garbage collection.  What others are there?

As a midnight braindump:

What about Perl's 'dynamic' (or 'really JIT') compilation ? The incessant
weak typing -- would this be part of the Perl side of Parrot, or part of the
Parrot types ? The differences in the regex engine; in Python, regular
expressions are optional. Also, the Perl engine has some features SRE
hasn't, yet, and vice versa (last I checked, Perl's regexps didn't do
unicode or named groups.) And what about Perl's 'Taint' mode ? I don't see
how you can emulate that ontop of the Parrot runtime, as it's a tag that
gets carried into operations. And I won't even start with Perl's more
archaic features, that change the whole working of the interpreter.

You mentioned regular expressions as an upside for Python, from this
'merger'. Why is that ? We have a good regex engine, and it's tuned to
Python's needs. Do we need 'regex literals' ? Why ? And why would we need a
merger with Perl for that, anyway -- I've seen some arbitrary-type-literals
suggestions come by in the last couple of days that would make it
possible :-)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From Mark.Favas@per.dem.csiro.au  Tue Jul 31 00:27:40 2001
From: Mark.Favas@per.dem.csiro.au (Favas, Mark (EM, Floreat))
Date: Tue, 31 Jul 2001 07:27:40 +0800
Subject: [Python-Dev] Picking on platform fmod
Message-ID: <51716131991ED5118CDE00B0D02351865F94@MOORT>

[Tim asks for platforms from Mars, unaccountably including Tru64 Unix in
there <wink>]

No problems on Tru64 (v4.0F), no problems on FreeBSD 4.3-RELEASE, no
problems on Solaris 8

More specifically, OK on:

OSF1 erebus V4.0 1229 alpha
FreeBSD teche 4.3-RELEASE FreeBSD 4.3-RELEASE (Intel)
SunOS asafoetida 5.8 Generic_108528-09 sun4u sparc SUNW,Ultra-60


Mark C Favas
CSIRO Exploration & Mining
Private Bag No. 5
Wembley, Western Australia 6913
Phone - +61 8 93336268



From thomas@xs4all.net  Tue Jul 31 00:34:48 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 31 Jul 2001 01:34:48 +0200
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <200107302251.KAA00585@s454.cosc.canterbury.ac.nz>
References: <200107302251.KAA00585@s454.cosc.canterbury.ac.nz>
Message-ID: <20010731013448.H20676@xs4all.nl>

On Tue, Jul 31, 2001 at 10:51:35AM +1200, Greg Ewing wrote:
> Skip Montanaro <skip@pobox.com>:

> > The main stumbling block was that pesky "from module import *"
> > statement.  It could push an unknown quantity of stuff onto the
> > stack

> Are you *sure* about that? I'm pretty certain it can't
> be true, since the compiler has to know at all times
> how much is on the stack, so it can decide how much
> stack space is needed.

I think Skip meant it does an arbitrary number of 

load-onto-stack
store-into-namespace

operations. Skip, you'll be glad to know that's no longer true :) Since 2.0
(or when was it that we introduced 'import as' ?) import-* is not a special
case of 'IMPORT_FROM', but rather a separate opcode that doesn't touch the
stack. 'IMPORT_FROM' is now only used to push a given name from TOS onto the
stack:

>>> def eggs():
...     from stat import a, b    
>>> dis.dis(eggs)
...
          9 IMPORT_NAME              0 (stat)
         12 IMPORT_FROM              1 (a)
         15 STORE_FAST               1 (a)
         18 IMPORT_FROM              2 (b)
         21 STORE_FAST               0 (b)
         24 POP_TOP
...

>>> def spam():
...     from stat import *
>>> dis.dis(spam)
...
          6 LOAD_CONST               1 (('*',))
          9 IMPORT_NAME              0 (stat)
         12 IMPORT_STAR
...

Bloody hell, what's that LOAD_CONST doing there ? I think I found a bug ;P
Sigh... Sleep first, fix later.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From sdm7g@Virginia.EDU  Tue Jul 31 00:39:47 2001
From: sdm7g@Virginia.EDU (Steven D. Majewski)
Date: Mon, 30 Jul 2001 19:39:47 -0400 (EDT)
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730162901.F9578@ute.cnri.reston.va.us>
Message-ID: <Pine.NXT.4.21.0107301930450.258-100000@localhost.virginia.edu>


On Mon, 30 Jul 2001, Andrew Kuchling wrote:

> If you must have a portable bytecode format, why not use the JVM?
> Perhaps it's not optimal, but it works reasonably well, has a few
> reasonably complete free implementations that are mostly strangling
> due to lack of manpower, has some support in GCC 3.0, and is actually
> deployed in browsers and on people's systems *right now*.  I fail to
> see why we should run after some mythical Perl/Python bytecode that
> would have to be 1) designed 2) implemented 3) debugged 4) actually
> made available to users 5) actually downloaded by users.  (Much the
> same objections apply to .NET for Unix.)

Some of the folks who have done other languages on the JVM have 
complained about limitations of the Java VM when it comes to supporting
features of other languages. 

Supposedly, Microsoft considered some of those critiques when designing
the C# runtime & VM. 

If, in fact, they have done a better job of generic VM design, then
.NET may be worth lookint at. ( Especially as there is now 
Miguel de Icaza's Mono project. ) 

( Of course, politically, that may be inviting a lot of arguments -- 
see the slashdot threads about whether Mono is a good idea, or is
just open source getting suckered by MS! ) 

-- Steve Majewski




From skip@pobox.com (Skip Montanaro)  Tue Jul 31 00:46:25 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 30 Jul 2001 18:46:25 -0500
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <15205.54612.424694.5559@slothrop.digicool.com>
References: <20010730014859.A15971@thyrsus.com>
 <200107301918.f6UJIt003517@odiug.digicool.com>
 <20010730033517.A17356@thyrsus.com>
 <200107302016.f6UKGoG03676@odiug.digicool.com>
 <20010730162901.F9578@ute.cnri.reston.va.us>
 <15205.54612.424694.5559@slothrop.digicool.com>
Message-ID: <15205.61905.569827.464400@beluga.mojam.com>

    Jeremy> If we had a JVM implementation designed to support Python, there
    Jeremy> would be no need to implement most of the opcodes.  We'd only
    Jeremy> need getstatic and invokevirtual <0.2 wink>.  The typed opcodes
    Jeremy> (int, float, etc.)  would never be used.

Perhaps Armin Rego's Psyco stuff could make use of them if he chose the JVM
as his "other VM".  

Skip


From pedroni@inf.ethz.ch  Tue Jul 31 00:59:22 2001
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Tue, 31 Jul 2001 01:59:22 +0200
Subject: [Python-Dev] Parrot -- should life imitate satire?
References: <20010730014859.A15971@thyrsus.com>       <200107301918.f6UJIt003517@odiug.digicool.com>       <20010730033517.A17356@thyrsus.com>       <200107302016.f6UKGoG03676@odiug.digicool.com>       <20010730162901.F9578@ute.cnri.reston.va.us>       <15205.54612.424694.5559@slothrop.digicool.com> <15205.61905.569827.464400@beluga.mojam.com>
Message-ID: <006b01c11953$b94d10a0$8a73fea9@newmexico>

Hi

[Skip Montanaro]
>
>     Jeremy> If we had a JVM implementation designed to support Python, there
>     Jeremy> would be no need to implement most of the opcodes.  We'd only
>     Jeremy> need getstatic and invokevirtual <0.2 wink>.  The typed opcodes
>     Jeremy> (int, float, etc.)  would never be used.
>
> Perhaps Armin Rego's Psyco stuff could make use of them if he chose the JVM
> as his "other VM".
>
Yes, but feeding the JVM with bytecodes costs more than feeding a real CPU
or a VM written to deal quickly with little chunks of code.

JVM dynamic loading has a verification phase, accept only full class
definitions and
then you enter the interpretation/hotspot collecting phase and then dynamic
compilation
stuff...

Samuele.



From pedroni@inf.ethz.ch  Tue Jul 31 01:04:49 2001
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Tue, 31 Jul 2001 02:04:49 +0200
Subject: [Python-Dev] Parrot -- should life imitate satire?
References: <Pine.NXT.4.21.0107301930450.258-100000@localhost.virginia.edu>
Message-ID: <007701c11954$6b0017c0$8a73fea9@newmexico>

> 
> Some of the folks who have done other languages on the JVM have 
> complained about limitations of the Java VM when it comes to supporting
> features of other languages. 
>
> Supposedly, Microsoft considered some of those critiques when designing
> the C# runtime & VM. 
> 
> If, in fact, they have done a better job of generic VM design, then
> .NET may be worth lookint at. ( Especially as there is now 
> Miguel de Icaza's Mono project. ) 
A question: are there already some data about 
what would be the actual performance of Python.NET vs. CPython ?

> 
> ( Of course, politically, that may be inviting a lot of arguments -- 
> see the slashdot threads about whether Mono is a good idea, or is
> just open source getting suckered by MS! ) 
> 

Samuele Pedroni.



From akuchlin@mems-exchange.org  Tue Jul 31 01:56:57 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 30 Jul 2001 20:56:57 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010731012432.G20676@xs4all.nl>; from thomas@xs4all.net on Tue, Jul 31, 2001 at 01:24:32AM +0200
References: <20010730051831.B1122@thyrsus.com> <20010731012432.G20676@xs4all.nl>
Message-ID: <20010730205657.A2298@ute.cnri.reston.va.us>

On Tue, Jul 31, 2001 at 01:24:32AM +0200, Thomas Wouters wrote:
>Parrot types ? The differences in the regex engine; in Python, regular
>expressions are optional. Also, the Perl engine has some features SRE

If regex opcodes form part of the basic VM, would the main loop end up
looking like the union of ceval.c and pypcre.c/_sre.c?  The thought is
too ghastly to contemplate, though a little part of me [*] would like
to see it.

--amk

[*] "Taking it in its deepest sense, the shadow is the invisible saurian
tail that man still drags behind him. Carefully amputated, it
becomes the healing serpent of the mysteries. Only monkeys parade
with it."  C.G. Jung, in _The Integration of the Personality_. (1939)



From esr@thyrsus.com  Mon Jul 30 14:06:24 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 30 Jul 2001 09:06:24 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <15205.58857.375440.347263@slothrop.digicool.com>; from jeremy@zope.com on Mon, Jul 30, 2001 at 06:55:37PM -0400
References: <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> <20010730033517.A17356@thyrsus.com> <200107302016.f6UKGoG03676@odiug.digicool.com> <20010730051831.B1122@thyrsus.com> <15205.58857.375440.347263@slothrop.digicool.com>
Message-ID: <20010730090624.A3944@thyrsus.com>

Jeremy Hylton <jeremy@zope.com>:
> What is a type ontology?  The definition of ontology I'm familiar with
> is too broad to be useful in understanding what you're getting at.
> I've never heard the technical term "type ontology".

I first heard it in connection with cross-language RPC.  The "type ontology"
of a language or protocol is its implicit theory of what kinds of things 
there are in the universe.  It's actually a pretty reasonable specialization
of the term "ontology" in philosophy.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

A man who has nothing which he is willing to fight for, nothing 
which he cares about more than he does about his personal safety, 
is a miserable creature who has no chance of being free, unless made 
and kept so by the exertions of better men than himself. 
	-- John Stuart Mill, writing on the U.S. Civil War in 1862


From esr@thyrsus.com  Mon Jul 30 14:12:08 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 30 Jul 2001 09:12:08 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010731012432.G20676@xs4all.nl>; from thomas@xs4all.net on Tue, Jul 31, 2001 at 01:24:32AM +0200
References: <20010730051831.B1122@thyrsus.com> <20010731012432.G20676@xs4all.nl>
Message-ID: <20010730091208.C3944@thyrsus.com>

Thomas Wouters <thomas@xs4all.net>:
> > Let's suppose, for the sake of the design discussion, that we can make
> > the type ontologies of the Perl and Python bytecode match up.  
> 
> I'm afraid I'll have to side with Jeremy when I say, "What?"

Explained in public reply to Jeremy.
 
> You mentioned regular expressions as an upside for Python, from this
> 'merger'. Why is that ?

No.  I was referring to the fact that we have *already* coopted Perl's
regexp design.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The end move in politics is always to pick up a gun.
	-- R. Buckminster Fuller


From guido@zope.com  Tue Jul 31 03:02:06 2001
From: guido@zope.com (Guido van Rossum)
Date: Mon, 30 Jul 2001 22:02:06 -0400
Subject: [Python-Dev] Python API version & optional features
In-Reply-To: Your message of "Tue, 31 Jul 2001 00:22:05 +0200."
 <200107302222.f6UMM5105688@mira.informatik.hu-berlin.de>
References: <200107302222.f6UMM5105688@mira.informatik.hu-berlin.de>
Message-ID: <200107310202.WAA10380@cj20424-a.reston1.va.home.com>

> >> I guess one could argue that extension writers should check
> >> for narrow/wide builds in their extensions before using Unicode.
> >> 
> >> Since the number of Unicode extension writers is much smaller 
> >> than the number of users, I think that this apporach would be 
> >> reasonable, provided that we document the problem clearly in the 
> >> NEWS file.
> 
> > OK.  I approve.
> 
> I'm not sure I can follow. What did you approve? That extension
> writers should check whether their Unicode build matches the one they
> get at run-time? How are they going to do that?

With an explicit call.  They know their compile-time unicode width,
they can pass that to a function defined in the main Python/C API
which asserts that the argument is the same as *its* compile-time
unicode width.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ping@lfw.org  Tue Jul 31 03:43:45 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 30 Jul 2001 19:43:45 -0700 (PDT)
Subject: [Python-Dev] cgitb.py for Python 2.2
Message-ID: <Pine.LNX.4.32.0107301905210.4535-100000@ziggy.localdomain.fake>

Hi guys.

Sorry i've been fairly quiet recently -- at least life isn't dull.
I wanted to put in a few words for cgitb.py for your consideration.

I think you all saw it at IPC 9 -- if you missed the presentation,
there are examples at http://www.lfw.org/python to check out.

What i'm proposing is that we toss cgitb.py into the standard library
(pretty small at about 100 lines, since all the heavy lifting is in
pydoc and inspect).  Then we can add this to site.py:

    if os.environ.has_key("GATEWAY_INTERFACE"):
        import sys, cgitb
        sys.excepthook = cgitb.excepthook

I think this is pretty safe, since GATEWAY_INTERFACE is guaranteed
to exist under the CGI specification and should never appear in any
other context.  cgitb.py is written in paranoid fashion -- if anything
goes wrong during generation of the HTML traceback, sys.stderr still
goes to the browser; and if for some reason the page gets dumped to
a shell somewhere, the original traceback is still visible in a comment
at the end of the page.

The upside is that we *automagically* get pretty tracebacks for all
the Python CGI scripts there, with zero effort from the CGI script
writers.  I think this is a really strong hook for people getting
started with Python.

No more "internal server error" messages followed by the annoying
task of inserting "print 'Content-Type: text/html\n\n<pre>'" into
all your scripts!  As for me, i've probably done this hundreds of
times now, and would love to stop doing it.

I anticipate a possible security concern (as this shows bits of your
source code to strangers when problems happen).  So i have tried to
address that by providing a SECRET flag in cgitb that causes the
tracebacks to get written to files instead of the Web browser.

Opinions and suggestions are welcomed!  (I'm looking at the good
stuff that the WebWare people have done with it, and i plan to
merge in their improvements.  For the HTML-heads out there in
particular, i'm looking for your thoughts on the reset() routine.)


-- ?!ng



From barry@zope.com  Tue Jul 31 04:04:03 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 30 Jul 2001 23:04:03 -0400
Subject: [Python-Dev] cgitb.py for Python 2.2
References: <Pine.LNX.4.32.0107301905210.4535-100000@ziggy.localdomain.fake>
Message-ID: <15206.8227.652539.471067@anthem.wooz.org>

>>>>> "KY" == Ka-Ping Yee <ping@lfw.org> writes:

    KY> What i'm proposing is that we toss cgitb.py into the standard
    KY> library (pretty small at about 100 lines, since all the heavy
    KY> lifting is in pydoc and inspect).  Then we can add this to
    KY> site.py:

No time right now to look at it, but I remember it looked pretty cool
at IPC9.  I'd like to merge in some of the ideas I've developed in
Mailman's driver script, which prints out the environment and some
other sys information.  driver always prints to a log file and
optionally to stdout (it has a STEALTH_MODE variable that's probably
equivalent to your SECRET).

One thing I tried very hard to do was to make driver bulletproof, so
that it only imported a very minimal amount of stuff, and that /any/
exception along the way would get caught and not allowed to percolate
up out of the top frame (which would cause a non-zero exit status and
unhelpful message in the browser).  About the only thing that isn't
caught are exceptions importing sys, but if that happens you have
bigger problems! :)

I'll take a closer look at cgitb.py when I get a chance, but I'm
generally +1 on the idea.

-Barry


From gnat@oreilly.com  Tue Jul 31 04:08:47 2001
From: gnat@oreilly.com (Nathan Torkington)
Date: Mon, 30 Jul 2001 20:08:47 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730205657.A2298@ute.cnri.reston.va.us>
References: <20010730051831.B1122@thyrsus.com>
 <20010731012432.G20676@xs4all.nl>
 <20010730205657.A2298@ute.cnri.reston.va.us>
Message-ID: <15206.8511.147000.832644@gargle.gargle.HOWL>

Andrew Kuchling writes:
> If regex opcodes form part of the basic VM, would the main loop end up
> looking like the union of ceval.c and pypcre.c/_sre.c?  The thought is
> too ghastly to contemplate, though a little part of me [*] would like
> to see it.

(perl guy speaking alert) The plan for perl6 is to implement the
regular expression engine as opcodes.  We feel this would be cleaner
and faster than having the essentially separate module that we have
right now.  I think our current perl5 project manager was the one who
said that we have no idea how inefficient our current RE engine is,
because it's been "optimized" to the point where it's impossible to
read.

The core loop would just be the usual opcode dispatch loop ("call the
function for the current operation, which returns the next
operation").  The only difference is that some of the opcodes would be
specific to RE matches.  (I'm unclear on how much special logic RE
opcodes involve--it may be possible to implement REs with the
operations that regular language features like loops and tests
require).

Nat



From gnat@oreilly.com  Tue Jul 31 04:12:04 2001
From: gnat@oreilly.com (Nathan Torkington)
Date: Mon, 30 Jul 2001 20:12:04 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010731012432.G20676@xs4all.nl>
References: <20010730051831.B1122@thyrsus.com>
 <20010731012432.G20676@xs4all.nl>
Message-ID: <15206.8708.811000.468489@gargle.gargle.HOWL>

Thomas Wouters writes:
> Also, the Perl engine has some features SRE hasn't, yet, and vice
> versa (last I checked, Perl's regexps didn't do unicode or named
> groups.)

Perl's REs now do Unicode.  Perl 6's REs will do named groups.

> And I won't even start with Perl's more archaic features, that
> change the whole working of the interpreter.

Those are going away.  Perl people hate them as much as you do--the
only time they're used now is to make deliberately hideous code, and
hardly anyone will seriously lament the passing of that ability.  No
more "change the starting position for subscripts", no more "change
all RE matches globally", and so on.

Nat



From gnat@oreilly.com  Tue Jul 31 04:15:34 2001
From: gnat@oreilly.com (Nathan Torkington)
Date: Mon, 30 Jul 2001 20:15:34 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730162901.F9578@ute.cnri.reston.va.us>
References: <20010730014859.A15971@thyrsus.com>
 <200107301918.f6UJIt003517@odiug.digicool.com>
 <20010730033517.A17356@thyrsus.com>
 <200107302016.f6UKGoG03676@odiug.digicool.com>
 <20010730162901.F9578@ute.cnri.reston.va.us>
Message-ID: <15206.8918.603000.448728@gargle.gargle.HOWL>

Andrew Kuchling writes:
> There's also the cultural difference between Python's "write it
> clearly and then optimize it" and Perl's "let's write clever optimized
> code right from the start".  Perhaps this can be bridged, perhaps not.

The people designing and implementing perl6 have already agreed on a
"do it clean, then make it faster" approach.  We can all see the
problems with the current Perl internals, and have no desire to repeat
the mistakes of the past.

There may or may not be impedence mismatch between the two languages
(Perl's flexitypes might be one of the sticking points) but this won't
be one of them.

Nat



From esr@thyrsus.com  Mon Jul 30 16:51:03 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 30 Jul 2001 11:51:03 -0400
Subject: [Python-Dev] cgitb.py for Python 2.2
In-Reply-To: <Pine.LNX.4.32.0107301905210.4535-100000@ziggy.localdomain.fake>; from ping@lfw.org on Mon, Jul 30, 2001 at 07:43:45PM -0700
References: <Pine.LNX.4.32.0107301905210.4535-100000@ziggy.localdomain.fake>
Message-ID: <20010730115103.A2052@thyrsus.com>

Ka-Ping Yee <ping@lfw.org>:
> The upside is that we *automagically* get pretty tracebacks for all
> the Python CGI scripts there, with zero effort from the CGI script
> writers.  I think this is a really strong hook for people getting
> started with Python.

<boggle>I've been to look at the cgitb page.  My jaw dropped open.</boggle>

+1
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The abortion rights and gun control debates are twin aspects of a deeper
question --- does an individual ever have the right to make decisions
that are literally life-or-death?  And if not the individual, who does?


From paulp@ActiveState.com  Tue Jul 31 04:49:50 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 30 Jul 2001 20:49:50 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
References: <20010730051831.B1122@thyrsus.com> <20010731012432.G20676@xs4all.nl> <20010730205657.A2298@ute.cnri.reston.va.us>
Message-ID: <3B662ADD.9E701795@ActiveState.com>

Andrew Kuchling wrote:
> 
>...
> 
> If regex opcodes form part of the basic VM, would the main loop end up
> looking like the union of ceval.c and pypcre.c/_sre.c?  The thought is
> too ghastly to contemplate, though a little part of me [*] would like
> to see it.

Welcome to Perl. :)

I don't really understand it but here are references that might help:

http://aspn.activestate.com/ASPN/Mail/Message/638953
http://aspn.activestate.com/ASPN/Mail/Message/639000
http://aspn.activestate.com/ASPN/Mail/Message/639048

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From paulp@ActiveState.com  Tue Jul 31 05:17:04 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 30 Jul 2001 21:17:04 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
References: <Pine.NXT.4.21.0107301930450.258-100000@localhost.virginia.edu> <007701c11954$6b0017c0$8a73fea9@newmexico>
Message-ID: <3B663140.41CB9DD7@ActiveState.com>

Samuele Pedroni wrote:
> 
>...
> A question: are there already some data about
> what would be the actual performance of Python.NET vs. CPython ?

I think it is safe to say that the current version of Python.NET is
slower than Jython. Now it hasn't been optimized as much as Jython so we
might be able to get it as fast as Jython. But I don't think that there
is anything in the .NET runtime that makes it a great deal better than
the JVM for dynamic languages. The only difference is that Microsoft
seems more aware of the problem and may move to correct it whereas I
have a feeling that explicit support for our languages would dilute
Sun's 100% Java marketing campaign. Also, the .NET CLR is standardized
at ECMA so we could (at least in theory!) go to the meetings and try to
influence version 2.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From m@moshez.org  Tue Jul 31 05:15:46 2001
From: m@moshez.org (Moshe Zadka)
Date: Tue, 31 Jul 2001 07:15:46 +0300
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010730051831.B1122@thyrsus.com>
References: <20010730051831.B1122@thyrsus.com>, <20010730014859.A15971@thyrsus.com> <200107301918.f6UJIt003517@odiug.digicool.com> <20010730033517.A17356@thyrsus.com> <200107302016.f6UKGoG03676@odiug.digicool.com>
Message-ID: <E15RQwc-00072Q-00@darjeeling>

On Mon, 30 Jul 2001, "Eric S. Raymond" <esr@thyrsus.com> wrote:

> Let's further suppose that we have a callout mechanism from the Parrot 
> interpreter core to the Perl or Python runtime's C level that can pass out 
> Python/Perl types and return them.
> 
> Given these two premises, what other problems are there?

This solution sounds like just taking two VM interpreters and forcing
them together by having the first byte of the instruction be "Python opcode"
or "Perl opcode". You get none of the wins you were aiming for.

> I can see one: garbage collection.

How is GC a problem? Python never promised a specific GC mechanism,
so as long as you have something which collects garbage, Python is
fine.
-- 
gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE
Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6
Insecure (accessible): C5A5 A8FA CA39 AB03 10B8  F116 1713 1BCF 54C4 E1FE
Learn Python! http://www.ibiblio.org/obp/thinkCSpy


From m@moshez.org  Tue Jul 31 05:18:10 2001
From: m@moshez.org (Moshe Zadka)
Date: Tue, 31 Jul 2001 07:18:10 +0300
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>
References: <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>
Message-ID: <E15RQyw-000730-00@darjeeling>

On Mon, 30 Jul 2001, "Steven D. Majewski" <sdm7g@Virginia.EDU> wrote:

> Scheme48 is probably considered the best portable byte-code Scheme
> implementation. ( Don't know anything about it's internals myself )

Last I heard (admittedly, >1 yr. ago), it didn't support 64 bit
architectures.
-- 
gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE
Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6
Insecure (accessible): C5A5 A8FA CA39 AB03 10B8  F116 1713 1BCF 54C4 E1FE
Learn Python! http://www.ibiblio.org/obp/thinkCSpy


From greg@cosc.canterbury.ac.nz  Tue Jul 31 06:00:45 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Jul 2001 17:00:45 +1200 (NZST)
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>
Message-ID: <200107310500.RAA00648@s454.cosc.canterbury.ac.nz>

"Steven D. Majewski" <sdm7g@Virginia.EDU>:

> But then, I've always thought that one of the problems with
> trying to optimize Python was that the VM was too high level. 

No, the problem is that Python is just too darn dynamic!
This is a feature of the language, not just the VM.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@zope.com  Tue Jul 31 07:22:13 2001
From: guido@zope.com (Guido van Rossum)
Date: Tue, 31 Jul 2001 02:22:13 -0400
Subject: [Python-Dev] cgitb.py for Python 2.2
In-Reply-To: Your message of "Mon, 30 Jul 2001 19:43:45 PDT."
 <Pine.LNX.4.32.0107301905210.4535-100000@ziggy.localdomain.fake>
References: <Pine.LNX.4.32.0107301905210.4535-100000@ziggy.localdomain.fake>
Message-ID: <200107310622.CAA11742@cj20424-a.reston1.va.home.com>

> Sorry i've been fairly quiet recently -- at least life isn't dull.

You still have a few SF bugs and patches assigned!  How about
addressing those?!

> I wanted to put in a few words for cgitb.py for your consideration.
> 
> I think you all saw it at IPC 9 -- if you missed the presentation,
> there are examples at http://www.lfw.org/python to check out.

Yeah, it's cool.

> What i'm proposing is that we toss cgitb.py into the standard library
> (pretty small at about 100 lines, since all the heavy lifting is in
> pydoc and inspect).  Then we can add this to site.py:
> 
>     if os.environ.has_key("GATEWAY_INTERFACE"):
>         import sys, cgitb
>         sys.excepthook = cgitb.excepthook

Why not add this to cgi.py instead?  Th site.py initialization is
accumulating a lot of cruft, and I don't like new additions that are
irrelevant for most apps (CGI is a tiny niche for Python IMO).  (I
also think all the stuff that's only for interactive mode should be
moved off to another module that is only run in interactive mode.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@zope.com  Tue Jul 31 07:29:36 2001
From: guido@zope.com (Guido van Rossum)
Date: Tue, 31 Jul 2001 02:29:36 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: Your message of "Mon, 30 Jul 2001 21:17:04 PDT."
 <3B663140.41CB9DD7@ActiveState.com>
References: <Pine.NXT.4.21.0107301930450.258-100000@localhost.virginia.edu> <007701c11954$6b0017c0$8a73fea9@newmexico>
 <3B663140.41CB9DD7@ActiveState.com>
Message-ID: <200107310629.CAA11818@cj20424-a.reston1.va.home.com>

> Also, the .NET CLR is standardized at ECMA so we could (at least in
> theory!) go to the meetings and try to influence version 2.

Notice the addition "in theory".  In practice, this is BS.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Tue Jul 31 08:37:27 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 09:37:27 +0200
Subject: [Python-Dev] Python API version & optional features
References: <200107302222.f6UMM5105688@mira.informatik.hu-berlin.de>
Message-ID: <3B666037.6A813780@lemburg.com>

"Martin v. Loewis" wrote:
> 
> >> I guess one could argue that extension writers should check
> >> for narrow/wide builds in their extensions before using Unicode.
> >>
> >> Since the number of Unicode extension writers is much smaller
> >> than the number of users, I think that this apporach would be
> >> reasonable, provided that we document the problem clearly in the
> >> NEWS file.
> 
> > OK.  I approve.
> 
> I'm not sure I can follow. What did you approve? 

To use macros in unicodeobject.h which then map all interface names
to either PyUnicodeUC2_* or PyUnicodeUCS4_*. The linker will then
report the mismatch in interfaces.

> That extension
> writers should check whether their Unicode build matches the one they
> get at run-time? How are they going to do that?

They would have to use at least one of the PyUnicode_* APIs in
their code. 

I think it would also be a good idea to provide 
a non-mangled PyUnicode_UnicodeSize() API which would then return
the number of bytes occupied by Py_UNICODE of the Python build.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal@lemburg.com  Tue Jul 31 09:14:53 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 10:14:53 +0200
Subject: [Python-Dev] Python API version & optional features
References: <3B655980.948BCDEF@lemburg.com> <15205.25545.353887.299167@cj42289-a.reston1.va.home.com> <3B6567A3.E386EAB9@lemburg.com> <200107301427.f6UERW802779@odiug.digicool.com>
 <3B65765A.9706A4A2@lemburg.com> <200107301547.f6UFlhB02991@odiug.digicool.com>
Message-ID: <3B6668FD.DA986A28@lemburg.com>

Guido van Rossum wrote:
> 
> > > Hm, the "u" argument parser is a nasty one to catch.  How likely is
> > > this to be the *only* reference to Unicode in a particular extension?
> >
> > It is not very likely but IMHO possible for e.g. extensions
> > which rely on the fact that wchar_t == Py_UNICODE and then do
> > direct interfacing to some other third party code.
> >
> > I guess one could argue that extension writers should check
> > for narrow/wide builds in their extensions before using Unicode.
> >
> > Since the number of Unicode extension writers is much smaller
> > than the number of users, I think that this apporach would be
> > reasonable, provided that we document the problem clearly in the
> > NEWS file.
> 
> OK.  I approve.

Great ! I'll go ahead and fix unicodeobject.h.
 
> > Hmm, that would probably not make UCS-4 builds very popular ;-)
> 
> Do you have any reason to assume that it would be popular otherwise?
> :-) :-) :-)

Oh, I do hope that people try out the UCS-4 builds. They may not
be all that interesting yet, but I believe that for Asian users
they do have some advantages.
 
> > > These warnings should use the warnings framework, by the way, to make
> > > it easier to ignore a specific warning.  Currently it's a hard write
> > > to stderr.
> >
> > Using the warnings framework would indeed be a good idea (many older
> > extensions work just fine even with later API levels; the warnings
> > are annoying, though) !
> 
> Exactly.
> 
> I'm not going to make the change, but it should be a two-liner in
> Python/modsupport.c:Py_InitModule4().

I'll look into this as well.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal@lemburg.com  Tue Jul 31 09:30:20 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 10:30:20 +0200
Subject: [Python-Dev] Revised decimal type PEP
References: <0107301106520A.02216@fermi.eeel.nist.gov>
Message-ID: <3B666C9C.4400BD9C@lemburg.com>

Michael McLay wrote:
> 
> PEP: 2XX
> Title: Adding a Decimal type to Python
> Version: $Revision:$
> Author: mclay@nist.gov <mclay@nist.gov>
> Status: Draft
> Type: ??
> Created: 25-Jul-2001
> Python-Version: 2.2
> 
> Introduction
> 
>     This PEP describes the addition of a decimal number type to Python.
> 
>     ...
>
> Implementation
> 
>     The tokenizer will be modified to recognized number literals with
>     a 'd' suffix and a decimal() function will be added to __builtins__.

How will you be able to define the precision of decimals ? Implicit
by providing a decimal string with enough 0s to let the parser
deduce the precision ? Explicit like so: decimal(12, 5) ?

Also, what happens to the precision of the decimal object resulting
from numeric operations ?

>     A decimal number can be used to represent integers and floating point
>     numbers and decimal numbers can also be displayed using scientific
>     notation. Examples of decimal numbers include:
>     
>     ...
>
>     This proposal will also add an optional  'b' suffix to the
>     representation of binary float type literals and binary int type
>     literals.

Hmm, I don't quite grasp the need for the 'b'... numbers without
any modifier will work the same way as they do now, right ?
 
>     ...
>
>     Expressions that mix binary floats with decimals introduce the
>     possibility of unexpected results because the two number types use
>     different internal representations for the same numerical value. 

I'd rather have this explicit in the sense that you define which
assumptions will be made and what issues arise (rounding, truncation,
loss of precision, etc.).

>     The
>     severity of this problem is dependent on the application domain.  For
>     applications that normally use binary numbers the error may not be
>     important and the conversion should be done silently.  For newbie
>     programmers a warning should be issued so the newbie will be able to
>     locate the source of a discrepancy between the expected results and
>     the results that were achieved.  For financial applications the mixing
>     of floating point with binary numbers should raise an exception.
> 
>     To accommodate the three possible usage models the python interpreter
>     command line options will be used to set the level for warning and
>     error messages. The three levels are:
> 
>     promiscuous mode,   -f or  --promiscuous
>     safe mode           -s or --save
>     pedantic mode       -p or --pedantic

How about a generic option:

	--numerics:[loose|safe|pedantic] or -n:[l|s|p]

>     The default setting will be set to the safe setting. In safe mode
>     mixing decimal and binary floats in a calculation will trigger a warning
>     message.
> 
>     >>> type(12.3d + 12.2b)
>     Warning: the calculation mixes decimal numbers with binary floats
>     <type 'decimal'>
> 
>     In promiscuous mode warnings will be turned off.
> 
>     >>> type(12.3d + 12.2b)
>     <type 'decimal'>
> 
>     In pedantic mode warning from safe mode will be turned into exceptions.
> 
>     >>> type(12.3d + 12.2b)
>     Traceback (innermost last):
>       File "<stdin>", line 1, in ?
>     TypeError: the calculation mixes decimal numbers with binary floats
> 
> Semantics of Decimal Numbers
> 
>     ??

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal@lemburg.com  Tue Jul 31 09:05:14 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 10:05:14 +0200
Subject: [Python-Dev] pep-discuss
References: <20010730154936.AE36899C94@waltz.rahul.net>
Message-ID: <3B6666BA.7F774C46@lemburg.com>

Aahz Maruch wrote:
> 
> Paul Prescod wrote:
> >
> > We've talked about having a mailing list for general PEP-related
> > discussions. Two things make me think that revisiting this would be a
> > good idea right now.
> >
> > First, the recent loosening up of the python-dev rules threatens the
> > quality of discussion about bread and butter issues such as patch
> > discussions and process issues.
> >
> > Second, the flamewar on python-list basically drowned out the usual
> > newbie questions and would give a person coming new to Python a very
> > negative opinion about the language's future and the friendliness of the
> > community. I would rather redirect as much as possible of that to a list
> > that only interested participants would have to endure.
> 
> While what you say makes sense, overall, there are a lot of people (me
> included) who prefer discussion on newsgroups, and I can't quite see
> creating a newsgroup for PEP discussions yet.  Call me -0.25 for kicking
> discussion off c.l.py and +0.25 for getting it off python-dev.

I don't really mind having PEP discussions on both c.l.p (to get
user feedback) and python-dev (for the purpose of reaching 
consensus). After all, python-dev is about developing Python,
so PEP discussion is very much on topic.

Note that a filter on "python-dev" in the List-ID field and
"PEP" in the subject should pretty much filter out all
PEP discussions from python-dev if you don't want to participate
in them.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From paulp@ActiveState.com  Tue Jul 31 09:47:03 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 31 Jul 2001 01:47:03 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
References: <Pine.NXT.4.21.0107301930450.258-100000@localhost.virginia.edu> <007701c11954$6b0017c0$8a73fea9@newmexico>
 <3B663140.41CB9DD7@ActiveState.com> <200107310629.CAA11818@cj20424-a.reston1.va.home.com>
Message-ID: <3B667087.EBBE8938@ActiveState.com>

Guido van Rossum wrote:
> 
> > Also, the .NET CLR is standardized at ECMA so we could (at least in
> > theory!) go to the meetings and try to influence version 2.
> 
> Notice the addition "in theory".  In practice, this is BS.

It depends on the rules and politics of each particular standards group.
It is fundamentally a social activity. It also depends how much effort
you are willing to put into promoting your cause. Sam Ruby is chair of
the ECMA CLI group. He is a big scripting language fan. 

http://www2.hursley.ibm.com/tc39/

Also note the presence of Mike Cowlishaw of REXX fame and Dave Raggett
of the W3C.

Working within a standards body is a gamble. It can pay off big or it
can completely fail. We might find Microsoft our strongest ally -- they
have always been interested in having the scripting languages work well
on their platforms. They would hate to give programmers to have an
excuse to stick to Unix or the JVM.

I don't personally know enough about this particular circumstance to
know whether there is any possibility of significantly influencing
version 2 or not. Maybe the gamble isn't worth the effort. But I
wouldn't dismiss it out of hand.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From mwh@python.net  Tue Jul 31 10:23:48 2001
From: mwh@python.net (Michael Hudson)
Date: 31 Jul 2001 05:23:48 -0400
Subject: [Python-Dev] Changing the Division Operator -- PEP 238, rev 1.12
In-Reply-To: Samuele Pedroni's message of "Mon, 30 Jul 2001 21:59:35 +0200 (MET DST)"
References: <200107301959.VAA11733@core.inf.ethz.ch>
Message-ID: <2m4rrt78pn.fsf@starship.python.net>

Samuele Pedroni <pedroni@inf.ethz.ch> writes:

> ...
> > > > 
> > > > Does codeop currently work in Jython?  The solution should continue to
> > > > work in Jython then. 
> > > We have our interface compatible version of codeop that works.
> > 
> > Would implementing the new interfaces I sketched out for codeop.py be
> > possible in Jython?  That's the bit I care about, not so much the
> > interface to __builtin__.compile.
> Yes, it's of possible.

Good; hopefully we can get somewhere then.

> > > > Does Jython support the same flag bit values as
> > > > CPython?  If not, Paul Prescod's suggestion to use keyword arguments
> > > > becomes very relevant.
> > > we support a subset of the co_flags, CO_NESTED e.g. is there with the same
> > > value.
> > > 
> > > But the embedding API is very different, my implementation of nested
> > > scopes does not define any Py_CF... flags, we have an internal CompilerFlags
> > > object but is more similar to PyFutureFeatures ...
> > 
> > Is this object exposed to Python code at all?
> Not publicily, but in Jython the separating line is a bit different,
> because public java classes are always accessible from jython,
> even most of the internals. That does not mean and every use of that
> is welcome and supported.

Ah, of course.  I'd forgotten how cool Jython was in some ways.

> >  One approach would be
> > PyObject-izing PyFutureFlags and making *that* the fourth argument to
> > compile...
> > 
> > class Compiler:
> >     def __init__(self):
> >         self.ff = ff.new() # or whatever
> >     def __call__(self, source, filename, start_symbol):
> >         code = compile(source, filename, start_symbol, self.ff)
> >         self.ff.merge(code.co_flags)
> >         return code
> I see, "internally" we already have a compiler_flags function
> that do the same of:
> >         code = compile(source, filename, start_symbol, self.ff)
> >         self.ff.merge(code.co_flags)
> 
> where self.ff is a CompuilerFlags object.
> 
> I can re-arrange things for any interface, 

Well, I don't want to make more work for you - I imagine Guido's doing
enough of that for two!

> I was only trying to explain our approach and situation and a
> possible way to avoid duplicating some internal code in Python.

Can you point me to the code in CVS that implements this sort of
thing?  I don't really know Java but I can probably muddle through to
some extent.  We might as well have CPython copy Jython for once...

Cheers,
M.

-- 
  On the other hand, the following areas are subject to boycott
  in reaction to the rampant impurity of design or execution, as
  determined after a period of study, in no particular order:
    ...                              http://www.naggum.no/profile.html


From thomas@xs4all.net  Tue Jul 31 10:55:22 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 31 Jul 2001 11:55:22 +0200
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <15206.8708.811000.468489@gargle.gargle.HOWL>
References: <20010730051831.B1122@thyrsus.com> <20010731012432.G20676@xs4all.nl> <15206.8708.811000.468489@gargle.gargle.HOWL>
Message-ID: <20010731115521.I20676@xs4all.nl>

On Mon, Jul 30, 2001 at 08:12:04PM -0700, Nathan Torkington wrote:

> > And I won't even start with Perl's more archaic features, that
> > change the whole working of the interpreter.
> 
> Those are going away.

Yeah, I thought as much, which is why I wasn't going to start on them :)

> Perl people hate them as much as you do--the only time they're used now is
> to make deliberately hideous code, and hardly anyone will seriously lament
> the passing of that ability.  No more "change the starting position for
> subscripts", no more "change all RE matches globally", and so on.

I don't really hate the features, I just don't use them, and wouldn't want
them in Python :-) I do actually program Perl, and will do a lot more of it
in the next couple of months at least (I switched projects at work, to one
that will entail Perl programming roughly 80% of the time) -- I just like
Python a lot more.

Your comments do lead me to ask this question, though (and forgive me if it
comes over as the arrogant ranting of a Python bigot; it's definately not
intended as such, even though I only have a Python-implementors point of
view.)

What's going to be the difference between Perl6 and Python ? The variable
typing-naming ($var, %var, etc) I presume, and the curly bracket vs.
indentation blocking issue. Regex-literals, 'unless', the '<expression>
if/unless/while <boolean exp>' shortcut, I guess ? Those are basically all
parser/compiler issues, so shouldn't be a real problem. The transmorphic
typing is trickier, as is taint mode and Perl's scoping rules.... Though the
latter could be done if we refactor the namespace-creation that is currently
done implicitly on function-creation, and allow it to be done explicitly.
The same goes for the variable-filling-assignment (which is quite different
from the name-binding assignment Python has.)

I don't really doubt that Perl and Python could use the same VM.... I'm not
entirely certain howmuch of the shared VM the two implementations would
actually be using. Is it worth it if the overlap is a mere, say, 25% ? (I
think it's more, but it depends entirely on howmuch different Perl6 is from
Perl5, and howmuch Python is willing to change.... Lurkers here know I'm
agressively against gratuitous breakage :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From paulp@ActiveState.com  Tue Jul 31 11:18:48 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 31 Jul 2001 03:18:48 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
References: <20010730051831.B1122@thyrsus.com> <20010731012432.G20676@xs4all.nl> <15206.8708.811000.468489@gargle.gargle.HOWL> <20010731115521.I20676@xs4all.nl>
Message-ID: <3B668608.F68B5953@ActiveState.com>

One of the things I picked up from the Perl conference is that Perl
users *seem* (to me) to have a higher tolerance for code breakage than
Python users. (and Python users have a higher tolerance than (let's say)
Java users) Even if we put aside Perl 6, Perlers talk pretty glibly
about ripping little used features out in Perl 5.8.0 and Perl 5.10 and
so forth. 

e.g. Damian said that Autoload is going away (or pseudo hashes or
something like that). Whether or not he was right, nobody in the room
threw tomatoes as I'm sure they would if Guido tried to kill
__getattr__.

Admittedly, I never know when I hear stuff like "tr///CU is dead"  or
"package; is dead" whether each was a feature that has been in for three
years or was added to an experimental release and removed from the next
experimental release.

I'm not criticizing the Perl community. Acceptance of change is a good
thing! But I think they should know how conservative the Python world
is. Last week there were storm troopers heading for Guidos house when he
announced that the division operator is going to change its behaviour
two or three years. That means it would take a major PR effort to
convince the Python community that even minor language changes would be
worth the benefit of sharing a VM.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From sjoerd.mullender@oratrix.com  Tue Jul 31 11:23:27 2001
From: sjoerd.mullender@oratrix.com (Sjoerd Mullender)
Date: Tue, 31 Jul 2001 12:23:27 +0200
Subject: [Python-Dev] Picking on platform fmod
In-Reply-To: Your message of Sat, 28 Jul 2001 16:13:53 -0400.
 <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEABLCAA.tim.one@home.com>
Message-ID: <20010731102328.1260D301CF7@bireme.oratrix.nl>

Success on SGI O2 running IRIX6.5.12m with native compiler version
7.2.1.3m and compiled without -O.

On Sat, Jul 28 2001 "Tim Peters" wrote:

> Here's your chance to prove your favorite platform isn't a worthless pile of
> monkey crap <wink>.  Please run the attached.  If it prints anything other
> than
> 
> 0 failures in 10000 tries
> 
> it will probably print a lot.  In that case I'd like to know which flavor of
> C+libc+libm you're using, and the OS; a few of the failures it prints may be
> helpful too.

-- Sjoerd Mullender <sjoerd.mullender@oratrix.com>


From barry@zope.com  Tue Jul 31 11:54:54 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 31 Jul 2001 06:54:54 -0400
Subject: [Python-Dev] cgitb.py for Python 2.2
References: <Pine.LNX.4.32.0107301905210.4535-100000@ziggy.localdomain.fake>
 <200107310622.CAA11742@cj20424-a.reston1.va.home.com>
Message-ID: <15206.36478.421953.437702@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@zope.com> writes:

    >> What i'm proposing is that we toss cgitb.py into the standard
    >> library (pretty small at about 100 lines, since all the heavy
    >> lifting is in pydoc and inspect).  Then we can add this to
    >> site.py: if os.environ.has_key("GATEWAY_INTERFACE"): import
    >> sys, cgitb sys.excepthook = cgitb.excepthook

    GvR> Why not add this to cgi.py instead?  Th site.py
    GvR> initialization is accumulating a lot of cruft, and I don't
    GvR> like new additions that are irrelevant for most apps (CGI is
    GvR> a tiny niche for Python IMO).  (I also think all the stuff
    GvR> that's only for interactive mode should be moved off to
    GvR> another module that is only run in interactive mode.)

I'm at best +0 on adding it to site.py too.  E.g. for performance
reasons Mailman's cgi wrappers invoke Python with -S to avoid the
expensive overhead of importing site.py for each cgi hit.

-Barry


From barry@zope.com  Tue Jul 31 12:01:03 2001
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 31 Jul 2001 07:01:03 -0400
Subject: [Python-Dev] pep-discuss
References: <3B62EB05.396DF4D7@ActiveState.com>
Message-ID: <15206.36847.621663.568615@anthem.wooz.org>

>>>>> "PP" == Paul Prescod <paulp@ActiveState.com> writes:

    PP> We've talked about having a mailing list for general
    PP> PEP-related discussions. Two things make me think that
    PP> revisiting this would be a good idea right now.

    PP> First, the recent loosening up of the python-dev rules
    PP> threatens the quality of discussion about bread and butter
    PP> issues such as patch discussions and process issues.

I'm not worrying about that until it becomes a problem. :)

    PP> Second, the flamewar on python-list basically drowned out the
    PP> usual newbie questions and would give a person coming new to
    PP> Python a very negative opinion about the language's future and
    PP> the friendliness of the community. I would rather redirect as
    PP> much as possible of that to a list that only interested
    PP> participants would have to endure.

For me too, it'd be just another list to subscribe to and follow, so
I'm generally against a separate pep list too.

One thing I'll note: in Mailman 2.1 we will be able to define "topics"
and you will be able to filter on specific topics.  E.g. if we defined
a pep topic, you could filter out all pep messages, receive only pep
messages, or do mail client filtering on the X-Topics: header.  (This
only works for regular delivery, not digest delivery.)

just-dont-ask-when-MM2.1-will-be-ready-ly y'rs,
-Barry


From guido@zope.com  Tue Jul 31 12:31:21 2001
From: guido@zope.com (Guido van Rossum)
Date: Tue, 31 Jul 2001 07:31:21 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: Your message of "Tue, 31 Jul 2001 01:47:03 PDT."
 <3B667087.EBBE8938@ActiveState.com>
References: <Pine.NXT.4.21.0107301930450.258-100000@localhost.virginia.edu> <007701c11954$6b0017c0$8a73fea9@newmexico> <3B663140.41CB9DD7@ActiveState.com> <200107310629.CAA11818@cj20424-a.reston1.va.home.com>
 <3B667087.EBBE8938@ActiveState.com>
Message-ID: <200107311131.HAA15851@cj20424-a.reston1.va.home.com>

> Guido van Rossum wrote:
> > 
> > > Also, the .NET CLR is standardized at ECMA so we could (at least in
> > > theory!) go to the meetings and try to influence version 2.
> > 
> > Notice the addition "in theory".  In practice, this is BS.
> 
> It depends on the rules and politics of each particular standards group.
> It is fundamentally a social activity. It also depends how much effort
> you are willing to put into promoting your cause. Sam Ruby is chair of
> the ECMA CLI group. He is a big scripting language fan. 
> 
> http://www2.hursley.ibm.com/tc39/
> 
> Also note the presence of Mike Cowlishaw of REXX fame and Dave Raggett
> of the W3C.
> 
> Working within a standards body is a gamble. It can pay off big or it
> can completely fail. We might find Microsoft our strongest ally -- they
> have always been interested in having the scripting languages work well
> on their platforms. They would hate to give programmers to have an
> excuse to stick to Unix or the JVM.

So it boils down to us vs. MS.  Guess who wins whenever there's a
disagreement.  I still maintain that it's a waste of our time.

> I don't personally know enough about this particular circumstance to
> know whether there is any possibility of significantly influencing
> version 2 or not. Maybe the gamble isn't worth the effort. But I
> wouldn't dismiss it out of hand.

Well, your boss has a pact with MS, so AS might pull it off. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Tue Jul 31 12:52:58 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Tue, 31 Jul 2001 07:52:58 -0400
Subject: [Python-Dev] cgitb.py for Python 2.2
In-Reply-To: <15206.8227.652539.471067@anthem.wooz.org>; from barry@zope.com on Mon, Jul 30, 2001 at 11:04:03PM -0400
References: <Pine.LNX.4.32.0107301905210.4535-100000@ziggy.localdomain.fake> <15206.8227.652539.471067@anthem.wooz.org>
Message-ID: <20010731075258.A2757@ute.cnri.reston.va.us>

On Mon, Jul 30, 2001 at 11:04:03PM -0400, Barry A. Warsaw wrote:
>I'll take a closer look at cgitb.py when I get a chance, but I'm
>generally +1 on the idea.

+0 from me, though I also think it would be better in cgi.py and not
in site.py.  It would also be useful if it could mail tracebacks and
return a non-committal but secure error message to the browser; I'll
contribute that as a patch if cgitb.py goes in.  (Or should that be
cgi/tb.py?  Hmm...)

--amk


From akuchlin@mems-exchange.org  Tue Jul 31 13:01:28 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Tue, 31 Jul 2001 08:01:28 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <15206.8511.147000.832644@gargle.gargle.HOWL>; from gnat@oreilly.com on Mon, Jul 30, 2001 at 08:08:47PM -0700
References: <20010730051831.B1122@thyrsus.com> <20010731012432.G20676@xs4all.nl> <20010730205657.A2298@ute.cnri.reston.va.us> <15206.8511.147000.832644@gargle.gargle.HOWL>
Message-ID: <20010731080128.B2757@ute.cnri.reston.va.us>

On Mon, Jul 30, 2001 at 08:08:47PM -0700, Nathan Torkington wrote:
>Andrew Kuchling writes:
>The core loop would just be the usual opcode dispatch loop ("call the
>function for the current operation, which returns the next
>operation").  The only difference is that some of the opcodes would be
>specific to RE matches.  (I'm unclear on how much special logic RE

The big difference I see between regex opcodes and language opcodes is
that regexes need to backtrack and language ones don't.  Unless the
idea is to compile a regex to actual VM code similar to that generated
by Python/Perl code, but then wouldn't that sacrifice efficiency?

--amk


From paulp@ActiveState.com  Tue Jul 31 13:39:53 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 31 Jul 2001 05:39:53 -0700
Subject: [Python-Dev] Frank Willison
Message-ID: <3B66A719.4252CAAC@ActiveState.com>

The Python world has lost a great friend in Frank Willison. Frank died
yesterday of a massive heart attack.

I've searched in vain for a biography of Frank for those that didn't
know him but perhaps he was too modest to put his biography on the Web.
Suffice to say that before there were 30 or 10 or 5 Python books, before
acquisitions editors started cold-calling Python programmers, Frank had
a sense that this little language could become something.

In Frank's words:

"This is my third Python Conference. At the first one, a loyal 70 or so
Python loyalists debated potential new features of the language. At the
second, 120 or so Python programmers split their time between a review
of language features and the discussion of interesting Python
applications. 

At this conference, the third, we moved onto a completely different
level. Presentations and demonstrations at this conference of nearly 250
attendees have covered applications built on Python. Companies are
demonstrating their Python-based products. There is venture capital
here. There are people here because they want to learn about Python.
This year, mark my words: Python is here to stay."

	http://www.oreilly.com/frank/pythonconf_0100.html

The O'Reilly books that Frank edited helped to give Python the
legitimacy it needed to get over the hump. I carefully put in the word
"helped" because Frank requires honesty and modesty:

"O'Reilly doesn't legitimize. If we did, lots of technology creators who
enjoy their status as bastards would shun us. We try to find the
technologies that are interesting and powerful, that solve the problems
people really have. Then we take pleasure in publishing an interesting
book on that subject. 

I'd like to put another issue to rest: the Camel book did not legitimize
Perl. It may have accelerated Perl's adoption by making information
about Perl more readily available. But the truth is that Perl would have
succeeded without an O'Reilly book (as would Python and Zope), and that
we're very pleased to have been smart enough to recognize Perl's
potential before other publishers did."

	http://www.oreilly.com/frank/legitimacy_1199.html

Frank was also a Perl guy. He was big enough for both worlds. To me he
was a Perl guy but *the* Python guy. Frank was the guy who got Python
books into print. He and his protege Laura Llewin were constantly on the
lookout for opportunities to write about Python.

Much more important than anything he did with or for Python: Frank was a
really great guy with an excellent sense of humor and a way of
connecting with people. I know all of that after only meeting him two or
three times because it was just so obvious what kind of person he was
that it didn't take you any time to figure it out.

You can find more of Frank's writings here:

	http://www.oreilly.com/frank/

 Paul Prescod


From mal@lemburg.com  Tue Jul 31 14:28:39 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 15:28:39 +0200
Subject: [Python-Dev] PyOS_snprintf() / PyOS_vsnprintf()
Message-ID: <3B66B287.5D319774@lemburg.com>

Just to let you know and to initiate some cross-platform
testing:

While working on the warning patch for modsupport.c,
I've added two new APIs which hopefully make it easier for Python
to switch to buffer overflow safe [v]snprintf() APIs for error
reporting et al. 

The two new APIs are PyOS_snprintf() and 
PyOS_vsnprintf() and work just like the standard ones in many
C libs. On platforms which have snprintf(), the native APIs are used,
on all other an emulation with snprintf() tries to do its best.

Please try them out on your platform. If all goes well, I think
we should replace all sprintf() (without the n in the name)
with these new safer APIs.

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From skip@pobox.com (Skip Montanaro)  Tue Jul 31 15:07:26 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 31 Jul 2001 09:07:26 -0500
Subject: [Python-Dev] zipfiles on sys.path
In-Reply-To: <3B65AA15.27947.E9B214D@localhost>
References: <20010725215830.2F49D14A25D@oratrix.oratrix.nl>
 <3B65AA15.27947.E9B214D@localhost>
Message-ID: <15206.48030.99097.902155@beluga.mojam.com>

    Gordon> ... but it's my observation that package authors are enamored of
    Gordon> import hacks, so be wary.

One for amk's quotes file? ;-)

Skip


From skip@pobox.com (Skip Montanaro)  Tue Jul 31 15:51:22 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 31 Jul 2001 09:51:22 -0500
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010731013448.H20676@xs4all.nl>
References: <200107302251.KAA00585@s454.cosc.canterbury.ac.nz>
 <20010731013448.H20676@xs4all.nl>
Message-ID: <15206.50666.998086.720321@beluga.mojam.com>

    Skip> The main stumbling block was that pesky "from module import *"
    Skip> statement.  It could push an unknown quantity of stuff onto the
    Skip> stack

    Greg> Are you *sure* about that? I'm pretty certain it can't be true,
    Greg> since the compiler has to know at all times how much is on the
    Greg> stack, so it can decide how much stack space is needed.

    Thomas> I think Skip meant it does an arbitrary number of 

    Thomas> load-onto-stack
    Thomas> store-into-namespace

    Thomas> operations. Skip, you'll be glad to know that's no longer true
    Thomas> :) Since 2.0 (or when was it that we introduced 'import as' ?)
    Thomas> import-* is not a special case of 'IMPORT_FROM', but rather a
    Thomas> separate opcode that doesn't touch the stack.

I'm not sure what I meant any more.  (They say eye witness testimony in a
courtroom is quite unreliable.)  I'm pretty sure Greg's analysis is at least
partly correct (in that that couldn't have been why I failed to implement a
converter for IMPORT_FROM).  I went back and looked briefly at my old code
last night (which was broken when I put it aside - don't *ever* do that!)
and could find nothing that would indicate why I didn't like
"from-import-*".  The instruction set converter would refuse to try
converting any code that contained these opcdes: {LOAD,STORE,DELETE}_NAME,
SETUP_{FINALLY,EXCEPT}, or IMPORT_FROM.  At this point in time I'm not sure
which of those six opcodes were just ones I hadn't gotten around to writing
converters for and which were showstoppers.

wish-i-had-more-time-for-this-ly y'rs,

Skip


From skip@pobox.com (Skip Montanaro)  Tue Jul 31 16:02:25 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 31 Jul 2001 10:02:25 -0500
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>
References: <20010730051831.B1122@thyrsus.com>
 <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>
Message-ID: <15206.51329.561652.565480@beluga.mojam.com>

I was thinking a little about a Python/Perl VM merge.  One problem I imagine
would be difficult to reconcile is the subtle difference in semantics of
various basic types.  Consider the various bits of Python's (proposed)
number system that Perl might not have (or want): rationals, automatic
promotion from machine ints to longs, complex numbers.  These may not work
well with Perl's semantics.  What about exceptions?  Do Python and Perl have
similar notions of what exceptional conditions exist?

Skip


From pedroni@inf.ethz.ch  Tue Jul 31 16:06:57 2001
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Tue, 31 Jul 2001 17:06:57 +0200
Subject: [Python-Dev] Parrot -- should life imitate satire?
References: <Pine.NXT.4.21.0107301930450.258-100000@localhost.virginia.edu> <007701c11954$6b0017c0$8a73fea9@newmexico> <3B663140.41CB9DD7@ActiveState.com>
Message-ID: <003801c119d2$724781c0$8a73fea9@newmexico>

Thanks for the answer.

> Samuele Pedroni wrote:
> >
> >...
> > A question: are there already some data about
> > what would be the actual performance of Python.NET vs. CPython ?
>
> I think it is safe to say that the current version of Python.NET is
> slower than Jython. Now it hasn't been optimized as much as Jython so we
> might be able to get it as fast as Jython.
But this  maybe will wonder you, but Jython is not that much optimized, it's
mostly
a straightforward OO design. But I think that's the only way to avoid
specializing
for some development state of the JVMs.
For exampe we have changed nothing, but it seems (it seems) that under Java 1.4
asymptotically (meaning you need a long running process to exploit the HotSpot
technology)
Jython is a bit faster than CPython, at least for non I/O intesive stuff. It
seems
they optimized reflection.

> But I don't think that there
> is anything in the .NET runtime that makes it a great deal better than
> the JVM for dynamic languages.
I have the same impression, unless one can do something really clever
with boxing/unboxing without loosing too much cycles or going in the
way of the compiler.

> The only difference is that Microsoft
> seems more aware of the problem and may move to correct it whereas I
> have a feeling that explicit support for our languages would dilute
> Sun's 100% Java marketing campaign.
But will Sun be such a passive actor, even if MS will have a market advatage
supporting especially
scripting languages.

There is much hype in both camps, but Unix/C seem to show that you need a good
system language
and the possibility to write some scripting languages over it to have a good
platform.

> Also, the .NET CLR is standardized
> at ECMA so we could (at least in theory!) go to the meetings and try to
> influence version 2.
I imagine you can go the same way entering the JCP. ASF is in for example.

Samuele Pedroni.



From mclay@nist.gov  Tue Jul 31 04:11:52 2001
From: mclay@nist.gov (Michael McLay)
Date: Mon, 30 Jul 2001 23:11:52 -0400
Subject: [Python-Dev] Revised decimal type PEP
In-Reply-To: <3B666C9C.4400BD9C@lemburg.com>
References: <0107301106520A.02216@fermi.eeel.nist.gov> <3B666C9C.4400BD9C@lemburg.com>
Message-ID: <01073023115207.02466@fermi.eeel.nist.gov>

On Tuesday 31 July 2001 04:30 am, M.-A. Lemburg wrote:
> How will you be able to define the precision of decimals ? Implicit
> by providing a decimal string with enough 0s to let the parser
> deduce the precision ? Explicit like so: decimal(12, 5) ?

Would the following work?  For literal type definitions the precision would 
be implicit.  For values set using the decimal() function the definition 
would be implicit unless an explicit precision definition is set.  The 
following would all define the same value and precision.

   3.40d
   decimal("3.40")
   decimal(3.4, 2)

Those were easy.  How would the following be interpreted?

   decimal 3.404, 2)
   decimal 3.405, 2)
   decimal(3.39999, 2)

> Also, what happens to the precision of the decimal object resulting
> from numeric operations ?

Good question.  I'm not the right person to answer this, but here's is a 
first stab at what I would expect.

For addition, subtraction, and multiplication the results would be exact with 
no rounding of the results.  Calculations that include division the number of 
digits in a non-terminating result will have to be explicitly set.  Would it 
make sense for this to be definedby the numbers used in the calculation?  
Could this be set in the module or could it be global for the application?

What do you suggestion?  

>
> >     A decimal number can be used to represent integers and floating point
> >     numbers and decimal numbers can also be displayed using scientific
> >     notation. Examples of decimal numbers include:
> >
> >     ...
> >
> >     This proposal will also add an optional  'b' suffix to the
> >     representation of binary float type literals and binary int type
> >     literals.
>
> Hmm, I don't quite grasp the need for the 'b'... numbers without
> any modifier will work the same way as they do now, right ?

I made a change to the parsenumber() function in compile.c so that the type 
of the number is determined by the suffix attached to the number.  To retain 
backward compatibility the tokenizer automatically attaches the 'b' suffix to 
float and int types if they do not have a suffix in the literal definition.

My original PEP included the definition of a .dp and a dpython mode for the 
interpreter in which the default number type is decimal instead of binary.  
When the mode is switch the language becomes easier to use for developing 
applications that use decimal numbers.

> >     Expressions that mix binary floats with decimals introduce the
> >     possibility of unexpected results because the two number types use
> >     different internal representations for the same numerical value.
>
> I'd rather have this explicit in the sense that you define which
> assumptions will be made and what issues arise (rounding, truncation,
> loss of precision, etc.).

Can you give an example of how this might be implemented.

> >     To accommodate the three possible usage models the python interpreter
> >     command line options will be used to set the level for warning and
> >     error messages. The three levels are:
> >
> >     promiscuous mode,   -f or  --promiscuous
> >     safe mode           -s or --save
> >     pedantic mode       -p or --pedantic
>
> How about a generic option:
>
> 	--numerics:[loose|safe|pedantic] or -n:[l|s|p]

Thanks for the suggestion. I"ll change it. 



From aahz@rahul.net  Tue Jul 31 17:37:02 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Tue, 31 Jul 2001 09:37:02 -0700 (PDT)
Subject: [Python-Dev] Revised decimal type PEP
In-Reply-To: <01073023115207.02466@fermi.eeel.nist.gov> from "Michael McLay" at Jul 30, 2001 11:11:52 PM
Message-ID: <20010731163703.2F86E99C85@waltz.rahul.net>

Michael McLay wrote:
> 
> Those were easy.  How would the following be interpreted?
> 
>    decimal 3.404, 2)
>    decimal 3.405, 2)
>    decimal(3.39999, 2)
> 
>  [...]
> 
> For addition, subtraction, and multiplication the results would be
> exact with no rounding of the results.  Calculations that include
> division the number of digits in a non-terminating result will have to
> be explicitly set.  Would it make sense for this to be definedby the
> numbers used in the calculation?  Could this be set in the module or
> could it be global for the application?

This is why Cowlishaw et al require a full context for all operations.
At one point I tried implementing things with the context being
contained in the number rather than "global" (which actually means
thread-global, but I'm probably punting on *that* bit for the moment),
but Tim Peters persuaded me that sticking with the spec was the Right
Thing until *after* the spec was fully implemented.

After seeing the mess generated by PEP-238, I'm fervently in favor of
sticking with external specs whenever possible.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From mal@lemburg.com  Tue Jul 31 17:36:28 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 18:36:28 +0200
Subject: [Python-Dev] Revised decimal type PEP
References: <0107301106520A.02216@fermi.eeel.nist.gov> <3B666C9C.4400BD9C@lemburg.com> <01073023115207.02466@fermi.eeel.nist.gov>
Message-ID: <3B66DE8C.C9C62012@lemburg.com>

Michael McLay wrote:
> 
> On Tuesday 31 July 2001 04:30 am, M.-A. Lemburg wrote:
> > How will you be able to define the precision of decimals ? Implicit
> > by providing a decimal string with enough 0s to let the parser
> > deduce the precision ? Explicit like so: decimal(12, 5) ?
> 
> Would the following work?  For literal type definitions the precision would
> be implicit.  For values set using the decimal() function the definition
> would be implicit unless an explicit precision definition is set.  The
> following would all define the same value and precision.
> 
>    3.40d
>    decimal("3.40")
>    decimal(3.4, 2)
> 
> Those were easy.  How would the following be interpreted?
> 
>    decimal 3.404, 2)
>    decimal 3.405, 2)
>    decimal(3.39999, 2)

I'd suggest to follow the rules for the SQL definitions
of DECIMAL(,).
 
> > Also, what happens to the precision of the decimal object resulting
> > from numeric operations ?
> 
> Good question.  I'm not the right person to answer this, but here's is a
> first stab at what I would expect.
> 
> For addition, subtraction, and multiplication the results would be exact with
> no rounding of the results.  Calculations that include division the number of
> digits in a non-terminating result will have to be explicitly set.  Would it
> make sense for this to be definedby the numbers used in the calculation?
> Could this be set in the module or could it be global for the application?
> 
> What do you suggestion?

Well, there are several options. I support that the IBM paper
on decimal types has good hints as to what the type should do.
Again, SQL is probably a good source for inspiration too, since
it deals with decimals a lot.
 
> >
> > >     A decimal number can be used to represent integers and floating point
> > >     numbers and decimal numbers can also be displayed using scientific
> > >     notation. Examples of decimal numbers include:
> > >
> > >     ...
> > >
> > >     This proposal will also add an optional  'b' suffix to the
> > >     representation of binary float type literals and binary int type
> > >     literals.
> >
> > Hmm, I don't quite grasp the need for the 'b'... numbers without
> > any modifier will work the same way as they do now, right ?
> 
> I made a change to the parsenumber() function in compile.c so that the type
> of the number is determined by the suffix attached to the number.  To retain
> backward compatibility the tokenizer automatically attaches the 'b' suffix to
> float and int types if they do not have a suffix in the literal definition.
> 
> My original PEP included the definition of a .dp and a dpython mode for the
> interpreter in which the default number type is decimal instead of binary.
> When the mode is switch the language becomes easier to use for developing
> applications that use decimal numbers.

I see, the small 'b' still looks funny to me though. Wouldn't
1.23f and 25i be more intuitive ?

> > >     Expressions that mix binary floats with decimals introduce the
> > >     possibility of unexpected results because the two number types use
> > >     different internal representations for the same numerical value.
> >
> > I'd rather have this explicit in the sense that you define which
> > assumptions will be made and what issues arise (rounding, truncation,
> > loss of precision, etc.).
> 
> Can you give an example of how this might be implemented.

You would typically first coerce the types to the "larger"
type, e.g. float + decimal -> float + float -> float, so
you'd only have to document how the conversion is done and
which accuracy to expect.
 
> > >     To accommodate the three possible usage models the python interpreter
> > >     command line options will be used to set the level for warning and
> > >     error messages. The three levels are:
> > >
> > >     promiscuous mode,   -f or  --promiscuous
> > >     safe mode           -s or --save
> > >     pedantic mode       -p or --pedantic
> >
> > How about a generic option:
> >
> >       --numerics:[loose|safe|pedantic] or -n:[l|s|p]
> 
> Thanks for the suggestion. I"ll change it.

Great.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From guido@zope.com  Tue Jul 31 17:56:51 2001
From: guido@zope.com (Guido van Rossum)
Date: Tue, 31 Jul 2001 12:56:51 -0400
Subject: [Python-Dev] PyOS_snprintf() / PyOS_vsnprintf()
In-Reply-To: Your message of "Tue, 31 Jul 2001 15:28:39 +0200."
 <3B66B287.5D319774@lemburg.com>
References: <3B66B287.5D319774@lemburg.com>
Message-ID: <200107311656.MAA16366@cj20424-a.reston1.va.home.com>

> While working on the warning patch for modsupport.c,
> I've added two new APIs which hopefully make it easier for Python
> to switch to buffer overflow safe [v]snprintf() APIs for error
> reporting et al. 
> 
> The two new APIs are PyOS_snprintf() and 
> PyOS_vsnprintf() and work just like the standard ones in many
> C libs. On platforms which have snprintf(), the native APIs are used,
> on all other an emulation with snprintf() tries to do its best.
> 
> Please try them out on your platform. If all goes well, I think
> we should replace all sprintf() (without the n in the name)
> with these new safer APIs.

It would be easier to test out the fallback implementation if there
was a config option to enable it even on platforms that do have the
native version.

Or maybe (following the getopt example) we might consider always using
our own code -- so it gets the maximum testing.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@zope.com  Tue Jul 31 18:08:47 2001
From: guido@zope.com (Guido van Rossum)
Date: Tue, 31 Jul 2001 13:08:47 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: Your message of "Tue, 31 Jul 2001 10:02:25 CDT."
 <15206.51329.561652.565480@beluga.mojam.com>
References: <20010730051831.B1122@thyrsus.com> <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>
 <15206.51329.561652.565480@beluga.mojam.com>
Message-ID: <200107311708.NAA16497@cj20424-a.reston1.va.home.com>

> I was thinking a little about a Python/Perl VM merge.  One problem I imagine
> would be difficult to reconcile is the subtle difference in semantics of
> various basic types.  Consider the various bits of Python's (proposed)
> number system that Perl might not have (or want): rationals, automatic
> promotion from machine ints to longs, complex numbers.  These may not work
> well with Perl's semantics.  What about exceptions?  Do Python and Perl have
> similar notions of what exceptional conditions exist?

Actually, this may not be as big a deal as I thought before.  The PVM
doesn't have a lot of knowledge about types built into its instruction
set.  It knows a bit about classes, lists, dicts, but not e.g. about
ints and strings.  The opcodes are mostly very abstract: BINARY_ADD etc.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From DavidA@ActiveState.com  Tue Jul 31 18:21:59 2001
From: DavidA@ActiveState.com (David Ascher)
Date: Tue, 31 Jul 2001 10:21:59 -0700
Subject: [Python-Dev] Frank Willison
Message-ID: <3B66E937.D2390F90@ActiveState.com>

As Paul mentioned on python-list, Frank Willison died of a heart attack
yesterday.  I'm sad.

--david


From mal@lemburg.com  Tue Jul 31 18:22:52 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 31 Jul 2001 19:22:52 +0200
Subject: [Python-Dev] PyOS_snprintf() / PyOS_vsnprintf()
References: <3B66B287.5D319774@lemburg.com> <200107311656.MAA16366@cj20424-a.reston1.va.home.com>
Message-ID: <3B66E96C.FBAB8A62@lemburg.com>

Guido van Rossum wrote:
> 
> > While working on the warning patch for modsupport.c,
> > I've added two new APIs which hopefully make it easier for Python
> > to switch to buffer overflow safe [v]snprintf() APIs for error
> > reporting et al.
> >
> > The two new APIs are PyOS_snprintf() and
> > PyOS_vsnprintf() and work just like the standard ones in many
> > C libs. On platforms which have snprintf(), the native APIs are used,
> > on all other an emulation with snprintf() tries to do its best.
> >
> > Please try them out on your platform. If all goes well, I think
> > we should replace all sprintf() (without the n in the name)
> > with these new safer APIs.
> 
> It would be easier to test out the fallback implementation if there
> was a config option to enable it even on platforms that do have the
> native version.
>
> Or maybe (following the getopt example) we might consider always using
> our own code -- so it gets the maximum testing.

How about always enabling our version in the alpha cycle and then
reverting back to the native one in the betas ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From DavidA@ActiveState.com  Tue Jul 31 18:40:16 2001
From: DavidA@ActiveState.com (David Ascher)
Date: Tue, 31 Jul 2001 10:40:16 -0700
Subject: [Python-Dev] pep-discuss
References: <3B62EB05.396DF4D7@ActiveState.com> <15206.36847.621663.568615@anthem.wooz.org>
Message-ID: <3B66ED80.61B7E4C6@ActiveState.com>

"Barry A. Warsaw" wrote:

>     PP> Second, the flamewar on python-list basically drowned out the
>     PP> usual newbie questions and would give a person coming new to
>     PP> Python a very negative opinion about the language's future and
>     PP> the friendliness of the community. I would rather redirect as
>     PP> much as possible of that to a list that only interested
>     PP> participants would have to endure.
> 
> For me too, it'd be just another list to subscribe to and follow, so
> I'm generally against a separate pep list too.
> 
> One thing I'll note: in Mailman 2.1 we will be able to define "topics"
> and you will be able to filter on specific topics.  E.g. if we defined
> a pep topic, you could filter out all pep messages, receive only pep
> messages, or do mail client filtering on the X-Topics: header.  (This
> only works for regular delivery, not digest delivery.)

But that doesn't really solve the problem for newbies who aren't going
to set up filters just for this Python list they just got onto.

IMO, having 100 or so people add a new list is cheaper than having 10's
of 1000's of people setting up filter.

But whatever. =)

--david


From skip@pobox.com (Skip Montanaro)  Tue Jul 31 18:48:32 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 31 Jul 2001 12:48:32 -0500
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
References: <20010730051831.B1122@thyrsus.com>
 <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>
 <15206.51329.561652.565480@beluga.mojam.com>
 <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
Message-ID: <15206.61296.958360.72700@beluga.mojam.com>

    Guido> The PVM doesn't have a lot of knowledge about types built into
    Guido> its instruction set....  The opcodes are mostly very abstract:
    Guido> BINARY_ADD etc.

Yeah, but the runtime behind the virtual machine knows a hell of a lot about
the types.  A stream of opcodes doesn't mean anything without the semantics
of the functions the interpreter loop calls to do its work.  I thought the
aim of Eric's Parrot idea was that Perl and Python might be able to share a
virtual machine.  If both can generate something like today's BINARY_ADD
opcode, the underlying types of both Python and Perl better have the same
semantics.

Skip



From nascheme@mems-exchange.org  Tue Jul 31 18:54:57 2001
From: nascheme@mems-exchange.org (Neil Schemenauer)
Date: Tue, 31 Jul 2001 13:54:57 -0400
Subject: [Python-Dev] Good news about ExtensionClass and Python 2.2a1
Message-ID: <20010731135457.A15139@mems-exchange.org>

After a few tweaks to ExtensionClass and a few small fixes to some of
our introspection code I'm happy to say that Python 2.2a1 passes our
unit test suite.  This is significant since there are about 45000 lines
of code (counted by "wc -l") tested by 3569 test cases.  Since we use
ZODB ExtensionClasses are quite widely used.  Merging descr_branch into
HEAD sounds like a good idea to me.  Well done Guido.

I'm going to spend a bit of time trying to rewrite the ZODB Persistent
class as a type.  Attached is a diff of the changes I made to
ExtensionClass.

  Neil

--- ExtensionClass.h.dist	Tue Jul 31 11:50:39 2001
+++ ExtensionClass.h	Tue Jul 31 12:15:21 2001
@@ -143,12 +143,48 @@
 	reprfunc tp_str;
 	getattrofunc tp_getattro;
 	setattrofunc tp_setattro;
-	/* Space for future expansion */
-	long tp_xxx3;
-	long tp_xxx4;
+
+	/* Functions to access object as input/output buffer */
+	PyBufferProcs *tp_as_buffer;
+
+	/* Flags to define presence of optional/expanded features */
+	long tp_flags;
 
 	char *tp_doc; /* Documentation string */
 
+	/* call function for all accessible objects */
+	traverseproc tp_traverse;
+	
+	/* delete references to contained objects */
+	inquiry tp_clear;
+
+	/* rich comparisons */
+	richcmpfunc tp_richcompare;
+
+	/* weak reference enabler */
+	long tp_weaklistoffset;
+
+	/* Iterators */
+	getiterfunc tp_iter;
+	iternextfunc tp_iternext;
+
+	/* Attribute descriptor and subclassing stuff */
+	struct PyMethodDef *tp_methods;
+	struct memberlist *tp_members;
+	struct getsetlist *tp_getset;
+	struct _typeobject *tp_base;
+	PyObject *tp_dict;
+	descrgetfunc tp_descr_get;
+	descrsetfunc tp_descr_set;
+	long tp_dictoffset;
+	initproc tp_init;
+	allocfunc tp_alloc;
+	newfunc tp_new;
+	destructor tp_free; /* Low-level free-memory routine */
+	PyObject *tp_bases;
+	PyObject *tp_mro; /* method resolution order */
+	PyObject *tp_defined;
+
 #ifdef COUNT_ALLOCS
 	/* these must be last */
 	int tp_alloc;
@@ -302,7 +338,9 @@
    { PyExtensionClassCAPI->Export(D,N,&T); }
 
 /* Convert a method list to a method chain.  */
-#define METHOD_CHAIN(DEF) { DEF, NULL }
+#define METHOD_CHAIN(DEF) \
+	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, \
+	{ DEF, NULL }
 
 /* The following macro checks whether a type is an extension class: */
 #define PyExtensionClass_Check(TYPE) \
@@ -336,7 +374,9 @@
 #define PURE_MIXIN_CLASS(NAME,DOC,METHODS) \
 static PyExtensionClass NAME ## Type = { PyObject_HEAD_INIT(NULL) \
 	0, # NAME, sizeof(PyPureMixinObject), 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
-	0, 0, 0, 0, 0, 0, 0, DOC, {METHODS, NULL}, \
+	0, 0, 0, 0, 0, 0, 0, DOC, \
+	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, \
+	{METHODS, NULL}, \
         EXTENSIONCLASS_BASICNEW_FLAG}
 
 /* The following macros provide limited access to extension-class
--- ExtensionClass.c.dist	Tue Jul 31 11:01:20 2001
+++ ExtensionClass.c	Tue Jul 31 12:15:24 2001
@@ -119,7 +119,7 @@
 static PyObject *subclass_watcher=0;  /* Object that subclass events */
 
 static void
-init_py_names()
+init_py_names(void)
 {
 #define INIT_PY_NAME(N) py ## N = PyString_FromString(#N)
   INIT_PY_NAME(__add__);
@@ -1800,8 +1800,8 @@
 
   if (PyFunction_Check(r) || NeedsToBeBound(r))
     ASSIGN(r,newPMethod(self,r));
-  else if (PyMethod_Check(r) && ! PyMethod_Self(r))
-    ASSIGN(r,newPMethod(self, PyMethod_Function(r)));
+  else if (PyMethod_Check(r) && ! PyMethod_GET_SELF(r))
+    ASSIGN(r,newPMethod(self, PyMethod_GET_FUNCTION(r)));
 
   return r;
 }
@@ -3527,7 +3527,7 @@
 };
 
 void
-initExtensionClass()
+initExtensionClass(void)
 {
   PyObject *m, *d;
   char *rev="$Revision: 1.1 $";


From DavidA@ActiveState.com  Tue Jul 31 18:57:35 2001
From: DavidA@ActiveState.com (David Ascher)
Date: Tue, 31 Jul 2001 10:57:35 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
References: <20010730051831.B1122@thyrsus.com>
 <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>
 <15206.51329.561652.565480@beluga.mojam.com>
 <200107311708.NAA16497@cj20424-a.reston1.va.home.com> <15206.61296.958360.72700@beluga.mojam.com>
Message-ID: <3B66F18F.3EE81628@ActiveState.com>

Skip Montanaro wrote:
> 
>     Guido> The PVM doesn't have a lot of knowledge about types built into
>     Guido> its instruction set....  The opcodes are mostly very abstract:
>     Guido> BINARY_ADD etc.
> 
> Yeah, but the runtime behind the virtual machine knows a hell of a lot about
> the types.  A stream of opcodes doesn't mean anything without the semantics
> of the functions the interpreter loop calls to do its work.  I thought the
> aim of Eric's Parrot idea was that Perl and Python might be able to share a
> virtual machine.  If both can generate something like today's BINARY_ADD
> opcode, the underlying types of both Python and Perl better have the same
> semantics.

I don't think that needs to be true _in toto_.  In other words, some
opcodes can be used by both languages, some can be language-specific. 
The implementation of the VM for a given opcode can be shared per
language, or even just partially shared.  BINARY_ADD can do the same
thing in most languages for 'native' types, and defer to per-language
codepaths for objects, for example.

One problem with a hybrid approach might be that optimizations become
really hard to do if you can't assume much about the semantics, or if
you can only assume the union of the various semantics.  But the idea is
intriguing anyway =).

--david


From guido@zope.com  Tue Jul 31 20:00:01 2001
From: guido@zope.com (Guido van Rossum)
Date: Tue, 31 Jul 2001 15:00:01 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: Your message of "Tue, 31 Jul 2001 12:48:32 CDT."
 <15206.61296.958360.72700@beluga.mojam.com>
References: <20010730051831.B1122@thyrsus.com> <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu> <15206.51329.561652.565480@beluga.mojam.com> <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
 <15206.61296.958360.72700@beluga.mojam.com>
Message-ID: <200107311900.PAA17062@cj20424-a.reston1.va.home.com>

>     Guido> The PVM doesn't have a lot of knowledge about types built into
>     Guido> its instruction set....  The opcodes are mostly very abstract:
>     Guido> BINARY_ADD etc.
> 
> Yeah, but the runtime behind the virtual machine knows a hell of a lot about
> the types.  A stream of opcodes doesn't mean anything without the semantics
> of the functions the interpreter loop calls to do its work.  I thought the
> aim of Eric's Parrot idea was that Perl and Python might be able to share a
> virtual machine.  If both can generate something like today's BINARY_ADD
> opcode, the underlying types of both Python and Perl better have the same
> semantics.

Yeah, but the runtime could offer a choice of data types -- for Python
code the constants table would contain Python ints and strings etc., for
Perl code it would contain Perl string-number objects.  Maybe.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh@python.net  Tue Jul 31 20:11:13 2001
From: mwh@python.net (Michael Hudson)
Date: 31 Jul 2001 15:11:13 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: Guido van Rossum's message of "Tue, 31 Jul 2001 15:00:01 -0400"
References: <20010730051831.B1122@thyrsus.com> <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu> <15206.51329.561652.565480@beluga.mojam.com> <200107311708.NAA16497@cj20424-a.reston1.va.home.com> <15206.61296.958360.72700@beluga.mojam.com> <200107311900.PAA17062@cj20424-a.reston1.va.home.com>
Message-ID: <2mr8uwsylq.fsf@starship.python.net>

Guido van Rossum <guido@zope.com> writes:

> >     Guido> The PVM doesn't have a lot of knowledge about types built into
> >     Guido> its instruction set....  The opcodes are mostly very abstract:
> >     Guido> BINARY_ADD etc.
> > 
> > Yeah, but the runtime behind the virtual machine knows a hell of a lot about
> > the types.  A stream of opcodes doesn't mean anything without the semantics
> > of the functions the interpreter loop calls to do its work.  I thought the
> > aim of Eric's Parrot idea was that Perl and Python might be able to share a
> > virtual machine.  If both can generate something like today's BINARY_ADD
> > opcode, the underlying types of both Python and Perl better have the same
> > semantics.
> 
> Yeah, but the runtime could offer a choice of data types -- for Python
> code the constants table would contain Python ints and strings etc., for
> Perl code it would contain Perl string-number objects.  Maybe.

And the point of this would be?  I don't see much more benefit than
just arranging for the numbers in Include/opcode.h to match perl's
equivalents (i.e. none), but I may be missing something...

Cheers,
M.

-- 
  I've even been known to get Marmite *near* my mouth -- but never
  actually in it yet.  Vegamite is right out.
 UnicodeError: ASCII unpalatable error: vegamite found, ham expected
                                       -- Tim Peters, comp.lang.python


From skip@pobox.com (Skip Montanaro)  Tue Jul 31 20:22:25 2001
From: skip@pobox.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 31 Jul 2001 14:22:25 -0500
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <200107311900.PAA17062@cj20424-a.reston1.va.home.com>
References: <20010730051831.B1122@thyrsus.com>
 <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>
 <15206.51329.561652.565480@beluga.mojam.com>
 <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
 <15206.61296.958360.72700@beluga.mojam.com>
 <200107311900.PAA17062@cj20424-a.reston1.va.home.com>
Message-ID: <15207.1393.232974.785433@beluga.mojam.com>

    Guido> Yeah, but the runtime could offer a choice of data types -- for
    Guido> Python code the constants table would contain Python ints and
    Guido> strings etc., for Perl code it would contain Perl string-number
    Guido> objects.  Maybe.

So I could give a code object generated by the Python compiler to the Perl
runtime and get different results than if it was executed by the Python
environment?

Perhaps it's time for Eric to chime in again and tell us what he really has
in mind.  I can't see the utility in having the same set of opcodes for the
two languages if the semantics of running them under either environment
aren't going to be the same.  It seems like it would artificially constrain
people working on the internals of both languages.

Skip



From gnat@oreilly.com  Tue Jul 31 20:31:01 2001
From: gnat@oreilly.com (Nathan Torkington)
Date: Tue, 31 Jul 2001 13:31:01 -0600
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <200107311900.PAA17062@cj20424-a.reston1.va.home.com>
References: <20010730051831.B1122@thyrsus.com>
 <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>
 <15206.51329.561652.565480@beluga.mojam.com>
 <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
 <15206.61296.958360.72700@beluga.mojam.com>
 <200107311900.PAA17062@cj20424-a.reston1.va.home.com>
Message-ID: <15207.1909.395000.123189@gargle.gargle.HOWL>

Guido van Rossum writes:
> Yeah, but the runtime could offer a choice of data types -- for Python
> code the constants table would contain Python ints and strings etc., for
> Perl code it would contain Perl string-number objects.  Maybe.

A perl6 value have a vtable, essentially an array of function pointers
which comprises the standard operations on that value.  I talked to
Dan (the perl6 internals guy, dan@sidhe.org) about an impedence
mismatch between Perl and Python data types, and he pointed out that
you can have Perl values and Python values, each with their own
semantics, simply by having separate vtables (and thus separate
functions to implement the behaviour of those types).  Code can work
with either type because the type carries around (in its vtable) the
knowledge of how it should behave.

Feel free to grill Dan about these things if you want.

Nat




From esr@thyrsus.com  Tue Jul 31 09:14:43 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 31 Jul 2001 04:14:43 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <15207.1393.232974.785433@beluga.mojam.com>; from skip@pobox.com on Tue, Jul 31, 2001 at 02:22:25PM -0500
References: <20010730051831.B1122@thyrsus.com> <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu> <15206.51329.561652.565480@beluga.mojam.com> <200107311708.NAA16497@cj20424-a.reston1.va.home.com> <15206.61296.958360.72700@beluga.mojam.com> <200107311900.PAA17062@cj20424-a.reston1.va.home.com> <15207.1393.232974.785433@beluga.mojam.com>
Message-ID: <20010731041443.A26075@thyrsus.com>

Skip Montanaro <skip@pobox.com>:
> 
>     Guido> Yeah, but the runtime could offer a choice of data types -- for
>     Guido> Python code the constants table would contain Python ints and
>     Guido> strings etc., for Perl code it would contain Perl string-number
>     Guido> objects.  Maybe.
> 
> So I could give a code object generated by the Python compiler to the Perl
> runtime and get different results than if it was executed by the Python
> environment?

No, I don't think that's what Guido is saying.  He and I are both imagining
a *single* runtime, but with some type-specific opcodes that are generated
only by Perl and some only generated by Python.

> Perhaps it's time for Eric to chime in again and tell us what he really has
> in mind.  I can't see the utility in having the same set of opcodes for the
> two languages if the semantics of running them under either environment
> aren't going to be the same.  It seems like it would artificially constrain
> people working on the internals of both languages.

You're right.

What I have in mind starts with a common opcode interpreter, perhaps
based on the Python VM but with extended opcodes where Perl type
semantics don't match, and a common callout mechanism to C-level
runtime libraries linked to the opcode interpreter.

In the conservative version of this vision, Perl and Python have
different runtimes dynamically linked to an instance of the same
opcode interpreter.  Memory allocation/GC and scheduling/threading are
handled inside the opcode interpreter but the OS and environment
binding is (mostly) in the libraries.

Things Python would bring to this party: our serious-cool GC, our 
C extension/embedding system (*much* nicer than XS).  Things Perl would
bring: blazingly fast regexps, taint, flexitypes, references. 

In the radical version, the Perl and Python runtimes merge and the 
differences in semantics are implemented by compiling different wrapper
sequences of opcodes around the library callouts.  At this point we're
doing something competitive with Microsoft's CLR.

My proposed work plan is:

1. Separate the Python VM from the Python compiler.  Initially it's
   OK if they still communicate by hard linkage but that will change
   later.

2. Build the Parrot VM out from the Python VM by adding the minimum
   number of Perliferous opcodes.

3. Start building the Perl runtime on top of that, re-using as much
   of the Python runtime as possible to save effort.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Every election is a sort of advance auction sale of stolen goods. 
	-- H.L. Mencken 


From m@moshez.org  Tue Jul 31 21:10:50 2001
From: m@moshez.org (Moshe Zadka)
Date: Tue, 31 Jul 2001 23:10:50 +0300
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
References: <200107311708.NAA16497@cj20424-a.reston1.va.home.com>, <20010730051831.B1122@thyrsus.com> <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>
 <15206.51329.561652.565480@beluga.mojam.com>
Message-ID: <E15Rfqs-0000vf-00@darjeeling>

On Tue, 31 Jul 2001, Guido van Rossum <guido@zope.com> wrote:

> Actually, this may not be as big a deal as I thought before.  The PVM
> doesn't have a lot of knowledge about types built into its instruction
> set.  It knows a bit about classes, lists, dicts, but not e.g. about
> ints and strings.  The opcodes are mostly very abstract: BINARY_ADD etc.

PUSH "1"
PUSH "2"
BINARY_ADD

In Python that gives "12". In Perl that gives 3.
Unless you suggest a PERL_BINARY_ADD and a PYTHON_BINARY_ADD, I 
don't see how you can around these things.

-- 
gpg --keyserver keyserver.pgp.com --recv-keys 46D01BD6 54C4E1FE
Secure (inaccessible): 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6
Insecure (accessible): C5A5 A8FA CA39 AB03 10B8  F116 1713 1BCF 54C4 E1FE
Learn Python! http://www.ibiblio.org/obp/thinkCSpy


From mclay@nist.gov  Tue Jul 31 08:27:11 2001
From: mclay@nist.gov (Michael McLay)
Date: Tue, 31 Jul 2001 03:27:11 -0400
Subject: [Python-Dev] Revised decimal type PEP
In-Reply-To: <3B66DE8C.C9C62012@lemburg.com>
References: <0107301106520A.02216@fermi.eeel.nist.gov> <01073023115207.02466@fermi.eeel.nist.gov> <3B66DE8C.C9C62012@lemburg.com>
Message-ID: <01073103271101.02004@fermi.eeel.nist.gov>

On Tuesday 31 July 2001 12:36 pm, M.-A. Lemburg wrote:

> I'd suggest to follow the rules for the SQL definitions
> of DECIMAL(,).

> Well, there are several options. I support that the IBM paper
> on decimal types has good hints as to what the type should do.
> Again, SQL is probably a good source for inspiration too, since
> it deals with decimals a lot.

Ok, I know about the IBM paper.  is there online document on the SQL 
semantics that can be referenced in the PEP?

> I see, the small 'b' still looks funny to me though. Wouldn't
> 1.23f and 25i be more intuitive ?

I originally used 'f' for both the integer and float.    The use of 'b' was 
suggested by Guido. There were two reasons not to use 'i' for integers.  The 
first has to do with how the tokenizer works.  It doesn't distringuish 
between float and int when the token string is passed to parsenumber().  Both 
float and int are processed by the same function.  I could have got around 
this problem by having the switch statement in parsenumber recognize both 'i' 
and 'f', but there is another problem with using 'i'.  The 25i would be 
confusing for someone if they was trying to use an imaginary numbers If they 
accidentally typed 25i instead of 25j they would get an integer instead of an 
imaginary number.  The error might not be detected since 3.0 + 4i would 
evaluate properly.

> > > I'd rather have this explicit in the sense that you define which
> > > assumptions will be made and what issues arise (rounding, truncation,
> > > loss of precision, etc.).
> >
> > Can you give an example of how this might be implemented.
>
> You would typically first coerce the types to the "larger"
> type, e.g. float + decimal -> float + float -> float, so
> you'd only have to document how the conversion is done and
> which accuracy to expect.

I would be concerned about the float + decimal automatically generating a 
float.  Would it generate an error message if the pedantic flag was set?  
Would it generate a warning in safe mode?

Also, why do you consider a float to be a "larger" value type than decimal?  
Do you mean that a float is less precise?


From gmcm@hypernet.com  Tue Jul 31 21:27:29 2001
From: gmcm@hypernet.com (Gordon McMillan)
Date: Tue, 31 Jul 2001 16:27:29 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <E15Rfqs-0000vf-00@darjeeling>
References: <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
Message-ID: <3B66DC71.25881.1347D90B@localhost>

Moshe Zadka wrote:

> PUSH "1"
> PUSH "2"
> BINARY_ADD

But you get a pair of LOAD_CONSTs and a BINARY_ADD. 
Presumably a Perl "1" is a different object that a Python "1".

- Gordon


From thomas@xs4all.net  Tue Jul 31 21:32:59 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 31 Jul 2001 22:32:59 +0200
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <E15Rfqs-0000vf-00@darjeeling>
Message-ID: <20010731223259.A626@xs4all.nl>

On Tue, Jul 31, 2001 at 11:10:50PM +0300, Moshe Zadka wrote:
> On Tue, 31 Jul 2001, Guido van Rossum <guido@zope.com> wrote:

> > Actually, this may not be as big a deal as I thought before.  The PVM
> > doesn't have a lot of knowledge about types built into its instruction
> > set.  It knows a bit about classes, lists, dicts, but not e.g. about
> > ints and strings.  The opcodes are mostly very abstract: BINARY_ADD etc.

> PUSH "1"
> PUSH "2"
> BINARY_ADD

> In Python that gives "12". In Perl that gives 3.
> Unless you suggest a PERL_BINARY_ADD and a PYTHON_BINARY_ADD, I 
> don't see how you can around these things.

The Perl version of the compiled code could of course be

PUSH "1"
COERCE_INT
PUSH "2"
COERCE_INT
BINARY_ADD

for Perl's 

"1" + "2"

and 

PUSH "1"
PUSH "2"
BINARY_ADD

for it's 

"1" . "2"

(or, in the case of variables instead of literals, an explicit
'COERCE_STRING' or whatever.)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mclay@nist.gov  Tue Jul 31 08:40:21 2001
From: mclay@nist.gov (Michael McLay)
Date: Tue, 31 Jul 2001 03:40:21 -0400
Subject: [Python-Dev] Revised decimal type PEP
In-Reply-To: <20010731163703.2F86E99C85@waltz.rahul.net>
References: <20010731163703.2F86E99C85@waltz.rahul.net>
Message-ID: <01073103402102.02004@fermi.eeel.nist.gov>

On Tuesday 31 July 2001 12:37 pm, Aahz Maruch wrote:
> Michael McLay wrote:

> > For addition, subtraction, and multiplication the results would be
> > exact with no rounding of the results.  Calculations that include
> > division the number of digits in a non-terminating result will have to
> > be explicitly set.  Would it make sense for this to be definedby the
> > numbers used in the calculation?  Could this be set in the module or
> > could it be global for the application?
>
> This is why Cowlishaw et al require a full context for all operations.
> At one point I tried implementing things with the context being
> contained in the number rather than "global" (which actually means
> thread-global, but I'm probably punting on *that* bit for the moment),
> but Tim Peters persuaded me that sticking with the spec was the Right
> Thing until *after* the spec was fully implemented.
>
> After seeing the mess generated by PEP-238, I'm fervently in favor of
> sticking with external specs whenever possible.

I had originally expected the context for decimal calculations to be the 
module in which a statement is defined.  If a function defined in another 
module is called the rules of that other module would be applied to that part 
of the calculation.  My expectations of how Python would work with decimal 
numbers doesn't seem to match what Guido said about his conversation with 
Tim, and what you said in this message.  

How can the rules for using decimals be stated so that a newbie can 
understand what they should expect to happen?  We could set a default 
precision of 17 digits and all calculations that were not exact would be 
rounded to 17 digits.  This would match how their calculator works.  I would 
think this would be the model with the least suprises.  For someone needing 
to be more precise, or less precise, how would this rule be modified?



From paulp@ActiveState.com  Tue Jul 31 21:48:45 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 31 Jul 2001 13:48:45 -0700
Subject: [Python-Dev] Parrot -- should life imitate satire?
References: <200107311708.NAA16497@cj20424-a.reston1.va.home.com>, <20010730051831.B1122@thyrsus.com> <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>
 <15206.51329.561652.565480@beluga.mojam.com> <E15Rfqs-0000vf-00@darjeeling>
Message-ID: <3B6719AD.EAC715FA@ActiveState.com>

Moshe Zadka wrote:
> 
>...
> 
> PUSH "1"
> PUSH "2"
> BINARY_ADD
> 
> In Python that gives "12". In Perl that gives 3.
> Unless you suggest a PERL_BINARY_ADD and a PYTHON_BINARY_ADD, I
> don't see how you can around these things.

I'm not endorsing the approach but I think the answer is:

PUSH PyString("1")
PUSH PyString("2")
BINARY_ADD

versus

PUSH PlString("1")
PUSH PlString("2")
BINARY_ADD

i.e. the operators are generic but the operand types vary across
languages. So you can completely unify the bytecodes or the types, but
trying to unify both seems impossible without changing the semantics of
one language or the other quite a bit.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook


From dan@sidhe.org  Tue Jul 31 21:51:30 2001
From: dan@sidhe.org (Dan Sugalski)
Date: Tue, 31 Jul 2001 16:51:30 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010731041443.A26075@thyrsus.com>
References: <15207.1393.232974.785433@beluga.mojam.com>
 <20010730051831.B1122@thyrsus.com>
 <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu>
 <15206.51329.561652.565480@beluga.mojam.com>
 <200107311708.NAA16497@cj20424-a.reston1.va.home.com>
 <15206.61296.958360.72700@beluga.mojam.com>
 <200107311900.PAA17062@cj20424-a.reston1.va.home.com>
 <15207.1393.232974.785433@beluga.mojam.com>
Message-ID: <5.1.0.14.0.20010731161946.02753210@24.8.96.48>

[Eric, could you forward this to python-dev if it doesn't show of its own 
accord? I'm not yet subscribed, so I don't know if it'll make it]

I should start with an apology for not being on python-dev when this 
started. Do please Cc me on anything, as I've not gotten on yet. (My 
subscription's caught in the mail, I guess... :)

At 04:14 AM 7/31/2001 -0400, Eric S. Raymond wrote:
>Skip Montanaro <skip@pobox.com>:
> >
> >     Guido> Yeah, but the runtime could offer a choice of data types -- for
> >     Guido> Python code the constants table would contain Python ints and
> >     Guido> strings etc., for Perl code it would contain Perl string-number
> >     Guido> objects.  Maybe.
> >
> > So I could give a code object generated by the Python compiler to the Perl
> > runtime and get different results than if it was executed by the Python
> > environment?
>
>No, I don't think that's what Guido is saying.  He and I are both imagining
>a *single* runtime, but with some type-specific opcodes that are generated
>only by Perl and some only generated by Python.

Odds are there won't even be a different set of opcodes. (Barring the 
possibility of the optimizer being able to *know* that an operation is 
guaranteed to be integer or float, and thus using special-purpose opcodes. 
And that's really an optimization, not a set of language-specific opcodes) 
The behaviour of data is governed by the data itself, so Python variables 
would have Python vtables attached to them guaranteeing Python behaviour, 
while perl ones would have perl vtables guaranteeing perl behaviour.

This was covered, more or less, by the chunks of the internals talk I 
didn't get to. Slides, for the interested, are at 
http://dev.perl.org/perl6/talks/. I'm not sure if there's enough info on 
the slides themselves to be clear--they were written to be talked around.

> > Perhaps it's time for Eric to chime in again and tell us what he really has
> > in mind.  I can't see the utility in having the same set of opcodes for the
> > two languages if the semantics of running them under either environment
> > aren't going to be the same.  It seems like it would artificially constrain
> > people working on the internals of both languages.
>
>You're right.
>
>What I have in mind starts with a common opcode interpreter, perhaps
>based on the Python VM but with extended opcodes where Perl type
>semantics don't match, and a common callout mechanism to C-level
>runtime libraries linked to the opcode interpreter.

I've snipped the rest here.

I don't think Parrot will be built off the Python interpreter. This isn't 
out of any NIH feelings or anything--I'm obligated to make it work for 
Perl, as that's the primary point. If we can make Python a primary point 
too that's keen, and something I *want*, but I do need to keep focused on perl.

Having said that, what I'm doing is stepping back from perl and trying, 
wherever possible, to make the runtime generic. If there's no reason to be 
perl specific I'm not, and so far that's not been a problem. (It actually 
makes life easier in a lot of ways, since we can then delegate the decision 
on how things are done to the variables involved, providing a default set 
of behaviours which the parser will end up determining anyway)

On some things I think I'm being a bit more vicious than, say, Python is by 
default. (For example, if extension code wants to hold on to a variable 
across a GC boundary it had darned well better register that fact with the 
interpreter, or it's going to find itself with trash) I'm not sure about 
the extension mechanism in general--I've not had a chance to look too 
closely at what Python does now, but I don't doubt that, at the C level, 
the differences between the languages will be pretty trivial and easily 
abstractable. Seeing what you folks have is on the list 'o things to do--I 
may well steal from it wholesale. :)

I expect there's a bunch of stuff I'm missing here, so if anyone wants to 
peg me with questions, go for it. (Cc me if they're going to the dev list 
please, at least until I'm sure I'm on) I really would like to see Parrot 
as a viable back end for Python--I think the joint development resources we 
could muster (possibly with the Ruby folks as well) could get us a VM for 
dynamically typed languages to rival the JVM/.NET for statically typed ones.

					Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                      teddy bears get drunk



From thomas@xs4all.net  Tue Jul 31 21:54:45 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 31 Jul 2001 22:54:45 +0200
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <15207.1909.395000.123189@gargle.gargle.HOWL>
References: <20010730051831.B1122@thyrsus.com> <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu> <15206.51329.561652.565480@beluga.mojam.com> <200107311708.NAA16497@cj20424-a.reston1.va.home.com> <15206.61296.958360.72700@beluga.mojam.com> <200107311900.PAA17062@cj20424-a.reston1.va.home.com> <15207.1909.395000.123189@gargle.gargle.HOWL>
Message-ID: <20010731225445.B626@xs4all.nl>

On Tue, Jul 31, 2001 at 01:31:01PM -0600, Nathan Torkington wrote:
> Guido van Rossum writes:
> > Yeah, but the runtime could offer a choice of data types -- for Python
> > code the constants table would contain Python ints and strings etc., for
> > Perl code it would contain Perl string-number objects.  Maybe.

> A perl6 value have a vtable, essentially an array of function pointers
> which comprises the standard operations on that value.  I talked to
> Dan (the perl6 internals guy, dan@sidhe.org) about an impedence
> mismatch between Perl and Python data types, and he pointed out that
> you can have Perl values and Python values, each with their own
> semantics, simply by having separate vtables (and thus separate
> functions to implement the behaviour of those types).  Code can work
> with either type because the type carries around (in its vtable) the
> knowledge of how it should behave.

Python objects all have vtables too (though they're structs, not arrays...
I'm not sure why you'd use arrays; check the way Python uses them, you can
do just about anything you want with them, including growing them without
breaking binary compatibility, due to the fact Python never memmoves/copies)
but that wouldn't solve the problem. The problem isn't that the VM wouldn't
know what to do with the various types -- it's absolutely problem to make a
Python object that behaves like a Perl scalar, or a Perl hash, including the
auto-converting bits...

The problem is that we'd end up with two different sets of types...
Dicts/hashes could be merged, though Perl6 will have to decide if it still
wants to auto-stringify the keys (Python dicts can hold any hashable object
as key) and arrays could possibly be too, but scalars are a different type.
You basically lose the interchangability benifit if Perl6 code all works
with the 'Scalar' type, but Python code just uses the distinct
int/string/class-instance...

But now that I think about it, this might not be a big problem after all. I
assume Perl6 will always convert to fit the operation, like Perl5 does.
It'll just have to learn to handle a few more objects, and most notably
user-defined types and extension types. Python C code already does things
like 'PyObject_ToInt' to convert a Python value to a C value it can work
with, or just uses the PyObject_<operation> (or PyMapping_<operation>, etc)
API to manipulate objects. Python code wouldn't notice the difference unless
it did type checks, and the Perl6 types could be made siblings of the Python
types to make it pass those, too. We already have the 8-bit and 16-bit
strings.

About the only *real* problem I see with that is getting the whole farm of
mexican jumping beans to figure-skate in unison... It'll be an interesting
experience, with a lot of slippery falls and just-in-time recovering... not
to mention quite a bit of ego-massaging :-) But I think it's just a manner
of typing code and taking the time, and forget about optimizing the code the
first couple of years.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From aahz@rahul.net  Tue Jul 31 22:07:02 2001
From: aahz@rahul.net (Aahz Maruch)
Date: Tue, 31 Jul 2001 14:07:02 -0700 (PDT)
Subject: [Python-Dev] Revised decimal type PEP
In-Reply-To: <01073103402102.02004@fermi.eeel.nist.gov> from "Michael McLay" at Jul 31, 2001 03:40:21 AM
Message-ID: <20010731210702.A778D99C82@waltz.rahul.net>

Michael McLay wrote:
> 
> I had originally expected the context for decimal calculations to be  
> the module in which a statement is defined.  If a function defined    
> in another module is called the rules of that other module would be   
> applied to that part of the calculation.  My expectations of how      
> Python would work with decimal numbers doesn't seem to match what     
> Guido said about his conversation with Tim, and what you said in this 
> message.                                                              
>
> How can the rules for using decimals be stated so that a newbie can
> understand what they should expect to happen?  We could set a default
> precision of 17 digits and all calculations that were not exact would
> be rounded to 17 digits.  This would match how their calculator works.
> I would think this would be the model with the least suprises.  For
> someone needing to be more precise, or less precise, how would this
> rule be modified?

I intend to have more discussions with Cowlishaw once I finish
implementing his spec, but I suspect his answer will be that whoever
calls the module should set the precision.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.


From niemeyer@conectiva.com  Tue Jul 31 22:09:54 2001
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Tue, 31 Jul 2001 18:09:54 -0300
Subject: [Python-Dev] Info documentation
Message-ID: <20010731180954.J19610@tux.distro.conectiva>

Hello!

I've taken the info files somebody has sent to the python-list and
included in Conectiva Linux' python package. People found it very
practical to use the documentation in this format. Would it be
possible to have this format built just like the others for
version 2.2?

Thanks!

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]


From esr@thyrsus.com  Tue Jul 31 10:18:08 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 31 Jul 2001 05:18:08 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <20010731225445.B626@xs4all.nl>; from thomas@xs4all.net on Tue, Jul 31, 2001 at 10:54:45PM +0200
References: <20010730051831.B1122@thyrsus.com> <Pine.NXT.4.21.0107301836260.258-100000@localhost.virginia.edu> <15206.51329.561652.565480@beluga.mojam.com> <200107311708.NAA16497@cj20424-a.reston1.va.home.com> <15206.61296.958360.72700@beluga.mojam.com> <200107311900.PAA17062@cj20424-a.reston1.va.home.com> <15207.1909.395000.123189@gargle.gargle.HOWL> <20010731225445.B626@xs4all.nl>
Message-ID: <20010731051808.A27187@thyrsus.com>

Thomas Wouters <thomas@xs4all.net>:
> About the only *real* problem I see with that is getting the whole farm of
> mexican jumping beans to figure-skate in unison... It'll be an interesting
> experience, with a lot of slippery falls and just-in-time recovering... not
> to mention quite a bit of ego-massaging :-) But I think it's just a manner
> of typing code and taking the time, and forget about optimizing the code the
> first couple of years.

This is just about exactly how I see it, too.  The big problem isn't
any of the technical challenges -- the discussion so far indicates
these are surmountable, and in fact may be less daunting than many
of us originally assumed.  The big problem will be summoning the
political will to make the right commitments and the right
compromises.

Making this work is going to take strong leadership from Larry and
Guido.  We're laying some of the technical groundwork now.  More will
have to be done.  But I think the key moment, if it happens, will be
the one at which Guido and Larry, each flanked by their three or four
chief lieutenants, shake hands for the cameras and issue a joint ukase
to their tribes.

Tim, hosting that meeting will be your job, of course :-).
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Those who make peaceful revolution impossible 
will make violent revolution inevitable."
	-- John F. Kennedy


From tim.one@home.com  Tue Jul 31 23:19:39 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 31 Jul 2001 18:19:39 -0400
Subject: [Python-Dev] Plan to merge descr-branch into trunk
Message-ID: <LNBBLJKPBEHFEDALKOLCAEJNLCAA.tim.one@home.com>

Unless somebody raises a killer objection over the next ~24 hours, I plan to
merge the descr-branch back into the trunk Wednesday PM (EDT), thus ending
descr-branch as a distinct line of Python development.

Since it would be intractably hard to roll back the code changes, this
represents a solid commitment to Guido's type/class work for 2.2 final.
There may be objections on those grounds.  If so, good luck selling them to
Guido <wink>.

I don't have any worries about the mechanics of the merge, so you shouldn't
either.  We've been very conscientious over the last month+ about merging
trunk changes into descr-branch frequently, and of course I'll do that one
last time before going the other direction.

all's-well-that-ends-ly y'rs  - tim



From esr@thyrsus.com  Tue Jul 31 14:59:41 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 31 Jul 2001 09:59:41 -0400
Subject: [Python-Dev] Parrot -- should life imitate satire?
In-Reply-To: <Pine.LNX.4.32.0107311809160.19357-100000@ziggy.localdomain.fake>; from ping@lfw.org on Tue, Jul 31, 2001 at 06:12:55PM -0700
References: <20010731041443.A26075@thyrsus.com> <Pine.LNX.4.32.0107311809160.19357-100000@ziggy.localdomain.fake>
Message-ID: <20010731095941.E1708@thyrsus.com>

Ka-Ping Yee <ping@lfw.org>:
> On Tue, 31 Jul 2001, Eric S. Raymond wrote:
> > Things Python would bring to this party: our serious-cool GC, our
> > C extension/embedding system (*much* nicer than XS).  Things Perl would
> > bring: blazingly fast regexps, taint, flexitypes, references.
> 
> I don't really understand the motivation.  Do we want any of those things?

No, but we want to be able to interoperate with Perl and have if possible 
have just one back end on which efforts to do things like native code
compilation can be concentrated.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The common argument that crime is caused by poverty is a kind of
slander on the poor.
	-- H. L. Mencken


From tim.one at home.com  Sun Jul  1 03:58:29 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 30 Jun 2001 21:58:29 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3E4487.40054EAE@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEKLKLAA.tim.one@home.com>

[Paul Prescod]
> "The Energy is the mass of the object times the speed of light times
> two."

[David Ascher]
> Actually, it's "squared", not times two.  At least in my universe =)

This is something for Guido to Pronounce on, then.  Who's going to write the
PEP?  The threat of nuclear war seems almost laughable in Paul's universe,
so it's certainly got attractions.  OTOH, it's got to be a lot colder too.

energy-will-do-what-guido-tells-it-to-do-ly y'rs  - tim




From paulp at ActiveState.com  Sun Jul  1 05:59:02 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sat, 30 Jun 2001 20:59:02 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com>
Message-ID: <3B3EA006.14882609@ActiveState.com>

David Ascher wrote:
> 
> > "The Energy is the mass of the object times the speed of light times
> > two."
> 
> Actually, it's "squared", not times two.  At least in my universe =)

Pedant. Next you're going to claim that these silly equations effect my
life somehow.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From paulp at ActiveState.com  Sun Jul  1 06:04:49 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sat, 30 Jun 2001 21:04:49 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com>
Message-ID: <3B3EA161.1375F74C@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
>...
> 
> The term "character" in Python should really only be used for
> the 8-bit strings. 

Are we going to change chr() and unichr() to one_element_string() and
unicode_one_element_string()

u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
character. No Python user will find that confusing no matter how Unicode
knuckle-dragging, mouth-breathing, wife-by-hair-dragging they are.

> In Unicode a "character" can mean any of:

Mark Davis said that "people" can use the word to mean any of those
things. He did not say that it was imprecisely defined in Unicode.
Nevertheless I'm not using the Unicode definition anymore than our
standard library uses an ancient Greek definition of integer. Python has
a concept of integer and a concept of character.

> >     It has been proposed that there should be a module for working
> >     with UTF-16 strings in narrow Python builds through some sort of
> >     abstraction that handles surrogates for you. If someone wants
> >     to implement that, it will be another PEP.
> 
> Uhm, narrow builds don't support UTF-16... it's UCS-2 which
> is supported (basically: store everything in range(0x10000));
> the codecs can map code points to surrogates, but it is solely
> their responsibility and the responsibility of the application
> using them to take care of dealing with surrogates.

The user can view the data as UCS-2, UTF-16, Base64, ROT-13, XML, ....
Just as we have a base64 module, we could have a UTF-16 module that
interprets the data in the string as UTF-16 and does surrogate
manipulation for you.

Anyhow, if any of those is the "real" encoding of the data, it is
UTF-16. After all, if the codec reads in four non-BMP characters in,
let's say, UTF-8, we represent them as 8 narrow-build Python characters.
That's the definition of UTF-16! But it's easy enough for me to take
that word out so I will.

>...
> Also, the module will be useful for both narrow and wide builds,
> since the notion of an encoded character can involve multiple code
> points. In that sense Unicode is always a variable length
> encoding for characters and that's the application field of
> this module.

I wouldn't advise that you do all different types of normalization in a
single module but I'll wait for your PEP.

> Here's the adjusted text:
> 
>      It has been proposed that there should be a module for working
>      with Unicode objects using character-, word- and line- based
>      indexing. The details of the implementation is left to
>      another PEP.
 
     It has been proposed that there should be a module that handles
     surrogates in narrow Python builds for programmers. If someone 
     wants to implement that, it will be another PEP. It might also be 
     combined with features that allow other kinds of character-, 
     word- and line- based indexing.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From DavidA at ActiveState.com  Sun Jul  1 08:09:40 2001
From: DavidA at ActiveState.com (David Ascher)
Date: Sat, 30 Jun 2001 23:09:40 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com>
Message-ID: <3B3EBEA4.3EC84EAF@ActiveState.com>

Paul Prescod wrote:
> 
> David Ascher wrote:
> >
> > > "The Energy is the mass of the object times the speed of light times
> > > two."
> >
> > Actually, it's "squared", not times two.  At least in my universe =)
> 
> Pedant. Next you're going to claim that these silly equations effect my
> life somehow.

Although one stretch the argument to say that the equations _effect_
your life, I'd limit the claim to stating that they _affect_ your life. 

pedantly y'rs,

--dr david



From paulp at ActiveState.com  Sun Jul  1 08:15:46 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sat, 30 Jun 2001 23:15:46 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com> <3B3EBEA4.3EC84EAF@ActiveState.com>
Message-ID: <3B3EC012.A3A05E64@ActiveState.com>

David Ascher wrote:
> 
> Paul Prescod wrote:
> >
> > David Ascher wrote:
> > >
> > > > "The Energy is the mass of the object times the speed of light times
> > > > two."
> > >
> > > Actually, it's "squared", not times two.  At least in my universe =)
> >
> > Pedant. Next you're going to claim that these silly equations effect my
> > life somehow.
> 
> Although one stretch the argument to say that the equations _effect_
              ^               
might    -----

> your life, I'd limit the claim to stating that they _affect_ your life.

And you just bought such a shiny, new glass, house. Pity.

-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From nhodgson at bigpond.net.au  Sun Jul  1 15:00:15 2001
From: nhodgson at bigpond.net.au (Neil Hodgson)
Date: Sun, 1 Jul 2001 23:00:15 +1000
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com>
Message-ID: <00dd01c1022d$c61e4160$0acc8490@neil>

Paul Prescod:
<PEP: 261>

   The problem I have with this PEP is that it is a compile time option
which makes it hard to work with both 32 bit and 16 bit strings in one
program. Can not the 32 bit string type be introduced as an additional type?

> Are we going to change chr() and unichr() to one_element_string() and
> unicode_one_element_string()
>
> u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> character.

   This wasn't usefully true in the past for DBCS strings and is not the
right way to think of either narrow or wide strings now. The idea that
strings are arrays of characters gets in the way of dealing with many
encodings and is the primary difficulty in localising software for Japanese.
Iteration through the code units in a string is a problem waiting to bite
you and string APIs should encourage behaviour which is correct when faced
with variable width characters, both DBCS and UTF style. Iteration over
variable width characters should be performed in a way that preserves the
integrity of the characters. M.-A. Lemburg's proposed set of iterators could
be extended to indicate encoding "for c in s.asCharacters('utf-8')" and to
provide for the various intended string uses such as "for c in
s.inVisualOrder()" reversing the receipt of right-to-left substrings.

   Neil





From guido at digicool.com  Sun Jul  1 15:44:29 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sun, 01 Jul 2001 09:44:29 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: Your message of "Sun, 01 Jul 2001 23:00:15 +1000."
             <00dd01c1022d$c61e4160$0acc8490@neil> 
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com>  
            <00dd01c1022d$c61e4160$0acc8490@neil> 
Message-ID: <200107011344.f61DiTM03548@odiug.digicool.com>

> <PEP: 261>
> 
>    The problem I have with this PEP is that it is a compile time option
> which makes it hard to work with both 32 bit and 16 bit strings in one
> program. Can not the 32 bit string type be introduced as an additional type?

Not without an outrageous amount of additional coding (every place in
the code that currently uses PyUnicode_Check() would have to be
bifurcated in a 16-bit and a 32-bit variant).

I doubt that the desire to work with both 16- and 32-bit characters in
one program is typical for folks using Unicode -- that's mostly
limited to folks writing conversion tools.  Python will offer the
necessary codecs so you shouldn't have this need very often.

You can use the array module to manipulate 16- and 32-bit arrays, and
you can use the various Unicode encodings to do the necessary
encodings.

> > u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> > character.
> 
>    This wasn't usefully true in the past for DBCS strings and is not the
> right way to think of either narrow or wide strings now. The idea that
> strings are arrays of characters gets in the way of dealing with many
> encodings and is the primary difficulty in localising software for Japanese.

Can you explain the kind of problems encountered in some more detail?

> Iteration through the code units in a string is a problem waiting to bite
> you and string APIs should encourage behaviour which is correct when faced
> with variable width characters, both DBCS and UTF style.

But this is not the Unicode philosophy.  All the variable-length
character manipulation is supposed to be taken care of by the codecs,
and then the application can deal in arrays of characteres.
Alternatively, the application can deal in opaque objects representing
variable-length encodings, but then it should be very careful with
concatenation and even more so with slicing.

> Iteration over
> variable width characters should be performed in a way that preserves the
> integrity of the characters. M.-A. Lemburg's proposed set of iterators could
> be extended to indicate encoding "for c in s.asCharacters('utf-8')" and to
> provide for the various intended string uses such as "for c in
> s.inVisualOrder()" reversing the receipt of right-to-left substrings.

I think it's a good idea to provide a set of higher-level tools as
well.  However nobody seems to know what these higher-level tools
should do yet.  PEP 261 is specifically focused on getting the
lower-level foundations right (i.e. the objects that represent arrays
of code units), so that the authors of higher level tools will have a
solid base.  If you want to help author a PEP for such higher-level
tools, you're welcome!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From loewis at informatik.hu-berlin.de  Sun Jul  1 15:52:58 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 1 Jul 2001 15:52:58 +0200 (MEST)
Subject: [Python-Dev] Support for "wide" Unicode characters
Message-ID: <200107011352.PAA27645@pandora.informatik.hu-berlin.de>

> The problem I have with this PEP is that it is a compile time option
> which makes it hard to work with both 32 bit and 16 bit strings in
> one program.

Can you elaborate why you think this is a problem?

> Can not the 32 bit string type be introduced as an additional type?

Yes, but not just "like that". You'd have to define an API for
creating values of this type, you'd have to teach all functions which
ought to accept it to process it, you'd have to define conversion
operations and all that: In short, you'd have to go through all the
trouble that introduction of the Unicode type gave us once again.
Also, I cannot see any advantages in introducing yet another type.

Implementing this PEP is straight forward, and with almost no visible
effect to Python programs.

People have suggested to make it a run-time decision, having the
internal representation switch on demand, but that would give an API
nightmare for C code that has to access such values.

> u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> character.

>  This wasn't usefully true in the past for DBCS strings and is not the
> right way to think of either narrow or wide strings now. The idea
> that strings are arrays of characters gets in the way of dealing
> with many encodings and is the primary difficulty in localising
> software for Japanese.

While I don't know much about localising software for Japanese (*), I
agree that 'u[i] is a character' isn't useful to say in many cases. If
this is the old Python string type, I'd much prefer calling u[i] a
'byte'.

Regards,
Martin

(*) Methinks that the primary difficulty still is translating all the
documentation, and messages. Actually, keeping the translations
up-to-date is even more challenging.



From aahz at rahul.net  Sun Jul  1 16:19:41 2001
From: aahz at rahul.net (Aahz Maruch)
Date: Sun, 1 Jul 2001 07:19:41 -0700 (PDT)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3EC012.A3A05E64@ActiveState.com> from "Paul Prescod" at Jun 30, 2001 11:15:46 PM
Message-ID: <20010701141941.A323099C80@waltz.rahul.net>

Paul Prescod wrote:
> David Ascher wrote:
>> Paul Prescod wrote:
>>> David Ascher wrote:
>>>>>
>>>>> "The Energy is the mass of the object times the speed of light times
>>>>> two."
>>>>
>>>> Actually, it's "squared", not times two.  At least in my universe =)
>>>
>>> Pedant. Next you're going to claim that these silly equations effect my
>>> life somehow.
>> 
>> Although one stretch the argument to say that the equations _effect_
>               ^               
> might    -----
> 
>> your life, I'd limit the claim to stating that they _affect_ your life.
> 
> And you just bought such a shiny, new glass, house. Pity.

All speeling falmes contain at least one erorr.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.



From just at letterror.com  Sun Jul  1 16:43:08 2001
From: just at letterror.com (Just van Rossum)
Date: Sun,  1 Jul 2001 16:43:08 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <200107011344.f61DiTM03548@odiug.digicool.com>
Message-ID: <20010701164315-r01010600-c2d5b07d@213.84.27.177>

Guido van Rossum wrote:

> > <PEP: 261>
> > 
> >    The problem I have with this PEP is that it is a compile time option
> > which makes it hard to work with both 32 bit and 16 bit strings in one
> > program. Can not the 32 bit string type be introduced as an additional type?
> 
> Not without an outrageous amount of additional coding (every place in
> the code that currently uses PyUnicode_Check() would have to be
> bifurcated in a 16-bit and a 32-bit variant).

Alternatively, a Unicode object could *internally* be either 8, 16 or 32 bits
wide (to be clear: not per character, but per string). Also a lot of work, but
it'll be a lot less wasteful.

> I doubt that the desire to work with both 16- and 32-bit characters in
> one program is typical for folks using Unicode -- that's mostly
> limited to folks writing conversion tools.  Python will offer the
> necessary codecs so you shouldn't have this need very often.

Not a lot of people will want to work with 16 or 32 bit chars directly, but I
think a less wasteful solution to the surrogate pair problem *will* be desired
by people. Why use 32 bits for all strings in a program when only a tiny
percentage actually *needs* more than 16? (Or even 8...)

> > Iteration through the code units in a string is a problem waiting to bite
> > you and string APIs should encourage behaviour which is correct when faced
> > with variable width characters, both DBCS and UTF style.
> 
> But this is not the Unicode philosophy.  All the variable-length
> character manipulation is supposed to be taken care of by the codecs,
> and then the application can deal in arrays of characteres.

Right: this is the way it should be.

My difficulty with PEP 261 is that I'm afraid few people will actually enable
32-bit support (*what*?! all unicode strings become 32 bits wide? no way!),
therefore making programs non-portable in very subtle ways.

Just



From DavidA at ActiveState.com  Sun Jul  1 19:13:30 2001
From: DavidA at ActiveState.com (David Ascher)
Date: Sun, 01 Jul 2001 10:13:30 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com> <3B3EBEA4.3EC84EAF@ActiveState.com> <3B3EC012.A3A05E64@ActiveState.com>
Message-ID: <3B3F5A3A.A88B54B2@ActiveState.com>

Paul: 
> And you just bought such a shiny, new glass, house. Pity.

What kind of comma placement is that?

--david



From paulp at ActiveState.com  Sun Jul  1 20:08:10 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 11:08:10 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com> <00dd01c1022d$c61e4160$0acc8490@neil>
Message-ID: <3B3F670A.B5396D61@ActiveState.com>

Neil Hodgson wrote:
> 
> Paul Prescod:
> <PEP: 261>
> 
>    The problem I have with this PEP is that it is a compile time option
> which makes it hard to work with both 32 bit and 16 bit strings in one
> program. Can not the 32 bit string type be introduced as an additional type?

The two solutions are not mutually exclusive. If you (or someone)
supplies a 32-bit type and Guido accepts it, then the compile option
might fall into disuse. But this solution was chosen because it is much
less work. Really though, I think that having 16-bit and 32-bit types is
extra confusion for very little gain. I would much rather have a single
space-efficient type that hid the details of its implementation. But
nobody has volunteered to code it and Guido might not accept it even if
someone did.

>...
>    This wasn't usefully true in the past for DBCS strings and is not the
> right way to think of either narrow or wide strings now. The idea that
> strings are arrays of characters gets in the way of dealing with many
> encodings and is the primary difficulty in localising software for Japanese.

The whole benfit of moving to 32-bit character strings is to allow
people to think of strings as arrays of characters. Forcing them to
consider variable-length encodings is precisely what we are trying to
avoid.

> Iteration through the code units in a string is a problem waiting to bite
> you and string APIs should encourage behaviour which is correct when faced
> with variable width characters, both DBCS and UTF style. Iteration over
> variable width characters should be performed in a way that preserves the
> integrity of the characters. 

On wide Python builds there is no such thing as variable width Unicode
characters. It doesn't make sense to combine two 32-bit characters to
get a 64-bit one. On narrow Python builds you might want to treat a
surrogate pair as a single character but I would strongly advise against
it. If you want wide characters, move to a wide build. Even if a narrow
build is more space efficient, you'll lose a ton of performance
emulating wide characters in Python code.

> ... M.-A. Lemburg's proposed set of iterators could
> be extended to indicate encoding "for c in s.asCharacters('utf-8')" and to
> provide for the various intended string uses such as "for c in
> s.inVisualOrder()" reversing the receipt of right-to-left substrings.

A floor wax and a desert topping. <0.5 wink>

I don't think that the average Python programmer would want
s.asCharacters('utf-8') when they already have s.decode('utf-8'). We
decided a long time ago that the model for standard users would be
fixed-length (1!), abstract characters. That's the way Python's Unicode
subsystem has always worked.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From paulp at ActiveState.com  Sun Jul  1 20:19:17 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 11:19:17 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010701164315-r01010600-c2d5b07d@213.84.27.177>
Message-ID: <3B3F69A5.D7CE539D@ActiveState.com>

Just van Rossum wrote:
> 
> Guido van Rossum wrote:
> 
> > > <PEP: 261>
> > >
> > >    The problem I have with this PEP is that it is a compile time option
> > > which makes it hard to work with both 32 bit and 16 bit strings in one
> > > program. Can not the 32 bit string type be introduced as an additional type?
> >
> > Not without an outrageous amount of additional coding (every place in
> > the code that currently uses PyUnicode_Check() would have to be
> > bifurcated in a 16-bit and a 32-bit variant).
> 
> Alternatively, a Unicode object could *internally* be either 8, 16 or 32 bits
> wide (to be clear: not per character, but per string). Also a lot of work, but
> it'll be a lot less wasteful.

I hope this is where we end up one day. But the compile-time option is
better than where we are today. Even though PEP 261 is not my favorite
solution, it buys us a couple of years of wait-and-see time.

Consider that computer memory is growing much faster than textual data.
People's text processing techniques get more and more "wasteful" because
it is now almost always possible to load the entire "text" into memory
at once. I remember how some text editors used to boast that they only
loaded your text "on demand". 

Maybe so much data will be passed to us from UCS-4 APIs that trying to
"compress it" will actually be inefficient.

Maybe two years from now Guido will make UCS-4 the default and only a
tiny minority will notice or care.

> ...
> My difficulty with PEP 261 is that I'm afraid few people will actually enable
> 32-bit support (*what*?! all unicode strings become 32 bits wide? no way!),
> therefore making programs non-portable in very subtle ways.

It really depends on what the default build option is.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From paulp at ActiveState.com  Sun Jul  1 20:22:01 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 11:22:01 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010630141524.E029999C80@waltz.rahul.net> <3B3E23D3.69D591DD@ActiveState.com> <3B3E4487.40054EAE@ActiveState.com> <3B3EA006.14882609@ActiveState.com> <3B3EBEA4.3EC84EAF@ActiveState.com> <3B3EC012.A3A05E64@ActiveState.com> <3B3F5A3A.A88B54B2@ActiveState.com>
Message-ID: <3B3F6A49.6E82B7DE@ActiveState.com>

David Ascher wrote:
> 
> Paul:
> > And you just bought such a shiny, new glass, house. Pity.
> 
> What kind of comma placement is that?

I had to leave you something to complain about;
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From guido at digicool.com  Sun Jul  1 20:37:48 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sun, 01 Jul 2001 14:37:48 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: Your message of "Sun, 01 Jul 2001 16:43:08 +0200."
             <20010701164315-r01010600-c2d5b07d@213.84.27.177> 
References: <20010701164315-r01010600-c2d5b07d@213.84.27.177> 
Message-ID: <200107011837.f61IbmZ03645@odiug.digicool.com>

> Alternatively, a Unicode object could *internally* be either 8, 16
> or 32 bits wide (to be clear: not per character, but per
> string). Also a lot of work, but it'll be a lot less wasteful.

Depending on what you prefer to waste: developers' time or computer
resources.  I bet that if you try the measure the wasted space you'll
find that it wastes very little compared to all the other overheads
in a typical Python program: CPU time compared to writing your code in
C, memory overhead for integers, etc.

It so happened that the Unicode support was written to make it very
easy to change the compile-time code unit size; but making this a
per-string (or even global) run-time variable is much harder without
touching almost every place that uses Unicode (not to mention slowing
down the common case).

Nobody was enthusiastic about fixing this, so our choice was really
between staying with 16 bits or making 32 bits an option for those who
need it.

> Not a lot of people will want to work with 16 or 32 bit chars
> directly,

How do you know?  There are more Chinese than Americans and Europeans
together, and they will soon all have computers. :-)

> but I think a less wasteful solution to the surrogate pair
> problem *will* be desired by people. Why use 32 bits for all strings
> in a program when only a tiny percentage actually *needs* more than
> 16? (Or even 8...)

So work in UTF-8 -- a lot of work can be done in UTF-8.

> > But this is not the Unicode philosophy.  All the variable-length
> > character manipulation is supposed to be taken care of by the codecs,
> > and then the application can deal in arrays of characteres.
> 
> Right: this is the way it should be.
> 
> My difficulty with PEP 261 is that I'm afraid few people will
> actually enable 32-bit support (*what*?! all unicode strings become
> 32 bits wide? no way!), therefore making programs non-portable in
> very subtle ways.

My hope and expectation is that those folks who need 32-bit support
will enable it.  If this solution is not sufficient, we may have to
provide something else in the future, but given that the
implementation effort for PEP 261 was very minimal (certainly less
than the time expended in discussing it) I am very happy with it.

It will take quite a while until lots of folks will need the 32-bit
support (there aren't that many characters defined outside the basic
plane yet).  In the mean time, those that need to 32-bit support
should be happy that we allow them to rebuild Python with 32-bit
support.  In the next 5-10 years, the 32-bit support requirement will
become more common -- as will be the memory upgrades to make it
painless.

It's not like Python is making this decision in a vacuum either: Linux
already has 32-bit wchar_t.  32-bit characters will eventually be
common (even in Windows, which probably has the largest investment in
16-bit Unicode at the moment of any system).  Like IPv6, we're trying
to enable uncommon uses of Python without breaking things for the
not-so-early adopters.

Again, don't see PEP 261 as the ultimate answer to all your 32-bit
Unicode questions.  Just consider that realistically we have two
choices: stick with 16-bit support only or make 32-bit support an
option.  Other approaches (more surrogate support, run-time choices,
transparent variable-length encodings) simply aren't realistic --
no-one has the time to code them.

It should be easy to write portable Python programs that work
correctly with 16-bit Unicode characters on a "narrow" interpreter and
also work correctly with 21-bit Unicode on a "wide" interpreter:
just avoid using surrogates.  If you *need* to work with surrogates,
try to limit yourself to very simple operations like concatenations of
valid strings, and splitting strings at known delimiters only.
There's a lot you can do with this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Sun Jul  1 20:52:36 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 1 Jul 2001 14:52:36 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3F69A5.D7CE539D@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEMBKLAA.tim.one@home.com>

[Paul Prescod]
> ...
> Consider that computer memory is growing much faster than textual data.
> People's text processing techniques get more and more "wasteful" because
> it is now almost always possible to load the entire "text" into memory
> at once.

Indeed, the entire text of the Bible fits in a corner of my year-old box's
RAM, even at 32 bits per character.

> I remember how some text editors used to boast that they only loaded
> your text "on demand".

Well, they still do -- fancy editors use fancy data structures, so that,
e.g., inserting characters at the start of the file doesn't cause a 50Mb
memmove each time.  Response time is still important, but I'd wager
relatively insensitive to basic character size (you need tricks that cut
factors of 1000s off potential worst cases to give the appearance of
instantaneous results; a factor of 2 or 4 is in the noise compared to what's
needed regardless).




From aahz at rahul.net  Sun Jul  1 21:21:26 2001
From: aahz at rahul.net (Aahz Maruch)
Date: Sun, 1 Jul 2001 12:21:26 -0700 (PDT)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3F670A.B5396D61@ActiveState.com> from "Paul Prescod" at Jul 01, 2001 11:08:10 AM
Message-ID: <20010701192126.9EB8299C80@waltz.rahul.net>

Paul Prescod wrote:
> 
> On wide Python builds there is no such thing as variable width Unicode
> characters. It doesn't make sense to combine two 32-bit characters to
> get a 64-bit one. On narrow Python builds you might want to treat a
> surrogate pair as a single character but I would strongly advise against
> it. If you want wide characters, move to a wide build. Even if a narrow
> build is more space efficient, you'll lose a ton of performance
> emulating wide characters in Python code.

This needn't go into the PEP, I think, but I'd like you to say something
about what you expect the end result of this PEP to look like under
Windows, where "rebuild" isn't really a valid option for most Python
users.  Are we simply committing to make two builds available?  If so,
what happens the next time we run into a situation like this?
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

I don't really mind a person having the last whine, but I do mind someone 
else having the last self-righteous whine.



From paulp at ActiveState.com  Sun Jul  1 21:21:09 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 12:21:09 -0700
Subject: [Python-Dev] Text editors
References: <LNBBLJKPBEHFEDALKOLCIEMBKLAA.tim.one@home.com>
Message-ID: <3B3F7825.CA3D1B5B@ActiveState.com>

Tim Peters wrote:
> 
>...
> 
> > I remember how some text editors used to boast that they only loaded
> > your text "on demand".
> 
> Well, they still do -- fancy editors use fancy data structures, so that,
> e.g., inserting characters at the start of the file doesn't cause a 50Mb
> memmove each time.  

Yes, but most modern text editors take O(n) time to open the file. There
was a time when the more advanced ones did not. Or maybe that was just
SGML editors...I can't remember.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From guido at digicool.com  Sun Jul  1 21:32:52 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sun, 01 Jul 2001 15:32:52 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: Your message of "Sun, 01 Jul 2001 12:21:26 PDT."
             <20010701192126.9EB8299C80@waltz.rahul.net> 
References: <20010701192126.9EB8299C80@waltz.rahul.net> 
Message-ID: <200107011932.f61JWq803843@odiug.digicool.com>

> This needn't go into the PEP, I think, but I'd like you to say something
> about what you expect the end result of this PEP to look like under
> Windows, where "rebuild" isn't really a valid option for most Python
> users.  Are we simply committing to make two builds available?  If so,
> what happens the next time we run into a situation like this?

I imagine that we will pick a choice (I expect it'll be UCS2) and
make only that build available, until there are loud enough cries from
folks who have a reasonable excuse not to have a copy of VCC around.

Given that the rest of Windows uses 16-bit Unicode, I think we'll be
able to get away with this for quite a while.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From paulp at ActiveState.com  Sun Jul  1 21:33:20 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 12:33:20 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010701192126.9EB8299C80@waltz.rahul.net>
Message-ID: <3B3F7B00.29D6832@ActiveState.com>

Aahz Maruch wrote:
> 
>...
> 
> This needn't go into the PEP, I think, but I'd like you to say something
> about what you expect the end result of this PEP to look like under
> Windows, where "rebuild" isn't really a valid option for most Python
> users.  Are we simply committing to make two builds available?  If so,
> what happens the next time we run into a situation like this?

Windows itself is strongly biased towards 16-bit characters. Therefore I
expect that to be the default for a while. Then I expect Guido to
announce that 32-bit characters are the new default with version 3000
(perhaps right after Windows 3000 ships) and we'll all change. So most
Windows users will not be able to work with 32-bit characters for a
while. But since Windows itself doesn't like those characters, they
probably won't run into them much.

I strongly doubt that we'll ever make two builds available because it
would cause a mess of extension module incompatibilities.
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From paulp at ActiveState.com  Sun Jul  1 21:57:09 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 01 Jul 2001 12:57:09 -0700
Subject: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" Unicode characters
Message-ID: <3B3F8095.8D58631D@ActiveState.com>

PEP: 261
Title: Support for "wide" Unicode characters
Version: $Revision: 1.3 $
Author: paulp at activestate.com (Paul Prescod)
Status: Draft
Type: Standards Track
Created: 27-Jun-2001
Python-Version: 2.2
Post-History: 27-Jun-2001


Abstract

    Python 2.1 unicode characters can have ordinals only up to 2**16
-1.  
    This range corresponds to a range in Unicode known as the Basic
    Multilingual Plane. There are now characters in Unicode that live
    on other "planes". The largest addressable character in Unicode
    has the ordinal 17 * 2**16 - 1 (0x10ffff). For readability, we
    will call this TOPCHAR and call characters in this range "wide 
    characters".


Glossary

    Character 
        
        Used by itself, means the addressable units of a Python 
        Unicode string.

    Code point

        A code point is an integer between 0 and TOPCHAR.
        If you imagine Unicode as a mapping from integers to
        characters, each integer is a code point. But the 
        integers between 0 and TOPCHAR that do not map to
        characters are also code points. Some will someday 
        be used for characters. Some are guaranteed never 
        to be used for characters.

    Codec

        A set of functions for translating between physical
        encodings (e.g. on disk or coming in from a network)
        into logical Python objects.

    Encoding

        Mechanism for representing abstract characters in terms of
        physical bits and bytes. Encodings allow us to store
        Unicode characters on disk and transmit them over networks
        in a manner that is compatible with other Unicode software.

    Surrogate pair

        Two physical characters that represent a single logical
        character. Part of a convention for representing 32-bit
        code points in terms of two 16-bit code points.

    Unicode string

          A Python type representing a sequence of code points with
          "string semantics" (e.g. case conversions, regular
          expression compatibility, etc.) Constructed with the 
          unicode() function.


Proposed Solution

    One solution would be to merely increase the maximum ordinal 
    to a larger value. Unfortunately the only straightforward
    implementation of this idea is to use 4 bytes per character.
    This has the effect of doubling the size of most Unicode 
    strings. In order to avoid imposing this cost on every
    user, Python 2.2 will allow the 4-byte implementation as a
    build-time option. Users can choose whether they care about
    wide characters or prefer to preserve memory.

    The 4-byte option is called "wide Py_UNICODE". The 2-byte option
    is called "narrow Py_UNICODE".

    Most things will behave identically in the wide and narrow worlds.

    * unichr(i) for 0 <= i < 2**16 (0x10000) always returns a
      length-one string.

    * unichr(i) for 2**16 <= i <= TOPCHAR will return a
      length-one string on wide Python builds. On narrow builds it will 
      raise ValueError.

        ISSUE 

            Python currently allows \U literals that cannot be
            represented as a single Python character. It generates two
            Python characters known as a "surrogate pair". Should this
            be disallowed on future narrow Python builds?

        Pro:

            Python already the construction of a surrogate pair
            for a large unicode literal character escape sequence.
            This is basically designed as a simple way to construct
            "wide characters" even in a narrow Python build. It is also
            somewhat logical considering that the Unicode-literal syntax
            is basically a short-form way of invoking the unicode-escape
            codec.

        Con:

            Surrogates could be easily created this way but the user
            still needs to be careful about slicing, indexing, printing 
            etc. Therefore some have suggested that Unicode
            literals should not support surrogates.


        ISSUE 

            Should Python allow the construction of characters that do
            not correspond to Unicode code points?  Unassigned Unicode 
            code points should obviously be legal (because they could 
            be assigned at any time). But code points above TOPCHAR are 
            guaranteed never to be used by Unicode. Should we allow
access 
            to them anyhow?

        Pro:

            If a Python user thinks they know what they're doing why
            should we try to prevent them from violating the Unicode
            spec? After all, we don't stop 8-bit strings from
            containing non-ASCII characters.

        Con:

            Codecs and other Unicode-consuming code will have to be
            careful of these characters which are disallowed by the
            Unicode specification.

    * ord() is always the inverse of unichr()

    * There is an integer value in the sys module that describes the
      largest ordinal for a character in a Unicode string on the current
      interpreter. sys.maxunicode is 2**16-1 (0xffff) on narrow builds
      of Python and TOPCHAR on wide builds.

        ISSUE: Should there be distinct constants for accessing
               TOPCHAR and the real upper bound for the domain of 
               unichr (if they differ)? There has also been a
               suggestion of sys.unicodewidth which can take the 
               values 'wide' and 'narrow'.

    * every Python Unicode character represents exactly one Unicode code 
      point (i.e. Python Unicode Character = Abstract Unicode
character).

    * codecs will be upgraded to support "wide characters"
      (represented directly in UCS-4, and as variable-length sequences
      in UTF-8 and UTF-16). This is the main part of the implementation 
      left to be done.

    * There is a convention in the Unicode world for encoding a 32-bit
      code point in terms of two 16-bit code points. These are known
      as "surrogate pairs". Python's codecs will adopt this convention
      and encode 32-bit code points as surrogate pairs on narrow Python
      builds. 

        ISSUE 

            Should there be a way to tell codecs not to generate
            surrogates and instead treat wide characters as 
            errors?

        Pro:

            I might want to write code that works only with
            fixed-width characters and does not have to worry about
            surrogates.


        Con:

            No clear proposal of how to communicate this to codecs.

    * there are no restrictions on constructing strings that use 
      code points "reserved for surrogates" improperly. These are
      called "isolated surrogates". The codecs should disallow reading
      these from files, but you could construct them using string 
      literals or unichr().


Implementation

    There is a new (experimental) define:

        #define PY_UNICODE_SIZE 2

    There is a new configure option:

        --enable-unicode=ucs2 configures a narrow Py_UNICODE, and uses
                              wchar_t if it fits
        --enable-unicode=ucs4 configures a wide Py_UNICODE, and uses
                              whchar_t if it fits
        --enable-unicode      same as "=ucs2"

    The intention is that --disable-unicode, or --enable-unicode=no
    removes the Unicode type altogether; this is not yet implemented.

    It is also proposed that one day --enable-unicode will just
    default to the width of your platforms wchar_t.

    Windows builds will be narrow for a while based on the fact that
    there have been few requests for wide characters, those requests
    are mostly from hard-core programmers with the ability to buy
    their own Python and Windows itself is strongly biased towards
    16-bit characters.


Notes

    This PEP does NOT imply that people using Unicode need to use a
    4-byte encoding for their files on disk or sent over the network. 
    It only allows them to do so. For example, ASCII is still a 
    legitimate (7-bit) Unicode-encoding.

    It has been proposed that there should be a module that handles
    surrogates in narrow Python builds for programmers. If someone 
    wants to implement that, it will be another PEP. It might also be 
    combined with features that allow other kinds of character-, 
    word- and line- based indexing.


Rejected Suggestions

    More or less the status-quo

        We could officially say that Python characters are 16-bit and
        require programmers to implement wide characters in their
        application logic by combining surrogate pairs. This is a heavy 
        burden because emulating 32-bit characters is likely to be
        very inefficient if it is coded entirely in Python. Plus these
        abstracted pseudo-strings would not be legal as input to the
        regular expression engine.

    "Space-efficient Unicode" type

        Another class of solution is to use some efficient storage
        internally but present an abstraction of wide characters to
        the programmer. Any of these would require a much more complex
        implementation than the accepted solution. For instance consider
        the impact on the regular expression engine. In theory, we could
        move to this implementation in the future without breaking
Python
        code. A future Python could "emulate" wide Python semantics on
        narrow Python. Guido is not willing to undertake the
        implementation right now.

    Two types

        We could introduce a 32-bit Unicode type alongside the 16-bit
        type. There is a lot of code that expects there to be only a 
        single Unicode type.

    This PEP represents the least-effort solution. Over the next
    several years, 32-bit Unicode characters will become more common
    and that may either convince us that we need a more sophisticated 
    solution or (on the other hand) convince us that simply 
    mandating wide Unicode characters is an appropriate solution.
    Right now the two options on the table are do nothing or do
    this.


References

    Unicode Glossary: http://www.unicode.org/glossary/


Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From thomas at xs4all.net  Mon Jul  2 00:12:48 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 2 Jul 2001 00:12:48 +0200
Subject: [Python-Dev] Python 2.1.1 release 'schedule'
Message-ID: <20010702001248.H8098@xs4all.nl>

This is just a heads-up to everyone. I plan to release Python 2.1.1c1
(release candidate 1) somewhere on Friday the 13th (of July) and, barring
any serious problems, the full release the friday following that, July 20.

The python 2.1.1 CVS branch (tagged 'release21-maint') should be stable, and
should contain most bugfixes that will be in 2.1.1. If you care about
2.1.1's stability and portability, or you found bugs in 2.1 and aren't sure
they are fixed, and you can check things out of CVS, please give the CVS
branch a try: just 'checkout' python with

cvs co -rrelease21-maint python

(with the -d option from the SourceForge CVS page that applies to you) and
follow the normal compile procedure. Binaries for Windows as well as source
tarballs will be provided for the release candidate and the final release
(obviously) but the more bugs people point out before the final release, the
more bugs will be fixed in 2.1.1 :-)

Python 2.1.1 (as well as the CVS branch) will fall under the new
GPL-compatible PSF licence, just like Python 2.0.1. The only notable thing
missing from the CVS branch is an updated NEWS file -- I'm working on it.
I'm also not done searching the open bugs for ones that might need to be
adressed in 2.1.1, but feel free to point me to bugs you think are
important!

2.1.1-Patch-Czar-ly y'rs,
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From greg at cosc.canterbury.ac.nz  Mon Jul  2 04:06:50 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Jul 2001 14:06:50 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3EBEA4.3EC84EAF@ActiveState.com>
Message-ID: <200107020206.OAA00427@s454.cosc.canterbury.ac.nz>

David Ascher <DavidA at ActiveState.com>:

> I'd limit the claim to stating that they _affect_ your life.

If matter didn't have any rest energy, everything
would fly about at the speed of light, which would
make life very hectic.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From greg at cosc.canterbury.ac.nz  Mon Jul  2 04:36:39 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Jul 2001 14:36:39 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <20010701164315-r01010600-c2d5b07d@213.84.27.177>
Message-ID: <200107020236.OAA00432@s454.cosc.canterbury.ac.nz>

Just van Rossum <just at letterror.com>:

> My difficulty with PEP 261 is that I'm afraid few people will actually enable
> 32-bit support (*what*?! all unicode strings become 32 bits wide? no way!),
> therefore making programs non-portable in very subtle ways.

I agree. This can only be a stopgap measure. Ultimately the
Unicode type needs to be made smarter.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From greg at cosc.canterbury.ac.nz  Mon Jul  2 04:42:12 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Jul 2001 14:42:12 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3F5A3A.A88B54B2@ActiveState.com>
Message-ID: <200107020242.OAA00436@s454.cosc.canterbury.ac.nz>

David Ascher <DavidA at ActiveState.com>:
> > And you just bought such a shiny, new glass, house. Pity.
>
> What kind of comma placement is that?

Obviously it's only the glass that is new, not the
whole house. :-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From nhodgson at bigpond.net.au  Mon Jul  2 04:42:11 2001
From: nhodgson at bigpond.net.au (Neil Hodgson)
Date: Mon, 2 Jul 2001 12:42:11 +1000
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107011352.PAA27645@pandora.informatik.hu-berlin.de>
Message-ID: <01d601c102a0$98671580$0acc8490@neil>

Martin von Loewis:


> > The problem I have with this PEP is that it is a compile time option
> > which makes it hard to work with both 32 bit and 16 bit strings in
> > one program.
>
> Can you elaborate why you think this is a problem?

   A common role for Python is to act as glue between various modules. If
Paul produces some interesting code that depends on 32 bit strings and I
want to use that in conjunction with some Win32 specific or COM dependent
code that wants 16 bit strings then it may not be possible or may require
difficult workaronds.

> (*) Methinks that the primary difficulty still is translating all the
> documentation, and messages. Actually, keeping the translations
> up-to-date is even more challenging.

   Translation of documentation and strings can be performed by almost
anyone who writes both languages ("even managers") and can be budgeted by
working out the amount of text and applying a conversion rate. Code requires
careful thought and can lead to the typical buggy software schedule
blowouts.

   Neil





From greg at cosc.canterbury.ac.nz  Mon Jul  2 04:49:56 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Jul 2001 14:49:56 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <200107011837.f61IbmZ03645@odiug.digicool.com>
Message-ID: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>

> It so happened that the Unicode support was written to make it very
> easy to change the compile-time code unit size

What about extension modules that deal with Unicode strings?
Will they have to be recompiled too? If so, is there anything
to detect an attempt to import an extension module with an
incompatible Unicode character width?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From nhodgson at bigpond.net.au  Mon Jul  2 04:52:45 2001
From: nhodgson at bigpond.net.au (Neil Hodgson)
Date: Mon, 2 Jul 2001 12:52:45 +1000
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com>              <00dd01c1022d$c61e4160$0acc8490@neil>  <200107011344.f61DiTM03548@odiug.digicool.com>
Message-ID: <01ea01c102a2$128491c0$0acc8490@neil>

Guido van Rossum:

> >    This wasn't usefully true in the past for DBCS strings and is
> > not the right way to think of either narrow or wide strings
> > now. The idea that strings are arrays of characters gets in
> > the way of dealing with many encodings and is the primary
> > difficulty in localising software for Japanese.
>
> Can you explain the kind of problems encountered in some more detail?

   Programmers used to working with character == indexable code unit will
often split double wide characters when performing an action. For example
searching for a particular double byte character "bc" may match "abcd"
incorrectly where "ab" and "cd" are the characters. DBCS is not normally
self synchronising although UTF-8 is. Another common problem is counting
characters, for example when filling a line, hitting the line width and
forcing half a character onto the next line.

> I think it's a good idea to provide a set of higher-level tools as
> well.  However nobody seems to know what these higher-level tools
> should do yet.  PEP 261 is specifically focused on getting the
> lower-level foundations right (i.e. the objects that represent arrays
> of code units), so that the authors of higher level tools will have a
> solid base.  If you want to help author a PEP for such higher-level
> tools, you're welcome!

   Its more likely I'll publish some of the low level pieces of
Scintilla/SinkWorld as a Python extension providing some of these facilities
in an editable-text class. Then we can see if anyone else finds the code
worthwhile.

   Neil





From nhodgson at bigpond.net.au  Mon Jul  2 05:00:41 2001
From: nhodgson at bigpond.net.au (Neil Hodgson)
Date: Mon, 2 Jul 2001 13:00:41 +1000
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <LNBBLJKPBEHFEDALKOLCIEMBKLAA.tim.one@home.com>
Message-ID: <020b01c102a3$2dd23440$0acc8490@neil>

Tim Peters:

> Well, they still do -- fancy editors use fancy data structures, so that,
> e.g., inserting characters at the start of the file doesn't cause a 50Mb
> memmove each time.  Response time is still important, but I'd wager
> relatively insensitive to basic character size (you need tricks that cut
> factors of 1000s off potential worst cases to give the appearance of
> instantaneous results; a factor of 2 or 4 is in the noise compared to
what's
> needed regardless).

   I actually have some numbers here. Early versions of some new editor
buffer code used UCS-2 on .NET and the JVM. Moving to an 8 bit buffer saved
10-20% of execution time on the insert string, delete string and global
replace benchmarks using strings that fit into ASCII. These buffers did have
some other overhead for line management and other features but I expect
these did not affect the proportions much.

   Neil






From tim.one at home.com  Mon Jul  2 06:36:20 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 2 Jul 2001 00:36:20 -0400
Subject: [Python-Dev] RE: Python 2.1.1 release 'schedule'
In-Reply-To: <20010702001248.H8098@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENEKLAA.tim.one@home.com>

Woo hoo!

[Thomas Wouters]
> ...
> Binaries for Windows as well as source tarballs will be provided ...

Building a Windows installer isn't straightforward, so you'd better let us
do that part (e.g., you need the Wise installer program, Fred needs to
supply appropriate HTML docs for the Windows installer to zip up, Tcl/Tk has
to get unpacked and rearranged, etc).  I just checked in 2.1.1c1 changes to
the Windows part of the release21-maint tree, but the rest of it isn't in
CVS.




From thomas at xs4all.net  Mon Jul  2 08:27:24 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 2 Jul 2001 08:27:24 +0200
Subject: [Python-Dev] Re: Python 2.1.1 release 'schedule'
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEENEKLAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCEENEKLAA.tim.one@home.com>
Message-ID: <20010702082724.K32419@xs4all.nl>

On Mon, Jul 02, 2001 at 12:36:20AM -0400, Tim Peters wrote:

> [Thomas Wouters]
> > ...
> > Binaries for Windows as well as source tarballs will be provided ...

> Building a Windows installer isn't straightforward, so you'd better let us
> do that part (e.g., you need the Wise installer program, Fred needs to
> supply appropriate HTML docs for the Windows installer to zip up, Tcl/Tk has
> to get unpacked and rearranged, etc).  I just checked in 2.1.1c1 changes to
> the Windows part of the release21-maint tree, but the rest of it isn't in
> CVS.

Oh yeah, I was entirely going to let you guys do it, or at least find
another set of wintendows-weenies to do it :) That's part of why I posted
the tentative release dates.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From loewis at informatik.hu-berlin.de  Mon Jul  2 09:25:18 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 2 Jul 2001 09:25:18 +0200 (MEST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <01d601c102a0$98671580$0acc8490@neil> (nhodgson@bigpond.net.au)
References: <200107011352.PAA27645@pandora.informatik.hu-berlin.de> <01d601c102a0$98671580$0acc8490@neil>
Message-ID: <200107020725.JAA25925@pandora.informatik.hu-berlin.de>

> > > The problem I have with this PEP is that it is a compile time option
> > > which makes it hard to work with both 32 bit and 16 bit strings in
> > > one program.
> >
> > Can you elaborate why you think this is a problem?
> 
>    A common role for Python is to act as glue between various modules. If
> Paul produces some interesting code that depends on 32 bit strings and I
> want to use that in conjunction with some Win32 specific or COM dependent
> code that wants 16 bit strings then it may not be possible or may require
> difficult workaronds.

Neither nor. All it will require is you to recompile your Python
installation for to use wide Unicode.

On Win32 APIs, this will mean that you cannot directly interpret
PyUnicode object representations as WCHAR_T pointers. This is no
problem, as you can transparently copy unicode objects into wchar_t
strings; it's a matter of coming up with a good C API for doing so
conveniently.

Regards,
Martin



From fredrik at pythonware.com  Mon Jul  2 10:20:09 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 2 Jul 2001 10:20:09 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107020236.OAA00432@s454.cosc.canterbury.ac.nz>
Message-ID: <03b301c102cf$e0e3dd00$0900a8c0@spiff>

greg wrote:

> I agree. This can only be a stopgap measure. Ultimately the
> Unicode type needs to be made smarter.

PIL uses 8 bits per pixel to store bilevel images, and 32 bits
per pixel to store 16- and 24-bit images.

back in 1995, some people claimed that the image type had
to be made smarter to be usable.  these days, nobody ever
notices...

</F>





From fredrik at pythonware.com  Mon Jul  2 10:08:10 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 2 Jul 2001 10:08:10 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com> <00dd01c1022d$c61e4160$0acc8490@neil>
Message-ID: <03b201c102cf$e0dab540$0900a8c0@spiff>

Neil Hodgson wrote:
> > u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> > character.
>
>    This wasn't usefully true in the past for DBCS strings and is not the
> right way to think of either narrow or wide strings now. The idea that
> strings are arrays of characters gets in the way

if you stop confusing binary buffers with text strings, all such
problems will go away.

</F>





From mal at egenix.com  Mon Jul  2 11:39:55 2001
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 11:39:55 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>
Message-ID: <3B40416B.6438D1F7@egenix.com>

Greg Ewing wrote:
> 
> > It so happened that the Unicode support was written to make it very
> > easy to change the compile-time code unit size
> 
> What about extension modules that deal with Unicode strings?
> Will they have to be recompiled too? If so, is there anything
> to detect an attempt to import an extension module with an
> incompatible Unicode character width?

That's a good question ! 

The answer is: yes, extensions which use Unicode will have to
be recompiled for narrow and wide builds of Python. The question
is however, how to detect cases where the user imports an
extension built for narrow Python into a wide build and
vice versa.

The standard way of looking at the API level won't help. We'd
need some form of introspection API at the C level... hmm,
perhaps looking at the sys module will do the trick for us ?!

In any case, this is certainly going to cause trouble one
of these days...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jul  2 12:13:59 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 12:13:59 +0200
Subject: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" Unicode 
 characters
References: <3B3F8095.8D58631D@ActiveState.com>
Message-ID: <3B404967.14FE180F@lemburg.com>

Paul Prescod wrote:
> 
> PEP: 261
> Title: Support for "wide" Unicode characters
> Version: $Revision: 1.3 $
> Author: paulp at activestate.com (Paul Prescod)
> Status: Draft
> Type: Standards Track
> Created: 27-Jun-2001
> Python-Version: 2.2
> Post-History: 27-Jun-2001
> 
> Abstract
> 
>     Python 2.1 unicode characters can have ordinals only up to 2**16
> -1.
>     This range corresponds to a range in Unicode known as the Basic
>     Multilingual Plane. There are now characters in Unicode that live
>     on other "planes". The largest addressable character in Unicode
>     has the ordinal 17 * 2**16 - 1 (0x10ffff). For readability, we
>     will call this TOPCHAR and call characters in this range "wide
>     characters".
> 
> Glossary
> 
>     Character
> 
>         Used by itself, means the addressable units of a Python
>         Unicode string.

Please add: also known as "code unit".
 
>     Code point
> 
>         A code point is an integer between 0 and TOPCHAR.
>         If you imagine Unicode as a mapping from integers to
>         characters, each integer is a code point. But the
>         integers between 0 and TOPCHAR that do not map to
>         characters are also code points. Some will someday
>         be used for characters. Some are guaranteed never
>         to be used for characters.
> 
>     Codec
> 
>         A set of functions for translating between physical
>         encodings (e.g. on disk or coming in from a network)
>         into logical Python objects.
> 
>     Encoding
> 
>         Mechanism for representing abstract characters in terms of
>         physical bits and bytes. Encodings allow us to store
>         Unicode characters on disk and transmit them over networks
>         in a manner that is compatible with other Unicode software.
> 
>     Surrogate pair
> 
>         Two physical characters that represent a single logical

Eeek... two code units (or have you ever seen a physical character
walking around ;-)

>         character. Part of a convention for representing 32-bit
>         code points in terms of two 16-bit code points.
> 
>     Unicode string
> 
>           A Python type representing a sequence of code points with
>           "string semantics" (e.g. case conversions, regular
>           expression compatibility, etc.) Constructed with the
>           unicode() function.
> 
> Proposed Solution
> 
>     One solution would be to merely increase the maximum ordinal
>     to a larger value. Unfortunately the only straightforward
>     implementation of this idea is to use 4 bytes per character.
>     This has the effect of doubling the size of most Unicode
>     strings. In order to avoid imposing this cost on every
>     user, Python 2.2 will allow the 4-byte implementation as a
>     build-time option. Users can choose whether they care about
>     wide characters or prefer to preserve memory.
> 
>     The 4-byte option is called "wide Py_UNICODE". The 2-byte option
>     is called "narrow Py_UNICODE".
> 
>     Most things will behave identically in the wide and narrow worlds.
> 
>     * unichr(i) for 0 <= i < 2**16 (0x10000) always returns a
>       length-one string.
> 
>     * unichr(i) for 2**16 <= i <= TOPCHAR will return a
>       length-one string on wide Python builds. On narrow builds it will
>       raise ValueError.
> 
>         ISSUE
> 
>             Python currently allows \U literals that cannot be
>             represented as a single Python character. It generates two
>             Python characters known as a "surrogate pair". Should this
>             be disallowed on future narrow Python builds?
> 
>         Pro:
> 
>             Python already the construction of a surrogate pair
>             for a large unicode literal character escape sequence.
>             This is basically designed as a simple way to construct
>             "wide characters" even in a narrow Python build. It is also
>             somewhat logical considering that the Unicode-literal syntax
>             is basically a short-form way of invoking the unicode-escape
>             codec.
> 
>         Con:
> 
>             Surrogates could be easily created this way but the user
>             still needs to be careful about slicing, indexing, printing
>             etc. Therefore some have suggested that Unicode
>             literals should not support surrogates.
> 
>         ISSUE
> 
>             Should Python allow the construction of characters that do
>             not correspond to Unicode code points?  Unassigned Unicode
>             code points should obviously be legal (because they could
>             be assigned at any time). But code points above TOPCHAR are
>             guaranteed never to be used by Unicode. Should we allow
> access
>             to them anyhow?
> 
>         Pro:
> 
>             If a Python user thinks they know what they're doing why
>             should we try to prevent them from violating the Unicode
>             spec? After all, we don't stop 8-bit strings from
>             containing non-ASCII characters.
> 
>         Con:
> 
>             Codecs and other Unicode-consuming code will have to be
>             careful of these characters which are disallowed by the
>             Unicode specification.
> 
>     * ord() is always the inverse of unichr()
> 
>     * There is an integer value in the sys module that describes the
>       largest ordinal for a character in a Unicode string on the current
>       interpreter. sys.maxunicode is 2**16-1 (0xffff) on narrow builds
>       of Python and TOPCHAR on wide builds.
> 
>         ISSUE: Should there be distinct constants for accessing
>                TOPCHAR and the real upper bound for the domain of
>                unichr (if they differ)? There has also been a
>                suggestion of sys.unicodewidth which can take the
>                values 'wide' and 'narrow'.
> 
>     * every Python Unicode character represents exactly one Unicode code
>       point (i.e. Python Unicode Character = Abstract Unicode
> character).
> 
>     * codecs will be upgraded to support "wide characters"
>       (represented directly in UCS-4, and as variable-length sequences
>       in UTF-8 and UTF-16). This is the main part of the implementation
>       left to be done.
> 
>     * There is a convention in the Unicode world for encoding a 32-bit
>       code point in terms of two 16-bit code points. These are known
>       as "surrogate pairs". Python's codecs will adopt this convention
>       and encode 32-bit code points as surrogate pairs on narrow Python
>       builds.
> 
>         ISSUE
> 
>             Should there be a way to tell codecs not to generate
>             surrogates and instead treat wide characters as
>             errors?
> 
>         Pro:
> 
>             I might want to write code that works only with
>             fixed-width characters and does not have to worry about
>             surrogates.
> 
>         Con:
> 
>             No clear proposal of how to communicate this to codecs.

No need to pass this information to the codec: simply write
a new one and give it a clear name, e.g. "ucs-2" will generate
errors while "utf-16-le" converts them to surrogates.
 
>     * there are no restrictions on constructing strings that use
>       code points "reserved for surrogates" improperly. These are
>       called "isolated surrogates". The codecs should disallow reading
>       these from files, but you could construct them using string
>       literals or unichr().
> 
> Implementation
> 
>     There is a new (experimental) define:
> 
>         #define PY_UNICODE_SIZE 2
> 
>     There is a new configure option:
> 
>         --enable-unicode=ucs2 configures a narrow Py_UNICODE, and uses
>                               wchar_t if it fits
>         --enable-unicode=ucs4 configures a wide Py_UNICODE, and uses
>                               whchar_t if it fits
>         --enable-unicode      same as "=ucs2"
> 
>     The intention is that --disable-unicode, or --enable-unicode=no
>     removes the Unicode type altogether; this is not yet implemented.
> 
>     It is also proposed that one day --enable-unicode will just
>     default to the width of your platforms wchar_t.
> 
>     Windows builds will be narrow for a while based on the fact that
>     there have been few requests for wide characters, those requests
>     are mostly from hard-core programmers with the ability to buy
>     their own Python and Windows itself is strongly biased towards
>     16-bit characters.
> 
> Notes
> 
>     This PEP does NOT imply that people using Unicode need to use a
>     4-byte encoding for their files on disk or sent over the network.
>     It only allows them to do so. For example, ASCII is still a
>     legitimate (7-bit) Unicode-encoding.
> 
>     It has been proposed that there should be a module that handles
>     surrogates in narrow Python builds for programmers. If someone
>     wants to implement that, it will be another PEP. It might also be
>     combined with features that allow other kinds of character-,
>     word- and line- based indexing.
> 
> Rejected Suggestions
> 
>     More or less the status-quo
> 
>         We could officially say that Python characters are 16-bit and
>         require programmers to implement wide characters in their
>         application logic by combining surrogate pairs. This is a heavy
>         burden because emulating 32-bit characters is likely to be
>         very inefficient if it is coded entirely in Python. Plus these
>         abstracted pseudo-strings would not be legal as input to the
>         regular expression engine.
> 
>     "Space-efficient Unicode" type
> 
>         Another class of solution is to use some efficient storage
>         internally but present an abstraction of wide characters to
>         the programmer. Any of these would require a much more complex
>         implementation than the accepted solution. For instance consider
>         the impact on the regular expression engine. In theory, we could
>         move to this implementation in the future without breaking
> Python
>         code. A future Python could "emulate" wide Python semantics on
>         narrow Python. Guido is not willing to undertake the
>         implementation right now.
> 
>     Two types
> 
>         We could introduce a 32-bit Unicode type alongside the 16-bit
>         type. There is a lot of code that expects there to be only a
>         single Unicode type.
> 
>     This PEP represents the least-effort solution. Over the next
>     several years, 32-bit Unicode characters will become more common
>     and that may either convince us that we need a more sophisticated
>     solution or (on the other hand) convince us that simply
>     mandating wide Unicode characters is an appropriate solution.
>     Right now the two options on the table are do nothing or do
>     this.
> 
> References
> 
>     Unicode Glossary: http://www.unicode.org/glossary/

Plus perhaps the Mark Davis paper at:

http://www-106.ibm.com/developerworks/unicode/library/utfencodingforms/
 
> Copyright
> 
>     This document has been placed in the public domain.

Good work, Paul !

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal at lemburg.com  Mon Jul  2 12:08:53 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 12:08:53 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <3B3BEF21.63411C4C@ActiveState.com> <3B3C95D8.518E5175@egenix.com> <3B3D2869.5C1DDCF1@ActiveState.com> <3B3DBD86.81F80D06@egenix.com> <3B3EA161.1375F74C@ActiveState.com>
Message-ID: <3B404835.4CE77C60@lemburg.com>

Paul Prescod wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> >...
> >
> > The term "character" in Python should really only be used for
> > the 8-bit strings.
> 
> Are we going to change chr() and unichr() to one_element_string() and
> unicode_one_element_string()

No. I am just suggesting to make use of the crispy clear
definitions which the Unicode Consortium has developed for us.
 
> u[i] is a character. If u is Unicode, then u[i] is a Python Unicode
> character. No Python user will find that confusing no matter how Unicode
> knuckle-dragging, mouth-breathing, wife-by-hair-dragging they are.

Except that u[i] maps to a code unit which may or may not be
a code point. Whether a code point matches a grapheme (this
is what users tend to regard as character) is yet another
story due to combining code points.

> > In Unicode a "character" can mean any of:
> 
> Mark Davis said that "people" can use the word to mean any of those
> things. He did not say that it was imprecisely defined in Unicode.
> Nevertheless I'm not using the Unicode definition anymore than our
> standard library uses an ancient Greek definition of integer. Python has
> a concept of integer and a concept of character.

Ok, I'll stop whining. Just as final remark, let me say that
our little discussion is a perfect example of how people can
misunderstand each other by using the terms in different ways
(Kant tried to solve this for Philosophy and did not succeed;
so I guess the Unicode Consortium doesn't stand a chance 
either ;-)
 
> > >     It has been proposed that there should be a module for working
> > >     with UTF-16 strings in narrow Python builds through some sort of
> > >     abstraction that handles surrogates for you. If someone wants
> > >     to implement that, it will be another PEP.
> >
> > Uhm, narrow builds don't support UTF-16... it's UCS-2 which
> > is supported (basically: store everything in range(0x10000));
> > the codecs can map code points to surrogates, but it is solely
> > their responsibility and the responsibility of the application
> > using them to take care of dealing with surrogates.
> 
> The user can view the data as UCS-2, UTF-16, Base64, ROT-13, XML, ....
> Just as we have a base64 module, we could have a UTF-16 module that
> interprets the data in the string as UTF-16 and does surrogate
> manipulation for you.
> 
> Anyhow, if any of those is the "real" encoding of the data, it is
> UTF-16. After all, if the codec reads in four non-BMP characters in,
> let's say, UTF-8, we represent them as 8 narrow-build Python characters.
> That's the definition of UTF-16! But it's easy enough for me to take
> that word out so I will.

u[i] gives you a code unit and whether this maps to a code point
or not is dependent on the implementation which in turn depends
on the narrow/wide choice.

In UCS-2, I believe, surrogates are regarded as two code points;
in UTF-16 they always have to come in pairs. There's a semantic
difference here which is for the codecs and these additional
tools to be aware of -- not the Unicode type implementation.

> >...
> > Also, the module will be useful for both narrow and wide builds,
> > since the notion of an encoded character can involve multiple code
> > points. In that sense Unicode is always a variable length
> > encoding for characters and that's the application field of
> > this module.
> 
> I wouldn't advise that you do all different types of normalization in a
> single module but I'll wait for your PEP.

I'll see if I find some time at the Bordeaux Python Meeting
next week.
 
> > Here's the adjusted text:
> >
> >      It has been proposed that there should be a module for working
> >      with Unicode objects using character-, word- and line- based
> >      indexing. The details of the implementation is left to
> >      another PEP.
> 
>      It has been proposed that there should be a module that handles
>      surrogates in narrow Python builds for programmers. If someone
>      wants to implement that, it will be another PEP. It might also be
>      combined with features that allow other kinds of character-,
>      word- and line- based indexing.

Hmm, I liked my version better, but what the heck ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/





From mal at lemburg.com  Mon Jul  2 12:43:38 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 12:43:38 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com>  
	            <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>
Message-ID: <3B40505A.2F03EEC4@lemburg.com>

Guido van Rossum wrote:
> 
> Hi Marc-Andre,
> 
> I'm dropping the i18n-sig from the distribution list.
> 
> I hear you:
> 
> > You didn't get my point. I feel responsable for the Unicode
> > implementation design and would like to see it become a continued
> > success.
> 
> I'm sure we all share this goal!
> 
> > In that sense and taking into account that I am the
> > maintainer of all this stuff, I think it is very reasonable to
> > ask me before making any significant changes to the implementation
> > and also respect any comments I put forward.
> 
> I understand you feel that we've rushed this in without waiting for
> your comments.
> 
> Given how close your implementation was, I still feel that the changes
> weren't that significant, but I understand that you get nervous.  If
> Christian were to check in his speed hack  changes to the guts of
> ceval.c I would be nervous too!  (Heck, I got nervous when Eric
> checked in his library-wide string method changes without asking.)
> 
> Next time I'll try to be more sensitive to situations that require
> your review before going forward.

Good.
 
> > Currently, I have to watch the checkins list very closely
> > to find out who changed what in the implementation and then to
> > take actions only after the fact. Since I'm not supporting Unicode
> > as my full-time job this is simply impossible. We have the SF manager
> > and there is really no need to rush anything around here.
> 
> Hm, apart from the fact that you ought to be left in charge, I think
> that in this case the live checkins were a big win over the usual SF
> process.  At least two people were making changes, sometimes to each
> other's code, and many others on at least three continents were
> checking out the changes on many different platforms and immediately
> reporting problems.  We would definitely not have a patch as solid as
> the code that's now checked in, after two days of using SF!  (We
> could've used a branch, but I've found that getting people to actually
> check out the branch is not easy.)

True, but I was thinking of the concept and design questions
which should be resolved *before* taking the direct checkin 
approach.
 
> So I think that the net result was favorable.  Sometimes you just have
> to let people work in the spur of the moment to get the results of
> their best thinking, otherwise they lose interest or their train of
> thought.

Understood, but then I'd like to at least receive a summary
of the changes in some way, so that I continue to understand
how the implementation works after the checkins and which
corners to keep in mind for future additions, changes, etc.
 
> > If I am offline or too busy with other things for a day or two,
> > then I want to see patches on SF and not find new versions of
> > the implementation already checked in.
> 
> That's still the general rule, but in our enthousiasm (and mine was
> definitely part of this!) we didn't want to wait.  Also, I have to
> admit that I mistook your silence for consent -- I didn't think the
> main proposed changes (making the size of Py_UNICODE a config choice)
> were controversial at all, so I didn't realize you would have a problem
> with it.

I don't have a problem with it; I was just seeing things
slip my fingers and getting worried about this.
 
> > This has worked just fine during the last year, so I can only explain
> > the latest actions in this direction with an urge to bypass my comments
> > and any discussion this might cause.
> 
> I think you're projecting your own stuff here. 

Not really. I have processed many patches on SF, gave comments
etc. and did the final checkin. This has worked great over
the last months and I intend to keep working this way since
it is by far the best way to both manage and document the
issues and questions which arise during the process.

E.g. I'm currently processing a patch by Walter D?rwald 
which adds support for callback error handlers. He has done
some great work there which was the result of many lively
discussions. Working like this is fun while staying
manageable at the same time... and again, there's really no
need to rush things !

> I honestly didn't
> think there was much disagreement on your part and thought we were
> doing you a favor by implementing the consensus.  IMO, Martin and and
> Fredrik are familiar enough with both the code and the issues to do a
> good job.

Well, the above was my interpretation of how things went. 
I may have been wrong (and honestly do hope that I am wrong),
but my gutt feeling simply said: hey, what are these guys doing
there... is this some kind of 
 
> > Needless to say that
> > quality control is not possible anymore.
> 
> Unclear.  Lots of other people looked over the changes in your
> absence.  And CVS makes code review after it's checked in easy enough.
> (Hey, in many other open source projects that's the normal procedure
> once the rough characteristics of a feature have been agreed upon:
> check in first and review later!)

That was not my point: quality control also includes checking
the design approach. This is something which should normally
be done in design/implementation/design/... phases -- just like 
I worked with you on the Unicode implementation late in 1999.
 
> > Conclusion:
> > I am not going to continue this work if this does not change.
> 
> That would be sad, and I hope you will stay with us.  We certainly
> don't plan to ignore your comments!
> 
> > Another other problem for me is the continued hostility I feel on i18n
> > against parts of the design and some of my decisions. I am
> > not talking about your feedback and the feedback from many other
> > people on the list which was excellent and to high standards.
> > But reading the postings of the last few months you will
> > find notices of what I am referring to here (no, I don't want
> > to be specific).
> 
> I don't know what to say about this, and obviously nobody has the time
> to go back and read the archives.  I'm sure it's not you as a person
> that was attacked.  If the design isn't perfect -- and hey, since
> Python is the 80 percent language, few things in it are quite perfect!
> -- then (positive) criticism is an attempt to help, to move it closer
> to perfection.
> 
> If people have at times said "the Unicode support sucks", well, that
> may hurt.  You can't always stay friends with everybody.  I get flames
> occasionally for features in Python that folks don't like.  I get used
> to them, and it doesn't affect my confidence any more.  Be the same!

I'll try.
 
> But sometimes, after saying "it sucks", people make specific
> suggestions for improvements, and it's important to be open for those
> even from sources that use offending language.  (Within reason, of
> course.  I don't ask you to listen to somebody who is persistently
> hostile to you as a person.)

Ok.
 
> > If people don't respect my comments or decision, then how can
> > I defend the design and how can I stop endless discussions which
> > simply don't lead anywhere ? So either I am missing something
> > or there is a need for a clear statement from you about
> > my status in all this.
> 
> Do you really *want* to be the Unicode BDFL?  Being something's BDFL a
> full-time job, and you've indicated you're too busy.  (Or is that
> temporary?)

I am currently doing a lot of consulting work, so things sometimes
tighten up and are less work intense at other times. Given
this setup, I think that I will be able to play the BD (without
the FL) for Unicode for some time. I will certainly pass on the
flag to someone else if I find myself not spending enough
time on it.

The only thing I'm asking for, is some more professional
work mentality at times. If people make it hard for me to follow
the development, then I cannot manage this task in a satisfying
way.

> I see you as the original coder, which means that you know that
> section of the code better than anyone, and whenever there's a
> question that others can't answer about its design, implementation, or
> restrictions, I refer to you.  But given that you've said you wouldn't
> be able to work much on it, I welcome contributions by others as long
> as they seem knowledgeable.

Same here.
 
> > If I don't have the right to comment on proposals and patches,
> > possibly even rejecting them, then I simply don't see any
> > ground for keeping the implementation in a state which I can
> > maintain.
> 
> Nobody said you couldn't comment, and you know that.

If I don't get a chance to comment on a summary of changes
(be it before or after a batch of checkings), how am I
supposed to follow up on them ? Keeping a close eye
on the checkin mailing list doesn't help: it simply doesn't
always give you the big picture.

We are all professional quality programmers and I respect
Fredrik and Martin for their coding quality and ideas. What
I am asking for is some more teamwork.

> When it comes to rejecting or accepting, I feel that I am still the
> final arbiter, even for Unicode, until I get hit by a bus.  Since I
> don't always understand the implementation or the issues, I'll of
> course defer to you in cases where I think I can't make the decision,
> but I do reserve the right to be convinced by others to override your
> judgement, occasionally, if there's a good reason.  And when you're
> not responsive, I may try to channel you.  (I'll try to be more
> explicit about that.)

That's perfectly OK (and indeed can be very useful at times).
 
> > And last but not least: The fun-factor has faded which was
> > the main motor driving my into working on Unicode in the first
> > place. Nothing much you can do about this, though :-/
> 
> Yes, that happens to all of us at times.  The fun factor goes up and
> down, and sometimes we must look for fun elsewhere for a while.  Then
> the fun may come back where it appeared lost.  Go on vacation, read a
> book, tackle a new project in a totally different area!  Then come
> back and see if you can find some fun in the old stuff again.

I'll visit the Bordeaux Python conference later week. That should
give me some time to breathe (and hopefully to write some more
PEPs :=).
 
> > > Paul Prescod offered to write a PEP on this issue.  My cynical half
> > > believes that we'll never hear from him again, but my optimistic half
> > > hopes that he'll actually write one, so that we'll be able to discuss
> > > the various issues for the users with the users.  I encourage you to
> > > co-author the PEP, since you have a lot of background knowledge about
> > > the issues.
> >
> > I guess your optimistic half won :-) I think Paul already did all the
> > work, so I'll simply comment on what he wrote.
> 
> Your suggestions were very valuable.  My opinion of Paul also went up
> a notch!
> 
> > > BTW, I think that Misc/unicode.txt should be converted to a PEP, for
> > > the historic record.  It was very much a PEP before the PEP process
> > > was invented.  Barry, how much work would this be?  No editing needed,
> > > just formatting, and assignment of a PEP number (the lower the better).
> >
> > Thanks for converting the text to PEP format, Barry.
> >
> > Thanks for reading this far,
> 
> You're welcome, and likewise.
> 
> Just one more thing, Marc-Andre.  Please know that I respect your work
> very much even if we don't always agree.  We would get by without you,
> but Python would be hurt if you turned your back on us.

Thanks. Be assured that I'll stay around for quite some time --
you won't get by that easily ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/




From mal at lemburg.com  Mon Jul  2 12:56:00 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 12:56:00 +0200
Subject: [Python-Dev] Bordeaux Python Meeting 04.07.-07.07.
Message-ID: <3B405340.31C5AA11@lemburg.com>

Hi everybody,

I think nobody has posted an announcement for the conference
yet, so I'll at least provide a pointer:

	http://www.lsm.abul.org/program/topic19/

Marc Poinot, who also organized the "First Python Day" in France,
is chair of this subtopic at the "Debian One" conference in
Bordeaux:

	http://www.lsm.abul.org/

Cheers,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/



From fredrik at pythonware.com  Mon Jul  2 13:41:51 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 2 Jul 2001 13:41:51 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com>              <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com> <3B40505A.2F03EEC4@lemburg.com>
Message-ID: <001e01c102eb$fe4995d0$4ffa42d5@hagrid>

mal wrote:

> The only thing I'm asking for, is some more professional
> work mentality at times.

for the record, your recent posts under this subject doesn't strike
me as very professional.

think about it.

</F>




From paulp at ActiveState.com  Mon Jul  2 16:25:55 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 02 Jul 2001 07:25:55 -0700
Subject: [I18n-sig] Re: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" 
 Unicodecharacters
References: <3B3F8095.8D58631D@ActiveState.com> <3B404967.14FE180F@lemburg.com>
Message-ID: <3B408473.77AB6C8@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
>...
> >     Character
> >
> >         Used by itself, means the addressable units of a Python
> >         Unicode string.
> 
> Please add: also known as "code unit".

I'm not entirely comfortable with that. As you yourself pointed out, the
same Python Unicode object can be interpreted as either a series of
single-width code points *or* as a UTF-16 string where the characters
are code units. You could also interpet it as a BASE64'd region or an
XML document... It all depends on how you look at it.

> ....
> >     Surrogate pair
> >
> >         Two physical characters that represent a single logical
> 
> Eeek... two code units (or have you ever seen a physical character
> walking around ;-)

No, that's sort of my point. The user can decide to adopt the convention
of looking at the two characters as code units or they can ignore that
interpretation and look at them as two code points. It's all relative,
man. Dig it? That's why I use the word "convention" below:

> >         character. Part of a convention for representing 32-bit
> >         code points in terms of two 16-bit code points.

"Surrogates are all in your head. Python doesn't know or care about
them!"

I'll change this to:

    Surrogate pair

        Two Python Unicode characters that represent a single logical
        Unicode code point. Part of a convention for representing
        32-bit code points in terms of two 16-bit code points. Python
        has limited support for reading, writing and constructing
strings 
        that use this convention (described below). Otherwise Python
        ignores the convention.

> No need to pass this information to the codec: simply write
> a new one and give it a clear name, e.g. "ucs-2" will generate
> errors while "utf-16-le" converts them to surrogates.

That's a good point, but what if I want a UTF-8 codec that doesn't
generate surrogates? Or even a UCS4 one?

> Plus perhaps the Mark Davis paper at:
> 
> http://www-106.ibm.com/developerworks/unicode/library/utfencodingforms/

Okay.

> > Copyright
> >
> >     This document has been placed in the public domain.
> 
> Good work, Paul !

Thanks for your help. You did help me to clarify many things even though
I argued with you as I was doing it. 
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From guido at digicool.com  Mon Jul  2 17:23:56 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 02 Jul 2001 11:23:56 -0400
Subject: [Python-Dev] Unicode Maintenance
In-Reply-To: Your message of "Mon, 02 Jul 2001 12:43:38 +0200."
             <3B40505A.2F03EEC4@lemburg.com> 
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>  
            <3B40505A.2F03EEC4@lemburg.com> 
Message-ID: <200107021523.f62FNun01807@odiug.digicool.com>

Thanks for your response, Marc-Andre.  I'd like to close this topic
now.  I'm not sure how to get you a "summary of changes", but I think
you can ask Fredrik directly (Martin annonced he's away on vacation).

One thing you can do is pipe the output of "cvs log" through
tools/scripts/logmerge.py -- this gives you the checkin messages in
(reverse?) chronological order.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jul  2 17:29:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 02 Jul 2001 11:29:39 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: Your message of "Mon, 02 Jul 2001 11:39:55 +0200."
             <3B40416B.6438D1F7@egenix.com> 
References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>  
            <3B40416B.6438D1F7@egenix.com> 
Message-ID: <200107021529.f62FTdx01823@odiug.digicool.com>

> Greg Ewing wrote:
> > 
> > > It so happened that the Unicode support was written to make it very
> > > easy to change the compile-time code unit size
> > 
> > What about extension modules that deal with Unicode strings?
> > Will they have to be recompiled too? If so, is there anything
> > to detect an attempt to import an extension module with an
> > incompatible Unicode character width?
> 
> That's a good question ! 
> 
> The answer is: yes, extensions which use Unicode will have to
> be recompiled for narrow and wide builds of Python. The question
> is however, how to detect cases where the user imports an
> extension built for narrow Python into a wide build and
> vice versa.
> 
> The standard way of looking at the API level won't help. We'd
> need some form of introspection API at the C level... hmm,
> perhaps looking at the sys module will do the trick for us ?!
> 
> In any case, this is certainly going to cause trouble one
> of these days...

Here are some alternative ways to deal with this:

(1) Use the preprocessor to rename all the Unicode APIs to get "Wide"
    appended to their name in wide mode.  This makes any use of a
    Unicode API in an extension compiled for the wrong Py_UNICODE_SIZE
    fail with a link-time error.  (Which should cause an ImportError
    for shared libraries.)

(2) Ditto but only rename the PyModule_Init function.  This is much
    less work but more coarse: a module that doesn't use any Unicode
    APIs (and I expect these will be a large majority) still would not
    be accepted.

(3) Change the interpretation of PYTHON_API_VERSION so that a low bit
    of '1' means wide Unicode.  Then you only get a warning (followed
    by a core dump when actually trying to use Unicode).

I mentioned (1) and (3) in an earlier post.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at beowolf.digicool.com  Mon Jul  2 17:37:45 2001
From: fdrake at beowolf.digicool.com (Fred Drake)
Date: Mon,  2 Jul 2001 11:37:45 -0400 (EDT)
Subject: [Python-Dev] [maintenance doc updates]
Message-ID: <20010702153745.B304B28929@beowolf.digicool.com>

The development version of the documentation has been updated:

	http://python.sourceforge.net/maint-docs/


Updated to reflect the current state of the Python 2.1.1 maintenance
release branch.




From mal at lemburg.com  Mon Jul  2 18:51:58 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 18:51:58 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>  
	            <3B40416B.6438D1F7@egenix.com> <200107021529.f62FTdx01823@odiug.digicool.com>
Message-ID: <3B40A6AE.EDE30857@lemburg.com>

Guido van Rossum wrote:
> 
> > Greg Ewing wrote:
> > >
> > > > It so happened that the Unicode support was written to make it very
> > > > easy to change the compile-time code unit size
> > >
> > > What about extension modules that deal with Unicode strings?
> > > Will they have to be recompiled too? If so, is there anything
> > > to detect an attempt to import an extension module with an
> > > incompatible Unicode character width?
> >
> > That's a good question !
> >
> > The answer is: yes, extensions which use Unicode will have to
> > be recompiled for narrow and wide builds of Python. The question
> > is however, how to detect cases where the user imports an
> > extension built for narrow Python into a wide build and
> > vice versa.
> >
> > The standard way of looking at the API level won't help. We'd
> > need some form of introspection API at the C level... hmm,
> > perhaps looking at the sys module will do the trick for us ?!
> >
> > In any case, this is certainly going to cause trouble one
> > of these days...
> 
> Here are some alternative ways to deal with this:
> 
> (1) Use the preprocessor to rename all the Unicode APIs to get "Wide"
>     appended to their name in wide mode.  This makes any use of a
>     Unicode API in an extension compiled for the wrong Py_UNICODE_SIZE
>     fail with a link-time error.  (Which should cause an ImportError
>     for shared libraries.)
>
> (2) Ditto but only rename the PyModule_Init function.  This is much
>     less work but more coarse: a module that doesn't use any Unicode
>     APIs (and I expect these will be a large majority) still would not
>     be accepted.
> 
> (3) Change the interpretation of PYTHON_API_VERSION so that a low bit
>     of '1' means wide Unicode.  Then you only get a warning (followed
>     by a core dump when actually trying to use Unicode).
>
> I mentioned (1) and (3) in an earlier post.

(4) Add a feature flag to PyModule_Init() which then looks up the
    features in the sys module and uses this as basis for
    processing the import requrest.

In this case, I think that (5) would be the best solution,
since old code will notice the change in width too.

-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/



From paulp at ActiveState.com  Mon Jul  2 20:15:41 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 02 Jul 2001 11:15:41 -0700
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <200107020249.OAA00439@s454.cosc.canterbury.ac.nz>  
		            <3B40416B.6438D1F7@egenix.com> <200107021529.f62FTdx01823@odiug.digicool.com> <3B40A6AE.EDE30857@lemburg.com>
Message-ID: <3B40BA4D.9C85A202@ActiveState.com>

"M.-A. Lemburg" wrote:
> 
>...
> 
> (4) Add a feature flag to PyModule_Init() which then looks up the
>     features in the sys module and uses this as basis for
>     processing the import requrest.

Could an extension be carefully written so that a single binary could be
compatible with both types of Python build? I'm thinking that it would
pass data buffers with the "right width" based on checking a runtime
flag...
-- 
Take a recipe. Leave a recipe.  
Python Cookbook!  http://www.ActiveState.com/pythoncookbook



From just at letterror.com  Mon Jul  2 20:20:38 2001
From: just at letterror.com (Just van Rossum)
Date: Mon,  2 Jul 2001 20:20:38 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B40BA4D.9C85A202@ActiveState.com>
Message-ID: <20010702202041-r01010600-d5c62b95@213.84.27.177>

Paul Prescod wrote:

> Could an extension be carefully written so that a single binary could be
> compatible with both types of Python build? I'm thinking that it would
> pass data buffers with the "right width" based on checking a runtime
> flag...

But then it would also be compatible with a unicode object using different
internal storage units per string, so I'm sure this is a dead end ;-)

Just



From mal at lemburg.com  Mon Jul  2 20:59:06 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 20:59:06 +0200
Subject: [Python-Dev] Support for "wide" Unicode characters
References: <20010702202041-r01010600-d5c62b95@213.84.27.177>
Message-ID: <3B40C47A.94317663@lemburg.com>

Just van Rossum wrote:
> 
> Paul Prescod wrote:
> 
> > Could an extension be carefully written so that a single binary could be
> > compatible with both types of Python build? I'm thinking that it would
> > pass data buffers with the "right width" based on checking a runtime
> > flag...
> 
> But then it would also be compatible with a unicode object using different
> internal storage units per string, so I'm sure this is a dead end ;-)

Agreed :-)

Extension writer will have to provide two versions of the binary.

-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jul  2 21:12:45 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 02 Jul 2001 21:12:45 +0200
Subject: [I18n-sig] Re: [Python-Dev] PEP 261, Rev 1.3 - Support for "wide" 
 Unicodecharacters
References: <3B3F8095.8D58631D@ActiveState.com> <3B404967.14FE180F@lemburg.com> <3B408473.77AB6C8@ActiveState.com>
Message-ID: <3B40C7AD.F2646D56@lemburg.com>

Paul Prescod wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> >...
> > >     Character
> > >
> > >         Used by itself, means the addressable units of a Python
> > >         Unicode string.
> >
> > Please add: also known as "code unit".
> 
> I'm not entirely comfortable with that. As you yourself pointed out, the
> same Python Unicode object can be interpreted as either a series of
> single-width code points *or* as a UTF-16 string where the characters
> are code units. You could also interpet it as a BASE64'd region or an
> XML document... It all depends on how you look at it.

Well, that's what code unit tries to capture too: it's the basic storage
unit used by the implementation for storing characters. Never mind, it's
just a detail...
 
> > ....
> > >     Surrogate pair
> > >
> > >         Two physical characters that represent a single logical
> >
> > Eeek... two code units (or have you ever seen a physical character
> > walking around ;-)
> 
> No, that's sort of my point. The user can decide to adopt the convention
> of looking at the two characters as code units or they can ignore that
> interpretation and look at them as two code points. It's all relative,
> man. Dig it? That's why I use the word "convention" below:

Ok.
 
> > >         character. Part of a convention for representing 32-bit
> > >         code points in terms of two 16-bit code points.
> 
> "Surrogates are all in your head. Python doesn't know or care about
> them!"
> 
> I'll change this to:
> 
>     Surrogate pair
> 
>         Two Python Unicode characters that represent a single logical
>         Unicode code point. Part of a convention for representing
>         32-bit code points in terms of two 16-bit code points. Python
>         has limited support for reading, writing and constructing
> strings
>         that use this convention (described below). Otherwise Python
>         ignores the convention.

Good.
 
> > No need to pass this information to the codec: simply write
> > a new one and give it a clear name, e.g. "ucs-2" will generate
> > errors while "utf-16-le" converts them to surrogates.
> 
> That's a good point, but what if I want a UTF-8 codec that doesn't
> generate surrogates? Or even a UCS4 one?

With Walter's patch for callback error handlers, you should be able to
provide handlers which implement whatever you see fit. 
 
I think that codecs should work the same on all platforms and always
apply the needed conversion for the platform in question; could be wrong
though... it's really only a minor issue.

> > Plus perhaps the Mark Davis paper at:
> >
> > http://www-106.ibm.com/developerworks/unicode/library/utfencodingforms/
> 
> Okay.
> 
> > > Copyright
> > >
> > >     This document has been placed in the public domain.
> >
> > Good work, Paul !
> 
> Thanks for your help. You did help me to clarify many things even though
> I argued with you as I was doing it.

Thank you for taking the suggestions into account.

-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/



From fredrik at pythonware.com  Mon Jul  2 21:41:33 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 2 Jul 2001 21:41:33 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>              <3B40505A.2F03EEC4@lemburg.com>  <200107021523.f62FNun01807@odiug.digicool.com>
Message-ID: <013101c1032f$022770d0$4ffa42d5@hagrid>

guido wrote: 
> I'm not sure how to get you a "summary of changes", but I think you
> can ask Fredrik directly (Martin annonced he's away on vacation).

summary:

- portability: made unicode object behave properly also if
  sizeof(Py_UNICODE) > 2 and >= sizeof(long) (FL)
- same for unicode codecs and the unicode database (MvL)
- base unicode feature selection on unicode defines, not platform (FL)
- wrap surrogate handling in #ifdef Py_UNICODE_WIDE (MvL, FL)
- tweaked unit tests to work with wide unicode, by replacing explicit
  surrogates with \U escapes (MvL)
- configure options for narrow/wide unicode (MvL)
- removed bogus const and register from some scalars (GvR, FL)
- default unicode configuration for PC (Tim, FL)
- default unicode configuration for Mac (Jack)
- added sys.maxunicode (MvL)

most changes where really trivial (e.g. ~0xFC00 => 0x3FF). martin's
big patch was reviewed and tested by both me and him before checkin
(tim managed to check out and build before I'd gotten around to check
in my windows tweaks, but that's what makes distributed egoless deve-
lopment so fun ;-)

</F>




From greg at cosc.canterbury.ac.nz  Tue Jul  3 02:20:37 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 03 Jul 2001 12:20:37 +1200 (NZST)
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <03b301c102cf$e0e3dd00$0900a8c0@spiff>
Message-ID: <200107030020.MAA00584@s454.cosc.canterbury.ac.nz>

Fredrik Lundh <fredrik at pythonware.com>:

> back in 1995, some people claimed that the image type had
> to be made smarter to be usable.

But at least you can use more than one depth of
image in the same program...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From mal at lemburg.com  Tue Jul  3 10:31:50 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 03 Jul 2001 10:31:50 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>              <3B40505A.2F03EEC4@lemburg.com>  <200107021523.f62FNun01807@odiug.digicool.com> <013101c1032f$022770d0$4ffa42d5@hagrid>
Message-ID: <3B4182F6.DAC4C1@lemburg.com>

Fredrik Lundh wrote:
> 
> guido wrote:
> > I'm not sure how to get you a "summary of changes", but I think you
> > can ask Fredrik directly (Martin annonced he's away on vacation).
> 
> summary:
> 
> - portability: made unicode object behave properly also if
>   sizeof(Py_UNICODE) > 2 and >= sizeof(long) (FL)
> - same for unicode codecs and the unicode database (MvL)
> - base unicode feature selection on unicode defines, not platform (FL)
> - wrap surrogate handling in #ifdef Py_UNICODE_WIDE (MvL, FL)
> - tweaked unit tests to work with wide unicode, by replacing explicit
>   surrogates with \U escapes (MvL)
> - configure options for narrow/wide unicode (MvL)
> - removed bogus const and register from some scalars (GvR, FL)
> - default unicode configuration for PC (Tim, FL)
> - default unicode configuration for Mac (Jack)
> - added sys.maxunicode (MvL)

Thank you for the summary. 

Please let me suggest that for the next coding party you prepare a patch 
which spans all party checkins and upload that patch with a summary
like the above to SF. That way we can keep the documentation of the overall
changes in one place and make the process more transparent for everybody.

Now let's get on with business...

Thanks,
-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/





From fredrik at pythonware.com  Tue Jul  3 12:21:27 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 3 Jul 2001 12:21:27 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>              <3B40505A.2F03EEC4@lemburg.com>  <200107021523.f62FNun01807@odiug.digicool.com> <013101c1032f$022770d0$4ffa42d5@hagrid> <3B4182F6.DAC4C1@lemburg.com>
Message-ID: <05aa01c103a9$ec29e710$0900a8c0@spiff>

mal wrote:

> Please let me suggest that for the next coding party you prepare a patch
> which spans all party checkins and upload that patch with a summary
> like the above to SF. That way we can keep the documentation of the overall
> changes in one place and make the process more transparent for everybody.

Sorry, but as long as Guido wants an open development approach
based on collective code ownership (aka "egoless programming"),
that's what he gets.

The current environment provides several tools to track changes
to the code base.  The python-checkins list provides instant info
on every single change to the code base; the investment to track
tha list is a few minutes per day.  The CVS history is also easy to
access; you can reach it via the viewcvs interface, or from the
command line.

Using both CVS and SF's patch manager to track development history
is a waste of time.  A development project manned by volunteers
doesn't need bureaucrats; the version control system provides
all the accountability we'll ever need.

(commercial development projects doesn't need bureaucrats
either, and usually don't have them, but that's another story).

I'd also argue that using many incremental checkins improves
quality -- the smaller a change is, the easier it is to understand,
and the more likely it is that also non-experts will notice simple
mistakes or portability issues.  (I regularily comment on checkin
messages that look suspicious codewise, even if I don't know
anything about the problem area.  I'm even right, sometimes).
Reviewing big patches on SF is really hard, even for experts.

And every hour a patch sits on sourceforge instead of in the code
repository is ten hours less burn-in in a heterogenous testing en-
vironment.  That's worth a lot.

Finally, my experience from this and other projects is that the
"visible heartbeat" you get from a continuous flow of checkin
messages improves team productivity and team morale.  No-
thing is more inspiring than seeing others working for a common
goal.  It's the final product that matters, not who's in charge of
what part of it.  The end user couldn't care less.

I'd prefer if you didn't feel the need to play miniboss on the Python
project (I'm sure you have plenty of 'mx' projects that you can use
that approach, if you have to).  And I'd rather see you at the next
party than out there whining over how you missed the last one.

Cheers /F





From mal at lemburg.com  Tue Jul  3 13:30:05 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 03 Jul 2001 13:30:05 +0200
Subject: [Python-Dev] Unicode Maintenance
References: <3B39CD51.406C28F0@lemburg.com> <200106271611.f5RGBn819631@odiug.digicool.com> <3B3AF307.6496AFB4@lemburg.com> <200106281225.f5SCPIr20874@odiug.digicool.com>              <3B40505A.2F03EEC4@lemburg.com>  <200107021523.f62FNun01807@odiug.digicool.com> <013101c1032f$022770d0$4ffa42d5@hagrid> <3B4182F6.DAC4C1@lemburg.com> <05aa01c103a9$ec29e710$0900a8c0@spiff>
Message-ID: <3B41ACBD.9FA8FB25@lemburg.com>

Fredrik Lundh wrote:
> 
> > Please let me suggest that for the next coding party you prepare a patch
> > which spans all party checkins and upload that patch with a summary
> > like the above to SF. That way we can keep the documentation of the overall
> > changes in one place and make the process more transparent for everybody.
> 
> Sorry, but as long as Guido wants an open development approach
> based on collective code ownership (aka "egoless programming"),
> that's what he gets.
> 
> The current environment provides several tools to track changes
> to the code base.  The python-checkins list provides instant info
> on every single change to the code base; the investment to track
> tha list is a few minutes per day.  The CVS history is also easy to
> access; you can reach it via the viewcvs interface, or from the
> command line.

I think you misunderstood my suggestion: I didn't say you can't have
a coding party with lots of small checkins, I just suggested that *after*
the party someone does a diff before-and-after-the-party.diff and
uploads this diff to SF with a description of the overall changes.

You simply don't get the big picture from looking at various small 
checkin messages which are sometimes spread across mutliple files/checkins.
 
> Using both CVS and SF's patch manager to track development history
> is a waste of time.  A development project manned by volunteers
> doesn't need bureaucrats; the version control system provides
> all the accountability we'll ever need.
> 
> (commercial development projects doesn't need bureaucrats
> either, and usually don't have them, but that's another story).

Wasn't talking about bureaucrats... 
 
> I'd also argue that using many incremental checkins improves
> quality -- the smaller a change is, the easier it is to understand,
> and the more likely it is that also non-experts will notice simple
> mistakes or portability issues.  (I regularily comment on checkin
> messages that look suspicious codewise, even if I don't know
> anything about the problem area.  I'm even right, sometimes).
> Reviewing big patches on SF is really hard, even for experts.

It's just for keeping a combined record of changes. Following up on
dozens of checkins spanning another dozen files using CVS is 
harder, IMHO, than looking at one single before/after diff.
 
> And every hour a patch sits on sourceforge instead of in the code
> repository is ten hours less burn-in in a heterogenous testing en-
> vironment.  That's worth a lot.

Agreed.
 
> Finally, my experience from this and other projects is that the
> "visible heartbeat" you get from a continuous flow of checkin
> messages improves team productivity and team morale.  No-
> thing is more inspiring than seeing others working for a common
> goal.  It's the final product that matters, not who's in charge of
> what part of it.  The end user couldn't care less.
> 
> I'd prefer if you didn't feel the need to play miniboss on the Python
> project (I'm sure you have plenty of 'mx' projects that you can use
> that approach, if you have to). 

I have no intention of playing "miniboss" (I have enough of that being
the boss of a small company), I'm just trying to keep the task of a code
maintainer manageable; that's all. 'nuff said.

> And I'd rather see you at the next
> party than out there whining over how you missed the last one.

Perhaps you can send around invitations first, before starting the party 
next time ?!

BTW, do you have plans to update the Unicode database to the 3.1
version ? If not, I'll look into this next week.

-- 
Marc-Andre Lemburg
________________________________________________________________________
Business:                                        http://www.lemburg.com/
Python Pages:                             http://www.lemburg.com/python/




From thomas at xs4all.net  Tue Jul  3 13:41:51 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 3 Jul 2001 13:41:51 +0200
Subject: [Python-Dev] CVS
Message-ID: <20010703134151.P8098@xs4all.nl>

Slightly off-topic, but I've depleted all my other sources :) I'm trying to
get CVS to give me all logentries for all checkins in a specific branch (the
2.1.1 branch) so I can pipe it through logmerge. It seems the one thing I'm
missing now is a branchpoint tag (which should translate to a revision with
an even number of dots, apparently) but 'release21' and 'release21-maint'
both don't qualify. Even the usage logmerge suggests (cvs log -rrelease21)
doesn't work, gives me a bunch of "no revision elease21' in <file>"
warnings and just all logentries for those files.

Am I missing something simple, here, or should I hack logmerge to parse the
symbolic names, figure out the even-dotted revision for each file from the
uneven-dotted branch-tag, and filter out stuff outside that range ? :P

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From gregor at mediasupervision.de  Tue Jul  3 14:09:51 2001
From: gregor at mediasupervision.de (Gregor Hoffleit)
Date: Tue, 3 Jul 2001 14:09:51 +0200
Subject: [Python-Dev] PEP 250, site-python, site-packages
Message-ID: <20010703140951.A27647@mediasupervision.de>

PEP 250 talks about adopting site-packages for Windows systems. I'd like
to discuss the sitedirs as a whole.

Currently, site.py appends the following sitedirs to sys.path:

    * <prefix>/lib/python<version>/site-packages
    * <prefix>/lib/site-python

If exec-prefix is different from prefix, then also

    * <exec-prefix>/lib/python<version>/site-packages
    * <exec-prefix>/lib/site-python



From jepler at mail.inetnebr.com  Tue Jul  3 14:38:00 2001
From: jepler at mail.inetnebr.com (Jeff Epler)
Date: Tue, 3 Jul 2001 07:38:00 -0500
Subject: [Python-Dev] PEP 250, site-python, site-packages
In-Reply-To: <20010703140951.A27647@mediasupervision.de>; from gregor@mediasupervision.de on Tue, Jul 03, 2001 at 02:09:51PM +0200
References: <20010703140951.A27647@mediasupervision.de>
Message-ID: <20010703073759.A4972@localhost.localdomain>

On Tue, Jul 03, 2001 at 02:09:51PM +0200, Gregor Hoffleit wrote:
> Due to Python's good tradition of compatibility, this is the vast
> majority of packages; only packages with binary modules necessarily need
> to be recompiled anyway for each major new <version>.

Aren't there bytecode changes in 1.6, 2.0, and 2.1, compared to 1.5.2?  If
so, this either means that each version of Python does need a separate copy
(for the .pyc/.pyo file), or if all versions are compatible with 1.5.2
bytecodes (and I don't know that they are) then all packages would need to
be bytecompiled with 1.5.2.

For instance, it appears that between 1.5.2 and 2.1, the UNPACK_LIST
and UNPACK_TUPLE bytecode instructions were removed and replaced with
a single UNPACK_SEQUENCE opcode.

Information gathered by executing:
	python -c 'import dis
	for name in dis.opname:
	    if name[0] != "<": print name' | sort -u > opcodes-1.5.2
and similarly for python2.

Jeff



From tim.one at home.com  Sun Jul  1 03:58:29 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 30 Jun 2001 21:58:29 -0400
Subject: [Python-Dev] Support for "wide" Unicode characters
In-Reply-To: <3B3E4487.40054EAE@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEKLKLAA.tim.one@home.com>

[Paul Prescod]
> "The Energy is the mass of the object times the speed of light times
> two."

[David Ascher]
> Actually, it's "squared", not times two.  At least in my universe =)

This is something for Guido to Pronounce on, then.  Who's going to write the
PEP?  The threat of nuclear war seems almost laughable in Paul's universe,
so it's certainly got attractions.  OTOH, it's got to be a lot colder too.

energy-will-do-what-guido-tells-it-to-do-ly y'rs  - tim