From tim.one@home.com  Fri Feb  1 00:21:47 2002
From: tim.one@home.com (Tim Peters)
Date: Thu, 31 Jan 2002 19:21:47 -0500
Subject: [Python-Dev] distutils & stderr
In-Reply-To: <15449.47153.267248.439654@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEAANJAA.tim.one@home.com>

[Skip]
> Sorry about the missing link.  PyInline uses distutils to compile the C
> code.  How PyInline does its think doesn't really matter to me, so I'm
> not going to be interested in distutils' messages.

If distutils output isn't interesting to PyInline users, shouldn't PyInline
be changed to run setup.py with its -q/--quiet option?



From gmcm@hypernet.com  Fri Feb  1 00:51:50 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 31 Jan 2002 19:51:50 -0500
Subject: [Python-Dev] Re: opcode performance measurements
In-Reply-To: <15449.9882.393698.701265@gondolin.digicool.com>
References: <00c801c1aa96$2b521320$6d94fea9@newmexico>
Message-ID: <3C59A056.18632.5A29CDD7@localhost>

On 31 Jan 2002 at 6:12, Jeremy Hylton wrote:

> import mod.sub
> creates a binding for "mod" in the global namespace
> 
> The compiler can detect that the import statement is
> a package import -- and mark "mod.sub" as a
> candidate for optimization.  A use of "mod.sub.attr"
> in function should be treated just as "mod.attr". 

How can the compiler tell it's a package import?

It's bad practice, but people write "import mod.attr" all the time. Heck, Marc-Andre tricks import so that pkg.mod is really pkg.attr where the attr turns into a 
mod when accessed. No problem, since it's only import that cares what it is. By the time it's used it's always global.attr.attr....

-- Gordon
http://www.mcmillan-inc.com/



From jeremy@alum.mit.edu  Fri Feb  1 01:13:16 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 31 Jan 2002 20:13:16 -0500
Subject: [Python-Dev] Re: opcode performance measurements
In-Reply-To: <3C59A056.18632.5A29CDD7@localhost>
References: <00c801c1aa96$2b521320$6d94fea9@newmexico>
 <3C59A056.18632.5A29CDD7@localhost>
Message-ID: <15449.60332.741353.525674@gondolin.digicool.com>

>>>>> "GM" == Gordon McMillan <gmcm@hypernet.com> writes:

  GM> On 31 Jan 2002 at 6:12, Jeremy Hylton wrote:
  >> import mod.sub creates a binding for "mod" in the global
  >> namespace
  >>
  >> The compiler can detect that the import statement is a package
  >> import -- and mark "mod.sub" as a candidate for optimization.  A
  >> use of "mod.sub.attr" in function should be treated just as
  >> "mod.attr".

  GM> How can the compiler tell it's a package import?

I'm assuming it can guess based on import statements and that a
runtime check in LOAD_GLOBAL_ATTR (or whatever it's called) can verify
this assumption.  I haven't thought this part through fully, because
I'm not aware of the full perversity of what people do with import
hooks. 

  GM> It's bad practice, but people write "import mod.attr" all the
  GM> time.

I write it all the time when attr is a module in a package.  And I
know I can't do it for an actual attr of module.

  GM>       Heck, Marc-Andre tricks import so that pkg.mod is really
  GM> pkg.attr where the attr turns into a mod when accessed. No
  GM> problem, since it's only import that cares what it is. By the
  GM> time it's used it's always global.attr.attr....

Not sure I understand what Marc-Andre is doing.  (That's probably true
in general <wink>.)  A client of his code types "import foo.bar."
foo is a module?  a package?  When the "bar" attribute is loaded
(LOAD_ATTR) is turns into another module?

Jeremy



From gward@python.net  Fri Feb  1 03:06:35 2002
From: gward@python.net (Greg Ward)
Date: Thu, 31 Jan 2002 22:06:35 -0500
Subject: [Python-Dev] distutils & stderr
In-Reply-To: <15449.45367.46625.175691@beluga.mojam.com>
References: <15449.45367.46625.175691@beluga.mojam.com>
Message-ID: <20020201030635.GB9864@gerg.ca>

On 31 January 2002, Skip Montanaro said:
> If I could "cvs up" I would submit a patch, but in the meantime, is there
> any good reason that distutils shouldn't write its output to stderr?  I'm
> using PyInline to execute a little bit of C code that returns some
> information about the system to the calling Python code.  This code then
> sends some output to stdout.

Because stderr is for error messages.  Most of the noise generated by
the Distutils is optional, here's-what-I'm-doing-now stuff -- ie. *not*
errors.  If there are Distutils messages that are not silenced with -q,
that's a bug (and probably pretty easy to fix, too).

        Greg
-- 
Greg Ward - programmer-at-large                         gward@python.net
http://starship.python.net/~gward/
All of science is either physics or stamp collecting.


From gward@python.net  Fri Feb  1 03:11:43 2002
From: gward@python.net (Greg Ward)
Date: Thu, 31 Jan 2002 22:11:43 -0500
Subject: [Python-Dev] distutils & stderr
In-Reply-To: <15449.22538.192214.765110@gondolin.digicool.com>
References: <15449.47153.267248.439654@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCKEAANJAA.tim.one@home.com> <15449.22538.192214.765110@gondolin.digicool.com>
Message-ID: <20020201031143.GC9864@gerg.ca>

On 31 January 2002, Jeremy Hylton said:
> I started a thread on similar issues on the distutils-sig mailing list
> a week or two ago.  There's agreement that output is a problem.

The amount of output, or the binary nature of control (total silence
vs. total verbosity)?  I knew that was a minor problem when I wrote that
code initially, but had bigger fish to fry.

FWIW, my current thinking is that code that wants to be chatty should do
something like this:

  log(1, "installing foo.bar package")
  ...
  log(2, "copying foo/bar/baz.py to /usr/local/lib/python2.1/site-packages/foo/bar")

The first number is the logging threshold, compared against a global
verbosity level.

In a strongly OO system like the Distutils, that should probably be
spelled

  log(N, msg)

where the logging threshold is carried around in each object (or in some
global object).

This shouldn't be too hard to bolt onto the existing code -- ISTR that
the verbose flag is readily available to every object in the system;
just change it from a boolean to an integer and ensure that every log
message goes through self.log().

Oh wait: most of the low-level worker code in the Distutils falls
outside the main class hierarchy, so the verbose flag isn't *quite* so
readily available; it gets passed in to a heck of a lot of functions.
Crap.

        Greg
-- 
Greg Ward - programmer-at-big                           gward@python.net
http://starship.python.net/~gward/
"He's dead, Jim.  You get his tricorder and I'll grab his wallet."


From gmcm@hypernet.com  Fri Feb  1 03:30:23 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 31 Jan 2002 22:30:23 -0500
Subject: [Python-Dev] Re: opcode performance measurements
In-Reply-To: <15449.60332.741353.525674@gondolin.digicool.com>
References: <3C59A056.18632.5A29CDD7@localhost>
Message-ID: <3C59C57F.7192.5ABAF61C@localhost>

On 31 Jan 2002 at 20:13, Jeremy Hylton wrote:


>   GM> How can the compiler tell it's a package import?
> 
> I'm assuming it can guess based on import
> statements and that a runtime check in
> LOAD_GLOBAL_ATTR (or whatever it's called) can
> verify this assumption.  I haven't thought this part
> through fully, because I'm not aware of the full
> perversity of what people do with import hooks. 

Import hooks are effectively dead. People play namespace games almost exclusively now.
 
>   GM> It's bad practice, but people write "import
>   mod.attr" all the GM> time.
> 
> I write it all the time when attr is a module in a
> package.  And I know I can't do it for an actual
> attr of module. 

import os.path works even though there's no module named path. import pkg.attr always works.
 
>   GM>       Heck, Marc-Andre tricks import so that
>   pkg.mod is really GM> pkg.attr where the attr turns
>   into a mod when accessed. No GM> problem, since it's
>   only import that cares what it is. By the GM> time
>   it's used it's always global.attr.attr....
> 
> Not sure I understand what Marc-Andre is doing.
> (That's probably true in general <wink>.)  A client of
> his code types "import foo.bar." foo is a module?  a
> package?  When the "bar" attribute is loaded
> (LOAD_ATTR) is turns into another module?

foo is a package. The __init__.py creates an instance of LazyModule named bar. Doing anything with foo.bar triggers an import, and replacment of the name 
"bar" in foo with module bar.

That one's clean. Now turn your eye on the shennanigans in PyXML.

-- Gordon
http://www.mcmillan-inc.com/



From mal@lemburg.com  Fri Feb  1 10:21:14 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 01 Feb 2002 11:21:14 +0100
Subject: [Python-Dev] Re: opcode performance measurements
References: <00c801c1aa96$2b521320$6d94fea9@newmexico>
 <3C59A056.18632.5A29CDD7@localhost> <15449.60332.741353.525674@gondolin.digicool.com>
Message-ID: <3C5A6C1A.BEED7152@lemburg.com>

Jeremy Hylton wrote:
> 
> >>>>> "GM" == Gordon McMillan <gmcm@hypernet.com> writes:
>   GM>       Heck, Marc-Andre tricks import so that pkg.mod is really
>   GM> pkg.attr where the attr turns into a mod when accessed. No
>   GM> problem, since it's only import that cares what it is. By the
>   GM> time it's used it's always global.attr.attr....
> 
> Not sure I understand what Marc-Andre is doing.  (That's probably true
> in general <wink>.)  A client of his code types "import foo.bar."
> foo is a module?  a package?  When the "bar" attribute is loaded
> (LOAD_ATTR) is turns into another module?

Take a look at e.g. mx.DateTime.__init__ and the included
LazyModule module for more background.

I don't really use that approach myself, but sometimes it can be
handy to be able to reference modules in packages without
requiring an import of them, e.g.

import mx.DateTime
date = mx.DateTime.Parser.DateTimeFromString('2002-02-01')

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mwh@python.net  Fri Feb  1 11:11:44 2002
From: mwh@python.net (Michael Hudson)
Date: 01 Feb 2002 11:11:44 +0000
Subject: [Python-Dev] distutils & stderr
In-Reply-To: Greg Ward's message of "Thu, 31 Jan 2002 22:11:43 -0500"
References: <15449.47153.267248.439654@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCKEAANJAA.tim.one@home.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca>
Message-ID: <2m665h1mof.fsf@starship.python.net>

Greg Ward <gward@python.net> writes:

> On 31 January 2002, Jeremy Hylton said:
> > I started a thread on similar issues on the distutils-sig mailing list
> > a week or two ago.  There's agreement that output is a problem.
> 
> The amount of output, or the binary nature of control (total silence
> vs. total verbosity)?  I knew that was a minor problem when I wrote that
> code initially, but had bigger fish to fry.

I'm thinking that verbose should range from about -2 (no output at
all, even from commands if we can supress it) to about 2 (stupid
amounts of output) with the default being 0, where we take our guide
from what make outputs by default.

-v and -q would then be additive on the command line, so

python setup.py -q -v -v -q -q 

would be an odd way of specifying "verbose==-1".

> FWIW, my current thinking is that code that wants to be chatty should do
> something like this:
> 
>   log(1, "installing foo.bar package")
>   ...
>   log(2, "copying foo/bar/baz.py to /usr/local/lib/python2.1/site-packages/foo/bar")
> 
> The first number is the logging threshold, compared against a global
> verbosity level.

This sounds good.

> In a strongly OO system like the Distutils, that should probably be
> spelled
> 
>   log(N, msg)
> 
> where the logging threshold is carried around in each object (or in some
> global object).
> 
> This shouldn't be too hard to bolt onto the existing code -- ISTR that
> the verbose flag is readily available to every object in the system;
> just change it from a boolean to an integer and ensure that every log
> message goes through self.log().
> 
> Oh wait: most of the low-level worker code in the Distutils falls
> outside the main class hierarchy, so the verbose flag isn't *quite* so
> readily available; it gets passed in to a heck of a lot of functions.
> Crap.

There are a lot of calls in disutils that go

    func(...,...,verbose=self.verbose, dry_run=self.dry_run);

Would it really be so bad to have a global "verbose" variable in, say,
core?  (same for dry_run, too).

Of course, what I would like is CL-style special variables, but ne'er
mind that...

-- 
     ARTHUR:  Why are there three of you?
  LINTILLAS:  Why is there only one of you?
     ARTHUR:  Er... Could I have notice of that question?
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 11


From mal@lemburg.com  Fri Feb  1 11:58:33 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 01 Feb 2002 12:58:33 +0100
Subject: [Python-Dev] distutils & stderr
References: <15449.47153.267248.439654@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCKEAANJAA.tim.one@home.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net>
Message-ID: <3C5A82E9.7367EA03@lemburg.com>

Michael Hudson wrote:
> 
> Greg Ward <gward@python.net> writes:
> 
> > On 31 January 2002, Jeremy Hylton said:
> > > I started a thread on similar issues on the distutils-sig mailing list
> > > a week or two ago.  There's agreement that output is a problem.
> >
> > The amount of output, or the binary nature of control (total silence
> > vs. total verbosity)?  I knew that was a minor problem when I wrote that
> > code initially, but had bigger fish to fry.
> 
> I'm thinking that verbose should range from about -2 (no output at
> all, even from commands if we can supress it) to about 2 (stupid
> amounts of output) with the default being 0, where we take our guide
> from what make outputs by default.
> 
> -v and -q would then be additive on the command line, so
> 
> python setup.py -q -v -v -q -q
> 
> would be an odd way of specifying "verbose==-1".

That looks like line noise :-)
 
> > FWIW, my current thinking is that code that wants to be chatty should do
> > something like this:
> >
> >   log(1, "installing foo.bar package")
> >   ...
> >   log(2, "copying foo/bar/baz.py to /usr/local/lib/python2.1/site-packages/foo/bar")
> >
> > The first number is the logging threshold, compared against a global
> > verbosity level.
> 
> This sounds good.

Hmm, that's very close to what I have implemented in mx.Log
(see the egenix-mx-base package). 
 
> > In a strongly OO system like the Distutils, that should probably be
> > spelled
> >
> >   log(N, msg)
> >
> > where the logging threshold is carried around in each object (or in some
> > global object).
> >
> > This shouldn't be too hard to bolt onto the existing code -- ISTR that
> > the verbose flag is readily available to every object in the system;
> > just change it from a boolean to an integer and ensure that every log
> > message goes through self.log().
> >
> > Oh wait: most of the low-level worker code in the Distutils falls
> > outside the main class hierarchy, so the verbose flag isn't *quite* so
> > readily available; it gets passed in to a heck of a lot of functions.
> > Crap.
> 
> There are a lot of calls in disutils that go
> 
>     func(...,...,verbose=self.verbose, dry_run=self.dry_run);
> 
> Would it really be so bad to have a global "verbose" variable in, say,
> core?  (same for dry_run, too).
> 
> Of course, what I would like is CL-style special variables, but ne'er
> mind that...

FYI, I usually use a package/module scope global logging object
for this kind of thing (rather than a function which then looks
somewhere for the debug level). Works great.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Fri Feb  1 12:31:21 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 01 Feb 2002 13:31:21 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.352,1.353
References: <LNBBLJKPBEHFEDALKOLCCECFNJAA.tim@zope.com>
Message-ID: <3C5A8A99.F2D7DCDE@lemburg.com>

Tim Peters wrote:
> 
> [MAL]
> > Wouldn't it be better to use Win32 APIs for this ? That way,
> > other compilers on Windows will have a chance to use the
> > same code.
> 
> I have no reason to believe that other compilers on Windows don't follow MS
> in this respect; they usually seem to ape the same functions, sometimes with
> or without leading underscores, or other trivial name changes.  If you want
> to wrestle with the Win32 API, be my guest <wink>.

I'm no Win32 expert, just though that the code in the win32process 
module (which is part of win32all) probably already provides code in 
this area.

Another candidate for Windows emulation would be os.kill().
win32process has TerminateProcess() which could probably be used
for this (no idea however, how you get from a PID to a process handle
on Windows).

Anyway, just a thought...
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From skip@pobox.com  Fri Feb  1 14:38:48 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 1 Feb 2002 08:38:48 -0600
Subject: [Python-Dev] distutils & stderr
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEAANJAA.tim.one@home.com>
References: <15449.47153.267248.439654@beluga.mojam.com>
 <LNBBLJKPBEHFEDALKOLCKEAANJAA.tim.one@home.com>
Message-ID: <15450.43128.601760.548974@12-248-41-177.client.attbi.com>

    Tim> If distutils output isn't interesting to PyInline users, shouldn't
    Tim> PyInline be changed to run setup.py with its -q/--quiet option?

Probably so, but not all prints are guarded by "if verbose:".

Skip



From skip@pobox.com  Fri Feb  1 14:51:25 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 1 Feb 2002 08:51:25 -0600
Subject: [Python-Dev] Re: opcode performance measurements
In-Reply-To: <3C59C57F.7192.5ABAF61C@localhost>
References: <3C59A056.18632.5A29CDD7@localhost>
 <3C59C57F.7192.5ABAF61C@localhost>
Message-ID: <15450.43885.275862.430781@12-248-41-177.client.attbi.com>

It just occurred to me that my LOAD_GLOBAL/LOAD_ATTR eliding scheme can't
work, since LOAD_ATTR calls PyObject_GetAttr, which can wind up calling
__getattr__, which is free to inflict all sorts of side effects on the
attribute lookup.  PEP 267 doesn't appear to be similarly affected, assuming
it can conclude that LOAD_GLOBAL is actually loading a module object.  (Can
it?)  LOAD_GLOBAL alone shouldn't be a problem, since all that does is call
PyDict_GetItem for globals and builtins.

damn...

Skip


From jack@oratrix.com  Fri Feb  1 14:52:11 2002
From: jack@oratrix.com (Jack Jansen)
Date: Fri, 1 Feb 2002 15:52:11 +0100
Subject: [Python-Dev] next vs darwin
In-Reply-To: <m34rl2i0cc.fsf@mira.informatik.hu-berlin.de>
Message-ID: <459BBDA3-1723-11D6-B4B0-0030655234CE@oratrix.com>

On Friday, February 1, 2002, at 12:09 , Martin v. Loewis wrote:

> Jack Jansen <Jack.Jansen@oratrix.nl> writes:
>
>> With the define on it loads all extension modules into the application
>> namespace. Some people want this (despite the problems sketched above)
>> because they have modules that refer to external symbols defined in
>> modules that have been loaded earlier (and I assume there's magic that
>> ensures their modules are loaded in the right order).
>
> On Unix, this is a runtime option via sys.setdlopenflags (RTLD_GLOBAL
> turns on import into application namespace). Do you think you could
> emulate this API?

Shouldn't be a problem. I had never heard of sys.setdlopenflags(), 
otherwise I would have done so already.
>> I prefer the new (OSX 10.1) preferred Apple way of linking plugins
>> (which is also the common way to do so on all other non-unix
>> platforms) where the plugin has to be linked against the application
>> and dynamic libraries it is going to be plugged into, so none of
>> this dynamic behaviour goes on.
>
> I'm not sure linking with a libpython.so is desirable, I'm quite fond
> of the approach to let the executable export symbols to the
> extensions. If that is possible on OS X, I'd encourage you to follow
> such a strategy (in unix gcc/ld, this is enabled through
> -Wl,--export-dynamic).

Indeed, you link against the embedder (be it .so, framework or 
application) in a special way that say "this is going to be the host 
application".
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -



From jeremy@alum.mit.edu  Fri Feb  1 15:10:58 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 1 Feb 2002 10:10:58 -0500
Subject: [Python-Dev] Re: opcode performance measurements
In-Reply-To: <15450.43885.275862.430781@12-248-41-177.client.attbi.com>
Message-ID: <BCEJJGNAEAKMACBPLMEPKEBGCAAA.jeremy@alum.mit.edu>

> It just occurred to me that my LOAD_GLOBAL/LOAD_ATTR eliding scheme can't
> work, since LOAD_ATTR calls PyObject_GetAttr, which can wind up calling
> __getattr__, which is free to inflict all sorts of side effects on the
> attribute lookup.  PEP 267 doesn't appear to be similarly affected,
assuming
> it can conclude that LOAD_GLOBAL is actually loading a module object.
(Can
> it?)  LOAD_GLOBAL alone shouldn't be a problem, since all that does is
call
> PyDict_GetItem for globals and builtins.

The approach I'm working on would have to check that the object is a module
on each use, but that's relatively cheap compared to the layers of function
calls we have now.  It's a pretty safe assumption because it would only be
made for objects bound by an import statement.

I also wanted to answer Samuele's question briefly, because I'm going to be
busy with other things most of today.  The basic idea, which I need to flesh
out by next week, is that the internal binding for "mod.attr" that a module
keeps is just a hint.  The compiler notices that function f() uses
"mod.attr" and that mod is imported at the module level.  The "mod.attr"
binding must include a pointer to the location where mod is stored and the
pointer it found when the "mod.attr" binding was updated.  When "mod.attr"
is used, the interpreter must check that mod is still bound to the same
object.  If so, the "mod.attr" binding is still valid.  Note that the
"mod.attr" binding is a PyObject ** -- a pointer to the location where
"attr" is bound in "mod".

Jeremy




From guido@python.org  Fri Feb  1 15:48:22 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 01 Feb 2002 10:48:22 -0500
Subject: [Python-Dev] distutils & stderr
In-Reply-To: Your message of "01 Feb 2002 11:11:44 GMT."
 <2m665h1mof.fsf@starship.python.net>
References: <15449.47153.267248.439654@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCKEAANJAA.tim.one@home.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca>
 <2m665h1mof.fsf@starship.python.net>
Message-ID: <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net>

> I'm thinking that verbose should range from about -2 (no output at
> all, even from commands if we can supress it) to about 2 (stupid
> amounts of output) with the default being 0, where we take our guide
> from what make outputs by default.

I think the point is that Make has a more useful definition of what
should be printed in the default case and what shouldn't, and that's
the real problem -- not that there aren't enough levels.  Fewer levels
is actually better, since there are less ways to screw up. :-)

The specific problem is that by default you don't want it to blab
about all the things it doesn't have to do because they're already
done.  Make got this right and distutils got it wrong.

I could see three levels at most:

- verbose, tells you about everything it could do

- default, only tells you about things it does and not about things it
  skips

- quiet, only tells you about errors

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jack@oratrix.com  Fri Feb  1 16:04:05 2002
From: jack@oratrix.com (Jack Jansen)
Date: Fri, 1 Feb 2002 17:04:05 +0100
Subject: [Python-Dev] next vs darwin
In-Reply-To: <459BBDA3-1723-11D6-B4B0-0030655234CE@oratrix.com>
Message-ID: <50CB70C6-172D-11D6-B4B0-0030655234CE@oratrix.com>

On Friday, February 1, 2002, at 03:52 , Jack Jansen wrote:

>> On Unix, this is a runtime option via sys.setdlopenflags (RTLD_GLOBAL
>> turns on import into application namespace). Do you think you could
>> emulate this API?
>
> Shouldn't be a problem. I had never heard of sys.setdlopenflags(), 
> otherwise I would have done so already.

Hmm. I had a look at the setdlopenflags() and accompanying 
infrastructure, and it seems you can
set many flags to dlopen() through this call, is that right?

If it is, is it a good idea to call the OSX-specific routine 
setdlopenflags() too, even though
it will only support the "use global namespace" flag? Or is that the 
only flag you can reasonably
pass to dlopen() anyway?
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -
>



From gward@python.net  Fri Feb  1 17:01:47 2002
From: gward@python.net (Greg Ward)
Date: Fri, 1 Feb 2002 12:01:47 -0500
Subject: [Python-Dev] urllib2 bug
Message-ID: <20020201170147.GA11551@gerg.ca>

I've just discovered a bug in urllib2: it drops caller-supplied headers
when processing HTTP redirects.  See
  http://sourceforge.net/tracker/index.php?func=detail&aid=511786&group_id=5470&atid=105470
for details.

The fix (to HTTPRedirectHandler.http_error_302()), as near as I can
tell, is trivial:

--- Lib/urllib2.py      2001/11/09 16:46:51     1.24
+++ Lib/urllib2.py      2002/02/01 17:00:05
@@ -416,7 +416,7 @@
         # XXX Probably want to forget about the state of the current
         # request, although that might interact poorly with other
         # handlers that also use handler-specific request attributes
-        new = Request(newurl, req.get_data())
+        new = Request(newurl, req.get_data(), req.headers)
         new.error_302_dict = {}
         if hasattr(req, 'error_302_dict'):
             if len(req.error_302_dict)>10 or \

I'll check this in (2.2.1 candidate) and close the bug unless anyone
howls.

        Greg
-- 
Greg Ward - Unix bigot                                  gward@python.net
http://starship.python.net/~gward/
This message transmitted with 100% recycled electrons.


From trentm@ActiveState.com  Fri Feb  1 18:20:50 2002
From: trentm@ActiveState.com (Trent Mick)
Date: Fri, 1 Feb 2002 10:20:50 -0800
Subject: [Python-Dev] distutils & stderr
In-Reply-To: <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Feb 01, 2002 at 10:48:22AM -0500
References: <15449.47153.267248.439654@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCKEAANJAA.tim.one@home.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net> <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020201102050.B31242@ActiveState.com>

On Fri, Feb 01, 2002 at 10:48:22AM -0500, Guido van Rossum wrote:
> I could see three levels at most:
> 
> - verbose, tells you about everything it could do
> - default, only tells you about things it does and not about things it
>   skips
> - quiet, only tells you about errors


FYI,

The log4j (j==Java) system uses five levels:
    1. debug
    2. info
    3. warn
    4. error
    5. fatal

Application code uses the system something like this (simplified Python
translation):

    # Go through some steps to get a "Logger" singleton object.
    import log4py
    log = log4py.getLogger("distutils")

    # Then you call methods on the 'log' object for the level of message to
    # write.
    log.debug("Distutil *could* do BAR.")
    ...
    log.info("Distutils is now doing FOO")
    ...
    log.warn("Beware, SPAM may not be what you expect.")
    ...
    log.error("This is just wrong.")
    ...
    log.fatal("This is really bad. Aborting EGGS.")
    
    # The 'log' object knows if, say, log.debug() calls should actually
    # result in any output (because the setup.py option processing sets the
    # level to print). So, if I use 'python setup.py -q' the print level is
    # set to "WARN" (or perhaps "ERROR") and only .warn(), .error(), and
    # .fatal() calls get printed.

That is just an idea of how it could be done. You could reduce the logging
levels down to three, as Guido suggested.

c.f. http://jakarta.apache.org/log4j/docs/index.html


Cheers,
Trent

-- 
Trent Mick
TrentM@ActiveState.com


From gward@python.net  Fri Feb  1 18:21:23 2002
From: gward@python.net (Greg Ward)
Date: Fri, 1 Feb 2002 13:21:23 -0500
Subject: [Python-Dev] distutils & stderr
In-Reply-To: <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net>
References: <15449.47153.267248.439654@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCKEAANJAA.tim.one@home.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net> <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020201182123.GA12019@gerg.ca>

On 01 February 2002, Guido van Rossum said:
> I could see three levels at most:
> 
> - verbose, tells you about everything it could do
> 
> - default, only tells you about things it does and not about things it
>   skips
> 
> - quiet, only tells you about errors

+1 from me.

The Distutils' current level of verbosity stems from my desire to see
exactly what was happening in the code at the development/debugging
stage.  That's obsolete and should be fixed.  I like Guido's idea.

        Greg
-- 
Greg Ward - Linux weenie                                gward@python.net
http://starship.python.net/~gward/
Jesus Saves -- and you can too, by redeeming these valuable coupons!


From thomas.heller@ion-tof.com  Fri Feb  1 18:33:07 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 1 Feb 2002 19:33:07 +0100
Subject: [Python-Dev] distutils & stderr
References: <15449.47153.267248.439654@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCKEAANJAA.tim.one@home.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net> <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net> <20020201102050.B31242@ActiveState.com>
Message-ID: <07c301c1ab4e$e4ad0e70$e000a8c0@thomasnotebook>

From: "Trent Mick" <trentm@ActiveState.com>
> FYI,
> 
> The log4j (j==Java) system uses five levels:
>     1. debug
>     2. info
>     3. warn
>     4. error
>     5. fatal
> 

I'm also a very happy user of log4* (although * = C at the moment for me).

IMO: The debug and info levels are for the programmer,
only warn, error, and fatal are for the user.

Thomas



From Samuele Pedroni" <pedroni@inf.ethz.ch  Fri Feb  1 18:30:28 2002
From: Samuele Pedroni" <pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Fri, 1 Feb 2002 19:30:28 +0100
Subject: [Python-Dev] Re: opcode performance measurements
References: <BCEJJGNAEAKMACBPLMEPKEBGCAAA.jeremy@alum.mit.edu>
Message-ID: <021901c1ab4e$867fb640$6d94fea9@newmexico>

First, thanks for the answer :).
Here is my input on the topic 
[Obviously I won't be present the developer day]

From: Jeremy Hylton <jeremy@alum.mit.edu>
> The approach I'm working on would have to check that the object is a module
> on each use, but that's relatively cheap compared to the layers of function
> calls we have now.  It's a pretty safe assumption because it would only be
> made for objects bound by an import statement.
> 
> I also wanted to answer Samuele's question briefly, because I'm going to be
> busy with other things most of today.  The basic idea, which I need to flesh
> out by next week, is that the internal binding for "mod.attr" that a module
> keeps is just a hint.  The compiler notices that function f() uses
> "mod.attr" and that mod is imported at the module level.  The "mod.attr"
> binding must include a pointer to the location where mod is stored and the
> pointer it found when the "mod.attr" binding was updated.  When "mod.attr"
> is used, the interpreter must check that mod is still bound to the same
> object.  If so, the "mod.attr" binding is still valid.  Note that the
> "mod.attr" binding is a PyObject ** -- a pointer to the location where
> "attr" is bound in "mod".
> 

I see, btw I asked primarily because the PEP as it is is vague, not
because I believed the idea cannot fly [for Jython the issue
is more complicated, PyObject ** is not something easily
expressed in Java <wink>]

I think that it is worth to point out that what you propose is a special/
ad-hoc version of what typically other Smalltalk-like dynamic languages do,
together with jitting, but the approach is orthogonal to that, namely:

for every send site they have a send-site cache:

   if send-site-cache-still-applies: # (1)
     dispatch based on site-cache contents # (2)
  else:
    normal send lookup and update send-site-cache

In Python more or less the same could be applied
to load_* instead of sends.

Your approach deals with a part of those. These need
(only) module-level caches.

The extended/general approach could work too and
give some benefit.

But it is clear that the complexity and overhead of (1) and (2),
and the space-demand for the caches depend 
on how much homogeneous are system object layouts 
and behaviors.

And Python with modules, data-objects, class/instances,
types etc is quite a zoo :(.

Pushing the class/type unification further, this is an aspect
to consider IMHO.

If those things where already all known sorry for the
boring post.

regards.
 




From martin@v.loewis.de  Fri Feb  1 19:19:46 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 01 Feb 2002 20:19:46 +0100
Subject: [Python-Dev] next vs darwin
In-Reply-To: <50CB70C6-172D-11D6-B4B0-0030655234CE@oratrix.com>
References: <50CB70C6-172D-11D6-B4B0-0030655234CE@oratrix.com>
Message-ID: <m3g04lko19.fsf@mira.informatik.hu-berlin.de>

Jack Jansen <jack@oratrix.com> writes:

> Hmm. I had a look at the setdlopenflags() and accompanying
> infrastructure, and it seems you can
> set many flags to dlopen() through this call, is that right?

Correct. It is admittedly very Unixish at the moment.

> If it is, is it a good idea to call the OSX-specific routine
> setdlopenflags() too, even though it will only support the "use
> global namespace" flag? Or is that the only flag you can reasonably
> pass to dlopen() anyway?

Effectively, yes. There is also a symbol RTLD_LOCAL, which is 0 on
most systems, and it may be reasonable to add RTLD_LAZY (defer
resolution of function symbols until they are called the first time).

Anyway, my main point is that this should be a run-time option. If the
APIs can merge, that might be a good thing (even if it means to
deprecate setdlopenflags); if that is not feasible, I'd atleast
recommend that you put the control over extension loading also into
sys.

Regards,
Martin



From skip@pobox.com  Fri Feb  1 19:59:46 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 1 Feb 2002 13:59:46 -0600
Subject: [Python-Dev] distutils & stderr
In-Reply-To: <20020201182123.GA12019@gerg.ca>
References: <15449.47153.267248.439654@beluga.mojam.com>
 <LNBBLJKPBEHFEDALKOLCKEAANJAA.tim.one@home.com>
 <15449.22538.192214.765110@gondolin.digicool.com>
 <20020201031143.GC9864@gerg.ca>
 <2m665h1mof.fsf@starship.python.net>
 <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net>
 <20020201182123.GA12019@gerg.ca>
Message-ID: <15450.62386.306227.518014@12-248-41-177.client.attbi.com>

On 01 February 2002, Guido van Rossum said:

    >> - quiet, only tells you about errors

And only to stderr, assuming stderr is available.  (Can this be detected on
Windows?)  If you log messages to stdout, scripts that use distutils can't
be used as filters.

Skip


From guido@python.org  Fri Feb  1 20:09:16 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 01 Feb 2002 15:09:16 -0500
Subject: [Python-Dev] distutils & stderr
In-Reply-To: Your message of "Fri, 01 Feb 2002 13:59:46 CST."
 <15450.62386.306227.518014@12-248-41-177.client.attbi.com>
References: <15449.47153.267248.439654@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCKEAANJAA.tim.one@home.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net> <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net> <20020201182123.GA12019@gerg.ca>
 <15450.62386.306227.518014@12-248-41-177.client.attbi.com>
Message-ID: <200202012009.g11K9G704961@pcp742651pcs.reston01.va.comcast.net>

>     >> - quiet, only tells you about errors
> 
> And only to stderr, assuming stderr is available.  (Can this be
> detected on Windows?)

Depends on what you call available.  sys.stderr should always exist.

> If you log messages to stdout, scripts that use distutils can't
> be used as filters.

IMO it would be better if there was a way to give distutils a file
where to send output.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Fri Feb  1 20:10:49 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 1 Feb 2002 21:10:49 +0100
Subject: [Python-Dev] distutils & stderr
References: <15449.47153.267248.439654@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCKEAANJAA.tim.one@home.com> <15449.22538.192214.765110@gondolin.digicool.com> <20020201031143.GC9864@gerg.ca> <2m665h1mof.fsf@starship.python.net> <200202011548.g11FmMb03517@pcp742651pcs.reston01.va.comcast.net> <20020201182123.GA12019@gerg.ca>              <15450.62386.306227.518014@12-248-41-177.client.attbi.com>  <200202012009.g11K9G704961@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <098801c1ab5c$8a853090$e000a8c0@thomasnotebook>

From: "Guido van Rossum" <guido@python.org>
> >     >> - quiet, only tells you about errors
> > 
> > And only to stderr, assuming stderr is available.  (Can this be
> > detected on Windows?)
> 
> Depends on what you call available.  sys.stderr should always exist.
> 
> > If you log messages to stdout, scripts that use distutils can't
> > be used as filters.
> 
> IMO it would be better if there was a way to give distutils a file
> where to send output.

One additional annoyance under windows is that MSVC (when compiling)
always prints messages to the console (stderr, stdout? not sure)
which cannot be suppressed (at least I haven't found a way).

Thomas



From tim.one@home.com  Fri Feb  1 21:29:38 2002
From: tim.one@home.com (Tim Peters)
Date: Fri, 1 Feb 2002 16:29:38 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.352,1.353
In-Reply-To: <3C5A8A99.F2D7DCDE@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEEENJAA.tim.one@home.com>

[MAL]
> I'm no Win32 expert, just though that the code in the win32process
> module (which is part of win32all) probably already provides code in
> this area.

With a Win32 flavor, which isn't what I need here.  There is no distinct
"wait for process" function in Win32, it's just another application of the
very cool WaitFor{Single,Multiple}Object(s)[Ex] APIs (which can "wait" for
sets of "handles" to "do something":  kinda like Unix select(), except not
braindead <wink>).

That's fine, but what I specifically needed (for a Zope Corp project) was a
Unixish waitpid() workalike.  MS already did most of the work for that in
their _cwait function, so it would be silly not to reuse it.  BTW, a google
search suggested Borland also supports a cwait function, but I have neither
a Borland compiler nor time to worry about that platform.  You didn't worry
much about the Cray T3E when implementing Unicode either <wink>.

> Another candidate for Windows emulation would be os.kill().
> win32process has TerminateProcess() which could probably be used
> for this (no idea however, how you get from a PID to a process handle
> on Windows).

I'm not looking for random functions to implement; if I *need* an
os.kill()-alike, I'll do one, but I don't expect the need.
TerminateProcess() is a dangerous function on Windows (read the docs).  If
you want to risk it, you go from process pid to process handle via the Win32
OpenProcess() function.



From mal@lemburg.com  Fri Feb  1 21:58:34 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 01 Feb 2002 22:58:34 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc
 NEWS,1.352,1.353
References: <LNBBLJKPBEHFEDALKOLCAEEENJAA.tim.one@home.com>
Message-ID: <3C5B0F8A.E1998EB9@lemburg.com>

Tim Peters wrote:
>=20
> [MAL]
> > I'm no Win32 expert, just though that the code in the win32process
> > module (which is part of win32all) probably already provides code in
> > this area.
>=20
> With a Win32 flavor, which isn't what I need here.  There is no distinc=
t
> "wait for process" function in Win32, it's just another application of =
the
> very cool WaitFor{Single,Multiple}Object(s)[Ex] APIs (which can "wait" =
for
> sets of "handles" to "do something":  kinda like Unix select(), except =
not
> braindead <wink>).
>=20
> That's fine, but what I specifically needed (for a Zope Corp project) w=
as a
> Unixish waitpid() workalike.  MS already did most of the work for that =
in
> their _cwait function, so it would be silly not to reuse it.  BTW, a go=
ogle
> search suggested Borland also supports a cwait function, but I have nei=
ther
> a Borland compiler nor time to worry about that platform.  You didn't w=
orry
> much about the Cray T3E when implementing Unicode either <wink>.

Touch=E9 :-)

> > Another candidate for Windows emulation would be os.kill().
> > win32process has TerminateProcess() which could probably be used
> > for this (no idea however, how you get from a PID to a process handle
> > on Windows).
>=20
> I'm not looking for random functions to implement; if I *need* an
> os.kill()-alike, I'll do one, but I don't expect the need.
> TerminateProcess() is a dangerous function on Windows (read the docs). =
 If
> you want to risk it, you go from process pid to process handle via the =
Win32
> OpenProcess() function.

Too bad, because I have will have a need for porting a multi-process
application to Windows sometime soon :-)

Here's an article I found on the topic:
    http://www.wdj.com/articles/1999/9907/9907c/9907c.htm

What a hack... now I know why you don't want to use Win32 APIs ;-)

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From jeremy@alum.mit.edu  Fri Feb  1 12:59:52 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 1 Feb 2002 07:59:52 -0500
Subject: [Python-Dev] Re: opcode performance measurements
In-Reply-To: <021901c1ab4e$867fb640$6d94fea9@newmexico>
References: <BCEJJGNAEAKMACBPLMEPKEBGCAAA.jeremy@alum.mit.edu>
 <021901c1ab4e$867fb640$6d94fea9@newmexico>
Message-ID: <15450.37192.51717.328419@gondolin.digicool.com>

>>>>> "SP" == Samuele Pedroni <pedronis@bluewin.ch> writes:

  SP> But it is clear that the complexity and overhead of (1) and (2),
  SP> and the space-demand for the caches depend on how much
  SP> homogeneous are system object layouts and behaviors.

Good point!  It's important to try to extract the general principles
at work and see how they can be applied systematically.  The general
notion I have is that dictionaries are not an efficient way to
implement namespaces.  The way most namespaces are used -- in
particular that most names are known statically -- allows a more
efficient implement.

  SP> And Python with modules, data-objects, class/instances, types
  SP> etc is quite a zoo :(.

And, again, this is a problem.  The same sorts of techniques apply to
all namespaces.  It would be good to try to make the approach
general, but some namespaces are more dynamic than others.  Python's
classes, lack of declarations, and separate compilation of modules
means class/instance namespaces are hard to do right.  Need to defer a
lot of final decisions to runtime and keep an extra dictionary around
just in case.

  SP> Pushing the class/type unification further, this is an aspect to
  SP> consider IMHO.

  SP> If those things where already all known sorry for the boring
  SP> post.

Thanks for good questions and suggestions.  Too bad you can't come to
dev day.  I'll try to post slides before or after the talk -- and
update the PEP.

Jermey
 






From Jack.Jansen@oratrix.nl  Fri Feb  1 23:34:51 2002
From: Jack.Jansen@oratrix.nl (Jack Jansen)
Date: Sat, 2 Feb 2002 00:34:51 +0100
Subject: [Python-Dev] Patch to enable sys.setdlopenflags() on MacOSX
Message-ID: <49FC9AC9-176C-11D6-9B87-003065517236@oratrix.nl>

I put a patch on sourceforge, #511962, which enables 
sys.setdlopenflags() on MacOSX. The only values you can pass are 
0 (the default, dynamic modules are each loaded into their own 
private symbol namespace) and 0x100 (modules are loaded into the 
process' global symbol namespace, so they can refer to eah 
other's symbols).

As the API is compatible with Linux I hope this solves the 
problem of the people who want a global namespace, could they 
please apply the patch and see whether it works?

As I myself think this is a hack upon a hack, could people with 
Strong Opinions (you know who you are:-) please tell (a) me to 
commit the patch, (b) me to change the patch, or (c) the people 
addressed in the previous paragraph to not do what they're 
doing:-)
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -



From Samuele Pedroni" <pedroni@inf.ethz.ch  Sat Feb  2 00:58:37 2002
From: Samuele Pedroni" <pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Sat, 2 Feb 2002 01:58:37 +0100
Subject: [Python-Dev] Re: opcode performance measurements
References: <BCEJJGNAEAKMACBPLMEPKEBGCAAA.jeremy@alum.mit.edu><021901c1ab4e$867fb640$6d94fea9@newmexico> <15450.37192.51717.328419@gondolin.digicool.com>
Message-ID: <047c01c1ab84$bf7170c0$6d94fea9@newmexico>

[Jeremy Hylton]
> Thanks for good questions and suggestions.  Too bad you can't come to
> dev day.  I'll try to post slides before or after the talk -- and
> update the PEP.

Here are some more wild ideas, probably more thought provoking
than useful, but this is really an area where only the profiler knows
the truth <wink>.

>   SP> And Python with modules, data-objects, class/instances, types
>   SP> etc is quite a zoo :(.
>
> And, again, this is a problem.  The same sorts of techniques apply to
> all namespaces.  It would be good to try to make the approach
> general, but some namespaces are more dynamic than others.  Python's
> classes, lack of declarations, and separate compilation of modules
> means class/instance namespaces are hard to do right.  Need to defer a
> lot of final decisions to runtime and keep an extra dictionary around
> just in case.
>

* instance namespaces

As I said but what eventually will happen with class/type unification plays
a role.

1. __slots__ are obviously a good thing here :)
2. old-style instances  and in general instances with a dict:

one can try to guess the slots of a class looking for the "self.attr"
pattern at compile time in a more or less clever way.
The set of compile-time guessed attrs will be passed to MAKE_CLASS
which will construct the runtime guess using the union of the
super-classes guesses and the compile time guess for the class.
This information can be used to layout a dlict.

* zoo problem
[yes as I said this whole inline cache thing is supossed
to trade memory with speed. And the fact that python
internal objects are so inhomogeneous/ polymorphic <wink>
does not help to keep the amount small, for example
having only new-style classes would help]

ideally one can assign to each bytecode in a codeobject
whose behavior depends/dispatchs on the concrete object "type"
a "cache line" (or many, polymorphic inline caches
for modern Smalltalk impl does that in the context of the jit)
(As long as the GIL is there we do not need per-thread
version of the caches)

the first entries in the "cache-line" could contain the PyObject type
and then a function pointer, so the we would have a common
logic like:

  if PyObjectType(obj) == cache_line.type:
     cache_line.onType()
  else:
     ...

then the per-type code could use the rest of the space in cache-line
polymorphically to contain type-specific cached "dispatch" info.
E.g. the index of a dict entry for the load_attr/set_attr logic on an instance
...

Abstractly  one can think about a cache-line for a bytecode as
the streamlined version in terms of values/or code-pointers of the
last time taken path for that bytecode, plus values to check whether
the very same path still makes sense.

1. in practice these ideas can perform very poorly
2. this try to address things/internals as they are,

3. Yup, anything on the object layout/behavior
   side that simplifies this picture probably does a step
   in the right direction.

regards, Samuele.






From jeremy@alum.mit.edu  Sat Feb  2 01:15:21 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 1 Feb 2002 20:15:21 -0500
Subject: [Python-Dev] Re: opcode performance measurements
In-Reply-To: <047c01c1ab84$bf7170c0$6d94fea9@newmexico>
References: <BCEJJGNAEAKMACBPLMEPKEBGCAAA.jeremy@alum.mit.edu>
 <021901c1ab4e$867fb640$6d94fea9@newmexico>
 <15450.37192.51717.328419@gondolin.digicool.com>
 <047c01c1ab84$bf7170c0$6d94fea9@newmexico>
Message-ID: <15451.15785.174046.282855@gondolin.digicool.com>

>>>>> "SP" == Samuele Pedroni <pedronis@bluewin.ch> writes:

  SP> * instance namespaces

  SP> As I said but what eventually will happen with class/type
  SP> unification plays a role.

  SP> 1. __slots__ are obviously a good thing here :)
  SP> 2. old-style instances and in general instances with a dict:

  SP> one can try to guess the slots of a class looking for the
  SP> "self.attr" pattern at compile time in a more or less clever
  SP> way.  The set of compile-time guessed attrs will be passed to
  SP> MAKE_CLASS which will construct the runtime guess using the
  SP> union of the super-classes guesses and the compile time guess
  SP> for the class.  This information can be used to layout a dlict.

Right!  There's another step necessary to take advantage though.  When
you execute a method you don't know the receiver type
(self.__class__).  So you need to specialize the bytecode to a
particular receiver the first time the method is called.  Since this
could be relatively expensive and you don't know how often the method
will be executed, you need to decide dynamically when to do it.  Just
like HotSpot.

We probably have to worry about a class or instance being modified in
a way that invalidates the dlict offsets computed.  (Not sure here,
but I think that's the case.)  If so, we probably need a different
object -- call it a template -- that represents the concrete layout
and is tied to unmodified concrete class.  When objects or classes are
modified in dangerous ways, we'd need to invalidate the template
pointer for the affected instances.

Jeremy



From Samuele Pedroni" <pedroni@inf.ethz.ch  Sat Feb  2 02:25:05 2002
From: Samuele Pedroni" <pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Sat, 2 Feb 2002 03:25:05 +0100
Subject: [Python-Dev] Re: opcode performance measurements
References: <BCEJJGNAEAKMACBPLMEPKEBGCAAA.jeremy@alum.mit.edu><021901c1ab4e$867fb640$6d94fea9@newmexico><15450.37192.51717.328419@gondolin.digicool.com><047c01c1ab84$bf7170c0$6d94fea9@newmexico> <15451.15785.174046.282855@gondolin.digicool.com>
Message-ID: <004201c1ab90$d3c18540$9d97bac3@newmexico>

From: Jeremy Hylton <jeremy@zope.com>
> >>>>> "SP" == Samuele Pedroni <pedronis@bluewin.ch> writes:
...
>   SP> one can try to guess the slots of a class looking for the
>   SP> "self.attr" pattern at compile time in a more or less clever
>   SP> way.  The set of compile-time guessed attrs will be passed to
>   SP> MAKE_CLASS which will construct the runtime guess using the
>   SP> union of the super-classes guesses and the compile time guess
>   SP> for the class.  This information can be used to layout a dlict.
> 
> Right!  There's another step necessary to take advantage though.  When
> you execute a method you don't know the receiver type
> (self.__class__).  So you need to specialize the bytecode to a
> particular receiver the first time the method is called.  Since this
> could be relatively expensive and you don't know how often the method
> will be executed, you need to decide dynamically when to do it.  Just
> like HotSpot.

Right, because with multiple inheritance you cannot make the layout
of a subclass compatible with that of *all* superclasses, so simple
monomorphic inline caches will not work :(.
OTOH you can use polymorphic inline cachesm, that means
a bunch of class->index lines for each bytecode or
not specialize the bytecode but (insane idea) choose
on method entry a different bunch of cache-lines based on self class.

> We probably have to worry about a class or instance being modified in
> a way that invalidates the dlict offsets computed.  (Not sure here,
> but I think that's the case.)  If so, we probably need a different
> object -- call it a template -- that represents the concrete layout
> and is tied to unmodified concrete class.  When objects or classes are
> modified in dangerous ways, we'd need to invalidate the template
> pointer for the affected instances.

This would be similar to the Self VM map concept (although python
is type/class based because of the very dynamic nature of instances it
has similar problems to prototype based languages).

I don't know if we need that and if it can be implemented effectively,
I considered that too during my brainstorming.

AFAIK caching/memoization plays an important role in all
high perf dynamic object languages impls. Abstractly it seems
effective for Python too, but it is unclear if the complexity
of the internal models will render it ineffective.

With caching you can probably simply timestamp classes,
when a class is changed structurally you increment its
timestamp and that of all direct and inderect subclasses,
you don't touch instances. Then you compare
the cached timestamp with that of instance class to
check if the entry is valid.

The tricky part is that in python an instance attribute
can be added at any point that shadows a class
attribute. I don't know if there are open issues,
but an approach would be in that case to increment
the timestamp of the instance classe too.

The problem is that there are so many cases and
situations, that's why the multi-staged cache-lines
approach in theory makes some sense 
but could be anyway
totally ineffective in practice <wink>.

These are all interesting topics, although from these
more or less informal discussions to results there is
a lot of details and code :(.

But already improving the global lookup thing
would be a good step.

Hope this makes some kind of sense. Samuele.










From neal@metaslash.com  Sat Feb  2 14:38:33 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Sat, 02 Feb 2002 09:38:33 -0500
Subject: [Python-Dev] Re: opcode performance measurements
References: <BCEJJGNAEAKMACBPLMEPKEBGCAAA.jeremy@alum.mit.edu><021901c1ab4e$867fb640$6d94fea9@newmexico><15450.37192.51717.328419@gondolin.digicool.com><047c01c1ab84$bf7170c0$6d94fea9@newmexico> <15451.15785.174046.282855@gondolin.digicool.com> <004201c1ab90$d3c18540$9d97bac3@newmexico>
Message-ID: <3C5BF9E9.1EFEA2FE@metaslash.com>

Samuele Pedroni wrote:
> 
> From: Jeremy Hylton <jeremy@zope.com>
> > >>>>> "SP" == Samuele Pedroni <pedronis@bluewin.ch> writes:
> ...
> >   SP> one can try to guess the slots of a class looking for the
> >   SP> "self.attr" pattern at compile time in a more or less clever
> >   SP> way.  The set of compile-time guessed attrs will be passed to
> >   SP> MAKE_CLASS which will construct the runtime guess using the
> >   SP> union of the super-classes guesses and the compile time guess
> >   SP> for the class.  This information can be used to layout a dlict.
> >
> > Right!  There's another step necessary to take advantage though.  When
> > you execute a method you don't know the receiver type
> > (self.__class__).  So you need to specialize the bytecode to a
> > particular receiver the first time the method is called.  Since this
> > could be relatively expensive and you don't know how often the method
> > will be executed, you need to decide dynamically when to do it.  Just
> > like HotSpot.

Why not assume the general case is the most common, ie, that the
object is an instance of this class or one of its subclasses?
That way you could do the specialization at compile time.  And
for the (presumably) few times that this isn't true fallback to another
technique, perhaps like HotSpot.

Also, doesn't calling a base class method as:
	Base.method(self)	# in particular __init__()
vs.
	self.method()

create problems if you specialize for a specific class?  Or does
specialization necessarily mean for a subclass and all its base clases?

> Right, because with multiple inheritance you cannot make the layout
> of a subclass compatible with that of *all* superclasses, so simple
> monomorphic inline caches will not work :(.

ISTM that it would be best to handle single inheritance first.
Multiple inheritance could perhaps be handled for the class with
the most commonly referenced attribute (assuming 2+ classes don't
define the same attr.  And use a fallback technique for all other cases.

> > We probably have to worry about a class or instance being modified in
> > a way that invalidates the dlict offsets computed.  (Not sure here,
> > but I think that's the case.)  If so, we probably need a different

Right, if an attr is deleted, methods added/removed dynamically, etc.

> > object -- call it a template -- that represents the concrete layout
> > and is tied to unmodified concrete class.  When objects or classes are
> > modified in dangerous ways, we'd need to invalidate the template
> > pointer for the affected instances.

By using a template, doesn't that become a dict lookup again?

> These are all interesting topics, although from these
> more or less informal discussions to results there is
> a lot of details and code :(.

I agree.  Since we can't know what will be optimal, it seems safer
to keep the existing functionality as a fallback case and try 
to improve things with small steps (eg, single inheritance, first).

> But already improving the global lookup thing
> would be a good step.

Definitely.

Neal


From Samuele Pedroni" <pedroni@inf.ethz.ch  Sat Feb  2 15:44:44 2002
From: Samuele Pedroni" <pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Sat, 2 Feb 2002 16:44:44 +0100
Subject: [Python-Dev] Re: opcode performance measurements
References: <BCEJJGNAEAKMACBPLMEPKEBGCAAA.jeremy@alum.mit.edu><021901c1ab4e$867fb640$6d94fea9@newmexico><15450.37192.51717.328419@gondolin.digicool.com><047c01c1ab84$bf7170c0$6d94fea9@newmexico> <15451.15785.174046.282855@gondolin.digicool.com> <004201c1ab90$d3c18540$9d97bac3@newmexico> <3C5BF9E9.1EFEA2FE@metaslash.com>
Message-ID: <00b601c1ac00$89bc96e0$6d94fea9@newmexico>

From: Neal Norwitz <neal@metaslash.com>
> Why not assume the general case is the most common, ie, that the
> object is an instance of this class or one of its subclasses?
> That way you could do the specialization at compile time.  And
> for the (presumably) few times that this isn't true fallback to another
> technique, perhaps like HotSpot.
>
> Also, doesn't calling a base class method as:
> Base.method(self) # in particular __init__()
> vs.
> self.method()
>
> create problems if you specialize for a specific class?  Or does
> specialization necessarily mean for a subclass and all its base clases?

Puzzled. In Python you could specialization at MAKE_CLASS time,
which means rewriting all the direct and indirect superclasses methods
and the class method under the assumption that self is of the built class.
Doing so is probably too expensive. Typically specializing only
when a given method is actually called makes more sense.
Btw typical systems specialize and native-compile at the same time,
if you substract the native-compile part your cost equation change
a lot. Given that people can change self (although nobody does)
that you need data/control flow analysis, that's too bad:

def a(self):
   self = 3
   return self+1

Also:

  def __add__(self,o):
    ...

You cannot do anything special for o :(.


> [Jeremy] > Right, because with multiple inheritance you cannot make the
layout
> > of a subclass compatible with that of *all* superclasses, so simple
> > monomorphic inline caches will not work :(.
>
> ISTM that it would be best to handle single inheritance first.
> Multiple inheritance could perhaps be handled for the class with
> the most commonly referenced attribute (assuming 2+ classes don't
> define the same attr.  And use a fallback technique for all other cases.
>

How do you decide which are the most commonly referenced attributes?
<wink>

> > > We probably have to worry about a class or instance being modified in
> > > a way that invalidates the dlict offsets computed.  (Not sure here,
> > > but I think that's the case.)  If so, we probably need a different
>
> Right, if an attr is deleted, methods added/removed dynamically, etc.

It really depends on implementation details.

> > > object -- call it a template -- that represents the concrete layout
> > > and is tied to unmodified concrete class.  When objects or classes are
> > > modified in dangerous ways, we'd need to invalidate the template
> > > pointer for the affected instances.
>
> By using a template, doesn't that become a dict lookup again?

Tthe good thing about templates as idea is that they
could solve the zoo isssue.
You're right about lookup but to see the utility you should
bring per bytecode instr caches in the picture:

if obj.template == cache_line.template:
  use cache_line.cached_lookup_result
else:
   lookup and update cache_line

[The Self VM used maps (read templates) in such a way]

There is really a huge hack/implentation space to play with.

These comments are mainly informal,  if the interest
remain after the conference I will be pleased to partecipate
to more focused and into-the-details discussions.

regards, Samuele.




From tim.one@home.com  Sat Feb  2 21:51:37 2002
From: tim.one@home.com (Tim Peters)
Date: Sat, 2 Feb 2002 16:51:37 -0500
Subject: [Python-Dev] distutils & stderr
In-Reply-To: <15450.43128.601760.548974@12-248-41-177.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHKNJAA.tim.one@home.com>

[Tim]
> If distutils output isn't interesting to PyInline users,
> shouldn't PyInline be changed to run setup.py with its -q/--quiet
> option?

[Skip]
> Probably so, but not all prints are guarded by "if verbose:".

Have you tried it in the case you complained about at the start of this?
These days I routinely build elaborate pieces of Zope using -q, and the only
msgs I ever see then are things like

"""
MultiMapping.c
   Creating library build\temp.win32-2.3\Release\MultiMapping.lib and object
build\temp.win32-2.3\Release\MultiMapping.exp
"""

I believe those are generated by Microsoft's compiler (the case-sensitive
string "Creating" appears nohwere in the distutils source; and yes, these go
to stdout too), and if so there's nothing distutils can do about that.  I
don't see any messages that look like they come from distutils.

just-because-you-don't-understand-the-code-doesn't-mean-it-doesn't-
    do-what-you-want<wink>-ly y'rs  - tim



From skip@pobox.com  Sun Feb  3 03:43:43 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sat, 2 Feb 2002 21:43:43 -0600
Subject: [Python-Dev] distutils & stderr
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEHKNJAA.tim.one@home.com>
References: <15450.43128.601760.548974@12-248-41-177.client.attbi.com>
 <LNBBLJKPBEHFEDALKOLCOEHKNJAA.tim.one@home.com>
Message-ID: <15452.45551.652825.195467@12-248-41-177.client.attbi.com>

    Tim> [Skip]
    >> Probably so, but not all prints are guarded by "if verbose:".

    Tim> Have you tried it in the case you complained about at the start of
    Tim> this?

Yes, and it seems to shut things up just fine.  I made that comment after
having modified my source to dump all prints to stderr.

    Tim> MultiMapping.c
    Tim>    Creating library build\temp.win32-2.3\Release\MultiMapping.lib and object
    Tim> build\temp.win32-2.3\Release\MultiMapping.exp

    Tim> I believe those are generated by Microsoft's compiler (the
    Tim> case-sensitive string "Creating" appears nohwere in the distutils
    Tim> source; and yes, these go to stdout too), and if so there's nothing
    Tim> distutils can do about that.  I don't see any messages that look
    Tim> like they come from distutils.

Windows matters little to me for most applications, and not at all when I
write scripts that I want to work like Unix filters, which is what my
original complaint was about.  I will suggest to Ken Simpson that PyInline
use the -q flag.

Thx,

Skip


From skip@pobox.com  Sun Feb  3 15:56:17 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 3 Feb 2002 09:56:17 -0600
Subject: [Python-Dev] network access from conference?
Message-ID: <15453.23969.582214.695481@12-248-41-177.client.attbi.com>

Any hope of network access from the conference?  I have ethernet and
wireless cards (no modem - one of those stinkin' winmodems came with my
laptop).

Thx,

Skip


From guido@python.org  Sun Feb  3 18:16:43 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 03 Feb 2002 13:16:43 -0500
Subject: [Python-Dev] network access from conference?
In-Reply-To: Your message of "Sun, 03 Feb 2002 09:56:17 CST."
 <15453.23969.582214.695481@12-248-41-177.client.attbi.com>
References: <15453.23969.582214.695481@12-248-41-177.client.attbi.com>
Message-ID: <200202031816.g13IGiJ15881@pcp742651pcs.reston01.va.comcast.net>

> Any hope of network access from the conference?  I have ethernet and
> wireless cards (no modem - one of those stinkin' winmodems came with my
> laptop).

I think there was network access last year so I'm counting on it
myself.  Wireless might be a possibility too, but bring your ethernet
card too.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sun Feb  3 20:39:55 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 03 Feb 2002 15:39:55 -0500
Subject: [Python-Dev] Re: Network access from conference?
Message-ID: <200202032039.g13KdtY16193@pcp742651pcs.reston01.va.comcast.net>

Thought this might be good to know!

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Sun, 03 Feb 2002 14:35:44 -0500
From:    Kevin Jacobs <jacobs@penguin.theopalgroup.com>
To:      <guido@python.org>
Subject: FYI: Re: Network access from conference?

I am bringing a wireless access point, so we can create a CAN (Conference
Area Network) if you want.  Feel free to send out an e-mail to python-dev or
the Python10 list and let people know that they can bring their WiFi cards.
The only caveat is that I won't arrive until Monday evening, so the first
day participants are somewhat out of luck.

- -Kevin

- -- 
- --
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com


------- End of Forwarded Message



From guido@python.org  Mon Feb  4 02:35:25 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 03 Feb 2002 21:35:25 -0500
Subject: [Python-Dev] Want to co-design and implement a logging module?
Message-ID: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net>

I'd like to see a logging module in the standard Python library.  Is
anybody interested in helping spec out requirements and work on an
implementation?  Some ideas from Zope's zLOG module should probably go
into it (it should eventually be a replacement for that), and some
from log4j (http://jakarta.apache.org/log4j/docs/).

Any takers?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz@rahul.net  Mon Feb  4 06:36:07 2002
From: aahz@rahul.net (Aahz Maruch)
Date: Sun, 3 Feb 2002 22:36:07 -0800 (PST)
Subject: [Python-Dev] Want to co-design and implement a logging module?
In-Reply-To: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> from "Guido van Rossum" at Feb 03, 2002 09:35:25 PM
Message-ID: <20020204063607.EA948E8C6@waltz.rahul.net>

Guido van Rossum wrote:
> 
> I'd like to see a logging module in the standard Python library.  Is
> anybody interested in helping spec out requirements and work on an
> implementation?  Some ideas from Zope's zLOG module should probably go
> into it (it should eventually be a replacement for that), and some
> from log4j (http://jakarta.apache.org/log4j/docs/).
> 
> Any takers?

I'm not sure I'm a "taker", but I did a bit of research and found log4p,
http://log4p.sourceforge.net/

Have you looked at it, and if yes, what's a short reason why it wouldn't
be suitable?  (One of the things I disliked about Zlogger (I believe
that's the correct name) is that it seems to require an error tuple,
based on what I'm reading in
http://www.zope.org/Documentation/Misc/LOGGING.txt
I believe that loggers should be more generic.)
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

We must not let the evil of a few trample the freedoms of the many.


From martin@v.loewis.de  Mon Feb  4 07:19:40 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 04 Feb 2002 08:19:40 +0100
Subject: [Python-Dev] Want to co-design and implement a logging module?
In-Reply-To: <20020204063607.EA948E8C6@waltz.rahul.net>
References: <20020204063607.EA948E8C6@waltz.rahul.net>
Message-ID: <m3665dk92r.fsf@mira.informatik.hu-berlin.de>

aahz@rahul.net (Aahz Maruch) writes:

> I'm not sure I'm a "taker", but I did a bit of research and found log4p,
> http://log4p.sourceforge.net/
> 
> Have you looked at it, and if yes, what's a short reason why it wouldn't
> be suitable?

The thing I dislike about log4p is that it looks much to java-ish.

import java.util.DateFormat;

is not something I would like to do when using the standard Python
library.

Regards,
Martin


From mal@lemburg.com  Mon Feb  4 09:39:27 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 04 Feb 2002 10:39:27 +0100
Subject: [Python-Dev] Want to co-design and implement a logging module?
References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C5E56CF.F2641490@lemburg.com>

Guido van Rossum wrote:
> 
> I'd like to see a logging module in the standard Python library.  Is
> anybody interested in helping spec out requirements and work on an
> implementation?  Some ideas from Zope's zLOG module should probably go
> into it (it should eventually be a replacement for that), and some
> from log4j (http://jakarta.apache.org/log4j/docs/).
> 
> Any takers?

You might want to have a look at mx.Log which is part of the
egenix-mx-base distribution. It is undocumented, but reading the
source should give some insights.

The basic idea is that you have logging objects which are
usually created as singletons; these can then log various 
information depending on a fine grained verbosity level to a 
log file, stdout or stderr.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From aahz@rahul.net  Mon Feb  4 14:40:20 2002
From: aahz@rahul.net (Aahz Maruch)
Date: Mon, 4 Feb 2002 06:40:20 -0800 (PST)
Subject: [Python-Dev] Tuples vs. lists
In-Reply-To: <200201282126.QAA30702@pcp742651pcs.reston01.va.comcast.net> from "Guido van Rossum" at Jan 28, 2002 04:26:23 PM
Message-ID: <20020204144021.56495E8C3@waltz.rahul.net>

Guido van Rossum wrote:
> Aahz:
>>
>> It's a constant.  The BCD module is Binary Coded Decimal; instances are
>> intended to be as immutable as strings and numbers (well, it *is* a
>> number type).  Modifying an instance is guaranteed to produce a new
>> instance.  To a large extent, I guess I feel that if a class is intended
>> to be immutable, each of its underlying data attributes should also be
>> immutable.
> 
> Or you could assign it to a private variable.

And in private e-mail, Guido writes:
>
> I hate to continue harping on this tiny item in public, but what woud
> you do if you needed a constant dictionary?

<grin> <throw up hands>  All right, I guess it's time for me to just
follow the Python motto: "There's only one way, and that way is Guido's."
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

We must not let the evil of a few trample the freedoms of the many.


From trentm@ActiveState.com  Mon Feb  4 17:41:49 2002
From: trentm@ActiveState.com (Trent Mick)
Date: Mon, 4 Feb 2002 09:41:49 -0800
Subject: [Python-Dev] Want to co-design and implement a logging module?
In-Reply-To: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net>; from guido@python.org on Sun, Feb 03, 2002 at 09:35:25PM -0500
References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020204094149.C31089@ActiveState.com>

On Sun, Feb 03, 2002 at 09:35:25PM -0500, Guido van Rossum wrote:
> I'd like to see a logging module in the standard Python library.  Is
> anybody interested in helping spec out requirements and work on an
> implementation?  Some ideas from Zope's zLOG module should probably go
> into it (it should eventually be a replacement for that), and some
> from log4j (http://jakarta.apache.org/log4j/docs/).
> 
> Any takers?

I'll take it. I have been (slowly) working on a log4j translation, trying to
stay as close to log4j's API as possible. I'll take a look at zLOG.


[Aahz said]
> I'm not sure I'm a "taker", but I did a bit of research and found log4p,
> http://log4p.sourceforge.net/

That one has not seen any development for ages and I don't believe it is even
functional.

There *is* a log4py out there.
    http://www.its4you.at/log4py.php
    http://sourceforge.net/project/showfiles.php?group_id=36216
I took a quick look at it a while ago and thought it was pretty limited.
Perhaps not though -- I may have been sufferring from a bout of "Not invented
here."


[MAL said:]
> You might want to have a look at mx.Log which is part of the
> egenix-mx-base distribution. It is undocumented, but reading the
> source should give some insights.
> 
> The basic idea is that you have logging objects which are
> usually created as singletons; these can then log various
> information depending on a fine grained verbosity level to a
> log file, stdout or stderr.

Sounds very similar to log4j. I'll take a look at that too.



Note that the log4j manual that is currently up
(http://jakarta.apache.org/log4j/docs/manual.html) is for the current release
version. They have an alpha version that cleans up the naming a little bit
mainly, I think, to try to make log4j look a little bit more like the
java.util.logging API.

Actually, log4j's site *used* to have a bunch of other pages up their that
included links to contributed packages and ports of log4k to other languages
(C, C++, Perl, Python, etc).


How about I try to have a PEP together within a week or two, and perhaps a
working base implementation?

Trent

-- 
Trent Mick
TrentM@ActiveState.com


From barry@zope.com  Mon Feb  4 19:51:30 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 4 Feb 2002 14:51:30 -0500
Subject: [Python-Dev] Want to co-design and implement a logging module?
References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net>
 <20020204094149.C31089@ActiveState.com>
Message-ID: <15454.58946.378611.495989@anthem.wooz.org>

>>>>> "TM" == Trent Mick <trentm@ActiveState.com> writes:

    TM> How about I try to have a PEP together within a week or two,
    TM> and perhaps a working base implementation?

+1

-Barry


From tim.one@home.com  Mon Feb  4 20:36:52 2002
From: tim.one@home.com (Tim Peters)
Date: Mon, 4 Feb 2002 15:36:52 -0500
Subject: [Python-Dev] Tuples vs. lists
In-Reply-To: <20020204144021.56495E8C3@waltz.rahul.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEELHNJAA.tim.one@home.com>

[Guido]
> I hate to continue harping on this tiny item in public, but what woud
> you do if you needed a constant dictionary?

[Aahz]
> <grin> <throw up hands>  All right, I guess it's time for me to just
> follow the Python motto: "There's only one way, and that way is Guido's."

Well, the other one way is to agitate for, e.g., accepting the new digraphs

   {?
   ?}

as delimiting a constant dict <wink>.

If I were Aahz, I'd keep using tuples:  a serious BCD user can have
gazillions of these objects sitting around, and tuples also allow
significant memory savings over lists.  If you have to, think of the digits
'3' and '7' of being different types, so that you can fool Guido into
believing it's not a homogeneous collection (he doesn't read the fine print
in math-related code <wink>).

practicality-beats-purity-ly y'rs  - tim



From jeremy@zope.com  Tue Feb  5 04:33:14 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Mon, 4 Feb 2002 23:33:14 -0500
Subject: [Python-Dev] Tuples vs. lists
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEELHNJAA.tim.one@home.com>
References: <20020204144021.56495E8C3@waltz.rahul.net>
 <LNBBLJKPBEHFEDALKOLCEELHNJAA.tim.one@home.com>
Message-ID: <15455.24714.128907.199785@gondolin.digicool.com>

Hey, should I change all the tuples in code objects to be lists, too?
A code object has got things like co_names and co_consts.  They're
currently implemented as tuples, but they're just homogenous,
variable-length sequences.  <wink>

'course if people modified the lists, they'd caused Python to dump
core.

Jeremy





From gh_pythonlist@gmx.de  Tue Feb  5 04:52:10 2002
From: gh_pythonlist@gmx.de (Gerhard =?iso-8859-15?Q?H=E4ring?=)
Date: Tue, 5 Feb 2002 05:52:10 +0100
Subject: [Python-Dev] Should Python compile as C++?
Message-ID: <20020205045209.GB1181@lilith.hqd-internal>

I'm currently doing a native mingw32 port of Python, and I've hit the
ugly "initializer is not a constant" problem mentioned in the FAQ. Hmm,
looks like I have three options:

1 Fix the Python sources in the Object/ directory and initalize the
  structs in a seperate init_objects function
2 compile Python with a C++ compiler
3 fix the mingw32 compiler

When trying option 2, I recognized that a lot of Python's source is not
valid ANSI C++. There are even variable names like "class" and "new".
There are of course less obvious issues when trying to make the source
compile as C++, in particular a lot more casts are needed. If it's just
that Python is supposed to compile as C++ but it hasn't been tested for
a while, I could do the necessary fixes and submit a patch. But if
that's a new idea, I don't know if fixing it now makes sense.

Because I plan to submit the required changes as a patch when the port
is ready, I'd like to know if you'd accept a patch for option #1.

Gerhard
-- 
This sig powered by Python!
Außentemperatur in München: 6.1 °C      Wind: 4.0 m/s


From guido@python.org  Tue Feb  5 06:34:19 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 05 Feb 2002 01:34:19 -0500
Subject: [Python-Dev] Should Python compile as C++?
In-Reply-To: Your message of "Tue, 05 Feb 2002 05:52:10 +0100."
 <20020205045209.GB1181@lilith.hqd-internal>
References: <20020205045209.GB1181@lilith.hqd-internal>
Message-ID: <200202050634.g156YJf19271@pcp742651pcs.reston01.va.comcast.net>

> I'm currently doing a native mingw32 port of Python, and I've hit the
> ugly "initializer is not a constant" problem mentioned in the FAQ. Hmm,
> looks like I have three options:
> 
> 1 Fix the Python sources in the Object/ directory and initalize the
>   structs in a seperate init_objects function
> 2 compile Python with a C++ compiler
> 3 fix the mingw32 compiler
> 
> When trying option 2, I recognized that a lot of Python's source is not
> valid ANSI C++. There are even variable names like "class" and "new".
> There are of course less obvious issues when trying to make the source
> compile as C++, in particular a lot more casts are needed. If it's just
> that Python is supposed to compile as C++ but it hasn't been tested for
> a while, I could do the necessary fixes and submit a patch. But if
> that's a new idea, I don't know if fixing it now makes sense.
> 
> Because I plan to submit the required changes as a patch when the port
> is ready, I'd like to know if you'd accept a patch for option #1.

Sounds to me like the Mingw32 compiler is not ANSI compatible.  I
don't want to have to change the source to accommodate a broken
compiler that a very small minority of users want to use.  So I am
against #1.

We never said that our .c files would be valid C++ (.h files is a
different story) so I think #2 is not an option.

I vote for #3 -- if enough software can't compiled with mingw32 the
compiler will be fixed, as it should, and I'm happy to help encourage
this.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gh_pythonlist@gmx.de  Tue Feb  5 09:30:55 2002
From: gh_pythonlist@gmx.de (Gerhard =?iso-8859-15?Q?H=E4ring?=)
Date: Tue, 5 Feb 2002 10:30:55 +0100
Subject: [Python-Dev] Should Python compile as C++?
In-Reply-To: <200202050634.g156YJf19271@pcp742651pcs.reston01.va.comcast.net>
References: <20020205045209.GB1181@lilith.hqd-internal> <200202050634.g156YJf19271@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020205093054.GA7547@lilith.hqd-internal>

Le 05/02/02 à 01:34, Guido van Rossum écrivit:
> > I'm currently doing a native mingw32 port of Python, and I've hit the
> > ugly "initializer is not a constant" problem mentioned in the FAQ. Hmm,
> > looks like I have three options:
> > 
> > 1 Fix the Python sources in the Object/ directory and initalize the
> >   structs in a seperate init_objects function
> > 2 compile Python with a C++ compiler
> > 3 fix the mingw32 compiler
> > 
> > [Python doesn't compile with C++ compiler]
> > 
> > Because I plan to submit the required changes as a patch when the port
> > is ready, I'd like to know if you'd accept a patch for option #1.
> 
> Sounds to me like the Mingw32 compiler is not ANSI compatible.  I
> don't want to have to change the source to accommodate a broken
> compiler that a very small minority of users want to use.  So I am
> against #1.

I now found the reason for the compiler message. I forgot to set
USE_DL_EXPORT when compiling the Python core. Doh! Sorry for the noise.
Everything works reasonably fine now.

> We never said that our .c files would be valid C++ (.h files is a
> different story) [...]

Ok. I must have mistaken Python with a different project.

> I vote for #3 -- if enough software can't compiled with mingw32 the
> compiler will be fixed, as it should, and I'm happy to help encourage
> this.

I'm not quite sure if was really a bug in mingw32, but the fact that the
compiler accepts the code when compiled as C++ is at least inconsistent.

Gerhard
-- 
This sig powered by Python!
Außentemperatur in München: 10.2 °C      Wind: 3.6 m/s


From mal@lemburg.com  Tue Feb  5 10:46:37 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 05 Feb 2002 11:46:37 +0100
Subject: [Python-Dev] Tuples vs. lists
References: <20020204144021.56495E8C3@waltz.rahul.net>
 <LNBBLJKPBEHFEDALKOLCEELHNJAA.tim.one@home.com> <15455.24714.128907.199785@gondolin.digicool.com>
Message-ID: <3C5FB80D.A2931407@lemburg.com>

I haven't really followed this thread, but what's all this talk
about lists vs. tuples about ? 

Tuples have a smaller memory footprint, provide faster element 
access, can be cached and are generally a good data type for 
constant data structures. Lists, OTOH, provide more flexibility 
when the size of the object isn't known in advance. They use up 
more memory, are not cacheable and slower on access.

For the BCD stuff Aahz was talking about, I'd suggest to have
a look at either arrays or cStringIO buffers.

Jeremy Hylton wrote:
> 
> Hey, should I change all the tuples in code objects to be lists, too?
> A code object has got things like co_names and co_consts.  They're
> currently implemented as tuples, but they're just homogenous,
> variable-length sequences.  <wink>
> 
> 'course if people modified the lists, they'd caused Python to dump
> core.

I hope I read the <wink> correctly :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mhammond@skippinet.com.au  Tue Feb  5 11:29:20 2002
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Tue, 5 Feb 2002 22:29:20 +1100
Subject: [Python-Dev] Should Python compile as C++?
In-Reply-To: <20020205093054.GA7547@lilith.hqd-internal>
Message-ID: <LCEPIIGDJPKCOIHOBJEPCEEFFCAA.mhammond@skippinet.com.au>

> > I vote for #3 -- if enough software can't compiled with mingw32 the
> > compiler will be fixed, as it should, and I'm happy to help encourage
> > this.
>
> I'm not quite sure if was really a bug in mingw32, but the fact that the
> compiler accepts the code when compiled as C++ is at least inconsistent.

IIRC, msvc has the exact same problem, and that is turns out that the error
is actually correct.  <jeez I hate saying anything like this when I know Tim
is listening :)>  I believe the problem is that C does not guarantee the
initialization order of static objects across object modules.  The Python
idiom of taking the address of a global variable in one module to initialize
another global variable in another module is not guaranteed to do what you
expect.  OTOH, C++ does make such a guarantee.

The good news is that if msvc has the same problem, wherever the blame lies,
you can be fairly sure that something will be done so msvc works (and has
indeed been done for a few modules).  Therefore you get mingw for free :)

Or-something-like-that ly,

Mark.



From mal@lemburg.com  Tue Feb  5 12:22:48 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 05 Feb 2002 13:22:48 +0100
Subject: [Python-Dev] Should Python compile as C++?
References: <LCEPIIGDJPKCOIHOBJEPCEEFFCAA.mhammond@skippinet.com.au>
Message-ID: <3C5FCE98.81E22FB7@lemburg.com>

Mark Hammond wrote:
> 
> > > I vote for #3 -- if enough software can't compiled with mingw32 the
> > > compiler will be fixed, as it should, and I'm happy to help encourage
> > > this.
> >
> > I'm not quite sure if was really a bug in mingw32, but the fact that the
> > compiler accepts the code when compiled as C++ is at least inconsistent.
> 
> IIRC, msvc has the exact same problem, and that is turns out that the error
> is actually correct.  <jeez I hate saying anything like this when I know Tim
> is listening :)>  I believe the problem is that C does not guarantee the
> initialization order of static objects across object modules.  The Python
> idiom of taking the address of a global variable in one module to initialize
> another global variable in another module is not guaranteed to do what you
> expect.  OTOH, C++ does make such a guarantee.
> 
> The good news is that if msvc has the same problem, wherever the blame lies,
> you can be fairly sure that something will be done so msvc works (and has
> indeed been done for a few modules).  Therefore you get mingw for free :)

If the initialization of type objects is all that needs fixing to
get Python compile to on MinGW32, why not simply fix it ? MSVC has had
the same problem for years. What's strange is that in some cases,
MSVC does seem to get it right where in others it fails with an
error -- probably a DLL vs. EXE thing.

Or am I missing something here :-?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mhammond@skippinet.com.au  Tue Feb  5 12:28:50 2002
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Tue, 5 Feb 2002 23:28:50 +1100
Subject: [Python-Dev] Should Python compile as C++?
In-Reply-To: <3C5FCE98.81E22FB7@lemburg.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPMEEGFCAA.mhammond@skippinet.com.au>

> If the initialization of type objects is all that needs fixing to
> get Python compile to on MinGW32, why not simply fix it ?

Gerhard indicated all is working now.

> MSVC has had
> the same problem for years. What's strange is that in some cases,
> MSVC does seem to get it right where in others it fails with an
> error -- probably a DLL vs. EXE thing.

It is a problem for extension modules.  Object files in the core DLL have no
problem, but object modules in seperate extension DLLs that reference the
global in pythonxx.dll generate the error.

Thus, we see the error as modules are split out of the core - eg, _socket,
_winreg, etc.

Mark.



From tim.one@comcast.net  Tue Feb  5 12:28:27 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 05 Feb 2002 07:28:27 -0500
Subject: [Python-Dev] Should Python compile as C++?
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPCEEFFCAA.mhammond@skippinet.com.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEODNJAA.tim.one@comcast.net>

Static initializers in C++ are much more liberal than in C, without the
latter's "constant expression" limitations.  This follows from that you
couldn't declare a static object of an arbitrary class otherwise:  C++ has
to be prepared to execute any code whatsoever in order to run user-coded
constructors.

OTOH, because C++ is much more liberal in this respect, order of module
initialization is a much worse problem in C, and C++ doesn't define that any
more than C does.  Some of the worst debugging problems I ever had in C++
were tracking down quiet assumptions about initialization order that didn't
hold x-platform.  There are a number of well-known hacks in the C++ world
for worming around this, some of which explain why starting a large C++
program can give your disk a major workout.

As to making Python source compilable under C++, I quietly nudge it in that
direction.  If I explained why, it wouldn't be quiet anymore <wink>.



From tim.one@comcast.net  Tue Feb  5 12:31:16 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 05 Feb 2002 07:31:16 -0500
Subject: [Python-Dev] Should Python compile as C++?
In-Reply-To: <3C5FCE98.81E22FB7@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEOENJAA.tim.one@comcast.net>

> ...
> MSVC has had the same problem for years. What's strange is that in some
> cases, MSVC does seem to get it right where in others it fails with an
> error -- probably a DLL vs. EXE thing.

MS C can't handle cross-DLL references in initializers, because they're
truly not "constant" in the way C requires (but C doesn't say anything about
DLLs!).  C++'s initialization model is much more liberal (and
correspondingly more elaborate and expensive), and C++ can handle cross-DLL
references in initializers.  MSVC plays both according to reasonable
readings of the respective languages' rules.



From jack@oratrix.com  Tue Feb  5 15:37:06 2002
From: jack@oratrix.com (Jack Jansen)
Date: Tue, 5 Feb 2002 16:37:06 +0100
Subject: [Python-Dev] Should Python compile as C++?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEOENJAA.tim.one@comcast.net>
Message-ID: <35AB4C18-1A4E-11D6-ABBF-0030655234CE@oratrix.com>

On Tuesday, February 5, 2002, at 01:31 , Tim Peters wrote:

>> ...
>> MSVC has had the same problem for years. What's strange is that in some
>> cases, MSVC does seem to get it right where in others it fails with an
>> error -- probably a DLL vs. EXE thing.
>
> MS C can't handle cross-DLL references in initializers, because they're
> truly not "constant" in the way C requires (but C doesn't say anything 
> about
> DLLs!).

I've always understood that the problem here was that Microsoft's object 
file format allows for patching up references in the text segment but 
not in the data segment. And C++ doesn't have the problem, because it 
can do initializers in code anyway, so it doesn't need a data segment 
reference to the symbol from the DLL.
> --
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -



From aahz@rahul.net  Tue Feb  5 16:56:21 2002
From: aahz@rahul.net (Aahz Maruch)
Date: Tue, 5 Feb 2002 08:56:21 -0800 (PST)
Subject: [Python-Dev] Tuples vs. lists
In-Reply-To: <3C5FB80D.A2931407@lemburg.com> from "M.-A. Lemburg" at Feb 05, 2002 11:46:37 AM
Message-ID: <20020205165621.D60DBE8C3@waltz.rahul.net>

M.-A. Lemburg wrote:
> 
> For the BCD stuff Aahz was talking about, I'd suggest to have
> a look at either arrays or cStringIO buffers.

cStringIO wouldn't work because I want to store ints.  Arrays might
work, but I think I'll stick with tuples because they're a bit more
familiar to most Pythonistas.  I'm not too concerned with raw speed and
efficiency before I convert the code to C; remember Knuth.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

We must not let the evil of a few trample the freedoms of the many.


From mal@lemburg.com  Tue Feb  5 17:25:16 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 05 Feb 2002 18:25:16 +0100
Subject: [Python-Dev] Tuples vs. lists
References: <20020205165621.D60DBE8C3@waltz.rahul.net>
Message-ID: <3C60157C.D498A4F@lemburg.com>

Aahz Maruch wrote:
> 
> M.-A. Lemburg wrote:
> >
> > For the BCD stuff Aahz was talking about, I'd suggest to have
> > a look at either arrays or cStringIO buffers.
> 
> cStringIO wouldn't work because I want to store ints.  

I was thinking of storing integers as chr(value) in these.

> Arrays might
> work, but I think I'll stick with tuples because they're a bit more
> familiar to most Pythonistas.  I'm not too concerned with raw speed and
> efficiency before I convert the code to C; remember Knuth.

If you plan to convert this to C, why not have a look at mxNumber
first ? It's a wrapper around GMP and provides high performance
implementations for many numeric operations, e.g. it should be
easy to create a BCD type using the GMP (arbitrary length) 
longs and an additional C long for the decimal point position.
In fact, there's a GMP extension MPFR which tries to do just 
this.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From aahz@rahul.net  Tue Feb  5 18:03:13 2002
From: aahz@rahul.net (Aahz Maruch)
Date: Tue, 5 Feb 2002 10:03:13 -0800 (PST)
Subject: [Python-Dev] Tuples vs. lists
In-Reply-To: <3C60157C.D498A4F@lemburg.com> from "M.-A. Lemburg" at Feb 05, 2002 06:25:16 PM
Message-ID: <20020205180313.EDBD8E8C3@waltz.rahul.net>

M.-A. Lemburg wrote:
> Aahz Maruch wrote:
>> M.-A. Lemburg wrote:
>>>
>>> For the BCD stuff Aahz was talking about, I'd suggest to have
>>> a look at either arrays or cStringIO buffers.
>> 
>> cStringIO wouldn't work because I want to store ints.  
> 
> I was thinking of storing integers as chr(value) in these.

Why spend the conversion time?

>> Arrays might
>> work, but I think I'll stick with tuples because they're a bit more
>> familiar to most Pythonistas.  I'm not too concerned with raw speed and
>> efficiency before I convert the code to C; remember Knuth.
> 
> If you plan to convert this to C, why not have a look at mxNumber
> first ? It's a wrapper around GMP and provides high performance
> implementations for many numeric operations, e.g. it should be easy
> to create a BCD type using the GMP (arbitrary length) longs and an
> additional C long for the decimal point position.  In fact, there's a
> GMP extension MPFR which tries to do just this.

I'm specifically implementing the ANSI BCD spec.  If you want to argue
the theory of this, poke the Timbot; I think it's simpler to ensure that
I'm following the spec if I implement everything by hand.  Once I really
understand what I'm doing, *then* it's time to optimize.

Note that one reason for using BCD over GMP longs (which are presumably
similar to Python longs) is speed of I/O conversion.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

We must not let the evil of a few trample the freedoms of the many.


From mal@lemburg.com  Wed Feb  6 08:45:51 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 06 Feb 2002 09:45:51 +0100
Subject: [Python-Dev] Tuples vs. lists
References: <20020205180313.EDBD8E8C3@waltz.rahul.net>
Message-ID: <3C60ED3F.72341290@lemburg.com>

Aahz Maruch wrote:
> 
> >> Arrays might
> >> work, but I think I'll stick with tuples because they're a bit more
> >> familiar to most Pythonistas.  I'm not too concerned with raw speed and
> >> efficiency before I convert the code to C; remember Knuth.
> >
> > If you plan to convert this to C, why not have a look at mxNumber
> > first ? It's a wrapper around GMP and provides high performance
> > implementations for many numeric operations, e.g. it should be easy
> > to create a BCD type using the GMP (arbitrary length) longs and an
> > additional C long for the decimal point position.  In fact, there's a
> > GMP extension MPFR which tries to do just this.
> 
> I'm specifically implementing the ANSI BCD spec.  If you want to argue
> the theory of this, poke the Timbot; I think it's simpler to ensure that
> I'm following the spec if I implement everything by hand.  Once I really
> understand what I'm doing, *then* it's time to optimize.

Just thought you might want to take a look at what other people
have done in this area. MPFR is specifically aimed at dealing
with the problems of rounding; MPFI which implements interval
arithmetics based on MPFR takes a slightly different approach:
rounding issues are handled using intervals (these are also very
handy in optimization).
 
Pointers:
	http://www.loria.fr/projets/mpfr/
	http://www.ens-lyon.fr/~nrevol/nr_software.html

> Note that one reason for using BCD over GMP longs (which are presumably
> similar to Python longs) is speed of I/O conversion.

Depends on which base you use for that conversion ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From thomas.heller@ion-tof.com  Wed Feb  6 12:56:30 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 6 Feb 2002 13:56:30 +0100
Subject: [Python-Dev] Extending types in C - help needed
References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com>              <060801c1a052$93d5a860$e000a8c0@thomasnotebook>  <200201200053.TAA30250@cj20424-a.reston1.va.home.com>
Message-ID: <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook>

From: "Guido van Rossum" <guido@python.org>
> > Does this mean this is the wrong route, or is it absolute impossible
> > to create a subtype of PyType_Type in C with additional slots?
> 
> I wish I had time to explain this, but I don't.  For now, you'll have
> to read how types are initialized in typeobject.c -- maybe there's a
> way, maybe there isn't.
> 
> > Any tips about the route to take?
> 
> It can be done easily dynamically.
> 

I'm still struggling with this. How can it be done dynamically?

My idea would be to realloc() the object after creation, adding
a few bytes at the end. The problem is that I don't know how to
find out about the object size without knowledge about the internals.
The formula given in PEP 253
  type->tp_basicsize  +  nitems * type->tp_itemsize
seems not to be valid any more (at least with CYCLE GC).

Thomas



From mwh@python.net  Wed Feb  6 13:54:14 2002
From: mwh@python.net (Michael Hudson)
Date: 06 Feb 2002 13:54:14 +0000
Subject: [Python-Dev] Mixing memory management APIs
In-Reply-To: Neal Norwitz's message of "Wed, 30 Jan 2002 19:13:58 -0500"
References: <LNBBLJKPBEHFEDALKOLCMELBNIAA.tim.one@home.com> <3C588C46.2BF27BBE@metaslash.com>
Message-ID: <2m4rkuyaux.fsf@starship.python.net>

Neal Norwitz <neal@metaslash.com> writes:

> Because of Michael Hudson's request, I tried running Purify 
> --with-pymalloc enabled.  The results were a bit surprising: 13664 errors!
> 
> All the errors were in unicodeobject.c.  There were 3 types of errors:
> Free Memory Reads, Array Bounds Reads, and Unitialized Memory Reads.
> The line #s were in strange places (e.g., in a function declaration
> and accessing self->length in an if clause, after it was accessed w/o error).
> The line #s are primarily:  unicodeobject.c:2875, and unicodeobject.c:2214.

Might this have something to do with

bug [ #495401 ] Build troubles: --with-pymalloc

http://sourceforge.net/tracker/?func=detail&atid=105470&aid=495401&group_id=5470

?

Is there a reason one of the fixes for this problem hasn't been
checked in yet?

Cheers,
M.

-- 
    . <- the point                                your article -> .
    |------------------------- a long way ------------------------|
                                        -- Cristophe Rhodes, ucam.chat


From guido@python.org  Wed Feb  6 14:36:27 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 06 Feb 2002 09:36:27 -0500
Subject: [Python-Dev] Extending types in C - help needed
In-Reply-To: Your message of "Wed, 06 Feb 2002 13:56:30 +0100."
 <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook>
References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com>
 <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook>
Message-ID: <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net>

> > I wish I had time to explain this, but I don't.  For now, you'll have
> > to read how types are initialized in typeobject.c -- maybe there's a
> > way, maybe there isn't.
> > 
> > > Any tips about the route to take?
> > 
> > It can be done easily dynamically.
> 
> I'm still struggling with this. How can it be done dynamically?
> 
> My idea would be to realloc() the object after creation, adding
> a few bytes at the end. The problem is that I don't know how to
> find out about the object size without knowledge about the internals.
> The formula given in PEP 253
>   type->tp_basicsize  +  nitems * type->tp_itemsize
> seems not to be valid any more (at least with CYCLE GC).

I have thought about this a little more and come to the conclusion
that you cannot define a metaclass that creates type objects that have
more C slots than the standard type object lay-out.  It would be the
same as trying to add a C slot to the instances of a string subtype:
there's variable-length data at the end, and you cannot place anything
*before* that variable-length data because all the C code that works
with the base type knows where the variable length data start; you
cannot place anything *after* that variable-lenth data because there's
no way to address it from C.

The only way around this would be to duplicate *all* the code of
type_new(), which I don't recommend because it will probably have to
be changed for each Python version (even for bugfix releases).

A better solution is to store additional information in the __dict__.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jepler@inetnebr.com  Wed Feb  6 16:24:31 2002
From: jepler@inetnebr.com (Jeff Epler)
Date: Wed, 6 Feb 2002 10:24:31 -0600
Subject: Half-baked idea (was Re: [Python-Dev] Extending types in C - help needed)
In-Reply-To: <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net>
References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com> <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook> <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020206102426.A20584@unpythonic.dhs.org>

On Wed, Feb 06, 2002 at 09:36:27AM -0500, Guido van Rossum wrote:
> I have thought about this a little more and come to the conclusion
> that you cannot define a metaclass that creates type objects that have
> more C slots than the standard type object lay-out.  It would be the
> same as trying to add a C slot to the instances of a string subtype:
> there's variable-length data at the end, and you cannot place anything
> *before* that variable-length data because all the C code that works
> with the base type knows where the variable length data start; you
> cannot place anything *after* that variable-lenth data because there's
> no way to address it from C.

I had a half-baked idea when I read this.  Is there something unworkable
about the scheme, aside from being very different from the way Python
currently operates?  Has anybody written a system that works this way?
Is it just plain gross?

Jeff Epler
jepler@inetnebr.com


Half-Baked Idea
---------------

The problem is that we have variable-length types.  For example,

    struct S {
	int nelem;
	int elem[0];
    };

you can allocate a new one by
    struct S *new_S(int nelem) {
	struct S *ret = malloc(sizeof(S) + nelem * sizeof(int));
	ret->nelem = nelem;
	return ret;
    }

Normally, we "subclass" structures by appending fields to the end:
    struct BASE {
	int x, y;
    };

    struct DERIVED { /* from struct BASE */
	int x, y;
	int flag;
    };

but this doesn't work with a dynamic-length object.

So, with the caveat that you can only have dynamic-length behavior in the base
class, why not place the new fields *BEFORE* the fields of base struct:

    struct S2 {
	int flag;
	int nelem;
	int elem[0];
    };

now, whenever you are going to pass S2 to a function on S, you simply pass in
    (struct S*)((char*)s2 + offsetof(S2, nelem))
and if you're faced with an instance of S that turns out to be an S2, you
can get the pointer to the start of S with 
    (struct S2*)((char*)s - offsetof(S2, nelem))
Note that neither of these is an additional level of indirection, it's just
an offset calculation, one that your compiler may be able to combine with
subsequent field accesses through the -> operator.

But how do you free an instance of S-or-subclass, without knowing all the
subclasses?  Well, you could store a pointer to the real start of the
structure, or an offset back to it, in the structure.  You'd use that
pointer only in a few occasions, usually using the "add const to pointer"
in functions which are for a particular subclass of S:

    struct S {
	void *real_head;
	int nelem;
	int elem[0];
    };

    struct S1 { /* derived from S */
	int flag;
	void *real_head;
	int nelem;
	int elem[0];
    };

    struct S1_1 { /* derived from S1 */
	int new_flag;
	int flag;
	void *real_head;
	int nelem;
	int elem[0];
    };
	
now, you can allocate a version of an S subclass by
    struct S *new_S(int nelem, int pre_size) {
	char *mem = malloc(sizeof(S) + nelem * sizeof(int) + pre_size);
	struct S *ret = mem + pre_size;
	ret->nelem = nelem;
	return ret;
    }
and free it by
    void free_S(struct S* s) {
	free(s->real_head);
    }

I don't know how this will interact with a garbage collector, but it
does maintain a pointer to the head of the allocated block, though that
pointer is only accessible through a pointer to the inside of a block.


From mal@lemburg.com  Wed Feb  6 16:49:24 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 06 Feb 2002 17:49:24 +0100
Subject: [Python-Dev] Mixing memory management APIs
References: <LNBBLJKPBEHFEDALKOLCMELBNIAA.tim.one@home.com> <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net>
Message-ID: <3C615E94.AF637093@lemburg.com>

Michael Hudson wrote:
> 
> Neal Norwitz <neal@metaslash.com> writes:
> 
> > Because of Michael Hudson's request, I tried running Purify
> > --with-pymalloc enabled.  The results were a bit surprising: 13664 errors!
> >
> > All the errors were in unicodeobject.c.  There were 3 types of errors:
> > Free Memory Reads, Array Bounds Reads, and Unitialized Memory Reads.
> > The line #s were in strange places (e.g., in a function declaration
> > and accessing self->length in an if clause, after it was accessed w/o error).
> > The line #s are primarily:  unicodeobject.c:2875, and unicodeobject.c:2214.
> 
> Might this have something to do with
> 
> bug [ #495401 ] Build troubles: --with-pymalloc
> 
> http://sourceforge.net/tracker/?func=detail&atid=105470&aid=495401&group_id=5470
> 
> ?
> 
> Is there a reason one of the fixes for this problem hasn't been
> checked in yet?

It is currently assigned to Martin.

Perhaps I should just take the Unicode patch and check it in (the
first one, not the second one for the reasons stated in the 
bug-tracker) ?!

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Wed Feb  6 18:13:41 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 06 Feb 2002 19:13:41 +0100
Subject: [Python-Dev] Mixing memory management APIs
References: <LNBBLJKPBEHFEDALKOLCMELBNIAA.tim.one@home.com> <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com>
Message-ID: <3C617255.6D4E5278@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> Michael Hudson wrote:
> >
> > Neal Norwitz <neal@metaslash.com> writes:
> >
> > > Because of Michael Hudson's request, I tried running Purify
> > > --with-pymalloc enabled.  The results were a bit surprising: 13664 errors!
> > >
> > > All the errors were in unicodeobject.c.  There were 3 types of errors:
> > > Free Memory Reads, Array Bounds Reads, and Unitialized Memory Reads.
> > > The line #s were in strange places (e.g., in a function declaration
> > > and accessing self->length in an if clause, after it was accessed w/o error).
> > > The line #s are primarily:  unicodeobject.c:2875, and unicodeobject.c:2214.
> >
> > Might this have something to do with
> >
> > bug [ #495401 ] Build troubles: --with-pymalloc
> >
> > http://sourceforge.net/tracker/?func=detail&atid=105470&aid=495401&group_id=5470
> >
> > ?
> >
> > Is there a reason one of the fixes for this problem hasn't been
> > checked in yet?
> 
> It is currently assigned to Martin.
> 
> Perhaps I should just take the Unicode patch and check it in (the
> first one, not the second one for the reasons stated in the
> bug-tracker) ?!

I've checked in a patch for the UTF-8 codec problem. Could you
try Purify against the CVS version ?

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From niemeyer@conectiva.com  Wed Feb  6 18:31:26 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Wed, 6 Feb 2002 16:31:26 -0200
Subject: [Python-Dev] Python optmizations
Message-ID: <20020206163126.B4071@ibook.distro.conectiva>

Hello Skip!

I've been reading some books and papers about stack virtual machines
optimization, and playing around with Python's bytecode and inner loop
organization. As always, I found some interesting results and some
frustrating ones.

Recently, I have found your paper about peephole optimization, and other
tries you've made in the same job. Well, basically I discovered that I'm
not original, and repeated most of your ideas and mistakes. :-) But
that's ok. It gave me a good idea of paths to follow if I want to keep
playing with this.

One thing I thought and also found a reference in your paper is about
some instructions that should be turned into a single opcode. To
understand how this would affect the code, I have disassembled the
whole Python standard library, and the whole Zope library. After that
I've run a script to detect opcode repeatings (excluding SET_LINENO).

Here are the top repeatings:

23632   LOAD_FAST, LOAD_ATTR
15382   LOAD_CONST, LOAD_CONST
12842   JUMP_IF_FALSE, POP_TOP
12397   CALL_FUNCTION, POP_TOP
12121   LOAD_FAST, LOAD_FAST

Not by casuality, I found in your paper references to a LOAD_FAST_ATTR
opcode. Since you probably have mentioned this to others, I wouldn't
like to bother everyone again asking why it was not implemented. Could
you please explain me the reasons that left this behind?

If you have the time, I'd also like to understand what's the trouble
involved in getting a peephole optimizer in the python compiler itself.
Is it just about compiling performance?  I don't remember to have read
about this in your paper, but you probably thought about that as well.

Thank you!

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]


From thomas.heller@ion-tof.com  Wed Feb  6 20:53:08 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 6 Feb 2002 21:53:08 +0100
Subject: [Python-Dev] Extending types in C - help needed
References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com>              <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook>  <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <104601c1af50$4888bd90$e000a8c0@thomasnotebook>

> I have thought about this a little more and come to the conclusion
> that you cannot define a metaclass that creates type objects that have
> more C slots than the standard type object lay-out.  It would be the
> same as trying to add a C slot to the instances of a string subtype:
> there's variable-length data at the end, and you cannot place anything
> *before* that variable-length data because all the C code that works
> with the base type knows where the variable length data start; you
> cannot place anything *after* that variable-lenth data because there's
> no way to address it from C.
> 


It's a pity, isn't it?

> A better solution is to store additional information in the __dict__.

You loose nice features: access these (new) slots from Python
by providing tp_members entries for them (for example).

Are you planning to address this issue in the future?

Thanks,

Thomas



From neal@metaslash.com  Wed Feb  6 21:37:28 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Wed, 06 Feb 2002 16:37:28 -0500
Subject: [Python-Dev] Mixing memory management APIs
References: <LNBBLJKPBEHFEDALKOLCMELBNIAA.tim.one@home.com> <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com>
Message-ID: <3C61A218.CD3AB779@metaslash.com>

"M.-A. Lemburg" wrote:

> I've checked in a patch for the UTF-8 codec problem. Could you
> try Purify against the CVS version ?

with-pymalloc or without or both?

Neal


From mal@lemburg.com  Wed Feb  6 22:41:52 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 06 Feb 2002 23:41:52 +0100
Subject: [Python-Dev] Mixing memory management APIs
References: <LNBBLJKPBEHFEDALKOLCMELBNIAA.tim.one@home.com> <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com>
Message-ID: <3C61B130.3040609@lemburg.com>

Neal Norwitz wrote:

> "M.-A. Lemburg" wrote:
> 
> 
>>I've checked in a patch for the UTF-8 codec problem. Could you
>>try Purify against the CVS version ?
>>
> 
> with-pymalloc or without or both?


Both if possible -- the leakage showed up with pymalloc AFAIR :-)

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/



From neal@metaslash.com  Wed Feb  6 23:34:24 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Wed, 06 Feb 2002 18:34:24 -0500
Subject: [Python-Dev] Mixing memory management APIs
References: <LNBBLJKPBEHFEDALKOLCMELBNIAA.tim.one@home.com> <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com>
Message-ID: <3C61BD80.5F11C66B@metaslash.com>

"M.-A. Lemburg" wrote:
> 
> Neal Norwitz wrote:
> 
> > "M.-A. Lemburg" wrote:
> >
> >
> >>I've checked in a patch for the UTF-8 codec problem. Could you
> >>try Purify against the CVS version ?
> >>
> >
> > with-pymalloc or without or both?
> 
> Both if possible -- the leakage showed up with pymalloc AFAIR :-)

There is a lot of data and it's very hard to follow, 
but I'm trying to provide as much info as I can.  
Let me know how I can make this info easier to use.

Here is a summary:

    * I'm using gcc version 2.95.3, on Solaris 8, Purify 2002.

    * The new patches don't fix all the problems, but it may
	reduce the problems (I'm not sure).  I think there were
	13k errors on build before, it's 5.5k now.

    * test_unicodedata fails:
		*** mismatch between line 3 of expected output and 
			line 3 of actual output:
		- Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18
		+ Methods: 3051e6d4d117133c3e36a5c22b3a1ae362474321

    * Purify now has 2 UMRs now w/o pymalloc, but they are in
	fwrite() and contain no usable stack trace.

    * It's probably best to try using Electric Fence and/or dbmalloc.
	This may give better results than Purify.

    * There is a warning from sre.h that may be significant:
	Modules/sre.h:24: warning: `SRE_CODE' redefined
	Modules/sre.h:19: warning: this is the location 
			  of the previous definition

I'll try some more things to see if I can get better info.

Neal
--

bash-2.03$ ./configure --with-pymalloc --enable-unicode=ucs4
bash-2.03$ make PURIFY=purify

--->  5542 errors
	Free Memory Read, Array Bounds Read, and Uninit Memory Read errors
		at lines unicodeobject.c:2214 & 2875
		(both are bogus lines)

	2214 is in:  PyUnicode_TranslateCharmap()
	2875 is in:  split_char()

bash-2.03$ ./python -E -tt Lib/test/regrtest.py test_unicode.py \ 
			test_unicode_file.py test_unicodedata.py
test_unicode
test test_unicode crashed -- exceptions.UnicodeError: UTF-8 decoding error: illegal encoding
test_unicode_file
test_unicodedata
test test_unicodedata produced unexpected output:
**********************************************************************
*** mismatch between line 3 of expected output and line 3 of actual output:
- Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18
+ Methods: 3051e6d4d117133c3e36a5c22b3a1ae362474321
**********************************************************************
1 test OK.
2 tests failed:
    test_unicode test_unicodedata

--------------------------------------------------------------------

Without purify, test_unicode completed successfully, but unicodedata
produced the same results.

The errors produced in purify for these 3 tests were 99745.
The errors were in the same places as for the build step.

--------------------------------------------------------------------

bash-2.03$ make clean
bash-2.03$ ./configure --enable-unicode=ucs4  
bash-2.03$ make PURIFY=purify 

bash-2.03$ ./python -E -tt Lib/test/regrtest.py test_unicode.py \ 
			test_unicode_file.py test_unicodedata.py
test test_unicodedata produced unexpected output:
**********************************************************************
*** mismatch between line 3 of expected output and line 3 of actual output:
- Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18
+ Methods: 84b72943b1d4320bc1e64a4888f7cdf62eea219a
**********************************************************************
2 tests OK.
1 test failed:
    test_unicodedata

--------------------------------------------------------------------

Purify did have 2 UMRs, but both contain almost no information:

      UMR: Uninitialized memory read
      This is occurring while in:
            _write         [libc.so.1]
            _xflsbuf       [libc.so.1]
            _fflush_u      [libc.so.1]
            fseek          [libc.so.1]
            *unknown func* [pc=0xe417c]
            *unknown func* [pc=0xe4db4]
            *unknown func* [pc=0xe64c4]
            *unknown func* [pc=0xe5cf0]
            *unknown func* [pc=0xe5524]
            *unknown func* [pc=0xe58a0]
            *unknown func* [pc=0x160464]
            *unknown func* [pc=0x159b64]
      Reading 3609 bytes from 0x6a2fcc in the heap (4 bytes at 0x6a3706 uninit).
      Address 0x6a2fcc is 4 bytes into a malloc'd block at 0x6a2fc8 of 8200 bytes.
      This block was allocated from:
            do_mkvalue     [modsupport.c:243]
            _findbuf       [libc.so.1]
            _wrtchk        [libc.so.1]
            _flsbuf        [libc.so.1]
            putc           [libc.so.1]
            *unknown func* [pc=0xe8b9c]
            *unknown func* [pc=0xed794]
            *unknown func* [pc=0xe4104]
            *unknown func* [pc=0xe4db4]
            *unknown func* [pc=0xe64c4]
            *unknown func* [pc=0xe5cf0]
            *unknown func* [pc=0xe5524]

--------------------------------------------------------------------

      UMR: Uninitialized memory read
      This is occurring while in:
            _write         [libc.so.1]
            _xflsbuf       [libc.so.1]
            _fwrite_unlocked [libc.so.1]
            fwrite         [libc.so.1]
            *unknown func* [pc=0xeaa50]
            *unknown func* [pc=0xeadf4]
            *unknown func* [pc=0xeb3c8]
            *unknown func* [pc=0xed7e8]
            *unknown func* [pc=0xe411c]
            *unknown func* [pc=0xe4db4]
            *unknown func* [pc=0xe64c4]
            *unknown func* [pc=0xe5cf0]
      Reading 8192 bytes from 0x79d88c in the heap (4 bytes at 0x79de8d uninit).
      Address 0x79d88c is 4 bytes into a malloc'd block at 0x79d888 of 8200 bytes.
      This block was allocated from:
            do_mkvalue     [modsupport.c:243]
            _findbuf       [libc.so.1]
            _wrtchk        [libc.so.1]
            _flsbuf        [libc.so.1]
            putc           [libc.so.1]
            *unknown func* [pc=0xe8b9c]
            *unknown func* [pc=0xed794]
            *unknown func* [pc=0xe4104]
            *unknown func* [pc=0xe4db4]
            *unknown func* [pc=0xe64c4]
            *unknown func* [pc=0xe5cf0]
            *unknown func* [pc=0xe5524]


From neal@metaslash.com  Wed Feb  6 23:36:19 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Wed, 06 Feb 2002 18:36:19 -0500
Subject: [Python-Dev] Mixing memory management APIs
References: <LNBBLJKPBEHFEDALKOLCMELBNIAA.tim.one@home.com> <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com>
Message-ID: <3C61BDF3.848F363C@metaslash.com>

"M.-A. Lemburg" wrote:
> 
> Neal Norwitz wrote:
> 
> > "M.-A. Lemburg" wrote:
> >
> >
> >>I've checked in a patch for the UTF-8 codec problem. Could you
> >>try Purify against the CVS version ?
> >>
> >
> > with-pymalloc or without or both?
> 
> Both if possible -- the leakage showed up with pymalloc AFAIR :-)

I forgot to mention that purify reports no memory leaks either 
with or without pymalloc.

Neal


From mal@lemburg.com  Thu Feb  7 08:49:52 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 07 Feb 2002 09:49:52 +0100
Subject: [Python-Dev] Mixing memory management APIs
References: <LNBBLJKPBEHFEDALKOLCMELBNIAA.tim.one@home.com> <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com> <3C61BDF3.848F363C@metaslash.com>
Message-ID: <3C623FB0.4C401A0@lemburg.com>

Neal Norwitz wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> > Neal Norwitz wrote:
> >
> > > "M.-A. Lemburg" wrote:
> > >
> > >
> > >>I've checked in a patch for the UTF-8 codec problem. Could you
> > >>try Purify against the CVS version ?
> > >>
> > >
> > > with-pymalloc or without or both?
> >
> > Both if possible -- the leakage showed up with pymalloc AFAIR :-)
> 
> I forgot to mention that purify reports no memory leaks either
> with or without pymalloc.

So that bug seems to be fixed now.

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Thu Feb  7 08:55:11 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 07 Feb 2002 09:55:11 +0100
Subject: [Python-Dev] Mixing memory management APIs
References: <LNBBLJKPBEHFEDALKOLCMELBNIAA.tim.one@home.com> <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com> <3C61BD80.5F11C66B@metaslash.com>
Message-ID: <3C6240EF.4FE8E9C@lemburg.com>

Neal Norwitz wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> > Neal Norwitz wrote:
> >
> > > "M.-A. Lemburg" wrote:
> > >
> > >
> > >>I've checked in a patch for the UTF-8 codec problem. Could you
> > >>try Purify against the CVS version ?
> > >>
> > >
> > > with-pymalloc or without or both?
> >
> > Both if possible -- the leakage showed up with pymalloc AFAIR :-)
> 
> There is a lot of data and it's very hard to follow,
> but I'm trying to provide as much info as I can.
> Let me know how I can make this info easier to use.
> 
> Here is a summary:
> 
>     * I'm using gcc version 2.95.3, on Solaris 8, Purify 2002.
> 
>     * The new patches don't fix all the problems, but it may
>         reduce the problems (I'm not sure).  I think there were
>         13k errors on build before, it's 5.5k now.
> 
>     * test_unicodedata fails:
>                 *** mismatch between line 3 of expected output and
>                         line 3 of actual output:
>                 - Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18
>                 + Methods: 3051e6d4d117133c3e36a5c22b3a1ae362474321

Hmm, I did run test_unicode, but forgot test_unicodedata. Now, looking
at test_unicodedata.py it produces loads of these unpaired Unicode
surrogates and then tries to encode them using UTF-8. Since the
UTF-8 previously produced wrong results for these, I guess I'll have
to recreate the test output.
 
>     * Purify now has 2 UMRs now w/o pymalloc, but they are in
>         fwrite() and contain no usable stack trace.
> 
>     * It's probably best to try using Electric Fence and/or dbmalloc.
>         This may give better results than Purify.
> 
>     * There is a warning from sre.h that may be significant:
>         Modules/sre.h:24: warning: `SRE_CODE' redefined
>         Modules/sre.h:19: warning: this is the location
>                           of the previous definition
> 
> I'll try some more things to see if I can get better info.
> 
> Neal
> --
> 
> bash-2.03$ ./configure --with-pymalloc --enable-unicode=ucs4
> bash-2.03$ make PURIFY=purify
> 
> --->  5542 errors
>         Free Memory Read, Array Bounds Read, and Uninit Memory Read errors
>                 at lines unicodeobject.c:2214 & 2875
>                 (both are bogus lines)
> 
>         2214 is in:  PyUnicode_TranslateCharmap()
>         2875 is in:  split_char()

Hmm, I'll have to look at this one...
 
> bash-2.03$ ./python -E -tt Lib/test/regrtest.py test_unicode.py \
>                         test_unicode_file.py test_unicodedata.py
> test_unicode
> test test_unicode crashed -- exceptions.UnicodeError: UTF-8 decoding error: illegal encoding

That's strange, because at least on my machine, test_unicode runs
through just fine. Could you run the test by hand, so that the error
location
can be localized ?

> test_unicode_file
> test_unicodedata
> test test_unicodedata produced unexpected output:
> **********************************************************************
> *** mismatch between line 3 of expected output and line 3 of actual output:
> - Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18
> + Methods: 3051e6d4d117133c3e36a5c22b3a1ae362474321
> **********************************************************************

See above.

> 1 test OK.
> 2 tests failed:
>     test_unicode test_unicodedata
> 
> --------------------------------------------------------------------
> 
> Without purify, test_unicode completed successfully, but unicodedata
> produced the same results.
> 
> The errors produced in purify for these 3 tests were 99745.
> The errors were in the same places as for the build step.
> 
> --------------------------------------------------------------------
> 
> bash-2.03$ make clean
> bash-2.03$ ./configure --enable-unicode=ucs4
> bash-2.03$ make PURIFY=purify
> 
> bash-2.03$ ./python -E -tt Lib/test/regrtest.py test_unicode.py \
>                         test_unicode_file.py test_unicodedata.py
> test test_unicodedata produced unexpected output:
> **********************************************************************
> *** mismatch between line 3 of expected output and line 3 of actual output:
> - Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18
> + Methods: 84b72943b1d4320bc1e64a4888f7cdf62eea219a
> **********************************************************************
> 2 tests OK.
> 1 test failed:
>     test_unicodedata
> 
> --------------------------------------------------------------------
> 
> Purify did have 2 UMRs, but both contain almost no information:
> 
>       UMR: Uninitialized memory read
>       This is occurring while in:
>             _write         [libc.so.1]
>             _xflsbuf       [libc.so.1]
>             _fflush_u      [libc.so.1]
>             fseek          [libc.so.1]
>             *unknown func* [pc=0xe417c]
>             *unknown func* [pc=0xe4db4]
>             *unknown func* [pc=0xe64c4]
>             *unknown func* [pc=0xe5cf0]
>             *unknown func* [pc=0xe5524]
>             *unknown func* [pc=0xe58a0]
>             *unknown func* [pc=0x160464]
>             *unknown func* [pc=0x159b64]
>       Reading 3609 bytes from 0x6a2fcc in the heap (4 bytes at 0x6a3706 uninit).
>       Address 0x6a2fcc is 4 bytes into a malloc'd block at 0x6a2fc8 of 8200 bytes.
>       This block was allocated from:
>             do_mkvalue     [modsupport.c:243]
>             _findbuf       [libc.so.1]
>             _wrtchk        [libc.so.1]
>             _flsbuf        [libc.so.1]
>             putc           [libc.so.1]
>             *unknown func* [pc=0xe8b9c]
>             *unknown func* [pc=0xed794]
>             *unknown func* [pc=0xe4104]
>             *unknown func* [pc=0xe4db4]
>             *unknown func* [pc=0xe64c4]
>             *unknown func* [pc=0xe5cf0]
>             *unknown func* [pc=0xe5524]
> 
> --------------------------------------------------------------------
> 
>       UMR: Uninitialized memory read
>       This is occurring while in:
>             _write         [libc.so.1]
>             _xflsbuf       [libc.so.1]
>             _fwrite_unlocked [libc.so.1]
>             fwrite         [libc.so.1]
>             *unknown func* [pc=0xeaa50]
>             *unknown func* [pc=0xeadf4]
>             *unknown func* [pc=0xeb3c8]
>             *unknown func* [pc=0xed7e8]
>             *unknown func* [pc=0xe411c]
>             *unknown func* [pc=0xe4db4]
>             *unknown func* [pc=0xe64c4]
>             *unknown func* [pc=0xe5cf0]
>       Reading 8192 bytes from 0x79d88c in the heap (4 bytes at 0x79de8d uninit).
>       Address 0x79d88c is 4 bytes into a malloc'd block at 0x79d888 of 8200 bytes.
>       This block was allocated from:
>             do_mkvalue     [modsupport.c:243]
>             _findbuf       [libc.so.1]
>             _wrtchk        [libc.so.1]
>             _flsbuf        [libc.so.1]
>             putc           [libc.so.1]
>             *unknown func* [pc=0xe8b9c]
>             *unknown func* [pc=0xed794]
>             *unknown func* [pc=0xe4104]
>             *unknown func* [pc=0xe4db4]
>             *unknown func* [pc=0xe64c4]
>             *unknown func* [pc=0xe5cf0]
>             *unknown func* [pc=0xe5524]

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mwh@python.net  Thu Feb  7 10:40:42 2002
From: mwh@python.net (Michael Hudson)
Date: 07 Feb 2002 10:40:42 +0000
Subject: [Python-Dev] Mixing memory management APIs
In-Reply-To: "M.-A. Lemburg"'s message of "Wed, 06 Feb 2002 23:41:52 +0100"
References: <LNBBLJKPBEHFEDALKOLCMELBNIAA.tim.one@home.com> <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com>
Message-ID: <2m8za5mv6d.fsf@starship.python.net>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> Neal Norwitz wrote:
> 
> > "M.-A. Lemburg" wrote:
> > 
> > 
> >>I've checked in a patch for the UTF-8 codec problem. Could you
> >>try Purify against the CVS version ?
> >>
> > 
> > with-pymalloc or without or both?
> 
> 
> Both if possible -- the leakage showed up with pymalloc AFAIR :-)

I thought we were chasing memory stomping, not leaking, this time
around...

Cheers,
M.

-- 
  /* I'd just like to take this moment to point out that C has all
     the expressive power of two dixie cups and a string.
   */                       -- Jamie Zawinski from the xkeycaps source


From mal@lemburg.com  Thu Feb  7 11:42:23 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 07 Feb 2002 12:42:23 +0100
Subject: [Python-Dev] Mixing memory management APIs
References: <LNBBLJKPBEHFEDALKOLCMELBNIAA.tim.one@home.com> <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com> <3C61BD80.5F11C66B@metaslash.com> <3C6240EF.4FE8E9C@lemburg.com>
Message-ID: <3C62681F.D27932B8@lemburg.com>

I've just checked in a set of fixes for the UTF-8 encoder and 
decoder and also updated the test output of test_unicodedata.

You should now no longer get the test failures you were seeing
(test_unicode failure was due to the old marshal format using
illegal UTF-8 sequences, test_unicodedata was due to the same
UTF-8 problem but shows up in a different hash value).

Hope I got it right this time around :-/

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Thu Feb  7 11:48:11 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 07 Feb 2002 12:48:11 +0100
Subject: [Python-Dev] PYC Magic
Message-ID: <3C62697B.3555501@lemburg.com>

FYI, I've bumped the PYC magic in a non-standard way (the old 
standard broke on 2002-01-01); please review:

import.c:
"""
/* New way to come up with the low 16 bits of the magic number:
      (YEAR-1995) * 10000 +  MONTH * 100 + DAY
   where MONTH and DAY are 1-based.
   XXX Whatever the "old way" may have been isn't documented.
   XXX This scheme breaks in 2002, as (2002-1995)*10000 = 70000 doesn't
       fit in 16 bits.
   XXX Later, sometimes 1 gets added to MAGIC in order to record that
       the Unicode -U option is in use.  IMO (Tim's), that's a Bad Idea
       (quite apart from that the -U option doesn't work so isn't used
       anyway).

   XXX MAL, 2002-02-07: I had to modify the MAGIC due to a fix of the
       UTF-8 encoder (it previously produced invalid UTF-8 for unpaired
       high surrogates), so I simply bumped the month value to 20
(invalid
       month) and set the day to 1.  This should be recognizable by any
       algorithm relying on the above scheme. Perhaps we should simply
       start counting in increments of 10 from now on ?!

   Known values:
       Python 1.5:   20121
       Python 1.5.1: 20121
       Python 1.5.2: 20121
       Python 2.0:   50823
       Python 2.0.1: 50823
       Python 2.1:   60202
       Python 2.1.1: 60202
       Python 2.1.2: 60202
       Python 2.2:   60717
       Python 2.3a0: 62001
*/
#define MAGIC (62001 | ((long)'\r'<<16) | ((long)'\n'<<24))
"""

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Thu Feb  7 11:52:38 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 07 Feb 2002 12:52:38 +0100
Subject: [Python-Dev] Mixing memory management APIs
References: <LNBBLJKPBEHFEDALKOLCMELBNIAA.tim.one@home.com> <3C588C46.2BF27BBE@metaslash.com> <2m4rkuyaux.fsf@starship.python.net> <3C615E94.AF637093@lemburg.com> <3C617255.6D4E5278@lemburg.com> <3C61A218.CD3AB779@metaslash.com> <3C61B130.3040609@lemburg.com> <2m8za5mv6d.fsf@starship.python.net>
Message-ID: <3C626A86.58B12055@lemburg.com>

Michael Hudson wrote:
> 
> > >>I've checked in a patch for the UTF-8 codec problem. Could you
> > >>try Purify against the CVS version ?
> > >>
> > >
> > > with-pymalloc or without or both?
> >
> >
> > Both if possible -- the leakage showed up with pymalloc AFAIR :-)
> 
> I thought we were chasing memory stomping, not leaking, this time
> around...

Both, I guess: pymalloc doesn't behave well with overallocation
and codecs use this technique a lot. I reduced the overallocation 
in the UTF-8 encoder down from 3*size to 2*size which should
cover the most common cases better than before.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From thomas.heller@ion-tof.com  Thu Feb  7 21:10:18 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 7 Feb 2002 22:10:18 +0100
Subject: [Python-Dev] Extending types in C - help needed
References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com>              <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook>  <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net> <104601c1af50$4888bd90$e000a8c0@thomasnotebook>
Message-ID: <065001c1b01b$d8bcc520$e000a8c0@thomasnotebook>

[Guido]
> > A better solution is to store additional information in the __dict__.
> 
[Thomas]
> You loose nice features: access these (new) slots from Python
> by providing tp_members entries for them (for example).

This thread is IMO closed, just for completenes I want to mention
that the same effect can be accomplished easily with tp_getset.

Thomas



From Jack.Jansen@oratrix.nl  Thu Feb  7 22:43:46 2002
From: Jack.Jansen@oratrix.nl (Jack Jansen)
Date: Thu, 7 Feb 2002 23:43:46 +0100
Subject: [Python-Dev] Extending types in C - help needed
In-Reply-To: <104601c1af50$4888bd90$e000a8c0@thomasnotebook>
Message-ID: <25807C26-1C1C-11D6-B5C4-003065517236@oratrix.nl>

On Wednesday, February 6, 2002, at 09:53  PM, Thomas Heller wrote:
>> A better solution is to store additional information in the __dict__.
>
> You loose nice features: access these (new) slots from Python
> by providing tp_members entries for them (for example).

Martin pointed at a way to solve this. And I think that with my 
proposed API (... where is it..., ah yes, found it)
void PyType_SetAnnotation(PyTypeObject *tp, char *name, void 
*unique, void *);
void *PyType_GetAnnotation(PyTypeObject *tp, char *name, void *unique);

it would be almost as easy to use as a tp_ slot. The only thing 
needed to make it 100% safe is a registry for name/descr pairs. 
(Actually the API is changed a little since I understand how the 
second arg works)

For the benefit of whoever missed the previous thread: name is 
used as the key into the dictionary, and unique is a pointer 
stored with the entry, which assures that this entry hasn't been 
used for something else accidentally.

So in stead of a new slot tp_foo what you would need to do is 
come up with a name ("tp_foo" comes to mind) and a global 
variable whose address can be used for unique.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -



From guido@python.org  Fri Feb  8 03:03:14 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 07 Feb 2002 22:03:14 -0500
Subject: [Python-Dev] Extending types in C - help needed
In-Reply-To: Your message of "Wed, 06 Feb 2002 21:53:08 +0100."
 <104601c1af50$4888bd90$e000a8c0@thomasnotebook>
References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com> <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook> <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net>
 <104601c1af50$4888bd90$e000a8c0@thomasnotebook>
Message-ID: <200202080303.g1833EO23629@pcp742651pcs.reston01.va.comcast.net>

> > A better solution is to store additional information in the __dict__.
> 
> You loose nice features: access these (new) slots from Python
> by providing tp_members entries for them (for example).

I'm not sure I understand what you mean.  Why would you need a
tp_members entry for something that's in __dict__?

> Are you planning to address this issue in the future?

David Abrahams (of Boost++ fame) is also interested in a solution for
this problem, so I may have to.  Not in 2.2.1, though -- this will
have to be rearchitected so it's a 2.3 issue.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Feb  8 03:04:47 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 07 Feb 2002 22:04:47 -0500
Subject: [Python-Dev] PYC Magic
In-Reply-To: Your message of "Thu, 07 Feb 2002 12:48:11 +0100."
 <3C62697B.3555501@lemburg.com>
References: <3C62697B.3555501@lemburg.com>
Message-ID: <200202080304.g1834l423644@pcp742651pcs.reston01.va.comcast.net>

> FYI, I've bumped the PYC magic in a non-standard way (the old 
> standard broke on 2002-01-01); please review:

This is fine.  I never intended the algorithm as reversible, just as a
way to come up with unique magic numbers.  There is no requirement
that from the magic number one can calculate the date it was assigned.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Fri Feb  8 05:34:10 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 7 Feb 2002 23:34:10 -0600
Subject: [Python-Dev] Re: Python optmizations
In-Reply-To: <20020206163126.B4071@ibook.distro.conectiva>
References: <20020206163126.B4071@ibook.distro.conectiva>
Message-ID: <15459.25426.546160.663165@12-248-41-177.client.attbi.com>

Gustavo,

Thanks for the note.  Funny coincidence the timing of your note like a bolt
out of the blue and my attendance at IPC10 where Jeremy Hylton and I led a
session this afternoon on optimization issues.

    Gustavo> Recently, I have found your paper about peephole optimization,
    Gustavo> ...

I should probably check my peephole optimizer into the nondist sandbox on
SF.  It shouldn't take very much effort.  Unlike Rattlesnake, it still
pretty much works.

    Gustavo> ... discovered that I'm not original, and repeated most of your
    Gustavo> ideas and mistakes.

I was not original either.  I'm sure I repeated the mistakes of many other
people as well.

    Gustavo> One thing I thought and also found a reference in your paper is
    Gustavo> about some instructions that should be turned into a single
    Gustavo> opcode. To understand how this would affect the code, I have
    Gustavo> disassembled the whole Python standard library, and the whole
    Gustavo> Zope library. After that I've run a script to detect opcode
    Gustavo> repeatings (excluding SET_LINENO).

It sounds like your measurements were made on the static bytecode.  I
suspect you might find the dynamic opcode pair frequencies interesting as
well.  That's what most people look at when deciding what really follows
what.  (For example, did you consider the basic block structure of the
bytecode or just adjacency in the disassembler output?)  You can get this
information by defining the two macros DXPAIRS and DYNAMIC_EXECUTION_PROFILE
when compiling Python.  I think all you need to recompile is ceval.c and
sysmodule.c Once you've done this, run your scripts as usual, then before
exit (you might just want to run your subject script with the -i flag), call
sys.getdxp().  That will return you a 257x256 list (well, a list containing
257 other lists, each of which contains 256 elements).  The value at
location [i][j] corresponds to the frequency of opcode j being executed
immediately after opcode i (or the other way around - a quick peek at the
code will tell you which is which).

At one point I had an xmlrpc server running on manatee.mojam.com to which
people could submit such opcode frequency arrays, however nobody submitted
anything to it, so I eventually turned it off.  I would be happy to crank it
up again.  With the atexit module and xmlrpclib in the core library, it's
dead easy to instrument a program so it automatically dumps the data to my
server upon program exit.

    Gustavo> 23632   LOAD_FAST, LOAD_ATTR

This is not all that surprising and supports Jeremy's belief (which I agree
with) that self.attr is a very common construct in the language.

    Gustavo> 15382   LOAD_CONST, LOAD_CONST

Now, this is interesting.  If those constants are numbers and the next
opcode is a BINARY_*, my peephole optimizer can elide that operation and
create a new constant, so something like

    LOAD_CONST 60
    LOAD_CONST 60
    BINARY_MULTIPLY

would get converted to simply 

    LOAD_CONST 3600

    Gustavo> 12842   JUMP_IF_FALSE, POP_TOP
    Gustavo> 12397   CALL_FUNCTION, POP_TOP

I don't think these can be avoided.

    Gustavo> 12121   LOAD_FAST, LOAD_FAST

While this pair occurs frequently, they are very cheap instructions.  All
you'd be saving is a trip around the opcode dispatch loop.

    Gustavo> Not by casuality, I found in your paper references to a
    Gustavo> LOAD_FAST_ATTR opcode. Since you probably have mentioned this
    Gustavo> to others, I wouldn't like to bother everyone again asking why
    Gustavo> it was not implemented. Could you please explain me the reasons
    Gustavo> that left this behind?

LOAD_ATTR is a *very* expensive opcode (perhaps only second to CALL_FUNCTION
on a per-instruction basis). Jeremy measured minimums of around 500 clock
cycles and means of around 1200 clock cycles for this opcode.  In contrast,
it appears that a single trip around the opcode dispatch loop is on the
order of 50 clock cycles, so merging a LOAD_FAST/LOAD_ATTR pair into one
instruction only saves about 50 cycles.  What you want to eliminate is the
500+ cycles from the LOAD_ATTR instruction.  Jeremy and I both have ideas
about how to accomplish some of that, but it's not a trivial task.

I believe in most cases I got about a 5% speedup with peephole optimization.
That's nothing to sneeze at I suppose, but there were some barriers to
adoption.  First and foremost, generating that instruction requires running
my optimizer, which isn't blindingly fast.  (Probably fast enough for a
"compileall" step that you execute once at install time, but maybe too slow
to use regularly on-the-fly.)  It also predates the compiler Jeremy
implemented in Python.  It would probably be fairly easy to hang my
optimizer off the back end of his compiler as an optional pass.  It looks
like Guido would like to see a little work put into regaining some of the
performance that was lost between 1.5.2 and 2.2, so now would probably be a
good time to dust off my optimizer.

    Gustavo> If you have the time, I'd also like to understand what's the
    Gustavo> trouble involved in getting a peephole optimizer in the python
    Gustavo> compiler itself.  Is it just about compiling performance?  I
    Gustavo> don't remember to have read about this in your paper, but you
    Gustavo> probably thought about that as well.

Mostly just time.  Tick tick tick...

Skip


From guido@python.org  Fri Feb  8 16:50:31 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Feb 2002 11:50:31 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
Message-ID: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net>

Inspired by talks by Jeremy and Skip on DevDay, here's a different
idea for speeding up access to globals.  It retain semantics but (like
Jeremy's proposal) changes the type of a module's __dict__.

- Let a cell be a really simple PyObject, containing a PyObject
  pointer and a cell pointer.  Both pointers may be NULL.  (It may
  have to be called PyGlobalCell since I believe there's already a
  PyCell object.)  (Maybe it doesn't even have to be an object -- it
  could just be a tiny struct.)

- Let a celldict be a mapping that is implemented using a dict of
  cells.  When you use its getitem method, the PyObject * in the cell
  is dereferenced, and if a NULL is found, getitem raises KeyError
  even if the cell exists.  Using setitem to add a new value creates a
  new cell and stores the value there; using setitem to change the
  value for an existing key stores the value in the existing cell for
  that key.  There's a separate API to access the cells.

- We change the module implementation to use a celldict for its
  __dict__.  The module's getattr and setattr operations now map to
  getitem and setitem on the celldict.  I think the type of
  <module>.__dict__ and globals() is the only backwards
  incompatibility.

- When a module is initialized, it gets its __builtins__ from the
  __builtin__ module, which is itself a celldict.  For each cell in
  __builtins__, the new module's __dict__ adds a cell with a NULL
  PyObject pointer, whose cell pointer points to the corresponding
  cell of __builtins__.

- The compiler generates LOAD_GLOBAL_CELL <i> (and STORE_GLOBAL_CELL
  <i> etc.) opcodes for references to globals, where <i> is a small
  index with meaning only within one code object like the const index
  in LOAD_CONST.  The code object has a new tuple, co_globals, giving
  the names of the globals referenced by the code indexed by <i>.  I
  think no new analysis is required to be able to do this.

- When a function object is created from a code object and a celldict,
  the function object creates an array of cell pointers by asking the
  celldict for cells corresponding to the names in the code object's
  co_globals.  If the celldict doesn't already have a cell for a
  particular name, it creates and an empty one.  This array of cell
  pointers is stored on the function object as func_cells.  When a
  function object is created from a regular dict instead of a
  celldict, func_cells is a NULL pointer.

- When the VM executes a LOAD_GLOBAL_CELL <i> instruction, it gets
  cell number <i> from func_cells.  It then looks in the cell's
  PyObject pointer, and if not NULL, that's the global value.  If it
  is NULL, it follows the cell's cell pointer to the next cell, if it
  is not NULL, and looks in the PyObject pointer in that cell.  If
  that's also NULL, or if there is no second cell, NameError is
  raised.  (It could follow the chain of cell pointers until a NULL
  cell pointer is found; but I have no use for this.)  Similar for
  STORE_GLOBAL_CELL <i>, except it doesn't follow the cell pointer
  chain -- it always stores in the first cell.

- There are fallbacks in the VM for the case where the function's
  globals aren't a celldict, and hence func_cells is NULL.  In that
  case, the code object's co_globals is indexed with <i> to find the
  name of the corresponding global and this name is used to index the
  function's globals dict.


I believe that's it.  I think this faithfully implements the current
semantics (where a global can shadow a builtin), but without the need
for any dict lookups when accessing globals, except in cases where an
explicit dict is passed to exec or eval().

Compare this to Jeremy's scheme using dlicts:

http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/FastGlobals

- My approach doesn't require global agreement on the numbering of the
  globals; each code object has its own numbering.  This avoids the
  need for more global analysis, and allows adding code to a module
  using exec that introduces new globals without having to fall back
  on a less efficient scheme.

- Jeremy's approach might be a teensy bit faster because it may have
  to do less work in the LOAD_GLOBAL; but I'm not convinced.


Here's a implementation sketch for cell and celldict.  Note in
particular that keys() only returns the keys for which the cell's
objptr is not NULL.


NULL = object() # used as a token


class cell(object):

    def __init__(self):
        self.objptr = NULL
        self.cellptr = NULL


class celldict(object):

    def __init__(self):
        self.__dict = {} # dict of cells

    def getcell(self, key):
        c = self.__dict.get(key)
        if c is None:
            c = cell()
            self.__dict[key] = c
        return c

    def __getitem__(self, key):
        c = self.__dict.get(key)
        if c is None:
            raise KeyError, key
        value = c.objptr
        if value is NULL:
            raise KeyError, key
        else:
            return value

    def __setitem__(self, key, value):
        c = self.__dict.get(key)
        if c is None:
            c = cell()
            self.__dict[key] = c
        c.objptr = value

    def __delitem__(self, key):
        c = self.__dict.get(key)
        if c is None or c.objptr is NULL:
            raise KeyError, key
        c.objptr = NULL

    def keys(self):
        return [c.objptr for c in self.__dict.keys() if c.objptr is not NULL]

    def clear(self):
        for c in self.__dict.values():
            c.objptr = NULL

    # Etc.


--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com  Fri Feb  8 18:04:45 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 8 Feb 2002 19:04:45 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
Message-ID: <029a01c1b0cb$195cb400$ced241d5@hagrid>

I propose adding a basic time type (or time base type ;-) to the standard
library, which can be subclassed by more elaborate date/time/timestamp
implementations, such as mxDateTime, custom types provided by DB-API
drivers, etc.

The goal is to make it easy to extract the year, month, day, hour, minute,
and second from any given time object.

Or to put it another way, I want the following to work for any time object,
including mxDateTime objects, any date/timestamp returned by a DB-API
driver, and weird date/time-like types I've developed myself:

    if isinstance(t, basetime):
        # yay! it's a timestamp
        print t.timetuple()

The goal is not to standardize any behaviour beyond this; anything else
should be provided by subtypes.

More details here:

    http://effbot.org/ideas/time-type.htm

I can produce PEP and patch if necessary.

</F>



From guido@python.org  Fri Feb  8 18:09:27 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Feb 2002 13:09:27 -0500
Subject: [Python-Dev] Speeding up instance attribute access
Message-ID: <200202081809.g18I9SD02879@pcp742651pcs.reston01.va.comcast.net>

Inspired by the second half of Jeremy's talk on DevDay, here's my
alternative approach for speeding up instance attribute access.  Like
my idea for globals, it uses double indirection rather than
recompilation.


- We only care about attributes of 'self' (which is identified as the
  first argument of a method, not by name).  We can exclude functions
  from our analysis that make any assignment to self -- this is
  extremely rare and would throw off our analysis.  We should also
  exclude static methods and class methods, since their first argument
  doesn't have the same role.

- Static analysis of the source code of a class (without access to the
  base class) can determine attributes of the class, and to some extent
  instance variables.  Without also analyzing the base classes, this
  analysis cannot reliably distinguish between instance variables and
  methods inherited from a base class; it can distinguish between
  instance variables and methods defined in the current class.

- We can guess the status of un-assigned-to inherited attributes by
  seeing whether they are called or not.  This is not 100% accurate,
  so we need things to work (if slower) even when we guess wrong.

- For instance variable references and stores of the form self.<name>,
  the bytecode compiler emits opcodes LOAD_SELF_IVAR <i> and
  STORE_SELF_IVAR <i>, where <i> is a small int identifying the
  instance variable (ivar).  A particular ivar is identified by the
  same <i> throughout all methods defined in the same class statement,
  but there is no attempt to coordinate this across different classes
  related by inheritance.

- It would be nice if we also had a single-opcode way to express a
  method call on self, e.g. CALL_SELF_METHOD <i>, <n>, <k> where <i>
  identifies the method like above, and <n> and <k> are the number of
  positional and keyword arguments.  Or maybe we should just have
  LOAD_SELF_METHOD <i> which may be able to skip looking in the
  instance dict.

- Some data structure describing the mapping from <i> to attribute
  name, and whether it's an ivar or a method, is produced by the
  compiler and stored in the class __dict__.  The function objects
  representing methods also contain a pointer to this data structure.
  (Or the code objects?  But it needs to be shared.  Details, details.)

- When a class object is created (at run-time), another data structure
  is created that accumulates the <i>-to-name mappings from that class
  and all its base classes.


From guido@python.org  Fri Feb  8 18:11:06 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Feb 2002 13:11:06 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Fri, 08 Feb 2002 19:04:45 +0100."
 <029a01c1b0cb$195cb400$ced241d5@hagrid>
References: <029a01c1b0cb$195cb400$ced241d5@hagrid>
Message-ID: <200202081811.g18IB6a02905@pcp742651pcs.reston01.va.comcast.net>


From guido@python.org  Fri Feb  8 18:14:34 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Feb 2002 13:14:34 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Fri, 08 Feb 2002 19:04:45 +0100."
 <029a01c1b0cb$195cb400$ced241d5@hagrid>
References: <029a01c1b0cb$195cb400$ced241d5@hagrid>
Message-ID: <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net>

>     http://effbot.org/ideas/time-type.htm
> 
> I can produce PEP and patch if necessary.

Yes, a PEP, please!  Jim Fulton has been asking for this for a long
time too.  His main requirement is that timestamp objects are small,
both in memory and as pickles, because Zope keeps a lot of these
around.  They are currently represented either as long ints (with a
little under 64 bits) or as 8-byte strings.  A dedicated timestamp
object could be smaller than that.

Your idea of a base type (which presumably standarizes at least one
form of representation) sounds like a breakthrough that can help
satisfy different other needs.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Feb  8 18:16:25 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Feb 2002 13:16:25 -0500
Subject: [Python-Dev] Speeding up instance attribute access
In-Reply-To: Your message of "Fri, 08 Feb 2002 13:09:27 EST."
 <200202081809.g18I9SD02879@pcp742651pcs.reston01.va.comcast.net>
References: <200202081809.g18I9SD02879@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <200202081816.g18IGPq02967@pcp742651pcs.reston01.va.comcast.net>

Forget that.  I hit "send" accidentally; my fingers seem jittery after
the conference.  I'll send the real proposal in a while.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Fri Feb  8 19:05:03 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 08 Feb 2002 20:05:03 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <029a01c1b0cb$195cb400$ced241d5@hagrid> <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C64215F.AF31BB96@lemburg.com>

Guido van Rossum wrote:
> 
> >     http://effbot.org/ideas/time-type.htm
> >
> > I can produce PEP and patch if necessary.
> 
> Yes, a PEP, please!  Jim Fulton has been asking for this for a long
> time too.  His main requirement is that timestamp objects are small,
> both in memory and as pickles, because Zope keeps a lot of these
> around.  They are currently represented either as long ints (with a
> little under 64 bits) or as 8-byte strings.  A dedicated timestamp
> object could be smaller than that.
> 
> Your idea of a base type (which presumably standarizes at least one
> form of representation) sounds like a breakthrough that can help
> satisfy different other needs.

Sounds like a plan :-) 

In order to make mxDateTime subtypes of this new type we'd need to
make sure that the datetime type uses a true struct subset of what
I have in DateTime objects now:

typedef struct {
    PyObject_HEAD

    /* Representation used to do calculations */
    long absdate;               /* number of days since 31.12. in the year 1 BC
                                   calculated in the Gregorian calendar. */
    double abstime;             /* seconds since 0:00:00.00 (midnight)
                                   on the day pointed to by absdate */

 ...lots of broken down values needed to assure roundtrip safety...

}

Depending on the size of PyObject_HEAD, this should meet Jim
Fultons requirements (the base type would of course not implement
the "..." part :-).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Fri Feb  8 19:08:43 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 08 Feb 2002 20:08:43 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <029a01c1b0cb$195cb400$ced241d5@hagrid> <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net> <3C64215F.AF31BB96@lemburg.com>
Message-ID: <3C64223B.F345E4AE@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> In order to make mxDateTime subtypes of this new type we'd need to
> make sure that the datetime type uses a true struct subset of what
> I have in DateTime objects now:
> 
> typedef struct {
>     PyObject_HEAD
> 
>     /* Representation used to do calculations */
>     long absdate;               /* number of days since 31.12. in the year 1 BC
>                                    calculated in the Gregorian calendar. */
>     double abstime;             /* seconds since 0:00:00.00 (midnight)
>                                    on the day pointed to by absdate */
> 
>  ...lots of broken down values needed to assure roundtrip safety...
> 
> }
> 
> Depending on the size of PyObject_HEAD, this should meet Jim
> Fultons requirements (the base type would of course not implement
> the "..." part :-).

I forgot to mention that there is another object type in mxDateTime
too: DateTimeDelta. That's the type needed to represent the time 
difference between two DateTime instances, or what people usually
call "time" :-)

It has the following type "signature":

typedef struct {
    PyObject_HEAD
    double seconds;             /* number of delta seconds */

...some broken down values needed to assure roundtrip safety...

}

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From guido@python.org  Fri Feb  8 19:18:20 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Feb 2002 14:18:20 -0500
Subject: [Python-Dev] Speeding up instance attribute access
Message-ID: <200202081918.g18JIKb03242@pcp742651pcs.reston01.va.comcast.net>

(By mistake I sent an incomplete version of this earlier.  Please
ignore that and read this instead.)

Inspired by the second half of Jeremy's talk on DevDay, here's my
alternative approach for speeding up instance attribute access.  Like
my idea for globals, it uses double indirection rather than
recompilation.


- We only care about attributes of 'self' (which is identified as the
  first argument of a method, not by name).  We can exclude functions
  from our analysis that make any assignment to self -- this is
  extremely rare and would throw off our analysis.  We should also
  exclude static methods and class methods, since their first argument
  doesn't have the same role.

- Static analysis of the source code of a class (without access to the
  base class) can determine attributes of the class, and to some extent
  instance variables.  Without also analyzing the base classes, this
  analysis cannot reliably distinguish between instance variables and
  methods inherited from a base class; it can distinguish between
  instance variables and methods defined in the current class.

- We can guess the status of un-assigned-to inherited attributes by
  seeing whether they are called or not.  This is not 100% accurate,
  so we need things to work (if slower) even when we guess wrong.

- For instance variable references and stores of the form self.<name>,
  the bytecode compiler emits opcodes LOAD_SELF_IVAR <i> and
  STORE_SELF_IVAR <i>, where <i> is a small int identifying the
  instance variable (ivar).  A particular ivar is identified by the
  same <i> throughout all methods defined in the same class statement,
  but there is no attempt to coordinate this across different classes
  related by inheritance.

- Some data structure describing the mapping from <i> to attribute
  name, and whether it's an ivar or a method, is produced by the
  compiler and stored in the class, in a way that a user can't change
  it.  The function objects representing methods also contain a
  pointer to this data structure.  (Or the code objects?  But it needs
  to be shared.  Details, details.)

- At *run time*, when a class object is created, another data
  structure is created that accumulates the <i>-to-name mappings from
  that class and all its base classes.  This has more accurate
  information because it can collect information from the bases
  (though it can still be fooled by dynamic manipulations of classes).
  In particular, it has the canonical set of known instance variables
  for instances of this class.  This is stored in the class object, in
  a way that a user can't change it.

- When an instance is created, the run-time data structure stored in
  the class is consulted to know how many instance variables to
  allocate.  An array is allocated at the end of the instance (or in a
  separately allocated block?) with one PyObject pointer for each
  known instance variable.  There's also a pointer to the instance
  __dict__, to hold instance variables that the parser didn't spot,
  but this pointer starts off as NULL -- a dictionary is created for
  it only when needed.

- There is no requirement that the layout of the ivar array for
  instances of a subclass is an extension of the layout of the ivar
  array for its base classes.  But there *is* a requirement that all
  instances of the same class that are not instances of a subclass
  (i.e., all x and y where x.__class__ is y.__class__) have the same
  layout, and this layout is determined by the run-time data structure
  stored in the class.

- Now all we need is an efficient way to map LOAD_SELF_IVAR <i> to an
  index in the array of ivars.  Two different classes play a role
  here: the class of self (the run-time class) and the class that
  defined the method containing the LOAD_SELF_IVAR <i> opcode (the
  compile-time class).  We assume the run-time class is a subclass of
  the compile-time class.  The correct mapping can easily be
  calculated from the data structures left behind by the compiler in
  the compile-time class and by class construction at run time in the
  run-time class.  Since this mapping doesn't change, it can be
  calculated once and then cached.  The cache can be held in the
  run-time class; it can be a dictionary whose keys are compile-time
  classes.  This means a single dict lookup that must be done once per
  method call (and only when LOAD_SELF_IVAR <i> or STORE_SELF_IVAR <i>
  is used).  We could save even that dict lookup in most cases by
  caching the run-time class and the outcome with the method.

- We need fallbacks for various exceptional cases:

  * If the compile-time class uses LOAD_SELF_IVAR <i> but the run-time
    class doesn't think that is an instance variable, LOAD_SELF_IVAR
    must fall back to look in the instance dict (if non-NULL) and then
    down the run-time class and its base classes.

  * If the ivar slot in the instance corresponding to <i> exists but
    is NULL, LOAD_SELF_IVAR <i> must fall back to searching the
    run-time class and its base classes.

  For both cases, the mapping from <i> to attribute names must be
  available.  The language and the current code generation guarantee
  that 'self' is the first local variable.

- The instance __dict__ must become a proxy that knows whether a given
  name is stored in the array of ivars or in the overflow dict; this
  is much like Jeremy's DLict.

- Note that there are two sources of savings: the major savings
  (probably) comes from avoiding a dict lookup for every ivar access;
  an additional minor savings comes from collapsing two opcodes:

      LOAD_FAST       0 (self)
      LOAD_ATTR       1 (foo)

  into one:

      LOAD_SELF_IVAR  0 (self.foo)

- I don't know if we should try to generate LOAD_SELF_IVAR <i> only
  for things that really are (likely) ivars, or for all attributes.
  Maybe it would be nice if we also had a single-opcode way to express
  a method call on self, e.g. CALL_SELF_METHOD <i>, <n>, <k> where <i>
  identifies the method, and <n> and <k> are the number of positional
  and keyword arguments.  Or maybe we should just have
  LOAD_SELF_METHOD <i> which looks in the instance overflow dict (but
  only if non-NULL) and in the class and bases, but can avoid looking
  in the ivars array if <i> does not describe a known ivar (again this
  information can be cached).

- The required global analysis is a bit hairy, and not something we
  already do.  I believe that Jeremy thinks PyChecker already does
  this; I'm not sure if that means we can borrow code or just ideas.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Feb  8 19:51:02 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Feb 2002 14:51:02 -0500
Subject: Half-baked idea (was Re: [Python-Dev] Extending types in C - help needed)
In-Reply-To: Your message of "Wed, 06 Feb 2002 10:24:31 CST."
 <20020206102426.A20584@unpythonic.dhs.org>
References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com> <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook> <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net>
 <20020206102426.A20584@unpythonic.dhs.org>
Message-ID: <200202081951.g18Jp2g03461@68.49.146.65>

[Idea about extending variable-length structures at the front instead
of at the back]

The problem with applying this idea to Python objects, IMO, is that
Python requires the object header to be at the start.  Anything
operating on a PyObject * expects that it can use the Py_INCREF and
Py_DECREF macros, and those expect the refcount to be the first field
and the type pointer to be the second.

So our objects are already constrained at the front.

Also, the GC implementation already uses thistrick: it adds three
fields in front of the structure.  But then it assumes you can use
fixed address calculations to translate between the object and the GC
header.  Adding something in front of the GC header would be too
painful.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Feb  8 19:54:16 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Feb 2002 14:54:16 -0500
Subject: Half-baked idea (was Re: [Python-Dev] Extending types in C - help needed)
In-Reply-To: Your message of "Wed, 06 Feb 2002 10:24:31 CST."
 <20020206102426.A20584@unpythonic.dhs.org>
References: <43ABBA6E-0AD4-11D6-A4BB-003065517236@oratrix.nl> <08c501c19f8c$72631b20$e000a8c0@thomasnotebook> <200201171951.OAA00909@cj20424-a.reston1.va.home.com> <060801c1a052$93d5a860$e000a8c0@thomasnotebook> <200201200053.TAA30250@cj20424-a.reston1.va.home.com> <0ac001c1af0d$b2e51dc0$e000a8c0@thomasnotebook> <200202061436.g16EaRK21446@pcp742651pcs.reston01.va.comcast.net>
 <20020206102426.A20584@unpythonic.dhs.org>
Message-ID: <200202081954.g18JsG003558@pcp742651pcs.reston01.va.comcast.net>

[Idea about extending variable-length structures at the front instead
of at the back]

The problem with applying this idea to Python objects, IMO, is that
Python requires the object header to be at the start.  Anything
operating on a PyObject * expects that it can use the Py_INCREF and
Py_DECREF macros, and those expect the refcount to be the first field
and the type pointer to be the second.

So our objects are already constrained at the front.

Also, the GC implementation already uses thistrick: it adds three
fields in front of the structure.  But then it assumes you can use
fixed address calculations to translate between the object and the GC
header.  Adding something in front of the GC header would be too
painful.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Fri Feb  8 20:16:33 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 08 Feb 2002 15:16:33 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net>

[/F]
>     http://effbot.org/ideas/time-type.htm
>
> I can produce PEP and patch if necessary.

[Guido]
> Yes, a PEP, please!  Jim Fulton has been asking for this for a long
> time too.  His main requirement is that timestamp objects are small
> both in memory and as pickles, because Zope keeps a lot of these
> around.  They are currently represented either as long ints (with a
> little under 64 bits) or as 8-byte strings.  A dedicated timestamp
> object could be smaller than that.

Are you sure Jim is looking to replace the TimeStamp object?  All the
complaints I've seen aren't about the relatively tiny TimeStamp object, but
about Zope's relatively huge DateTime class (note that you won't have source
for that if you're looking at a StandaloneZODB checkout -- DateTime is used
at higher Zope levels), which is a Python class with a couple dozen(!)
instance attributes.  See, e.g.,

    http://dev.zope.org/Wikis/DevSite/Proposals/ReplacingDateTime

It seems clear from the source code that TimeStamp is exactly what Jim
intended it to be <wink>.

> Your idea of a base type (which presumably standarizes at least one
> form of representation) sounds like a breakthrough that can help
> satisfy different other needs.

Best I can make out, /F is only proposing what Jim would call an Interface:
the existence of two methods, timetuple() and utctimetuple().  In a comment
on his page, /F calls it an "abstract" base class, which is more C++-ish
terminology, and the sample implementation makes clear it's a "pure"
abstract base class, so same thing as a Jim Interface in the end.



From guido@python.org  Fri Feb  8 20:27:16 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Feb 2002 15:27:16 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Fri, 08 Feb 2002 15:16:33 EST."
 <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net>
Message-ID: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net>

[Tim]
> Are you sure Jim is looking to replace the TimeStamp object?  All the
> complaints I've seen aren't about the relatively tiny TimeStamp object, but
> about Zope's relatively huge DateTime class (note that you won't have source
> for that if you're looking at a StandaloneZODB checkout -- DateTime is used
> at higher Zope levels), which is a Python class with a couple dozen(!)
> instance attributes.  See, e.g.,
> 
>     http://dev.zope.org/Wikis/DevSite/Proposals/ReplacingDateTime
> 
> It seems clear from the source code that TimeStamp is exactly what Jim
> intended it to be <wink>.

I'm notoriously bad at channeling Jim.  Nevertheless, I do recall him
saying he wanted a lightweight time object.  I think the mistake of
DateTime is that it stores the broken-out info, rather than computing
it on request.

> > Your idea of a base type (which presumably standarizes at least one
> > form of representation) sounds like a breakthrough that can help
> > satisfy different other needs.
> 
> Best I can make out, /F is only proposing what Jim would call an Interface:
> the existence of two methods, timetuple() and utctimetuple().  In a comment
> on his page, /F calls it an "abstract" base class, which is more C++-ish
> terminology, and the sample implementation makes clear it's a "pure"
> abstract base class, so same thing as a Jim Interface in the end.

I'll show the PEP to Jim when it appears.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Fri Feb  8 20:47:28 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 08 Feb 2002 15:47:28 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEKDNKAA.tim.one@comcast.net>

[Guido]
> I'm notoriously bad at channeling Jim.  Nevertheless, I do recall him
> saying he wanted a lightweight time object.

Given that most mallocs align to 8-byte boundaries these days (also true of
pymalloc), it's impossible in reality to define a smaller object than
TimeStamp, provided it needs at least one byte of info beyond PyObject_HEAD.

> I think the mistake of DateTime is that it stores the broken-out info,
> rather than computing it on request.

Possibly, but hard to say, since speed of display is also an issue, and I
imagine also speed of range searches.  At least 2.2 makes it easy to define
computed attributes, any of which could choose to cache their ultimate
value, but none of which would need to be stored in pickles.



From fdrake@acm.org  Fri Feb  8 20:50:03 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 8 Feb 2002 15:50:03 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net>
References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15460.14843.719197.239058@grendel.zope.com>

Guido van Rossum writes:
 >     def keys(self):
 >         return [c.objptr for c in self.__dict.keys() if c.objptr is not NULL]

I presume you meant values() here rather than keys()?  The keys()
method could simply delegate to self.__dict.  I imagine most of us can
fill in any additional dictionary methods, though.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From jeremy@alum.mit.edu  Fri Feb  8 01:04:23 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 7 Feb 2002 20:04:23 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEKDNKAA.tim.one@comcast.net>
References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCKEKDNKAA.tim.one@comcast.net>
Message-ID: <15459.9239.83647.334632@gondolin.digicool.com>

>>>>> "TP" == Tim Peters <tim.one@comcast.net> writes:

  TP> [Guido]
  >> I'm notoriously bad at channeling Jim.  Nevertheless, I do recall
  >> him saying he wanted a lightweight time object.

  TP> Given that most mallocs align to 8-byte boundaries these days
  TP> (also true of pymalloc), it's impossible in reality to define a
  TP> smaller object than TimeStamp, provided it needs at least one
  TP> byte of info beyond PyObject_HEAD.

Also, it may not be necessary to have a TimeStamp object in ZODB 4.
There are three uses for the timestamp: tracking how recently an
object was used for cache evication, providing a last modified time to
users, and as a simple version number.  

In ZODB 4, the cache eviction may be done quite differently.  The
version number may be a simple int.  The last mod time will not be
provided for each object; instead, users will need to define this
themselves if they care about it.  If they define it themselves,
they'd probably use a DateTime object, but we'd care much less about
how small it is.

Jeremy



From niemeyer@conectiva.com  Fri Feb  8 20:56:35 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Fri, 8 Feb 2002 18:56:35 -0200
Subject: [Python-Dev] Re: Python optmizations
In-Reply-To: <15459.25426.546160.663165@12-248-41-177.client.attbi.com>
References: <20020206163126.B4071@ibook.distro.conectiva> <15459.25426.546160.663165@12-248-41-177.client.attbi.com>
Message-ID: <20020208185635.B4607@ibook.distro.conectiva>

Hi Skip!

> Thanks for the note.  Funny coincidence the timing of your note like a bolt
> out of the blue and my attendance at IPC10 where Jeremy Hylton and I led a
> session this afternoon on optimization issues.

You have a powerful mind... ;-)

[...]
> I should probably check my peephole optimizer into the nondist sandbox on
> SF.  It shouldn't take very much effort.  Unlike Rattlesnake, it still
> pretty much works.

Please, do that. I'd like to have a look at it.

[...]
> It sounds like your measurements were made on the static bytecode.  I

Indeed.

> suspect you might find the dynamic opcode pair frequencies interesting as
> well.  That's what most people look at when deciding what really follows
> what.  (For example, did you consider the basic block structure of the
> bytecode or just adjacency in the disassembler output?)  You can get this

Yes, I have customized the disassembler a little bit.

> information by defining the two macros DXPAIRS and DYNAMIC_EXECUTION_PROFILE
> when compiling Python.  I think all you need to recompile is ceval.c and
> sysmodule.c Once you've done this, run your scripts as usual, then before
> exit (you might just want to run your subject script with the -i flag), call
> sys.getdxp().  That will return you a 257x256 list (well, a list containing
> 257 other lists, each of which contains 256 elements).  The value at
> location [i][j] corresponds to the frequency of opcode j being executed
> immediately after opcode i (or the other way around - a quick peek at the
> code will tell you which is which).

I was aware about this because of the code just after the dispatch_opcode
label. Again, I'm not original. :-) On the other hand, when running an
application you have data about that specific application, and the
behavior of that run (next time it may follow other paths). While this
is good, because you know what opcodes are being repeated most often,
measuring static data may give you a wider view of repeating opcodes.

> At one point I had an xmlrpc server running on manatee.mojam.com to which
> people could submit such opcode frequency arrays, however nobody submitted
> anything to it, so I eventually turned it off.  I would be happy to crank it
> up again.  With the atexit module and xmlrpclib in the core library, it's
> dead easy to instrument a program so it automatically dumps the data to my
> server upon program exit.

Now, *that* is something interesting. If you're really going to put the
system up, you may count with my help if you need any.

>     Gustavo> 15382   LOAD_CONST, LOAD_CONST
> 
> Now, this is interesting.  If those constants are numbers and the next
> opcode is a BINARY_*, my peephole optimizer can elide that operation and
> create a new constant, so something like
> 
>     LOAD_CONST 60
>     LOAD_CONST 60
>     BINARY_MULTIPLY
> 
> would get converted to simply 
> 
>     LOAD_CONST 3600

Good point!!

>>> def f():  
...   return 2+1*5
... 
>>> dis.dis(f)
          0 SET_LINENO               1

          3 SET_LINENO               2
          6 LOAD_CONST               1 (2)
          9 LOAD_CONST               2 (1)
         12 LOAD_CONST               3 (5)
         15 BINARY_MULTIPLY     
         16 BINARY_ADD          
         17 RETURN_VALUE        
         18 LOAD_CONST               0 (None)
         21 RETURN_VALUE        

That's something we shouldn't left behind.

[...]
>     Gustavo> 12121   LOAD_FAST, LOAD_FAST
> 
> While this pair occurs frequently, they are very cheap instructions.  All
> you'd be saving is a trip around the opcode dispatch loop.

I see..

>     Gustavo> Not by casuality, I found in your paper references to a
>     Gustavo> LOAD_FAST_ATTR opcode. Since you probably have mentioned this
>     Gustavo> to others, I wouldn't like to bother everyone again asking why
>     Gustavo> it was not implemented. Could you please explain me the reasons
>     Gustavo> that left this behind?
> 
> LOAD_ATTR is a *very* expensive opcode (perhaps only second to CALL_FUNCTION
> on a per-instruction basis). Jeremy measured minimums of around 500 clock
> cycles and means of around 1200 clock cycles for this opcode.  In contrast,
> it appears that a single trip around the opcode dispatch loop is on the
> order of 50 clock cycles, so merging a LOAD_FAST/LOAD_ATTR pair into one
> instruction only saves about 50 cycles.  What you want to eliminate is the
> 500+ cycles from the LOAD_ATTR instruction.  Jeremy and I both have ideas
> about how to accomplish some of that, but it's not a trivial task.

Hummmm... pretty interesting! Thanks for the explanation.

> I believe in most cases I got about a 5% speedup with peephole optimization.
> That's nothing to sneeze at I suppose, but there were some barriers to
> adoption.  First and foremost, generating that instruction requires running
> my optimizer, which isn't blindingly fast.  (Probably fast enough for a
> "compileall" step that you execute once at install time, but maybe too slow
> to use regularly on-the-fly.)  It also predates the compiler Jeremy
> implemented in Python.  It would probably be fairly easy to hang my
> optimizer off the back end of his compiler as an optional pass.  It looks

I understand. That's something to be implemented in C, once we know
the efforts are worthwhile. Maybe 5% is not that much, but optimization
is something we do once, and benefit forever. A good peephole interface,
with plugable passes, will also motivate new optimizations, in the
peephole itself and around it.

> like Guido would like to see a little work put into regaining some of the
> performance that was lost between 1.5.2 and 2.2, so now would probably be a
> good time to dust off my optimizer.

No doubts.

[...]
> Mostly just time.  Tick tick tick...

I don't have much of this thing lately.. :-) But I'll try to use
some of it helping wherever possible.

Thank you!

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]


From guido@python.org  Fri Feb  8 21:01:16 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Feb 2002 16:01:16 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: Your message of "Fri, 08 Feb 2002 15:50:03 EST."
 <15460.14843.719197.239058@grendel.zope.com>
References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net>
 <15460.14843.719197.239058@grendel.zope.com>
Message-ID: <200202082101.g18L1GA04598@pcp742651pcs.reston01.va.comcast.net>

> Guido van Rossum writes:
>  >     def keys(self):
>  >         return [c.objptr for c in self.__dict.keys() if c.objptr is not NULL]
> 
> I presume you meant values() here rather than keys()?  The keys()
> method could simply delegate to self.__dict.  I imagine most of us can
> fill in any additional dictionary methods, though.

Oops, I was indeed confused.  I think I meant this:

    def keys(self):
        return [k for k, c in self.__dict.iteritems() if c.objptr is not NULL]

And indeed I expected that you could extrapolate to the other
methods. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Feb  8 21:03:33 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Feb 2002 16:03:33 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Thu, 07 Feb 2002 20:04:23 EST."
 <15459.9239.83647.334632@gondolin.digicool.com>
References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <LNBBLJKPBEHFEDALKOLCKEKDNKAA.tim.one@comcast.net>
 <15459.9239.83647.334632@gondolin.digicool.com>
Message-ID: <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net>

> In ZODB 4, the cache eviction may be done quite differently.  The
> version number may be a simple int.  The last mod time will not be
> provided for each object; instead, users will need to define this
> themselves if they care about it.  If they define it themselves,
> they'd probably use a DateTime object, but we'd care much less about
> how small it is.

In that case, I take back everything I've said about Jim Fulton's
requirements.  I'm quite sure that in the past he said he needed a
very lightweight date/time object, but from what you say it appears
this need has disappeared.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake@acm.org  Fri Feb  8 21:04:28 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 8 Feb 2002 16:04:28 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <200202082101.g18L1GA04598@pcp742651pcs.reston01.va.comcast.net>
References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net>
 <15460.14843.719197.239058@grendel.zope.com>
 <200202082101.g18L1GA04598@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15460.15708.873799.157131@grendel.zope.com>

Guido van Rossum writes:
 > Oops, I was indeed confused.  I think I meant this:
 > 
 >     def keys(self):
 >         return [k for k, c in self.__dict.iteritems() if c.objptr is not NULL]

Was I not clear, or am I missing something entirely?  keys() needs
*no* special treatment, but items() and values() do:

class celldict(object):
    ...

    def keys(self):
        return self.__dict.keys()

    def items(self):
        return [k, c.objptr for k, c in self.__dict.iteritems()
                if c.objptr is not NULL]

    def values(self):
        return [c.objptr for c in self.__dict.itervalues()
                if c.objptr is not NULL]


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From fdrake@acm.org  Fri Feb  8 21:07:04 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 8 Feb 2002 16:07:04 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net>
References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCKEKDNKAA.tim.one@comcast.net>
 <15459.9239.83647.334632@gondolin.digicool.com>
 <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15460.15864.266226.241495@grendel.zope.com>

Guido van Rossum writes:
 > In that case, I take back everything I've said about Jim Fulton's
 > requirements.  I'm quite sure that in the past he said he needed a
 > very lightweight date/time object, but from what you say it appears
 > this need has disappeared.

He wanted this for the catalog, and I suspect he still does.  Both
size and performance (of comparisons) were important, not rendering
time.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From neal@metaslash.com  Fri Feb  8 21:10:25 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Fri, 08 Feb 2002 16:10:25 -0500
Subject: [Python-Dev] Speeding up instance attribute access
References: <200202081918.g18JIKb03242@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C643EC1.A5D6B094@metaslash.com>

Guido van Rossum wrote:

> - The required global analysis is a bit hairy, and not something we
>   already do.  I believe that Jeremy thinks PyChecker already does
>   this; I'm not sure if that means we can borrow code or just ideas.

The algorithm pychecker uses is pretty simple.  
I think something like this should work:

	for each method:
	    self = method.co_varnames[0]
	    for each byte code in method:
		if op == STORE_FAST and oparg == self:
		    break # we don't know self anymore

		if (op == STORE_ATTR or op == LOAD_ATTR) and selfOnTop:
		    # we have an attribute store it off
		selfOnTop = (LOAD_FAST and oparg == self)

Note that storing the attributes could be done during the compilation step.
This means that it should be simple housekeeping in the compiler to store
the info and not require another pass (as in pychecker).

Neal


From guido@python.org  Fri Feb  8 21:19:54 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Feb 2002 16:19:54 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: Your message of "Fri, 08 Feb 2002 16:04:28 EST."
 <15460.15708.873799.157131@grendel.zope.com>
References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net> <15460.14843.719197.239058@grendel.zope.com> <200202082101.g18L1GA04598@pcp742651pcs.reston01.va.comcast.net>
 <15460.15708.873799.157131@grendel.zope.com>
Message-ID: <200202082119.g18LJs405546@pcp742651pcs.reston01.va.comcast.net>

> Guido van Rossum writes:
>  > Oops, I was indeed confused.  I think I meant this:
>  > 
>  >     def keys(self):
>  >         return [k for k, c in self.__dict.iteritems() if c.objptr is not NULL]
> 
> Was I not clear, or am I missing something entirely?

I'm guessing both. ;-)

> keys() needs
> *no* special treatment, but items() and values() do:
> 
> class celldict(object):
>     ...
> 
>     def keys(self):
>         return self.__dict.keys()

Wrong.  keys() *does* need special treatment.  If c.objptr is NULL,
the cell exists, but keys() should not return the corresponding key.
This is so that len(x.keys()) == len(x.values()), amongst other
reasons!

>     def items(self):
>         return [k, c.objptr for k, c in self.__dict.iteritems()
>                 if c.objptr is not NULL]
> 
>     def values(self):
>         return [c.objptr for c in self.__dict.itervalues()
>                 if c.objptr is not NULL]

Yes, these are correct.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Fri Feb  8 21:17:28 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 8 Feb 2002 16:17:28 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <200202082119.g18LJs405546@pcp742651pcs.reston01.va.comcast.net>
References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net>
 <15460.14843.719197.239058@grendel.zope.com>
 <200202082101.g18L1GA04598@pcp742651pcs.reston01.va.comcast.net>
 <15460.15708.873799.157131@grendel.zope.com>
 <200202082119.g18LJs405546@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15460.16488.20690.943775@grendel.zope.com>

Guido van Rossum writes:
 > I'm guessing both. ;-)
...
 > Wrong.  keys() *does* need special treatment.  If c.objptr is NULL,
 > the cell exists, but keys() should not return the corresponding key.
 > This is so that len(x.keys()) == len(x.values()), amongst other
 > reasons!

Ow!  Bad Fred!  I should know better than to speak up on a Friday!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From guido@python.org  Fri Feb  8 21:22:24 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Feb 2002 16:22:24 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Fri, 08 Feb 2002 16:07:04 EST."
 <15460.15864.266226.241495@grendel.zope.com>
References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <LNBBLJKPBEHFEDALKOLCKEKDNKAA.tim.one@comcast.net> <15459.9239.83647.334632@gondolin.digicool.com> <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net>
 <15460.15864.266226.241495@grendel.zope.com>
Message-ID: <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net>

> Guido van Rossum writes:
>  > In that case, I take back everything I've said about Jim Fulton's
>  > requirements.  I'm quite sure that in the past he said he needed a
>  > very lightweight date/time object, but from what you say it appears
>  > this need has disappeared.
> 
> He wanted this for the catalog, and I suspect he still does.  Both
> size and performance (of comparisons) were important, not rendering
> time.

Is comparison the same what Tim mentioned as range searches?  I guess
a representation like current Zope timestamps or what time.time()
returns is fine for that -- it is monononous even if not necessarily
continuous.  I guess a broken-out time tuple is much harder to compare.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Fri Feb  8 21:31:09 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 8 Feb 2002 16:31:09 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net>
References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCKEKDNKAA.tim.one@comcast.net>
 <15459.9239.83647.334632@gondolin.digicool.com>
 <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net>
 <15460.15864.266226.241495@grendel.zope.com>
 <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15460.17309.905113.103005@grendel.zope.com>

Guido van Rossum writes:
 > Is comparison the same what Tim mentioned as range searches?  I guess
 > a representation like current Zope timestamps or what time.time()
 > returns is fine for that -- it is monononous even if not necessarily
 > continuous.  I guess a broken-out time tuple is much harder to compare.

Yes; as long as ordering is easy to check, we're fine with a long int
or some such thing.  The range search is indeed the specific
application Jim has in mind.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From mal@lemburg.com  Fri Feb  8 22:17:31 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 08 Feb 2002 23:17:31 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCKEKDNKAA.tim.one@comcast.net>
 <15459.9239.83647.334632@gondolin.digicool.com>
 <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net>
 <15460.15864.266226.241495@grendel.zope.com>
 <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net> <15460.17309.905113.103005@grendel.zope.com>
Message-ID: <3C644E7B.F69AF73C@lemburg.com>

"Fred L. Drake, Jr." wrote:
> 
> Guido van Rossum writes:
>  > Is comparison the same what Tim mentioned as range searches?  I guess
>  > a representation like current Zope timestamps or what time.time()
>  > returns is fine for that -- it is monononous even if not necessarily
>  > continuous.  I guess a broken-out time tuple is much harder to compare.
> 
> Yes; as long as ordering is easy to check, we're fine with a long int
> or some such thing.  The range search is indeed the specific
> application Jim has in mind.

Uhm... I think this thread is heading in the wrong direction. 

Fredrik wasn't proposing a solution to Jim's particular
problem (whatever it was ;-), but instead opting for a solution 
of a large number of Python users out there. 

While mxDateTime probably works for most of them (and is used by 
pretty much all major database modules out there), some may feel 
that they don't want to rely on external libs for their software 
to run on.

I would be willing to make the mxDateTime types subtypes of 
whatever Fredrik comes up with. The only requirement I have is
that the binary footprint of the types needs to match todays
layout of mxDateTime types since I need to maintain binary
compatibility.

The other possibility would be adding a set of new types
to mxDateTime which focus on low memory requirements rather
than data roundtrip safety and speed.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From neal@metaslash.com  Fri Feb  8 22:54:51 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Fri, 08 Feb 2002 17:54:51 -0500
Subject: [Python-Dev] Python 2.2 group missing on SF Patches
Message-ID: <3C64573B.3412540F@metaslash.com>

There is no 2.2 (or 2.2.1) choice under Group when submitting a patch
on Source Forge.

Neal


From tim.one@comcast.net  Fri Feb  8 23:04:49 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 08 Feb 2002 18:04:49 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOELENKAA.tim.one@comcast.net>

I'm not looking for point-by-point answers here, I'm just pointing out
things that were hard to follow so that they may get addressed in a
revision.

[Guido]
> Inspired by talks by Jeremy and Skip on DevDay, here's a different
> idea for speeding up access to globals.  It retain semantics but (like
> Jeremy's proposal) changes the type of a module's __dict__.
>
> - Let a cell be a really simple PyObject, containing a PyObject
>   pointer and a cell pointer.

Meaning a pointer to a cell, I bet.  Note that in the pseduo-code at the
end, the cellptr member of cell objects is never referenced, so it's hard to
be sure.

>   Both pointers may be NULL.  (It may have to be called PyGlobalCell
>   since I believe there's already a PyCell object.)

There is a PyCellObject already.

>   (Maybe it doesn't even have to be an object -- it could just be a tiny
>    struct.)

Would probably make it much harder to use the existing dict code (which maps
PyObjects to PyObjects).

> - Let a celldict be a mapping that is implemented using a dict of
>   cells.

Presumably this is a mapping *to* cells, and from ...?  String objects?

>   When you use its getitem method, the PyObject * in the cell is
>   dereferenced, and if a NULL is found, getitem raises KeyError
>   even if the cell exists.

Had a hard time with this:

1. Letting p be "the PyObject* in the cell", are you saying p==NULL or
   *p==NULL is the KeyError trigger?  "dereference" suggests the latter,
   but the former seems to make more sense.

2. Presumably the first "the cell" in this sentence refers to a
   different cell than the second "the cell" intends.

>   Using setitem to add a new value creates a new cell and stores the
>   value there;

Presumably in the PyObject* member of the new cell.  To what is the cellptr
member of the new cell set?  I think NULL.

>   using setitem to change the value for an existing key stores the
>   value in the existing cell for that key.  There's a separate API to
>   access the cells.

delitem is missing, but presumably straightforward.

> - We change the module implementation to use a celldict for its
>   __dict__.  The module's getattr and setattr operations now map to
>   getitem and setitem on the celldict.  I think the type of
>   <module>.__dict__ and globals() is the only backwards
>   incompatibility.
>
> - When a module is initialized, it gets its __builtins__ from the
>   __builtin__ module, which is itself a celldict.

Surely the __builtin__ module isn't a celldict, but rather has a __dict__
that is a celldict.

>   For each cell in __builtins__, the new module's __dict__ adds a cell
>   with a NULL  PyObject pointer, whose cell pointer points to the
>   corresponding cell of __builtins__.
>
> - The compiler generates LOAD_GLOBAL_CELL <i> (and STORE_GLOBAL_CELL
>   <i> etc.) opcodes for references to globals, where <i> is a small
>   index with meaning only within one code object like the const index
>   in LOAD_CONST.  The code object has a new tuple, co_globals, giving
>   the names of the globals referenced by the code indexed by <i>.  I
>   think no new analysis is required to be able to do this.

Me too.

> - When a function object is created from a code object and a celldict,
>   the function object creates an array of cell pointers by asking the
>   celldict for cells corresponding to the names in the code object's
>   co_globals.  If the celldict doesn't already have a cell for a
>   particular name, it creates and an empty one.  This array of cell
>   pointers is stored on the function object as func_cells.

I expect that the more we use these guys (cells), the more valuable to make
them PyObjects in their own right (for uniformity, ease of introspection,
etc).

>   When a function object is created from a regular dict instead of a
>   celldict, func_cells is a NULL pointer.

This part is regrettable, since it's Yet Another NULL check at the *top* of
code using this stuff (meaning it slows the normal case, assuming that it's
unusual not to get a celldict).  I'm not clear on how code ends up getting
created from a regular dict instead of a celldict -- is this because of
stuff like "exec whatever in mydict"?

> - When the VM executes a LOAD_GLOBAL_CELL <i> instruction, it gets
>   cell number <i> from func_cells.  It then looks in the cell's
>   PyObject pointer, and if not NULL, that's the global value.  If it
>   is NULL, it follows the cell's cell pointer to the next cell, if it
>   is not NULL, and looks in the PyObject pointer in that cell.  If
>   that's also NULL, or if there is no second cell, NameError is
>   raised.  (It could follow the chain of cell pointers until a NULL
>   cell pointer is found; but I have no use for this.)  Similar for
>   STORE_GLOBAL_CELL <i>, except it doesn't follow the cell pointer
>   chain -- it always stores in the first cell.

If I'm reading this right, then in the normal case of resolving "len" in

def mylen(s):
    return len(s)

1. We test func_cells for NULL and find out it isn't.
2. A pointer to a cell object is read out of func_cells at a fixed (wrt
   this function) offset.  This points to len's cell object in the
   module's celldict.
3. The cell object's PyObject* pointer is tested and found to be NULL.
4. The cell object's cellptr pointer is tested and found not to be NULL.
   This points to len's cell object in __builtin__'s celldict.
5. The cell object's cellptr's PyObject* is tested and found not to be
   NULL.
6. The cell object's cellptr's PyObject* is returned.

> - There are fallbacks in the VM for the case where the function's
>   globals aren't a celldict, and hence func_cells is NULL.  In that
>   case, the code object's co_globals is indexed with <i> to find the
>   name of the corresponding global and this name is used to index the
>   function's globals dict.

Which may not succeed, so we also need another level to back off to the
builtins.  I'd like to pursue getting rid of the func_cells==NULL special
case, even if it means constructing a celldict out of a regular dict for the
duration, and feeding mutations back in to the regular dict afterwards.

> I believe that's it.  I think this faithfully implements the current
> semantics (where a global can shadow a builtin), but without the need
> for any dict lookups when accessing globals, except in cases where an
> explicit dict is passed to exec or eval().

I think <wink> I agree.

Note that a chain of 4 test+branches against NULL in "the usual case" for
builtins may not be faster on average than inlining the first few useful
lines of lookdict_string twice (the expected path in this routine became
fat-free for 2.2):

	i = hash;
	ep = &ep0[i];
	if (ep->me_key == NULL || ep->me_key == key)
		return ep;

Win or lose, that's usually the end of a dict lookup.  That is, I'm certain
we're paying significantly more for layers of C-level function call overhead
today than for what the dict implementation actually does now (in the usual
cases).

> Compare this to Jeremy's scheme using dlicts:
>
> http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/FastGlobals
>
> - My approach doesn't require global agreement on the numbering of the
>   globals; each code object has its own numbering.  This avoids the
>   need for more global analysis,

Don't really care about that.

>   and allows adding code to a module using exec that introduces new
>   globals without having to fall back on a less efficient scheme.

That is indeed lovely.

> - Jeremy's approach might be a teensy bit faster because it may have
>   to do less work in the LOAD_GLOBAL; but I'm not convinced.

LOAD_GLOBAL is executed much more often than STORE_GLOBAL, so whichever
scheme wins for LOAD_GLOBAL will enjoy a multiplier effect when measuring
overall performance.

[and skipping the code]



From tim.one@comcast.net  Sat Feb  9 00:31:06 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 08 Feb 2002 19:31:06 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <3C644E7B.F69AF73C@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCELNNKAA.tim.one@comcast.net>

[M.-A. Lemburg]
> Uhm... I think this thread is heading in the wrong direction.

Maybe from your POV, but from our POV the only way we can get time to work
on Python is talk all of you into doing Zope work for Jim <wink>.

> Fredrik wasn't proposing a solution to Jim's particular
> problem (whatever it was ;-), but instead opting for a solution
> of a large number of Python users out there.

I believe all /F is asking for is that all datetime types supply two
specific methods, so that he can get the year etc out of anybody's datetime
object via a uniform spelling.  It's a fine idea.

> ...
> I would be willing to make the mxDateTime types subtypes of
> whatever Fredrik comes up with. The only requirement I have is
> that the binary footprint of the types needs to match todays
> layout of mxDateTime types since I need to maintain binary
> compatibility.

If /F is asking more than that datetime types implement a specific
interface, he's got some major rewriting to do <wink>.  Python doesn't have
a good way to spell "interface" now, so think of it as a do-nothing base
class, inheriting from which means absolutely nothing except that (a) you
promise to supply the methods /F specified, and (b) /F can use isinstance to
determine whether or not a given object supports this interface.

> The other possibility would be adding a set of new types
> to mxDateTime which focus on low memory requirements rather
> than data roundtrip safety and speed.

That's getting back to what Jim wants.  Maybe someone should ask him what
that is <wink>.



From jason@jorendorff.com  Sat Feb  9 04:09:59 2002
From: jason@jorendorff.com (Jason Orendorff)
Date: Fri, 8 Feb 2002 22:09:59 -0600
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <HFEKILOLEFEFMKAECNDLKEFLDBAA.jason@jorendorff.com>

Guido van Rossum wrote:
> - Let a cell be a really simple PyObject, containing a PyObject
>   pointer and a cell pointer.  Both pointers may be NULL.  (It may
>   have to be called PyGlobalCell since I believe there's already a
>   PyCell object.)  (Maybe it doesn't even have to be an object -- it
>   could just be a tiny struct.)
> 
> - Let a celldict be a mapping that is implemented using a dict of
>   cells.  When you use its getitem method, the PyObject * in the cell
>   is dereferenced, and if a NULL is found, getitem raises KeyError
>   even if the cell exists.  Using setitem to add a new value creates a
>   new cell and stores the value there; using setitem to change the
>   value for an existing key stores the value in the existing cell for
>   that key.  There's a separate API to access the cells.

The following is totally unimportant, but I feel compelled to share:

I implemented this once, long ago, for Python 1.5-ish, I believe.  I got
it to the point where it was only 15% slower than ordinary Python, then
abandoned it.  ;)  In my implementation, "cells" were real first-class
objects, and "celldict" was a copy-and-hack version of dictionary.  I
forget how the rest worked.

Anyway, this is all very exciting to me.  :)

## Jason Orendorff    http://www.jorendorff.com/


From tim.one@comcast.net  Sat Feb  9 04:35:22 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 08 Feb 2002 23:35:22 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <HFEKILOLEFEFMKAECNDLKEFLDBAA.jason@jorendorff.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEMENKAA.tim.one@comcast.net>

[Jason Orendorff]
> The following is totally unimportant, but I feel compelled to share:
>
> I implemented this once, long ago, for Python 1.5-ish, I believe.  I got
> it to the point where it was only 15% slower than ordinary Python, then
> abandoned it.  ;)  In my implementation, "cells" were real first-class
> objects,

That shouldn't matter to speed via any first-order effect, unless you also
used accessor functions instead of direct reference to get at the data
members.

> and "celldict" was a copy-and-hack version of dictionary.

Hmm.

> I forget how the rest worked.
>
> Anyway, this is all very exciting to me.  :)

Don't worry -- it will run much faster if Guido codes it.  One key
difference is that Guido will run each cell in its own thread <wink>.



From tim.one@comcast.net  Sat Feb  9 05:02:28 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 09 Feb 2002 00:02:28 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <15459.9239.83647.334632@gondolin.digicool.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEMHNKAA.tim.one@comcast.net>

[Jeremy Hylton]
> Also, it may not be necessary to have a TimeStamp object in ZODB 4.
> There are three uses for the timestamp: tracking how recently an
> object was used for cache evication, providing a last modified time to
> users, and as a simple version number.
>
> In ZODB 4, the cache eviction may be done quite differently.  The
> version number may be a simple int.

WRT RAM usage, a Python int is no smaller than a TimeStamp object.  An int
pickle is likely much smaller, though.

> The last mod time will not be provided for each object; instead, users
> will need to define this themselves if they care about it.  If they
> define it themselves, they'd probably use a DateTime object, but we'd
> care much less about how small it is.

Unclear that we'd care less, if the catalog remains full of DateTime
objects, and Fred is channeling more faithfully than the rest of us <wink>.



From tim.one@comcast.net  Sat Feb  9 05:30:34 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 09 Feb 2002 00:30:34 -0500
Subject: [Python-Dev] PYC Magic
In-Reply-To: <3C62697B.3555501@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEMINKAA.tim.one@comcast.net>

[M.-A. Lemburg]
> FYI, I've bumped the PYC magic in a non-standard way (the old
> standard broke on 2002-01-01); please review:

Fine by me, except you should also check in a NEWS blurb about it.  The
current NEWS file says:

"""
- Because Python's magic number scheme broke on January 1st, we decided
  to stop Python development.  Thanks for all the fish!
"""

That's why PythonLabs hasn't done much of anything on Python since 2.2 was
released <wink>.

>        algorithm relying on the above scheme. Perhaps we should simply
>        start counting in increments of 10 from now on ?!

Why 10?  I'd rather see it incremented by 1.  If you respond that you want
to make room for more hacks akin to -U, my response would be that's exactly
what I want to prevent by blessing 1 <0.4 wink>.



From srichter@cbu.edu  Sat Feb  9 07:22:37 2002
From: srichter@cbu.edu (Stephan Richter)
Date: Sat, 09 Feb 2002 01:22:37 -0600
Subject: [Python-Dev] proposal: add basic time type to the standard
 library
Message-ID: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu>

--=====================_112555085==_.ALT
Content-Type: text/plain; charset="us-ascii"; format=flowed

Hello everyone,

what a coincidence. I just was discussing this issue with Jason O. today. 
Here is my original mail:

Hey Jason,

I also want to start to think about a DateTime module. PostGres has a nice 
discussion of their impementation: 
http://www.ca.postgresql.org/users-lounge/docs/7.1/user/datatype-datetime.html

Here is Java's stuff on it:
http://www3.cbu.edu/sciences/jdk1.3/docs/api/java/sql/Date.html


Low Level Data Types:

Date
Time
DateTime
TimeStamp - Timestamps are always in UTC.

* Intervals can be added or subtracted from themselves and the types above.
DateInterval
TimeInterval
DateTimeInterval
TimeStampInterval


Notes:

- The basic data type must be as small as possible, so that applications 
that save these types often (i.e. ZODB/Persistent) then it should not 
increase the amount of data by much.

- We should then have high-level classes that put in all the functionality, 
such as lower-level in/output. I think high-level in/output should be 
handled by functions inside the module, such as getDateTimeFromString(str, 
someI18NspecificInfo=None).

- We need flexible i18n support!!! This is very important, especially for 
Zope. By default the system should come with a gettext implementation, but 
I would like to have the module generic enough that we can define other 
types of translation and localization mechanisms. Mh, the more I think of 
it, the more I think we will end up building our own stuff and then 
exposing that via an API.

- The parsing of Date, Time and DateTimes as well as their Intervals 
(PostGreSQL has some very nice ways for that) should be tremendously 
flexible. I am thinking here about a plugin-type architecture, where you 
can create your own plugins for parsing. For example, while the "." 
notation was reserved for the European Date Formats until now, more and 
more American companies (which are totally ignorant that there might be 
another country besides the US in the world) use this notation to write the 
American Date Format this way too. Therefore we need to have a mechanism to 
switch between the two.
I thought of some sort of a list of regex expressions which try to resolve 
a string. Oh yeah, we need internationalization here as well of course, 
even though the parser should be generic enough.

- The tough part will be time zones. I am almost thinking that we need our 
own object for handling that. Timezones are horribly complex, but we need 
to handle them well. I know Zope's current DateTime implementation has a 
good handle on that, even though I think the code is horrible (sorry Jim).

- A professor just mentioned that we should also handle daylight saving. 
This is not even that trivial, but I agree with him; there needs to be 
support for that, even though most apps handle that via the time zone, 
which is ok for the numeric version, but not if you say "CST" for example.


PS: Jim, I cc'ed you so that you might be able to comment in some of the 
points I made. FYI, Jason and I think about implementing a DateTime module 
for Python in general, which is small and sweet. We are shooting for our 
calendar system only.

Regards,
Stephan

--
Stephan Richter
CBU - Physics and Chemistry Student
Web2k - Web Design/Development & Technical Project Management
--=====================_112555085==_.ALT
Content-Type: text/html; charset="us-ascii"

<html>
Hello everyone,<br><br>
what a coincidence. I just was discussing this issue with Jason O. today.
Here is my original mail:<br><br>
Hey Jason,<br><br>
I also want to start to think about a DateTime module. PostGres has a
nice discussion of their impementation:
<a href="http://www.ca.postgresql.org/users-lounge/docs/7.1/user/datatype-datetime.html" eudora="autourl"><font color="#0000FF"><u>http://www.ca.postgresql.org/users-lounge/docs/7.1/user/datatype-datetime.html<br><br>
</a></u></font>Here is Java's stuff on it: <br>
<font color="#0000FF"><u><a href="http://www3.cbu.edu/sciences/jdk1.3/docs/api/java/sql/Date.html" eudora="autourl">http://www3.cbu.edu/sciences/jdk1.3/docs/api/java/sql/Date.html<br><br>
<br>
</a></u></font>Low Level Data Types:<br><br>
Date<br>
Time<br>
DateTime<br>
TimeStamp - Timestamps are always in UTC.<br><br>
* Intervals can be added or subtracted from themselves and the types
above.<br>
DateInterval<br>
TimeInterval<br>
DateTimeInterval<br>
TimeStampInterval<br><br>
<br>
Notes:<br><br>
- The basic data type must be as small as possible, so that applications
that save these types often (i.e. ZODB/Persistent) then it should not
increase the amount of data by much.<br><br>
- We should then have high-level classes that put in all the
functionality, such as lower-level in/output. I think high-level
in/output should be handled by functions inside the module, such as
getDateTimeFromString(str, someI18NspecificInfo=None).<br><br>
- We need flexible i18n support!!! This is very important, especially for
Zope. By default the system should come with a gettext implementation,
but I would like to have the module generic enough that we can define
other types of translation and localization mechanisms. Mh, the more I
think of it, the more I think we will end up building our own stuff and
then exposing that via an API.<br><br>
- The parsing of Date, Time and DateTimes as well as their Intervals
(PostGreSQL has some very nice ways for that) should be tremendously
flexible. I am thinking here about a plugin-type architecture, where you
can create your own plugins for parsing. For example, while the
&quot;.&quot; notation was reserved for the European Date Formats until
now, more and more American companies (which are totally ignorant that
there might be another country besides the US in the world) use this
notation to write the American Date Format this way too. Therefore we
need to have a mechanism to switch between the two. <br>
I thought of some sort of a list of regex expressions which try to
resolve a string. Oh yeah, we need internationalization here as well of
course, even though the parser should be generic enough.<br><br>
- The tough part will be time zones. I am almost thinking that we need
our own object for handling that. Timezones are horribly complex, but we
need to handle them well. I know Zope's current DateTime implementation
has a good handle on that, even though I think the code is horrible
(sorry Jim). <br><br>
- A professor just mentioned that we should also handle daylight saving.
This is not even that trivial, but I agree with him; there needs to be
support for that, even though most apps handle that via the time zone,
which is ok for the numeric version, but not if you say &quot;CST&quot;
for example. <br><br>
<br>
PS: Jim, I cc'ed you so that you might be able to comment in some of the
points I made. FYI, Jason and I think about implementing a DateTime
module for Python in general, which is small and sweet. We are shooting
for our calendar system only.<br><br>
Regards,<br>
Stephan<br>
<x-sigsep><p></x-sigsep>
--<br>
Stephan Richter<br>
CBU - Physics and Chemistry Student<br>
Web2k - Web Design/Development &amp; Technical Project Management</html>

--=====================_112555085==_.ALT--



From fredrik@pythonware.com  Sat Feb  9 11:21:00 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 9 Feb 2002 12:21:00 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <029a01c1b0cb$195cb400$ced241d5@hagrid> <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net> <3C64215F.AF31BB96@lemburg.com>
Message-ID: <00ac01c1b15b$e26b7d00$ced241d5@hagrid>

mal wrote:

> In order to make mxDateTime subtypes of this new type we'd need to
> make sure that the datetime type uses a true struct subset of what
> I have in DateTime objects now:
> 
> typedef struct {
>     PyObject_HEAD
> 
>     /* Representation used to do calculations */
>     long absdate;               /* number of days since 31.12. in the year 1 BC
>                                    calculated in the Gregorian calendar. */
>     double abstime;             /* seconds since 0:00:00.00 (midnight)
>                                    on the day pointed to by absdate */
> 
>  ...lots of broken down values needed to assure roundtrip safety...
> 
> }

as Tim has pointed out, what I have in mind is:

    typedef struct {
        PyObject_HEAD
        /* nothing here: subtypes should implement timetuple
           and, if possible, utctimetuple */
    } basetimeObject;

    /* maybe: */

    PyObject*
    basetime_timetuple(PyObject* self, PyObject* args)
    {
        PyErr_SetString(PyExc_NotImplementedError, "must override");
        return NULL;
    }

(to adapt mxDateTime, all you should have to do is to inherit from
baseObject, and add an alias for your "tuple" method)

:::

since it's really easy to do, we should probably also add a simpletime type
to the standard library, which wraps the standard time_t:

    typedef struct {
        PyObject_HEAD
        time_t time;
        /* maybe: int timezone; */
    } simpletimeObject;

:::

What I'm looking for is "decoupling", and making it easier for people
to experiment with different implementations.

Things like xmlrpclib, the logging system, database adapters, etc
can look for basetime instances, and use the standard protocol to
extract time information from any time object implementation.

(I can imagine similar "abstract" basetypes for money/decimal data
-- a basetype plus standardized behaviour for __int__, __float__,
__str__ -- and possibly some other data types: baseimage, base-
sound, basedomnode, ...)

Hopefully, such base types can be converted to "interfaces" when-
ever we get that.  But I don't want to wait for a datetime working
group to solve everything that MAL has already solved in mxDate-
Time, and then everything he hasn't addressed.  Nor do I want to
wait for an interface working group to sort that thing out.

Let's do something really simple instead.

</F>



From mal@lemburg.com  Sat Feb  9 11:31:38 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 09 Feb 2002 12:31:38 +0100
Subject: [Python-Dev] PYC Magic
References: <LNBBLJKPBEHFEDALKOLCAEMINKAA.tim.one@comcast.net>
Message-ID: <3C65089A.F2607E59@lemburg.com>

Tim Peters wrote:
> 
> [M.-A. Lemburg]
> > FYI, I've bumped the PYC magic in a non-standard way (the old
> > standard broke on 2002-01-01); please review:
> 
> Fine by me, except you should also check in a NEWS blurb about it.  The
> current NEWS file says:
> 
> """
> - Because Python's magic number scheme broke on January 1st, we decided
>   to stop Python development.  Thanks for all the fish!
> """
>
> That's why PythonLabs hasn't done much of anything on Python since 2.2 was
> released <wink>.

Done.
 
> >        algorithm relying on the above scheme. Perhaps we should simply
> >        start counting in increments of 10 from now on ?!
> 
> Why 10?  I'd rather see it incremented by 1.  If you respond that you want
> to make room for more hacks akin to -U, my response would be that's exactly
> what I want to prevent by blessing 1 <0.4 wink>.

The reason is that I don't want to break the -U scheme. I know
it's a hack, but until someone comes up with a better way to
add flags to store PYC compile options, we'll have to stick with
it (-U changes the semantics of the language in a pretty nasty
way ... nothing works anymore ;-).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Sat Feb  9 11:40:27 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 09 Feb 2002 12:40:27 +0100
Subject: [Python-Dev] proposal: add basic time type to the standardlibrary
References: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu>
Message-ID: <3C650AAB.878CC336@lemburg.com>

[You should never post HTML email to mailing lists...]

Stephan Richter wrote:
> 
> Hello everyone,
> 
> what a coincidence. I just was discussing this issue with Jason O.
> today. Here is my original mail:
> 
> Hey Jason,
> 
> I also want to start to think about a DateTime module. PostGres has a
> nice discussion of their impementation:
> http://www.ca.postgresql.org/users-lounge/docs/7.1/user/datatype-datetime.html
> 
> Here is Java's stuff on it:
> http://www3.cbu.edu/sciences/jdk1.3/docs/api/java/sql/Date.html
> 
> Low Level Data Types:
> 
> Date
> Time
> DateTime
> TimeStamp - Timestamps are always in UTC.

See below... you don't need that many types.
 
> * Intervals can be added or subtracted from themselves and the types
> above.
> DateInterval
> TimeInterval
> DateTimeInterval
> TimeStampInterval

Intervals are a bad idea. 

You really only need two types: one referencing fixed points in 
time and another one for storing the delta between two such 
fixed points. Everything else can be modeled on top of those
two.

Please have a look at mxDateTime. It has these two types and
much of what you described in your notes.

BTW, you wouldn't believe how complicated dealing with date
and time really is... ah, yes, and don't even think of ever
getting DST to work properly :-/ 

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Sat Feb  9 11:57:05 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 09 Feb 2002 12:57:05 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <029a01c1b0cb$195cb400$ced241d5@hagrid> <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net> <3C64215F.AF31BB96@lemburg.com> <00ac01c1b15b$e26b7d00$ced241d5@hagrid>
Message-ID: <3C650E91.529D9D29@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:
> 
> > In order to make mxDateTime subtypes of this new type we'd need to
> > make sure that the datetime type uses a true struct subset of what
> > I have in DateTime objects now:
> >
> > typedef struct {
> >     PyObject_HEAD
> >
> >     /* Representation used to do calculations */
> >     long absdate;               /* number of days since 31.12. in the year 1 BC
> >                                    calculated in the Gregorian calendar. */
> >     double abstime;             /* seconds since 0:00:00.00 (midnight)
> >                                    on the day pointed to by absdate */
> >
> >  ...lots of broken down values needed to assure roundtrip safety...
> >
> > }
> 
> as Tim has pointed out, what I have in mind is:
> 
>     typedef struct {
>         PyObject_HEAD
>         /* nothing here: subtypes should implement timetuple
>            and, if possible, utctimetuple */
>     } basetimeObject;
> 
>     /* maybe: */
> 
>     PyObject*
>     basetime_timetuple(PyObject* self, PyObject* args)
>     {
>         PyErr_SetString(PyExc_NotImplementedError, "must override");
>         return NULL;
>     }
> 
> (to adapt mxDateTime, all you should have to do is to inherit from
> baseObject, and add an alias for your "tuple" method)

Ok.

Sounds like you are inventing something like a set of abstract 
types here.

I'm very much +1 on that idea, provided the interfaces we define
for these types are simple enough (I think the DB SIG has shown
that simple interface can go a looong way).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From srichter@cbu.edu  Sat Feb  9 12:20:13 2002
From: srichter@cbu.edu (Stephan Richter)
Date: Sat, 09 Feb 2002 06:20:13 -0600
Subject: [Python-Dev] proposal: add basic time type to the
 standardlibrary
In-Reply-To: <3C650AAB.878CC336@lemburg.com>
References: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu>
Message-ID: <5.1.0.14.2.20020209061810.02ce9dd0@mercury-1.cbu.edu>

At 12:40 PM 2/9/2002 +0100, M.-A. Lemburg wrote:
>[You should never post HTML email to mailing lists...]

I know. I noticed it only after I had seen the archive entry. Did you guys could still read it? If not, I will resend it.

Sorry!

Regards,
Stephan

--
Stephan Richter
CBU - Physics and Chemistry Student
Web2k - Web Design/Development & Technical Project Management



From guido@python.org  Sat Feb  9 13:55:07 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 09 Feb 2002 08:55:07 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Sat, 09 Feb 2002 00:02:28 EST."
 <LNBBLJKPBEHFEDALKOLCCEMHNKAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCCEMHNKAA.tim.one@comcast.net>
Message-ID: <200202091355.g19Dt7H07636@pcp742651pcs.reston01.va.comcast.net>

> WRT RAM usage, a Python int is no smaller than a TimeStamp object.

Wrong, unless TimeStamps also use a custom allocator.  The custom
allocator uses 12 bytes per int (on a 32-bit machine) and incurs
malloc overhead + 8 bytes of additional overhead for every 82 ints.
That's about 12.2 bytes per int object; using malloc it would probably
be 24 bytes.  (PyMalloc would probably do a little better, except it
would still round up to 16 bytes.)

If TimeStamp objects were to use a similar allocation scheme, they
could be pushed down to 16.2 bytes.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sat Feb  9 14:23:01 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 09 Feb 2002 09:23:01 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: Your message of "Fri, 08 Feb 2002 18:04:49 EST."
 <LNBBLJKPBEHFEDALKOLCOELENKAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCOELENKAA.tim.one@comcast.net>
Message-ID: <200202091423.g19EN1807684@pcp742651pcs.reston01.va.comcast.net>

> I'm not looking for point-by-point answers here, I'm just pointing out
> things that were hard to follow so that they may get addressed in a
> revision.

Do you think it's PEP time yet?

> >   When you use its getitem method, the PyObject * in the cell is
> >   dereferenced, and if a NULL is found, getitem raises KeyError
> >   even if the cell exists.
> 
> Had a hard time with this:
> 
[...]
> 
> 2. Presumably the first "the cell" in this sentence refers to a
>    different cell than the second "the cell" intends.

No, they are the same.  See __getitem__ pseudo code.

> delitem is missing, but presumably straightforward.

I left it out intentionally because it adds nothing new.  Maybe that
was wrong -- it's important that deleting a global stores NULL in its
cell.objptr but does not delete the cell from the celldict.

> >   When a function object is created from a regular dict instead of a
> >   celldict, func_cells is a NULL pointer.
> 
> This part is regrettable, since it's Yet Another NULL check at the
> *top* of code using this stuff (meaning it slows the normal case,
> assuming that it's unusual not to get a celldict).  I'm not clear on
> how code ends up getting created from a regular dict instead of a
> celldict -- is this because of stuff like "exec whatever in mydict"?

Yes, I don't want to break such code because that's been the
politically correct way for ages.  We do have to deprecate it to
encourage people to use celldicts here.

To avoid the NULL check at the top, we could stuff func_cells with
empty cells and do the special-case check at the end (just before we
would raise NameError).  Then there still needs to be a check for
STORE and DELETE, because we don't want to store into the dummy
cells.  Sound like a hack to assess separately later.

(Another hack probably not worth it right now is to make the module's
cell.cellptr point to itself if it's not shadowing a builtin cell --
then the first NULL check for cell.cellptr can be avoided in the case
of finding a builtin name successful.)

> > - There are fallbacks in the VM for the case where the function's
> >   globals aren't a celldict, and hence func_cells is NULL.  In that
> >   case, the code object's co_globals is indexed with <i> to find the
> >   name of the corresponding global and this name is used to index the
> >   function's globals dict.
> 
> Which may not succeed, so we also need another level to back off to
> the builtins.  I'd like to pursue getting rid of the
> func_cells==NULL special case, even if it means constructing a
> celldict out of a regular dict for the duration, and feeding
> mutations back in to the regular dict afterwards.

The problem is that *during* the execution accessing the dict doesn't
give the right results.  I don't care about this case being fast
(after all it's exec and if people want it faster they can switch to
using a celldict).  I do care about not changing corners of the
semantics.

> Note that a chain of 4 test+branches against NULL in "the usual case" for
> builtins may not be faster on average than inlining the first few useful
> lines of lookdict_string twice (the expected path in this routine became
> fat-free for 2.2):
> 
> 	i = hash;
> 	ep = &ep0[i];
> 	if (ep->me_key == NULL || ep->me_key == key)
> 		return ep;
> 
> Win or lose, that's usually the end of a dict lookup.  That is, I'm certain
> we're paying significantly more for layers of C-level function call overhead
> today than for what the dict implementation actually does now (in the usual
> cases).

This should be tried!!!

> > Compare this to Jeremy's scheme using dlicts:
> >
> > http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/FastGlobals
> >
> > - My approach doesn't require global agreement on the numbering of the
> >   globals; each code object has its own numbering.  This avoids the
> >   need for more global analysis,
> 
> Don't really care about that.

I do.  The C code in compiler.c is already at a level of complexity
that nobody understands it in its entirety!  (I don't understand what
Jeremy added, and Jeremy has to ask me about the original code. :-( )

Switching to the compiler.py package is unrealistic for 2.3; there's a
bootstrap problem, plus it's much slower.  I know that we cache the
bytecode, but there are enough situations where we can't and the
slowdown would kill us (imagine starting Zope for the first time from
a fresh CVS checkout).

> >   and allows adding code to a module using exec that introduces new
> >   globals without having to fall back on a less efficient scheme.
> 
> That is indeed lovely.

Forgot a <wink> there?  It seems a pretty minor advantage to me.

I would like to be able to compare the two schemes more before
committing to any implementation.  Unfortunately there's no
description of Jeremy's scheme that we can compare easily (though I'm
glad to see he put up his slides on the web:
http://www.python.org/~jeremy/talks/spam10/PEP-267-1.html).

I guess there's so much handwaving in Jeremy's proposal about how to
deal with exceptional cases that I'm uncomfortable with it.  But that
could be fixed.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Jeff Graysmith<ec00067@cci-29palms.com>  Sat Feb  9 18:17:08 2002
From: Jeff Graysmith<ec00067@cci-29palms.com> (Jeff Graysmith)
Date: 09 Feb 2002 10:17:08 -0800
Subject: [Python-Dev] You know your email is vulnerable to SPAM Robots?
Message-ID: <E16Zc4L-0001mw-00@mail.python.org>

------=_QO8SlNmY_Hqe7FiRQ_MA
Content-Type: text/plain
Content-Transfer-Encoding: 8bit

Hello,

Please pardon the intrusion, but I saw that the email address python-
dev@python.org is in plain text on your site at 
http://python.sourceforge.net/peps/pep-0226.html making it vulnerable to 
be harvested by SPAM robots. Check this out there's a way to hide your 
email from robots, but still have it visible to human users.
http://www.email-cloak.net

Sincerely,
Jeff Graysmith


------=_QO8SlNmY_Hqe7FiRQ_MA
Content-Type: text/html
Content-Transfer-Encoding: 8bit

<HTML>
<BODY>
<font face="arial" size="2">
Hello,<br>
<br>
Please pardon the intrusion, but I saw that the email address python-dev@python.org is in plain text on your site at  http://python.sourceforge.net/peps/pep-0226.html making it vulnerable to be harvested by SPAM robots. Check this out there's a way to hide your email from robots, but still have it visible to human users.
<br>
<a href="http://www.email-cloak.net"><font face="arial" size="2">http://www.email-cloak.net</font></a><br>
<br>
Sincerely,<br>
Jeff Graysmith<br>
</font>
</BODY>
</HTML>

------=_QO8SlNmY_Hqe7FiRQ_MA--



From aahz@rahul.net  Sat Feb  9 20:16:42 2002
From: aahz@rahul.net (Aahz Maruch)
Date: Sat, 9 Feb 2002 12:16:42 -0800 (PST)
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <029a01c1b0cb$195cb400$ced241d5@hagrid> from "Fredrik Lundh" at Feb 08, 2002 07:04:45 PM
Message-ID: <20020209201642.0FECEE8C4@waltz.rahul.net>

Fredrik Lundh wrote:
> 
> Or to put it another way, I want the following to work for any time object,
> including mxDateTime objects, any date/timestamp returned by a DB-API
> driver, and weird date/time-like types I've developed myself:
> 
>     if isinstance(t, basetime):
>         # yay! it's a timestamp
>         print t.timetuple()

Looks good!  I'd prefer None to -1, though, for the last three items of
the tuple.  Also, the raise on utctime() should be NotImplementedError,
maybe?
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

We must not let the evil of a few trample the freedoms of the many.


From tim.one@comcast.net  Sat Feb  9 20:48:21 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 09 Feb 2002 15:48:21 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <200202091355.g19Dt7H07636@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENKNKAA.tim.one@comcast.net>

[Tim]
> WRT RAM usage, a Python int is no smaller than a TimeStamp object.

[Guido[
> Wrong, unless TimeStamps also use a custom allocator.

Good point, and it doesn't (it uses PyObject_NEW).

I don't think counting fractions of bytes is of great interest here, though,
since I (still) believe it's the massive Zope DateTime type that's the focus
of complaints.

> The custom allocator uses 12 bytes per int (on a 32-bit machine) and
> incurs malloc overhead + 8 bytes of additional overhead for every 82 ints.
> That's about 12.2 bytes per int object; using malloc it would probably
> be 24 bytes.  (PyMalloc would probably do a little better, except it
> would still round up to 16 bytes.)

pymalloc overhead is a few percent; would work out to 16+f bytes per int
object, for some f < 1.0.  A difference is that "total memory dedicated to
ints" never shrinks using the custom allocator, but can get reused for other
objects under pymalloc.



From mal@lemburg.com  Sat Feb  9 21:55:02 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 09 Feb 2002 22:55:02 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <20020209201642.0FECEE8C4@waltz.rahul.net>
Message-ID: <3C659AB6.73DB91A2@lemburg.com>

Aahz Maruch wrote:
> 
> Fredrik Lundh wrote:
> >
> > Or to put it another way, I want the following to work for any time object,
> > including mxDateTime objects, any date/timestamp returned by a DB-API
> > driver, and weird date/time-like types I've developed myself:
> >
> >     if isinstance(t, basetime):
> >         # yay! it's a timestamp
> >         print t.timetuple()
> 
> Looks good!  I'd prefer None to -1, though, for the last three items of
> the tuple. 

None would be better from an interface design POV, but for historic
reasons (compatibility to localtime()) -1 is better.

> Also, the raise on utctime() should be NotImplementedError,
> maybe?

In the DB API we let the implementors decide: if the functionality
cannot be provided per design, then it should not be implemented;
if it can be implemented, but only works under certain conditions,
a DB API NotSupportedError is raised instead.

For mxDateTime I would implement both methods since mxDateTime
does not store a timezone with the value but instead defines
methods (and other operations) based on assumptions about the
value. Time zones are on the plate, though, and the parser
already knows about them.

The C lib only provides APIs for local time and UTC;
if you ever tried to convert a non-local time value into UTC,
you'll know that this is not easy at all (mostly because of the
troubles caused by DST and sometimes also due to leap seconds
getting in the way).

About the proposed interface:

I'd rename the type to datetimebase and the methods to 
.tuple() and .gmtuple(). 

y,m,d = datetime.tuple()[:3]
h,m,s = datetime.utctuple()[3:6]

IMHO, it looks better :-)

One thing I'm missing is a definition for a constructor (type 
objects are callable, so it'll have to do something, I guess...)
and there should also be a datetimedeltabase type (this one is 
needed for dealing with the difference between two datetime 
values).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From tim.one@comcast.net  Sat Feb  9 22:48:29 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 09 Feb 2002 17:48:29 -0500
Subject: [Python-Dev] PYC Magic
In-Reply-To: <3C65089A.F2607E59@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEOANKAA.tim.one@comcast.net>

[M.-A. Lemburg]
> The reason is that I don't want to break the -U scheme.

But -U doesn't work anyway:

> (-U changes the semantics of the language in a pretty nasty
> way ... nothing works anymore ;-).

The only things -U have bought us are a bizarre definition of "magic
number", code complication, and complaints from people who see -U in the
"python -h" blurb and want to know why everything breaks when they try it.
It may be a hack you want to use for internal testing, but stuff that's been
broken since the day it was introduced, and makes no progress towards
working, doesn't belong in the general release.

> I know it's a hack, but until someone comes up with a better way to
> add flags to store PYC compile options, we'll have to stick with
> it.

But there is no need to store info about PYC compile options:  -U is its
only use now, and -U has never worked.  Since it's worse than useless,
better to throw it out, then dream up a rational way to store PYC compile
options if and when (and only if and when) there's an actual need for such.

What would we lose if we tossed the -U support code?  I can see what we'd
gain.



From tim.one@comcast.net  Sat Feb  9 23:57:26 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 09 Feb 2002 18:57:26 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <200202091423.g19EN1807684@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEOBNKAA.tim.one@comcast.net>

[Guido]
> Do you think it's PEP time yet?

If the ideas aren't written down in an editable form while they're fresh on
at least one person's mind, I'm afraid they'll just get lost.  If it's a
PEP, at least there will be a nagging reminder that someone once had a
promising idea <wink>.

>> 2. Presumably the first "the cell" in this sentence refers to a
>>    different cell than the second "the cell" intends.

> No, they are the same.  See __getitem__ pseudo code.

I persist in my delusion.  Original text:

    When you use its getitem method, the PyObject * in the cell is
    dereferenced, and if a NULL is found, getitem raises KeyError
    even if the cell exists.

Since we're doing something with "the PyObject* in the cell", surely "the
cell" *must* exist.  So what is the "even if the cell exists" trying to say?
I believe it means to say

    even if the cell's cellptr is not NULL

and "the cell's cellptr is not NULL" is quite different from "the cell
exists".

> ...
> To avoid the NULL check at the top, we could stuff func_cells with
> empty cells and do the special-case check at the end (just before we
> would raise NameError).

That would be better -- getting cycles out of the most-frequent paths is my
only goal here.

> Then there still needs to be a check for STORE and DELETE, because we
> don't want to store into the dummy cells.  Sound like a hack to assess
> separately later.

Another idea:  a celldict could contain a "real dict" pointer, normally
NULL, and pointing to a plain dict when a real dict is given.  The celldict
constructor would populate the cells from the realdict's contents when not
NULL.  Then getitem wouldn't have to do anything special (realdict==NULL and
realdict!=NULL would be the same to it).  setitem and delitem would
propagate mutations immediately into the realdict too when non-NULL.  Since
mutations are almost certainly much rarer than accesses, this makes the
rarer operations pay.  The eval loop would always see a celldict.

> (Another hack probably not worth it right now is to make the module's
> cell.cellptr point to itself if it's not shadowing a builtin cell --
> then the first NULL check for cell.cellptr can be avoided in the case
> of finding a builtin name successful.)

I don't think I followed this.  If, e.g., a module's "len" cell is normally

    {NULL, pointer to __builtin__'s "len" cell}

under the original scheme, how would that change?

    {NULL, pointer to this very cell}

wouldn't make sense.

    {builtin len, pointer to this very cell}

would make sense, but then the pointer to self is useless -- except as a
hint that we copied the value up from the builtins?  But then a change to
__builtin__.len wouldn't be visible to the module.

> ...
> The problem is that *during* the execution accessing the dict doesn't
> give the right results.  I don't care about this case being fast
> (after all it's exec and if people want it faster they can switch to
> using a celldict).  I do care about not changing corners of the
> semantics.

I expect that a write-through realdict (see above) attached to a celldict in
such cases would address this, keeping the referencing code uniform and
fast, and moving the special-casing into the celldict implementation for
mutating operations.

>> 	i = hash;
>> 	ep = &ep0[i];
>> 	if (ep->me_key == NULL || ep->me_key == key)
>> 		return ep;
>>
>> Win or lose, that's usually the end of a dict lookup.  That is,
>> I'm certain we're paying significantly more for layers of C-level
>> function call overhead today than for what the dict implementation
>> actually does now in the usual cases).

> This should be tried!!!

It's less promising after more thought.  The chirf snag is that "usually the
end" relies on that we're usually looking for things that are there.  But
when looking for a builtin, it's usually not in the module's dict, where we
look first.  In that case, about half the time we'll find an occupied
irrelevant slot in the module's dict, and then we need the rest of
lookdict_string to do a (usually brief, but there's no getting away from the
loop because we can't know how brief in advance) futile chase down the
collision chain.

>>> This avoids the need for more global analysis,

>> Don't really care about that.

> I do.  The C code in compiler.c is already at a level of complexity
> that nobody understands it in its entirety!  (I don't understand what
> Jeremy added, and Jeremy has to ask me about the original code. :-( )

I don't care because I care about something else <wink>:  it would add to
the pressure to refactor this code mercilessly, and that would be a Good
Thing over the long term.  The current complexity isn't inherent, it's an
artifact of outgrowing the original concrete-syntax-tree direct-to bytecode
one-pass design.  Now we've got multiple passes crawling over a now-
inappropriate program representation, glued together more by "reliable
accidents" <wink> than sensible design.  That's all curable, and the
pressures *to* cure it will continue to multiply over time (e.g., it would
take a certain insanity to even think about folding pychecker-like checks
into the current architecture).

> Switching to the compiler.py package is unrealistic for 2.3; there's a
> bootstrap problem, plus it's much slower.  I know that we cache the
> bytecode, but there are enough situations where we can't and the
> slowdown would kill us (imagine starting Zope for the first time from
> a fresh CVS checkout).

I'm a fan of fast compilation.  Heck, I was upset in 1982 when Cray's
compiler dropped below 100,000 lines/minute for the first time <wink>.

>>>   and allows adding code to a module using exec that introduces new
>>>   globals without having to fall back on a less efficient scheme.

>> That is indeed lovely.

> Forgot a <wink> there?  It seems a pretty minor advantage to me.

No, it's lovely, not major.  It's simply a good sign when the worst semantic
nightmares "just work".  It's also a lovely sign.  Most flowers aren't
terribly important either, but they are lovely.

> I would like to be able to compare the two schemes more before
> committing to any implementation.  Unfortunately there's no
> description of Jeremy's scheme that we can compare easily (though I'm
> glad to see he put up his slides on the web:
> http://www.python.org/~jeremy/talks/spam10/PEP-267-1.html).
>
> I guess there's so much handwaving in Jeremy's proposal about how to
> deal with exceptional cases that I'm uncomfortable with it.  But that
> could be fixed.

I agree it needs more detail, but at the start I'm more interested in the
normal cases.  I'll reattach my no-holds-barred description of resolving
normal-case "len" in this scheme.  Perhaps Jeremy could do the same for his.
Jeremy is also aiming at speeding things like math.pi (global.attribute) as
a whole (not just speeding the "math" part of it).

Regurgitatia:

"""
If I'm reading this right, then in the normal case of resolving "len" in

def mylen(s):
    return len(s)

1. We test func_cells for NULL and find out it isn't.
2. A pointer to a cell object is read out of func_cells at a fixed (wrt
   this function) offset.  This points to len's cell object in the
   module's celldict.
3. The cell object's PyObject* pointer is tested and found to be NULL.
4. The cell object's cellptr pointer is tested and found not to be NULL.
   This points to len's cell object in __builtin__'s celldict.
5. The cell object's cellptr's PyObject* is tested and found not to be
   NULL.
6. The cell object's cellptr's PyObject* is returned.
"""

For a module global, the same description applies, but the outcome of #3 is
not-NULL and it ends there then.

For global.attr, step #3 yields the global, and then attr lookup is the same
as today.

Jeremy, can you do the same level of detail for your scheme?  Skip?



From andreas@andreas-jung.com  Sat Feb  9 00:50:30 2002
From: andreas@andreas-jung.com (Andreas Jung)
Date: Fri, 8 Feb 2002 19:50:30 -0500
Subject: [Python-Dev] Re: [Zope3-dev] Strange behaviour of python2.2 -U with Zope 3
References: <00ee01c1b103$72040990$02010a0a@suxlap>
Message-ID: <00f701c1b103$cbc5a330$02010a0a@suxlap>

Another followup:

the module import seems to completely broken when using the -U option.

Andreas
----- Original Message ----- 
From: "Andreas Jung" <andreas@andreas-jung.com>
To: <zope3-dev@zope.org>
Sent: Friday, February 08, 2002 19:48
Subject: [Zope3-dev] Strange behaviour of python2.2 -U with Zope 3


> python2.2 utilities/unittestgui.py Zope.Testing.allZopeTests
> start the GUI for the Zope3 unittests. So far so good.
> 
> I tried to run all tests with unicode as default string type:
> 
>  python2.2 -U utilities/unittestgui.py Zope.Testing.allZopeTests
> 
> This fails with the following traceback:
> 
> yetix@/develop/DC/sandboxes/3x(57)% python2.2 -U utilities/unittestgui.py
> Zope.Testing.allZopeTests
> Traceback (most recent call last):
>   File "utilities/unittestgui.py", line 30, in ?
>     import linecache
> ImportError: No module linecache
> 
> Also "python2.2 -U -c "import linecache" " fails
> 
> Any ideas ?
> 
> Andreas
> 
> 
> 
> _______________________________________________
> Zope3-dev mailing list
> Zope3-dev@zope.org
> http://lists.zope.org/mailman/listinfo/zope3-dev
> 



From tim@zope.com  Sun Feb 10 01:02:02 2002
From: tim@zope.com (Tim Peters)
Date: Sat, 9 Feb 2002 20:02:02 -0500
Subject: [Python-Dev] RE: [Zope3-dev] Strange behaviour of python2.2 -U with Zope 3
In-Reply-To: <00f701c1b103$cbc5a330$02010a0a@suxlap>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEOENKAA.tim@zope.com>

[Andreas Jung]
> Another followup:
>
> the module import seems to completely broken when using the -U option.

See my last email on zope3-dev:  leave -U alone.  It doesn't work and isn't
supported.  It shouldn't even exist (IMO).



From mal@lemburg.com  Sun Feb 10 13:29:36 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 10 Feb 2002 14:29:36 +0100
Subject: [Python-Dev] PYC Magic
References: <LNBBLJKPBEHFEDALKOLCGEOANKAA.tim.one@comcast.net>
Message-ID: <3C6675C0.BE7B42FE@lemburg.com>

Tim Peters wrote:
> 
> [M.-A. Lemburg]
> > The reason is that I don't want to break the -U scheme.
> 
> But -U doesn't work anyway:
> 
> > (-U changes the semantics of the language in a pretty nasty
> > way ... nothing works anymore ;-).
> 
> The only things -U have bought us are a bizarre definition of "magic
> number", code complication, and complaints from people who see -U in the
> "python -h" blurb and want to know why everything breaks when they try it.
> It may be a hack you want to use for internal testing, but stuff that's been
> broken since the day it was introduced, and makes no progress towards
> working, doesn't belong in the general release.

Wait... the -U option was added in order to be able to see how well
the 8-bit string / Unicode integration works. It's a know fact that
the Python standard lib is not Unicode compatible yet and that's
exactly what the -U option allows you to test (in a very simple
way).

In the long run, Python's std lib should move into the direction 
of being Unicode compatible, so I don't really see the need for
removing -U altogether. To reduce the noise about Python failing
to run with the option set, it may be a good idea to remove the
mentioning from the -h blurb, though.
 
> > I know it's a hack, but until someone comes up with a better way to
> > add flags to store PYC compile options, we'll have to stick with
> > it.
> 
> But there is no need to store info about PYC compile options:  -U is its
> only use now, and -U has never worked.  Since it's worse than useless,
> better to throw it out, then dream up a rational way to store PYC compile
> options if and when (and only if and when) there's an actual need for such.

The -U option is currently the only application of such a flag.
We will definitely have a need for these options in the future
to make the runtime aware of certain assumptions which have been
made in the compiled byte code, e.g. byte code using special
opcodes, byte code compiled for a different Python virtual
machine (once we get pluggable Python compiler / VM combos),
byte code which was compiled using special literal 
interpretations (such as in the -U case or when compiling
the source code with a different source code encoding 
assumption).

I would be more than happy to get rid off the current PYC magic hack 
for -U and have it replaced with a better and extensible alternative,
e.g. a combination of PYC version number and marhsalled option 
dictionary.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From srichter@cbu.edu  Sun Feb 10 16:07:50 2002
From: srichter@cbu.edu (Stephan Richter)
Date: Sun, 10 Feb 2002 10:07:50 -0600
Subject: [Python-Dev] proposal: add basic time type to the
 standardlibrary
In-Reply-To: <3C650AAB.878CC336@lemburg.com>
References: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu>
Message-ID: <5.1.0.14.2.20020210091537.02026d38@mercury-1.cbu.edu>

> > * Intervals can be added or subtracted from themselves and the types
> > above.
> > DateInterval
> > TimeInterval
> > DateTimeInterval
> > TimeStampInterval
>
>Intervals are a bad idea.

Why? They are the same as your Deltas. Interval is the more common term I 
think, therefore I chose it. Maybe having a Time/Date/DateTime{Interval} is 
too much and they should be really one. So you would have DateTimeInterval 
and TimeStampInterval for the same reasons I describe below.

On the other hand Java does not seem to implement intervals at all, which I 
think is a bad idea, since RDBs support it.

 >>> import DateTime
 >>> DateTime.parseInterval('6 mins 3 secs') # DateTime.DateTimeInterval is 
the default
6 minutes 3 seconds
 >>> DateTime.parseInterval('50 secs 3 millis', 
type=DateTime.TimeStampInterval) # returns ticks
50.003

I still think that many types are a good thing; it leaves the developer 
with choice. However the module should be smart and hide some of the choice 
from you, if you are a beginner. For example I imagine this to work:

 >>> import DateTime
 >>> date = DateTime.parseDateTime('2.1.2001')
 >>> type(date).__name__
Date
 >>> time = DateTime.parseDateTime('12:00:00')
 >>> type(time).__name__
Time
 >>> datetime = DateTime.parseDateTime('2.1.2001 12:00:00')
 >>> type(datetime).__name__
DateTime

>You really only need two types: one referencing fixed points in
>time and another one for storing the delta between two such
>fixed points. Everything else can be modeled on top of those
>two.

Well yes, but this is a reason why I have such a hard rime to get 
mxDateTime into Zope. Your module is well suited for certain tasks, but not 
everybody wants to use mxDateTime for Date/Time manipulation. So, saving 
components of a date is for some uses much better than saving ticks and 
vice versa. I also talked with Jim Fulton about it, and he agrees that 
there is a need for more than one Date/Time type. However it should be easy 
of course to convert between both, the Timestamp and the DateTime type.

Here are some more examples:

 >>> import DateTime
 >>> date = DateTime.parseDateTime('2.1.2001')
 >>> type(date).__name__
Date
 >>> stamp = DateTime.TimeStamp(date)
 >>> type(stamp).__name__
TimeStamp

BTW, something I do not want to support is:

 >>> import DateTime
 >>> date = DateTime.DateTime('2.1.2001')

Since putting parsing into the object itself is a big mess, as we noticed 
in the Zope 2.x DateTime implementation. I think there should be only two 
ways to initialize a DateTime object, one of which I showed above, which is 
responsible of converting TimeStamps to DateTimes (mmh, maybe that should 
be a module function as well). The other one is:

 >>> import DateTime
 >>> DateTime.DateTime(2001, 2, 3)
February 3, 2001
 >>> DateTime.DateTime('2001', '02', '03') # Of course it also supports 
strings here
February 3, 2001
 >>> DateTime.DateTime(2001, 2, 3, 12, 0)
February 3, 2001 12:00:00
 >>> DateTime.DateTime(2001, hour=12) # missing pieces will be replaced by 
1 or 0
January 1, 2001 12:00:00
 >>> DateTime.DateTime(year=2001, month=2, day=3, hour=1,
                 minute=2, second=3, millisecond=4, timezone=-6) # max 
amount of arguments
February 3, 2001 01:02:03.004 -06:00

>Please have a look at mxDateTime. It has these two types and
>much of what you described in your notes.

I know mxDateTime very well and have even suggested before to make it the 
Zope DateTime module and even put it in the standard Python distribution. 
Here is the mail from the Zope-Coders list: 
http://lists.zope.org/pipermail/zope-coders/2001-October/000100.html. You 
can follow the thread to see some responses.
Also, the list of notes was made from my experience working with 
mxDateTime, Zope DateTime and PostGreSQL Dates/Times. I know it was not 
complete, but it had some of the hotspots in it.


>BTW, you wouldn't believe how complicated dealing with date
>and time really is... ah, yes, and don't even think of ever
>getting DST to work properly :-/

Oh, I have seen and fixed the Zope DateTime implementation plenty and I 
have thought of the problem for 2.5 years now. The problem is that the US 
starts to use the German "." notation (as mentioned in my original mail) 
and other issues, which make it much harder. That is the reason why I want 
to build an ultra-flexible parsing engine. So you can do things like:

 >>> import DateTime
 >>> DateTime.parseDateTime('03/02/01', format=DateTime.ISO)
February 1, 2003
 >>> DateTime.parseDateTime('03/02/01', format=DateTime.US)
March 2, 2001
 >>> DateTime.parseDateTime('03.02.01', format=DateTime.US)
March 2, 2001
 >>> DateTime.parseDateTime('03/02/01', format=DateTime.GERMAN) # just in 
case Europe/Germany goes insane as well.
February 3, 2001


But by default:

 >>> DateTime.parseDateTime('03/02/01')
March 2, 2001
 >>> DateTime.parseDateTime('03.02.01')
February 3, 2001

Regards,
Stephan

--
Stephan Richter
CBU - Physics and Chemistry Student
Web2k - Web Design/Development & Technical Project Management



From guido@python.org  Sun Feb 10 16:20:30 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 10 Feb 2002 11:20:30 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: Your message of "Sat, 09 Feb 2002 18:57:26 EST."
 <LNBBLJKPBEHFEDALKOLCMEOBNKAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEOBNKAA.tim.one@comcast.net>
Message-ID: <200202101620.g1AGKUm17544@pcp742651pcs.reston01.va.comcast.net>

> I persist in my delusion.  Original text:
> 
>     When you use its getitem method, the PyObject * in the cell is
>     dereferenced, and if a NULL is found, getitem raises KeyError
>     even if the cell exists.
> 
> Since we're doing something with "the PyObject* in the cell", surely "the
> cell" *must* exist.  So what is the "even if the cell exists" trying to say?

It is trying to say "despite the cell's existence".  See the sample code.

> I believe it means to say
> 
>     even if the cell's cellptr is not NULL
> 
> and "the cell's cellptr is not NULL" is quite different from "the cell
> exists".

No, it doesn't try to say that.  But you're right that it's useful to
add that the cell's cellptr is irrelevant to getitem.

> Another idea: a celldict could contain a "real dict" pointer,
> normally NULL, and pointing to a plain dict when a real dict is
> given.  The celldict constructor would populate the cells from the
> realdict's contents when not NULL.  Then getitem wouldn't have to do
> anything special (realdict==NULL and realdict!=NULL would be the
> same to it).  setitem and delitem would propagate mutations
> immediately into the realdict too when non-NULL.  Since mutations
> are almost certainly much rarer than accesses, this makes the rarer
> operations pay.  The eval loop would always see a celldict.

This works for propagating changes from the celldict to the real dict,
but not the other way around.  Example:

  d = {'x': 10}
  def set_x(x):
      d['x'] = x
  exec "...some code that calls set_x()..." in d

> > (Another hack probably not worth it right now is to make the module's
> > cell.cellptr point to itself if it's not shadowing a builtin cell --
> > then the first NULL check for cell.cellptr can be avoided in the case
> > of finding a builtin name successful.)
> 
> I don't think I followed this.  If, e.g., a module's "len" cell is normally
> 
>     {NULL, pointer to __builtin__'s "len" cell}
> 
> under the original scheme, how would that change?
> 
>     {NULL, pointer to this very cell}
> 
> wouldn't make sense.
> 
>     {builtin len, pointer to this very cell}
> 
> would make sense, but then the pointer to self is useless -- except as a
> hint that we copied the value up from the builtins?  But then a change to
> __builtin__.len wouldn't be visible to the module.

I meant that for "len" it would not change, i.e. it would be

    {NULL, pointer to __builtin__'s "len" cell}

but for a global "foo" it would change to

    {value of foo or NULL if foo is undefined, pointer to this very cell}

Then if foo is defined, the code would find the value of foo in the
first cell it tries, and if foo is undefined, it would find a NULL in
the cell and in the cell it points to.

> > I do.  The C code in compiler.c is already at a level of
> > complexity that nobody understands it in its entirety!  (I don't
> > understand what Jeremy added, and Jeremy has to ask me about the
> > original code. :-( )
> 
> I don't care because I care about something else <wink>: it would
> add to the pressure to refactor this code mercilessly, and that
> would be a Good Thing over the long term.  The current complexity
> isn't inherent, it's an artifact of outgrowing the original
> concrete-syntax-tree direct-to bytecode one-pass design.  Now we've
> got multiple passes crawling over a now- inappropriate program
> representation, glued together more by "reliable accidents" <wink>
> than sensible design.  That's all curable, and the pressures *to*
> cure it will continue to multiply over time (e.g., it would take a
> certain insanity to even think about folding pychecker-like checks
> into the current architecture).

Actually, the concrete syntax tree was never a very good
representation; it was convenient for the parser to generate that, and
it was "okay" (or "good enough") to generate code from and to do
anything else from.

I agree that it's a good idea to start thinking about changing the
parse tree representation to a proper abstract syntax tree.  Maybe the
normalization that the compiler.py package uses would be a good start?
Except that I've never quite grasped the visitor architecture there. :-(

> I agree it needs more detail, but at the start I'm more interested
> in the normal cases.  I'll reattach my no-holds-barred description
> of resolving normal-case "len" in this scheme.  Perhaps Jeremy could
> do the same for his.  Jeremy is also aiming at speeding things like
> math.pi (global.attribute) as a whole (not just speeding the "math"
> part of it).

One problem with that is that it's hard to know when <global> in
<global>.<attribute> is a module, and when it's something else.  I
guess global analysis could help -- if it's imported ("import math")
it's likely a module, if it's assigned from an expression ("L = []")
or a locally defined function or class, it's likely not a module.  But
"from X import Y" creates a mystery -- X could be a package containing
a module Y, or it could be a module containing a function or class Y.

> Regurgitatia:
> 
> """
> If I'm reading this right, then in the normal case of resolving "len" in
> 
> def mylen(s):
>     return len(s)
> 
> 1. We test func_cells for NULL and find out it isn't.

This step could be avoided using my trick of an array of dummy cells
or using your trick of a celldict containing an optional reference to
a real dict, so let's skip it.

> 2. A pointer to a cell object is read out of func_cells at a fixed (wrt
>    this function) offset.  This points to len's cell object in the
>    module's celldict.
> 3. The cell object's PyObject* pointer is tested and found to be NULL.
> 4. The cell object's cellptr pointer is tested and found not to be NULL.

This NULL test shouldn't be needed given my trick of linking cells
that do not shadow globals to themselves.

>    This points to len's cell object in __builtin__'s celldict.
> 5. The cell object's cellptr's PyObject* is tested and found not to be
>    NULL.
> 6. The cell object's cellptr's PyObject* is returned.
> """
> 
> For a module global, the same description applies, but the outcome of #3 is
> not-NULL and it ends there then.
> 
> For global.attr, step #3 yields the global, and then attr lookup is the same
> as today.
> 
> Jeremy, can you do the same level of detail for your scheme?  Skip?

Jeremy is probably still recovering with his family from the
conference.  I know I got sick there and am now stuck with a horrible
cold (the umpteenth one this season).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Sun Feb 10 18:16:05 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 10 Feb 2002 19:16:05 +0100
Subject: [Python-Dev] proposal: add basic time type to thestandardlibrary
References: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu> <5.1.0.14.2.20020210091537.02026d38@mercury-1.cbu.edu>
Message-ID: <3C66B8E5.DC2DD1CA@lemburg.com>

Stephan Richter wrote:
> 
> > > * Intervals can be added or subtracted from themselves and the types
> > > above.
> > > DateInterval
> > > TimeInterval
> > > DateTimeInterval
> > > TimeStampInterval
> >
> >Intervals are a bad idea.
> 
> Why? They are the same as your Deltas. Interval is the more common term I
> think, therefore I chose it. Maybe having a Time/Date/DateTime{Interval} is
> too much and they should be really one. So you would have DateTimeInterval
> and TimeStampInterval for the same reasons I describe below.

As I explained my reply, most of these intervals are not needed 
as *base types*. You can easily model them on top of the two
types I have in mxDateTime. Some may not like this model because it
comes from a more mathematical point of view, but in reality it
works quite nicely and simplifies the API structure significantly.

A time interval is basically just an amount of seconds, nothing more.
There's no need to have 4 different types to wrap a single 
double ;-)
 
> On the other hand Java does not seem to implement intervals at all, which I
> think is a bad idea, since RDBs support it.
>
>  >>> import DateTime
>  >>> DateTime.parseInterval('6 mins 3 secs') # DateTime.DateTimeInterval is
> the default
> 6 minutes 3 seconds
>  >>> DateTime.parseInterval('50 secs 3 millis',
> type=DateTime.TimeStampInterval) # returns ticks
> 50.003
> 
> I still think that many types are a good thing; it leaves the developer
> with choice. However the module should be smart and hide some of the choice
> from you, if you are a beginner. For example I imagine this to work:
> 
>  >>> import DateTime
>  >>> date = DateTime.parseDateTime('2.1.2001')
>  >>> type(date).__name__
> Date
>  >>> time = DateTime.parseDateTime('12:00:00')
>  >>> type(time).__name__
> Time
>  >>> datetime = DateTime.parseDateTime('2.1.2001 12:00:00')
>  >>> type(datetime).__name__
> DateTime

Just think of all the possible combinations you have in operations
like '+', '-' and comparisons. You don't want to go down this 
road...
 
> >You really only need two types: one referencing fixed points in
> >time and another one for storing the delta between two such
> >fixed points. Everything else can be modeled on top of those
> >two.
> 
> Well yes, but this is a reason why I have such a hard rime to get
> mxDateTime into Zope. Your module is well suited for certain tasks, but not
> everybody wants to use mxDateTime for Date/Time manipulation. 

Uhm, where did you get the impression that I want all the world
to use mxDateTime :-? I wrote it for use in mxODBC since at the time
there was no DateTime type around which could handle dates prior
to 1970. As a result, mxDateTime was written to provide everything
you need for database interfacing. That's also the reason why there
is no time zone support in mxDateTime's base types: databases
don't have time zone support built into their date/time types
either (and for a good reason: time zones are better handled at
application level).

> So, saving
> components of a date is for some uses much better than saving ticks and
> vice versa. I also talked with Jim Fulton about it, and he agrees that
> there is a need for more than one Date/Time type. However it should be easy
> of course to convert between both, the Timestamp and the DateTime type.

That's why mxDateTime provides so many interfaces to other forms
of storing and reading date/time values, e.g. COMDate, ticks, 
doubles, tuples, strings, various scientific formats, in two 
different calendars etc.
 
> Here are some more examples:
> 
>  >>> import DateTime
>  >>> date = DateTime.parseDateTime('2.1.2001')
>  >>> type(date).__name__
> Date
>  >>> stamp = DateTime.TimeStamp(date)
>  >>> type(stamp).__name__
> TimeStamp
> 
> BTW, something I do not want to support is:
> 
>  >>> import DateTime
>  >>> date = DateTime.DateTime('2.1.2001')
> 
> Since putting parsing into the object itself is a big mess, as we noticed
> in the Zope 2.x DateTime implementation. I think there should be only two
> ways to initialize a DateTime object, one of which I showed above, which is
> responsible of converting TimeStamps to DateTimes (mmh, maybe that should
> be a module function as well). The other one is:
> 
>  >>> import DateTime
>  >>> DateTime.DateTime(2001, 2, 3)
> February 3, 2001
>  >>> DateTime.DateTime('2001', '02', '03') # Of course it also supports
> strings here
> February 3, 2001
>  >>> DateTime.DateTime(2001, 2, 3, 12, 0)
> February 3, 2001 12:00:00
>  >>> DateTime.DateTime(2001, hour=12) # missing pieces will be replaced by
> 1 or 0
> January 1, 2001 12:00:00
>  >>> DateTime.DateTime(year=2001, month=2, day=3, hour=1,
>                  minute=2, second=3, millisecond=4, timezone=-6) # max
> amount of arguments
> February 3, 2001 01:02:03.004 -06:00

You really just want to support one way for the type constructor
(broken down numbers). All other possibilities can be had via 
factory functions.
 
> >Please have a look at mxDateTime. It has these two types and
> >much of what you described in your notes.
> 
> I know mxDateTime very well and have even suggested before to make it the
> Zope DateTime module and even put it in the standard Python distribution.
> Here is the mail from the Zope-Coders list:
> http://lists.zope.org/pipermail/zope-coders/2001-October/000100.html. You
> can follow the thread to see some responses.
> Also, the list of notes was made from my experience working with
> mxDateTime, Zope DateTime and PostGreSQL Dates/Times. I know it was not
> complete, but it had some of the hotspots in it.
> 
> >BTW, you wouldn't believe how complicated dealing with date
> >and time really is... ah, yes, and don't even think of ever
> >getting DST to work properly :-/
> 
> Oh, I have seen and fixed the Zope DateTime implementation plenty and I
> have thought of the problem for 2.5 years now. The problem is that the US
> starts to use the German "." notation (as mentioned in my original mail)
> and other issues, which make it much harder. That is the reason why I want
> to build an ultra-flexible parsing engine. So you can do things like:
> 
>  >>> import DateTime
>  >>> DateTime.parseDateTime('03/02/01', format=DateTime.ISO)
> February 1, 2003
>  >>> DateTime.parseDateTime('03/02/01', format=DateTime.US)
> March 2, 2001
>  >>> DateTime.parseDateTime('03.02.01', format=DateTime.US)
> March 2, 2001
>  >>> DateTime.parseDateTime('03/02/01', format=DateTime.GERMAN) # just in
> case Europe/Germany goes insane as well.
> February 3, 2001
> 
> But by default:
> 
>  >>> DateTime.parseDateTime('03/02/01')
> March 2, 2001
>  >>> DateTime.parseDateTime('03.02.01')
> February 3, 2001

You can do all this with Parser module in mxDateTime. It allows
you to specify a list of parsers to try and in which order
to try them. Chuck Esterbrook has kept me working on it for 
quite some time, so it should be very complete by now :-) 

For more specific (and strict) formats, there are two other 
modules ISO and ARPA which can handle the respective 
formats used in Internet standards.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From srichter@cbu.edu  Sun Feb 10 19:08:09 2002
From: srichter@cbu.edu (Stephan Richter)
Date: Sun, 10 Feb 2002 13:08:09 -0600
Subject: [Python-Dev] proposal: add basic time type to
 thestandardlibrary
In-Reply-To: <3C66B8E5.DC2DD1CA@lemburg.com>
References: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu>
 <5.1.0.14.2.20020210091537.02026d38@mercury-1.cbu.edu>
Message-ID: <5.1.0.14.2.20020210124257.01eb3820@mercury-1.cbu.edu>

>As I explained my reply, most of these intervals are not needed
>as *base types*. You can easily model them on top of the two
>types I have in mxDateTime. Some may not like this model because it
>comes from a more mathematical point of view, but in reality it
>works quite nicely and simplifies the API structure significantly.
>
>A time interval is basically just an amount of seconds, nothing more.
>There's no need to have 4 different types to wrap a single
>double ;-)

Well, this is an okay representation, if you want to do a lot of math and 
use it mainly for this reason. On the other hand it might be fairly 
expensive, if I always want to extract components. In fact, only 10% of my 
usage requires mathematical operations. Most of the time I get the interval 
out of the database and want to simply display it (localized), such as in 
calendars.

> >  >>> import DateTime
> >  >>> date = DateTime.parseDateTime('2.1.2001')
> >  >>> type(date).__name__
> > Date
> >  >>> time = DateTime.parseDateTime('12:00:00')
> >  >>> type(time).__name__
> > Time
> >  >>> datetime = DateTime.parseDateTime('2.1.2001 12:00:00')
> >  >>> type(datetime).__name__
> > DateTime
>
>Just think of all the possible combinations you have in operations
>like '+', '-' and comparisons. You don't want to go down this
>road...

Well, but again, 90% of the time I do not need to do any manipulation 
whatsoever. For this reason you would have time stamps or you know (because 
you used this type) that it will be less efficient to do '+' and '-' with 
DateTime objects, since it does need some more conversions.

>Uhm, where did you get the impression that I want all the world
>to use mxDateTime :-? I wrote it for use in mxODBC since at the time
>there was no DateTime type around which could handle dates prior
>to 1970. As a result, mxDateTime was written to provide everything
>you need for database interfacing. That's also the reason why there
>is no time zone support in mxDateTime's base types: databases
>don't have time zone support built into their date/time types
>either (and for a good reason: time zones are better handled at
>application level).

Well, back then (when I wrote the mail) I thought so. But now I see the 
limitations and have a better idea what people need; hence this proposal. 
For the same reason you say mxDateTime is not good for everything we need a 
solution that works for more situations.

>You really just want to support one way for the type constructor
>(broken down numbers). All other possibilities can be had via
>factory functions.

Probably so.  I will have to think about it some more and look at some 
applications.

> > But by default:
> >
> >  >>> DateTime.parseDateTime('03/02/01')
> > March 2, 2001
> >  >>> DateTime.parseDateTime('03.02.01')
> > February 3, 2001
>
>You can do all this with Parser module in mxDateTime. It allows
>you to specify a list of parsers to try and in which order
>to try them. Chuck Esterbrook has kept me working on it for
>quite some time, so it should be very complete by now :-)
>
>For more specific (and strict) formats, there are two other
>modules ISO and ARPA which can handle the respective
>formats used in Internet standards.

Right. And I am not saying that we will not reuse some of the mxDateTime or 
the Zope DateTime code. I certainly do not want to reimplement stuff that 
already works very well. Also, we need to support I18N, which means the 
module needs to understand things like "February", but also "Februar" if 
the German locale was requested.

I have no desire to compete with the mxDateTime implementation. I want to 
look at some of the solutions out there and take the best from everyone and 
provide a module that will suit 95-100% of the people. For several reasons, 
which I tried to point out in my mails, mxDateTime or Zope's Datetime in 
its current states is not suitable.

Regards,
Stephan

--
Stephan Richter
CBU - Physics and Chemistry Student
Web2k - Web Design/Development & Technical Project Management



From jeremy@alum.mit.edu  Sun Feb 10 00:04:19 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Sat, 9 Feb 2002 19:04:19 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEOBNKAA.tim.one@comcast.net>
References: <200202091423.g19EN1807684@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCMEOBNKAA.tim.one@comcast.net>
Message-ID: <15461.47363.911259.672824@gondolin.digicool.com>

Here's a brief review of the example function.

def mylen(s):
    return len(s)

LOAD_BUILTIN       0 (len)
LOAD_FAST          0 (s)
CALL_FUNCTION      1
RETURN_VALUE

The interpreter has a dlict for all the builtins.  The details don't
matter here.  Let's say that len is at index 4.

The function mylen has an array:
func_builtin_index = [4]  # an index for each builtin used in mylen

The entry at index 0 of func_builtin_index is the index of len in the
interpreter's builtin dlict.  It is either initialized when the
function is created or on first use of len.  (It doesn't matter for
the mechanism and there's no need to decide which is better yet.)

The module has an md_globals_dirty flag.  If it is true, then a
global was introduced dynamically, i.e. a name binding op occurred
that the compiler did not detect statically.

The code object has a co_builtin_names that is like co_names except
that it only contains the names of builtins used by LOAD_BUILTIN.
It's there to get the correct behavior when shadowing of a builtin by
a local occurs at runtime.

The frame grows a bunch of pointers -- 

    f_module from the function (which stores it instead of func_globals)
    f_builtin_names from the code object
    f_builtins from the interpreter

The implementation of LOAD_BUILTIN 0 is straightforward -- in pidgin C:

case LOAD_BUILTIN:
    if (f->f_module->md_globals_dirty) {
        PyObject *w = PyTuple_GET_ITEM(f->f_builtin_names);
        ... /* rest is just like current LOAD_GLOBAL 
               except that is used PyDLict_GetItem()
             */
    } else {
        int builtin_index = f->f_builtin_index[oparg];
        PyObject *x = f->f_builtins[builtin_index];
        if (x == NULL)  
           raise NameError
        Py_INCREF(x);
        PUSH(x);
    }

The LOAD_GLOBAL opcode ends up looking basically the same, except that
it doesn't need to check md_globals_dirty.

case LOAD_GLOBAL:
    int global_index = f->f_global_index[oparg];
    PyObject *x = f->f_module->md_globals[global_index];
    if (x == NULL) {
       check for dynamically introduced builtin
    }
    Py_INCREF(x);
    PUSH(x);

In the x == NULL case above, we need to take extra care for a builtin
that the compiler didn't expect.  It's an odd case.  There is a
global for the module named spam that hasn't yet been assigned to in
the module and there's also a builtin named spam that will be hidden
once spam is bound in the module.

Jeremy



From skip@pobox.com  Sun Feb 10 20:34:08 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 10 Feb 2002 14:34:08 -0600
Subject: [Python-Dev] -U flag
In-Reply-To: <3C6675C0.BE7B42FE@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCGEOANKAA.tim.one@comcast.net>
 <3C6675C0.BE7B42FE@lemburg.com>
Message-ID: <15462.55616.207319.531285@12-248-41-177.client.attbi.com>

(I think we've had this discussion before...)

    MAL> Wait... the -U option was added in order to be able to see how well
    MAL> the 8-bit string / Unicode integration works. It's a know fact that
    MAL> the Python standard lib is not Unicode compatible yet and that's
    MAL> exactly what the -U option allows you to test (in a very simple
    MAL> way).

If -U is really just a "test" flag, I don't think it should show up in
"python -h" output.

Skip


From jeremy@alum.mit.edu  Sun Feb 10 01:13:53 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Sat, 9 Feb 2002 20:13:53 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEOBNKAA.tim.one@comcast.net>
References: <200202091423.g19EN1807684@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCMEOBNKAA.tim.one@comcast.net>
Message-ID: <15461.51537.381585.439205@gondolin.digicool.com>

Let's try an attribute of a module.  

import math

def mysin(x):
    return math.sin(x)

There are two variants of support for this that differ in the way they
handle math being rebound.  Say another function is:

    def yikes():
        global math
        import string as math

We can either check on each use of math.attr to see if math is
rebound, or we can require that STORE_GLOBAL marks all the math.attr
entries as invalid.  I'm not sure which is better, so I'll try to
describe both.

Case #1: Binding operation responsible for invalidating cache.

The module has a dlict for globals that contains three entries:
[math, mysin, yikes].  Each is a PyObject *.

The module also has a global attrs cache, where each entry is
struct {
    int ce_initialized; /* just a flag */
    PyObject **ce_ref;
} cache_entry;

In the case we're considering, ce_module points to math and
ce_module_index is math's index in the globals dlict.  It's assigned
to when the module object is created and never changes.

There is one entry in the global attrs cache, for math.sin.  There's
only one entry because the compiler only found one attribute access of
a global bound by an import statement.

The function mysin(x) uses 
    LOAD_GLOBAL_ATTR  0 (math.sin).

case LOAD_GLOBAL_ATTR:
    cache_entry *e = f->f_module->md_cache[oparg];
    if (!e->ce_initialized) {
        /* lookup module and find it's sin attr.
           store pointer to module dlict entry in ce_ref.
           NB: cache shared by all functions.

           if the thing we expected to be a module isn't actually
           a module, handle that case here and leave initalized set to
           false.
         */
    }
    if (*e->ce_ref == NULL) {
        /* raise NameError if global module isn't bound yet.
           raise AttributeError if module is bound, but doesn't have
           attr.
         */
    }
    Py_INCREF(*e->ce_ref);
    PUSH(*e->ce_ref);

To support invalidation of cache entries, we need to arrange the cache
entries in a particular order and add an auxiliary data structure that
maps from module globals to cache entries it must invalidation.

For example, say a module use math.sin, math.cos, and math.tan.  The
three cache entries for the math module should be stored contiguously
in the cache.

cache_entry *cache[] = { math.sin entry,
                         math.cos entry,
                         math.tan entry,
                       }

struct {
    int index;   /* first attr of this module in cache */
    int length;  /* number of attrs for this module in cache */
} invalidation_info;

There is one invalidation_info for each module that has cached
attributes.  (And only for things that the compiler determines to be
modules.)  The invalidation_info for math would be {0, 3}.  If a
STORE_GLOBAL rebinds math, it must walk through the cache and set
ce_initialized to false for each cache entry.

This isn't exactly the scheme I described in the slides, where I
suggested that the LOAD_GLOBAL_ATTR would check if the module binding
was still valid on each use.  A question from Ping pushed me back in
favor of the approach that I just described.

No time this weekend to describe that check-on-each-use scheme.

Jeremy



From tim.one@comcast.net  Sun Feb 10 21:13:07 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 10 Feb 2002 16:13:07 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <200202101620.g1AGKUm17544@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEAANLAA.tim.one@comcast.net>

>>>     When you use its getitem method, the PyObject * in the cell is
>>>     dereferenced, and if a NULL is found, getitem raises KeyError
>>>     even if the cell exists.

>> Since we're doing something with "the PyObject* in the cell",
>> surely "the cell" *must* exist.  So what is the "even if the cell
>> exists" trying to say?

> It is trying to say "despite the cell's existence".

Then s/even if/even though/, and that's what it will say <wink>.

>> I believe it means to say
>>
>>     even if the cell's cellptr is not NULL
>>
>> and "the cell's cellptr is not NULL" is quite different from "the cell
>> exists".

> No, it doesn't try to say that.  But you're right that it's useful to
> add that the cell's cellptr is irrelevant to getitem.

So

    even though the cell exists, and even if its cellptr is non-NULL.

[on a write-through dict]
> This works for propagating changes from the celldict to the real dict,
> but not the other way around.

Fudge, that's right.  Make it a 2-way write-through dict pair <wink>.

>  Example:
>
>   d = {'x': 10}
>   def set_x(x):
>       d['x'] = x
>   exec "...some code that calls set_x()..." in d

Understood.  What about this?

import cheater
def f():
    cheater.cheat()
    return pachinko()
print f()

where cheater.py contains:

def cheat():
    import __builtin__
    __builtin__.pachinko = lambda: 666

That works fine today, and prints 666.  Under the proposed scheme, I
believe:

1. The main module's globals don't get a cell for pachinko at module
   creation time, because pachinko isn't in the builtins at that point.

2. When the function object for f() is created, the main module's globals
   grow a {NULL, NULL} cell for pachinko.

3. When cheat() is called, __builtin__'s celldict grows a
   {lambda: 666, NULL}
   slot for pachinko.

4. But the main module's globals neither sees that nor has a
   pointer to the new cell.

5. The reference to pachinko in f's return finds {NULL, NULL} in the
   globals, and so raises NameError.


I'm thinking more radically now, about using module dicts mapping to pairs

   PyObject* (the actual value)
   a "I got this PyObject* from the builtins" flag (the "shadow flag")

Invariants:
   1. value==NULL implies flag is false.
   2. flag is true implies value is the value in the builtin dict

Suppose that, at module creation time, the module's globals still got an
entry for every then-existing builtin, but rather than point to the
builtin's cell, copied the ultimate PyObject* into one of these pairs (and
set the flag).  Most of the other machinery in the proposal stays the same.
Accessing a global (builtin or not) via LOAD_GLOBAL_CELL<i> is then very
simple:  the pair does or doesn't contain NULL, and that's the end of it
either way.  This makes the most frequent operation as fast as I can imagine
it becoming (short of plugging ultimate PyObject* values directly into
func_cells -- still thinking about that).

How can this get out of synch?

1. Mutations of the builtin dict (creation of a new builtin, rebinding
   an existing builtin, del'ing an existing builtin).  This has got to
   be exceedingly rare, and never happens in most programs:  it doesn't
   matter how expensive this is.  Each dict serving as a builtin dict could,
   e.g., maintain a list of weakrefs to the modules that were initialized
   from it.   Then mutations of the dict could reach into the relevant
   module dicts and adjust them accordingly.  The shadow flags in the
   module dicts' pairs let the fixup code know whether a given entry
   really refers to the builtin.  This makes mutations of the builtins
   very expensive, but I'd be surprised to find a real program where it's
   a measurable expense.  Note:  It may be helpful to view this as akin
   to propagating changes in new-style base classes down to their
   descendants.

2. del'ing a module global may uncover a builtin of the same name.  While
   not as exceedingly rare as mutations of the builtins, it's still a
   rare thing.  Seems like it would be reasonably cheap anyway:

   Module delitem:
       raise exception if no such key
       # key exists
       raise exception if it came from the builtins (flag is set)
       # key exists and is a module global, not a builtin; flag is false
       set value to NULL
       if the builtins have this name as a key:
           copy the current builtin value
           set the flag

3. Module setitem:
       if key exists:
           overwrite the value and clear the flag
       else:
           add new {value, false} pair

4. Module getitem:
       if name isn't a key or flag is set or value is NULL:
           raise exception
       else:
           return value

That's for non-builtin module timdicts.  I expect the same code would work
for the builtin module's timdict too provided it were given an empty dict as
the source from which to initialize itself (then the flag component of all
its pairs would start out, and remain, false, and the code above would "do
the right thing" by magic, reading "the builtins" as "the timdict from which
I got initialized").

> ...
> I meant that for "len" it would not change, i.e. it would be
>
>     {NULL, pointer to __builtin__'s "len" cell}
>
> but for a global "foo" it would change to
>
>     {value of foo or NULL if foo is undefined, pointer to this very cell}
>
> Then if foo is defined, the code would find the value of foo in the
> first cell it tries, and if foo is undefined, it would find a NULL in
> the cell and in the cell it points to.

Whereas the original scheme stored

    {value of foo or NULL if foo is undefined, NULL}

in this case.  So that's just as quick if foo is defined, but if it isn't
defined as a module global, has to do an extra NULL check on the cellptr.
Gotcha.  OTOH, if "foo" later *becomes* defined in the builtins too, the
module dict won't know that it must change its foo's cellptr.

[on the compiler's architecture]
> Actually, the concrete syntax tree was never a very good
> representation; it was convenient for the parser to generate that, and
> it was "okay" (or "good enough") to generate code from and to do
> anything else from.

I think it worked great until the local-variable optimization got added.
Unfortunately, that happened shortly after the dawn of time.

> I agree that it's a good idea to start thinking about changing the
> parse tree representation to a proper abstract syntax tree.  Maybe the
> normalization that the compiler.py package uses would be a good start?
> Except that I've never quite grasped the visitor architecture there. :-(

This msg is too long already <0.7 wink>.

...
[on Jeremy's global.attr scheme]
> One problem with that is that it's hard to know when <global> in
> <global>.<attribute> is a module, and when it's something else.  I
> guess global analysis could help -- if it's imported ("import math")
> it's likely a module, if it's assigned from an expression ("L = []")
> or a locally defined function or class, it's likely not a module.  But
> "from X import Y" creates a mystery -- X could be a package containing
> a module Y, or it could be a module containing a function or class Y.

Jeremy is aware of all this (I've heard him ponder these specific points),
but I don't think he has a fully fleshed out approach to all of it yet.

[a rework of the "len resolution" example, incorporating Guido's
 comments]
"""
If I'm reading this right, then in the normal case of resolving "len" in

def mylen(s):
    return len(s)

1. A pointer to a cell object is read out of func_cells at a fixed (wrt
   this function) offset.  This points to len's cell object in the
   module's celldict.

2. The cell object's PyObject* pointer is tested and found to be NULL.

[3'. The cell object's cellptr pointer is tested and found not to be NULL.
     This points to len's cell object in __builtin__'s celldict.

>     This NULL test shouldn't be needed given my trick of linking cells
>     that do not shadow globals to themselves.

 As above, in the presence of mutations to builtins, a global that didn't
 shadow a builtin at first may end up shadowing one later.  Perhaps you
 want to punt on preserving current behavior in such cases.  The variant I
 sketched above is intended to preserve all current behavior, while running
 faster in non-pathological cases.
]

3. The cell object's cellptr's PyObject* is tested and found not to be
   NULL.

4. The cell object's cellptr's PyObject* is returned.
"""

In the {PyObject*, flag} variant:

1. A pointer to a pair is read out of func_cells at a fixed (wrt
   this function) offset.  This points to len's pair in the
   module's timdict.

2. The pair's PyObject* pointer is tested and found to be non-NULL.

3. The pair's PyObject* is returned.

> ...
> I know I got sick there and am now stuck with a horrible cold (the
> umpteenth one this season).

In empathy, I'll refrain from painting a word-picture of my neck <wink>.



From skip@pobox.com  Sun Feb 10 21:39:28 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 10 Feb 2002 15:39:28 -0600
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOELENKAA.tim.one@comcast.net>
References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCOELENKAA.tim.one@comcast.net>
Message-ID: <15462.59536.874232.817836@12-248-41-177.client.attbi.com>

    >> When a function object is created from a regular dict instead of a
    >> celldict, func_cells is a NULL pointer.

    Tim> This part is regrettable, since it's Yet Another NULL check at the
    Tim> *top* of code using this stuff (meaning it slows the normal case,
    Tim> assuming that it's unusual not to get a celldict).  I'm not clear
    Tim> on how code ends up getting created from a regular dict instead of
    Tim> a celldict -- is this because of stuff like "exec whatever in
    Tim> mydict"?

I'm still working my way through this thread, so forgive me if this has been
hashed out already.  It seems to me that the correct thing to do is to
convert plain dicts to celldicts when creating functions.  Besides, where
are functions going to get created that are outside of your (PyhonLabs)
control?

Skip


From skip@pobox.com  Sun Feb 10 21:53:07 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 10 Feb 2002 15:53:07 -0600
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEOBNKAA.tim.one@comcast.net>
References: <200202091423.g19EN1807684@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCMEOBNKAA.tim.one@comcast.net>
Message-ID: <15462.60355.540553.195176@12-248-41-177.client.attbi.com>

    Tim> If I'm reading this right, then in the normal case of resolving
    Tim> "len" in

    Tim> def mylen(s):
    Tim>     return len(s)

    ...

    Tim> Jeremy, can you do the same level of detail for your scheme?  Skip?

Yeah, it's

    TRACK_GLOBAL        'len'
    LOAD_FAST           <len>
    LOAD_FAST           <s>
    CALL_FUNCTION       1
    UNTRACK_GLOBAL      'len'
    RETURN_VALUE

or something similar.  (Stuff in <...> represent array indexes.)

My scheme makes update of my local copy of __builtins__.len the
responsibility of the guy who changes the global copy.  Most of the time
this never changes, so as the number of accesses to len increase, the
average time per lookup approaches that of a simple LOAD_FAST.

Skip


From tim.one@comcast.net  Sun Feb 10 21:51:59 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 10 Feb 2002 16:51:59 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <15462.59536.874232.817836@12-248-41-177.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEADNLAA.tim.one@comcast.net>

[Skip Montanaro]
> I'm still working my way through this thread, so forgive me if
> this has been hashed out already.  It seems to me that the correct
> thing to do is to convert plain dicts to celldicts when creating
> functions.

There's the problem of object identity:  it's possible for exec'ed code to
mutate the original dict while the exec'ed code is running, and Guido gave
an example where that can matter.  I had originally suggested building a
celldict that *contained* the original dict, reflecting mutations from the
former to the latter as they happened.  Mutations in the other direction go
unnoticed, though.  If the binary layouts are compatible enough, it may
suffice to replace the dict's type pointer for the duration.  Even then, the
exec'ed code may get tripped up via testing (directly or indirectly) the
type of the original dict (I suppose it could lie about its type ...).

> Besides, where are functions going to get created that are outside
> of your (PyhonLabs) control?

They aren't, but eval and exec and execfile allow users to pass in plain
dicts to be used for locals and/or globals.



From skip@pobox.com  Sun Feb 10 22:29:53 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 10 Feb 2002 16:29:53 -0600
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net>
References: <200202081650.g18GoVu02559@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15462.62561.295245.440138@12-248-41-177.client.attbi.com>

    Guido> Inspired by talks by Jeremy and Skip on DevDay, here's a
    Guido> different idea for speeding up access to globals.  It retain
    Guido> semantics but (like Jeremy's proposal) changes the type of a
    Guido> module's __dict__.

Just to see if I have a correct mental model of what Guido proposed, I drew
a picture:

    http://manatee.mojam.com/~skip/python/celldict.png

The cells are the small blank boxes.  I guess the celldict would be the
stuff I labelled "module dict".  The "func cells" would be an array like
fastlocals, but would refer to cells in the module's dict.  I'm not clear
where/how builtins are accessed though.  Is that what the extra indirection
is, or are builtins incorporated into the module dict somehow?

If anyone wants to correct my picture, the Dia diagram is at

    http://manatee.mojam.com/~skip/python/celldict

Skip


From skip@pobox.com  Sun Feb 10 22:35:49 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 10 Feb 2002 16:35:49 -0600
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEADNLAA.tim.one@comcast.net>
References: <15462.59536.874232.817836@12-248-41-177.client.attbi.com>
 <LNBBLJKPBEHFEDALKOLCAEADNLAA.tim.one@comcast.net>
Message-ID: <15462.62917.505362.521566@12-248-41-177.client.attbi.com>


    Tim> There's the problem of object identity: it's possible for exec'ed
    Tim> code to mutate the original dict while the exec'ed code is running,
    Tim> and Guido gave an example where that can matter.  

In the face of exec statements or calls to execfile can't the compiler just
generate the usual LOAD_NAME fallback instead of the new-fangled LOAD_GLOBAL
opcode?

Skip


From tim.one@comcast.net  Sun Feb 10 22:45:34 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 10 Feb 2002 17:45:34 -0500
Subject: [Python-Dev] PYC Magic
In-Reply-To: <3C6675C0.BE7B42FE@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEAGNLAA.tim.one@comcast.net>

[MAL]
> Wait... the -U option was added in order to be able to see how well
> the 8-bit string / Unicode integration works. It's a know fact that
> the Python standard lib is not Unicode compatible yet and that's
> exactly what the -U option allows you to test (in a very simple
> way).

I don't object to testing hacks provided they don't trip up the innocent; it
would help to remove -U from the user-visible docs (which I'll do).  Note
that, by coincidence, Andreas Jung (at Zope Corp) pissed away time worrying
about -U breakage yesterday independent of our thread here:  it's doing
harm.

If you're the only one who tries -U on purpose (anyone?  it's clear that I
don't ...), it would be better done via a preprocessor define.  How often is
this used even by you?  If it's once per release just to make sure it's
still broken <wink>, a variant build wouldn't be a real burden.

> ...
> The -U option is currently the only application of such a flag.
> We will definitely have a need for these options in the future
> to make the runtime aware of certain assumptions which have been
> made in the compiled byte code, e.g. byte code using special
> opcodes, byte code compiled for a different Python virtual
> machine (once we get pluggable Python compiler / VM combos),
> byte code which was compiled using special literal
> interpretations (such as in the -U case or when compiling
> the source code with a different source code encoding
> assumption).

There remains no current use for any of these things.  When a real use
appears, "magic number" abuse won't be appropriate:  imp.get_magic() doesn't
return a vector; we're not doing the Unixish /etc/magic database any favors
by *ever* changing it; and needing to register umpteen distinct magic
numbers per release for Linux binfmt would make Python even more irritating
to live with there.

> I would be more than happy to get rid off the current PYC magic hack
> for -U and have it replaced with a better and extensible alternative,
> e.g. a combination of PYC version number and marhsalled option
> dictionary.

I agree, except that I still think having -U now is a net loss.



From tim.one@comcast.net  Sun Feb 10 22:51:19 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 10 Feb 2002 17:51:19 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <15462.62917.505362.521566@12-248-41-177.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEAHNLAA.tim.one@comcast.net>

[Skip]
> In the face of exec statements or calls to execfile can't the
> compiler just generate the usual LOAD_NAME fallback instead of the new-
> fangled LOAD_GLOBAL opcode?

Note that exec doesn't have to be passed a string:  you can pass it a
compiled code object just as well.  The compiler can't guess how a code
object will be used at the time it's compiled.  In theory there would be
nothing to stop exec from rewriting the bytecode in a compiled code object
passed to it, but I doubt we could get Guido to buy that trick until he
first buys rewriting bytecode to set debugger breakpoints <wink>.



From tim.one@comcast.net  Sun Feb 10 23:27:09 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 10 Feb 2002 18:27:09 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <15462.62561.295245.440138@12-248-41-177.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEAJNLAA.tim.one@comcast.net>

[Skip Montanaro]
> Just to see if I have a correct mental model of what Guido
> proposed, I drew a picture:
>
>     http://manatee.mojam.com/~skip/python/celldict.png
>
> The cells are the small blank boxes.  I guess the celldict would be the
> stuff I labelled "module dict".  The "func cells" would be an array like
> fastlocals, but would refer to cells in the module's dict.

Yup, except that fastlocals are part of a frame, not part of a function
object.  Guido didn't make a big deal about this, but it's key to
efficiency:  the expense of setting up func_cells is *not* incurred on a
per-call basis, it's done once when a function object is created
(MAKE_FUNCTION), then reused across all calls to that function object.

> I'm not clear where/how builtins are accessed though.

__builtin__ is just another module, and also has a celldict for a __dict__.
The empty squares in your diagram (the "bottom half" of your cells)
sometimes point to cells in __builtin__'s celldict.  They remain empty
(NULL) in __builtin__'s celldict, though.

> Is that what the extra indirection is, or are builtins incorporated into
> the module dict somehow?

In Guido's proposal, module celldicts sometimes point to builtin's cells.
It's set up so that *all* names of builtins get an entry in the module's
dict, even names that aren't referenced in the module (this avoids global
analysis).  Their initial entries look like:

   "len":  {NULL, pointer to the "len" cell in the builtins}

Setting "len" as a module global (if you ever do that) overwrites the NULL.
Then later del'ing "len" again (if you ever do that) restores the NULL.  For
*most* purposes, a cell with a NULL first pointer acts as if it didn't
exist.  It's only the eval loop that understands the "deep structure".


In the variant I sketched today, there are no cross-dict pointers, and the
initial entries look like

   "len": {the actual value of "len" from builtins, true}

instead.  Then mutating the builtins requires reaching back into modules and
updating their timdicts.  In return, access code is simpler+faster, and
there aren't semantic changes (compared to today) if the builtins mutate
*after* a module's dict is initially populated (Guido's scheme appears
vulnerable here in at least two ways).



From guido@python.org  Mon Feb 11 00:09:29 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 10 Feb 2002 19:09:29 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: Your message of "Sun, 10 Feb 2002 16:35:49 CST."
 <15462.62917.505362.521566@12-248-41-177.client.attbi.com>
References: <15462.59536.874232.817836@12-248-41-177.client.attbi.com> <LNBBLJKPBEHFEDALKOLCAEADNLAA.tim.one@comcast.net>
 <15462.62917.505362.521566@12-248-41-177.client.attbi.com>
Message-ID: <200202110009.g1B09TV18217@pcp742651pcs.reston01.va.comcast.net>

[Skip]
> In the face of exec statements or calls to execfile can't the
> compiler just generate the usual LOAD_NAME fallback instead of the
> new-fangled LOAD_GLOBAL opcode?

Very good!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Feb 11 00:10:53 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 10 Feb 2002 19:10:53 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: Your message of "Sun, 10 Feb 2002 17:51:19 EST."
 <LNBBLJKPBEHFEDALKOLCCEAHNLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCCEAHNLAA.tim.one@comcast.net>
Message-ID: <200202110010.g1B0Asb18230@pcp742651pcs.reston01.va.comcast.net>

> [Skip]
> > In the face of exec statements or calls to execfile can't the
> > compiler just generate the usual LOAD_NAME fallback instead of the new-
> > fangled LOAD_GLOBAL opcode?

[Tim]
> Note that exec doesn't have to be passed a string: you can pass it a
> compiled code object just as well.  The compiler can't guess how a
> code object will be used at the time it's compiled.  In theory there
> would be nothing to stop exec from rewriting the bytecode in a
> compiled code object passed to it, but I doubt we could get Guido to
> buy that trick until he first buys rewriting bytecode to set
> debugger breakpoints <wink>.

Arg.  So much for that idea.  (Although I think the mutable bytecode
idea *is* the right idea for setting breakpoints after all.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@alum.mit.edu  Sun Feb 10 04:21:01 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Sat, 9 Feb 2002 23:21:01 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <15461.51537.381585.439205@gondolin.digicool.com>
References: <200202091423.g19EN1807684@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCMEOBNKAA.tim.one@comcast.net>
 <15461.51537.381585.439205@gondolin.digicool.com>
Message-ID: <15461.62765.301509.19821@gondolin.digicool.com>

>>>>> "JH" == Jeremy Hylton <jeremy@alum.mit.edu> writes:

  JH> Case #1: Binding operation responsible for invalidating cache.

  JH> The module has a dlict for globals that contains three entries:
  JH> [math, mysin, yikes].  Each is a PyObject *.

  JH> The module also has a global attrs cache, where each entry is
  JH> struct {
  JH>     int ce_initialized; /* just a flag */ PyObject **ce_ref;
  JH> } cache_entry;

  JH> In the case we're considering, ce_module points to math and
  JH> ce_module_index is math's index in the globals dlict.  It's
  JH> assigned to when the module object is created and never changes.

Just pretend I didn't write this paragraph :-(.  I was going to
describe the other case first, then changed my mind.  The previous
paragraph describes Case #2.  

The text before and after this paragraph looks clear to me.  Does
anyone else agree?  I didn't think I had done any hand waving on
globals and module attributes in the slides; so I expect that I'm not
a good judge of what is hand waving and what is high-level
description.

Jeremy



From aahz@rahul.net  Mon Feb 11 00:41:34 2002
From: aahz@rahul.net (Aahz Maruch)
Date: Sun, 10 Feb 2002 16:41:34 -0800 (PST)
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <200202110010.g1B0Asb18230@pcp742651pcs.reston01.va.comcast.net> from "Guido van Rossum" at Feb 10, 2002 07:10:53 PM
Message-ID: <20020211004134.B799FE8C3@waltz.rahul.net>

Guido van Rossum wrote:
> 
>> [Skip]
>>> In the face of exec statements or calls to execfile can't the
>>> compiler just generate the usual LOAD_NAME fallback instead of the new-
>>> fangled LOAD_GLOBAL opcode?
> 
> [Tim]
>> Note that exec doesn't have to be passed a string: you can pass it a
>> compiled code object just as well.  The compiler can't guess how a
>> code object will be used at the time it's compiled.  In theory there
>> would be nothing to stop exec from rewriting the bytecode in a
>> compiled code object passed to it, but I doubt we could get Guido to
>> buy that trick until he first buys rewriting bytecode to set
>> debugger breakpoints <wink>.
> 
> Arg.  So much for that idea.  (Although I think the mutable bytecode
> idea *is* the right idea for setting breakpoints after all.)

Let me play stupid for a sec: how does a compiled code object get
created?  Is Tim saying that one can pass foo.bar to exec, where bar()
is a function in module foo?  If not, why can't we force compile() to
generate the slower code?  Alternatively, can we change the semantics of
exec to require the use of compile() to generate code objects?
(compile() on an existing code object would do an explicit rewrite of
the bytecode.)
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

We must not let the evil of a few trample the freedoms of the many.


From jeremy@alum.mit.edu  Sun Feb 10 05:01:31 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Sun, 10 Feb 2002 00:01:31 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <20020211004134.B799FE8C3@waltz.rahul.net>
References: <200202110010.g1B0Asb18230@pcp742651pcs.reston01.va.comcast.net>
 <20020211004134.B799FE8C3@waltz.rahul.net>
Message-ID: <15461.65195.711574.143348@gondolin.digicool.com>

You can exec a code object and specify the environment to use for
names.

Jeremy

>>> def f():
...	print x + y
... 
>>> x = 1
>>> y = 3
>>> f()
4
>>> exec f.func_code in {'x':0, 'y':-3}, {}
-3



From tim.one@comcast.net  Mon Feb 11 01:14:07 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 10 Feb 2002 20:14:07 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <20020211004134.B799FE8C3@waltz.rahul.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEAONLAA.tim.one@comcast.net>

[Aahz]
> Let me play stupid for a sec: how does a compiled code object get
> created?

Explicitly via passing strings to compile/exec/eval or via execfile, or
implicitly due to the normal operation of class, def and lambda statements,
and the interactive prompt.

> Is Tim saying that one can pass foo.bar to exec, where bar() is a function
> in module foo?

No, foo.bar is a function object, meaning basically that it's a code object
bound to a specific name and a specifc bag of globals, and whose default
argument values (if any) have been computed and frozen based on those
globals.  You can pass foo.bar.func_code to exec, though (that's the raw
code object).  Note that marshal can't handle function objects, but can
handle code objects, and some people make extremely heavy use of extracting
code objects for marshaling, then later unmarshaling and exec'ing them.
When I'm tempted to exec, I'm more likely to use compile() in a separate
step (to get better control over errors) and exec the resulting code object.

> ...
> Alternatively, can we change the semantics of exec to require the use
> of compile() to generate code objects?
> (compile() on an existing code object would do an explicit rewrite of
> the bytecode.)

I didn't follow this, but am not sure it would help if I did <wink>.



From tim.one@comcast.net  Mon Feb 11 08:13:46 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 11 Feb 2002 03:13:46 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <15462.60355.540553.195176@12-248-41-177.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEBNNLAA.tim.one@comcast.net>

[Skip Montanaro, on
    def mylen(s):
        return len(s)
]
> Yeah, it's
>
>     TRACK_GLOBAL        'len'
>     LOAD_FAST           <len>
>     LOAD_FAST           <s>
>     CALL_FUNCTION       1
>     UNTRACK_GLOBAL      'len'
>     RETURN_VALUE
>
> or something similar.  (Stuff in <...> represent array indexes.)
>
> My scheme makes update of my local copy of __builtins__.len

Who is the "me" in "my"?  That is, is "my local copy" attached to the frame,
or to the function object, or to the module globals, or ...?  Since it's
accessed via LOAD_FAST, I'm assuming it's attached to the frame object.

> the responsibility of the guy who changes the global copy.

Also in my variant of Guido's proposal (and the value of len is cached in
the module dict there, which tracks all changes to the builtins as they
occur).

> Most of the time this never changes,

Right.

> so as the number of accesses to len increase, the average time per
> lookup approaches that of a simple LOAD_FAST.

You mean number of accesses to len per function call, I think.  If I do

    for i in xrange(1000000):
        print mylen("abc")

I'm going to do a TRACK_GLOBAL and UNTRACK_GLOBAL thingie too for each
LOAD_FAST of len, and then the average time per len lookup really has to
count the average time for those guys too.



From tim.one@comcast.net  Mon Feb 11 09:27:27 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 11 Feb 2002 04:27:27 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <15461.47363.911259.672824@gondolin.digicool.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEBPNLAA.tim.one@comcast.net>

[Jeremy Hylton]
> Here's a brief review of the example function.
>
> def mylen(s):
>     return len(s)
>
> LOAD_BUILTIN       0 (len)
> LOAD_FAST          0 (s)
> CALL_FUNCTION      1
> RETURN_VALUE
>
> The interpreter has a dlict for all the builtins.  The details don't
> matter here.

Actually, the details are everything here <wink>.

> Let's say that len is at index 4.
>
> The function mylen has an array:
> func_builtin_index = [4]  # an index for each builtin used in mylen
>
> The entry at index 0 of func_builtin_index is the index of len in the
> interpreter's builtin dlict.  It is either initialized when the
> function is created or on first use of len.

All clear except for the referent of "It" (the subject of the preceding
sentence is "The entry at index 0", but that doesn't seem to make much sense
as a referent).

> (It doesn't matter for the mechanism and there's no need to decide which
> is better yet.)
>
> The module has an md_globals_dirty flag.  If it is true, then a
> global was introduced dynamically, i.e. a name binding op occurred
> that the compiler did not detect statically.

Once it becomes true, can md_globals_dirty ever become false again?

> The code object has a co_builtin_names that is like co_names except
> that it only contains the names of builtins used by LOAD_BUILTIN.
> It's there to get the correct behavior when shadowing of a builtin by
> a local occurs at runtime.
    ^^^^^

Can that happen?  Or did you mean when shadowing of a builtin by a global
occurs at runtime?  The LOAD_BUILTIN code below seems most consistent with
the "global" rewording.

> The frame grows a bunch of pointers --
>
>     f_module from the function (which stores it instead of func_globals)
>     f_builtin_names from the code object
>     f_builtins from the interpreter
>
> The implementation of LOAD_BUILTIN 0 is straightforward -- in pidgin C:
>
> case LOAD_BUILTIN:
>     if (f->f_module->md_globals_dirty) {
>         PyObject *w = PyTuple_GET_ITEM(f->f_builtin_names);

Presumably this is missing an ", oparg" argument.

>         ... /* rest is just like current LOAD_GLOBAL
>                except that is used PyDLict_GetItem()
>              */
>     } else {
>         int builtin_index = f->f_builtin_index[oparg];
>         PyObject *x = f->f_builtins[builtin_index];
>         if (x == NULL)
>            raise NameError
>         Py_INCREF(x);
>         PUSH(x);
>     }

OK, that's the gritty detail I was looking for.  When it comes time to code,
note that it's better to negate the test and swap the "if" branches (a
not-taken branch is usually quicker than a taken branch, and you want to
favor the expected case).

Question:  couldn't the LOAD_BUILTIN opcode use builtin_index directly as
its argument (and so skip one level of indirection)?  We know which builtins
the interpreter supplies, and the compiler could be taught a fixed
correspondence between builtin names and little integers.  There are only
<snort> 114 keys in __builtin__.__dict__ today, so there's plenty of room in
an instruction to hold the index.  A tuple of std builtin names could also
be a C extern shared by everyone, eliminating the need for f_builtin_names.

> The LOAD_GLOBAL opcode ends up looking basically the same, except that
> it doesn't need to check md_globals_dirty.
>
> case LOAD_GLOBAL:
>     int global_index = f->f_global_index[oparg];
>     PyObject *x = f->f_module->md_globals[global_index];
>     if (x == NULL) {
>        check for dynamically introduced builtin
>     }
>     Py_INCREF(x);
>     PUSH(x);

f_global_index wasn't mentioned before its appearance in this code block.  I
can guess what it is.  Again I wonder whether it's possible to snip a layer
of indirection (for a fixed function and fixed oparg, can
f->f_global_index[oparg] change across invocations of LOAD_GLOBAL?  I'm
guessing "no", in which case a third of the normal-case code is burning
cycles without real need).

> In the x == NULL case above, we need to take extra care for a builtin
> that the compiler didn't expect.  It's an odd case.  There is a
> global for the module named spam

The module is named spam, or the global is named spam?  I think the latter
was intended.

> that hasn't yet been assigned to in the module and there's also a
> builtin named spam that will be hidden once spam is bound in the module.

And can also be revealed again if someone reaches into the module and del's
spam again, right?

This looks fast, provided it works <wink>, and is along the lines of what I
had in mind when I first tortured Guido with the idea of dlicts way back
when.  One major correction:  you pronounce it "dee-likt".  That's a
travesty.  I picked the name dlict because it's unpronounceable in any human
language -- as befits an unthinkable idea <wink>.



From mal@lemburg.com  Mon Feb 11 11:12:47 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 11 Feb 2002 12:12:47 +0100
Subject: [Python-Dev] Accessing globals without dict lookup
References: <LNBBLJKPBEHFEDALKOLCIEBPNLAA.tim.one@comcast.net>
Message-ID: <3C67A72F.B313E47@lemburg.com>

Just a few quick questions before go back into lurcking mode:

Will it still be possible to:
a) install new builtins in the __builtin__ namespace and have them
   available in all already loaded modules right away ?
b) override builtins (e.g. open()) with my own copies 
   (e.g. to increase security) in a way that makes these new
   copies override the previous ones in all modules ?

Also, how does the new scheme get along with the restricted 
execution model ? (I have a feeling that this model needs some
auditing since so many new ways of accessing variables and
attributes were introduced since the days of 1.5.2)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From ping@lfw.org  Mon Feb 11 13:14:09 2002
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 11 Feb 2002 07:14:09 -0600 (CST)
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEAJNLAA.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.33.0202110646310.32170-100000@server1.lfw.org>

All right -- i have attempted to diagram a slightly more interesting
example, using my interpretation of Guido's scheme.

    http://lfw.org/repo/cells.gif

    http://lfw.org/repo/cells-big.gif for a bigger image

    http://lfw.org/repo/cells.ai for the source file

The diagram is supposed to represent the state of things after
"import spam", where spam.py contains

    import eggs

    i = -2
    max = 3

    def foo(n):
        y = abs(i) + max
        return eggs.ham(y + n)

How does it look?  Guido, is it anything like what you have in mind?


A couple of observations so far:

    1.  There are going to be lots of global-cell objects.
        Perhaps they should get their own allocator and free list.

    2.  Maybe we don't have to change the module dict type.
        We could just use regular dictionaries, with the special
        case that if retrieving the value yields a cell object,
        we then do the objptr/cellptr dance to find the value.
        (The cell objects have to live outside the dictionaries
        anyway, since we don't want to lose them on a rehashing.)

    3.  Could we change the name, please?  It would really suck
        to have two kinds of things called "cell objects" in
        the Python core.

    4.  I recall Tim asked something about the cellptr-points-to-itself
        trick.  Here's what i make of it -- it saves a branch: instead of

            PyObject* cell_get(PyGlobalCell* c)
            {
                if (c->cell_objptr) return c->cell_objptr;
                if (c->cell_cellptr) return c->cell_cellptr->cell_objptr;
            }

        it's

            PyObject* cell_get(PyGlobalCell* c)
            {
                if (c->cell_objptr) return c->cell_objptr;
                return c->cell_cellptr->cell_objptr;
            }

        This makes no difference when c->cell_objptr is filled,
        but it saves one check when c->cell_objptr is NULL in
        a non-shadowed variable (e.g. after "del x").  I believe
        that's the only case in which it matters, and it seems
        fairly rare to me that a module function will attempt to
        access a variable that's been deleted from the module.

        Because the module can't know what new variables might
        be introduced into __builtin__ after the module has been
        loaded, a failed lookup must finally fall back to a lookup
        in __builtin__.  Given that, it seems like a good idea to
        set c->cell_cellptr = c when c->cell_objptr is set (for
        both shadowed and non-shadowed variables).  In my picture,
        this would change the cell that spam.max points to, so
        that it points to itself instead of __builtin__.max's cell.
        That is:

            PyObject* cell_set(PyGlobalCell* c, PyObject* v)
            {
                c->cell_objptr = v;
                c->cell_cellptr = c;
            }

        This simplifies things further:

            PyObject* cell_get(PyGlobalCell* c)
            {
                return c->cell_cellptr->cell_objptr;
            }

        This buys us no branches, which might be a really good
        thing on today's speculative execution styles.

I know i'm a few messages behind on the discussion -- i'll do
some reading to catch up before i say any more.  But i hope
the diagram is somewhat helpful, anyway.


-- ?!ng



From ping@lfw.org  Mon Feb 11 13:22:27 2002
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 11 Feb 2002 07:22:27 -0600 (CST)
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <Pine.LNX.4.33.0202110646310.32170-100000@server1.lfw.org>
Message-ID: <Pine.LNX.4.33.0202110716230.32321-100000@server1.lfw.org>

On Mon, 11 Feb 2002, Ka-Ping Yee wrote:
>         This simplifies things further:
>
>             PyObject* cell_get(PyGlobalCell* c)
>             {
>                 return c->cell_cellptr->cell_objptr;
>             }

I forgot to mention that this would also add loopback cellptrs for
the two cells pointed to by __builtin__.abs and __builtin__.max.

But hey... in that case the cellptr is always two steps away from
the object.  So why not just use PyObject**s instead of cells?

    dict -> ptr -> ptr -> object

(Or, if we want to maintain backward compatibility with existing
dictionaries, let a cell be an object, so we can check its type,
and have it contain just one pointer instead of two?)

Am i out to lunch?


-- ?!ng



From martin@v.loewis.de  Mon Feb 11 12:15:59 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 11 Feb 2002 13:15:59 +0100
Subject: [Python-Dev] Speeding up instance attribute access
In-Reply-To: <200202081918.g18JIKb03242@pcp742651pcs.reston01.va.comcast.net>
References: <200202081918.g18JIKb03242@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <m37kpkgqo0.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> - We need fallbacks for various exceptional cases:

I think assignment to __class__ also needs to be
considered. Therefore, it may be best if the member array is a
separate block (not allocated with the instances).

It might also be worthwhile to incorporate __slots__ access into that
scheme, to avoid having to find the member descriptor in the class
dictionary.

Regards,
Martin


From skip@pobox.com  Mon Feb 11 14:16:32 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 11 Feb 2002 08:16:32 -0600
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEBNNLAA.tim.one@comcast.net>
References: <15462.60355.540553.195176@12-248-41-177.client.attbi.com>
 <LNBBLJKPBEHFEDALKOLCAEBNNLAA.tim.one@comcast.net>
Message-ID: <15463.53824.600024.850814@12-248-41-177.client.attbi.com>

    Tim> [Skip Montanaro, on
    Tim>     def mylen(s):
    Tim>         return len(s)
    Tim> ]
    >> Yeah, it's
    >> 
    >> TRACK_GLOBAL        'len'
    >> LOAD_FAST           <len>
    >> LOAD_FAST           <s>
    >> CALL_FUNCTION       1
    >> UNTRACK_GLOBAL      'len'
    >> RETURN_VALUE
    >> 
    >> or something similar.  (Stuff in <...> represent array indexes.)
    >> 
    >> My scheme makes update of my local copy of __builtins__.len

    Tim> Who is the "me" in "my"?

Sorry, should have been "the" instead of "my".  TRACK_GLOBAL is responsible
for making the original copy.  I should have added another argument to it:

    TRACK_GLOBAL        'len', <len>
    LOAD_FAST           <len>
    LOAD_FAST           <s>
    CALL_FUNCTION       1
    UNTRACK_GLOBAL      'len', <len>
    RETURN_VALUE

    Tim> You mean number of accesses to len per function call, I think.  

Yes.

    Tim> If I do

    Tim>     for i in xrange(1000000):
    Tim>         print mylen("abc")

    Tim> I'm going to do a TRACK_GLOBAL and UNTRACK_GLOBAL thingie too for
    Tim> each LOAD_FAST of len, and then the average time per len lookup
    Tim> really has to count the average time for those guys too.

Actually, no.  I originally meant to say "Ignoring the fact that my
optimizer would leave this example untouched...", but deleted it while
editing the message as more detail than you were asking for.  Your example:

    def mylen(s):
        return len(s)

doesn't access len in a loop, so it would be ignored.  On the other hand:

    for i in xrange(1000000):
        print mylen("abc")

would track mylen (but not xrange).

Skip


From skip@pobox.com  Mon Feb 11 14:26:46 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 11 Feb 2002 08:26:46 -0600
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <Pine.LNX.4.33.0202110716230.32321-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0202110646310.32170-100000@server1.lfw.org>
 <Pine.LNX.4.33.0202110716230.32321-100000@server1.lfw.org>
Message-ID: <15463.54438.240250.933946@12-248-41-177.client.attbi.com>

    Ping> But hey... in that case the cellptr is always two steps away from
    Ping> the object.  So why not just use PyObject**s instead of cells?

I think it's because they aren't objects.  You need to make the indirection
explicit so that when some code does the equivalent of module.abs it
realizes it needs to follow the chain.

Thanks for the great diagram, btw.  I knew if I did something feeble it
would get rewritten correctly.

Skip


From guido@python.org  Mon Feb 11 14:31:32 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Feb 2002 09:31:32 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: Your message of "Mon, 11 Feb 2002 12:12:47 +0100."
 <3C67A72F.B313E47@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCIEBPNLAA.tim.one@comcast.net>
 <3C67A72F.B313E47@lemburg.com>
Message-ID: <200202111431.g1BEVWJ19544@pcp742651pcs.reston01.va.comcast.net>

> Just a few quick questions before go back into lurcking mode:

Note that I've moved my design to a new PEP, PEP 280.  Tim has added
his approach there too.  Please read it!!!

> Will it still be possible to:
> a) install new builtins in the __builtin__ namespace and have them
>    available in all already loaded modules right away ?
> b) override builtins (e.g. open()) with my own copies 
>    (e.g. to increase security) in a way that makes these new
>    copies override the previous ones in all modules ?

Yes, this is the whole point of this design.  In the original
approach, when LOAD_GLOBAL_CELL finds a NULL in the second cell, it
should go back to see if the __builtins__ dict has been modified (the
pseudo code doesn't have this yet).  Tim's alternative also takes care
of this.

> Also, how does the new scheme get along with the restricted 
> execution model ?

Yes, again.

> (I have a feeling that this model needs some auditing since so many
> new ways of accessing variables and attributes were introduced since
> the days of 1.5.2)

You may be right.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Mon Feb 11 14:39:35 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 11 Feb 2002 08:39:35 -0600
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <Pine.LNX.4.33.0202110646310.32170-100000@server1.lfw.org>
References: <LNBBLJKPBEHFEDALKOLCGEAJNLAA.tim.one@comcast.net>
 <Pine.LNX.4.33.0202110646310.32170-100000@server1.lfw.org>
Message-ID: <15463.55207.217946.237969@12-248-41-177.client.attbi.com>

    Ping> All right -- i have attempted to diagram a slightly more
    Ping> interesting example, using my interpretation of Guido's scheme.

Very nice.  One case I would like to see covered is that of a global that is
deleted.  Something like:


    import eggs

    i = -2
    max = 3
    j = 4

    def foo(n):
        y = abs(i) + max
        return eggs.ham(y + n)

    del j

I presume there would still be an entry in spam's module dict with a NULL
objptr.

The whole think makes sense to me if it avoids the possible two
PyDict_GetItem calls in the LOAD_GLOBAL opcode.  As I understand it, if
accessed inside a function, LOAD_GLOBAL could be implemented something like
this:

    case LOAD_GLOBAL:
        cell = func_cells[oparg];
        if (cell.objptr) x = cell->objptr;
        else x = cell->cellptr->objptr;
        if (x == NULL) {
            ... error recovery ...
            break;
        }
        Py_INCREF(x);
        continue;

This looks a lot better to me (no complex function calls).

What happens in the module's top-level code where there is presumably no
func_cells array?  Do we simply have two different opcodes, one for use at
the global level and one for use in functions?

Skip


From guido@python.org  Mon Feb 11 14:42:54 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Feb 2002 09:42:54 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: Your message of "Mon, 11 Feb 2002 07:22:27 CST."
 <Pine.LNX.4.33.0202110716230.32321-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0202110716230.32321-100000@server1.lfw.org>
Message-ID: <200202111442.g1BEgsZ19628@pcp742651pcs.reston01.va.comcast.net>

> On Mon, 11 Feb 2002, Ka-Ping Yee wrote:
> >         This simplifies things further:
> >
> >             PyObject* cell_get(PyGlobalCell* c)
> >             {
> >                 return c->cell_cellptr->cell_objptr;
> >             }
> 
> I forgot to mention that this would also add loopback cellptrs for
> the two cells pointed to by __builtin__.abs and __builtin__.max.
> 
> But hey... in that case the cellptr is always two steps away from
> the object.  So why not just use PyObject**s instead of cells?
> 
>     dict -> ptr -> ptr -> object
> 
> (Or, if we want to maintain backward compatibility with existing
> dictionaries, let a cell be an object, so we can check its type,
> and have it contain just one pointer instead of two?)
> 
> Am i out to lunch?

I think so.  Think of max in the example used for your diagram (thanks
for that BTW!).  The first cell for it contains 3; the second cell for
it contains the built-in function 'max'.  A double dereference would
get the wrong value.

Or did I misread your suggestion?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Feb 11 15:02:38 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Feb 2002 10:02:38 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: Your message of "Mon, 11 Feb 2002 08:39:35 CST."
 <15463.55207.217946.237969@12-248-41-177.client.attbi.com>
References: <LNBBLJKPBEHFEDALKOLCGEAJNLAA.tim.one@comcast.net> <Pine.LNX.4.33.0202110646310.32170-100000@server1.lfw.org>
 <15463.55207.217946.237969@12-248-41-177.client.attbi.com>
Message-ID: <200202111502.g1BF2cD19707@pcp742651pcs.reston01.va.comcast.net>

> Very nice.  One case I would like to see covered is that of a global
> that is deleted.  Something like:
> 
> 
>     import eggs
> 
>     i = -2
>     max = 3
>     j = 4
> 
>     def foo(n):
>         y = abs(i) + max
>         return eggs.ham(y + n)
> 
>     del j
> 
> I presume there would still be an entry in spam's module dict with a
> NULL objptr.

Yes.

> The whole think makes sense to me if it avoids the possible two
> PyDict_GetItem calls in the LOAD_GLOBAL opcode.  As I understand it, if
> accessed inside a function, LOAD_GLOBAL could be implemented something like
> this:
> 
>     case LOAD_GLOBAL:

Surely you meant LOAD_GLOBAL_CELL.

>         cell = func_cells[oparg];
>         if (cell.objptr) x = cell->objptr;
>         else x = cell->cellptr->objptr;
>         if (x == NULL) {
>             ... error recovery ...
>             break;
>         }
>         Py_INCREF(x);
>         continue;
> 
> This looks a lot better to me (no complex function calls).

Here's my version:

       case LOAD_GLOBAL_CELL:
           cell = func_cells[oparg];
           x = cell->objptr;
           if (x == NULL) {
               x = cell->cellptr->objptr;
               if (x == NULL) {
                   ... error recovery ...
                   break;
               }
           }
           Py_INCREF(x);
           continue;

> What happens in the module's top-level code where there is
> presumably no func_cells array?  Do we simply have two different
> opcodes, one for use at the global level and one for use in
> functions?

It could use LOAD_GLOBAL which should use PyMapping_GetItem on the
globals dict.  Or maybe even LOAD_NAME which should do the same.
But we could also somehow create a func_cells array (hm, it would have
to be called differently then I suppose).

(I've added these to the FAQs in PEP 280 too.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gward@python.net  Mon Feb 11 15:03:38 2002
From: gward@python.net (Greg Ward)
Date: Mon, 11 Feb 2002 10:03:38 -0500
Subject: [Python-Dev] Want to co-design and implement a logging module?
In-Reply-To: <20020204094149.C31089@ActiveState.com>
References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> <20020204094149.C31089@ActiveState.com>
Message-ID: <20020211150338.GA20372@gerg.ca>

On 04 February 2002, Trent Mick volunteered to "do something"
about a standard logging module for Python:
> How about I try to have a PEP together within a week or two, and perhaps a
> working base implementation?

Well, it's been exactly 7 days.  Trent, are you halfway done yet?  ;-)

(Yes, I've been thinking that the solution to the Distutils' verbosity
problem lies somewhere down this road.)

        Greg
-- 
Greg Ward - Unix geek                                   gward@python.net
http://starship.python.net/~gward/
Laziness, Impatience, Hubris.


From guido@python.org  Mon Feb 11 15:31:33 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Feb 2002 10:31:33 -0500
Subject: [Python-Dev] Speeding up instance attribute access
In-Reply-To: Your message of "11 Feb 2002 13:15:59 +0100."
 <m37kpkgqo0.fsf@mira.informatik.hu-berlin.de>
References: <200202081918.g18JIKb03242@pcp742651pcs.reston01.va.comcast.net>
 <m37kpkgqo0.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200202111531.g1BFVXO19857@pcp742651pcs.reston01.va.comcast.net>

> Guido van Rossum <guido@python.org> writes:
> 
> > - We need fallbacks for various exceptional cases:
> 
> I think assignment to __class__ also needs to be
> considered. Therefore, it may be best if the member array is a
> separate block (not allocated with the instances).

Good point.  When __class__ is assigned and the new __class__ has a
different layout of the member array then the old, all instance
variables must be moved from the member array into a temporary dict,
and then from the temporary dict redistributed over the new member
array.  If you're lucky, the temporary dict is empty after that;
otherwise, it becomes the overflow dict.

> It might also be worthwhile to incorporate __slots__ access into that
> scheme, to avoid having to find the member descriptor in the class
> dictionary.

__slots__ are really allocated in the object, not in a separate memory
block.  But I agree it would be nice if they could somehow be integrated.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh@python.net  Mon Feb 11 15:31:53 2002
From: mwh@python.net (Michael Hudson)
Date: 11 Feb 2002 15:31:53 +0000
Subject: [Python-Dev] Want to co-design and implement a logging module?
In-Reply-To: Greg Ward's message of "Mon, 11 Feb 2002 10:03:38 -0500"
References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> <20020204094149.C31089@ActiveState.com> <20020211150338.GA20372@gerg.ca>
Message-ID: <2mheoom3va.fsf@starship.python.net>

Greg Ward <gward@python.net> writes:

> On 04 February 2002, Trent Mick volunteered to "do something"
> about a standard logging module for Python:
> > How about I try to have a PEP together within a week or two, and perhaps a
> > working base implementation?
> 
> Well, it's been exactly 7 days.  Trent, are you halfway done yet?  ;-)
> 
> (Yes, I've been thinking that the solution to the Distutils' verbosity
> problem lies somewhere down this road.)

But I believe that 1.5.2 compatibility is still relavent for
distutils, so a logging module in 2.3 is not especially helpful,
unless one can come up with some scheme whereby the standalone
distutils packages can use a bundled logger and the 2.3 distutils use
the library one.

I had a go at implementing a very KISS approach to distutils logging
this morning and found what I was doing conflicted horribly with
distutils' current practice, so I stopped.

Cheers,
M.

-- 
  Also, remember to put the galaxy back when you've finished, or an
  angry mob of astronomers will come round and kneecap you with a
  small telescope for littering. 
       -- Simon Tatham, ucam.chat, from Owen Dunn's review of the year


From gward@python.net  Mon Feb 11 15:32:01 2002
From: gward@python.net (Greg Ward)
Date: Mon, 11 Feb 2002 10:32:01 -0500
Subject: [Python-Dev] Proposed standard module: Optik
Message-ID: <20020211153201.GA20417@gerg.ca>

Hi all --

I would like to propose adding my Optik module to the standard library.
Optik is the all-singing, all-dancing, featureful, extensible,
well-documented option-parsing module that I have always wanted.  Now I
have it, and I like it -- so much that I'd like it to just always be
there whenever I fire up Python (2.3 or greater, of course).

Please take a look at http://optik.sourceforge.net/ for the whole story,
including all the documentation and code via CVS.

Note that Optik is currently distributed as a package with three
modules; it's a hair over 1000 lines of text (563 lines of code),
though, so could easily be munged into a single file if that's preferred
for the standard library.

Two good arguments for adding Optik to the standard library:

  * David Goodger wants to use it for the standard doc-processing tools

  * I'd like to use it in the Distutils (and ditch
    distutils.fancy_getopt)

        Greg
-- 
Greg Ward - Unix bigot                                  gward@python.net
http://starship.python.net/~gward/
No man is an island, but some of us are long peninsulas.


From guido@python.org  Mon Feb 11 15:42:34 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Feb 2002 10:42:34 -0500
Subject: [Python-Dev] Proposed standard module: Optik
In-Reply-To: Your message of "Mon, 11 Feb 2002 10:32:01 EST."
 <20020211153201.GA20417@gerg.ca>
References: <20020211153201.GA20417@gerg.ca>
Message-ID: <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net>

> I would like to propose adding my Optik module to the standard library.

No immediate objection, although there are some other fancy options
packages around, and IMO you have to explain why Optik is better.

Can we change the name?  Optik is nice for a standalone 3rd party
module/package but a bit too fancyful for a standard library module.
It could be a new function in getopt:

    from getopt import OptionParser
    [...]
    parser = OptionParser()

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Mon Feb 11 15:41:35 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 11 Feb 2002 16:41:35 +0100
Subject: [Python-Dev] Accessing globals without dict lookup
References: <LNBBLJKPBEHFEDALKOLCIEBPNLAA.tim.one@comcast.net>
 <3C67A72F.B313E47@lemburg.com> <200202111431.g1BEVWJ19544@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C67E62F.5A0AD72F@lemburg.com>

Guido van Rossum wrote:
> 
> > Just a few quick questions before go back into lurking mode:
> 
> Note that I've moved my design to a new PEP, PEP 280.  Tim has added
> his approach there too.  Please read it!!!

Thanks for the answers; looks like I can safely go back into
lurking mode :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From gward@python.net  Mon Feb 11 16:10:25 2002
From: gward@python.net (Greg Ward)
Date: Mon, 11 Feb 2002 11:10:25 -0500
Subject: [Python-Dev] Proposed standard module: Optik
In-Reply-To: <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net>
References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020211161025.GA20794@gerg.ca>

On 11 February 2002, Guido van Rossum said:
> No immediate objection, although there are some other fancy options
> packages around, and IMO you have to explain why Optik is better.

Well, here's what I like about Optik:

  * it ties short options and long options together, so once you
    define your options you never have to worry about the fact that
    -f and --file are the same

  * it's strongly typed: if you say option --foo expects an int,
    then Optik makes sure the user supplied a string that can be
    int()'ified, and supplies that int to you

  * it automatically generates full help based on snippets of
    help text you supply with each option

  * it has a wide range of "actions" -- ie. what to do with the
    value supplied with each option.  Eg. you can store that value
    in a variable, append it to a list, pass it to an arbitrary
    callback function, etc.

  * you can add new types and actions by subclassing -- how to
    do this is documented and tested

  * it's dead easy to implement simple, straightforward, GNU/POSIX-
    style command-line options, but using callbacks you can be as
    insanely flexible as you like

  * provides lots of mechanism and only a tiny bit of policy (namely,
    the --help and (optionally) --version options -- and you can
    trash that convention if you're determined to be anti-social)

Anyways, read the docs at optik.sourceforge.net for the whole deal.

> Can we change the name?  Optik is nice for a standalone 3rd party
> module/package but a bit too fancyful for a standard library module.

Sure, no problem.

> It could be a new function in getopt:
> 
>     from getopt import OptionParser
>     [...]
>     parser = OptionParser()

I guess that's OK if we're agreed that Optik is the be-all, end-all
option-parsing tool.  (I happen to think so, but I'd like to get
opinions from a few other python-dev'ers before I let this go to my
head.)  I'm pretty cool to names like "super_getopt" or "fancy_getopt",
despite having perpetrated precisely the latter in the Distutils.  ;-(

        Greg
-- 
Greg Ward - geek-at-large                               gward@python.net
http://starship.python.net/~gward/
Know thyself.  If you need help, call the CIA.


From gward@python.net  Mon Feb 11 16:13:54 2002
From: gward@python.net (Greg Ward)
Date: Mon, 11 Feb 2002 11:13:54 -0500
Subject: [Python-Dev] Want to co-design and implement a logging module?
In-Reply-To: <2mheoom3va.fsf@starship.python.net>
References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> <20020204094149.C31089@ActiveState.com> <20020211150338.GA20372@gerg.ca> <2mheoom3va.fsf@starship.python.net>
Message-ID: <20020211161354.GB20794@gerg.ca>

On 11 February 2002, Michael Hudson said:
> But I believe that 1.5.2 compatibility is still relavent for
> distutils

I'm still catching up on distutils-sig traffic from the past year, so I
don't want to overcommit myself here... but I've been thinking that we
(I) should do one last Distutils release that is 1.5.2 compatible, and
then we can decide if future Distutils releases will stick to
2.0-compatibility, or are allowed to require the version of Python that
they go with.

However, please *don't* everyone jump in and start a thread about this
now.  I'll take it up on distutils-sig when I've caught up.

> I had a go at implementing a very KISS approach to distutils logging
> this morning and found what I was doing conflicted horribly with
> distutils' current practice, so I stopped.

Probably because the Distutils current practice is an ill-thought-out
mishmash.  That'll have to be fixed first, I suspect.  Sorry.  ;-(

        Greg
-- 
Greg Ward - Unix bigot                                  gward@python.net
http://starship.python.net/~gward/
NOBODY expects the Spanish Inquisition!


From paul@prescod.net  Mon Feb 11 16:16:57 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 11 Feb 2002 08:16:57 -0800
Subject: [Python-Dev] Proposed standard module: Optik
References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C67EE79.9C39087F@prescod.net>

Guido van Rossum wrote:
> 
> > I would like to propose adding my Optik module to the standard library.
> 
> No immediate objection, although there are some other fancy options
> packages around, and IMO you have to explain why Optik is better.

Maybe we should turn the Optik documentation into a PEP (or at least
make a PEP with a pointer to it) so that people with competitive
solutions can either suggest improvements or claim that their solution
is a better starting point.

 Paul Prescod


From martin@v.loewis.de  Mon Feb 11 16:26:00 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 11 Feb 2002 17:26:00 +0100
Subject: [Python-Dev] Proposed standard module: Optik
In-Reply-To: <20020211161025.GA20794@gerg.ca>
References: <20020211153201.GA20417@gerg.ca>
 <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net>
 <20020211161025.GA20794@gerg.ca>
Message-ID: <m3vgd4c7dz.fsf@mira.informatik.hu-berlin.de>

Greg Ward <gward@python.net> writes:

> > It could be a new function in getopt:
> > 
> >     from getopt import OptionParser
> >     [...]
> >     parser = OptionParser()
> 
> I guess that's OK if we're agreed that Optik is the be-all, end-all
> option-parsing tool.  (I happen to think so, but I'd like to get
> opinions from a few other python-dev'ers before I let this go to my
> head.)

I'd also be in favour of providing option parsing through getopt
only. If getopt is not enough, extend it (in moderate ways, rather
adding customization mechanisms instead of alternatives, etc). If that
involves incorporating code from Optik, fine. However, I don't think
the standard library should have two modules that do essentially the
same thing; such scenarious will raise question whether one is better
than the other and which of them is maintained.

Regards,
Martin


From guido@python.org  Mon Feb 11 16:28:59 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Feb 2002 11:28:59 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: Your message of "Mon, 11 Feb 2002 07:14:09 CST."
 <Pine.LNX.4.33.0202110646310.32170-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0202110646310.32170-100000@server1.lfw.org>
Message-ID: <200202111628.g1BGSxF20159@pcp742651pcs.reston01.va.comcast.net>

> All right -- i have attempted to diagram a slightly more interesting
> example, using my interpretation of Guido's scheme.
[...]
> How does it look?  Guido, is it anything like what you have in mind?

Yes, exactly.  I've added pointers to your images to PEP 280.  Maybe
you can also create a diagram for Tim's "more aggressive" scheme?

> A couple of observations so far:
> 
>     1.  There are going to be lots of global-cell objects.
>         Perhaps they should get their own allocator and free list.

Yes.

>     2.  Maybe we don't have to change the module dict type.
>         We could just use regular dictionaries, with the special
>         case that if retrieving the value yields a cell object,
>         we then do the objptr/cellptr dance to find the value.
>         (The cell objects have to live outside the dictionaries
>         anyway, since we don't want to lose them on a rehashing.)

And who would do the special dance?  If PyDict_GetItem, it would add
an extra test to code whose speed is critical in lots of other cases
(plus it would be impossible to create a dictionary containing cells
without having unwanted special magic).  If in a wrapper, then
<module>.__dict__[<key>] would return a surprise cell instead of a
value.

>     3.  Could we change the name, please?  It would really suck
>         to have two kinds of things called "cell objects" in
>         the Python core.

Agreed.  Or we could add a cellptr to the existing cell objects; or
maybe a scheme could be devised that wouldn't need a cell to have a
cellptr, and then we could use the existing cell objects unchanged.

>     4.  I recall Tim asked something about the cellptr-points-to-itself
>         trick.  Here's what i make of it -- it saves a branch: instead of
> 
>             PyObject* cell_get(PyGlobalCell* c)
>             {
>                 if (c->cell_objptr) return c->cell_objptr;
>                 if (c->cell_cellptr) return c->cell_cellptr->cell_objptr;
>             }
> 
>         it's
> 
>             PyObject* cell_get(PyGlobalCell* c)
>             {
>                 if (c->cell_objptr) return c->cell_objptr;
>                 return c->cell_cellptr->cell_objptr;
>             }

That's what my second "additional idea" in PEP 280 proposes:

|     - Make c.cellptr equal to c when a cell is created, so that
|       LOAD_GLOBAL_CELL can always dereference c.cellptr without a NULL
|       check.

>         This makes no difference when c->cell_objptr is filled,
>         but it saves one check when c->cell_objptr is NULL in
>         a non-shadowed variable (e.g. after "del x").  I believe
>         that's the only case in which it matters, and it seems
>         fairly rare to me that a module function will attempt to
>         access a variable that's been deleted from the module.

Agreed.  When x is not defined, it doesn't matter how much extra code
we execute as long as we don't dereference NULL. :-)

>         Because the module can't know what new variables might
>         be introduced into __builtin__ after the module has been
>         loaded, a failed lookup must finally fall back to a lookup
>         in __builtin__.  Given that, it seems like a good idea to
>         set c->cell_cellptr = c when c->cell_objptr is set (for
>         both shadowed and non-shadowed variables).  In my picture,
>         this would change the cell that spam.max points to, so
>         that it points to itself instead of __builtin__.max's cell.
>         That is:
> 
>             PyObject* cell_set(PyGlobalCell* c, PyObject* v)
>             {
>                 c->cell_objptr = v;
>                 c->cell_cellptr = c;
>             }

But now you'd have to work harder when you delete the global again
(i.e. in cell_delete()); the shadowed built-in must be restored.

>         This simplifies things further:
> 
>             PyObject* cell_get(PyGlobalCell* c)
>             {
>                 return c->cell_cellptr->cell_objptr;
>             }
> 
>         This buys us no branches, which might be a really good
>         thing on today's speculative execution styles.

Good idea!  (And before I *did* misread your followup, because I
hadn't fully digested this msg.  I think you're right that we might be
able to use just a PyObject **; but I haven't fully digested Tim's
more aggressive idea.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Feb 11 16:35:00 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Feb 2002 11:35:00 -0500
Subject: [Python-Dev] Proposed standard module: Optik
In-Reply-To: Your message of "Mon, 11 Feb 2002 08:16:57 PST."
 <3C67EE79.9C39087F@prescod.net>
References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net>
 <3C67EE79.9C39087F@prescod.net>
Message-ID: <200202111635.g1BGZ0R20230@pcp742651pcs.reston01.va.comcast.net>

> Maybe we should turn the Optik documentation into a PEP (or at least
> make a PEP with a pointer to it) so that people with competitive
> solutions can either suggest improvements or claim that their solution
> is a better starting point.

IMO we don't need a PEP, but we do need to solicit feedback from
people with competitive solutions.  Can you post something to c.l.py
and c.l.py.announce?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Feb 11 16:40:35 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Feb 2002 11:40:35 -0500
Subject: [Python-Dev] Proposed standard module: Optik
In-Reply-To: Your message of "11 Feb 2002 17:26:00 +0100."
 <m3vgd4c7dz.fsf@mira.informatik.hu-berlin.de>
References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> <20020211161025.GA20794@gerg.ca>
 <m3vgd4c7dz.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200202111640.g1BGeZT20267@pcp742651pcs.reston01.va.comcast.net>

> I'd also be in favour of providing option parsing through getopt
> only. If getopt is not enough, extend it (in moderate ways, rather
> adding customization mechanisms instead of alternatives, etc). If that
> involves incorporating code from Optik, fine. However, I don't think
> the standard library should have two modules that do essentially the
> same thing; such scenarious will raise question whether one is better
> than the other and which of them is maintained.

I think Optik provides one key idea that makes it better: an options
parser object that can be invoked multiple times and each time returns
a new options object whose attributes are variables corresponding to
various options.

I'd be happy to say that the old getopt.getopt() interface will be
deprecated.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paul@prescod.net  Mon Feb 11 16:36:31 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 11 Feb 2002 08:36:31 -0800
Subject: [Python-Dev] Proposed standard module: Optik
References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net>
 <3C67EE79.9C39087F@prescod.net> <200202111635.g1BGZ0R20230@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C67F30F.76C4A67A@prescod.net>

Guido van Rossum wrote:
> 
> > Maybe we should turn the Optik documentation into a PEP (or at least
> > make a PEP with a pointer to it) so that people with competitive
> > solutions can either suggest improvements or claim that their solution
> > is a better starting point.
> 
> IMO we don't need a PEP, but we do need to solicit feedback from
> people with competitive solutions.  Can you post something to c.l.py
> and c.l.py.announce?

I am happy to do an announcement but I feel like there needs to be a
place to redirect conversation. Should we set up a mailing list? Or do
all interested people want to join comp.lang.python and perhaps use a
subject prefix for filtering? "OPT: ..."

 Paul Prescod


From fdrake@acm.org  Mon Feb 11 16:41:51 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 11 Feb 2002 11:41:51 -0500
Subject: [Python-Dev] Python 2.2 group missing on SF Patches
In-Reply-To: <3C64573B.3412540F@metaslash.com>
References: <3C64573B.3412540F@metaslash.com>
Message-ID: <15463.62543.247745.715347@grendel.zope.com>

Neal Norwitz writes:
 > There is no 2.2 (or 2.2.1) choice under Group when submitting a patch
 > on Source Forge.

I've added 2.2.x for the release22-maint branch.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From guido@python.org  Mon Feb 11 16:47:07 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Feb 2002 11:47:07 -0500
Subject: [Python-Dev] Proposed standard module: Optik
In-Reply-To: Your message of "Mon, 11 Feb 2002 08:36:31 PST."
 <3C67F30F.76C4A67A@prescod.net>
References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> <3C67EE79.9C39087F@prescod.net> <200202111635.g1BGZ0R20230@pcp742651pcs.reston01.va.comcast.net>
 <3C67F30F.76C4A67A@prescod.net>
Message-ID: <200202111647.g1BGl7C20309@pcp742651pcs.reston01.va.comcast.net>

> I am happy to do an announcement but I feel like there needs to be a
> place to redirect conversation. Should we set up a mailing list? Or do
> all interested people want to join comp.lang.python and perhaps use a
> subject prefix for filtering? "OPT: ..."

They can post to python-dev (and even subscribe -- it's open these
days) or someone can summarize.  I don't expect a huge discussion.
Please make it clear in the announcement that followups in c.l.py will
be ignored -- they must send at least one email to python-dev to make
us aware that they're competing.  Ideally, it should compare their
solution to Greg's list of key features of Optik, which he posted
here:

http://mail.python.org/pipermail/python-dev/2002-February/019937.html

Please include that link in your announcement.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh@python.net  Mon Feb 11 17:28:11 2002
From: mwh@python.net (Michael Hudson)
Date: 11 Feb 2002 17:28:11 +0000
Subject: [Python-Dev] Want to co-design and implement a logging module?
In-Reply-To: Greg Ward's message of "Mon, 11 Feb 2002 11:13:54 -0500"
References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> <20020204094149.C31089@ActiveState.com> <20020211150338.GA20372@gerg.ca> <2mheoom3va.fsf@starship.python.net> <20020211161354.GB20794@gerg.ca>
Message-ID: <2mlme0j5ck.fsf@starship.python.net>

Greg Ward <gward@python.net> writes:

> On 11 February 2002, Michael Hudson said:
> > But I believe that 1.5.2 compatibility is still relavent for
> > distutils
> 
> I'm still catching up on distutils-sig traffic from the past year, so I
> don't want to overcommit myself here... but I've been thinking that we
> (I) should do one last Distutils release that is 1.5.2 compatible, and
> then we can decide if future Distutils releases will stick to
> 2.0-compatibility, or are allowed to require the version of Python that
> they go with.

I;m not sure that idea will get widespread support.

> However, please *don't* everyone jump in and start a thread about this
> now.  I'll take it up on distutils-sig when I've caught up.

But I'll wait until you get caught up.

> > I had a go at implementing a very KISS approach to distutils logging
> > this morning and found what I was doing conflicted horribly with
> > distutils' current practice, so I stopped.
> 
> Probably because the Distutils current practice is an ill-thought-out
> mishmash.  That'll have to be fixed first, I suspect.  Sorry.  ;-(

It was more to do with options processing (the fact that basically
speaking all options translate to attributes on some object) than
logging.  I suspect I could have used Optik more easily...

I'm also not sure how politic it would be to take an axe to the
interfaces of the various *util modules.

Cheers,
M.

-- 
  You sound surprised.  We're talking about a government department
  here - they have procedures, not intelligence.
                                            -- Ben Hutchings, cam.misc


From trentm@ActiveState.com  Mon Feb 11 17:54:41 2002
From: trentm@ActiveState.com (Trent Mick)
Date: Mon, 11 Feb 2002 09:54:41 -0800
Subject: [Python-Dev] Want to co-design and implement a logging module?
In-Reply-To: <20020211150338.GA20372@gerg.ca>; from gward@python.net on Mon, Feb 11, 2002 at 10:03:38AM -0500
References: <200202040235.g142ZPG16987@pcp742651pcs.reston01.va.comcast.net> <20020204094149.C31089@ActiveState.com> <20020211150338.GA20372@gerg.ca>
Message-ID: <20020211095441.B3536@ActiveState.com>

On Mon, Feb 11, 2002 at 10:03:38AM -0500, Greg Ward wrote:
> On 04 February 2002, Trent Mick volunteered to "do something"
> about a standard logging module for Python:
> > How about I try to have a PEP together within a week or two, and perhaps a
> > working base implementation?
> 
> Well, it's been exactly 7 days.  Trent, are you halfway done yet?  ;-)

I have my thoughts together. I'll write up and post tonight. Meanwhile I have
to get some work done for my employer. :)

Regarding Distutils for Python 1.5.2 usage: the potential logging support
*could* be back ported and included in the distutils package that gets put
together for Python 1.5.2.

Trent

-- 
Trent Mick
TrentM@ActiveState.com


From mal@lemburg.com  Mon Feb 11 18:08:32 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 11 Feb 2002 19:08:32 +0100
Subject: [Python-Dev] proposal: add basic time type tothestandardlibrary
References: <5.1.0.14.2.20020209011904.02cc7540@mercury-1.cbu.edu>
 <5.1.0.14.2.20020210091537.02026d38@mercury-1.cbu.edu> <5.1.0.14.2.20020210124257.01eb3820@mercury-1.cbu.edu>
Message-ID: <3C6808A0.9DCA4CA3@lemburg.com>

> I have no desire to compete with the mxDateTime implementation. I want to
> look at some of the solutions out there and take the best from everyone and
> provide a module that will suit 95-100% of the people. For several reasons,
> which I tried to point out in my mails, mxDateTime or Zope's Datetime in
> its current states is not suitable.

That's a strange conclusion since both of these modules have been
around for quite some time (mxDateTime was started in Dec. 1997) 
and obviously *are* quite suitable for a large share of Python's 
users :-)

BTW, mxDateTime can do quite a bit in terms of i18n:

>>> from mx.DateTime import *
>>> DateTimeFrom('11. Februar 2002')
<DateTime object for '2002-02-11 00:00:00.00' at 816cc48>
>>> DateTimeFrom('February, 11 2002')
<DateTime object for '2002-02-11 00:00:00.00' at 81307c8>

>>> from mx.DateTime import Locale
>>> Locale.French.str(now())
'lundi 11 f\xe9vrier 2002 19:07:12'
>>> Locale.Spanish.str(now())
'lunes 11 febrero 2002 19:07:19'
>>> Locale.German.str(now())
'Montag 11 Februar 2002 19:07:25'

(hmm, I ought to insert some extra interpunctation...)

Nevermind,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From Gerson.Kurz@t-online.de  Mon Feb 11 18:07:55 2002
From: Gerson.Kurz@t-online.de (Gerson Kurz)
Date: Mon, 11 Feb 2002 19:07:55 +0100
Subject: [Python-Dev] RE: RFC: Option Parsing Libraries
Message-ID: <CLEFLIBFLLLHJNOKPNGFKEDNCDAA.Gerson.Kurz@t-online.de>

I will get beat for this, but: can it be optionally non-case-sensitive? I
know, I know, in time-honoured unix-tradition a commandline should be
dangerous and unforgiving in use, but still, please?

Also, I've just scanned the specs and didn't find some
"rest-of-the-commandline-whatever-that-is" option. As in:

filename options file1 file2 ... filen

Other than that, it looks pretty neat.





From mal@lemburg.com  Mon Feb 11 18:29:04 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 11 Feb 2002 19:29:04 +0100
Subject: [Python-Dev] -U flag
References: <LNBBLJKPBEHFEDALKOLCGEOANKAA.tim.one@comcast.net>
 <3C6675C0.BE7B42FE@lemburg.com> <15462.55616.207319.531285@12-248-41-177.client.attbi.com>
Message-ID: <3C680D70.D30EB38D@lemburg.com>

Skip Montanaro wrote:
> 
> (I think we've had this discussion before...)
> 
>     MAL> Wait... the -U option was added in order to be able to see how well
>     MAL> the 8-bit string / Unicode integration works. It's a know fact that
>     MAL> the Python standard lib is not Unicode compatible yet and that's
>     MAL> exactly what the -U option allows you to test (in a very simple
>     MAL> way).
> 
> If -U is really just a "test" flag, I don't think it should show up in
> "python -h" output.

If noone objects, I'll remove the flag from the -h output. Ok ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From guido@python.org  Mon Feb 11 18:40:51 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Feb 2002 13:40:51 -0500
Subject: [Python-Dev] -U flag
In-Reply-To: Your message of "Mon, 11 Feb 2002 19:29:04 +0100."
 <3C680D70.D30EB38D@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCGEOANKAA.tim.one@comcast.net> <3C6675C0.BE7B42FE@lemburg.com> <15462.55616.207319.531285@12-248-41-177.client.attbi.com>
 <3C680D70.D30EB38D@lemburg.com>
Message-ID: <200202111840.g1BIeq721259@pcp742651pcs.reston01.va.comcast.net>

> If noone objects, I'll remove the flag from the -h output. Ok ?

+1

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Mon Feb 11 18:48:21 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 11 Feb 2002 19:48:21 +0100
Subject: [Python-Dev] -U flag
References: <LNBBLJKPBEHFEDALKOLCGEOANKAA.tim.one@comcast.net> <3C6675C0.BE7B42FE@lemburg.com> <15462.55616.207319.531285@12-248-41-177.client.attbi.com>
 <3C680D70.D30EB38D@lemburg.com> <200202111840.g1BIeq721259@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C6811F5.93E1607B@lemburg.com>

Guido van Rossum wrote:
> 
> > If noone objects, I'll remove the flag from the -h output. Ok ?
> 
> +1

Done.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From marklists@mceahern.com  Mon Feb 11 19:16:52 2002
From: marklists@mceahern.com (Mark McEahern)
Date: Mon, 11 Feb 2002 11:16:52 -0800
Subject: [Python-Dev] RE: Option Parsing Libraries
In-Reply-To: <3C67F75D.D1EF37DA@prescod.net>
Message-ID: <NCBBLFCOHHDIKCAFGCFBIEDHKCAA.marklists@mceahern.com>

[Paul Prescod]
> If you have a competitive library, or suggestions for changes to Optik,
> please forward your comments to python-dev mailing list
> (python-dev@python.org).

I love optik.  We use it for all of our option parsing.

I have one feature request related to error handling.  I've attached sample
code below that shows a common thing I end up doing:  raising an error if a
required option is missing.  I guess it's not really an option, then, is it?
<wink>  Anyway, I searched optik's documentation for some way to "inspect"
the options collection itself for the original information used when
creating the option.  In the code below, you'll notice the requiredVar
method takes a description parameter.  It would be nice to be able to do
something like this instead:

	if var is None:
		parser.error("Missing: %s" % options.var.description)

In fact, it would seem that this itself is so common it would be "built-in"
to optik itself.  So that all I have to do is declare an option as required
when I add it with add_option?

Thanks,

// mark

#! /usr/bin/env python
# testError.py

from optik import OptionParser

def requiredVar(parser, options, var, description):
    """Raise a parser error if var is None."""
    if var is None:
        # Here's where it'd be nice to have access to the attributes of the
        # options; at the very least, so I could say which option is missing
        # without having to pass in the description.
        parser.error("Missing: %s" % description)

def parseCommandLine():
    """Parse the command line options and return (options, args)."""

    usage = """usage: %prog [options]
Testing optik's error handling.
"""
    parser = OptionParser(usage)
    parser.add_option("-f", "--file", type="string", dest="filename",
                      metavar="FILE", help="read data from FILE",
default=None)

    options, args = parser.parse_args()

    requiredVar(parser, options, options.filename, "filename")

def main():
    options, args = parseCommandLine()

if __name__=="__main__":
    main()



From gward@python.net  Mon Feb 11 19:29:04 2002
From: gward@python.net (Greg Ward)
Date: Mon, 11 Feb 2002 14:29:04 -0500
Subject: [Python-Dev] Proposed standard module: Optik
In-Reply-To: <200202111647.g1BGl7C20309@pcp742651pcs.reston01.va.comcast.net>
References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> <3C67EE79.9C39087F@prescod.net> <200202111635.g1BGZ0R20230@pcp742651pcs.reston01.va.comcast.net> <3C67F30F.76C4A67A@prescod.net> <200202111647.g1BGl7C20309@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020211192904.GA22667@gerg.ca>

On 11 February 2002, Guido van Rossum said:
> They can post to python-dev (and even subscribe -- it's open these
> days) or someone can summarize.  I don't expect a huge discussion.
> Please make it clear in the announcement that followups in c.l.py will
> be ignored -- they must send at least one email to python-dev to make
> us aware that they're competing.

A good starting point for modules that compete with Optik can be found
in "User Interfaces" section of the Vaults of Parnassus:

  http://www.vex.net/parnassus/apyllo.py/808292924

The contenders are:

  Cmdline
  Getargs
  GetPotPython
  Optik
  Options
  pypopt

...wow!  I must confess, it didn't occur to me to check Parnassus before
writing Optik; I just arrogantly assumed that I would get it right.

It would be interesting to hear from the authors *and users* of the
above modules.  If you're curious what Optik's users have said, see the
optik-users list archive:

  http://www.geocrawler.com/redir-sf.php3?list=optik-users

        Greg
-- 
Greg Ward - nerd                                        gward@python.net
http://starship.python.net/~gward/
If you're not part of the solution, you're part of the precipitate.


From gward@python.net  Mon Feb 11 19:40:12 2002
From: gward@python.net (Greg Ward)
Date: Mon, 11 Feb 2002 14:40:12 -0500
Subject: [Python-Dev] RE: RFC: Option Parsing Libraries
In-Reply-To: <CLEFLIBFLLLHJNOKPNGFKEDNCDAA.Gerson.Kurz@t-online.de>
References: <CLEFLIBFLLLHJNOKPNGFKEDNCDAA.Gerson.Kurz@t-online.de>
Message-ID: <20020211194012.GB22667@gerg.ca>

On 11 February 2002, Gerson Kurz said:
> I will get beat for this, but: can it be optionally non-case-sensitive? I
> know, I know, in time-honoured unix-tradition a commandline should be
> dangerous and unforgiving in use, but still, please?

Interesting idea; should be trivial given a case-insensitive dictionary.
And hasn't such a beast been bandied about as an example of subclassing
built-in types with Python 2.2?

Anyways, that's an Optik feature requests, and belongs on
optik-users@lists.sourceforge.net.  If you're serious, take it up there.

> Also, I've just scanned the specs and didn't find some
> "rest-of-the-commandline-whatever-that-is" option. As in:
> 
> filename options file1 file2 ... filen

When you do this:

  parser = OptionParser(...)
  (options, args) = parser.parse_args()

then args is the list of positional arguments left over after parsing
options.

But again, that's a question about Optik, and belongs (for now) on the
optik-users list.

        Greg
-- 
Greg Ward - just another Python hacker                  gward@python.net
http://starship.python.net/~gward/
Paranoia is simply an optimistic outlook on life.


From oren-py-d@hishome.net  Mon Feb 11 20:09:54 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 11 Feb 2002 22:09:54 +0200
Subject: [Python-Dev] patch: speed up name access by up to 80%
Message-ID: <20020211220954.A12061@hishome.net>

Problem: Python name lookup in dictionaries is relatively slow.

Possible solutions:

1. Find ways to bypass dictionary lookup.
2. Optimize the hell out of dictionary lookup.

Examples of approach #1 are the existing fastlocals mechanism, PEP 266,
PEP 267 and the recent proposal by GvR for speeding up instance attribute
access. These proposals all face major difficulties resulting from the
dynamic nature of Python's namespace and require tricky techniques of code
analysis code to check for assignments that may change the visible namespace.

I have chosen to try #2. The biggest difficulty with this approach is that
dictionaries are already very well optimized :-) But I have found that
Python dictionaries are mostly optimized for general-purpose use, not for
use as namespace for running code. There are some characteristics unique to 
namespace access which have not been used for optimization:

* Lookup keys are always interned strings.
* The local/global/builtin fallback means that most accesses fail.

The patch adds the function PyDict_GetItem_Fast to dictobject. This
function is equivalent to PyDict_GetItem but is much faster for lookup
using interned string keys and for lookups with a negative result.
LOAD_NAME and LOAD_GLOBAL have been converted to use this function.

Here are the timings for 200,000,000 accesses to fastlocals, locals,
globals and builtins for Python 2.2 with and without the fastnames patch:

                2.2       2.2+fastnames
---------------------------------------
builtin.py      68.540s   37.900s
global.py       54.570s   34.020s
local.py        41.210s   32.780s
fastlocal.py    24.530s   24.540s

Fastlocals are still significantly faster than locals, but not by such a 
wide margin. You can notice that the speed differences between locals, 
globals and builtins are almost gone.

The machine is a Pentium III at 866MHz running linux. Both interpreters 
were compiled with "./configure ; make". The test code is at the bottom of
this message. You will not see any effect on the results of pybench
because it uses only fastlocals inside the test loops. Real code that uses
a lot of globals and builtins should see a significant improvement.

Get the patch:

http://www.tothink.com/python/fastnames/fastnames.patch

The optimization techniques used:

* Inline code
PyDict_GetItem_Fast is actually an inline macro. If the item is stored at
the first hash location it will be returned without any function calls. It
also requires that the key used for the entry is the same interned string
as the key. This fast inline version is possible because the first
argument is known to be a valid dictionary and the second argument is
known to be a valid interned string with a valid cached hash.

* Key comparison by pointer
If the inline macro fails PyDict_GetItem_Fast2 is called. This function
searches for an entry with a key identical to the requested key. This
search is faster than lookdict or lookdict_string because there are no 
expensive calls to external compare-by-value functions. Another small
speedup is gained by not checking for free slots since this function is
never used for setting items. If this search fails, the dictionary's 
ma_lookup function is called.

* Interning of entry keys
One reason the quick search could fail is because the entry was set using
direct access to __dict__ instead of standard name assignment and
therefore the entry key is not interned. In this case the entry key is
replaced with the interned lookup key. The next fast search for the same
key will succeed. There is a very good chance that it will be handled by
the inline macro.

* Negative entries
In name lookup most accesses fail. In order to speed them up negative
entries can mark a name as "positively not there", usually detected by the
macro without requiring any function calls. Negative entries have the
interned key as their me_key and me_value is NULL. Negative entries occupy
real space in the hash table and cannot be reused as empty slots. This
optimization technique is not practical for general purpose dictionaries
because some types of code would quickly overload the dictionary with
many negative entries. For name lookup the number of negative entries is
bound by the number of global and builtin names referenced by the code
that uses the dictionary as a namespace.

This new type of slot has a surprisingly small impact on the rest of
dictobject.c. Only one assertion had to be removed to accomodate it. All
other code treats it as either an active entry (key !=NULL, !=dummy) or a 
deleted entry (value == NULL, key != NULL) and just happens to do the Right 
Thing for each case.

If an entry with the same key as a negative entry is subsequently inserted 
into the dictionary it will overwrite the negative entry and be reflected 
immediately in the namespace. There is no caching and therefore no cache 
coherency issues.

Known bugs:

Negative entries do not resize the table. If there is not enough free space 
in the table they are simply not inserted.

Assumes CACHE_HASH, INTERN_STRINGS without checking.

Future directions:

It should be possible to apply this to more than just LOAD_ATTR and 
LOAD_GLOBAL: attributes, modules, setting items, etc.

The hit rate for the inline macro varies from 100% in most simple cases 
to 0% in cases where the first hash position in the table happens to be 
occupied by another entry. Even in these cases it is still very fast, but 
I want to get more consistent performance. I am starting to experiment 
with probabilistic techniques that shuffle entries in the hash table and 
try to ensure that the entries accessed most often are kept in the first 
hash positions as much as possible.

Test code:

* builtin.py
class f:
  for i in xrange(10000000):
    hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ;
    hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ;

* global.py
hex = 5
class f:
  for i in xrange(10000000):
    hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ;
    hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ;

* local.py
class f:
  hex = 5
  for i in xrange(10000000):
    hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ;
    hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ;

* fastlocal.py
def f():
  hex = 5
  for i in xrange(10000000):
    hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ;
    hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ; hex ;

f()

	Oren


From gward@python.net  Mon Feb 11 20:37:16 2002
From: gward@python.net (Greg Ward)
Date: Mon, 11 Feb 2002 15:37:16 -0500
Subject: [Python-Dev] Proposed standard module: Optik
In-Reply-To: <20020211192904.GA22667@gerg.ca>
References: <20020211153201.GA20417@gerg.ca> <200202111542.g1BFgYS19924@pcp742651pcs.reston01.va.comcast.net> <3C67EE79.9C39087F@prescod.net> <200202111635.g1BGZ0R20230@pcp742651pcs.reston01.va.comcast.net> <3C67F30F.76C4A67A@prescod.net> <200202111647.g1BGl7C20309@pcp742651pcs.reston01.va.comcast.net> <20020211192904.GA22667@gerg.ca>
Message-ID: <20020211203716.GA22837@gerg.ca>

--TB36FDmn/VVEgNH/
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On 11 February 2002, I said:
> A good starting point for modules that compete with Optik can be found
> in "User Interfaces" section of the Vaults of Parnassus:
> 
>   http://www.vex.net/parnassus/apyllo.py/808292924

OK, I've looked at all the option-parsing packages listed in Parnassus.
I've read the docs for all of them, and flipped through the source for
some of them.  Here's the executive summary:

  * only one of them, arglist.py by Ben Wolfson, has a nice OO
    design similar to Optik

  * the one feature that several of the competition offer but Optik
    does not (yet) is the ability to specify an option that *may*
    take a value, but doesn't necessarily *have to* take a value.
    Ironically, this is one of my requirements for the Distutils,
    motivated by the --home option to the "install" command.

I think arglist.py is the only serious contender here.  Based on my
cursory inspection, all of the others have rather deep flaws.  (Eg. they
implement a non-standard syntax, or they do all their work at import
time rather than providing a class to instantiate and do option-parsing
work, or they have painful/awkward/hairy programming interface.)

I'll attach my full notes.  Anyone else who feels like doing this should
start at the *bottom* of the list on Parnassus, since I devoted
progressively less time and energy to each package along the way.  ;-)

        Greg
-- 
Greg Ward - geek                                        gward@python.net
http://starship.python.net/~gward/
Save energy: be apathetic.

--TB36FDmn/VVEgNH/
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="competition.txt"

THE COMPETITION
---------------

arglist.py (Feb 2002)
  author: Ben Wolfson <rumjuggler@cryptarchy.org>
  url:    http://home.uchicago.edu/~wolfson/Python/

  * fairly clean OO design, much like Optik: Option for each option,
    Argument for a collection of options

  * results of parsing command line (option values and leftover
    positional args) are accessible through Arguments object -- no
    separate "option values" object

  * handles short options much like Optik: "-ffoo" and "-f foo" seem to
    work, as does "-avx" where -a, -v, -x all value-less options

  * subtly different notion of "default value" from Optik -- if an
    option takes a value, and no default value is provided, the user
    must provide a value.  With Optik (<= 1.2), if an option takes a
    value the user must always provide a value; the default value is for
    when that option isn't present at all.

  * dependent on Python 2.2 -- even uses a metaclass! (not that
    it really *needs* to)

  * no strong typing, much weaker callback interface; but "behaviors"
    are like Optik's "actions" -- there just aren't as many of them

  * main advantage over Optik: it's possible to define an option
    that takes a value, but doesn't require a value

  * error-handling?  not sure -- think it raises an exception

  * long option abbreviations allowed? not sure


Cmdline (1.0)
  author: Daniel Gindikin <dan@netrics.com>
  url: http://members.home.com/gindikin/dev/python/cmdline/

  * weird API: just import the module and it does everything
    then 

  * slightly weird user interface: in addition to the standard
    "--foo=bar" and "--foo bar", "foo=bar" and "foo:bar" also
    work: yuck

  * very cool error-handling: prints out the command-line, underlining
    the option with errors -- nice!

  * rudimentary type-checking -- if you ask for an integer value, and
    user supplied a string, it bombs with a useful error message

  * not extensible -- everything's done at module-level, no classes
    or anything nice like that

  * long option abbreviations allowed? not sure


Getargs (1.3)
  author: ? (Ivan Van Laningham?)
  url: http://www.pauahtun.org/ftp.html

  * painful, clunky interface (eg. None specifies a boolean option,
    0j a "count" option, 0 an integer option, 0.0 a float option)

  * I don't see how to specify a plain old string option!

  * documentation is confusing and poorly written

  * "long options" are Tk-style, eg. "-file", rather than GNU-style
    "--file"

  * order of options is lost -- not clear what happens if user does
    -ffoo -fbar? what is the value of -f?

  * long options can be abbreviated

  * last updated 1999


GetPot Python
  author: Frank-Rene Schaefer
  url: http://getpot.sourceforge.net/

  * written in C++, so an extension is needed... or is it?  not clear

  * docs cover C++ version

  * LGPL'd

  * seems to define a mini-language for defining command-line options;
    not sure where you're supposed to put those .pot source files


Options
  author: Tim Colles <timc@dai.ed.ac.uk>
          Johan Vromans <jvromans@squirrel.nl>

  * port of Perl's Getopt::Long

  * not really OO or extensible, as near as I could tell

  * possible to specify option types and required-ness, but the
    syntax is hairy -- I think it's all done in one fell swoop
    (single call to GetOptions() does everything)

--TB36FDmn/VVEgNH/--


From tim.one@comcast.net  Mon Feb 11 20:45:56 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 11 Feb 2002 15:45:56 -0500
Subject: [Python-Dev] Proposed standard module: Optik
In-Reply-To: <20020211153201.GA20417@gerg.ca>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEFDNLAA.tim.one@comcast.net>

[Greg Ward]
> ...
> Two good arguments for adding Optik to the standard library:
>
>   * David Goodger wants to use it for the standard doc-processing tools

IIRC, David was the most recent person to have a massive getopt enhancement
path rejected, so if you've got his backing both of the key players are
covered <wink>.



From skip@pobox.com  Mon Feb 11 20:48:36 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 11 Feb 2002 14:48:36 -0600
Subject: [Python-Dev] patch: speed up name access by up to 80%
In-Reply-To: <20020211220954.A12061@hishome.net>
References: <20020211220954.A12061@hishome.net>
Message-ID: <15464.11812.114930.543584@beluga.mojam.com>

    Oren> Problem: Python name lookup in dictionaries is relatively slow.

    ... [ lots of interesting stuff elided ] ...

This looks pretty much like a no-brainer to me, assuming it stands up to
close scrutiny.  The only thing I'd change is that if PyDict_GetItem_Fast is
a macro that only works for interned strings, I'd change its name to reflect
its use: PyDict_GET_ITEM_INTERNED or something similar.

As a further test, I suggest you give the pystone benchmark a whirl, not
because it's such a kickass benchmark, but because it occasionally fiddles
global variable values.  I imagine it will do just fine and probably run a
bit faster to boot.

Skip


From skip@pobox.com  Mon Feb 11 20:52:30 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 11 Feb 2002 14:52:30 -0600
Subject: [Python-Dev] patch: speed up name access by up to 80%
In-Reply-To: <20020211220954.A12061@hishome.net>
References: <20020211220954.A12061@hishome.net>
Message-ID: <15464.12046.67498.758229@beluga.mojam.com>

    The only thing I'd change is that if PyDict_GetItem_Fast is a macro that
    only works for interned strings, I'd change its name to reflect its use:
    PyDict_GET_ITEM_INTERNED or something similar.

One other naming change is to prefix PyDict_GetItem_Fast2 with an
underscore, since the comments about its use make it clear that it's an
internal function.

Skip


From oren-py-d@hishome.net  Mon Feb 11 21:29:32 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 11 Feb 2002 16:29:32 -0500
Subject: [Python-Dev] patch: speed up name access by up to 80%
In-Reply-To: <15464.12046.67498.758229@beluga.mojam.com>
References: <20020211220954.A12061@hishome.net> <15464.12046.67498.758229@beluga.mojam.com>
Message-ID: <20020211212932.GA82642@hishome.net>

On Mon, Feb 11, 2002 at 02:52:30PM -0600, Skip Montanaro wrote:
> 
>     The only thing I'd change is that if PyDict_GetItem_Fast is a macro that
>     only works for interned strings, I'd change its name to reflect its use:
>     PyDict_GET_ITEM_INTERNED or something similar.
> 
> One other naming change is to prefix PyDict_GetItem_Fast2 with an
> underscore, since the comments about its use make it clear that it's an
> internal function.

Done and up on the same URL.

PyDict_GetItem_Fast -> PyDict_GETITEM_INTERNED
PyDict_GetItem_Fast2 -> _PyDict_GetItem_Interned

	Oren


From tim.one@comcast.net  Mon Feb 11 21:57:03 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 11 Feb 2002 16:57:03 -0500
Subject: [Python-Dev] Want to co-design and implement a logging module?
In-Reply-To: <20020211095441.B3536@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFLNLAA.tim.one@comcast.net>

[Trent Mick]
> I have my thoughts together. I'll write up and post tonight. 
> Meanwhile I have to get some work done for my employer. :)

No problem:  Guido says this is your top priority <wink>.

everyone-works-for-guido!-ly y'rs  - tim


From neal@metaslash.com  Mon Feb 11 22:26:01 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Mon, 11 Feb 2002 17:26:01 -0500
Subject: [Python-Dev] patch: speed up name access by up to 80%
References: <20020211220954.A12061@hishome.net>
Message-ID: <3C6844F9.954E1DA2@metaslash.com>

Oren Tirosh wrote:
> 
> Problem: Python name lookup in dictionaries is relatively slow.

> http://www.tothink.com/python/fastnames/fastnames.patch

I tried this patch (*) by running the regression tests:

	make && time ./python -E -tt Lib/test/regrtest.py

All the expected tests passed and there were no failures, this is good.
The bad news is that it was slower.  It took 42 user seconds longer 
with the patch than without.

Before patch:

    real    3m1.031s
    user    1m19.480s
    sys     0m2.400s

After patch:

    real    3m38.071s
    user    1m51.760s
    sys     0m2.790s

The box is Linux 2.4, Athlon 650, 256 MB.

(*) Pretty sure this was patch #1, running sum yields:  53200 10.
But it shouldn't matter, since it was only a name change, right?

Neal


From oren-py-d@hishome.net  Mon Feb 11 22:55:38 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 11 Feb 2002 17:55:38 -0500
Subject: [Python-Dev] patch: speed up name access by up to 80%
In-Reply-To: <3C6844F9.954E1DA2@metaslash.com>
References: <20020211220954.A12061@hishome.net> <3C6844F9.954E1DA2@metaslash.com>
Message-ID: <20020211225538.GA93506@hishome.net>

On Mon, Feb 11, 2002 at 05:26:01PM -0500, Neal Norwitz wrote:
> Oren Tirosh wrote:
> > 
> > Problem: Python name lookup in dictionaries is relatively slow.
> 
> > http://www.tothink.com/python/fastnames/fastnames.patch
> 
> I tried this patch (*) by running the regression tests:
> 
> 	make && time ./python -E -tt Lib/test/regrtest.py
> 
> All the expected tests passed and there were no failures, this is good.
> The bad news is that it was slower.  It took 42 user seconds longer 
> with the patch than without.

I have tried this and got the same results for the patched and unpatched
versions (+-1 second).  The regression tests spend most of their time on
things like threads, sockets, signals, etc that have a lot of variance
and are not really affected by name lookup speed.

I got strage results comparing to the python2.2 RPM package (some faster,
some slower).  I didn't start to get consistent results until I used two
freshly compiled interpreters.  I wonder with what options this package 
was compiled.

	Oren



From neal@metaslash.com  Mon Feb 11 23:20:01 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Mon, 11 Feb 2002 18:20:01 -0500
Subject: [Python-Dev] patch: speed up name access by up to 80%
References: <20020211220954.A12061@hishome.net> <3C6844F9.954E1DA2@metaslash.com> <20020212004640.A20174@hishome.net>
Message-ID: <3C6851A1.83609558@metaslash.com>

Oren Tirosh wrote:
> 
> On Mon, Feb 11, 2002 at 05:26:01PM -0500, Neal Norwitz wrote:
> > I tried this patch (*) by running the regression tests:
> >
> >       make && time ./python -E -tt Lib/test/regrtest.py
> >
> > All the expected tests passed and there were no failures, this is good.
> > The bad news is that it was slower.  It took 42 user seconds longer
> > with the patch than without.
> 
> I tried this and the results were identical for both version (+-1 second)
> The regression tests most of their time on things like threads, sockets,
> signals and stuff like that which is barely affected by this patch and
> has a lot of other sources of variance.  Some subset of the regression
> tests may be good as a benchmark, though.
> 
> I also got very strange results (some faster, some slower) with the
> python2.2 RPM package and didn't start to get consistent results until I
> used a freshly compiled interpreter for both the reference and DUT.  I want
> to see how the package was compiled and why it got such strange results.

I was surprised by the results.  I am using the latest version from CVS, 
plus I have some outstanding changes that would be unlikely 
to cause a conflict.  They deal with removing consecutive 
line numbers (there is a patch on SF) and speeding up conditionals.  
Nothing that is remotely close to dictionaries.

Here are all the options from the compile:

gcc -c -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fkeep-inline-functions -fno-inline -fprofile-arcs -ftest-coverage -I.
-I./Include -DHAVE_CONFIG_H

It's possible that no inlining is hurting, but I would still have expected
your patch to be faster than without.  I also wouldn't expect the test
coverage to be hurting (-fprofile-arcs -ftest-coverage), but that is
possible.  It would be nice if someone else could duplicate either of
our results.

Neal


From wolfson@uchicago.edu  Mon Feb 11 23:31:49 2002
From: wolfson@uchicago.edu (Ben Wolfson)
Date: Mon, 11 Feb 2002 17:31:49 -0600
Subject: [Python-Dev] Re: RFC: Option Parsing Libraries
References: <mailman.1013446699.14883.python-list@python.org>
Message-ID: <200202112333.g1BNX5Z24781@midway.uchicago.edu>

I have an option-parsing module at
http://home.uchicago.edu/~wolfson/Python/arglist.py ; from reading
Optik's docs it isn't as featureful but it does answer the list given in 
http://mail.python.org/pipermail/python-dev/2002-February/019937.html.
I haven't used Optik, but my module seems simpler to use.  It supports
arbitrary callbacks, can be strongly typed (insofar as passing, eg the
int function as a callback will generate an error if the value isn't
int-able), recognizes the identity of short and long forms, and has a
reasonable range of actions--if an option has no argument, by default it
records if it appeared and how often; if it does, it can append multiple
occurences to a list or keep only the last.  Callbacks aren't as flexible
as Optik's (they are functions of one argument) but with nested scopes
that probably wouldn't be terribly problematic.

I'm currently in the process of re-writing arglist.py in my spare time
since the code is rather messy, so if it's seen as a contender I could
add features.

-- 
BTR    
BEN WOLFSON HAS RUINED ROCK MUSIC FOR A GENERATION
 -- Crgre Jvyyneq


From greg@cosc.canterbury.ac.nz  Mon Feb 11 23:39:35 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 12 Feb 2002 12:39:35 +1300 (NZDT)
Subject: [Python-Dev] RE: Option Parsing Libraries
In-Reply-To: <NCBBLFCOHHDIKCAFGCFBIEDHKCAA.marklists@mceahern.com>
Message-ID: <200202112339.MAA20826@s454.cosc.canterbury.ac.nz>

Mark McEahern <marklists@mceahern.com>:

> raising an error if a required option is missing.  I guess it's not
> really an option, then, is it?

Maybe it should be called an argument parser instead
of an option parser.

Although "Argik" doesn't have quite the same ring to it. :-(

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From aahz@rahul.net  Mon Feb 11 23:44:54 2002
From: aahz@rahul.net (Aahz Maruch)
Date: Mon, 11 Feb 2002 15:44:54 -0800 (PST)
Subject: [Python-Dev] RE: Option Parsing Libraries
In-Reply-To: <200202112339.MAA20826@s454.cosc.canterbury.ac.nz> from "Greg Ewing" at Feb 12, 2002 12:39:35 PM
Message-ID: <20020211234455.744A5E8C5@waltz.rahul.net>

Greg Ewing wrote:
> 
> Although "Argik" doesn't have quite the same ring to it. :-(

Would you like some roast turkey and mashed potatoes with that?



(Sorry, an old in-joke for people who've been on netnews for more than a
decade.)
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

We must not let the evil of a few trample the freedoms of the many.


From neal@metaslash.com  Mon Feb 11 23:57:46 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Mon, 11 Feb 2002 18:57:46 -0500
Subject: [Python-Dev] patch: speed up name access by up to 80%
References: <20020211220954.A12061@hishome.net> <3C6844F9.954E1DA2@metaslash.com> <20020211225538.GA93506@hishome.net>
Message-ID: <3C685A7A.4D39A35D@metaslash.com>

Oren Tirosh wrote:
> 
> On Mon, Feb 11, 2002 at 05:26:01PM -0500, Neal Norwitz wrote:
> > Oren Tirosh wrote:
> > >
> > > Problem: Python name lookup in dictionaries is relatively slow.
> >
> > > http://www.tothink.com/python/fastnames/fastnames.patch
> >
> > I tried this patch (*) by running the regression tests:
> >
> >       make && time ./python -E -tt Lib/test/regrtest.py
> >
> > All the expected tests passed and there were no failures, this is good.
> > The bad news is that it was slower.  It took 42 user seconds longer
> > with the patch than without.
> 
> I have tried this and got the same results for the patched and unpatched
> versions (+-1 second).  The regression tests spend most of their time on
> things like threads, sockets, signals, etc that have a lot of variance
> and are not really affected by name lookup speed.

I rebuilt everything from scratch and got results similar to Oren's,
ie, roughly the same.  This time I took off the test-coverage flags.
(Sorry, I must have had them off for stock, but on with the Oren's patch).

Before patch:

real    2m57.416s
user    1m12.830s
sys     0m2.580s

After patch:

real    2m56.017s
user    1m14.960s
sys     0m2.380s

I still have inlines turned off.

Neal


From tim.one@comcast.net  Tue Feb 12 00:01:04 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 11 Feb 2002 19:01:04 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <200202111628.g1BGSxF20159@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEGNNLAA.tim.one@comcast.net>

[Guido]
> ...
> I think you're right that we might be able to use just a PyObject **; but
> I haven't fully digested Tim's more aggressive idea.)

The overwhelming thrust of Tim's variant is to reduce the (by far) most
frequent namespace operation to this:

       case LOAD_GLOBAL_CELL:
           cell = func_cells[oparg];
           x = cell->objptr;  /* note: not two levels, just one */
           if (x != NULL) {
               Py_INCREF(x);
               continue;
           }
           ... error recovery ...
           break;

*Everything* else follows from that; it's really a quite minor variant of
the original proposal, consisting mostly of changes to spelling details due
to having a different kind of cell.  The sole big change is requiring that
mutations to builtins propagate at once to their cached values in module
celldicts.

I believe Jeremy's scheme *could* do better than this for builtins, but not
the way it's currently set up (I don't see why we can't define a fixed
bijection between the standard builtin names and a range of contiguous
little integers, and use that fixed bijection everywhere; the compiler could
exploit this global (in the cross-module sense) bijection directly for
LOAD_BUILTIN offsets, eliminating all the indirection for standard builtins,
and eliminating the code-object-specific vectors of referenced builtin
names; note that I don't care about speeding access to builtins with
non-standard names -- fine by me if they're handled via LOAD_GLOBAL instead,
and fall into its "oops! it's not a module global after all" case).



From jeremy@alum.mit.edu  Tue Feb 12 00:14:25 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 11 Feb 2002 19:14:25 -0500
Subject: [Python-Dev] patch: speed up name access by up to 80%
In-Reply-To: <3C685A7A.4D39A35D@metaslash.com>
References: <20020211220954.A12061@hishome.net>
 <3C6844F9.954E1DA2@metaslash.com>
 <20020211225538.GA93506@hishome.net>
 <3C685A7A.4D39A35D@metaslash.com>
Message-ID: <15464.24161.581535.548441@gondolin.digicool.com>

So simple benchmark programs are a lot more interesting.

I'd pick pystone, test_hmac, and test_htmlparser.

Jeremy



From greg@cosc.canterbury.ac.nz  Tue Feb 12 00:03:17 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 12 Feb 2002 13:03:17 +1300 (NZDT)
Subject: [Python-Dev] RE: Option Parsing Libraries
In-Reply-To: <20020211234455.744A5E8C5@waltz.rahul.net>
Message-ID: <200202120003.NAA20832@s454.cosc.canterbury.ac.nz>

aahz@rahul.net (Aahz Maruch):

> Greg Ewing wrote:
> > 
> > Although "Argik" doesn't have quite the same ring to it. :-(
> 
> Would you like some roast turkey and mashed potatoes with that?

No, thank you. I was just sneezing and hiccupping at
the same time. :-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From martin@v.loewis.de  Tue Feb 12 00:10:42 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: Tue, 12 Feb 2002 01:10:42 +0100
Subject: [Python-Dev] Incorporating Expat
Message-ID: <200202120010.g1C0AgM01277@mira.informatik.hu-berlin.de>

At IPC10, I discussed with Fred a strategy for incorporating Expat
1.95.2 into Python. I've now implemented most of this, consisting of:

- adding the lib/ directory of Expat as Modules/expat
- changing setup.py to build the included Expat library into the
  extension module
- likewise for PCbuild/pyexpat.dsp
- dropping support for older Expat versions from pyexpat (support for
  older Python version is still maintained)

AFAICT, the only missing part is to add the relevant changes to
Modules/Setup.in, and to update various documentation files.

Please make sure to pick up new directories when updating your CVS
sandbox.

If you find any problems with that change, please let me know.

Regards,
Martin


From guido@python.org  Tue Feb 12 04:30:43 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Feb 2002 23:30:43 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: Your message of "Mon, 11 Feb 2002 19:01:04 EST."
 <LNBBLJKPBEHFEDALKOLCMEGNNLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEGNNLAA.tim.one@comcast.net>
Message-ID: <200202120430.g1C4UhV29560@pcp742651pcs.reston01.va.comcast.net>

[Ping, suggesting to always dereference the cell twice]
> But hey... in that case the cellptr is always two steps away from
> the object.  So why not just use PyObject**s instead of cells?

[Tim, sketching a scheme that always dereferences the cell once]
> The sole big change is requiring that mutations to builtins
> propagate at once to their cached values in module celldicts.

Let's combine these ideas.  Suppose there's a vector of pointers to
pointers to objects, indexed by some index calculated by the compiler.
Then the fast track in LOAD_GLOBAL_CELL could look like this:

    case LOAD_GLOBAL_CELL:
        x = *globals_vector[oparg];
        if (x != NULL) {
            Py_INCREF(x);
            continue;
        }
        ... handle uncommon cases and errors here ...

Here, globals_vector[i] is usually the address of the me_value slot of
a PyDictEntry in either the globals dict or the builtins dict.  These
are subclasses of dict that trap assignment, deletion, and rehashing.

There's a special C-global variable

    PyObject *unfilled = NULL;

whose contents is always NULL; when globals_vector is initalized,
every element of it is set to &unfilled.

The code to handle uncommon cases and errors does a "lookdict"
operation on the globals dict using the name of the global variable,
which it gets from (e.g.) globals_names[oparg].  This requires some
opening up of the dict implementation; lookdict is an internal routine
that returns a PyDictEntry *, call it e.  If e->me_value != NULL, we
set globals_vector[oparg] to &e->me_value, and we're done.  Otherwise,
we do another lookdict on the builtins dict, and again if e->me_value
!= NULL, set globals_vector[oparg] to &e->me_value.  Otherwise, we
raise a NameError.

Now we need to take care of a number of additional special cases that
could invalidate the pointers we're collecting in globals_vector.
The following things may invalidate those pointers:

- globals_vector[i] points to a builtin, and a global with the same
  name is created

- globals_vector[i] points to either a builtin or a global, and the
  dict into whose hashtable it points is rehashed (as the result of
  adding an item)

- globals_vector[i] points to a builtin or global, which is deleted

- globals_vector[i] points to a builtin, and the special global named
  __builtins__ is assigned to (switching to a different builtins dict)

To deal with all these cases, our dict subclass keeps a list of weak
references.  The builtins dict has weak references pointing to all
globals dicts that shadow this builtins dict (because of rexec there
can be multiple builtins dicts); each globals dict has weak references
pointing to all the globals_vector structures that reference it.

On a rehash, all entries in each affected globals_vector are reset to
&unfilled.  The uncommon case handling code will gradually populate
them again.  On assignment or deletion it might pay off to be a little
more careful and only invalidate the entry in the globals_vector
corrsponding to the affected name.  (In particular, assignment to a
global that's already set, and deletion of a global that doesn't
shadow a built-in, should probably be handled somewhat efficiently.)

The globals_vector structure should contain a pointer to the
corresponding globals_names array, and it should also contain a
reference to the globals dict into whose hashtable it may point, to
keep it alive.  So it should probably be an object that contains a
vector of pointers to pointers in addition to some other stuff.

The globals_vector may be shared by all code objects compiled
together; this makes it similar to the dlict.  But the overflow
handling is quite different, and by pointing directly into the hash
table it is possible to handle all globals and builtins uniformly.

I expect that the implementation won't be particularly hard; the
lookdict operation already exists and we can easily subclass dict to
trap the setitem and delitem operations.  We will have to be careful
not to use PyDict_SetItem() and PyDict_DelItem() on these subclasses,
but that should be easy: I think that the only offenders here are
STORE_NAME and friends, and these are exactly the operations that
we're going to change anyway.  (STORE_NAME or STORE_GLOBAL will become
a bit slower, because of the check whether it needs to update any
globals_vector structures; but that's OK since we're speeding up the
corresponding LOAD operation quite a bit.)

(I'd add this to PEP 280 but I'll wait for Tim to shoot holes in it
first. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Tue Feb 12 04:31:39 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 11 Feb 2002 23:31:39 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <15463.55207.217946.237969@12-248-41-177.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIANLAA.tim.one@comcast.net>

[Skip Montanaro]
> ...
> The whole think makes sense to me if it avoids the possible two
> PyDict_GetItem calls in the LOAD_GLOBAL opcode.  As I understand it, if
> accessed inside a function, LOAD_GLOBAL could be implemented
> something like this:
>
>     case LOAD_GLOBAL:
>         cell = func_cells[oparg];
>         if (cell.objptr) x = cell->objptr;
>         else x = cell->cellptr->objptr;
>         if (x == NULL) {
>             ... error recovery ...
>             break;
>         }
>         Py_INCREF(x);
>         continue;
>
> This looks a lot better to me (no complex function calls).

Something much like that.  Guido added code to the PEP (280).  My suggested
modifications reduce it to:

       case LOAD_GLOBAL_CELL:
           cell = func_cells[oparg];
           x = cell->objptr;
           if (x != NULL) {
               Py_INCREF(x);
               continue;
           }
           ... error recovery ...
           break;

Another difference is hiding in the "... error recovery ..." elisions.  In
Guido's scheme, this must also include code to deal with the possibility
that a global went away and thereby uncovered a builtin that popped into
existence after the module globals were initialized.  Then it's still a
non-error case, but the cell->cellptr has gotten out of synch with reality.
In the variation, the caches are never allowed to get out of synch, so "...
error recovery .." there should really be "... error reporting ...":  you
can't there in the variant unless NameError is certain.

Hmm:  We *all* seem to be missing a PUSH(x), so all of our schemes are dead
wrong <wink>.

Speaking of which, why does LOAD_FAST waste time checking against NULL
twice?!

		case LOAD_FAST:
			x = GETLOCAL(oparg);
			if (x == NULL) {
				format_exc_check_arg(
					PyExc_UnboundLocalError,
					UNBOUNDLOCAL_ERROR_MSG,
					PyTuple_GetItem(co->co_varnames, oparg)
					);
				break;
			}
			Py_INCREF(x);
			PUSH(x);
			if (x != NULL) continue;
			break;

I'll fix that ...



From tim.one@comcast.net  Tue Feb 12 04:45:06 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 11 Feb 2002 23:45:06 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <200202111628.g1BGSxF20159@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEIBNLAA.tim.one@comcast.net>

[Ping]
>>     1.  There are going to be lots of global-cell objects.
>>         Perhaps they should get their own allocator and free list.

[Guido]
> Yes.

No no no no no, and 5*-1 beats a +1 even from the BDFL <wink>.

Vanilla pymalloc is perfect for this:  many small objects.  A custom free
list for cells is a waste of code, because a cell never goes away until the
module does:  cells will not "churn".  We'll get a lot of them, but most of
them will stay alive until the program ends, so the tiny performance gain
you may be able to get from a thoroughly specialized free list "in theory"
will never be realized in practice.



From barry@zope.com  Tue Feb 12 05:09:06 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 12 Feb 2002 00:09:06 -0500
Subject: [Python-Dev] Order that site-packages is added to sys.path
Message-ID: <15464.41842.664484.307330@anthem.wooz.org>

I have a bit of a dilemma when it comes to sys.path and the location
of the site-packages directory.

The problem comes when someone is using Mailman 2.1 with Python 2.2.
The latter comes with the email package, which is in the standard
library.  Through some contributions, my standalone email package now
supports multibyte character sets in RFC-compliant ways
(e.g. splitting long headers correctly).  The question is, how do I
get the updated package to Python 2.2 users?

The standalone email package is a simple distutils thingie with a
directory and a bunch of .py files.  distutils sticks this in
site-packages.  But an "import email" will always get the standard
library version instead of the site-packages version because site.py
/appends/ site-packages to sys.path instead of prepending it.

I can work around this by adding my own path-hacking code before any
import of email.* modules.  This is a bit ugly because now it means
that the proper functioning of the application depends on import
order, and that's nasty.

So the question is: why does site.py append site-packages instead of
prepending it to sys.path?  If there's a valid reason, I don't
remember it, and I'm currently blind to any valuable use case.  If
there's no good reason, it would seem to me that the following use
case is better served by prepending:

- We want to provide an enhanced version, or a fixed version of a
  module or package.  Distribute it w/distutils and do a normal
  install.  As long as you don't start Python w/ -S, you'll always get
  the improved version.  Don't want the improved version?  Start
  Python w/ -S or just don't ever install the new package.

I'm mostly looking for rationale right now, before I try to decide
whether it's something worth debating and/or changing.

Thanks,
-Barry


From tim.one@comcast.net  Tue Feb 12 05:09:55 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 12 Feb 2002 00:09:55 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <200202120430.g1C4UhV29560@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEIDNLAA.tim.one@comcast.net>

[Guido]
> ...
> (I'd add this to PEP 280 but I'll wait for Tim to shoot holes in it
> first. :-)

At first sight this scheme made sense and was very appealing.  Alas, I've
been staring at the computer non-stop for 11 hours, my eyes are losing
focus, and I just remembered I forgot to eat today.  IOW, you'll have to
wait for Tuesday to get shot <wink>.

For now, I really liked that the indirection vector fills in adaptively, to
speed accesses that are actually getting made.  I don't know that that's an
objective advantage, but I loved the image.



From oren-py-d@hishome.net  Tue Feb 12 07:05:14 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 12 Feb 2002 09:05:14 +0200
Subject: [Python-Dev] patch: speed up name access by up to 80%
In-Reply-To: <15464.24161.581535.548441@gondolin.digicool.com>; from jeremy@zope.com on Mon, Feb 11, 2002 at 07:14:25PM -0500
References: <20020211220954.A12061@hishome.net> <3C6844F9.954E1DA2@metaslash.com> <20020211225538.GA93506@hishome.net> <3C685A7A.4D39A35D@metaslash.com> <15464.24161.581535.548441@gondolin.digicool.com>
Message-ID: <20020212090514.A22361@hishome.net>

On Mon, Feb 11, 2002 at 07:14:25PM -0500, Jeremy Hylton wrote:
> So simple benchmark programs are a lot more interesting.
> 
> I'd pick pystone, test_hmac, and test_htmlparser.

test_htmlparser (x100):	0m29.950s	0m29.730s
test_hmac (x1000):	0m16.480s	0m15.720s  (lower is better)
pystone:		11261.3		11494.3	   (higher is better)

A small, but measureable improvement.  You can see below that most accesses 
are still to fastlocals and, of course, the code has some real work to do 
other than looking up names.  

test_htmlparser:
362331 fastlocal non-dictionary lookups
60106 inline dictionary lookups
10554 fast dictionary lookups
151 slow dictionary lookups

test_hmac:
13959 fastlocal non-dictionary lookups
9920 inline dictionary lookups
7548 fast dictionary lookups
240 slow dictionary lookups

pystone:
1447094 fastlocal non-dictionary lookups
502190 inline dictionary lookups
111549 fast dictionary lookups
111 slow dictionary lookups

Anyone has an example of a program that relies on a lot of global and 
builtin name accesses?  Meanwhile I'm going to start working on LOAD_ATTR.

    if-the-evidence-doesn't-fit-the-theory...-ly yours,

	Oren



From ping@lfw.org  Tue Feb 12 07:45:30 2002
From: ping@lfw.org (Ka-Ping Yee)
Date: Tue, 12 Feb 2002 01:45:30 -0600 (CST)
Subject: [Python-Dev] Zest
Message-ID: <Pine.LNX.4.33.0202120140180.3853-100000@server1.lfw.org>

A few people have expressed interest in the mail-archiving project
i mentioned on Developer's Day (see http://lfw.org/python/pydev-sample
for some sample output).  I've registered a SourceForge project named
"Zest" under which i'll be doing this work, and created a mailing list
for anyone who wants to discuss it.  Please direct any thoughts and
comments about Zest to zest-devel@lists.sf.net.

I don't expect to be blabbing a great deal on the list until i've got
a better prototype ready, but it's good to have a place for the project
to reside, so that ideas and conversations don't get lost.  (Thanks
for the prod, Barry.)  If you're interested, please join the list:

    http://lists.sf.net/lists/listinfo/zest-devel

Sorry for the diversion.


-- ?!ng



From rsc@plan9.bell-labs.com  Tue Feb 12 08:44:19 2002
From: rsc@plan9.bell-labs.com (Russ Cox)
Date: Tue, 12 Feb 2002 03:44:19 -0500
Subject: [Python-Dev] a different approach to argument parsing
Message-ID: <a3ed7e8ff9854129ba52c84e0d774286@plan9.bell-labs.com>

[Hi.  I'm responsible for the Plan 9 port of Python; I typically just
lurk here.]

Regarding the argument parsing discussion, it seems like many of the
"features" of the various argument parsing packages are aimed at the
fact that in C (whence this all originated) the original getopt
interface wasn't so great.  To use getopt, you end up specifying the
argument set twice: once to the parser and then once when processing
the list of returned results.  Packages like Optik make this a little
better by letting you wrap up the actual processing in some form and
hand that to the parser too.  Still, you have to wrap up your argument
parsing into little actions; the getopt style processing loop is
usually a bit clearer.  Ultimately, I find getopt unsatisfactory
because of the duplication; and I find Optik and other similar
packages unsatisfactory because of the contortions you have to go
through to invoke them.  I don't mean to pick on Optik, since many
others appear to behave in similar ways, but it seems to be the
yardstick.  For concreteness, I'd much rather write:

	if o=='-n' or o=='--num':
		ncopies = opt.optarg(opt.needtype(int))

than:

	parser.add_option("-n", "--num", action="store", type="int", dest="ncopies")

The second strikes me as clumsy at best.

The Plan 9 argument parser (for C) avoids these problems by making the
parser itself small enough to be a collection of preprocessor macros.
Although the implementation is ugly, the external interface that
programmers see is trivial.  A modified version of the example at
http://optik.sourceforge.net would be rendered:

	char *usagemessage = 
	"usage: example [-f FILE] [-h] [-q] who where\n"
	"\n"
	"    -h            show this help message\n"
	"    -f FILE       write report to FILE\n"
	"    -q            don't print status messages to stdout\n";

	void
	usage(void)
	{
		write(2, usagemessage, strlen(usagemessage));
		exits("usage");
	}

	void
	main(int argc, char **argv)
	{
		...
		ARGBEGIN{
		case 'f':
			report = EARGF(usage());
			break;
		case 'q':
			verbose = 0;
			break;
		case 'h':
		default:
			usage();
		}ARGEND
		if(argc != 2)
			usage();
		...

[This is documented at http://plan9.bell-labs.com/magic/man2html/2/ARGBEGIN,
for anyone who is curious.]

Notice that the argument parsing machinery only gets the argument
parameters in one place, and is kept so simple because it is driven by
what happens in the actions: if I run "example -frsc" and the f option
case doesn't call EARGF() to fetch the "rsc", the next iteration
through the loop will be for option 'r'; a priori there's no way to
tell.

Now that Python has generators, it is easy to do a similar sort of
thing, so that the argument parsing can be kept very simple.  The
running example would be written using the attached argument parser
as:

	usagemessage=\
	'''usage: example.py [-h] [-f FILE] [-n N] [-q] who where
	    -h, --help                  show this help message
	    -f FILE, --file=FILE        write report to FILE
		-n N, --num=N               print N copies of the report
	    -q, --quiet                 don't print status messages to stdout
	'''
	
	def main():
		opt = OptionParser(usage=usagemessage)
		report = 'default.file'
		ncopies = 1
		verbose = 1
		for o in opt:
			if o=='-f' or o=='--file':
				report = opt.optarg()
			elif o=='-n' or o=='--num':
				ncopies = opt.optarg(opt.typecast(int, 'integer'))
			elif o=='-q' or o=='--quiet':
				verbose = 0
			else:
				opt.error('unknown option '+o)
		if len(opt.args()) != 2:
			opt.error('incorrect argument count')
		print 'report=%s, ncopies=%s verbose=%s' % (report, ncopies, verbose)
		print 'arguments: ', opt.args()

It's fairly clear what's going on, and the option parser itself is
very simple too.  While it may not have all the bells and whistles
that some packages do, I think it's simplicity makes most of them
irrelevant.  It or something like it might be the right approach to
take to present a simpler interface.

The simplicity of the interface has the benefit that users (potentially
anyone who writes a Python program) don't have to learn a lot of
stuff to parse their command-line arguments.  Suppose I want to
write a program with an option that takes two arguments instead
of one.  Given the Optik-style example it's not at all clear how to
do this.  Given the above example, there's one obvious thing to try:
call opt.optarg() twice.  That sort of thing.

Addressing the benchmark set by Optik:

[1]
>   * it ties short options and long options together, so once you
>     define your options you never have to worry about the fact that
>     -f and --file are the same

Here the code does that for you, and if you want to use some
other convention, you're not tied to anything.  (You do have to tie
-f and --file in the usage message too, see answer to [3].)

[2]
>   * it's strongly typed: if you say option --foo expects an int,
>     then Optik makes sure the user supplied a string that can be
>     int()'ified, and supplies that int to you

There are plenty of ways you could consider adding this.
The easiest is what I did in the example.  The optarg argument
fetcher takes a function to transform the argument before
returning.  Here, our function calls opt.error() if the argument
cannot be converted to an int.  The added bells and whistles
that Optik adds (choice sets, etc.) can be added in this manner
as well, as external functions that the parser doesn't care about,
or as internally-supplied helper functions that the user can
call if he wants.

[3]
>   * it automatically generates full help based on snippets of
>     help text you supply with each option

This is the one shortcoming: you have to write the usage message
yourself.  I feel that the benefit of having much clearer argument
parsing makes it worth bearing this burden.  Also, tools like Optik
have to work fairly hard to present the usage message in a reasonable
manner, and if it doesn't do what you want you either have to write
extension code or just write your own usage message anyway.
I'd rather give this up and get the rest of the benefits.

[4]
>   * it has a wide range of "actions" -- ie. what to do with the
>     value supplied with each option.  Eg. you can store that value
>     in a variable, append it to a list, pass it to an arbitrary
>     callback function, etc.

Here the code provides the widest possible range of actions: you run
arbitrary code for each option, and it's all in once place rather than
scattered.

[5]
>   * you can add new types and actions by subclassing -- how to
>     do this is documented and tested

The need for new actions is obviated by not having actions at all.

The need for new types could be addressed by the argument transformer,
although I'm not really happy with that and wouldn't mind seeing it go
away.  In particular,

	ncopies = opt.optarg(opt.typecast(int, 'integer'))

seems a bit more convoluted and slightly ad hoc compared to the
straightforward:

	try:
		ncopies = int(opt.optarg())
	except ValueError:
		opt.error(opt.curopt+' requires an integer argument')

especially when the requirements get complicated, like
the integer has to be prime.  Perhaps a hybrid is best, using a collection
of standard transformers for the common cases and falling back on
actual code for the tough ones.

[6]
>   * it's dead easy to implement simple, straightforward, GNU/POSIX-
>     style command-line options, but using callbacks you can be as
>     insanely flexible as you like

Here, ditto, except you don't have to use callbacks in order to be as
insanely flexible as you like.

[7]
>   * provides lots of mechanism and only a tiny bit of policy (namely,
>     the --help and (optionally) --version options -- and you can
>     trash that convention if you're determined to be anti-social)

In this version there is very little mechanism (no need for lots), and
no policy.  It would be easy enough to add the --help and --version
hacks as a standard subclass.

Anyhow, there it is.  I've attached the code for the parser, which I 
just whipped up tonight.  If people think this is a promising thing to
explore and someone else wants to take over exploring, great.
If yes promising but no takers, I'm willing to keep at it.

Russ


--- opt.py
from __future__ import generators
import sys, copy

class OptionError(Exception):
	pass

class OptionParser:
	def __init__(self, argv=sys.argv, usage=None):
		self.argv0 = argv[0]
		self.argv = argv[1:]
		self.usage = usage
			
	def __iter__(self):
		# this assumes the "
		while self.argv:
			if self.argv[0]=='-' or self.argv[0][0]!='-':
				break
			a = self.argv.pop(0)
			if a=='--':
				break
			if a[0:2]=='--':
				i = a.find('=')
				if i==-1:
					self.curopt = a
					yield self.curopt
					self.curopt = None
				else:
					self.curarg = a[i+1:]
					self.curopt = a[0:i]
					yield self.curopt
					if self.curarg:		# wasn't fetched with optarg
						self.error(self.curopt+' does not take an argument')
					self.curopt = None
				continue
			self.curarg = a[1:]
			while self.curarg:
				a = self.curarg[0:1]
				self.curarg = self.curarg[1:]
				self.curopt = '-'+a
				yield self.curopt
				self.curopt = None

	def optarg(self, fn=lambda x:x):
		if self.curarg:
			ret = self.curarg
			self.curarg=''
		else:
			try:
				ret = self.argv.pop(0)
			except IndexError:
				self.error(self.curopt+' requires argument')
		return fn(ret)

	def _typecast(self, t, x, desc=None):
		try:
			return t(x)
		except ValueError:
			d = desc
			if d == None:
				d = str(t)
			self.error(self.curopt+' requires '+d+' argument')

	def typecast(self, t, desc=None):
		return lambda x: self._typecast(t, x, desc)

	def args(self):
		return self.argv

	def error(self, msg):
		if self.usage != None:
			sys.stderr.write('option error: '+msg+'\n\n'+self.usage)
			sys.stderr.flush()
			sys.exit(0)
		else:
			raise OptionError(), msg

########

import sys

usagemessage=\
'''usage: example.py [-h] [-f FILE] [-n N] [-q] who where
    -h, --help                  show this help message
    -f FILE, --file=FILE        write report to FILE
	-n N, --num=N               print N copies of the report
    -q, --quiet                 don't print status messages to stdout
'''

def main():
	opt = OptionParser(usage=usagemessage)
	report = 'default.file'
	ncopies = 1
	verbose = 1
	for o in opt:
		if o=='-f' or o=='--file':
			report = opt.optarg()
		elif o=='-n' or o=='--num':
			ncopies = opt.optarg(opt.typecast(int, 'integer'))
		elif o=='-q' or o=='--quiet':
			verbose = 0
		else:
			opt.error('unknown option '+o)
	if len(opt.args()) != 2:
		opt.error('incorrect argument count')
	print 'report=%s, ncopies=%s verbose=%s' % (report, ncopies, verbose)
	print 'arguments: ', opt.args()

if __name__=='__main__':
	main()



From martin@v.loewis.de  Tue Feb 12 09:39:44 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 12 Feb 2002 10:39:44 +0100
Subject: [Python-Dev] Order that site-packages is added to sys.path
In-Reply-To: <15464.41842.664484.307330@anthem.wooz.org>
References: <15464.41842.664484.307330@anthem.wooz.org>
Message-ID: <m3665382e7.fsf@mira.informatik.hu-berlin.de>

barry@zope.com (Barry A. Warsaw) writes:

> So the question is: why does site.py append site-packages instead of
> prepending it to sys.path?

I think the rationale was that you are precisely not supposed to
override any of the standard modules. It was considered a good thing
that if you do "import string" in some version of Python, you know
exactly what you will get.

There is currently one exception to that rule, which is the xml module
and PyXML: the standard xml module allows being replaced by add-on
(later, better) packages. However, there have been complaints that
this is so: One of Paul Prescod's applications would break if PyXML
was installed, since PyXML performed some stricter argument checking
in certain cases. 

The same problem would occur more frequently if you have site-packages
in front of the path: The add-on package may behave worse than the
standard package in some cases (especially after installing a Python
bugfix release); this problem is hard to track.

In the specific case, I'd propose the following strategy:
- Get the fixes to the Email package into the 2.2 maintainance branch,
  in addition to getting them into the trunk. This assumes that the
  patches really do fix bugs and are suitable for the general public
  etc.

- If Python 2.2.1 is released before Mailman 2.1, you are done: Just
  tell your users that they need 2.2.1 or 2.1.2, but cannot use 2.2
  (or need to live with limitations in MIME processing).

- If this is not possible, rename the email package inside mailman
  (e.g. xemail).
  It then appears that the standard library package is not suitable
  for mailman, so just ignore its presence, and use your own (under a
  different name). 

- As a compromise, you might consider falling back to the email
  package if you determine it is good enough at installation time, by
  playing with xemail.__init__.__path__, or even replacing xemail with
  email in the same way that xml is replaced with _xmlplus.

Regards,
Martin


From martin@v.loewis.de  Tue Feb 12 09:45:36 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 12 Feb 2002 10:45:36 +0100
Subject: [Python-Dev] a different approach to argument parsing
In-Reply-To: <a3ed7e8ff9854129ba52c84e0d774286@plan9.bell-labs.com>
References: <a3ed7e8ff9854129ba52c84e0d774286@plan9.bell-labs.com>
Message-ID: <m31yfr824f.fsf@mira.informatik.hu-berlin.de>

"Russ Cox" <rsc@plan9.bell-labs.com> writes:

> Anyhow, there it is.  I've attached the code for the parser, which I 
> just whipped up tonight.  If people think this is a promising thing to
> explore and someone else wants to take over exploring, great.
> If yes promising but no takers, I'm willing to keep at it.

I think it is quite promising; I share your views on option parsing,
and like an iterative interface myself very much.

Regards,
Martin


From paul@prescod.net  Tue Feb 12 10:23:47 2002
From: paul@prescod.net (Paul Prescod)
Date: Tue, 12 Feb 2002 02:23:47 -0800
Subject: [Python-Dev] a different approach to argument parsing
References: <a3ed7e8ff9854129ba52c84e0d774286@plan9.bell-labs.com>
Message-ID: <3C68ED33.9244EDA0@prescod.net>

I like the general direction but one thing makes me a little confused...

Russ Cox wrote:
> 
>...
> 
>    for o in opt:
>         if o=='-n' or o=='--num':
>                 ncopies = opt.optarg(opt.needtype(int))

How does "opt" know that I am looking for the arguments to the --num
command line argument and not the --file one? I guess I would expect an
interface more like:

for o, value in opt:
    if o=='-n' or o=='--num':
         ncopies = optparser.needtype(value, 'integer')

 Paul Prescod


From mal@lemburg.com  Tue Feb 12 10:36:00 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Feb 2002 11:36:00 +0100
Subject: [Python-Dev] Order that site-packages is added to sys.path
References: <15464.41842.664484.307330@anthem.wooz.org>
Message-ID: <3C68F010.4736AE2F@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> I have a bit of a dilemma when it comes to sys.path and the location
> of the site-packages directory.
> 
> The problem comes when someone is using Mailman 2.1 with Python 2.2.
> The latter comes with the email package, which is in the standard
> library.  Through some contributions, my standalone email package now
> supports multibyte character sets in RFC-compliant ways
> (e.g. splitting long headers correctly).  The question is, how do I
> get the updated package to Python 2.2 users?
> 
> The standalone email package is a simple distutils thingie with a
> directory and a bunch of .py files.  distutils sticks this in
> site-packages.  But an "import email" will always get the standard
> library version instead of the site-packages version because site.py
> /appends/ site-packages to sys.path instead of prepending it.
> 
> I can work around this by adding my own path-hacking code before any
> import of email.* modules.  This is a bit ugly because now it means
> that the proper functioning of the application depends on import
> order, and that's nasty.

Why not put put the updated email package into the Mailman 
package (is it a package?) ?

That way you can update whatever part you want from the 
Python lib or replace it with something else.
 
> So the question is: why does site.py append site-packages instead of
> prepending it to sys.path?  If there's a valid reason, I don't
> remember it, and I'm currently blind to any valuable use case.  

I guess this is done for the same reason that e.g. /usr/local
is last in PATH on Unix: system top level programs and libs
should always have top priority. Otherwise, a user could easily
override a system program/lib by placing a new version
into the local dir which then gets picked up by other system
programs.

I'd suggest to better be explicit about what you do and
to put the new code in the package (which is completely
under your control).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Tue Feb 12 10:52:05 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Feb 2002 11:52:05 +0100
Subject: [Python-Dev] Order that site-packages is added to sys.path
References: <15464.41842.664484.307330@anthem.wooz.org> <m3665382e7.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3C68F3D5.A50F1347@lemburg.com>

"Martin v. Loewis" wrote:
> 
> - As a compromise, you might consider falling back to the email
>   package if you determine it is good enough at installation time, by
>   playing with xemail.__init__.__path__, or even replacing xemail with
>   email in the same way that xml is replaced with _xmlplus.

Those kind of hacks should not be needed if Barry puts his own
email package inside the Mailman package. All local imports
will pick up his version automatically; even though I'd suggest 
to use explicit imports for it in the Mailman code to avoid magical
problems ;-)

Hacking __path__ should really only be the last resort... it
(usually) breaks installers, gives importers a hard time, 
etc. 

We should not consider this good practice even though it
may be needed sometimes (e.g. by PyXML).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Tue Feb 12 11:40:11 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Feb 2002 12:40:11 +0100
Subject: [Python-Dev] patch: speed up name access by up to 80%
References: <20020211220954.A12061@hishome.net>
Message-ID: <3C68FF1B.C3F449DB@lemburg.com>

Some other things you might want to try:

* Inline small dictionary tables in the PyObject struct and only
  revert to external tables for larger ones. (I have an old patch 
  for this one which you might want to update)

* Optimize perfect hashings. Sometimes (hopefully most of the times)
  Python will generate a perfect hashing for a set of attributes.
  In that case, it could set a flag in the dictionary object to
  be able to use a faster lookup function.

BTW, could you run pybench against your patch ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From oren-py-d@hishome.net  Tue Feb 12 13:29:47 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 12 Feb 2002 08:29:47 -0500
Subject: [Python-Dev] patch: speed up name access by up to 80%
In-Reply-To: <3C68FF1B.C3F449DB@lemburg.com>
References: <20020211220954.A12061@hishome.net> <3C68FF1B.C3F449DB@lemburg.com>
Message-ID: <20020212132947.GA13065@hishome.net>

On Tue, Feb 12, 2002 at 12:40:11PM +0100, M.-A. Lemburg wrote:
> 
> * Inline small dictionary tables in the PyObject struct and only
>   revert to external tables for larger ones. (I have an old patch 
>   for this one which you might want to update)
> 
> * Optimize perfect hashings. Sometimes (hopefully most of the times)
>   Python will generate a perfect hashing for a set of attributes.
>   In that case, it could set a flag in the dictionary object to
>   be able to use a faster lookup function.

Interesting, but I am exploring other directions now: attribute access, 
hints associated to negative entries that should speed up the next lookup 
in the chain and getting the inline/fast ratio from 3:1 up to 10:1 or 
higher.

> BTW, could you run pybench against your patch ?

18331152 fastlocal non-dictionary lookups
416661 inline dictionary lookups
131509 fast dictionary lookups
200 slow dictionary lookups

With 97% of accesses using fastlocals it's not going to have any significant
effect.

	Oren



From guido@python.org  Tue Feb 12 13:45:21 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 12 Feb 2002 08:45:21 -0500
Subject: [Python-Dev] a different approach to argument parsing
In-Reply-To: Your message of "Tue, 12 Feb 2002 02:23:47 PST."
 <3C68ED33.9244EDA0@prescod.net>
References: <a3ed7e8ff9854129ba52c84e0d774286@plan9.bell-labs.com>
 <3C68ED33.9244EDA0@prescod.net>
Message-ID: <200202121345.g1CDjLB30221@pcp742651pcs.reston01.va.comcast.net>

OK, I was wrong when I expected there wouldn't be much traffic.  The
design of an option parser alternative does *not* belong on
python-dev.  Please get this discussion of the list NOW and move it
elsewhere.  You can come back here when you've agreed on a solution.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Feb 12 13:53:21 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 12 Feb 2002 08:53:21 -0500
Subject: [Python-Dev] patch: speed up name access by up to 80%
In-Reply-To: Your message of "Tue, 12 Feb 2002 12:40:11 +0100."
 <3C68FF1B.C3F449DB@lemburg.com>
References: <20020211220954.A12061@hishome.net>
 <3C68FF1B.C3F449DB@lemburg.com>
Message-ID: <200202121353.g1CDrLq30278@pcp742651pcs.reston01.va.comcast.net>

> * Inline small dictionary tables in the PyObject struct and only
>   revert to external tables for larger ones. (I have an old patch 
>   for this one which you might want to update)

I may be missing some context, but AFAIK we already do this.  See
dictobject.h: the last item in struct _dictobject is

	PyDictEntry ma_smalltable[PyDict_MINSIZE];

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Tue Feb 12 14:05:05 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Feb 2002 15:05:05 +0100
Subject: [Python-Dev] patch: speed up name access by up to 80%
References: <20020211220954.A12061@hishome.net>
 <3C68FF1B.C3F449DB@lemburg.com> <200202121353.g1CDrLq30278@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C692111.5C39D266@lemburg.com>

Guido van Rossum wrote:
> 
> > * Inline small dictionary tables in the PyObject struct and only
> >   revert to external tables for larger ones. (I have an old patch
> >   for this one which you might want to update)
> 
> I may be missing some context, but AFAIK we already do this.  See
> dictobject.h: the last item in struct _dictobject is
> 
>         PyDictEntry ma_smalltable[PyDict_MINSIZE];

Nice :-) I must have missed that addition (or simply forgotten
about it).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Tue Feb 12 14:15:12 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Feb 2002 15:15:12 +0100
Subject: [Python-Dev] SSL support in _socket
Message-ID: <3C692370.D21EF15D@lemburg.com>

I have a problem with SSL support in _socket and the way
setup.py does the autodetection: even though SSL may be
installed on the system, it seems that they changed the
exposed APIs between patch level releases. 

As a result, _socket compiles but the import fails on platforms 
which have the wrong OpenSSL version installed. setup.py then
simply removes _socket from the extension list and builds
Python without socket support which is a really Bad Thing 
since _socket without SSL support compiles just fine.

What can we do about this ?

Since auto-detection is happening rather early in setup.py
it doesn't seem possible to apply some fallback scheme
depending on extra knowledge for the various modules.

Perhaps we should simply let setup.py build two extensions:
_socket (without SSL) and _socketssl (with SSL) ?! If the
_socketssl build or import fails for some reason, Python 
could still pick up the _socket extension in socket.py.

Comments ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From guido@python.org  Tue Feb 12 14:25:24 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 12 Feb 2002 09:25:24 -0500
Subject: [Python-Dev] SSL support in _socket
In-Reply-To: Your message of "Tue, 12 Feb 2002 15:15:12 +0100."
 <3C692370.D21EF15D@lemburg.com>
References: <3C692370.D21EF15D@lemburg.com>
Message-ID: <200202121425.g1CEPO330489@pcp742651pcs.reston01.va.comcast.net>

> Perhaps we should simply let setup.py build two extensions:
> _socket (without SSL) and _socketssl (with SSL) ?! If the
> _socketssl build or import fails for some reason, Python 
> could still pick up the _socket extension in socket.py.

+1

--Guido van Rossum (home page: http://www.python.org/~guido/)


From david@boddie.net  Tue Feb 12 15:08:20 2002
From: david@boddie.net (David Boddie)
Date: Tue, 12 Feb 2002 15:08:20 +0000
Subject: [Python-Dev] Re: RFC: Option Parsing Libraries
Message-ID: <20020212150948.0E0492AA62@wireless-084-136.tele2.co.uk>

Paul Prescod wrote:

> Greg Ward has proposed to add his Optik module to the Python library.
> 
>  * http://mail.python.org/pipermail/python-dev/2002-February/019934.html
> 
> Optik is a parser for command line options with the features described
> here:
> 
>  * http://mail.python.org/pipermail/python-dev/2002-February/019937.html
> 
> If you have a competitive library, or suggestions for changes to Optik,
> please forward your comments to python-dev mailing list
> (python-dev@python.org). If a long-running conversation ensues, you may
> need to join the list to participate. Discussions in comp.lang.python
> will not be considered unless you at least forward a reference to
> python-dev. If you propose a competitor to Optik, please describe how it
> compares to the feature list described above.

I have been working on a library which I hope begins to unify the
presentation of arguments to the programmer with the syntax that is
presented to the user.

In the form of the list in the first reference above:

cmdsyntax.py (Feb 2002)
  Author: David Boddie <david@boddie.net>
  URL:    http://www-solar.mcs.st-and.ac.uk/~davidb/Software/Python/cmdsyntax/

  * An attempt at OO design, with limited functionality allowing you to:

    1. Set up a syntax object using a syntax definition.
    2. Supply arguments from sys.argv or a string and retrieve either:

    2a. A dictionary containing values corresponding to the required
        arguments.
    2b. A list of possible matches with the definition.

  * Arguments passed must conform to a familiar looking syntax definition.
    For example, the string "infile [-o outfile]" indicates that one argument
    is necessary and that another may be specified using the -o switch.

  * Short and long options are allowed.

    The short options cannot accept arguments as in the example "-ffoo",
    but lists of options such as "-avx" are supported and accept
    combinations of these options in any order.

    Long options support both the "--no-value" and "--name=value"
    variants.

  * Command arguments may be specified, which must be matched
    exactly by user input, as in the case

    "add"|"remove" value

    which requires that either the command "add" or "remove" be given
    with a following argument.

  * Arguments/options may be grouped using brackets.

  * No type information is specified, so arguments are not typed before
    being presented to the programmer.

  * Excessive method used to match arguments with the syntax definition:
    all possible definitions are generated then arguments are matched
    against each one.

  * No ability to catch remaining unspecified arguments.

  * No license.

  * Needs more testing.
    
I'm not proposing this as a competitor to Optik, but I'm happy to donate any
ideas and code to the effort.

David

________________________________________________________________________
This email has been scanned for all viruses by the MessageLabs SkyScan
service. For more information on a proactive anti-virus service working
around the clock, around the globe, visit http://www.messagelabs.com
________________________________________________________________________


From nas@python.ca  Tue Feb 12 15:13:37 2002
From: nas@python.ca (Neil Schemenauer)
Date: Tue, 12 Feb 2002 07:13:37 -0800
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEIANLAA.tim.one@comcast.net>; from tim.one@comcast.net on Mon, Feb 11, 2002 at 11:31:39PM -0500
References: <15463.55207.217946.237969@12-248-41-177.client.attbi.com> <LNBBLJKPBEHFEDALKOLCIEIANLAA.tim.one@comcast.net>
Message-ID: <20020212071336.A32363@glacier.arctrix.com>

Tim Peters wrote:
> Speaking of which, why does LOAD_FAST waste time checking against NULL
> twice?!

If you would have approved my patch it would be fixed already.

  one-small-banana-left-ly y'rs Neil


From gward@python.net  Tue Feb 12 15:27:24 2002
From: gward@python.net (Greg Ward)
Date: Tue, 12 Feb 2002 10:27:24 -0500
Subject: [Python-Dev] Order that site-packages is added to sys.path
In-Reply-To: <15464.41842.664484.307330@anthem.wooz.org>
References: <15464.41842.664484.307330@anthem.wooz.org>
Message-ID: <20020212152724.GA24891@gerg.ca>

On 12 February 2002, Barry A. Warsaw said:
> The standalone email package is a simple distutils thingie with a
> directory and a bunch of .py files.  distutils sticks this in
> site-packages.  But an "import email" will always get the standard
> library version instead of the site-packages version because site.py
> /appends/ site-packages to sys.path instead of prepending it.
> 
> I can work around this by adding my own path-hacking code before any
> import of email.* modules.  This is a bit ugly because now it means
> that the proper functioning of the application depends on import
> order, and that's nasty.

Looong ago, I tried to persuade Guido that giving the Distutils the
power to override standard library modules would, on rare occasions, be
a good and useful thing.  (Yet another idea stolen from Perl's
MakeMaker, which can do precisely that.  Sometimes, it's useful.)  Guess
who won?

        Greg
-- 
Greg Ward - just another Python hacker                  gward@python.net
http://starship.python.net/~gward/
A closed mouth gathers no foot.


From paul@prescod.net  Tue Feb 12 15:45:11 2002
From: paul@prescod.net (Paul Prescod)
Date: Tue, 12 Feb 2002 07:45:11 -0800
Subject: [Python-Dev] a different approach to argument parsing
References: <a3ed7e8ff9854129ba52c84e0d774286@plan9.bell-labs.com>
 <3C68ED33.9244EDA0@prescod.net> <200202121345.g1CDjLB30221@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C693886.28EA8595@prescod.net>

Guido van Rossum wrote:
> 
> OK, I was wrong when I expected there wouldn't be much traffic.  The
> design of an option parser alternative does *not* belong on
> python-dev.  Please get this discussion of the list NOW and move it
> elsewhere.  You can come back here when you've agreed on a solution.

Hey, I just did the announcement. I'm not the ringleader. If someone
(e.g. at pythonlabs) can set up a mailman for us then I'll send out
another announcement telling people about it. But I don't intend to
become the point man for option parsing! I could also set up a
yahoogroups list but those are somewhat annoying in my experience.

 Paul Prescod


From gward@python.net  Tue Feb 12 15:50:54 2002
From: gward@python.net (Greg Ward)
Date: Tue, 12 Feb 2002 10:50:54 -0500
Subject: [Python-Dev] a different approach to argument parsing
In-Reply-To: <200202121345.g1CDjLB30221@pcp742651pcs.reston01.va.comcast.net>
References: <a3ed7e8ff9854129ba52c84e0d774286@plan9.bell-labs.com> <3C68ED33.9244EDA0@prescod.net> <200202121345.g1CDjLB30221@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020212155054.GB24891@gerg.ca>

On 12 February 2002, Guido van Rossum said:
> OK, I was wrong when I expected there wouldn't be much traffic.  The
> design of an option parser alternative does *not* belong on
> python-dev.  Please get this discussion of the list NOW and move it
> elsewhere.  You can come back here when you've agreed on a solution.

I'm about to (try to) create a list on Starship for this: how does
getopt-alternatives@python.net sound as a place to discuss this issue?

Everyone who has posted on this thread will receive an invitation to
join the list.  (Assuming I can get Mailman to do my bidding, that is.)

Glad I brought the whole thing up though: hopefully something good will
emerge!

        Greg
-- 
Greg Ward - just another Python hacker                  gward@python.net
http://starship.python.net/~gward/
All things are possible -- except skiing through a revolving door.


From barry@zope.com  Tue Feb 12 16:03:59 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 12 Feb 2002 11:03:59 -0500
Subject: [Python-Dev] a different approach to argument parsing
References: <a3ed7e8ff9854129ba52c84e0d774286@plan9.bell-labs.com>
 <3C68ED33.9244EDA0@prescod.net>
 <200202121345.g1CDjLB30221@pcp742651pcs.reston01.va.comcast.net>
 <20020212155054.GB24891@gerg.ca>
Message-ID: <15465.15599.363507.279197@anthem.wooz.org>

>>>>> "GW" == Greg Ward <gward@python.net> writes:

    GW> I'm about to (try to) create a list on Starship for this: how
    GW> does getopt-alternatives@python.net sound as a place to
    GW> discuss this issue?

I'm happy to create the list on python.org if you prefer.  I'd go for
full SIG status: getopt-sig@python.org.  Let its charter be short lived.

If Greg's willing to be the champion, I'll set this up.

-Barry


From fdrake@acm.org  Tue Feb 12 16:16:27 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 12 Feb 2002 11:16:27 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEIBNLAA.tim.one@comcast.net>
References: <200202111628.g1BGSxF20159@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCKEIBNLAA.tim.one@comcast.net>
Message-ID: <15465.16347.962182.714475@grendel.zope.com>

Tim Peters writes:
 > Vanilla pymalloc is perfect for this:  many small objects.  A custom free
 > list for cells is a waste of code, because a cell never goes away until the
 > module does:  cells will not "churn".  We'll get a lot of them, but most of
 > them will stay alive until the program ends, so the tiny performance gain
 > you may be able to get from a thoroughly specialized free list "in theory"
 > will never be realized in practice.

Have we become convinced that these cells need to be Python objects?
I must have missed that.  As long as we can keep them simple
structures, we should be able to avoid individual allocations for
them.  It seems we have a fixed number of cells for both module
objects and function objects (regardless of whether they are part of
the new celldict or the containing function or module), so they can be
allocated as an array rather than individually.

So, I must be missing something.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From barry@zope.com  Tue Feb 12 16:43:24 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 12 Feb 2002 11:43:24 -0500
Subject: [Python-Dev] Order that site-packages is added to sys.path
References: <15464.41842.664484.307330@anthem.wooz.org>
 <m3665382e7.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15465.17964.571257.835907@anthem.wooz.org>

>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

    MvL> I think the rationale was that you are precisely not supposed
    MvL> to override any of the standard modules. It was considered a
    MvL> good thing that if you do "import string" in some version of
    MvL> Python, you know exactly what you will get.

Okay, I can see why that's useful.

Let's say there was a way to add stuff to the front of sys.path, such
that they could override the standard library.  This might work just
fine on a single user (or single application) system, but might be
very broken on a multiuser or multiapp system ("I know what I'm
installing in site-packages, so what's the problem?").

Hopefully, any overrides that were installed would be API compatible
with the standard.  Such overrides would probably be allowed to fix
bugs or add functionality, but not remove functionality.  This might
still get us into trouble and this path leads to module versioning,
etc.  I don't want to go there now.

I know how to handle my specific case (I've done it before), but just
to close the loop, I can't wait for Python 2.2.1 because some of the
features I'm depending on are new features, not just bug fixes.  I
think those will have to wait for Python 2.3 to be safe, so until
then, I must distribute a separate package.  That's fine, I can live
with that.

I think "python setup.py install --root blah" will do the trick for
me, along with some application specific path-hackery.

-Barry


From barry@zope.com  Tue Feb 12 16:46:26 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 12 Feb 2002 11:46:26 -0500
Subject: [Python-Dev] Order that site-packages is added to sys.path
References: <15464.41842.664484.307330@anthem.wooz.org>
 <3C68F010.4736AE2F@lemburg.com>
Message-ID: <15465.18146.700970.932676@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> I guess this is done for the same reason that e.g. /usr/local
    MAL> is last in PATH on Unix: system top level programs and libs
    MAL> should always have top priority. Otherwise, a user could
    MAL> easily override a system program/lib by placing a new version
    MAL> into the local dir which then gets picked up by other system
    MAL> programs.

Well, hopefully you'd control who can write into /usr/local so that
you could trust overrides being installed there.  On a single user
system, I usually do in fact put /usr/local/bin early in my path
specifically because I do want to override older, buggier, system
programs.

The analogy is similar in the Python situation.  When I'm the only
person using the system, and I'm in control of everything, being able
to override the standard library is a very useful thing to do.  When
there's less trust in the environment I'm running in, or more sharing
of common resources, it can be problematic.

-Barry


From barry@zope.com  Tue Feb 12 16:48:58 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 12 Feb 2002 11:48:58 -0500
Subject: [Python-Dev] Order that site-packages is added to sys.path
References: <15464.41842.664484.307330@anthem.wooz.org>
 <20020212152724.GA24891@gerg.ca>
Message-ID: <15465.18298.611257.213141@anthem.wooz.org>

>>>>> "GW" == Greg Ward <gward@python.net> writes:

    GW> Looong ago, I tried to persuade Guido that giving the
    GW> Distutils the power to override standard library modules
    GW> would, on rare occasions, be a good and useful thing.  (Yet
    GW> another idea stolen from Perl's MakeMaker, which can do
    GW> precisely that.  Sometimes, it's useful.)  Guess who won?

distutils's --root option could be used to specific a different
install directory than site-packages right?  So conceivably site.py
could prepend some directory onto sys.path, and distutils could be
coaxed into installing there rather than site-packages.  This might
provide a principled way to override Python's standard library when
you're really sure that's what you want to do.

-Barry


From mwh@python.net  Tue Feb 12 17:05:02 2002
From: mwh@python.net (Michael Hudson)
Date: 12 Feb 2002 17:05:02 +0000
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
Message-ID: <2m6652my0x.fsf@starship.python.net>

Some time ago, Gareth McCaughan suggested a syntax for staticmethods.
You'd write

class C(object):
    def static(arg) [staticmethod]:
        return 1 + arg

C.static(2)
   => 3

The way this works is that the above becomes syntactic sugar for
roughly:

class C(object):
    def $temp(arg):
        return 1 + arg
    static = staticmethod($temp)

Anyway, I thought this was a reasonably pythonic idea, so I
implemented it, and thought I'd mention it here.  Patch at:

    http://starship.python.net/crew/mwh/hacks/meth-syntax-sugar.diff

Some other things that become possible:

>>> class D(object):
...     def x(self) [property]:
...         return "42"
...
hello!
>>> D().x
'42'

(the hello! is a debugging printf I haven't taken out yet...)

>>> def published(func):
...     func.publish = 1
...     return func
...
>>> def m() [published]:
...     print "hiya!"
...
hello!
>>> m.publish
1

>>> def hairy_constant() [apply]:
...     return math.cos(1 + math.log(34))
...
hello!
>>> hairy_constant
-0.18495734252481616

>>> def memoize(func):
...     cache = {}
...     def f(*args):
...         try:
...            return cache[args]
...         except:
...            return cache.setdefault(args, func(*args))
...     return f
...
>>> def fib(a) [memoize]:
...     if a < 2: return 1
...     return fib(a-1) + fib(a-2)
...
hello!
>>> fib(40)
165580141 # fairly quickly


I'm not sure all of these are Good Things (esp. the [apply] one...).
OTOH, I think the idea is worth discussion (or squashing by Guido :).

Cheers,
M.

-- 
  For every complex problem, there is a solution that is simple,
  neat, and wrong.                                    -- H. L. Mencken


From jeremy@zope.com  Tue Feb 12 17:08:51 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Tue, 12 Feb 2002 12:08:51 -0500
Subject: [Python-Dev] Order that site-packages is added to sys.path
In-Reply-To: <15465.18298.611257.213141@anthem.wooz.org>
References: <15464.41842.664484.307330@anthem.wooz.org>
 <20020212152724.GA24891@gerg.ca>
 <15465.18298.611257.213141@anthem.wooz.org>
Message-ID: <15465.19491.615551.709595@gondolin.digicool.com>

>>>>> "BAW" == Barry A Warsaw <barry@zope.com> writes:

  BAW> distutils's --root option could be used to specific a different
  BAW> install directory than site-packages right?  So conceivably
  BAW> site.py could prepend some directory onto sys.path, and
  BAW> distutils could be coaxed into installing there rather than
  BAW> site-packages.  This might provide a principled way to override
  BAW> Python's standard library when you're really sure that's what
  BAW> you want to do.

Why don't you use "--root /usr/local/lib/python2.2" and *really*
override the standard library?  

It seems fragile to extend Python with yet more directories to search
in a special order so that the interpreter picks up the correct copy
of somemodule.py from among the four or five copies installed on the
system and on the path.

Jeremy



From gward@python.net  Tue Feb 12 17:17:52 2002
From: gward@python.net (Greg Ward)
Date: Tue, 12 Feb 2002 12:17:52 -0500
Subject: [Python-Dev] Order that site-packages is added to sys.path
In-Reply-To: <15465.19491.615551.709595@gondolin.digicool.com>
References: <15464.41842.664484.307330@anthem.wooz.org> <20020212152724.GA24891@gerg.ca> <15465.18298.611257.213141@anthem.wooz.org> <15465.19491.615551.709595@gondolin.digicool.com>
Message-ID: <20020212171752.GA25558@gerg.ca>

On 12 February 2002, Jeremy Hylton said:
> Why don't you use "--root /usr/local/lib/python2.2" and *really*
> override the standard library?  

No: --root just lets you replace / with something else.  It's mainly so
you can build an RPM (eg.) without being superuser.  Your example
would install to
  /usr/local/lib/python2.2/usr/local/lib/python2.2/site-packages
...which is probably not what you meant.

The distutils install command *is* pretty flexible; if someone cares to
sit down and figure it out, I'm sure this is possible.  It's just not
documented or obvious.

        Greg
-- 
Greg Ward - Linux weenie                                gward@python.net
http://starship.python.net/~gward/
"He's dead, Jim.  You get his tricorder and I'll grab his wallet."


From jeremy@alum.mit.edu  Tue Feb 12 17:29:37 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Tue, 12 Feb 2002 12:29:37 -0500
Subject: [Python-Dev] Order that site-packages is added to sys.path
In-Reply-To: <20020212171752.GA25558@gerg.ca>
References: <15464.41842.664484.307330@anthem.wooz.org>
 <20020212152724.GA24891@gerg.ca>
 <15465.18298.611257.213141@anthem.wooz.org>
 <15465.19491.615551.709595@gondolin.digicool.com>
 <20020212171752.GA25558@gerg.ca>
Message-ID: <15465.20737.379286.281107@gondolin.digicool.com>

Perhaps --home then?  I know there's some command I've used to install
Python packages in my Zope lib/python directory by spelling out its
full path.

Jeremy



From jason-dated-1014226523.fe612b@mastaler.com  Tue Feb 12 17:35:21 2002
From: jason-dated-1014226523.fe612b@mastaler.com (Jason R. Mastaler)
Date: Tue, 12 Feb 2002 10:35:21 -0700
Subject: [Python-Dev] Re: RFC: Option Parsing Libraries
In-Reply-To: <mailman.1013528702.31641.clpa-moderators@python.org> (Paul
 Prescod's message of "Mon, 11 Feb 2002 08:54:53 -0800")
References: <mailman.1013528702.31641.clpa-moderators@python.org>
Message-ID: <hhsn86iox2.fsf@nightshade.la.mastaler.com>

I haven't specifically looked at Optik, but if another option parser
is going to be added to the standard lib, there is one thing I'd like
it to have which the getopt module currently doesn't:

Support for optional arguments.  That is, the ability to specify that
an option *may* have an argument, and not just that it either must or
can't have an argument.

I find this limitation in getopt very frustrating.

-- 
http://tmda.sourceforge.net/


From mal@lemburg.com  Tue Feb 12 17:45:48 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Feb 2002 18:45:48 +0100
Subject: [Python-Dev] Order that site-packages is added to sys.path
References: <15464.41842.664484.307330@anthem.wooz.org>
 <3C68F010.4736AE2F@lemburg.com> <15465.18146.700970.932676@anthem.wooz.org>
Message-ID: <3C6954CC.46B215D8@lemburg.com>

"Barry A. Warsaw" wrote:
> [distutils --root hackery]

Why not use the subpackage approach I suggested ? 

It keeps the std lib in a sane state (meaning that the std lib 
installation only depends on the Python installation and no other 
hacks on top of it).

Since you'll have to ship the complete package anyway, 
I don't see any win in installing over the std email 
package. If that's what you really want, I'd suggest to provide 
the updated email package as separate download and then test
inside Mailman for the new version.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Tue Feb 12 17:55:23 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Feb 2002 18:55:23 +0100
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
Message-ID: <3C69570B.CD65B453@lemburg.com>

This is a multi-part message in MIME format.
--------------E8FBACF218F16A772D2300AE
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

 
--------------E8FBACF218F16A772D2300AE
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Received: from lemburg.com (www.egenix.com [217.115.138.139])
	by www.egenix.com (8.11.2/8.11.2/SuSE Linux 8.11.1-0.5) with ESMTP id g1CHegs25065;
	Tue, 12 Feb 2002 18:40:42 +0100
Message-ID: <3C6953B8.91157457@lemburg.com>
Date: Tue, 12 Feb 2002 18:41:12 +0100
From: "M.-A. Lemburg" <mal@lemburg.com>
Organization: eGenix.com Software GmbH
X-Mailer: Mozilla 4.78 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Michael Hudson <mwh@python.net>
Subject: Re: [Python-Dev] syntactic sugar idea for {static,class}methods
References: <2m6652my0x.fsf@starship.python.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Michael Hudson wrote:
> 
> Some time ago, Gareth McCaughan suggested a syntax for staticmethods.
> You'd write
> 
> class C(object):
>     def static(arg) [staticmethod]:
>         return 1 + arg
> 
> C.static(2)
>    => 3
> 

Certainly looks nice.

I'd just use a shorter name for [staticmethod], e.g. [static].

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/

--------------E8FBACF218F16A772D2300AE--



From barry@zope.com  Tue Feb 12 18:03:05 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 12 Feb 2002 13:03:05 -0500
Subject: [Python-Dev] Order that site-packages is added to sys.path
References: <15464.41842.664484.307330@anthem.wooz.org>
 <3C68F010.4736AE2F@lemburg.com>
 <15465.18146.700970.932676@anthem.wooz.org>
 <3C6954CC.46B215D8@lemburg.com>
Message-ID: <15465.22745.136540.172075@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> "Barry A. Warsaw" wrote:
    >> [distutils --root hackery]

    MAL> Why not use the subpackage approach I suggested ?

    MAL> It keeps the std lib in a sane state (meaning that the std
    MAL> lib installation only depends on the Python installation and
    MAL> no other hacks on top of it).

    MAL> Since you'll have to ship the complete package anyway, I
    MAL> don't see any win in installing over the std email
    MAL> package. If that's what you really want, I'd suggest to
    MAL> provide the updated email package as separate download and
    MAL> then test inside Mailman for the new version.

I'm fine with installing in a Mailman specific location, but I still
want to use as much of the distutils machinery as possible.

It looks like

    python setup.py install --home=/some/path

gets close enough.  This will install the email package into
/some/path/lib/python and I can easily arrange for that to be in the
right place on sys.path, at least for the mail program and the cgi
program.

The command line scripts are a bit trickier because you can't wheedle
your way into Python's startup machinery without 1) telling your users
to setenv PYTHONPATH (yuck) or 2) importing a path-hacking module
before any that require the override location.  Since I already have
to do #2 anyway, this isn't much of a problem, except that some
imports will have to be rearranged.  It also makes things a little
trickier when a user does eventually upgrade to Python 2.3, which will
obviate the need for the enhanced package (hopefully).

Like everyone else, I'm sure I'll eventually just end up shipping my
own complete Python distro to make sure it's got exactly what you
need. ;)

-Barry


From barry@zope.com  Tue Feb 12 18:03:41 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 12 Feb 2002 13:03:41 -0500
Subject: [Python-Dev] Order that site-packages is added to sys.path
References: <15464.41842.664484.307330@anthem.wooz.org>
 <20020212152724.GA24891@gerg.ca>
 <15465.18298.611257.213141@anthem.wooz.org>
 <15465.19491.615551.709595@gondolin.digicool.com>
 <20020212171752.GA25558@gerg.ca>
Message-ID: <15465.22781.118071.730570@anthem.wooz.org>

>>>>> "GW" == Greg Ward <gward@python.net> writes:

    GW> The distutils install command *is* pretty flexible; if someone
    GW> cares to sit down and figure it out, I'm sure this is
    GW> possible.  It's just not documented or obvious.

Any hope of actually documenting all this stuff? <wink>

-Barry


From barry@zope.com  Tue Feb 12 18:12:45 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 12 Feb 2002 13:12:45 -0500
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
References: <2m6652my0x.fsf@starship.python.net>
Message-ID: <15465.23325.41291.966138@anthem.wooz.org>

>>>>> "MH" == Michael Hudson <mwh@python.net> writes:

    MH> Some time ago, Gareth McCaughan suggested a syntax for
    MH> staticmethods.  You'd write

    MH> class C(object):
    |     def static(arg) [staticmethod]:
    |         return 1 + arg

    | C.static(2)
    |    => 3

Very interesting!  Why the square brackets though?  Is that just for
visual offset or is there a grammar constraint that requires them?
I'd leave them out of the picture, unless you mean to imply that a
list is acceptable in that position <wink>.

salt-and-pep-per-ly y'rs,
-Barry


From skip@pobox.com  Tue Feb 12 18:29:37 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 12 Feb 2002 12:29:37 -0600
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
In-Reply-To: <15465.23325.41291.966138@anthem.wooz.org>
References: <2m6652my0x.fsf@starship.python.net>
 <15465.23325.41291.966138@anthem.wooz.org>
Message-ID: <15465.24337.8353.562658@beluga.mojam.com>

    MH> class C(object):
    MH>     def static(arg) [staticmethod]:
    MH>     return 1 + arg

    BAW> Why the square brackets though?

I believe Guido addressed this in his DevDay presentation.  The list
construct is to allow future extensions without requiring parser changes.

Skip



From barry@zope.com  Tue Feb 12 18:44:11 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 12 Feb 2002 13:44:11 -0500
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
References: <2m6652my0x.fsf@starship.python.net>
 <15465.23325.41291.966138@anthem.wooz.org>
 <15465.24337.8353.562658@beluga.mojam.com>
Message-ID: <15465.25211.385898.880479@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    MH> s C(object): def static(arg) [staticmethod]: return 1 + arg

    BAW> Why the square brackets though?

    SM> I believe Guido addressed this in his DevDay presentation.
    SM> The list construct is to allow future extensions without
    SM> requiring parser changes.

Okie dokie.
-Barry


From guido@python.org  Tue Feb 12 19:11:15 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 12 Feb 2002 14:11:15 -0500
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
In-Reply-To: Your message of "Tue, 12 Feb 2002 12:29:37 CST."
 <15465.24337.8353.562658@beluga.mojam.com>
References: <2m6652my0x.fsf@starship.python.net> <15465.23325.41291.966138@anthem.wooz.org>
 <15465.24337.8353.562658@beluga.mojam.com>
Message-ID: <200202121911.g1CJBFr31684@pcp742651pcs.reston01.va.comcast.net>

>     MH> class C(object):
>     MH>     def static(arg) [staticmethod]:
>     MH>     return 1 + arg
> 
>     BAW> Why the square brackets though?
> 
> I believe Guido addressed this in his DevDay presentation.  The list
> construct is to allow future extensions without requiring parser changes.

It was only one of the many grammar options I proposed, semi-jokingly.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From DavidA@ActiveState.com  Tue Feb 12 19:10:25 2002
From: DavidA@ActiveState.com (David Ascher)
Date: Tue, 12 Feb 2002 11:10:25 -0800
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
References: <2m6652my0x.fsf@starship.python.net>
Message-ID: <3C6968A1.1ED5158A@activestate.com>

Michael Hudson wrote:
> 
> Some time ago, Gareth McCaughan suggested a syntax for staticmethods.
> You'd write
> 
> class C(object):
>     def static(arg) [staticmethod]:
>         return 1 + arg
> 
> C.static(2)
>    => 3

Nice!

Note that this is quite similar to the [WebMethod] in C#, VB.Net, etc. ,
and indeed we could have [webmethod] for some variation of a SOAP/RPC
interface.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpcondeclaringwebservice.asp


From gward@python.net  Tue Feb 12 19:47:46 2002
From: gward@python.net (Greg Ward)
Date: Tue, 12 Feb 2002 14:47:46 -0500
Subject: [Python-Dev] a different approach to argument parsing
In-Reply-To: <15465.15599.363507.279197@anthem.wooz.org>
References: <a3ed7e8ff9854129ba52c84e0d774286@plan9.bell-labs.com> <3C68ED33.9244EDA0@prescod.net> <200202121345.g1CDjLB30221@pcp742651pcs.reston01.va.comcast.net> <20020212155054.GB24891@gerg.ca> <15465.15599.363507.279197@anthem.wooz.org>
Message-ID: <20020212194745.GA27163@gerg.ca>

On 12 February 2002, Barry A. Warsaw said:
> I'm happy to create the list on python.org if you prefer.  I'd go for
> full SIG status: getopt-sig@python.org.  Let its charter be short lived.
> 
> If Greg's willing to be the champion, I'll set this up.

Thanks Barry.  getopt-alternatives@python.net is dead, long live
getopt-sig@python.org!  Join the list at

  http://mail.python.org/mailman/listinfo/getopt-sig

I'll announce this on c.l.py.announce shortly.

        Greg
-- 
Greg Ward - geek-at-large                               gward@python.net
http://starship.python.net/~gward/
Nostalgia just isn't what it used to be.


From tim.one@comcast.net  Tue Feb 12 20:17:50 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 12 Feb 2002 15:17:50 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <20020212071336.A32363@glacier.arctrix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEKONLAA.tim.one@comcast.net>

[Tim]
> Speaking of which, why does LOAD_FAST waste time checking against NULL
> twice?!

[Neil Schemenauer]
> If you would have approved my patch it would be fixed already.

Heh.  If you had entered the patch at priority 9, I might have gotten to it
by this summer.  At priority 3, we're talking years <wink/sigh>.  I boosted
it to 6.  Note that the tiny patch I checked in also rearranged the code so
that the mormal case became the fall-through case:

    if (normal)
        do normal stuff
    else do exceptional stuff

Most dumb compilers on platforms that care use a "forward branches probably
aren't taken, backward branches probably are" heuristic for setting
branch-prediction hints in the machine code; and on platforms that don't
care it's usually faster to fall through than to change the program counter
anyway.

>   one-small-banana-left-ly y'rs Neil

is-that-an-american-or-canadian-banana?-ly y'rs  - tim



From tim.one@comcast.net  Tue Feb 12 20:24:47 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 12 Feb 2002 15:24:47 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <15465.16347.962182.714475@grendel.zope.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEKPNLAA.tim.one@comcast.net>

[Fred]
> Have we become convinced that these cells need to be Python objects?

No, it's just easier that way.  The existing dict code maps PyObject* to
PyObject*, so we'd have to copy and fiddle *all* of the dict code if a
celldict wants to map to anything other than a PyObject*.

> I must have missed that.  As long as we can keep them simple
> structures, we should be able to avoid individual allocations for
> them.  It seems we have a fixed number of cells for both module
> objects and function objects (regardless of whether they are part of
> the new celldict or the containing function or module), so they can be
> allocated as an array rather than individually.
>
> So, I must be missing something.

cells don't live in function objects; a function object only has a vector of
pointers *to* cells, and that is indeed a contiguous, fixed-size array.

cells are the values in celldicts, and that's the only place they appear,
and celldicts can grow dynamically (import fred; fred.brandnew = 1).



From mal@lemburg.com  Tue Feb 12 20:42:22 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Feb 2002 21:42:22 +0100
Subject: [Python-Dev] Order that site-packages is added to sys.path
References: <15464.41842.664484.307330@anthem.wooz.org>
 <3C68F010.4736AE2F@lemburg.com>
 <15465.18146.700970.932676@anthem.wooz.org>
 <3C6954CC.46B215D8@lemburg.com> <15465.22745.136540.172075@anthem.wooz.org>
Message-ID: <3C697E2E.34174D26@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "MAL" == M  <mal@lemburg.com> writes:
> 
>     MAL> "Barry A. Warsaw" wrote:
>     >> [distutils --root hackery]
> 
>     MAL> Why not use the subpackage approach I suggested ?
> 
>     MAL> It keeps the std lib in a sane state (meaning that the std
>     MAL> lib installation only depends on the Python installation and
>     MAL> no other hacks on top of it).
> 
>     MAL> Since you'll have to ship the complete package anyway, I
>     MAL> don't see any win in installing over the std email
>     MAL> package. If that's what you really want, I'd suggest to
>     MAL> provide the updated email package as separate download and
>     MAL> then test inside Mailman for the new version.
> 
> I'm fine with installing in a Mailman specific location, but I still
> want to use as much of the distutils machinery as possible.
> 
> It looks like
> 
>     python setup.py install --home=/some/path
> 
> gets close enough.  

No, no, no :-)

What I am suggesting is to put the email package *inside* the
Mailman package:

Mailman/__init__.py
        ...
        email/__init__.py
              ...

And then use "from Mailman import email" in Mailman source 
code. That's clean, doesn't interfere with the std lib
and it's all your's ;-) (meaning that you have complete
control over what email does in the Mailman context).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From barry@zope.com  Tue Feb 12 20:57:42 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 12 Feb 2002 15:57:42 -0500
Subject: [Python-Dev] Order that site-packages is added to sys.path
References: <15464.41842.664484.307330@anthem.wooz.org>
 <3C68F010.4736AE2F@lemburg.com>
 <15465.18146.700970.932676@anthem.wooz.org>
 <3C6954CC.46B215D8@lemburg.com>
 <15465.22745.136540.172075@anthem.wooz.org>
 <3C697E2E.34174D26@lemburg.com>
Message-ID: <15465.33222.548705.100220@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    >> install --home=/some/path gets close enough.

    MAL> No, no, no :-)

    MAL> What I am suggesting is to put the email package *inside* the
    MAL> Mailman package:

    MAL> Mailman/__init__.py
    |         ...
    |         email/__init__.py
    |               ...

What I didn't say was s|/some/path|/path/to/Mailman|

so we're saying (nearly) the same thing.

    MAL> And then use "from Mailman import email" in Mailman source 
    MAL> code. That's clean, doesn't interfere with the std lib
    MAL> and it's all your's ;-) (meaning that you have complete
    MAL> control over what email does in the Mailman context).

I could do this (and may) or I may use something like from
Mailman.pythonlib import email, which is my normal place to put
override modules.

It's moderately more appealing to put Mailman.pythonlib on sys.path
and just leave my "import email"'s alone.  I know there are arguments
against doing it that way, but I don't want to have to change dozens
of files.

-Barry


From aahz@rahul.net  Tue Feb 12 21:24:32 2002
From: aahz@rahul.net (Aahz Maruch)
Date: Tue, 12 Feb 2002 13:24:32 -0800 (PST)
Subject: [Python-Dev] Order that site-packages is added to sys.path
In-Reply-To: <15465.33222.548705.100220@anthem.wooz.org> from "Barry A. Warsaw" at Feb 12, 2002 03:57:42 PM
Message-ID: <20020212212433.3E8A4E8CF@waltz.rahul.net>

Barry A. Warsaw wrote:
> 
> It's moderately more appealing to put Mailman.pythonlib on sys.path
> and just leave my "import email"'s alone.  I know there are arguments
> against doing it that way, but I don't want to have to change dozens
> of files.

For shame, Barry, isn't that what Python is for?  ;-)
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

We must not let the evil of a few trample the freedoms of the many.


From nas@python.ca  Tue Feb 12 21:27:24 2002
From: nas@python.ca (Neil Schemenauer)
Date: Tue, 12 Feb 2002 13:27:24 -0800
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEKONLAA.tim.one@comcast.net>; from tim.one@comcast.net on Tue, Feb 12, 2002 at 03:17:50PM -0500
References: <20020212071336.A32363@glacier.arctrix.com> <LNBBLJKPBEHFEDALKOLCEEKONLAA.tim.one@comcast.net>
Message-ID: <20020212132724.A1443@glacier.arctrix.com>

Tim Peters wrote:
>     if (normal)
>         do normal stuff
>     else do exceptional stuff
> 
> Most dumb compilers on platforms that care use a "forward branches probably
> aren't taken, backward branches probably are" heuristic for setting
> branch-prediction hints in the machine code; and on platforms that don't
> care it's usually faster to fall through than to change the program counter
> anyway.

I seem to remember someone saying that GCC generated better code for:

        if (exceptional) {
            do exceptional things
            break / return / goto
        }
        do normal things
   
Is GCC in the dumb category?  Also, the Linux is starting to use this
set of macros more often:

    /* Somewhere in the middle of the GCC 2.96 development cycle, we
     * implemented a mechanism by which the user can annotate likely
     * branch directions and expect the blocks to be reordered
     * appropriately.  Define __builtin_expect to nothing for earlier
     * compilers.  */

    #if __GNUC__ == 2 && __GNUC_MINOR__ < 96
    #define __builtin_expect(x, expected_value) (x)
    #endif

    #define likely(x)       __builtin_expect((x),1)
    #define unlikely(x)     __builtin_expect((x),0)

For example:

    if (likely(normal))
        do normal stuff
    else do exceptional stuff

I don't have GCC >= 2.96 otherwise I would have tried sprinkling some of
those macros in ceval and testing the effect.

  Neil


From barry@zope.com  Tue Feb 12 21:26:16 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 12 Feb 2002 16:26:16 -0500
Subject: [Python-Dev] Order that site-packages is added to sys.path
References: <15465.33222.548705.100220@anthem.wooz.org>
 <20020212212433.3E8A4E8CF@waltz.rahul.net>
Message-ID: <15465.34936.922554.202330@anthem.wooz.org>

>>>>> "AM" == Aahz Maruch <aahz@rahul.net> writes:

    AM> For shame, Barry, isn't that what Python is for?  ;-)

Naw, it's what elisp if for <wink>.

-Barry


From James_Althoff@i2.com  Tue Feb 12 21:38:24 2002
From: James_Althoff@i2.com (James_Althoff@i2.com)
Date: Tue, 12 Feb 2002 13:38:24 -0800
Subject: [Python-Dev] re: syntactic sugar idea for {static,class}methods
Message-ID: <OF8E8DEF49.E0B3ED98-ON88256B5E.00759C71@i2.com>

I would think that specifying a list (as in  [staticmethod]) would be very
desirable so that you could do a sequence of transformations, not just one.
Michael's examples seem to suggest elements of an Aspect-oriented approach
to things.  If you have several relevant "Aspect wrappers", then you might
want to apply each cascaded in sequence.  If, using previous examples, I
want a "static" method that is also "memoized" and "SOAPed" I could write:

        def mymethod(arg) [staticmethod,memoize,webmethod]:

or some such combination that is presumably well-defined.

Jim



From Jack.Jansen@oratrix.nl  Tue Feb 12 22:11:44 2002
From: Jack.Jansen@oratrix.nl (Jack Jansen)
Date: Tue, 12 Feb 2002 23:11:44 +0100
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
In-Reply-To: <2m6652my0x.fsf@starship.python.net>
Message-ID: <7FFEC9B0-2005-11D6-A45A-003065517236@oratrix.nl>

On Tuesday, February 12, 2002, at 06:05  PM, Michael Hudson wrote:

> Some time ago, Gareth McCaughan suggested a syntax for staticmethods.
> You'd write
>
> class C(object):
>     def static(arg) [staticmethod]:
>         return 1 + arg
>
> C.static(2)
>    => 3

At some point in the past, when the actual implementation wasn't 
even finished, I suggested to Guido to use

class C(object):
	def static(class, arg):
		return 1 + arg

as the syntactic sugar. I think he wasn't against it at the 
time, but somehow found the actual implementation more 
important:-)
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -



From tim.one@comcast.net  Tue Feb 12 23:04:44 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 12 Feb 2002 18:04:44 -0500
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <20020212132724.A1443@glacier.arctrix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEMFNLAA.tim.one@comcast.net>

[Neil Schemenauer]
> I seem to remember someone saying that GCC generated better code for:
>
>         if (exceptional) {
>             do exceptional things
>             break / return / goto
>         }
>         do normal things
>
> Is GCC in the dumb category?

Yes, any compiler that doesn't do branch prediction based on *semantic*
analysis is dirt dumb.  A simple example of semantic prediction is
"comparing a pointer to NULL is probably going to yield false".  Ditto
comparing a number for equality with 0.

I'd like to see a reference for the pattern above; it goes against the very
common "forward branches usually aren't taken" heuristic.  Note that
Vladimir applied that gimmick to an extreme in obmalloc.c's malloc()
function.

> Also, the Linux is starting to use this set of macros more often:
["likely" and "unlikely"]

They're late to the party.  Cray had a "percent true" directive 20 years
ago, allowing for 48 bits of precision in specifying how (un)likely <wink>.

> ...
> I don't have GCC >= 2.96 otherwise I would have tried sprinkling some of
> those macros in ceval and testing the effect.

Maybe more interesting:  One of the folks at Zope Corp reported getting
significant speedups by using some gcc option that can feed real-life branch
histories back into the compiler.  Less work and less error-prone than
guessing annotations.



From gward@python.net  Tue Feb 12 23:41:44 2002
From: gward@python.net (Greg Ward)
Date: Tue, 12 Feb 2002 18:41:44 -0500
Subject: [Python-Dev] SyntaxError tracebacks in 2.2
Message-ID: <20020212234144.GA28828@gerg.ca>

Has anyone else noticed that SyntaxError tracebacks no longer include
the name of the file where the error occurs?  Instead, they just say
"<file>".  Eg.

  $ python2.1 foo.py
    File "foo.py", line 1
      foo =
          ^
  SyntaxError: invalid syntax
  $ python2.2 foo.py
    File "<string>", line 1
      foo =
        ^
SyntaxError: invalid syntax

This is annoying enough that I just filed SF bug #516712:
  http://sourceforge.net/tracker/index.php?func=detail&aid=516712&group_id=5470&atid=105470

        Greg
-- 
Greg Ward - Linux nerd                                  gward@python.net
http://starship.python.net/~gward/
There are no stupid questions -- only stupid people.


From neal@metaslash.com  Tue Feb 12 23:50:23 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Tue, 12 Feb 2002 18:50:23 -0500
Subject: [Python-Dev] SyntaxError tracebacks in 2.2
References: <20020212234144.GA28828@gerg.ca>
Message-ID: <3C69AA3F.9E1932DD@metaslash.com>

Greg Ward wrote:
> 
> Has anyone else noticed that SyntaxError tracebacks no longer include
> the name of the file where the error occurs?  Instead, they just say
> "<file>".  Eg.
> 
>   $ python2.1 foo.py
>     File "foo.py", line 1
>       foo =
>           ^
>   SyntaxError: invalid syntax
>   $ python2.2 foo.py
>     File "<string>", line 1
>       foo =
>         ^
> SyntaxError: invalid syntax
> 
> This is annoying enough that I just filed SF bug #516712:
>   http://sourceforge.net/tracker/index.php?func=detail&aid=516712&group_id=5470&atid=105470

I believe Martin fixed this.  With the latest from CVS:

[neal@epoch src]$ ./python foo.py
  File "foo.py", line 1
    foo = 
         ^
SyntaxError: invalid syntax

Neal


From DavidA@ActiveState.com  Wed Feb 13 00:16:07 2002
From: DavidA@ActiveState.com (David Ascher)
Date: Tue, 12 Feb 2002 16:16:07 -0800
Subject: [Python-Dev] Accessing globals without dict lookup
References: <LNBBLJKPBEHFEDALKOLCKEMFNLAA.tim.one@comcast.net>
Message-ID: <3C69B047.EDE6048D@activestate.com>

Tim Peters wrote:

> Maybe more interesting:  One of the folks at Zope Corp reported getting
> significant speedups by using some gcc option that can feed real-life branch
> histories back into the compiler.  Less work and less error-prone than
> guessing annotations.

Slightly OT: Has anyone tried compiling Python w/ the Intel C++
compiler? 

--david


From martin@v.loewis.de  Wed Feb 13 00:19:44 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 13 Feb 2002 01:19:44 +0100
Subject: [Python-Dev] SSL support in _socket
In-Reply-To: <3C692370.D21EF15D@lemburg.com>
References: <3C692370.D21EF15D@lemburg.com>
Message-ID: <m33d06xmfz.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> What can we do about this ?

The standard solution is to modify Modules/Setup at installation time,
to suit your local needs.

> Perhaps we should simply let setup.py build two extensions: _socket
> (without SSL) and _socketssl (with SSL) ?! If the _socketssl build
> or import fails for some reason, Python could still pick up the
> _socket extension in socket.py.

-1: Instead of avoiding to use an existing OpenSSL installation, it
would be much better if the socket module was fixed to work with all
existing versions.

Of course, without a precise bug report, we cannot know whether this
was possible.

Regards,
Martin


From martin@v.loewis.de  Wed Feb 13 00:28:04 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 13 Feb 2002 01:28:04 +0100
Subject: [Python-Dev] SyntaxError tracebacks in 2.2
In-Reply-To: <3C69AA3F.9E1932DD@metaslash.com>
References: <20020212234144.GA28828@gerg.ca> <3C69AA3F.9E1932DD@metaslash.com>
Message-ID: <m3y9hyw7hn.fsf@mira.informatik.hu-berlin.de>

Neal Norwitz <neal@metaslash.com> writes:

> I believe Martin fixed this.

Indeed; that's parsetok.c 2.29 and 2.28.8.1. I've closed the report as
a duplicate.

Regards,
Martin


From skip@pobox.com  Wed Feb 13 03:54:01 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 12 Feb 2002 21:54:01 -0600
Subject: [Python-Dev] Accessing globals without dict lookup
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEMFNLAA.tim.one@comcast.net>
References: <20020212132724.A1443@glacier.arctrix.com>
 <LNBBLJKPBEHFEDALKOLCKEMFNLAA.tim.one@comcast.net>
Message-ID: <15465.58201.826252.981746@12-248-41-177.client.attbi.com>

    Tim> Maybe more interesting: One of the folks at Zope Corp reported
    Tim> getting significant speedups by using some gcc option that can feed
    Tim> real-life branch histories back into the compiler.  Less work and
    Tim> less error-prone than guessing annotations.

That would be -fprofile-args and -fbranch-probabilities:

    -fprofile-arcs also makes it possible to estimate branch probabilities,
    and to calculate basic block execution counts.  In general, basic block
    execution counts do not give enough information to estimate all branch
    probabilities.  When the compiled program exits, it saves the arc
    execution counts to a file called sourcename.da.  Use the compiler
    option -fbranch-probabilities when recompiling, to optimize using
    estimated branch probabilities.

I fiddled with a bunch of gcc options a few months ago.  I finally settled
on

    -O3 -minline-all-stringops -fomit-frame-pointer

Skip



From tim.one@comcast.net  Wed Feb 13 04:55:27 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 12 Feb 2002 23:55:27 -0500
Subject: [Python-Dev] Expat vs Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCKENINLAA.tim.one@comcast.net>

Anyone understand what's going on with expat?  I noticed pyexpat stopped
compiling on Windows a day or two ago, but didn't have time to look at it.

Today I see it compiles, but generates lots of linker warnings:

   Creating library ./pyexpat.lib and object ./pyexpat.exp
LINK : warning LNK4049:
    locally defined symbol "_XML_GetSpecifiedAttributeCount" imported
LINK : warning LNK4049:
    locally defined symbol "_XML_Parse" imported
LINK : warning LNK4049:
    locally defined symbol "_XML_ErrorString" imported

etc.

Are we trying to break away from the SourceForge expat project?  Seems a
dubious idea, if so.  In any case, I can make almost no time for repairing
this on Windows, so need someone to explain what we're trying to accomplish
here (btw, if someone already explained this on some mailing list, sorry,
I'm hundreds of msgs behind the times).



From tim.one@comcast.net  Wed Feb 13 05:11:46 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 13 Feb 2002 00:11:46 -0500
Subject: [Python-Dev] Expat vs Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKENINLAA.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCIENJNLAA.tim.one@comcast.net>

[Tim]
> Today I see it compiles, but generates lots of linker warnings:
> ...

Oops -- I don't look far enough.  A different part still doesn't compile, at
least not in a debug build:

--------------------Configuration: pyexpat - Win32 Debug-------------------
Compiling...
xmlparse.c
C:\Code\python\Modules\expat\xmlparse.c(1329) : error C2143: syntax error :
missing ';' before 'constant'
C:\Code\python\Modules\expat\xmlparse.c(1329) : error C2115: 'return' :
incompatible types
Error executing cl.exe.

pyexpat_d.pyd - 2 error(s), 0 warning(s)


It's griping about this:

const XML_LChar *
XML_ExpatVersion(void) {
  return VERSION;
}



From martin@v.loewis.de  Wed Feb 13 07:53:58 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 13 Feb 2002 08:53:58 +0100
Subject: [Python-Dev] Expat vs Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIENJNLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCIENJNLAA.tim.one@comcast.net>
Message-ID: <m3it913jhl.fsf@mira.informatik.hu-berlin.de>

Tim Peters <tim.one@comcast.net> writes:

> Oops -- I don't look far enough.  A different part still doesn't compile, at
> least not in a debug build:

That's because the VERSION define in the debug build read

 /D VERSION="1.95.2"

whereas MSVC had wanted it as

 /D VERSION=\"1.95.2\"

I still fail to see the rationale for requiring the backslashes there,
or why I have to change every setting twice on Windows (which I forgot
in this case); in any case, I moved the VERSION setting into expat.h,
so this problem should be gone now.

Regards,
Martin



From tim.one@comcast.net  Wed Feb 13 08:00:29 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 13 Feb 2002 03:00:29 -0500
Subject: [Python-Dev] Order that site-packages is added to sys.path
In-Reply-To: <15464.41842.664484.307330@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEOFNLAA.tim.one@comcast.net>

Note that app developers eager to replace standard libraries, and an OS that
allowed them to do so, are the causes of the aptly named "DLL Hell" on
Windows.  It can work fine for a single app, but it's truly hell when
multiple apps resort to this, and end users don't have a prayer of sorting
out the inevitable, vicious problems.

if-you-need-your-own-xxx.py-you-know-where-to-shove-it<wink>-ly y'rs  - tim



From martin@v.loewis.de  Wed Feb 13 08:10:47 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 13 Feb 2002 09:10:47 +0100
Subject: [Python-Dev] Expat vs Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKENINLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCKENINLAA.tim.one@comcast.net>
Message-ID: <m3ofitkdiw.fsf@mira.informatik.hu-berlin.de>

Tim Peters <tim.one@comcast.net> writes:

> Anyone understand what's going on with expat?  I noticed pyexpat stopped
> compiling on Windows a day or two ago, but didn't have time to look at it.
> 
> Today I see it compiles, but generates lots of linker warnings:
> 
>    Creating library ./pyexpat.lib and object ./pyexpat.exp
> LINK : warning LNK4049:
>     locally defined symbol "_XML_GetSpecifiedAttributeCount" imported

I cannot reproduce this on my MSVC 6 installation. What does that
warning mean? Does it indicate a problem of some sort?

> Are we trying to break away from the SourceForge expat project?  

No, Modules/expat is a literal copy of SF expat 1.95.2, lib/.

> (btw, if someone already explained this on some mailing list, sorry,
> I'm hundreds of msgs behind the times).

http://mail.python.org/pipermail/python-dev/2002-February/019974.html

[assuming you read this message before catching up with the rest of
python-dev]

Regards,
Martin


From tim.one@comcast.net  Wed Feb 13 08:15:08 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 13 Feb 2002 03:15:08 -0500
Subject: [Python-Dev] Expat vs Windows
In-Reply-To: <m3it913jhl.fsf@mira.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEOHNLAA.tim.one@comcast.net>

[Martin v. Loewis]
> That's because the VERSION define in the debug build read
>
>  /D VERSION="1.95.2"
>
> whereas MSVC had wanted it as
>
>  /D VERSION=\"1.95.2\"
>
> I still fail to see the rationale for requiring the backslashes there,

I expect it's the same as under most Unix shells:  the cmdline processor
chews up unescaped quotes, so if you want quotes to survive in what's passed
to argv, you have to escape them.

> or why I have to change every setting twice on Windows (which I forgot
> in this case);

You don't, if you first select "Multiple Configurations ... " from the
"Settings for:" dropdown list.  That controls which configuration(s) your
changes apply to, so if you leave it at, e.g., "Win32 Release", you're
explicitly instructing it to apply changes only to the Release build.

> in any case, I moved the VERSION setting into expat.h, so this problem
> should be gone now.

Thank you!  I won't get to try it until tomorrow night; maybe the linker
warnings will vanish by then too ...



From mal@lemburg.com  Wed Feb 13 09:22:05 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 13 Feb 2002 10:22:05 +0100
Subject: [Python-Dev] SSL support in _socket
References: <3C692370.D21EF15D@lemburg.com> <m33d06xmfz.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3C6A303D.E0DDC16A@lemburg.com>

"Martin v. Loewis" wrote:
> 
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
> > What can we do about this ?
> 
> The standard solution is to modify Modules/Setup at installation time,
> to suit your local needs.

I thought that Modules/Setup is deprecated and replaced by the
auto setup tests in setup.py ? In any case, setup.py will simply
remove _socket if it doesn't import correctly and so a casual
sys admin or user will lose big if his OpenSSL installation
happens to be out of sync with whatever we provide in _socket.
 
> > Perhaps we should simply let setup.py build two extensions: _socket
> > (without SSL) and _socketssl (with SSL) ?! If the _socketssl build
> > or import fails for some reason, Python could still pick up the
> > _socket extension in socket.py.
> 
> -1: Instead of avoiding to use an existing OpenSSL installation, it
> would be much better if the socket module was fixed to work with all
> existing versions.
> 
> Of course, without a precise bug report, we cannot know whether this
> was possible.

Some symbols starting with 'RAND_*' are aparently missing from 
OpenSSL on my notebook. On other occasions (i.e. on RedHat) I found
that the system vendor had forgotten to provide a link to the
0.9 version of OpenSSL and instead used 1.0 as version number
(which is completely wrong since there is no 1.0 version
of OpenSSL). As a result, _socket built on a system with correctly
setup libs wouldn't run on this particular RedHat installation.

In summary: _socket is just too important to lose if something
in the OpenSSL support goes wrong. The two build model I suggested
fixes this problem elegantly and doesn't cost anything in
terms of adding tons of code -- all we need is an #ifdef for
the module name in _socketmodule.c

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mwh@python.net  Wed Feb 13 10:41:13 2002
From: mwh@python.net (Michael Hudson)
Date: 13 Feb 2002 10:41:13 +0000
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
In-Reply-To: barry@zope.com's message of "Tue, 12 Feb 2002 13:12:45 -0500"
References: <2m6652my0x.fsf@starship.python.net> <15465.23325.41291.966138@anthem.wooz.org>
Message-ID: <2mheolmzp2.fsf@starship.python.net>

barry@zope.com (Barry A. Warsaw) writes:

> >>>>> "MH" == Michael Hudson <mwh@python.net> writes:
> 
>     MH> Some time ago, Gareth McCaughan suggested a syntax for
>     MH> staticmethods.  You'd write
> 
>     MH> class C(object):
>     |     def static(arg) [staticmethod]:
>     |         return 1 + arg
> 
>     | C.static(2)
>     |    => 3
> 
> Very interesting!  Why the square brackets though?  Is that just for
> visual offset or is there a grammar constraint that requires them?

Um, no big reason; they were what Gareth suggested, so I implemented
that.  He may have got the idea from the slides from one of Guido's
presentations -- it was reading them that reminded me I'd done this
and wanted to mention it here.

Note, though, that my patch allows an arbitrary number of
*expressions* in the square brackets; in principle you can do things
like:

>>> def h() [apply, (lambda f:(lambda : f() + 1))]:
...  return 1
...

and have `h' be 2 (except that this caused an abort at the moment -- I
must have missed something in my symtable code).

Not sure whether this is a good idea, of course, but allowing
arbitrary expressions does actually make the compiling easier.
Allowing arbitrary expressions without delimiters sounds like a bad
idea, both for parsers and people.

> I'd leave them out of the picture, unless you mean to imply that a
> list is acceptable in that position <wink>.

Well, it is, at the moment...

Cheers,
M.

-- 
  : exploding like a turd
  Never had that happen to me, I have to admit.  They do that
  often in your world?              -- Eric The Read & Dave Brown, asr


From guido@python.org  Wed Feb 13 13:01:50 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 13 Feb 2002 08:01:50 -0500
Subject: [Python-Dev] SSL support in _socket
In-Reply-To: Your message of "Wed, 13 Feb 2002 10:22:05 +0100."
 <3C6A303D.E0DDC16A@lemburg.com>
References: <3C692370.D21EF15D@lemburg.com> <m33d06xmfz.fsf@mira.informatik.hu-berlin.de>
 <3C6A303D.E0DDC16A@lemburg.com>
Message-ID: <200202131301.g1DD1ou07332@pcp742651pcs.reston01.va.comcast.net>

> Some symbols starting with 'RAND_*' are aparently missing from 
> OpenSSL on my notebook.

Yes, this has bitten me too.  It's apparently a relatively new API in
OpenSSL and the SSL code in socket.c was changed to require it almost
as soon as it appeared in OpenSSL.

> In summary: _socket is just too important to lose if something
> in the OpenSSL support goes wrong. The two build model I suggested
> fixes this problem elegantly and doesn't cost anything in
> terms of adding tons of code -- all we need is an #ifdef for
> the module name in _socketmodule.c

Since the SSL support mostly introduces new code that doesn't depend
on other socket code (not 100% sure if this is true), can't we make
the SSL support a separate module?  Then socket.py (which is also used
on Unix these days!!!) can glue them together.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Wed Feb 13 13:14:27 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 13 Feb 2002 14:14:27 +0100
Subject: [Python-Dev] SSL support in _socket
References: <3C692370.D21EF15D@lemburg.com> <m33d06xmfz.fsf@mira.informatik.hu-berlin.de>
 <3C6A303D.E0DDC16A@lemburg.com> <200202131301.g1DD1ou07332@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C6A66B3.3C4AE597@lemburg.com>

Guido van Rossum wrote:
> 
> > Some symbols starting with 'RAND_*' are apparently missing from
> > OpenSSL on my notebook.
> 
> Yes, this has bitten me too.  It's apparently a relatively new API in
> OpenSSL and the SSL code in socket.c was changed to require it almost
> as soon as it appeared in OpenSSL.
> 
> > In summary: _socket is just too important to lose if something
> > in the OpenSSL support goes wrong. The two build model I suggested
> > fixes this problem elegantly and doesn't cost anything in
> > terms of adding tons of code -- all we need is an #ifdef for
> > the module name in _socketmodule.c
> 
> Since the SSL support mostly introduces new code that doesn't depend
> on other socket code (not 100% sure if this is true), can't we make
> the SSL support a separate module?  Then socket.py (which is also used
> on Unix these days!!!) can glue them together.

Good idea. 

Checking the code it should be easy to do. I'll look
into this later this week. 

Funny, BTW, that the source file is named socketmodule.c 
while the resulting DLL is called _socket... I suppose 
renaming socketmodule.c to _socket.c would be advisable.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From guido@python.org  Wed Feb 13 13:36:44 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 13 Feb 2002 08:36:44 -0500
Subject: [Python-Dev] SSL support in _socket
In-Reply-To: Your message of "Wed, 13 Feb 2002 14:14:27 +0100."
 <3C6A66B3.3C4AE597@lemburg.com>
References: <3C692370.D21EF15D@lemburg.com> <m33d06xmfz.fsf@mira.informatik.hu-berlin.de> <3C6A303D.E0DDC16A@lemburg.com> <200202131301.g1DD1ou07332@pcp742651pcs.reston01.va.comcast.net>
 <3C6A66B3.3C4AE597@lemburg.com>
Message-ID: <200202131336.g1DDaiV07604@pcp742651pcs.reston01.va.comcast.net>

> Checking the code it should be easy to do. I'll look
> into this later this week. 

Great!

> Funny, BTW, that the source file is named socketmodule.c 
> while the resulting DLL is called _socket... I suppose 
> renaming socketmodule.c to _socket.c would be advisable.

That requires asking the SF sysadmin a favor to move a file, or loses
all he CVS history.  So who cares.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@v.loewis.de  Wed Feb 13 14:28:53 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 13 Feb 2002 15:28:53 +0100
Subject: [Python-Dev] SSL support in _socket
In-Reply-To: <200202131301.g1DD1ou07332@pcp742651pcs.reston01.va.comcast.net>
References: <3C692370.D21EF15D@lemburg.com>
 <m33d06xmfz.fsf@mira.informatik.hu-berlin.de>
 <3C6A303D.E0DDC16A@lemburg.com>
 <200202131301.g1DD1ou07332@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <m3g045jw0q.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Since the SSL support mostly introduces new code that doesn't depend
> on other socket code (not 100% sure if this is true), can't we make
> the SSL support a separate module?  Then socket.py (which is also used
> on Unix these days!!!) can glue them together.

+1.

Martin


From martin@v.loewis.de  Wed Feb 13 14:34:26 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 13 Feb 2002 15:34:26 +0100
Subject: [Python-Dev] SSL support in _socket
In-Reply-To: <3C6A303D.E0DDC16A@lemburg.com>
References: <3C692370.D21EF15D@lemburg.com>
 <m33d06xmfz.fsf@mira.informatik.hu-berlin.de>
 <3C6A303D.E0DDC16A@lemburg.com>
Message-ID: <m3bsetjvrh.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> I thought that Modules/Setup is deprecated and replaced by the
> auto setup tests in setup.py ? 

Not at all. It is just used less frequently.

Personally, I think that is a pity. Python binary distributions, by
default, on Unix, should build as many extension libraries statically
into the interpreter as they can without dragging in too many
additional shared libraries. IOW, _socket should be compiled
statically into the interpreter, which you cannot do with distutils
(by nature).

The reason for linking them statically is efficiency: if used, the
interpreter won't have to locate them in sys.path, they don't need to
be compiled as PIC code, the dynamic linker does not need to bind that
many symbols, etc; if not used, they don't consume any additional
resources as they are demand-paged from the executable. Static linking
is also desirable for frozen applications.

For those reasons, I hope that Setup.dist continues to be maintained.


Regards,
Martin


From barry@zope.com  Wed Feb 13 14:47:12 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 13 Feb 2002 09:47:12 -0500
Subject: [Python-Dev] SSL support in _socket
References: <3C692370.D21EF15D@lemburg.com>
 <m33d06xmfz.fsf@mira.informatik.hu-berlin.de>
 <3C6A303D.E0DDC16A@lemburg.com>
Message-ID: <15466.31856.880945.17273@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> I thought that Modules/Setup is deprecated and replaced by
    MAL> the auto setup tests in setup.py ? In any case, setup.py will
    MAL> simply remove _socket if it doesn't import correctly and so a
    MAL> casual sys admin or user will lose big if his OpenSSL
    MAL> installation happens to be out of sync with whatever we
    MAL> provide in _socket.

This is a more general problem with the current setup.py stuff for the
standard library.  It took me /ages/ to figure out why BerkeleyDB
support was broken in Python 2.2 -- not just broken, but non-existant!
"import bsddb" simply failed because the .so wasn't there.  I couldn't
figure out why that was until I trolled through the build output and
realized that setup.py was deleting the .so because it got an import
error after building the .so.  Then I had to figure out how to build
the .so and keep it around so I could then learn that it had link
problems and from there, I realized why BerkeleyDB support in Python
2.2 is /really/ busted (it tries to be too smart about finding its
libraries).

It shouldn't have been this difficult to debug.  Surely there must be
some way to tell setup.py not to delete .so's it can't import so we
have a prayer of finding the real problems.

-Barry


From barry@zope.com  Wed Feb 13 14:49:59 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 13 Feb 2002 09:49:59 -0500
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
References: <2m6652my0x.fsf@starship.python.net>
 <15465.23325.41291.966138@anthem.wooz.org>
 <2mheolmzp2.fsf@starship.python.net>
Message-ID: <15466.32023.403192.162135@anthem.wooz.org>

>>>>> "MH" == Michael Hudson <mwh@python.net> writes:

    MH> Not sure whether this is a good idea, of course, but allowing
    MH> arbitrary expressions does actually make the compiling easier.
    MH> Allowing arbitrary expressions without delimiters sounds like
    MH> a bad idea, both for parsers and people.

    >> I'd leave them out of the picture, unless you mean to imply
    >> that a list is acceptable in that position <wink>.

    MH> Well, it is, at the moment...

Well, that's pretty neat!  Maybe FAST, but neat. :)

-Barry


From mwh@python.net  Wed Feb 13 14:57:27 2002
From: mwh@python.net (Michael Hudson)
Date: 13 Feb 2002 14:57:27 +0000
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
In-Reply-To: barry@zope.com's message of "Wed, 13 Feb 2002 09:49:59 -0500"
References: <2m6652my0x.fsf@starship.python.net> <15465.23325.41291.966138@anthem.wooz.org> <2mheolmzp2.fsf@starship.python.net> <15466.32023.403192.162135@anthem.wooz.org>
Message-ID: <2m8z9x5t0o.fsf@starship.python.net>

barry@zope.com (Barry A. Warsaw) writes:

> >>>>> "MH" == Michael Hudson <mwh@python.net> writes:
>     >> I'd leave them out of the picture, unless you mean to imply
>     >> that a list is acceptable in that position <wink>.
> 
>     MH> Well, it is, at the moment...
> 
> Well, that's pretty neat!  Maybe FAST, but neat. :)

No, you're going to have to explain that.  (Googling for "FAST" isn't
terribly enlightening...).

Cheers,
M.

-- 
  well, take it from an old hand: the only reason it would be easier
  to program in C is that you can't easily express complex problems
  in C, so you don't.                   -- Erik Naggum, comp.lang.lisp


From mwh@python.net  Wed Feb 13 14:59:27 2002
From: mwh@python.net (Michael Hudson)
Date: 13 Feb 2002 14:59:27 +0000
Subject: [Python-Dev] SSL support in _socket
In-Reply-To: barry@zope.com's message of "Wed, 13 Feb 2002 09:47:12 -0500"
References: <3C692370.D21EF15D@lemburg.com> <m33d06xmfz.fsf@mira.informatik.hu-berlin.de> <3C6A303D.E0DDC16A@lemburg.com> <15466.31856.880945.17273@anthem.wooz.org>
Message-ID: <2m66515sxc.fsf@starship.python.net>

barry@zope.com (Barry A. Warsaw) writes:

> It shouldn't have been this difficult to debug.  Surely there must be
> some way to tell setup.py not to delete .so's it can't import so we
> have a prayer of finding the real problems.

Maybe it just shouldn't install the shared libs if they fail to
import?

Cheers,
M.

-- 
  It's actually a corruption of "starling".  They used to be carried.
  Since they weighed a full pound (hence the name), they had to be
  carried by two starlings in tandem, with a line between them.
                 -- Alan J Rosenthal explains "Pounds Sterling" on asr


From mal@lemburg.com  Wed Feb 13 16:33:45 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 13 Feb 2002 17:33:45 +0100
Subject: [Python-Dev] setup.py auto-conf (SSL support in _socket)
References: <3C692370.D21EF15D@lemburg.com> <m33d06xmfz.fsf@mira.informatik.hu-berlin.de> <3C6A303D.E0DDC16A@lemburg.com> <15466.31856.880945.17273@anthem.wooz.org> <2m66515sxc.fsf@starship.python.net>
Message-ID: <3C6A9569.B494377C@lemburg.com>

Michael Hudson wrote:
> 
> barry@zope.com (Barry A. Warsaw) writes:
> 
> > It shouldn't have been this difficult to debug.  Surely there must be
> > some way to tell setup.py not to delete .so's it can't import so we
> > have a prayer of finding the real problems.
> 
> Maybe it just shouldn't install the shared libs if they fail to
> import?

I'm not sure why setup.py is trying to be smart in the first 
place. A warning is certainly a good idea, but then setup.py
should let the user decide what to do about the problem, IMHO.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From gmccaughan@synaptics-uk.com  Wed Feb 13 17:11:57 2002
From: gmccaughan@synaptics-uk.com (Gareth McCaughan)
Date: Wed, 13 Feb 2002 17:11:57 +0000 (GMT)
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
Message-ID: <200202131712.RAA29416@synaptics-uk.com>

Michael Hudson wrote (replying to Barry Warsaw):

> > Very interesting!  Why the square brackets though?  Is that just for
> > visual offset or is there a grammar constraint that requires them?
> 
> Um, no big reason; they were what Gareth suggested, so I implemented
> that.  He may have got the idea from the slides from one of Guido's
> presentations -- it was reading them that reminded me I'd done this
> and wanted to mention it here.

Four reasons for the brackets.

1. Easier for the parser, I think.
2. Visually distinctive.
3. For me, it "reads" better than it would without the brackets.
4. Generalizes to a sequence of transformations.

#4 is much the most important of these in my mind.

One drawback of allowing an arbitrary list of transformations
is that it might not be completely clear what order they're done in.
I conjecture that most people will have the same intuition
as I do about this, namely that the first-listed transformation
is applied first. (It would be less obvious if the list came
before the name of the definiendum instead of after.)

Oh, and for the record: My suggestion was made long before I
ever saw Guido's slides. :-)

-- 
Gareth McCaughan



From paul@prescod.net  Wed Feb 13 17:33:56 2002
From: paul@prescod.net (Paul Prescod)
Date: Wed, 13 Feb 2002 09:33:56 -0800
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
References: <200202131712.RAA29416@synaptics-uk.com>
Message-ID: <3C6AA384.64D27FD0@prescod.net>

Gareth McCaughan wrote:
> 
>...
> 4. Generalizes to a sequence of transformations.

For me, this is crucial. A future version of Spark would probably put
its parse annotations in the parens. Various type checking systems would
probably do the same:

    def t_whitespace(self, s)[
		grammar(r' \s+'), 
		type(Node)]:
        pass

This is going to happen so we need to be confident that we like this use
of the syntax. I've been waiting for something like this for a while.

 Paul Prescod


From barry@zope.com  Wed Feb 13 19:32:12 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 13 Feb 2002 14:32:12 -0500
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
References: <2m6652my0x.fsf@starship.python.net>
 <15465.23325.41291.966138@anthem.wooz.org>
 <2mheolmzp2.fsf@starship.python.net>
 <15466.32023.403192.162135@anthem.wooz.org>
 <2m8z9x5t0o.fsf@starship.python.net>
Message-ID: <15466.48956.191807.871000@anthem.wooz.org>

>>>>> "MH" == Michael Hudson <mwh@python.net> writes:

    MH> barry@zope.com (Barry A. Warsaw) writes:

    >> "MH" == Michael Hudson <mwh@python.net> writes:
    >> I'd leave them out of the picture, unless you mean to imply
    >> that a list is acceptable in that position <wink>.
    >> MH> Well, it is, at the moment...  Well, that's pretty neat!
    >> Maybe FAST, but neat. :)

    MH> No, you're going to have to explain that.  (Googling for
    MH> "FAST" isn't terribly enlightening...).

It stands for "facinating and stomach turning", a reference to a
docstring-based mechanism John Aycock used for his parser technology.

:)

-Barry


From barry@zope.com  Wed Feb 13 19:33:13 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 13 Feb 2002 14:33:13 -0500
Subject: [Python-Dev] SSL support in _socket
References: <3C692370.D21EF15D@lemburg.com>
 <m33d06xmfz.fsf@mira.informatik.hu-berlin.de>
 <3C6A303D.E0DDC16A@lemburg.com>
 <15466.31856.880945.17273@anthem.wooz.org>
 <2m66515sxc.fsf@starship.python.net>
Message-ID: <15466.49017.607224.678650@anthem.wooz.org>

>>>>> "MH" == Michael Hudson <mwh@python.net> writes:

    >> It shouldn't have been this difficult to debug.  Surely there
    >> must be some way to tell setup.py not to delete .so's it can't
    >> import so we have a prayer of finding the real problems.

    MH> Maybe it just shouldn't install the shared libs if they fail
    MH> to import?

That would be an improvement because at least there'd be an artifact
you can poke at after the build process is complete.

-Barry


From Anthony Baxter <anthony@ekit-inc.com>  Thu Feb 14 00:55:52 2002
From: Anthony Baxter <anthony@ekit-inc.com> (Anthony Baxter)
Date: Thu, 14 Feb 2002 11:55:52 +1100
Subject: [Python-Dev] SSL support in _socket
In-Reply-To: Message from "M.-A. Lemburg" <mal@lemburg.com>
 of "Wed, 13 Feb 2002 10:22:05 BST." <3C6A303D.E0DDC16A@lemburg.com>
Message-ID: <200202140055.g1E0tq420840@burswood.off.ekorp.com>


The whole subject of socket and SSL support came up at a lunchtime
chat on developers day (and my jetlagged brain has totally failed to
supply me the names of who I was talking to at the time...). Wouldn't
it be better to rip the SSL support out entirely, and provide a way
to hook the transport layer stuff on top of the standard socket 
object? 

Anthony

-- 
Anthony Baxter     <anthony@interlink.com.au>   
It's never to late to have a happy childhood.



From martin@v.loewis.de  Thu Feb 14 01:02:35 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 14 Feb 2002 02:02:35 +0100
Subject: [Python-Dev] SSL support in _socket
In-Reply-To: <200202140055.g1E0tq420840@burswood.off.ekorp.com>
References: <200202140055.g1E0tq420840@burswood.off.ekorp.com>
Message-ID: <m3k7tgj2ok.fsf@mira.informatik.hu-berlin.de>

Anthony Baxter <anthony@ekit-inc.com> writes:

> The whole subject of socket and SSL support came up at a lunchtime
> chat on developers day (and my jetlagged brain has totally failed to
> supply me the names of who I was talking to at the time...). Wouldn't
> it be better to rip the SSL support out entirely, and provide a way
> to hook the transport layer stuff on top of the standard socket 
> object? 

With OpenSSL? How do you make OpenSSL's internals use the Python
socket object?

Regards,
Martin



From jeremy@zope.com  Thu Feb 14 17:39:26 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Thu, 14 Feb 2002 12:39:26 -0500
Subject: [Python-Dev] SSL support in _socket
In-Reply-To: <200202140055.g1E0tq420840@burswood.off.ekorp.com>
References: <mal@lemburg.com>
 <3C6A303D.E0DDC16A@lemburg.com>
 <200202140055.g1E0tq420840@burswood.off.ekorp.com>
Message-ID: <15467.63054.123422.382424@gondolin.digicool.com>

>>>>> "AB" == Anthony Baxter <anthony@ekit-inc.com> writes:

  AB> The whole subject of socket and SSL support came up at a
  AB> lunchtime chat on developers day (and my jetlagged brain has
  AB> totally failed to supply me the names of who I was talking to at
  AB> the time...). Wouldn't it be better to rip the SSL support out
  AB> entirely, and provide a way to hook the transport layer stuff on
  AB> top of the standard socket object?

It is certainly attractive to focus future development on a separate C
extension module.  The current SSL support was included because we
thought it would be nice to allow people to open https URLs.  The code
itself is problematic for many reasons, not least of which is its very
minimal feature set.  But getting the right Python interface for a
large library like OpenSSL is a big task.  I think it's better suite
for 3rd party libraries that the core (and such libraries do exist,
though I've never used them).

Jeremy



From mal@lemburg.com  Thu Feb 14 17:45:40 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Feb 2002 18:45:40 +0100
Subject: [Python-Dev] SSL support in _socket
References: <mal@lemburg.com>
 <3C6A303D.E0DDC16A@lemburg.com>
 <200202140055.g1E0tq420840@burswood.off.ekorp.com> <15467.63054.123422.382424@gondolin.digicool.com>
Message-ID: <3C6BF7C4.5386DA52@lemburg.com>

Jeremy Hylton wrote:
> 
> >>>>> "AB" == Anthony Baxter <anthony@ekit-inc.com> writes:
> 
>   AB> The whole subject of socket and SSL support came up at a
>   AB> lunchtime chat on developers day (and my jetlagged brain has
>   AB> totally failed to supply me the names of who I was talking to at
>   AB> the time...). Wouldn't it be better to rip the SSL support out
>   AB> entirely, and provide a way to hook the transport layer stuff on
>   AB> top of the standard socket object?
> 
> It is certainly attractive to focus future development on a separate C
> extension module.  The current SSL support was included because we
> thought it would be nice to allow people to open https URLs.  The code
> itself is problematic for many reasons, not least of which is its very
> minimal feature set.  But getting the right Python interface for a
> large library like OpenSSL is a big task.  I think it's better suite
> for 3rd party libraries that the core (and such libraries do exist,
> though I've never used them).

FYI, I'm moving the SSL out of _socket and into _ssl.c. socket.py
will then try to import _ssl, but move along if it cannot import
that module for some reason.

For true SSL support, you should look at M2Crypto.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Thu Feb 14 18:43:15 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Feb 2002 19:43:15 +0100
Subject: [Python-Dev] The Python of Pythagoras?
Message-ID: <3C6C0543.D6F57864@lemburg.com>

Pythaguidoras ?!

   http://greatserpentmound.org/articles/python.html

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From Greg.Wilson@baltimore.com  Thu Feb 14 18:42:02 2002
From: Greg.Wilson@baltimore.com (Greg Wilson)
Date: Thu, 14 Feb 2002 13:42:02 -0500
Subject: [Python-Dev] student projects
Message-ID: <930BBCA4CEBBD411BE6500508BB3328F523333@nsamcanms1.ca.baltimore.com>

I'm going to be supervising one-term programming projects
for several (3-4) senior Computer Science majors starting
in September.  They'll have 5-6 hours a week for 13 weeks
to (a) learn their way around whatever technology is thrown
at them, (b) build something worth building, and (c) write
it up.

So: any little itches anyone on this list would like to
see scratched?  Any little add-ons or uses for Jabber,
SOAP, etc. that would only take you a weekend or two, but
you've never quite gotten around to building?

Thanks,
Greg

p.s. please reply to me directly; if there's enough interest,
I'll put together a summary and re-post.


-----------------------------------------------------------------------------------------------------------------
The information contained in this message is confidential and is intended 
for the addressee(s) only.  If you have received this message in error or 
there are any problems please notify the originator immediately.  The 
unauthorized use, disclosure, copying or alteration of this message is 
strictly forbidden. Baltimore Technologies plc will not be liable for direct, 
special, indirect or consequential damages arising from alteration of the 
contents of this message by a third party or as a result of any virus being 
passed on.

 
This footnote confirms that this email message has been swept by 
Baltimore MIMEsweeper for Content Security threats, including
computer viruses.


From trentm@ActiveState.com  Thu Feb 14 22:58:12 2002
From: trentm@ActiveState.com (Trent Mick)
Date: Thu, 14 Feb 2002 14:58:12 -0800
Subject: [Python-Dev] Is Barry receiving email at barry@zope.com? PEP number please.
Message-ID: <20020214145812.F25577@ActiveState.com>

Barry,
Could I have a PEP number for my logging system proposal please?
Here is what I have put together so far.

Others,
Feel free to send me comments on this if you like. I will official post a
request for comment when I get a PEP number for it.


Trent

------------------------------------------------------------------------------
PEP: XXX
Title: A Logging System
Version: $Revision$
Last-Modified: $Date$
Author: trentm@activestate.com (Trent Mick)
Python-Version: 2.3
Status: Draft
Type: Standards Track
Created: 4-Feb-2002
Post-History:


Abstract

    This PEP describes a proposed logging package for Python's standard
    library.

    Basically the system involves the user creating one or more logging
    objects on which methods are called to log debugging notes/general
    information/warnings/errors/etc. Different logging 'levels' can be used
    to distinguish important messages from trivial ones.
    
    A registry of named singleton logger objects is maintained so that
        (1) different logical logging streams (or 'channels') exist (say, one
            for 'zope.zodb' stuff and another for 'mywebsite'-specific
            stuff); and
        (2) one does not have to pass logger object references around.

    The system is configurable at runtime. This configuration mechanism
    allows one to tune the level and type of logging done while not touching
    the application itself.

    
Motivation

    If a single logging mechanism is enshrined in the standard library, 1)
    logging is more likely to be done 'well', and 2) multiple libraries will
    be able to be integrated into larger applications which can be logged
    reasonably coherently.


Influences

    This proposal was put together after having somewhat studied the
    following logging packages:
        o java.util.logging in JDK 1.4 (a.k.a. JSR047) [1]
        o log4j [2]
          These two systems are *very* similar.
        o the Syslog package from the Protomatter project [3]
        o MAL's mx.Log package [4]

    This proposal will basically look like java.util.logging with a
    smattering of log4j.


Simple Example

    This shows a very simple example of how the logging package can be used
    to generate simple logging output on stdout.
    
        --------- mymodule.py -------------------------------
        import logging
        log = logging.getLogger("MyModule")

        def doit():
            log.debug("doin' stuff")
            # do stuff ...
        -----------------------------------------------------

        --------- myapp.py ----------------------------------
        import mymodule, logging
        log = logging.getLogger("MyApp")

        log.info("start my app")
        try:
            mymodule.doit()
        except Exception, e:
            log.error("There was a problem doin' stuff.")
        log.info("end my app")
        -----------------------------------------------------

    > python myapp.py
    0    [myapp.py:4] INFO  MyApp - start my app
    36   [mymodule.py:5] DEBUG MyModule - doin' stuff
    51   [myapp.py:9] INFO  MyApp - end my app
    ^^   ^^^^^^^^^^^^ ^^^^  ^^^^^   ^^^^^^^^^^
    |    |            |     |       `-- message
    |    |            |     `-- logging name/channel
    |    |            `-- level
    |    `-- location
    `-- time

    NOTE: Not sure exactly what the default format will look like yet.


Control Flow

    [Note: excerpts from Java Logging Overview. [5]]

    Applications make logging calls on *Logger* objects. Loggers are
    organized in a hierarchical namespace and child Loggers may inherit some
    logging properties from their parents in the namespace.
    
    Notes on namespace: Logger names fit into a "dotted name" namespace, with
    dots (periods) indicating subnamespaces.  The namespace of logger objects
    therefore corresponds to a single tree data structure.

       "" is the root of the namespace
       "Zope" would be a child node of the root
       "Zope.ZODB" would be a child node of "Zope"

    These Logger objects allocate *LogRecord* objects which are passed to
    *Handler* objects for publication. Both Loggers and Handlers may use
    logging *levels* and (optionally) *Filters* to decide if they are
    interested in a particular LogRecord. When it is necessary to publish a
    LogRecord externally, a Handler can (optionally) use a *Formatter* to
    localize and format the message before publishing it to an I/O stream.

    Each Logger keeps track of a set of output Handlers. By default all
    Loggers also send their output to their parent Logger. But Loggers may
    also be configured to ignore Handlers higher up the tree. 

    The APIs are structured so that calls on the Logger APIs can be cheap
    when logging is disabled. If logging is disabled for a given log level,
    then the Logger can make a cheap comparison test and return. If logging
    is enabled for a given log level, the Logger is still careful to minimize
    costs before passing the LogRecord into the Handlers. In particular,
    localization and formatting (which are relatively expensive) are deferred
    until the Handler requests them.


Levels
    
    The logging levels, in increasing order of importance, are:
        DEBUG
        INFO
        WARN
        ERROR
        FATAL
        ALL
    This is consistent with log4j and Protomatter's Syslog and not with
    JSR047 which has a few more levels and some different names.

    Implementation-wise: these are just integer constants, to allow simple
    comparsion of importance.  See "What Logging Levels?" below for a debate
    on what standard levels should be defined.


Loggers

    Each Logger object keeps track of a log level (or threshold) that it is
    interested in, and discards log requests below that level.

    The *LogManager* maintains a hierarchical namespace of named Logger
    objects. Generations are denoted with dot-separated names: Logger "foo"
    is the parent of Loggers "foo.bar" and "foo.baz".

    The main logging method is:
        class Logger:
            def log(self, level, msg, *args):
                """Log 'msg % args' at logging level 'level'."""
                ...
    however convenience functions are defined for each logging level:
            def debug(self, msg, *args): ...
            def info(self, msg, *args): ...
            def warn(self, msg, *args): ...
            def error(self, msg, *args): ...
            def fatal(self, msg, *args): ...

    XXX How to defined a nice convenience function for logging an exception?
        mx.Log has something like this, doesn't it?
    XXX What about a .raising() convenience function? How about:
            def raising(self, exception, level=ERROR): ...
        It would create a log message describing an exception that is about
        to be raised. I don't like that 'level' is not first when it *is*
        first for .log().


Handlers

    Handlers are responsible for doing something useful with a given
    LogRecord. The following core Handlers will be implemented:

    - StreamHandler: A handler for writing to a file-like object.
    - FileHandler: A handler for writing to a single file or set of rotating
      files.

    More standard Handlers may be implemented if deemed desireable and
    feasible. Other interesting candidates:

    - SocketHandler: A handler for writing to remote TCP ports.
    - CreosoteHandler: A handler for writing to UDP packets, for low-cost
      logging.  Jeff Bauer already had such a system [5].
    - MemoryHandler: A handler that buffers log records in memory (JSR047).
    - SMTPHandler: Akin to log4j's SMTPAppender.
    - SyslogHandler: Akin to log4j's SyslogAppender.
    - NTEventLogHandler: Akin to log4j's NTEventLogAppender.
    - SMTPHandler: Akin to log4j's SMTPAppender.


Formatters

    A Formatter is responsible for converting a LogRecord to a string
    representation. A Handler may call its Formatter before writing a
    record. The following core Formatters will be implemented:

    - Formatter: Provide printf-like formatting, perhaps akin to
      log4j's PatternAppender.
    
    Other possible candidates for implementation:

    - XMLFormatter: Serialize a LogRecord according to a specific schema.
      Could copy the schema from JSR047's XMLFormatter or log4j's
      XMLAppender.
    - HTMLFormatter: Provide a simple HTML output of log information. (See
      log4j's HTMLAppender.)


Filters

    A Filter can be called by a Logger or Handler to decide if a LogRecord
    should be logged.

    JSR047 and log4j have slightly different filtering interfaces. The former
    is simpler:
        class Filter:
            def isLoggable(self):
                """Return a boolean."""
    The latter is modeled after Linux's ipchains (where Filter's can be
    chained with each filter either 'DENY'ing, 'ACCEPT'ing, or being
    'NEUTRAL' on each check). I would probably favor to former because it is
    simpler and I don't immediate see the need for the latter.
    
    No filter implementations are currently proposed (other that the do
    nothing base class) because I don't have enough experience to know what
    kinds of filters would be common. Users can always subclass Filter for
    their own purposes. Log4j includes a few filters that might be
    interesting.


Configuration

    Note: Configuration for the proposed logging system is currently
    under-specified.

    The main benefit of a logging system like this is that one can control
    how much and what logging output one gets from an application without
    changing that application's source code.

    Log4j and Syslog provide for configuration via an external XML file.
    Log4j and JSR047 provide for configuration via Java properties (similar
    to -D #define's to a C/C++ compiler). All three provide for configuration
    via API calls.

    Configuration includes the following:
        - What logging level a logger should be interested in.
        - What handlers should be attached to which loggers.
        - What filters should be attached to which handlers and loggers.
        - Specifying attributes specific to certain Handlers and Filters.
        - Defining the default configuration.
        - XXX Add others. 

    In general each application will have its own requirements for how a user
    may configure logging output. One application (e.g. distutils) may want
    to control logging levels via '-q,--quiet,-v,--verbose' options to
    setup.py. Zope may want to configure logging via certain environment
    variables (e.g. 'STUPID_LOG_FILE' :). Komodo may want to configure
    logging via its preferences system.

    This PEP proposes to clearly document the API for configuring each of the
    above listed configurable elements and to define a reasonable default
    configuration.  This PEP does not propose to define a general XML or .ini
    file configuration schema and the backend to parse it.
    
    It might, however, be worthwhile to define an abstraction of the
    configuration API to allow the expressiveness of Syslog configuration.
    Greg Wilson made this argument:
        In Protomatter [Syslog], you configure by saying "give me everything
        that matches these channel+level combinations", such as
        "server.error" and "database.*".  The log4j "configure by
        inheritance" model, on the other hand, is very clever, but hard for
        non-programmers to manage without a GUI that essentially reduces it
        to Protomatter's.


Case Scenarios

    This section presents a few usage scenarios which will be used to help
    decide how best to specify the logging API.

    (1) A short simple script.
        This script does not have many lines. It does not heavily use an
        third party modules (i.e. the only code doing any logging would be
        the main script). Only one logging channel is really needed and
        thus, the channel name is unnecessary. The user doesn't want to
        bother with logging system configuration much.

    (2) Medium sized app with C extension module.
        Includes a few Python modules and a main script. Employs, perhaps, a
        few logging channels. Includes a C extension module which might want
        to make logging calls as well.

    (3) Distutils.
        A large number of Python packages/modules. Perhaps (but not
        necessarily) a number of logging channels are used. Specifically
        needs to facilitate the controlling verbosity levels via simple
        command line options to 'setup.py'.

    (4) Large, possibly multi-language, app. E.g. Zope or (my experience)
        Komodo.
        (I don't expect this logging system to deal with any cross-language
        issues but it is something to think about.) Many channels are used.
        Many developers involved. People providing user support are possibly
        not the same people who developed the application. Users should be
        able to generate log files (i.e. configure logging) while reproducing
        a bug to send back to developers.


Implementation

    XXX Details to follow consensus that this proposal is a good idea.


What Logging Levels?

    The following are the logging levels defined by the systems I looked at:
    - log4j: DEBUG, INFO, WARN, ERROR, FATAL
    - syslog: DEBUG, INFO, WARNING, ERROR, FATAL
    - JSR047: FINEST, FINER, FINE, CONFIG, INFO, WARNING, SEVERE
    - zLOG (used by Zope):
        TRACE=-300   -- Trace messages
        DEBUG=-200   -- Debugging messages
        BLATHER=-100 -- Somebody shut this app up.
        INFO=0       -- For things like startup and shutdown.
        PROBLEM=100  -- This isn't causing any immediate problems, but
                        deserves attention.
        WARNING=100  -- A wishy-washy alias for PROBLEM.
        ERROR=200    -- This is going to have adverse effects.
        PANIC=300    -- We're dead!
    - mx.Log:
        SYSTEM_DEBUG
        SYSTEM_INFO
        SYSTEM_UNIMPORTANT
        SYSTEM_MESSAGE
        SYSTEM_WARNING
        SYSTEM_IMPORTANT
        SYSTEM_CANCEL
        SYSTEM_ERROR
        SYSTEM_PANIC
        SYSTEM_FATAL

    The current proposal is to copy log4j. XXX I suppose I could see adding
    zLOG's "TRACE" level, but I am not sure of the usefulness of others.


Static Logging Methods (as per Syslog)?

    Both zLOG and Syslog provide module-level logging functions rather (or
    in addition to) logging methods on a created Logger object. XXX Is this
    something that is deemed worth including?

    Pros:
        - It would make the simplest case shorter:
            import logging
            logging.error("Something is wrong")
          instead of
            import logging
            log = logging.getLogger("")
            log.error("Something is wrong")

    Cons:
        - It provides more than one way to do it.
        - It encourages logging without a channel name, because this mechanism
          would likely be implemented by implicitly logging on the root (and
          nameless) logger of the hierarchy.



References

    [1] java.util.logging
        http://java.sun.com/j2se/1.4/docs/guide/util/logging/

    [2] log4j: a Java logging package
        http://jakarta.apache.org/log4j/docs/index.html

    [3] Protomatter's Syslog
        http://protomatter.sourceforge.net/1.1.6/index.html
        http://protomatter.sourceforge.net/1.1.6/javadoc/com/protomatter/syslog/syslog-whitepaper.html

    [4] MAL mentions his mx.Log logging module:
        http://mail.python.org/pipermail/python-dev/2002-February/019767.html 

    [5] Jeff Bauer's Mr. Creosote
        http://starship.python.net/crew/jbauer/creosote/

Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
fill-column: 70
End:



-- 
Trent Mick
TrentM@ActiveState.com


From DavidA@ActiveState.com  Fri Feb 15 01:06:52 2002
From: DavidA@ActiveState.com (David Ascher)
Date: Thu, 14 Feb 2002 17:06:52 -0800
Subject: [Python-Dev] Reminder: Python track at OSCON -- Deadline March 1!
Message-ID: <3C6C5F2C.9E9B1B79@activestate.com>

Reminder:

The O'Reilly Open Source Convention (July 22-26, 2002 -- San Diego,
CA) is accepting proposals for tutorials, talks, panels, and lightning
talks.  See the Call for Participation in the Python and Zope track on
python.org.  Proposals are due by March 1, so don't wait a moment
longer!

Details available at:

CFP URL: http://www.python.org/workshops/oscon2002/cfp.html
form:    http://conferences.oreillynet.com/cs/os2002/create/e_sess

Cheers,

-- David Ascher
   [Guido's the program chair, but he's on the road, so I'm filling in]

PS: Feel free to resend this to whatever mailing lists may be
interested.


From barry@zope.com  Fri Feb 15 04:10:45 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Thu, 14 Feb 2002 23:10:45 -0500
Subject: [Python-Dev] Is Barry receiving email at barry@zope.com? PEP number please.
References: <20020214145812.F25577@ActiveState.com>
Message-ID: <15468.35397.821775.509858@anthem.wooz.org>

>>>>> "TM" == Trent Mick <trentm@ActiveState.com> writes:

    TM> Barry, Could I have a PEP number for my logging system
    TM> proposal please?  Here is what I have put together so far.

PEP 282.  Spell checked and formatted, and checked in.

(Remember that I usually batch up PEP stuff and handle them about once
a week, so please be patient if I don't get to it for a day or two.)

-Barry


From MR MICHEAL  ADAM" <michadam0@eudoramail.com  Thu Feb 14 05:45:53 2002
From: MR MICHEAL  ADAM" <michadam0@eudoramail.com (MR MICHEAL  ADAM)
Date: Thu, 14 Feb 2002 05:45:53 -0000
Subject: [Python-Dev] Partnership  Proposal
Message-ID: <E16ba8N-0003v9-00@mail.python.org>

  
ATTN: THE PRESIDENT/CEO
Dear Sir / Madam,
I am Dr. Mrs. Marian Abacha, wife to the late Nigerian Head of 
state,
General Sani Abacha who died on the 8th of June 1998 while 
still on active
service for our Country.
I am contacting you with the hope that you will be of great 
assistance to
me, I currently have within my reach the sum of 76MILLION U.S 
dollars cash
which l intend to use for investment purposes outside Nigeria. 
This money
came as a result of a payback contract deal between my husband 
and a Russian
firm in our country's multi-billion dollar Ajaokuta steel 
plant. The Russian
partners returned my husband's share being the above sum after 
his death.
Presently, the new civilian Government has intensified their 
probe into my
husband's financial resources, which has led to the freezing of 
all our
accounts, local and foreign, the revoking of all our business 
licenses and
the arrest of my First son. In view of this I acted very fast 
to withdraw
this money from one of our finance houses before it was closed 
down. I have
deposited the money in a security vault for safe keeping with 
the help of
very loyal officials of my late husband. No record is known 
about this fund
by the government because there is no documentation showing 
that we received
such funds. Due to the current situation in the country and 
government
attitude to my financial affairs, I cannot make use of this 
money within.
Bearing in mind that you may assist me, 20% of the total amount 
will be paid
to you for your assistance, while 5% will be set aside for 
expenses incurred
by the parties involved and this will be paid before sharing. 
Half of my75%
will be paid in to my account on your instruction once the 
money hits your
account, while the other half will be invested by your humble 
self in any
viable business venture you deem fit, with you as manager of 
the invested
funds. Remunerations, during the investment period will be on a 
50/50 basis.
Your URGENT response is needed. All correspondence must be 
through my lawyer,fax:234-1-4709814. Attentioned to my attorney 
(HAMZA IBU). Please do not
forget to include your direct tel/fax line for easy reach.
I hope I can trust you with my family's last financial 
hope.Regards
Dr. Mrs. Marian Sani Abacha.
C/o HAMZA IBU (counsel)
























                                 URGENT AND CONFIDENTIAL 


                                                      
                                                                          

                                              MR. MICHEAL  ADAM
                                                                          

                                              FAX: 234-1-7590900
       

Attn: The Chief Executive Officer

REQUEST FOR URGENT AND CONFIDENTIAL BUSINESS RELATIONSHIP

Please permit me to introduce myself to you, my names are Mr. MICHEAL ADAM
a Petroleum Engineer 

with the Nigerian National Petroleum Corporation and a member of the
contract award committee of 

the above corporation, which is under, The Federal Ministry of Petroleum
and Natural Resources.

CONFIDENTIAL THE SOURCE OF THE FUND IS AS FOLLOWS:

With the assistance of some senior officials of the Federal Ministry of
Finance and Office of the 

Accountant General of the Federation, we want to quietly transfer the sum
of Nineteen Million US 

Dollars only ($19m US Dollars only) out of my country Nigeria. This US$19
M US Dollar was quietly 

over-estimated on the contract for Turn around Maintenance (TAM) of Port
Harcourt petrochemical 

refinery in Nigeria (SOUTHERN NIGERIA) and the Rehabilitation of Petroleum
Pipelines, Depot and 

Jetties. The actual contract value of this said project was US$171M US
Dollars, but my colleagues 

and I deliberately increased the contract to our own benefit to the tune
of $190M US Dollars, of 

which the over-estimated value of US$19M US Dollars belongs to us and this
amount is what we want 

to secretly transfer into your personal or company account for safe
keeping and sharing.

The Federal Government and the Federal Ministry of Petroleum and Natural
Resources have approved 

the total sum of US$190 Million US Dollars. The project has been completed
and commissioned by 

the Federal Government and the original contractors have been paid their
Contractual sum and what 

is left now is the US$19Million US Dollars. Under this circumstance and
upon your acceptance we 

will register You/your Company as a sub-contractor to the original
contractors with my 

corporation, so that this fund can be transferred into your account
without hitch whatsoever.

Our reasons of soliciting your assistance to transfer this fund to your
account is owing to the 

policy of the Federal Government of Nigeria, the code conduct debars us
civil servants 

(Government Workers) from operating a foreign account, hence we seeking
your assistance. After 

several deliberations with my colleagues, we decided to give you 25% as
your entitlement for your 

assistance for providing your account, while 70% will be for us and the
remaining 5% would be 

used to offset all local and foreign expenses that might be incurred
during this transaction.

However this is based on the ground that you would assure me of the
following:

1 That after the successful transfer of the $19m us dollars into your
account, you will give us 

our own fare share of 70% without running away with the money or setting
on it to our detriment.

2 That you will treat this business with utmost secrecy, Confidentiality,
understanding and 

sincerity, which this business demands. 

3 You will assist us (by way of advice) to invest our own share in
business venture in your 

country.

4 Upon your acceptance of this proposal I will send a TEXT for you to fill
in your letter headed 

paper and return back to me, as we shall use this TEXT to raise an
application for payment on 

your behalf as you will be made the recognized beneficiary of the fund.

KINDLY FORWARD YOUR TELEPHONE AND FAX NUMBER to me also.
PLEASE NOTE: that this business is 100% risk free and will not implicate
you in any way, sir. 

Finally please if you feel you cannot do this business with us, kindly
delete this message from 

your computer or destroy it as it will do you no good showing it to a
third party or anybody 

whatsoever, please kindly do us this favor for God sake. The kind of
business you do does not 

effect the business.

Sincerely yours,
MR. MICHEAL ADAM








                                                                    


                                                      
                                                                          


From oren-py-d@hishome.net  Fri Feb 15 09:25:55 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 15 Feb 2002 04:25:55 -0500
Subject: [Python-Dev] syntactic sugar idea for {static,class}methods
In-Reply-To: <200202131712.RAA29416@synaptics-uk.com>
References: <200202131712.RAA29416@synaptics-uk.com>
Message-ID: <20020215092555.GA12028@hishome.net>

On Wed, Feb 13, 2002 at 05:11:57PM +0000, Gareth McCaughan wrote:
> One drawback of allowing an arbitrary list of transformations
> is that it might not be completely clear what order they're done in.
> I conjecture that most people will have the same intuition
> as I do about this, namely that the first-listed transformation
> is applied first. (It would be less obvious if the list came
> before the name of the definiendum instead of after.)

The modifier order [memoize, staticmethod] sounds more like the sentence 
"foo is a memoized staticmethod" - at least in English it does.  In French, 
Hebrew and several other languages it's the other way around, but Python 
is definitely English-oriented.

So, do adjectives come before or after the noun in Dutch? :-)

	Oren



From martin@v.loewis.de  Fri Feb 15 09:36:20 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: Fri, 15 Feb 2002 10:36:20 +0100
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
Message-ID: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de>

I have a patch that makes the Tcl object API available to _tkinter, in
the sense that Tcl invocations don't necessarily return strings, but
return Tcl objects (or appropriately converted Python objects). This
both helps to improve efficiency, and to improve correctness of Tkinter
applications since less type guessing is needed.

For backward compatibility, there is an option on the tkapp object to
determine whether strings or objects are returned. This is on by
default when using Tkinter, but can be turned off through setting
Tkinter.want_objects to 0. I found that IDLE still works fine when
using Tcl objects.

Do I need to write a PEP for this change, should I post a patch to SF,
or should I just apply the change to the CVS?

Regards,
Martin


From gmccaughan@synaptics-uk.com  Fri Feb 15 10:09:24 2002
From: gmccaughan@synaptics-uk.com (Gareth McCaughan)
Date: Fri, 15 Feb 2002 10:09:24 +0000 (GMT)
Subject: Re[2]: [Python-Dev] syntactic sugar idea for {static,class}methods
In-Reply-To: <20020215092555.GA12028@hishome.net>
References: <200202131712.RAA29416@synaptics-uk.com>
 <20020215092555.GA12028@hishome.net>
Message-ID: <200202151010.KAA03070@synaptics-uk.com>

On Fri, 15 Feb 2002 04:25:55 -0500, Oren Tirosh <oren-py-d@hishome.net> wrote:

> On Wed, Feb 13, 2002 at 05:11:57PM +0000, Gareth McCaughan wrote:
> > One drawback of allowing an arbitrary list of transformations
> > is that it might not be completely clear what order they're done in.
> > I conjecture that most people will have the same intuition
> > as I do about this, namely that the first-listed transformation
> > is applied first. (It would be less obvious if the list came
> > before the name of the definiendum instead of after.)
> 
> The modifier order [memoize, staticmethod] sounds more like the sentence 
> "foo is a memoized staticmethod" - at least in English it does.  In French, 
> Hebrew and several other languages it's the other way around, but Python 
> is definitely English-oriented.

Interesting. I read it more as: "Define a function, then memoize it
and make it a static method".

> So, do adjectives come before or after the noun in Dutch? :-)

I don't think they do. :-)

By the way, the fact that adjectives go before nouns in English
is one reason why I don't read "def foo() [wibblify]" as if "wibblify"
is an adjective. It can't be: it comes after the noun.

PS. Court martial. C sharp. Letters patent. Bother. :-)

-- 
g




From mwh@python.net  Fri Feb 15 10:29:59 2002
From: mwh@python.net (Michael Hudson)
Date: 15 Feb 2002 10:29:59 +0000
Subject: Re[2]: [Python-Dev] syntactic sugar idea for {static,class}methods
In-Reply-To: Gareth McCaughan's message of "Fri, 15 Feb 2002 10:09:24 +0000 (GMT)"
References: <200202131712.RAA29416@synaptics-uk.com> <20020215092555.GA12028@hishome.net> <200202151010.KAA03070@synaptics-uk.com>
Message-ID: <2m3d03xck8.fsf@starship.python.net>

Gareth McCaughan <gmccaughan@synaptics-uk.com> writes:

> On Fri, 15 Feb 2002 04:25:55 -0500, Oren Tirosh <oren-py-d@hishome.net> wrote:
> > The modifier order [memoize, staticmethod] sounds more like the sentence 
> > "foo is a memoized staticmethod" - at least in English it does.  In French, 
> > Hebrew and several other languages it's the other way around, but Python 
> > is definitely English-oriented.
> 
> Interesting. I read it more as: "Define a function, then memoize it
> and make it a static method".

That's what my patch does, too, but I can't remember whether this was
by accident or design :-/.

Incidentally, I'm not sure 

class C:
    def s():
        print 1
    s = memoize(staticmethod(s))

would actually work (s would have type 'function').  I guess memoize
could made cleverer than the version I posted.

Cheers,
M.

-- 
  "declare"?  my bogometer indicates that you're really programming
  in some other language and trying to force Common Lisp into your
  mindset.  this won't work.            -- Erik Naggum, comp.lang.lisp


From jacobs@penguin.theopalgroup.com  Fri Feb 15 14:57:19 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Fri, 15 Feb 2002 09:57:19 -0500 (EST)
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__
In-Reply-To: <3C6C5F2C.9E9B1B79@activestate.com>
Message-ID: <Pine.LNX.4.33.0202150942190.20291-100000@penguin.theopalgroup.com>

[I tried to post this on SourceForge, but as usual, it hates my guts]

I have been hacking on ways to make lighter-weight Python objects using the
__slots__ mechanism that came with Python 2.2 new-style class.  Everything
has gone swimmingly until I noticed that slots do not get pickled/cPickled
at all!

Here is a simple test case:

  import pickle,cPickle
  class Test(object):
    __slots__ = ['x']
    def __init__(self):
      self.x = 66666

  test = Test()

  pickle_str  = pickle.dumps( test )
  cpickle_str = cPickle.dumps( test )

  untest  = pickle.loads( pickle_str )
  untestc = cPickle.loads( cpickle_str )

  print untest.x    # raises AttributeError
  print untextc.x   # raises AttributeError

Clearly, this is incorrect behavior.  The problem is due a change in object
reflection semantics.  Previously (before type-class unification), a
standard Python object instance always contained a __dict__ that listed all
of its attributes.  Now, with __slots__, some classes that do store
attributes have no __dict__ or one that only contains what did not fit into
slots.

Unfortunately, there is no trivial way to know what slots a particular class
instance really has. This is because the __slots__ list in classes and
instances can be mutable!  Changing these lists _does not_ change the object
layout at all, so I am unsure why they are not stored as tuples and the
'__slots__' attribute is not made read-only.  To be pedantic, the C
implementation does have an immutable and canonical list(s) of slots, though
they are well buried within the C extended type implementation.

So, IMHO this bug needs to be fixed in two steps:

First, I propose that class and instance __slots__ read-only and the lists
made immutable.  Otherwise, pickle, cPickle, and any others that want to use
reflection will be SOL.  There is certainly good precedent in several places
for this change (e.g., __bases__, __mro__, etc.) I can submit a fairly
trivial patch to do so.  This change requires Guido's input, since I am
guessing that I am simply not privy to the method, only the madness.

Second, after the first issue is resolved, pickle and cPickle must then be
modified to iterate over an instance's __slots__ (or possibly its class's)
and store any attributes that exist.  i.e., some __slots__ can be empty and
thus should not be pickled.  I can also whip up patches for this, though I'll
wait to see how the first issue shakes out.

Regards,
-Kevin

PS:  There may be another problem when when one class inherits from another
     and both have a slot with the same name.

     e.g.:
       class Test(object):
         __slots__ = ['a']

       class Test2(Test):
         __slots__ = ['a']

       test=Test()
       test2=Test2()
       test2.__class__ = Test

    This code results in this error:

      Traceback (most recent call last):
        File "<stdin>", line 1, in ?
      TypeError: __class__ assignment: 'Test' object layout differs from 'Test2'

    However, Test2's slot 'a' entirely masks Test's slot 'a'.  So, either
    there should be some complex attribute access scheme to make both slots
    available OR (in my mind, the preferable solution) slots with the same
    name can simply be re-used or coalesced.  Now that I think about it,
    this has implications for pickling objects as well.  I'll likely leave
    this patch for Guido -- it tickles some fairly hairy bits of typeobject.

    Cool stuff, but the rabbit hole just keeps getting deeper and deeper....

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From barry@zope.com  Fri Feb 15 15:03:07 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 15 Feb 2002 10:03:07 -0500
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
References: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de>
Message-ID: <15469.9003.4790.636511@anthem.wooz.org>

>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

    MvL> Do I need to write a PEP for this change, should I post a
    MvL> patch to SF, or should I just apply the change to the CVS?

As this is an enhancement to a library, I don't think you'd need a
PEP.

-Barry


From guido@python.org  Fri Feb 15 15:34:28 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 15 Feb 2002 10:34:28 -0500
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: Your message of "Fri, 15 Feb 2002 10:36:20 +0100."
 <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de>
References: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de>
Message-ID: <200202151534.g1FFYSK25793@pcp742651pcs.reston01.va.comcast.net>

> I have a patch that makes the Tcl object API available to _tkinter, in
> the sense that Tcl invocations don't necessarily return strings, but
> return Tcl objects (or appropriately converted Python objects). This
> both helps to improve efficiency, and to improve correctness of Tkinter
> applications since less type guessing is needed.

Cool!

> For backward compatibility, there is an option on the tkapp object to
> determine whether strings or objects are returned. This is on by
> default when using Tkinter, but can be turned off through setting
> Tkinter.want_objects to 0. I found that IDLE still works fine when
> using Tcl objects.
> 
> Do I need to write a PEP for this change, should I post a patch to SF,
> or should I just apply the change to the CVS?

My only worry is about breaking old apps.  I'd like to see a patch on
SF.  No PEP is needed IMO.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com  Fri Feb 15 15:15:10 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 15 Feb 2002 16:15:10 +0100
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
References: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de>
Message-ID: <04dc01c1b636$d7324650$0900a8c0@spiff>

martin wrote:

> For backward compatibility, there is an option on the tkapp object to
> determine whether strings or objects are returned. This is on by
> default when using Tkinter

"on" as in "return strings" or "return objects" ?

I doubt it's a good idea to change the return type without any
warning.

> Do I need to write a PEP for this change, should I post a patch to SF,
> or should I just apply the change to the CVS?

if the default is "use old behaviour", check it in.

if you insist on changing the return types, post it to SF.

</F>



From john_coppola_r_s@yahoo.com  Fri Feb 15 17:57:12 2002
From: john_coppola_r_s@yahoo.com (john coppola)
Date: Fri, 15 Feb 2002 09:57:12 -0800 (PST)
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__
In-Reply-To: <Pine.LNX.4.33.0202150942190.20291-100000@penguin.theopalgroup.com>
Message-ID: <20020215175712.96785.qmail@web11807.mail.yahoo.com>

Hi Kevin.  I'm with you slots have great potential. 
First of all, I think it is a very bad idea that slots
could be mutable.  Here is something I wrote on the
python-list.  (Python list is too noisy and too many
dumb questions involving people not thinking before
tackle a problem. -with the exception of an
interesting thread on licensing, and a spanish
message.)

Anyway, here is some food for thought on slots.
Perhaps long-winded and already talked about, but here
it is:

(If you want to skip the intro and get to the meat and
potatos go to the code, and read my remarks.

- - - - - -
Hello Python developers!  I have been reading about
Python 2.2's new features and I have to say there's
lots of good stuff going on. The feature I dedicate
this message to is slots.  This ubiquitous new feature
has the potential to boost performance of python quite
considerable and it is the first time python has ever
hinted at, for lack of a better word, "being static".

I used this word loosely.  What I mean by this is that
typically one could set attributes arbitrarily on
python objects, but now cometh slots, if __slots__ is
defined, one could set only attribute names defined in
slots if __slots__ exists.

In introductory essay on the new python object system,
"Unifying types and classes in 2_2", I found the
benevolent dictator's answer for the existence of
__slots__ unsatisfactory (or at least the example). 
As implied from the introductory essay, __slots__ was
used as a means to eliminate a double dictionary for
the dictionary subclass in the example (please don't
be offended -- SO what).  That is one application of
slots, but there is a more powerful way to use this
new
feature.

First let me discuss some notational issues I have
with __slots__.  It is possible to use any object that
supports iteration as the __slots__ object (huh does
my memory serve me correctly?). I feel that this is
not a good idea.  Consider the following example:

    class Foo(object):
        __slots__=['a','b','c']

    foo=Foo()

Bad, but you could do it.

    foo.__class__.__slots__.append('d') #Ouch!

Suppose someone wrote code to initialize some instance
like so:

    for attr in foo.__class__.__slots__:
        setattr(foo,attr,0)

Granted this is decidedly bad code but the problem
this creates is that attribute "d" does not exist in
foo or any instance of Foo before or after the "append
event".  So an attribute error will be raised. Slot
attributes are read only thus del foo.__slots__ is not
possible.  So neither should __slots__.append be
legal.  So the answer is to enforce __slots__ is
tuple, or for a single attribute __slots__ is string,
when compiling objects.

Another issue I have with __slots__ is that there
exists better notation to do the same thing.  And part
of python is to maintain a clean coding style. Why not
introduce new a keyword into the language so
"static attribute sets" are created like so:

    class Foo(object):
        attribute a
        attribute b
        attribute c

Perhaps this syntax could be expanded further to
include strict typing:

    class Foo(object):
        attribute a is str
        attribute b is float
        attribute c is int

I feel that this notation is much more lucid and could
lead to even more interesting notation:

    class Foo(object):
	attribute a is str
	attribute b is float
	attribute c is property
	attribute g is property of float

I hope that it is understood from this context that
"is" compares the type(<attrname>) to trailing type in
the declaration.  The meaning of the modifier "of" in
the last line is a bit ambiguous at this moment. 
Perhaps it enforces the type passed into fset, and the
return value of fget.

The programmer doesn't need to know what the
underlying __slots__ structure looks like, unless they
are doing some heavy duty C.  In which case they could
probably handle it in all it's grand ugliness.

Is the python interpreter  __slot__ aware?  Instead of
the usual variable name lookup into __dict__ (which is
not supported by instances who's class implements
__slots__), a slot attribute could be directly
referenced via pointer indirection.   Does the
interpreter substitute references for slot
attributes?  For that matter any namespace could
utilize this new approach, even the local
namespace of a code segment. Slots could be built
implicitly for a local namespace.  This could lead to
significant performance boost. Or perhaps a JIT of
sorts for python. Cause all you need is a bit of type
info, and those segments could be accurately
translated into native code.
------


--- Kevin Jacobs <jacobs@penguin.theopalgroup.com>
wrote:
> [I tried to post this on SourceForge, but as usual,
> it hates my guts]
> 
> I have been hacking on ways to make lighter-weight
> Python objects using the
> __slots__ mechanism that came with Python 2.2
> new-style class.  Everything
> has gone swimmingly until I noticed that slots do
> not get pickled/cPickled
> at all!
> 
> Here is a simple test case:
> 
>   import pickle,cPickle
>   class Test(object):
>     __slots__ = ['x']
>     def __init__(self):
>       self.x = 66666
> 
>   test = Test()
> 
>   pickle_str  = pickle.dumps( test )
>   cpickle_str = cPickle.dumps( test )
> 
>   untest  = pickle.loads( pickle_str )
>   untestc = cPickle.loads( cpickle_str )
> 
>   print untest.x    # raises AttributeError
>   print untextc.x   # raises AttributeError
> 
> Clearly, this is incorrect behavior.  The problem is
> due a change in object
> reflection semantics.  Previously (before type-class
> unification), a
> standard Python object instance always contained a
> __dict__ that listed all
> of its attributes.  Now, with __slots__, some
> classes that do store
> attributes have no __dict__ or one that only
> contains what did not fit into
> slots.
> 
> Unfortunately, there is no trivial way to know what
> slots a particular class
> instance really has. This is because the __slots__
> list in classes and
> instances can be mutable!  Changing these lists
> _does not_ change the object
> layout at all, so I am unsure why they are not
> stored as tuples and the
> '__slots__' attribute is not made read-only.  To be
> pedantic, the C
> implementation does have an immutable and canonical
> list(s) of slots, though
> they are well buried within the C extended type
> implementation.
> 
> So, IMHO this bug needs to be fixed in two steps:
> 
> First, I propose that class and instance __slots__
> read-only and the lists
> made immutable.  Otherwise, pickle, cPickle, and any
> others that want to use
> reflection will be SOL.  There is certainly good
> precedent in several places
> for this change (e.g., __bases__, __mro__, etc.) I
> can submit a fairly
> trivial patch to do so.  This change requires
> Guido's input, since I am
> guessing that I am simply not privy to the method,
> only the madness.
> 
> Second, after the first issue is resolved, pickle
> and cPickle must then be
> modified to iterate over an instance's __slots__ (or
> possibly its class's)
> and store any attributes that exist.  i.e., some
> __slots__ can be empty and
> thus should not be pickled.  I can also whip up
> patches for this, though I'll
> wait to see how the first issue shakes out.
> 
> Regards,
> -Kevin
> 
> PS:  There may be another problem when when one
> class inherits from another
>      and both have a slot with the same name.
> 
>      e.g.:
>        class Test(object):
>          __slots__ = ['a']
> 
>        class Test2(Test):
>          __slots__ = ['a']
> 
>        test=Test()
>        test2=Test2()
>        test2.__class__ = Test
> 
>     This code results in this error:
> 
>       Traceback (most recent call last):
>         File "<stdin>", line 1, in ?
>       TypeError: __class__ assignment: 'Test' object
> layout differs from 'Test2'
> 
>     However, Test2's slot 'a' entirely masks Test's
> slot 'a'.  So, either
>     there should be some complex attribute access
> scheme to make both slots
>     available OR (in my mind, the preferable
> solution) slots with the same
>     name can simply be re-used or coalesced.  Now
> that I think about it,
>     this has implications for pickling objects as
> well.  I'll likely leave
>     this patch for Guido -- it tickles some fairly
> hairy bits of typeobject.
> 
>     Cool stuff, but the rabbit hole just keeps
> getting deeper and deeper....
> 
> --
> Kevin Jacobs
> The OPAL Group - Enterprise Systems Architect
> Voice: (216) 986-0710 x 19         E-mail:
> jacobs@theopalgroup.com
> Fax:   (216) 986-0714              WWW:   
> http://www.theopalgroup.com
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev



__________________________________________________
Do You Yahoo!?
Got something to say? Say it better with Yahoo! Video Mail 
http://mail.yahoo.com


From jacobs@penguin.theopalgroup.com  Fri Feb 15 18:08:39 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Fri, 15 Feb 2002 13:08:39 -0500 (EST)
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find
 __slots__
In-Reply-To: <20020215175712.96785.qmail@web11807.mail.yahoo.com>
Message-ID: <Pine.LNX.4.33.0202151302250.21612-100000@penguin.theopalgroup.com>

On Fri, 15 Feb 2002, john coppola wrote:
> Hi Kevin.  I'm with you slots have great potential.
> First of all, I think it is a very bad idea that slots
> could be mutable.

They don't have to be mutable -- its more a relic of the implementation.
e.g.:

class A(object):
  __slots__ = ('a','b')

A.__slots__.append('c')

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'tuple' object has no attribute 'append'

Of course, I think that this is something of a mistake, though I'll reserve
final justment until after I've heard Guido's reasoning.  Its clear that he
has bigger plans for both the syntax and semantics (especially if you heard
some of his off-hand remarks at IPC10).

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From skip@pobox.com  Fri Feb 15 19:13:34 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 15 Feb 2002 13:13:34 -0600
Subject: [Python-Dev] IRC?
Message-ID: <15469.24030.981884.932642@12-248-41-177.client.attbi.com>

Is there a set time people congregate on IRC (#python-dev @
irc.openprojects.net)? 

Skip


From mclay@nist.gov  Fri Feb 15 19:19:19 2002
From: mclay@nist.gov (Michael McLay)
Date: Fri, 15 Feb 2002 14:19:19 -0500
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__
In-Reply-To: <20020215175712.96785.qmail@web11807.mail.yahoo.com>
References: <20020215175712.96785.qmail@web11807.mail.yahoo.com>
Message-ID: <200202151923.g1FJNVJ5017966@email.nist.gov>

On Friday 15 February 2002 12:57 pm, john coppola wrote:


> Another issue I have with __slots__ is that there
> exists better notation to do the same thing.  And part
> of python is to maintain a clean coding style. Why not
> introduce new a keyword into the language so
> "static attribute sets" are created like so:
>
>     class Foo(object):
>         attribute a
>         attribute b
>         attribute c

Guido suggested several alternative syntaxes during his presentation on 
developers day at the Python conference.  His slides are online from 
python.org.  He used the word "slot" instead of "attribute".  Both names have 
advantanges. He question the use of the term slot and was looking for an 
appropriate substitute.  It has the advantage of being short and easy to 
type. Using the term "attribute" would be more descriptive. I find it very 
long to type and somewhat visually distracting.

>
> Perhaps this syntax could be expanded further to
> include strict typing:
>
>     class Foo(object):
>         attribute a is str
>         attribute b is float
>         attribute c is int

I suggested a method for adding optional static typing to slots and submitted 
a patch[1] last November.  My patch eliminated the special __slots__ member 
and replaced it with a special method to a class called addmember() that 
could be used to add slot definitions to a class.  A docstring, an 
optional type, and a default value could be specified as parameters in the 
addmembers() method.  If a type, or tuple of types was defined the 
tp_descr_set call in C would first check the type of the object being 
assigned to the member name using the isinstance() builtin. If isinstance 
failed an exception would be triggered. 

While my approach was patterened after the property() builtin, the Python 
Labs crowd didn't like the notation and rejected the patch. (I don't think it 
helped that the feature freeze was about to start and Guido was out of 
paternity leave.) Here was example of how the addmember() method worked:

>>> class B(object):
    """class B's docstring
    """
    a = addmember(types=int, default=56, doc="a docstring")
    b = addmember(types=int, doc="b's docstring")
    c = addmember(types=(int,float), default=5.0, doc="c docstring")
    d = addmember(types=(str,float), default="ham", doc="d docstring")
>>> b = B()
>>> b.a
56
>>> b.d
'ham'

[1]http://sourceforge.net/tracker/index.php?func=detail&aid=480562&group_id=5470&atid=305470

I think most everyone would agree that adding optional static typing could 
enable some interesting optimizations, but not everyone at the conference was 
supportive of the encrouchment of static typing on their favorite 
dynamiically typed language.

Greg Ward presented a proposed syntax for optional static typing at the end 
of the optimization sesssion on developer's day.  Greg had made a 
presentation on Grouch, his postprocessing type checker for ZODB, during the 
lightning talks. The proposed syntax for optional static typing was a meld of 
Guido's proposed new syntax for slots, my addmembers() work, and Greg's 
Grouch.  Guido wasn't keen on the docstring notation, but otherwise seemed to 
be receptive of the syntax and the idea of adding optional static typing. 
Here is what Greg presented:

# ======================================================================
# WHAT IF...
#
#   Grouch's type syntax were integrated into Python?

class Thing:
    slot name : string
        "The name of the thing"

class Animal (Thing):
    slot num_legs : int = 4
        "Number of legs this animal has"
    slot furry : boolean = true	
        "Does this animal have full-body fur?"

class Vegetable (Thing):
    # hypothetical constraint (is this type info?)
    slot colour : string oneof ("red", "green", "blue")
        "Colour of this vegetable's external surface"

class Mineral (Thing):
    slot atoms : [(atom:Atom, n:int)]
        """Characterize the proportion of elements in this
	mineral's crystal lattice: each tuple records the
	relative count of that atom in the lattice."""
	# possible constraints on this attribute:
	#   n > 0
	#   atom in AtomSet (collection of all atoms)
	# how to code these?  do we even *want* syntax
	# for arbitrary constraints?  or just require that
	# you code a set_atoms() method? (or a property
	# modifier, or something)

>>> class B(object):
    """Docstring of class B
    """
    slot a: int = 56

    """a docstring
        """
	precondition:
	    if b > 3:
		default=3

    slot b: int
        """Docstring of b
	"""

    slot  c: int | float = 5.0
        """Docstring of attribute c
	"""

    slot d: str | float = "spam"
        """Docstring of d
	"""
	postcondition:
	    if self.b = 42.0

>>> b = B()
>>> b.a



From mal@lemburg.com  Fri Feb 15 19:35:44 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Feb 2002 20:35:44 +0100
Subject: [Python-Dev] IRC?
References: <15469.24030.981884.932642@12-248-41-177.client.attbi.com>
Message-ID: <3C6D6310.E69758D2@lemburg.com>

Skip Montanaro wrote:
> 
> Is there a set time people congregate on IRC (#python-dev @
> irc.openprojects.net)?

Not really. 

I think that IRC is mostly a waste of time unless
you have something serious to talk about (e.g. a meeting,
specific problem, etc.), maybe just me, though.

It would probably help with some tough problems though,
e.g. porting issues, etc. Sort of like online support :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From fdrake@acm.org  Fri Feb 15 19:35:20 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 15 Feb 2002 14:35:20 -0500
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__
In-Reply-To: <200202151923.g1FJNVJ5017966@email.nist.gov>
References: <20020215175712.96785.qmail@web11807.mail.yahoo.com>
 <200202151923.g1FJNVJ5017966@email.nist.gov>
Message-ID: <15469.25336.313450.27375@grendel.zope.com>

Michael McLay writes:
 > While my approach was patterened after the property() builtin, the
 > Python Labs crowd didn't like the notation and rejected the

I'll note as well that at least some of us, if not all, don't like the
property() syntax as well.  My current favorite was one of Guido's
proposals at Python 10:


class Foo(object):
    property myprop:
        """A computed property on Foo objects."""

        def __get__(self):
            return ...
        def __set__(self):
            ...
        def __delete__(self):
            ...
        


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From mclay@nist.gov  Fri Feb 15 19:46:10 2002
From: mclay@nist.gov (Michael McLay)
Date: Fri, 15 Feb 2002 14:46:10 -0500
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__
In-Reply-To: <15469.25336.313450.27375@grendel.zope.com>
References: <20020215175712.96785.qmail@web11807.mail.yahoo.com> <200202151923.g1FJNVJ5017966@email.nist.gov> <15469.25336.313450.27375@grendel.zope.com>
Message-ID: <200202151950.g1FJoMJ5006922@email.nist.gov>

On Friday 15 February 2002 02:35 pm, Fred L. Drake, Jr. wrote:
> Michael McLay writes:
>  > While my approach was patterened after the property() builtin, the
>  > Python Labs crowd didn't like the notation and rejected the
>
> I'll note as well that at least some of us, if not all, don't like the
> property() syntax as well.  My current favorite was one of Guido's
> proposals at Python 10:

I agree with you on this being a better notation. It unclutters the class 
definition. Had Guido suggested the alternative slot syntaxes back at the 
start of November I would have used one of the alternative syntaxes instead 
of creating a new builtin function.  BTW, adding a builtin function is a 
pain. The trick of counting the number of parameters to determine behavior 
caused strange things to happen during the testing of the addmember function.

> class Foo(object):
>     property myprop:
>         """A computed property on Foo objects."""
>
>         def __get__(self):
>             return ...
>         def __set__(self):
>             ...
>         def __delete__(self):

Is someone working on an implementation of this?



From fdrake@acm.org  Fri Feb 15 20:00:44 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 15 Feb 2002 15:00:44 -0500
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__
In-Reply-To: <200202151950.g1FJoMJ5006922@email.nist.gov>
References: <20020215175712.96785.qmail@web11807.mail.yahoo.com>
 <200202151923.g1FJNVJ5017966@email.nist.gov>
 <15469.25336.313450.27375@grendel.zope.com>
 <200202151950.g1FJoMJ5006922@email.nist.gov>
Message-ID: <15469.26860.234987.122278@grendel.zope.com>

Michael McLay writes:
 > Is someone working on an implementation of this?

Not that I know of.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From tim.one@comcast.net  Fri Feb 15 20:13:32 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 15 Feb 2002 15:13:32 -0500
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find
 __slots__
In-Reply-To: <200202151950.g1FJoMJ5006922@email.nist.gov>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEKKNMAA.tim.one@comcast.net>

[Michael McLay]
> ...
> Had Guido suggested the alternative slot syntaxes back at the start of
> November I would have used one of the alternative syntaxes instead
> of creating a new builtin function.

Guido said several times during 2.2 development that he didn't like using builtin
functions for some of the new class features, but that syntax issues were off the table
before 2.3 because there wasn't time to address them for 2.2.  He didn't repeat this
every time it was brought up, though; you can't imagine how pressed for time we all were,
and Guido especially.

>> class Foo(object):
>>     property myprop:
>>         """A computed property on Foo objects."""
>>
>>         def __get__(self):
>>             return ...
>>         def __set__(self):
>>             ...
>>         def __delete__(self):

> Is someone working on an implementation of this?

Not within PythonLabs at present.



From john_coppola_r_s@yahoo.com  Fri Feb 15 20:56:50 2002
From: john_coppola_r_s@yahoo.com (john coppola)
Date: Fri, 15 Feb 2002 12:56:50 -0800 (PST)
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__
In-Reply-To: <200202151921.g1FJLdJ5016693@email.nist.gov>
Message-ID: <20020215205650.44487.qmail@web11808.mail.yahoo.com>

I'm definitely in preference to slot versus attribute.
But may be we could just use def.

class Foo:
   def a
   def b
   def c
   def somemethod(self):
      pass

The distinction being the def statement is "unbound"
so to speak.  humm.... the code is not clear enough.

class Foo:
   slot a
   slot b
   slot c
   def foo():
      pass

Better, much better.

class Foo:
   slot a is str
   slot b is float
   slot c is property of float
   def foo(self): pass

I like this.  It reads like sentences.
(remember, 'of' modified property to ensure that the
fget return type would be correct and fset passed in
object would be correct)
   
I didn't much care for Groucho's notation,
particularly if a slot could have multiple types,
why bother assigning types to it at all. It should
definitely be singular.  One to one.

The concept of slot does not need to be solely related
to attributes within a class.  Why not reserve slots
for a methods, classes within a module, modules
imported within modules.  Then it will be easier to
see the overall picture.  Does the slot pattern relate
to every object in python?  I think it does. That's
when the real benefit comes in.  If python could
utilize this pattern in every aspect, the big
performance boost will occur.

In a strange way slots has made python even more
dynamic than it ever was.  Prior to slots, objects had
a static c structure.  Slots enables variability in
another dimension for the underlying C struct.  On the
outside it looks like python is becoming static, but
what's really going on under the hood is quite the
contrary.  Definitely more burden on the compiler to
build correct references and make the correct
substitutions.





__________________________________________________
Do You Yahoo!?
Got something to say? Say it better with Yahoo! Video Mail 
http://mail.yahoo.com


From john_coppola_r_s@yahoo.com  Fri Feb 15 21:03:10 2002
From: john_coppola_r_s@yahoo.com (john coppola)
Date: Fri, 15 Feb 2002 13:03:10 -0800 (PST)
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__
In-Reply-To: <15469.25336.313450.27375@grendel.zope.com>
Message-ID: <20020215210310.45339.qmail@web11808.mail.yahoo.com>

This is beautiful! It's like an innerclass.
Perfect!  Encapulated delegate.
I can't wait for Python 3.0.

> class Foo(object):
>     property myprop:
>         """A computed property on Foo objects."""
> 
>         def __get__(self):
>             return ...
>         def __set__(self):
>             ...
>         def __delete__(self):
>             ...
>   -Fred
> 
> -- 
> Fred L. Drake, Jr.  <fdrake at acm.org>
> PythonLabs at Zope Corporation


__________________________________________________
Do You Yahoo!?
Got something to say? Say it better with Yahoo! Video Mail 
http://mail.yahoo.com


From fdrake@acm.org  Fri Feb 15 21:04:33 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 15 Feb 2002 16:04:33 -0500
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__
In-Reply-To: <20020215210310.45339.qmail@web11808.mail.yahoo.com>
References: <15469.25336.313450.27375@grendel.zope.com>
 <20020215210310.45339.qmail@web11808.mail.yahoo.com>
Message-ID: <15469.30689.678773.846329@grendel.zope.com>

john coppola writes:
 > This is beautiful! It's like an innerclass.
 > Perfect!  Encapulated delegate.
 > I can't wait for Python 3.0.

I can't wait to work on it!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From jacobs@penguin.theopalgroup.com  Fri Feb 15 21:09:01 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Fri, 15 Feb 2002 16:09:01 -0500 (EST)
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find
 __slots__
In-Reply-To: <15469.30689.678773.846329@grendel.zope.com>
Message-ID: <Pine.LNX.4.33.0202151607080.24268-100000@penguin.theopalgroup.com>

On Fri, 15 Feb 2002, Fred L. Drake, Jr. wrote:
> john coppola writes:
>  > This is beautiful! It's like an innerclass.
>  > Perfect!  Encapulated delegate.
>  > I can't wait for Python 3.0.
>
> I can't wait to work on it!

In the mean time, does anyone have any comments on the original bug
report(s)?

Thanks,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From fdrake@acm.org  Fri Feb 15 21:10:17 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 15 Feb 2002 16:10:17 -0500
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find __slots__
In-Reply-To: <15469.25336.313450.27375@grendel.zope.com>
References: <20020215175712.96785.qmail@web11807.mail.yahoo.com>
 <200202151923.g1FJNVJ5017966@email.nist.gov>
 <15469.25336.313450.27375@grendel.zope.com>
Message-ID: <15469.31033.656478.828919@grendel.zope.com>

Fred L. Drake, Jr. writes:
[describing a suggested property syntax]
 > class Foo(object):
 >     property myprop:
 >         """A computed property on Foo objects."""
 > 
 >         def __get__(self):
 >             return ...

Perhaps it was obvious to everyone else, but it just occured to me
that this lends itself to inheriting descriptor types:


class ReadOnly(object):
    def __get__(self):
        raise NotImplementedError("sub-class must override this!")

    def __set__(self):
        raise AttributeError("read-only attribute")

    def __delete__(self):
        raise AttributeError("read-only attribute")


class Foo(object):
    property myprop(ReadOnly):
        def __get__(self):
            return ...



  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From martin@v.loewis.de  Fri Feb 15 20:13:43 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 15 Feb 2002 21:13:43 +0100
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: <04dc01c1b636$d7324650$0900a8c0@spiff>
References: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de>
 <04dc01c1b636$d7324650$0900a8c0@spiff>
Message-ID: <m3r8nm1p1k.fsf@mira.informatik.hu-berlin.de>

"Fredrik Lundh" <fredrik@pythonware.com> writes:

> > For backward compatibility, there is an option on the tkapp object to
> > determine whether strings or objects are returned. This is on by
> > default when using Tkinter
> 
> "on" as in "return strings" or "return objects" ?

In Tkinter, it returns objects by default.

> I doubt it's a good idea to change the return type without any
> warning.

This is not as bad as it sounds. For most functions, the return type
does not change at all. Consider

    def winfo_depth(self):
        """Return the number of bits per pixel."""
        return getint(self.tk.call('winfo', 'depth', self._w))

'winfo depth' will return a Tcl int in Tcl, which is currently
converted to a string in _tkinter, then converted back to an int.
With the change, tk.call will already return an int, so the getint
invocation becomes a no-op.

For others, a conversion into string will continue to return the value
that it currently returns:

>>> l=Tkinter.Label()
>>> l.config("foreground")[3]
<color object at 0x0824b4b0>
>>> str(_)
'Black'

I would expect that few if any applications will be affected; those
would need to change the default after import Tkinter.

> if the default is "use old behaviour", check it in.
> 
> if you insist on changing the return types, post it to SF.

I'd like to change the return types. If that is not acceptable, I'd
like to produce a DeprecationWarning if Tkinter is imported and the
new-style behaviour (return objects) is not enabled.

Regards,
Martin


From DavidA@ActiveState.com  Fri Feb 15 23:17:17 2002
From: DavidA@ActiveState.com (David Ascher)
Date: Fri, 15 Feb 2002 15:17:17 -0800
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
References: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de>
 <04dc01c1b636$d7324650$0900a8c0@spiff> <m3r8nm1p1k.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3C6D96FD.20A4AA85@activestate.com>

While we're on the topic of Tkinter, I got an email from Jeff Hobbs (Tcl
guy at AS) re: Tkinter.  He suspects that in:

                Py_BEGIN_ALLOW_THREADS
                PyThread_acquire_lock(tcl_lock, 1);
                tcl_tstate = tstate;
                result = Tcl_DoOneEvent(TCL_DONT_WAIT);
                tcl_tstate = NULL;
                PyThread_release_lock(tcl_lock);
                if (result == 0)
                        Sleep(20);
                Py_END_ALLOW_THREADS


The Sleep() call is a perf problem.  If anyone wants to discuss it with
Jeff, I've cc'ed him here.

building bridges,

--da


From goodger@users.sourceforge.net  Fri Feb 15 23:38:04 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Fri, 15 Feb 2002 18:38:04 -0500
Subject: [Python-Dev] spice for PEP 282
Message-ID: <B893060A.1EAB3%goodger@users.sourceforge.net>

Here's some spice for the logger recipe.  Please season to taste.

Abstract
========

The dps.utils.Reporter class (http://docstring.sf.net/dps/utils.py)
implements a logger but with *multiple* thresholds per category
(stream/channel).  Similarly to log4j, there's a "warninglevel"
threshold, which determines if a message gets sent to the warning
stream (sys.stderr).  There's also an "errorlevel" threshold which
determines if a message is converted into a *raised exception*,
potentially halting processing.  And a "debug" flag which turns debug
messages on or off independently of the "warninglevel" threshold.  I
suggest that the Python stdlib logging module adopt some of these
features.

Background
==========

I've been working on the DPS & reStructuredText projects (soon to be
merged and officially renamed to "Docutils") on and off for some
months now.  Docutils will parse texts (files or docstrings) into
DOM-like document trees, then convert them to HTML etc.  Early on I
saw the need to insert "system_message" feedback elements of different
levels into the doctree and implemented dps.utils.Reporter.  I
included thresholds for logging to sys.stderr and raising exceptions,
initially with only one setting (like log4j with only the root
"category" set).  This Reporter class has been very successful

As a pointed reminder of how wheels are continually reinvented, I
learned about log4j (just before the python-dev effort got underway; I
probably read the same message that got Guido started).  I already had
4 message levels (what log4j called "logging priorities"), and log4j's
notion of "logging categories" seemed a powerful one, so I retrofitted
the dps.utils.Reporter class with support for categories.  I also
added a "debug" category, which I had been handling separately.

>From the revised PEP 258 (not yet checked in to the Python CVS):

    When the parser encounters an error in markup, it inserts a system
    message (DTD element 'system_message').  There are five levels of
    system messages:

    - Level-0, "DEBUG": an internal reporting issue.  There is no
      effect on the processing.  Level-0 system messages are
      handled separately from the others.

    - Level-1, "INFO": a minor issue that can be ignored.  There is no
      effect on the processing.  Typically level-1 system messages are
      not reported.

    - Level-2, "WARNING": an issue that should be addressed.  If
      ignored, there may be unpredictable problems with the output.

    - Level-3, "ERROR": an error that should be addressed.  If
      ignored, the output will contain errors.

    - Level-4, "SEVERE": a severe error that must be addressed.
      Typically level-4 system messages are turned into exceptions
      which halt processing.  If ignored, the output will contain
      severe errors.

    Although the initial message levels were devised independently,
    they have a strong correspondence to VMS error condition severity
    levels [9]; the names in quotes for levels 1 through 4 were
    borrowed from VMS.  Error handling has since been influenced by
    the log4j project [10].

    ...

    [9] http://www.openvms.compaq.com:8000/73final/5841/
        5841pro_027.html#error_cond_severity

    [10] http://jakarta.apache.org/log4j/

Here's the docstring of dps.utils.Reporter:

    Info/warning/error reporter and ``system_message`` element
    generator.

    Five levels of system messages are defined, along with
    corresponding methods: `debug()`, `info()`, `warning()`,
    `error()`, and `severe()`.

    There is typically one Reporter object per process.  A Reporter
    object is instantiated with thresholds for generating warnings and
    errors (raising exceptions), a switch to turn debug output on or
    off, and an I/O stream for warnings.  These are stored in the
    default reporting category, '' (zero-length string).

    Multiple reporting categories may be set, each with its own
    warning and error thresholds, debugging switch, and warning
    stream.  Categories are hierarchically-named strings that look
    like attribute references: 'spam', 'spam.eggs', 'neeeow.wum.ping'.
    The 'spam' category is the ancestor of 'spam.bacon.eggs'.  Unset
    categories inherit stored values from their closest ancestor
    category that has been set.

    When a system message is generated, the stored values from its
    category (or ancestor if unset) are retrieved.  The system message
    level is compared to the thresholds stored in the category, and a
    warning or error is generated as appropriate.  Debug messages are
    produced iff the stored debug switch is on.  Message output is
    sent to the stored warning stream.

The Point
=========

I submit that the priority/level spectrum is not continuous.  There is
a break between "debug" and "info":

    DEBUG -/- INFO - WARNING - ERROR - FATAL/SEVERE

In the Docutils application, and (I think) in general, debug logging
is better treated separately from info/warning/error logging.  Debug
logging is often used by the developer but only rarely used by the
end-user.  However, depending on the type of application,
info/warning/error logging can be useful to the end-user.  Compilers,
parsers, and filters are such applications.

In addition, having a separate threshold for warning output (typical
logging) and error generation (raising exceptions) has been very
useful to the Docutils application, and may be useful in general for a
logging module.

When I run the test suite, I run with warnings and errors turned off
(I get feedback from system_message elements added to the doctree).
Real processing typically runs with "WARNING" and higher generating
warning output, and "SEVERE" (FATAL) raising exceptions.  A "make sure
this input has absolutely no problems whatsoever" run might have
thresholds set lower, so "INFO" is reported and "WARNING" and higher
turn into exceptions.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net



From martin@v.loewis.de  Sat Feb 16 00:09:31 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 16 Feb 2002 01:09:31 +0100
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: <3C6D96FD.20A4AA85@activestate.com>
References: <200202150936.g1F9aKw03405@mira.informatik.hu-berlin.de>
 <04dc01c1b636$d7324650$0900a8c0@spiff>
 <m3r8nm1p1k.fsf@mira.informatik.hu-berlin.de>
 <3C6D96FD.20A4AA85@activestate.com>
Message-ID: <m3d6z61e4k.fsf@mira.informatik.hu-berlin.de>

David Ascher <DavidA@ActiveState.com> writes:

> The Sleep() call is a perf problem.

It certainly is, but it is also necessary to have.

Regards,
Martin


From john_coppola_r_s@yahoo.com  Sat Feb 16 00:34:47 2002
From: john_coppola_r_s@yahoo.com (john coppola)
Date: Fri, 15 Feb 2002 16:34:47 -0800 (PST)
Subject: [Python-Dev] property syntax
In-Reply-To: <15469.31033.656478.828919@grendel.zope.com>
Message-ID: <20020216003447.37144.qmail@web11807.mail.yahoo.com>

Hey Fred.  It might not be a good idea to nest the
"property class" like an inner class.  It may be
plausible that property objects are reusable between
classes.  As implied by this syntax, it wouldn't be
reuseable.  Another point, is that they may be very
large.  Which would be messy.

I did i bit of brainstorming.

One purpose of the type objects is a means to coerce
one object to another.  So here is the pattern.

Just like str(MyObject) requires __str__, or
len(MyObject) requires __len__ or any of the factory
functions for that matter, the property factory
function would require that your object support both
__get__ , __set__ , and __del__.  Thats it.  So
instead of, property(fset,fget,fdel)you would instead
have, property(AnyObjectSupportingAboveInterface).

How the property factory function differs from the
others is that it will only check for the existence of
these methods, and will not execute the code within
them.  It instead sets a flag on the object indicating
that it is active.  Will be necessary to do checking
on every object for every set, or every get.  Not too
bad though.  How time consuming is two if statements?

Is this making sense?

John Coppola

--- "Fred L. Drake, Jr." <fdrake@acm.org> wrote:
> 
> Fred L. Drake, Jr. writes:
> [describing a suggested property syntax]
>  > class Foo(object):
>  >     property myprop:
>  >         """A computed property on Foo objects."""
>  > 
>  >         def __get__(self):
>  >             return ...
> 
> Perhaps it was obvious to everyone else, but it just
> occured to me
> that this lends itself to inheriting descriptor
> types:
> 
> 
> class ReadOnly(object):
>     def __get__(self):
>         raise NotImplementedError("sub-class must
> override this!")
> 
>     def __set__(self):
>         raise AttributeError("read-only attribute")
> 
>     def __delete__(self):
>         raise AttributeError("read-only attribute")
> 
> 
> class Foo(object):
>     property myprop(ReadOnly):
>         def __get__(self):
>             return ...
> 
> 
> 
>   -Fred
> 
> -- 
> Fred L. Drake, Jr.  <fdrake at acm.org>
> PythonLabs at Zope Corporation
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev


__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com


From trentm@ActiveState.com  Sat Feb 16 00:43:33 2002
From: trentm@ActiveState.com (Trent Mick)
Date: Fri, 15 Feb 2002 16:43:33 -0800
Subject: [Python-Dev] PEP 282: A Logging System  --  comments please
Message-ID: <20020215164333.A31903@ActiveState.com>

Howdy all,

I would appreciate any comments you might have on this proposal for adding a
logging system to the Python Standard Library. This PEP is still an early
draft so please forward your comments just to me directly for now.

Thanks,
Trent

-----------------------------------------------------------
PEP: 282
Title: A Logging System
Version: $Revision: 1.1 $
Last-Modified: $Date: 2002/02/15 04:09:17 $
Author: trentm@activestate.com (Trent Mick)
Status: Draft
Type: Standards Track
Created: 4-Feb-2002
Python-Version: 2.3
Post-History:


Abstract

    This PEP describes a proposed logging package for Python's
    standard library.

    Basically the system involves the user creating one or more
    logging objects on which methods are called to log debugging
    notes/general information/warnings/errors/etc.  Different logging
    'levels' can be used to distinguish important messages from
    trivial ones.
    
    A registry of named singleton logger objects is maintained so that

        1) different logical logging streams (or 'channels') exist
           (say, one for 'zope.zodb' stuff and another for
           'mywebsite'-specific stuff)
           
        2) one does not have to pass logger object references around.

    The system is configurable at runtime.  This configuration
    mechanism allows one to tune the level and type of logging done
    while not touching the application itself.

    
Motivation

    If a single logging mechanism is enshrined in the standard
    library, 1) logging is more likely to be done 'well', and 2)
    multiple libraries will be able to be integrated into larger
    applications which can be logged reasonably coherently.


Influences

    This proposal was put together after having somewhat studied the
    following logging packages:

        o java.util.logging in JDK 1.4 (a.k.a. JSR047) [1]
        o log4j [2]
          These two systems are *very* similar.
        o the Syslog package from the Protomatter project [3]
        o MAL's mx.Log package [4]

    This proposal will basically look like java.util.logging with a
    smattering of log4j.


Simple Example

    This shows a very simple example of how the logging package can be
    used to generate simple logging output on stdout.
    
        --------- mymodule.py -------------------------------
        import logging
        log = logging.getLogger("MyModule")

        def doit():
            log.debug("doin' stuff")
            # do stuff ...
        -----------------------------------------------------

        --------- myapp.py ----------------------------------
        import mymodule, logging
        log = logging.getLogger("MyApp")

        log.info("start my app")
        try:
            mymodule.doit()
        except Exception, e:
            log.error("There was a problem doin' stuff.")
        log.info("end my app")
        -----------------------------------------------------

    > python myapp.py
    0    [myapp.py:4] INFO  MyApp - start my app
    36   [mymodule.py:5] DEBUG MyModule - doin' stuff
    51   [myapp.py:9] INFO  MyApp - end my app
    ^^   ^^^^^^^^^^^^ ^^^^  ^^^^^   ^^^^^^^^^^
    |    |            |     |       `-- message
    |    |            |     `-- logging name/channel
    |    |            `-- level
    |    `-- location
    `-- time

    NOTE: Not sure exactly what the default format will look like yet.


Control Flow

    [Note: excerpts from Java Logging Overview. [5]]

    Applications make logging calls on *Logger* objects.  Loggers are
    organized in a hierarchical namespace and child Loggers may
    inherit some logging properties from their parents in the
    namespace.
    
    Notes on namespace: Logger names fit into a "dotted name"
    namespace, with dots (periods) indicating sub-namespaces.  The
    namespace of logger objects therefore corresponds to a single tree
    data structure.

       "" is the root of the namespace
       "Zope" would be a child node of the root
       "Zope.ZODB" would be a child node of "Zope"

    These Logger objects allocate *LogRecord* objects which are passed
    to *Handler* objects for publication.  Both Loggers and Handlers
    may use logging *levels* and (optionally) *Filters* to decide if
    they are interested in a particular LogRecord.  When it is
    necessary to publish a LogRecord externally, a Handler can
    (optionally) use a *Formatter* to localize and format the message
    before publishing it to an I/O stream.

    Each Logger keeps track of a set of output Handlers.  By default
    all Loggers also send their output to their parent Logger.  But
    Loggers may also be configured to ignore Handlers higher up the
    tree.

    The APIs are structured so that calls on the Logger APIs can be
    cheap when logging is disabled.  If logging is disabled for a
    given log level, then the Logger can make a cheap comparison test
    and return.  If logging is enabled for a given log level, the
    Logger is still careful to minimize costs before passing the
    LogRecord into the Handlers.  In particular, localization and
    formatting (which are relatively expensive) are deferred until the
    Handler requests them.


Levels
    
    The logging levels, in increasing order of importance, are:

        DEBUG
        INFO
        WARN
        ERROR
        FATAL
        ALL

    This is consistent with log4j and Protomatter's Syslog and not
    with JSR047 which has a few more levels and some different names.

    Implementation-wise: these are just integer constants, to allow
    simple comparison of importance.  See "What Logging Levels?" below
    for a debate on what standard levels should be defined.


Loggers

    Each Logger object keeps track of a log level (or threshold) that
    it is interested in, and discards log requests below that level.

    The *LogManager* maintains a hierarchical namespace of named
    Logger objects.  Generations are denoted with dot-separated names:
    Logger "foo" is the parent of Loggers "foo.bar" and "foo.baz".

    The main logging method is:

        class Logger:
            def log(self, level, msg, *args):
                """Log 'msg % args' at logging level 'level'."""
                ...

    however convenience functions are defined for each logging level:

        def debug(self, msg, *args): ...
        def info(self, msg, *args): ...
        def warn(self, msg, *args): ...
        def error(self, msg, *args): ...
        def fatal(self, msg, *args): ...

    XXX How to defined a nice convenience function for logging an exception?
        mx.Log has something like this, doesn't it?

    XXX What about a .raising() convenience function?  How about:

            def raising(self, exception, level=ERROR): ...

        It would create a log message describing an exception that is
        about to be raised.  I don't like that 'level' is not first
        when it *is* first for .log().


Handlers

    Handlers are responsible for doing something useful with a given
    LogRecord.  The following core Handlers will be implemented:

        - StreamHandler: A handler for writing to a file-like object.
        - FileHandler: A handler for writing to a single file or set
          of rotating files.

    More standard Handlers may be implemented if deemed desirable and
    feasible.  Other interesting candidates:

        - SocketHandler: A handler for writing to remote TCP ports.
        - CreosoteHandler: A handler for writing to UDP packets, for
          low-cost logging.  Jeff Bauer already had such a system [5].
        - MemoryHandler: A handler that buffers log records in memory
          (JSR047).
        - SMTPHandler: Akin to log4j's SMTPAppender.
        - SyslogHandler: Akin to log4j's SyslogAppender.
        - NTEventLogHandler: Akin to log4j's NTEventLogAppender.


Formatters

    A Formatter is responsible for converting a LogRecord to a string
    representation.  A Handler may call its Formatter before writing a
    record.  The following core Formatters will be implemented:

        - Formatter: Provide printf-like formatting, perhaps akin to
          log4j's PatternAppender.
    
    Other possible candidates for implementation:

        - XMLFormatter: Serialize a LogRecord according to a specific
          schema.  Could copy the schema from JSR047's XMLFormatter or
          log4j's XMLAppender.
        - HTMLFormatter: Provide a simple HTML output of log
          information. (See log4j's HTMLAppender.)


Filters

    A Filter can be called by a Logger or Handler to decide if a
    LogRecord should be logged.

    JSR047 and log4j have slightly different filtering interfaces. The
    former is simpler:

        class Filter:
            def isLoggable(self):
                """Return a boolean."""

    The latter is modeled after Linux's ipchains (where Filter's can
    be chained with each filter either 'DENY'ing, 'ACCEPT'ing, or
    being 'NEUTRAL' on each check).  I would probably favor to former
    because it is simpler and I don't immediate see the need for the
    latter.
    
    No filter implementations are currently proposed (other that the
    do nothing base class) because I don't have enough experience to
    know what kinds of filters would be common.  Users can always
    subclass Filter for their own purposes.  Log4j includes a few
    filters that might be interesting.


Configuration

    Note: Configuration for the proposed logging system is currently
    under-specified.

    The main benefit of a logging system like this is that one can
    control how much and what logging output one gets from an
    application without changing that application's source code.

    Log4j and Syslog provide for configuration via an external XML
    file.  Log4j and JSR047 provide for configuration via Java
    properties (similar to -D #define's to a C/C++ compiler).  All
    three provide for configuration via API calls.

    Configuration includes the following:

        - What logging level a logger should be interested in.
        - What handlers should be attached to which loggers.
        - What filters should be attached to which handlers and loggers.
        - Specifying attributes specific to certain Handlers and Filters.
        - Defining the default configuration.
        - XXX Add others. 

    In general each application will have its own requirements for how
    a user may configure logging output.  One application
    (e.g. distutils) may want to control logging levels via
    '-q,--quiet,-v,--verbose' options to setup.py.  Zope may want to
    configure logging via certain environment variables
    (e.g. 'STUPID_LOG_FILE' :).  Komodo may want to configure logging
    via its preferences system.

    This PEP proposes to clearly document the API for configuring each
    of the above listed configurable elements and to define a
    reasonable default configuration.  This PEP does not propose to
    define a general XML or .ini file configuration schema and the
    backend to parse it.
    
    It might, however, be worthwhile to define an abstraction of the
    configuration API to allow the expressiveness of Syslog
    configuration.  Greg Wilson made this argument:

        In Protomatter [Syslog], you configure by saying "give me
        everything that matches these channel+level combinations",
        such as "server.error" and "database.*".  The log4j "configure
        by inheritance" model, on the other hand, is very clever, but
        hard for non-programmers to manage without a GUI that
        essentially reduces it to Protomatter's.


Case Scenarios

    This section presents a few usage scenarios which will be used to
    help decide how best to specify the logging API.

    (1) A short simple script.

        This script does not have many lines.  It does not heavily use
        any third party modules (i.e. the only code doing any logging
        would be the main script).  Only one logging channel is really
        needed and thus, the channel name is unnecessary.  The user
        doesn't want to bother with logging system configuration much.

    (2) Medium sized app with C extension module.
    
        Includes a few Python modules and a main script.  Employs,
        perhaps, a few logging channels.  Includes a C extension
        module which might want to make logging calls as well.

    (3) Distutils.

        A large number of Python packages/modules.  Perhaps (but not
        necessarily) a number of logging channels are used.
        Specifically needs to facilitate the controlling verbosity
        levels via simple command line options to 'setup.py'.

    (4) Large, possibly multi-language, app. E.g. Zope or (my
        experience) Komodo.

        (I don't expect this logging system to deal with any
        cross-language issues but it is something to think about.)
        Many channels are used.  Many developers involved.  People
        providing user support are possibly not the same people who
        developed the application.  Users should be able to generate
        log files (i.e. configure logging) while reproducing a bug to
        send back to developers.


Implementation

    XXX Details to follow consensus that this proposal is a good idea.


What Logging Levels?

    The following are the logging levels defined by the systems I looked at:

    - log4j: DEBUG, INFO, WARN, ERROR, FATAL
    - syslog: DEBUG, INFO, WARNING, ERROR, FATAL
    - JSR047: FINEST, FINER, FINE, CONFIG, INFO, WARNING, SEVERE
    - zLOG (used by Zope):
        TRACE=-300   -- Trace messages
        DEBUG=-200   -- Debugging messages
        BLATHER=-100 -- Somebody shut this app up.
        INFO=0       -- For things like startup and shutdown.
        PROBLEM=100  -- This isn't causing any immediate problems, but
                        deserves attention.
        WARNING=100  -- A wishy-washy alias for PROBLEM.
        ERROR=200    -- This is going to have adverse effects.
        PANIC=300    -- We're dead!
    - mx.Log:
        SYSTEM_DEBUG
        SYSTEM_INFO
        SYSTEM_UNIMPORTANT
        SYSTEM_MESSAGE
        SYSTEM_WARNING
        SYSTEM_IMPORTANT
        SYSTEM_CANCEL
        SYSTEM_ERROR
        SYSTEM_PANIC
        SYSTEM_FATAL

    The current proposal is to copy log4j.  XXX I suppose I could see
    adding zLOG's "TRACE" level, but I am not sure of the usefulness
    of others.


Static Logging Methods (as per Syslog)?

    Both zLOG and Syslog provide module-level logging functions rather
    (or in addition to) logging methods on a created Logger object.
    XXX Is this something that is deemed worth including?

    Pros:
        - It would make the simplest case shorter:

            import logging
            logging.error("Something is wrong")

          instead of

            import logging
            log = logging.getLogger("")
            log.error("Something is wrong")

    Cons:
        - It provides more than one way to do it.
        - It encourages logging without a channel name, because this
          mechanism would likely be implemented by implicitly logging
          on the root (and nameless) logger of the hierarchy.


References

    [1] java.util.logging
        http://java.sun.com/j2se/1.4/docs/guide/util/logging/

    [2] log4j: a Java logging package
        http://jakarta.apache.org/log4j/docs/index.html

    [3] Protomatter's Syslog
        http://protomatter.sourceforge.net/1.1.6/index.html
        http://protomatter.sourceforge.net/1.1.6/javadoc/com/protomatter/syslog/syslog-whitepaper.html

    [4] MAL mentions his mx.Log logging module:
        http://mail.python.org/pipermail/python-dev/2002-February/019767.html 

    [5] Jeff Bauer's Mr. Creosote
        http://starship.python.net/crew/jbauer/creosote/

Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
fill-column: 70
End:

-- 
Trent Mick
TrentM@ActiveState.com


From loth@users.sourceforge.net  Sat Feb 16 01:05:00 2002
From: loth@users.sourceforge.net (Burton Radons)
Date: Fri, 15 Feb 2002 17:05:00 -0800
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find
 __slots__
References: <20020215175712.96785.qmail@web11807.mail.yahoo.com>
 <200202151923.g1FJNVJ5017966@email.nist.gov>
 <15469.25336.313450.27375@grendel.zope.com>
Message-ID: <3C6DB03C.8080101@users.sourceforge.net>

Fred L. Drake, Jr. wrote:

> Michael McLay writes:
>  > While my approach was patterened after the property() builtin, the
>  > Python Labs crowd didn't like the notation and rejected the
> 
> I'll note as well that at least some of us, if not all, don't like the
> property() syntax as well.  My current favorite was one of Guido's
> proposals at Python 10:
> 
> 
> class Foo(object):
>     property myprop:
>         """A computed property on Foo objects."""
> 
>         def __get__(self):
>             return ...
>         def __set__(self):
>             ...
>         def __delete__(self):
>             ...


What's wrong with:

   class Foo(object):
     class myprop_class (object):
       """A computed property on Foo objects."""

       def __get__(subclass, self, klass):
         return ...
       def __set__(subclass, self, value):
         ...
       def __delete__(subclass, self):
         ...

     myprop = myprop_class ()

It's a grand total of _two_ lines more than your example, and has more 
extensions possibilities to boot.

While we're discussing strange usages of blocks, one feature I've always 
wanted has been subblocks passed to functions.  It could be done using a 
new argument prefix "@" (for example).  If the block is not otherwise 
taken (if it is in an if statement, for example), it can be followed by 
":" and a normal block; this block is then put in the argument as a 
PyCodeObject (I think).  The argument can also be given a value 
normally.  The code object also has a few new methods for our convenience.

So to implement your example above:

   def field (@block):
     dict = block.exec_save_locals () # Execute the block and return the 
locals dictionary rather than destroy it.
     fget = dict.get ("__get__", None)
     fset = dict.get ("__set__", None)
     fdel = dict.get ("__delete__", None)
     fdoc = dict.get ("__doc__", None)

     return property (fget, fset, fdel, fdoc)

Now that we have that, we do your example:

   class Foo(object):
     myprop = field ():
       """A computed property on Foo objects."""

       def __get__(self):
         return ...
       def __set__(self, value):
         ...
       def __delete__(self):
         ...

There are other capabilities, but as I've never had a language that can 
do this I wouldn't know how many pragmatic possibilities there are.  The 
advantage over Guido's method is that his suggestion solves a single 
problem and has no use outside of it, while mine, at least on the face 
of it, could be applied in other ways.



From mclay@nist.gov  Sat Feb 16 02:43:33 2002
From: mclay@nist.gov (Michael McLay)
Date: Fri, 15 Feb 2002 21:43:33 -0500
Subject: [Python-Dev] PEP 282: A Logging System  --  comments please
In-Reply-To: <20020215164333.A31903@ActiveState.com>
References: <20020215164333.A31903@ActiveState.com>
Message-ID: <200202160247.g1G2lkJ5029529@email.nist.gov>

On Friday 15 February 2002 07:43 pm, Trent Mick wrote:
> Howdy all,
>
> I would appreciate any comments you might have on this proposal for adding
> a logging system to the Python Standard Library. This PEP is still an early
> draft so please forward your comments just to me directly for now.

I scanned the PEP and didn't find a reference to the logging package 
supporting logging over a network. 

> Influences
>
>     This proposal was put together after having somewhat studied the
>     following logging packages:
>
>         o java.util.logging in JDK 1.4 (a.k.a. JSR047) [1]
>         o log4j [2]
>           These two systems are *very* similar.
>         o the Syslog package from the Protomatter project [3]
>         o MAL's mx.Log package [4]
>
>     This proposal will basically look like java.util.logging with a
>     smattering of log4j.

Marshal Rose submitted RFC3195[1] to the IETF for a syslog protocol.  The 
specification is defined as a profile on top of the BEEP framework.  The 
format of the messages are encoded in XML. Here is an example of an "entry" 
element.  

     C: <entry facility='8' severity='6' 
      C:   hostname='pipeworks'
      C:   timestamp='Oct 31 23:59:59'
      C:  >&lt;.....eeeek!</entry>

[1] http://www.beepcore.org/beepcore/docs/rfc3195.html.


From aahz@rahul.net  Sat Feb 16 03:22:02 2002
From: aahz@rahul.net (Aahz Maruch)
Date: Fri, 15 Feb 2002 19:22:02 -0800 (PST)
Subject: [Python-Dev] [Python 2.2 BUG] pickle/cPickle does not find
In-Reply-To: <3C6DB03C.8080101@users.sourceforge.net> from "Burton Radons" at Feb 15, 2002 05:05:00 PM
Message-ID: <20020216032202.711AFE8C4@waltz.rahul.net>

Burton Radons wrote:
> 
> What's wrong with:
> 
>    class Foo(object):
>      class myprop_class (object):
>        """A computed property on Foo objects."""
> 
>        def __get__(subclass, self, klass):
>          return ...
>        def __set__(subclass, self, value):
>          ...
>        def __delete__(subclass, self):
>          ...

How about this:

    class Foo(object):
        class myprop(property):
            ...
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

We must not let the evil of a few trample the freedoms of the many.


From martin@v.loewis.de  Sat Feb 16 09:44:54 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: Sat, 16 Feb 2002 10:44:54 +0100
Subject: [Python-Dev] sendall patches not in 2.2?
Message-ID: <200202160944.g1G9isc01555@mira.informatik.hu-berlin.de>

I wonder why Anthony's Grand sendall patch never found its way into
Python 2.2, see

http://aspn.activestate.com/ASPN/Mail/Message/Python-checkins/956422
http://mail.python.org/pipermail/python-bugs-list/2001-December/009299.html
https://sourceforge.net/tracker/index.php?func=detail&aid=516715&group_id=5470&atid=305470

Unless there are any objections, I'll forward this patch to 2.2 (not
sure what to do with imaplib, since that has been taken care of with
an explicit loop in Python meanwhile).

Regards,
Martin


From JeffH@ActiveState.com  Sat Feb 16 10:01:39 2002
From: JeffH@ActiveState.com (Jeff Hobbs)
Date: Sat, 16 Feb 2002 02:01:39 -0800
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: <m3d6z61e4k.fsf@mira.informatik.hu-berlin.de>
Message-ID: <001b01c1b6d0$ee28f940$ba03a8c0@activestate.ca>

> From: martin@mira [mailto:martin@mira]On Behalf Of Martin v. Loewis
	...
> David Ascher <DavidA@ActiveState.com> writes:
> 
> > The Sleep() call is a perf problem.
> 
> It certainly is, but it is also necessary to have.

Why?  I suspect if you inverted the control behavior to run the
Tcl event loop as it's designed and trigger signals with
Tcl_AsyncMark, you would have no problem.  Alternatively, you
could do Tcl_CreateEventSource, of if threading is really
necessary, build Tcl with threads and use Tcl_ThreadQueueEvent.
It has all the APIs to approach this from several different
angles without have to toss a gratuitous Sleep in there that
does nothing more than have people scratch their head and
wonder why Tkinter appears so slow.

BTW, I know you were tying into Tk before Tk was properly
thread-safe, but those issues have been addressed (although
it is highly recommended to stick to using Tk in one thread
as things like X aren't guaranteed to be thread-safe).

Jeff


From martin@v.loewis.de  Sat Feb 16 12:29:02 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 16 Feb 2002 13:29:02 +0100
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: <001b01c1b6d0$ee28f940$ba03a8c0@activestate.ca>
References: <001b01c1b6d0$ee28f940$ba03a8c0@activestate.ca>
Message-ID: <m3it8xk3u9.fsf@mira.informatik.hu-berlin.de>

"Jeff Hobbs" <JeffH@ActiveState.com> writes:

> > It certainly is, but it is also necessary to have.
> 
> Why?  I suspect if you inverted the control behavior to run the
> Tcl event loop as it's designed and trigger signals with
> Tcl_AsyncMark, you would have no problem.  Alternatively, you
> could do Tcl_CreateEventSource, of if threading is really
> necessary, build Tcl with threads and use Tcl_ThreadQueueEvent.

Let me first state what I think what problem this Sleep call solves:
it allows a thread switch to occur, by blocking the thread so that the
OS knows that it should schedule a different thread. Otherwise, this
thread would hold the tcl lock essentially forever, since releasing
the tcl lock would be immediately followed by regaining it. Some
thread implementation won't allow, in this case, other threads blocked
for the tcl lock to run.

In the light of this rationale, can you please explain what
Tcl_AsyncMark is and how it would avoid the problem, or what effect
calling Tcl_CreateEventSource would have, or how Tcl_ThreadQueueEvent
would help?

> It has all the APIs to approach this from several different
> angles without have to toss a gratuitous Sleep in there that
> does nothing more than have people scratch their head and
> wonder why Tkinter appears so slow.

It does more than that: it avoids people thinking that their threads
have blocked indefinitely, for no good reason.

> BTW, I know you were tying into Tk before Tk was properly
> thread-safe, but those issues have been addressed (although
> it is highly recommended to stick to using Tk in one thread
> as things like X aren't guaranteed to be thread-safe).

Let's assume thread-safety of X is not a problem (as it isn't in most
current installations). Are you then saying that Tk is thread-safe?
What is the minimum Tk version that makes this guarantee? Where is
this documented?

I'm all in favour of getting rid of the Tcl lock.

Regards,
Martin


From guido@python.org  Sat Feb 16 16:23:09 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 16 Feb 2002 11:23:09 -0500
Subject: [Python-Dev] sendall patches not in 2.2?
In-Reply-To: Your message of "Sat, 16 Feb 2002 10:44:54 +0100."
 <200202160944.g1G9isc01555@mira.informatik.hu-berlin.de>
References: <200202160944.g1G9isc01555@mira.informatik.hu-berlin.de>
Message-ID: <200202161623.g1GGN9130315@pcp742651pcs.reston01.va.comcast.net>

> I wonder why Anthony's Grand sendall patch never found its way into
> Python 2.2, see
> 
> http://aspn.activestate.com/ASPN/Mail/Message/Python-checkins/956422
> http://mail.python.org/pipermail/python-bugs-list/2001-December/009299.html
> https://sourceforge.net/tracker/index.php?func=detail&aid=516715&group_id=5470&atid=305470
> 
> Unless there are any objections, I'll forward this patch to 2.2 (not
> sure what to do with imaplib, since that has been taken care of with
> an explicit loop in Python meanwhile).

An oversight!  This should go into 2.2.1 definitely.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sat Feb 16 16:28:01 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 16 Feb 2002 11:28:01 -0500
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: Your message of "Sat, 16 Feb 2002 02:01:39 PST."
 <001b01c1b6d0$ee28f940$ba03a8c0@activestate.ca>
References: <001b01c1b6d0$ee28f940$ba03a8c0@activestate.ca>
Message-ID: <200202161628.g1GGS1x30329@pcp742651pcs.reston01.va.comcast.net>

> > > The Sleep() call is a perf problem.
> > 
> > It certainly is, but it is also necessary to have.
> 
> Why?  I suspect if you inverted the control behavior to run the
> Tcl event loop as it's designed and trigger signals with
> Tcl_AsyncMark, you would have no problem.  Alternatively, you
> could do Tcl_CreateEventSource, of if threading is really
> necessary, build Tcl with threads and use Tcl_ThreadQueueEvent.
> It has all the APIs to approach this from several different
> angles without have to toss a gratuitous Sleep in there that
> does nothing more than have people scratch their head and
> wonder why Tkinter appears so slow.

Jeff, I really hope you can help us with this.  I know it's a twisted
mess.  Years ago, I asked Ousterhout's help, but he was already too
busy to pay attention to a competing language designer. :-(

I hope that it's possible to do something better with Tcl/Tk 8.3 that
doesn't require the sleep and maintains the existing _tkinter API /
semantics.

> BTW, I know you were tying into Tk before Tk was properly
> thread-safe, but those issues have been addressed (although
> it is highly recommended to stick to using Tk in one thread
> as things like X aren't guaranteed to be thread-safe).

Are they solved in Tcl/Tk 8.3?  I'd be happy to require that version.
I'm not (yet) happy to require an alpha/beta of 9.0 or whatever the
Tcl community is now working at.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Sat Feb 16 18:29:46 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 16 Feb 2002 19:29:46 +0100
Subject: [Python-Dev] SSL support in _socket
References: <3C692370.D21EF15D@lemburg.com> <m33d06xmfz.fsf@mira.informatik.hu-berlin.de> <3C6A303D.E0DDC16A@lemburg.com> <200202131301.g1DD1ou07332@pcp742651pcs.reston01.va.comcast.net>
 <3C6A66B3.3C4AE597@lemburg.com> <200202131336.g1DDaiV07604@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C6EA51A.E8F58B0C@lemburg.com>

Guido van Rossum wrote:
> 
> > Checking the code it should be easy to do. I'll look
> > into this later this week.
> 
> Great!

Done -- wasn't that easy after all, because the ssl object
relies on the socket object.

Please review and test. The header file chaos at the top of
socketmodule.* looks scary. It works fine on Linux, but I have
no idea what the situation is on other platforms.

Side-note: I've added the "inter-module dynamic C API linking 
via Python trick" from the mx tools to the _socket module. _ssl
only uses it to get at the type object, but the support can easily
be extended if this should be needed for more C APIs from
_socket.

Also note: the non-Unix build process files need to be updated.
 
> > Funny, BTW, that the source file is named socketmodule.c
> > while the resulting DLL is called _socket... I suppose
> > renaming socketmodule.c to _socket.c would be advisable.
> 
> That requires asking the SF sysadmin a favor to move a file, or loses
> all he CVS history.  So who cares.

I have left out this step. Perhaps Barry know a way to
rename the socketmodule.* files without losing the 
history ?!

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From tim.one@comcast.net  Sun Feb 17 03:41:53 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 16 Feb 2002 22:41:53 -0500
Subject: [Python-Dev] SSL support in _socket
In-Reply-To: <3C6EA51A.E8F58B0C@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEPNNMAA.tim.one@comcast.net>

[MAL]
> Side-note: I've added the "inter-module dynamic C API linking
> via Python trick" from the mx tools to the _socket module. _ssl
> only uses it to get at the type object, but the support can easily
> be extended if this should be needed for more C APIs from
> _socket.
>
> Also note: the non-Unix build process files need to be updated.

I don't know what "inter-module dynamic C API linking via Python trick"
means, but the Windows build doesn't compile anymore despite that it didn't
and doesn't support SSL.  I suspect it's because "inter-module" wrt sockets
is really "cross-DLL" on Windows, and clever tricks are going to bite hard
because of that.  It's griping here:

static PyTypeObject PySocketSock_Type = {
C:\Code\python\Modules\socketmodule.c(1768) : error C2491:
    'PySocketSock_Type' : definition of dllimport data not allowed

and here:

    &PySocketSock_Type,
C:\Code\python\Modules\socketmodule.c(2650) : error C2099:
    initializer is not a constant

The changes to socketmodule.h pretty much baffle me.  Why is the body of the
function PySocketModule_ImportModuleAndAPI included in the header file?  Why
is the body of this function skipped unless PySocket_BUILDING_SOCKET is
defined?  All in all, this appears to be an extremely confusing way to
define a function named PySocketModule_ImportModuleAndAPI in the new _ssl.c
alone.  So why isn't the function just defined in _ssl.c directly?  There
appears no reason to put it in the header file, and it's confusing there.

This shows signs of adapting a complicated framework to a situation too
simple to require most of what the framework does.  If so, since there is no
other use of this framework in Python, and the framework isn't documented in
the Python codebase, the framework should be tossed, and something as simple
as possible done instead.

I can't make more time to sort this out now.  It would help if the code were
made more transparent (see last paragraph), so it consumed less time to
figure out what it's intending to do.  In the meantime, the Windows build
will remain broken.



From andymac@pcug.org.au  Sun Feb 17 05:34:16 2002
From: andymac@pcug.org.au (Andrew MacIntyre)
Date: Sun, 17 Feb 2002 16:34:16 +1100 (EST)
Subject: [Python-Dev] OS/2 EMX port build directory committed
Message-ID: <Pine.GSO.4.21.0202171629120.28778-100000@supreme.pcug.org.au>

I have committed PC/os2emx and its contents.  If disaster results, please
cc this e-mail address, as I haven't been able to get into my main e-mail
account (ISP equipment problems).

Andrew I MacIntyre                    "These thoughts are mine alone ..."
Email: andymac@bullseye.apana.org.au  (preferred) | Snail: PO Box 370
       andymac@pcug.org.au            (alternate) |        Belconnen ACT 2616
       andrew.macintyre@aba.gov.au    (work)      |        Australia



From andymac@pcug.org.au  Sun Feb 17 05:43:08 2002
From: andymac@pcug.org.au (Andrew MacIntyre)
Date: Sun, 17 Feb 2002 16:43:08 +1100 (EST)
Subject: [Python-Dev] %#x/%#X format conversion patches for review
Message-ID: <Pine.GSO.4.21.0202171634360.28778-100000@supreme.pcug.org.au>

Following on from discussion of the patch I uploaded containing OS/2 EMX
changes to the Python core, I have uploaded patches to
Objects/stringobject.c and Objects/unicodeobject.c to simplify the mess of
dealing with these format conversions in the face of Python's preferences
vs the C standard, C standard violations, and bugs.

http://sf.net/tracker/?func=detail&aid=450266&group_id=5470&atid=305470
(this is my "Python core" OS/2 EMX patch in the Patch manager)

I have assigned the patch to Martin von Loewis; if they should be moved to
a separate patch for future reference, please let me know.

Andrew I MacIntyre                    "These thoughts are mine alone ..."
Email: andymac@bullseye.apana.org.au  (preferred) | Snail: PO Box 370
       andymac@pcug.org.au            (alternate) |        Belconnen ACT 2616
       andrew.macintyre@aba.gov.au    (work)      |        Australia



From andymac@pcug.org.au  Sun Feb 17 05:47:09 2002
From: andymac@pcug.org.au (Andrew MacIntyre)
Date: Sun, 17 Feb 2002 16:47:09 +1100 (EST)
Subject: [Python-Dev] Re: %#x/%#X format conversion patches for review
In-Reply-To: <Pine.GSO.4.21.0202171634360.28778-100000@supreme.pcug.org.au>
Message-ID: <Pine.GSO.4.21.0202171645440.529-100000@supreme.pcug.org.au>

Sorry, the correct patch URL is
http://sf.net/tracker/?func=detail&aid=450267&group_id=5470&atid=305470

Andrew I MacIntyre                    "These thoughts are mine alone ..."
Email: andymac@bullseye.apana.org.au  (preferred) | Snail: PO Box 370
       andymac@pcug.org.au            (alternate) |        Belconnen ACT 2616
       andrew.macintyre@aba.gov.au    (work)      |        Australia

On Sun, 17 Feb 2002, Andrew MacIntyre wrote:

> Following on from discussion of the patch I uploaded containing OS/2 EMX
> changes to the Python core, I have uploaded patches to
> Objects/stringobject.c and Objects/unicodeobject.c to simplify the mess of
> dealing with these format conversions in the face of Python's preferences
> vs the C standard, C standard violations, and bugs.



From tim.one@comcast.net  Sun Feb 17 06:06:57 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 17 Feb 2002 01:06:57 -0500
Subject: [Python-Dev] %#x/%#X format conversion patches for review
In-Reply-To: <Pine.GSO.4.21.0202171634360.28778-100000@supreme.pcug.org.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEADNNAA.tim.one@comcast.net>

[Andrew MacIntyre]
> Following on from discussion of the patch I uploaded containing OS/2 EMX
> changes to the Python core, I have uploaded patches to
> Objects/stringobject.c and Objects/unicodeobject.c to simplify the mess
> of dealing with these format conversions in the face of Python's
> preferences vs the C standard, C standard violations, and bugs.
>
> http://sf.net/tracker/?func=detail&aid=450266&group_id=5470&atid=305470
> (this is my "Python core" OS/2 EMX patch in the Patch manager)

There are 4 patch files attached to that report, and while I may have missed
them, I didn't see any changes to stringobject or unicodeobject in any of
them.  Which patch contains these changes?  Or are these changes in some
other patch submission?



From tim.one@comcast.net  Sun Feb 17 06:11:06 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 17 Feb 2002 01:11:06 -0500
Subject: [Python-Dev] OS/2 EMX port build directory committed
In-Reply-To: <Pine.GSO.4.21.0202171629120.28778-100000@supreme.pcug.org.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEADNNAA.tim.one@comcast.net>

[Andrew MacIntyre]
> I have committed PC/os2emx and its contents.  If disaster results,

Didn't hurt the Windows build, so there's no disaster from my POV.  Thanks!

> please cc this e-mail address,

Which e-mail address?  You listed three addresses below, and none of them
are labelled "this" <wink>.

> as I haven't been able to get into my main e-mail account (ISP equipment
> problems).
>
> Andrew I MacIntyre                    "These thoughts are mine alone ..."
> Email: andymac@bullseye.apana.org.au  (preferred) | Snail: PO Box 370
>        andymac@pcug.org.au            (alternate) |
> Belconnen ACT 2616
>        andrew.macintyre@aba.gov.au    (work)      |        Australia



From tim.one@comcast.net  Sun Feb 17 06:19:12 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 17 Feb 2002 01:19:12 -0500
Subject: [Python-Dev] Re: %#x/%#X format conversion patches for review
In-Reply-To: <Pine.GSO.4.21.0202171645440.529-100000@supreme.pcug.org.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEAENNAA.tim.one@comcast.net>

[Andrew MacIntyre]
> Sorry, the correct patch URL is
> http://sf.net/tracker/?func=detail&aid=450267&group_id=5470&atid=305470

Thanks.  The changes to {string,unicode}object.c are decidedly non-scary.
Check 'em in if you like.  I think we're now officially at the point where
our code would be smaller and simpler if we implemented sprintf entirely by
ourselves instead of fighting platform sprintf quirks <0.5 wink>.



From andymac@pcug.org.au  Sun Feb 17 07:32:22 2002
From: andymac@pcug.org.au (Andrew MacIntyre)
Date: Sun, 17 Feb 2002 18:32:22 +1100 (EST)
Subject: [Python-Dev] OS/2 EMX port build directory committed
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEADNNAA.tim.one@comcast.net>
Message-ID: <Pine.GSO.4.21.0202171814510.9745-100000@supreme.pcug.org.au>

On Sun, 17 Feb 2002, Tim Peters wrote:

> [Andrew MacIntyre]
> > I have committed PC/os2emx and its contents.  If disaster results,
> 
> Didn't hurt the Windows build, so there's no disaster from my POV.  Thanks!

I like good news!

> > please cc this e-mail address,
> 
> Which e-mail address?  You listed three addresses below, and none of them
> are labelled "this" <wink>.

"this address" was supposed to imply the "from address" of the message,
sorry.

I've now closed patch #450265.

Andrew I MacIntyre                    "These thoughts are mine alone ..."
Email: andymac@bullseye.apana.org.au  (preferred) | Snail: PO Box 370
       andymac@pcug.org.au            (alternate) |        Belconnen ACT 2616
       andrew.macintyre@aba.gov.au    (work)      |        Australia



From mal@lemburg.com  Sun Feb 17 12:38:07 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 17 Feb 2002 13:38:07 +0100
Subject: [Python-Dev] SSL support in _socket
References: <LNBBLJKPBEHFEDALKOLCGEPNNMAA.tim.one@comcast.net>
Message-ID: <3C6FA42F.928A0062@lemburg.com>

Tim Peters wrote:
> 
> [MAL]
> > Side-note: I've added the "inter-module dynamic C API linking
> > via Python trick" from the mx tools to the _socket module. _ssl
> > only uses it to get at the type object, but the support can easily
> > be extended if this should be needed for more C APIs from
> > _socket.
> >
> > Also note: the non-Unix build process files need to be updated.
> 
> I don't know what "inter-module dynamic C API linking via Python trick"
> means, but the Windows build doesn't compile anymore despite that it didn't
> and doesn't support SSL.  I suspect it's because "inter-module" wrt sockets
> is really "cross-DLL" on Windows, and clever tricks are going to bite hard
> because of that. 

No it's not (and that's the main advantage of the "trick").

Some explanation:

The _ssl module needs access to the type object defined in 
the _socket module. Since cross-DLL linking introduces a lot of
problems on many platforms, the "trick" is to wrap the
C API of a module in a struct which then gets exported to
other modules via a PyCObject.

The code in socketmodule.c defines this struct (which currently
only contains the type object reference, but could very
well also include other C APIs needed by other modules)
and exports it as PyCObject via the module dictionary
under the name "CAPI".

Other modules can now include the socketmodule.h file
which defines the needed C APIs to import and set up
a static copy of this struct in the importing module.

After initialization, the importing module can then
access the C APIs from the _socket module by simply
referring to the static struct, e.g.

        /* Load _socket module and its C API; this sets up the global
           PySocketModule */
        if (PySocketModule_ImportModuleAndAPI())
                return;

        ...
        if (!PyArg_ParseTuple(args, "O!|zz:ssl",

                              PySocketModule.Sock_Type,

                              (PyObject*)&Sock,
                              &key_file, &cert_file))
                return NULL;

(Perhaps I should copy the above explanation into the source
 files ?!)

> It's griping here:
> 
> static PyTypeObject PySocketSock_Type = {
> C:\Code\python\Modules\socketmodule.c(1768) : error C2491:
>     'PySocketSock_Type' : definition of dllimport data not allowed
> 
> and here:
> 
>     &PySocketSock_Type,
> C:\Code\python\Modules\socketmodule.c(2650) : error C2099:
>     initializer is not a constant

Ah, you're right, the export of the type object is not
needed anymore since this is now done using the PyCObject.
Sorry, my bad.
 
> The changes to socketmodule.h pretty much baffle me.  Why is the body of the
> function PySocketModule_ImportModuleAndAPI included in the header file?  Why
> is the body of this function skipped unless PySocket_BUILDING_SOCKET is
> defined?  All in all, this appears to be an extremely confusing way to
> define a function named PySocketModule_ImportModuleAndAPI in the new _ssl.c
> alone.  So why isn't the function just defined in _ssl.c directly?  There
> appears no reason to put it in the header file, and it's confusing there.

The reason for putting the code in the header file is
to avoid duplication of code. The import API is needed by
all modules wishing to use the C API of the socket module.

Currently, only _ssl needs this, but I think it would be
a good strategy to extend this technique to other modules
as well (esp. the array module would be a good candidate).
 
> This shows signs of adapting a complicated framework to a situation too
> simple to require most of what the framework does.  If so, since there is no
> other use of this framework in Python, and the framework isn't documented in
> the Python codebase, the framework should be tossed, and something as simple
> as possible done instead.

I don't think it's overly complicated. It's been in use in
mxDateTime and various database modules including mxODBC 
for many years and I haven't received any complaints about
it in the last few years. It would be nice if we could
integrate better support for it into the Python core.
Then we wouldn't need the header file source code
definition anymore.

IMHO, it's a very useful way of doing cross-DLL "linking"
in a platform independent manner. Note that the whole idea
originated from a discussion I had with Jim Fulton some
years ago. As I understand, the PyCObject was invented for
just this purpose.
 
> I can't make more time to sort this out now.  It would help if the code were
> made more transparent (see last paragraph), so it consumed less time to
> figure out what it's intending to do.  In the meantime, the Windows build
> will remain broken.

As I read the checkins, you've remove the type object 
export. 

I am curious why the test_socket still fails on Windows
though. Both test_socket and test_socket_ssl work just fine on
Linux.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From ping@lfw.org  Mon Feb 18 04:27:02 2002
From: ping@lfw.org (Ka-Ping Yee)
Date: Sun, 17 Feb 2002 22:27:02 -0600 (CST)
Subject: [Python-Dev] Global name lookup schemes
Message-ID: <Pine.LNX.4.33.0202172213330.15644-100000@server1.lfw.org>

Okay, i spent another afternoon drawing silly pictures full
of boxes and arrows.  I swear, i'm going to be seeing pointers
in my dreams tonight.

Here are figures representing my current understanding of the
various schemes on the table:

    Jeremy 1: the dlict scheme
        http://lfw.org/python/jeremy1.gif
        http://lfw.org/python/jeremy1.tif
        http://lfw.org/python/jeremy1.ai

        Jeremy, i think i'm still somewhat unclear -- notice the
        two question marks in the figure.  What kind of animal
        is the cache?  I assumed that the invalidation info lives
        in an array parallel to the dlict's array.  Is this right?

    Guido 1: the original cellptr/objptr scheme
        http://lfw.org/python/guido1.gif
        http://lfw.org/python/guido1.tif
        http://lfw.org/python/guido1.ai

    Ping 1: guido1 + a tweak to always use two dereferencing steps
        http://lfw.org/python/ping1.gif
        http://lfw.org/python/ping1.tif
        http://lfw.org/python/ping1.ai

    Tim 1: timdicts and cells with shadow flags
        http://lfw.org/python/tim1.gif
        http://lfw.org/python/tim1.tif
        http://lfw.org/python/tim1.ai

GIFs are small versions, TIFs are big versions, AIs are
Adobe Illustrator source files.

Please examine, send me corrections, discuss, enjoy... :)


Still to do:

    Guido 2: the globals_vector scheme

    Skip 1: the global-tracking scheme
        (I don't actually know yet what in this diagram
        would be different from the way things work today.
        Statically, Skip's picture is mostly the same;
        it's the runtime behaviour that's different.
        Still, it's probably good to have a reference
        picture of today's data structures anyway.)


-- ?!ng



From tim.one@comcast.net  Mon Feb 18 05:19:26 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 18 Feb 2002 00:19:26 -0500
Subject: [Python-Dev] SSL support in _socket
In-Reply-To: <3C6FA42F.928A0062@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEDJNNAA.tim.one@comcast.net>

[M.-A. Lemburg]
> Some explanation:
>
> The _ssl module needs access to the type object defined in
> the _socket module. Since cross-DLL linking introduces a lot of
> problems on many platforms, the "trick" is to wrap the
> C API of a module in a struct which then gets exported to
> other modules via a PyCObject.
>
> The code in socketmodule.c defines this struct (which currently
> only contains the type object reference, but could very
> well also include other C APIs needed by other modules)
> and exports it as PyCObject via the module dictionary
> under the name "CAPI".
>
> Other modules can now include the socketmodule.h file
> which defines the needed C APIs to import and set up
> a static copy of this struct in the importing module.
>
> After initialization, the importing module can then
> access the C APIs from the _socket module by simply
> referring to the static struct, e.g.
>
>         /* Load _socket module and its C API; this sets up the global
>            PySocketModule */
>         if (PySocketModule_ImportModuleAndAPI())
>                 return;
>
>         ...
>         if (!PyArg_ParseTuple(args, "O!|zz:ssl",
>
>                               PySocketModule.Sock_Type,
>
>                               (PyObject*)&Sock,
>                               &key_file, &cert_file))
>                 return NULL;
>
> (Perhaps I should copy the above explanation into the source
>  files ?!)

I don't know.  I really don't have time to try and understand this, but I
can tell you I spent a lot of time staring at the code just trying to fix
the part that didn't work, and it was slow and painful going.  Without deep
understanding, I can only repeat that all this machinery *seems* to be
overkill in this specific case; and since there is no other case in the
Python core, a mass of overly general machinery in the Python core seems out
of place.

> ...
> Ah, you're right, the export of the type object is not
> needed anymore since this is now done using the PyCObject.
> Sorry, my bad.

No problem -- that part turned out to be easy, once I found it.

> ...
> The reason for putting the code in the header file is
> to avoid duplication of code. The import API is needed by
> all modules wishing to use the C API of the socket module.

But in this specific case you confirm that there is only one client:

> Currently, only _ssl needs this, but I think it would be
> a good strategy to extend this technique to other modules
> as well (esp. the array module would be a good candidate).

Possibly, but it's overly elaborate in this specific case.  If it needed to
be hypergeneral (and it doesn't here), it seems it would be better to make
the code template *more* general, so that every importer of every module
could include a common (e.g.) PyImportModuleAndApi.h header file one or more
times, after setting a pile of #defines to specialize it to the module at
hand.

> I don't think it's overly complicated.

You've confirmed that it is in this specific case, and that's the only case
there is in the codebase all the Python developers work with.

...
> As I read the checkins, you've remove the type object export.

Well, I removed the DL_IMPORT.  The problem was more that it wasn't
exported, and now it doesn't need to be imported or exported.

> I am curious why the test_socket still fails on Windows
> though. Both test_socket and test_socket_ssl work just fine on
> Linux.

test_socket was a red herring.  Merely trying to import socket died with
NameError on Windows.  That got fixed too, and the non-SLL socket tests on
Windows worked fine then.



From john_coppola_r_s@yahoo.com  Mon Feb 18 08:41:06 2002
From: john_coppola_r_s@yahoo.com (john coppola)
Date: Mon, 18 Feb 2002 00:41:06 -0800 (PST)
Subject: [Python-Dev] new property factory arguments
Message-ID: <20020218084106.26444.qmail@web11807.mail.yahoo.com>

--0-1827853335-1014021666=:25719
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Hello python developers.  After discussions with Fred
about defining how property objects are created, I
decided to give it a whirl myself.  After about minute
of piddling, I came up with something I thought would
be a hack but is quite interesting.

Ultimately, this is what I came up with.

class Foo(object):
    def __get__(self, container):
	print "this is get"
	print "self:", type(self)
	print "container:", type(container)
	print

    def __set__(self, container, value):
	print "this is set"
	print "self:", type(self)
	print "container:", type(container)
	print "value:", value
	print
	
    def __del__(self, container):
	print "this is del"
	print "self:", type(self)
	print "container:", type(container)
	print

class Spam(object):
    x=property(Foo())

I feel this has the benefit of encapsulating the x
property from the Spam object.  If coupling is needed,
then access to Spam can be obtained via container.

This was the interesting hack I was talking about.  I
first tried it without container and got an error, but
then I decided to see what that second argument was
like hummm, interesting and I liked it.

What's really cool is that Foo can be used by a
completely separate class.  Perhaps Foo is a singleton
for a DB connection. A single connection could be
created in __new__, and other attribute details
created in __init__.  So class Spam is decoupled from
what is going on in Foo.  Whereas the former syntax,
this was not possible.


I've attached a descrobject.diff file to this email as
well as testprop.py. (I've never tried sending an
attachment to python-dev, I hope it works.)

Enjoy,
John Coppola



__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com
--0-1827853335-1014021666=:25719
Content-Type: text/plain; name="descrobject.diff"
Content-Description: descrobject.diff
Content-Disposition: inline; filename="descrobject.diff"

*** PythonOrig/Python-2.2/Objects/descrobject.c	Sat Dec 15 00:00:30 2001
--- PythonDev/Python-2.2/Objects/descrobject.c	Sun Feb 17 21:52:55 2002
***************
*** 1003,1024 ****
  }
  
  static int
! property_init(PyObject *self, PyObject *args, PyObject *kwds)
  {
! 	PyObject *get = NULL, *set = NULL, *del = NULL, *doc = NULL;
! 	static char *kwlist[] = {"fget", "fset", "fdel", "doc", 0};
! 	propertyobject *gs = (propertyobject *)self;
! 
! 	if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOOO:property",
! 	     				 kwlist, &get, &set, &del, &doc))
! 		return -1;
! 
! 	if (get == Py_None)
! 		get = NULL;
! 	if (set == Py_None)
! 		set = NULL;
! 	if (del == Py_None)
! 		del = NULL;
  
  	Py_XINCREF(get);
  	Py_XINCREF(set);
--- 1003,1023 ----
  }
  
  static int
! property_init(PyObject *self, PyObject *args, PyObject *kw)
  {
! 	PyObject *get=NULL, *set=NULL, *del=NULL, *doc=NULL, *arg=NULL;
!         static char *kwlist[] = {"object", 0};
!         propertyobject *gs = (propertyobject *)self;
!         if (!PyArg_ParseTupleAndKeywords(args,kw,"|O:property",kwlist,&arg))
!              return -1;
!     
!         get = PyObject_GetAttrString(arg,"__get__");
!         set = PyObject_GetAttrString(arg,"__set__");
!         del = PyObject_GetAttrString(arg,"__del__");
!         doc = PyObject_GetAttrString(arg,"__doc__");
! 	if (get == Py_None) get = NULL;
! 	if (set == Py_None) set = NULL;
! 	if (del == Py_None) del = NULL;
  
  	Py_XINCREF(get);
  	Py_XINCREF(set);
***************
*** 1034,1049 ****
  }
  
  static char property_doc[] =
! "property(fget=None, fset=None, fdel=None, doc=None) -> property attribute\n"
  "\n"
! "fget is a function to be used for getting an attribute value, and likewise\n"
! "fset is a function for setting, and fdel a function for del'ing, an\n"
! "attribute.  Typical use is to define a managed attribute x:\n"
  "class C(object):\n"
! "    def getx(self): return self.__x\n"
! "    def setx(self, value): self.__x = value\n"
! "    def delx(self): del self.__x\n"
! "    x = property(getx, setx, delx, \"I'm the 'x' property.\")";
  
  static int
  property_traverse(PyObject *self, visitproc visit, void *arg)
--- 1033,1050 ----
  }
  
  static char property_doc[] =
! "property(object) -> property attribute\n"
  "\n"
! "__get__ is a function to be used for getting an attribute value, and\n"
! "likewise __set__ is a function for setting, and __del__ a function for\n"
! "del'ing, an attribute.  Typical use is to define a managed attribute x:\n"
  "class C(object):\n"
! "    def __get__(self, container): return self.__x\n"
! "    def __set__(self, container, value): self.__x = value\n"
! "    def __del__(self, container): del self.__x\n"
! "\n"
! "class D(object):\n"
! "    x = property(object=C(), \"I'm the 'x' property.\")";
  
  static int
  property_traverse(PyObject *self, visitproc visit, void *arg)

--0-1827853335-1014021666=:25719
Content-Type: text/plain; name="TESTPROP.PY"
Content-Description: TESTPROP.PY
Content-Disposition: inline; filename="TESTPROP.PY"

__doc__="""\
class Foo(object):
    def __get__(self, container):
	print "this is get"
	print "self:", type(self)
	print "container:", type(container)
	print

    def __set__(self, container, value):
	print "this is set"
	print "self:", type(self)
	print "container:", type(container)
	print "value:", value
	print
	
    def __del__(self, container):
	print "this is del"
	print "self:", type(self)
	print "container:", type(container)
	print
	
    __doc__ = "this is doc"
"""

exec __doc__
class Spam(object):
    x=property(Foo())

print __doc__
a=Spam()
a.x=5
print "getting:", a.x
del a.x

--0-1827853335-1014021666=:25719--


From jason@jorendorff.com  Mon Feb 18 09:24:47 2002
From: jason@jorendorff.com (Jason Orendorff)
Date: Mon, 18 Feb 2002 03:24:47 -0600
Subject: [Python-Dev] new property factory arguments
In-Reply-To: <20020218084106.26444.qmail@web11807.mail.yahoo.com>
Message-ID: <HFEKILOLEFEFMKAECNDLMEFPDDAA.jason@jorendorff.com>

With minor changes, this works already.

class Foo(property):
    def __get__(self, container, _type=None):
	print "this is get"
	print "self:", self
	print "container:", container
	print

    def __set__(self, container, value):
	print "this is set"
	print "self:", self
	print "container:", container
	print "value:", value
	print
	
    def __delete__(self, container):
	print "this is del"
	print "self:", type(self)
	print "container:", type(container)
	print

class Spam(object):
    x = Foo()

## Jason Orendorff    http://www.jorendorff.com/



From mal@lemburg.com  Mon Feb 18 11:04:29 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Feb 2002 12:04:29 +0100
Subject: [Python-Dev] SSL support in _socket
References: <LNBBLJKPBEHFEDALKOLCIEDJNNAA.tim.one@comcast.net>
Message-ID: <3C70DFBD.6D0DB861@lemburg.com>

Tim Peters wrote:
> 
> [M.-A. Lemburg]
> > Some explanation:
> >
> > The _ssl module needs access to the type object defined in
> > the _socket module. Since cross-DLL linking introduces a lot of
> > problems on many platforms, the "trick" is to wrap the
> > C API of a module in a struct which then gets exported to
> > other modules via a PyCObject.
> >
> > The code in socketmodule.c defines this struct (which currently
> > only contains the type object reference, but could very
> > well also include other C APIs needed by other modules)
> > and exports it as PyCObject via the module dictionary
> > under the name "CAPI".
> >
> > Other modules can now include the socketmodule.h file
> > which defines the needed C APIs to import and set up
> > a static copy of this struct in the importing module.
> >
> > After initialization, the importing module can then
> > access the C APIs from the _socket module by simply
> > referring to the static struct, e.g.
> >
> >         /* Load _socket module and its C API; this sets up the global
> >            PySocketModule */
> >         if (PySocketModule_ImportModuleAndAPI())
> >                 return;
> >
> >         ...
> >         if (!PyArg_ParseTuple(args, "O!|zz:ssl",
> >
> >                               PySocketModule.Sock_Type,
> >
> >                               (PyObject*)&Sock,
> >                               &key_file, &cert_file))
> >                 return NULL;
> >
> > (Perhaps I should copy the above explanation into the source
> >  files ?!)
> 
> I don't know.  I really don't have time to try and understand this, but I
> can tell you I spent a lot of time staring at the code just trying to fix
> the part that didn't work, and it was slow and painful going.  Without deep
> understanding, I can only repeat that all this machinery *seems* to be
> overkill in this specific case; and since there is no other case in the
> Python core, a mass of overly general machinery in the Python core seems out
> of place.

The idea of using the above framework was to get the 
discussion started and then perhaps extend this kind of 
support to other modules as well, e.g. to be able to
create and access types from other modules at C level.

Note that the framework only seem to be overkill
at the moment (since it only exports one symbol).
As soon as you add more APIs to the API struct,
things look different -- e.g. a socket constructor
at C level would be nice to have.

> > ...
> > Ah, you're right, the export of the type object is not
> > needed anymore since this is now done using the PyCObject.
> > Sorry, my bad.
> 
> No problem -- that part turned out to be easy, once I found it.

You should have just thrown the error message in my Inbox.

> > ...
> > The reason for putting the code in the header file is
> > to avoid duplication of code. The import API is needed by
> > all modules wishing to use the C API of the socket module.
> 
> But in this specific case you confirm that there is only one client:
> 
> > Currently, only _ssl needs this, but I think it would be
> > a good strategy to extend this technique to other modules
> > as well (esp. the array module would be a good candidate).
> 
> Possibly, but it's overly elaborate in this specific case.  If it needed to
> be hypergeneral (and it doesn't here), it seems it would be better to make
> the code template *more* general, so that every importer of every module
> could include a common (e.g.) PyImportModuleAndApi.h header file one or more
> times, after setting a pile of #defines to specialize it to the module at
> hand.

Right.
 
> > I don't think it's overly complicated.
> 
> You've confirmed that it is in this specific case, and that's the only case
> there is in the codebase all the Python developers work with.

Yeah, well, ok :-) You have to get the ball rolling 
somehow ;-) 
 
> > I am curious why the test_socket still fails on Windows
> > though. Both test_socket and test_socket_ssl work just fine on
> > Linux.
> 
> test_socket was a red herring.  Merely trying to import socket died with
> NameError on Windows.  That got fixed too, and the non-SLL socket tests on
> Windows worked fine then.

Thanks.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Mon Feb 18 11:17:17 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Feb 2002 12:17:17 +0100
Subject: [Python-Dev] new property factory arguments
References: <20020218084106.26444.qmail@web11807.mail.yahoo.com>
Message-ID: <3C70E2BD.3E72C6CA@lemburg.com>

john coppola wrote:
> 
> Hello python developers.  After discussions with Fred
> about defining how property objects are created, I
> decided to give it a whirl myself.  After about minute
> of piddling, I came up with something I thought would
> be a hack but is quite interesting.
> ...
> ! property_init(PyObject *self, PyObject *args, PyObject *kwds)
>   {
> !       PyObject *get = NULL, *set = NULL, *del = NULL, *doc = NULL;
> !       static char *kwlist[] = {"fget", "fset", "fdel", "doc", 0};
> !       propertyobject *gs = (propertyobject *)self;
> !
> !       if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOOO:property",
> !                                        kwlist, &get, &set, &del, &doc))
> !               return -1;
> ...
> --- 1003,1023 ----
>   }
> 
>   static int
> ! property_init(PyObject *self, PyObject *args, PyObject *kw)
>   {
> !       PyObject *get=NULL, *set=NULL, *del=NULL, *doc=NULL, *arg=NULL;
> !         static char *kwlist[] = {"object", 0};
> !         propertyobject *gs = (propertyobject *)self;
> !         if (!PyArg_ParseTupleAndKeywords(args,kw,"|O:property",kwlist,&arg))
> !              return -1;
> !
> !         get = PyObject_GetAttrString(arg,"__get__");
> !         set = PyObject_GetAttrString(arg,"__set__");
> !         del = PyObject_GetAttrString(arg,"__del__");
> !         doc = PyObject_GetAttrString(arg,"__doc__");

Wouldn't this break the documented API ? 

If so, I'd suggest to provide a second constructor which 
exposes the new signature instead. Should be easy to
do in Python...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Mon Feb 18 11:25:26 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Feb 2002 12:25:26 +0100
Subject: [Python-Dev] Global name lookup schemes
References: <Pine.LNX.4.33.0202172213330.15644-100000@server1.lfw.org>
Message-ID: <3C70E4A6.57CD7E1E@lemburg.com>

[Very nice pictures]

Way cool, Ping ! Does AI provide tools for simplifying these
kind of diagrams or did you do it all by hand ?

Perhaps Guido ought to add these to the PEP as external
reference ?!

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From Jack.Jansen@oratrix.com  Mon Feb 18 11:35:40 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Mon, 18 Feb 2002 12:35:40 +0100
Subject: [Python-Dev] SSL support in _socket
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEDJNNAA.tim.one@comcast.net>
Message-ID: <A3101458-2463-11D6-B2BF-0030655234CE@oratrix.com>

On Monday, February 18, 2002, at 06:19 , Tim Peters wrote:
> I don't know.  I really don't have time to try and understand this, 
> but I
> can tell you I spent a lot of time staring at the code just trying to 
> fix
> the part that didn't work, and it was slow and painful going.  Without 
> deep
> understanding, I can only repeat that all this machinery *seems* to be
> overkill in this specific case; and since there is no other case in the
> Python core, a mass of overly general machinery in the Python core 
> seems out
> of place.

Well... The MacOS toolbox modules have a similar requirement (but 
currently implemented in a different way, see pymactoolboxglue.c if 
you're interested in the gory details) and various extension packages 
(such as Numeric) also have their own implementation of something 
similar.

And there's packages like VTK which currently do hard cross-dll linking 
which could benefit from such a scheme.

Maybe someone should try and come up with a list of requirements for 
inter-extension-module communication and PEP it?
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -



From mal@lemburg.com  Mon Feb 18 11:53:35 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Feb 2002 12:53:35 +0100
Subject: [Python-Dev] SSL support in _socket
References: <A3101458-2463-11D6-B2BF-0030655234CE@oratrix.com>
Message-ID: <3C70EB3F.249B55E0@lemburg.com>

Jack Jansen wrote:
> 
> On Monday, February 18, 2002, at 06:19 , Tim Peters wrote:
> > I don't know.  I really don't have time to try and understand this,
> > but I
> > can tell you I spent a lot of time staring at the code just trying to
> > fix
> > the part that didn't work, and it was slow and painful going.  Without
> > deep
> > understanding, I can only repeat that all this machinery *seems* to be
> > overkill in this specific case; and since there is no other case in the
> > Python core, a mass of overly general machinery in the Python core
> > seems out
> > of place.
> 
> Well... The MacOS toolbox modules have a similar requirement (but
> currently implemented in a different way, see pymactoolboxglue.c if
> you're interested in the gory details) and various extension packages
> (such as Numeric) also have their own implementation of something
> similar.
> 
> And there's packages like VTK which currently do hard cross-dll linking
> which could benefit from such a scheme.
> 
> Maybe someone should try and come up with a list of requirements for
> inter-extension-module communication and PEP it?

Good idea. I can have a go at this next weekend.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From ping@lfw.org  Mon Feb 18 12:15:09 2002
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 18 Feb 2002 06:15:09 -0600 (CST)
Subject: [Python-Dev] Global name lookup schemes
In-Reply-To: <3C70E4A6.57CD7E1E@lemburg.com>
Message-ID: <Pine.LNX.4.33.0202180610220.18893-100000@server1.lfw.org>

On Mon, 18 Feb 2002, M.-A. Lemburg wrote:
> Way cool, Ping ! Does AI provide tools for simplifying these
> kind of diagrams or did you do it all by hand ?

I spent all day on it.  Illustrator was not much help, sadly.
It's so lame that it can't even keep arrowheads stuck to arrows,
and its grid-snapping behaviour is mysterious and unpredictable.
Unfortunately it's the only tool i have that's remotely good
enough for the job (works with my Wacom tablet, fast navigation,
multiple undo).

> Perhaps Guido ought to add these to the PEP as external
> reference ?!

I would like him to, once we have made sure they are accurate.


-- ?!ng



From mal@lemburg.com  Mon Feb 18 13:49:52 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Feb 2002 14:49:52 +0100
Subject: [Python-Dev] Global name lookup schemes
References: <Pine.LNX.4.33.0202180610220.18893-100000@server1.lfw.org>
Message-ID: <3C710680.A06C59D@lemburg.com>

Ka-Ping Yee wrote:
> 
> On Mon, 18 Feb 2002, M.-A. Lemburg wrote:
> > Way cool, Ping ! Does AI provide tools for simplifying these
> > kind of diagrams or did you do it all by hand ?
> 
> I spent all day on it.  Illustrator was not much help, sadly.

Ouch... and I thought someone has finally come up with a great
tool for doing technical diagrams. Oh well; I'll stick with Corel
Draw then.

> It's so lame that it can't even keep arrowheads stuck to arrows,
> and its grid-snapping behaviour is mysterious and unpredictable.
> Unfortunately it's the only tool i have that's remotely good
> enough for the job (works with my Wacom tablet, fast navigation,
> multiple undo).

Hmm, you sure did a great job on the diagrams given this
environment.
 
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From john_coppola_r_s@yahoo.com  Mon Feb 18 14:20:28 2002
From: john_coppola_r_s@yahoo.com (john coppola)
Date: Mon, 18 Feb 2002 06:20:28 -0800 (PST)
Subject: [Python-Dev] new property factory arguments
In-Reply-To: <HFEKILOLEFEFMKAECNDLMEFPDDAA.jason@jorendorff.com>
Message-ID: <20020218142028.9929.qmail@web11806.mail.yahoo.com>

--- Jason Orendorff <jason@jorendorff.com> wrote:
> With minor changes, this works already.

> class Foo(property):
>     def __get__(self, container, _type=None):
> 	print "this is get"
> 	print "self:", self
 . . . 


I didn't subclass from property?  I do believe my with
example, any object new or old could be used as a
property.  And by looking at the code, property_init
clearly did not include GetAttr methods for __get__,
__set__, __del__.

If fact, there is not reason to include __delete__,
why not use __del__ instead?

If you send any ole python class instance to property
your code fails.  Thats the need for the change.

Without my patch...
class Bar(object): # <== object!
	def __get__(self,container,tp=None):
		print "get"
	def __set__(self,container,value):
		print "set"
	def __delete__(self,container):
		print "del"
		
>>> class Foo(object):
	x=property(Bar()) #performs coersion 
	
>>> a=Foo()
>>> a.x

Fails!
With my patch, it works.

Infact, I feel very strongly, that the old syntax
should be removed.  Better now then later. 
property(fget,fset,fdel,fdoc) does not make much sense
in the new object oriented world of python.


Sincerely,
John Coppola

__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com


From gward@python.net  Mon Feb 18 15:18:09 2002
From: gward@python.net (Greg Ward)
Date: Mon, 18 Feb 2002 10:18:09 -0500
Subject: [Python-Dev] PEP 282: A Logging System  --  comments please
In-Reply-To: <200202160247.g1G2lkJ5029529@email.nist.gov>
References: <20020215164333.A31903@ActiveState.com> <200202160247.g1G2lkJ5029529@email.nist.gov>
Message-ID: <20020218151809.GA1335@gerg.ca>

On 15 February 2002, Michael McLay said:
> I scanned the PEP and didn't find a reference to the logging package 
> supporting logging over a network. 

I think the right way to handle that would be to pass a file-like object
to the logging framework, which it then write()'s to.  That should work
just fine for stream (TCP) sockets; I *think* it would work for datagram
(UDP) sockets too, but I'd want to test it first.

        Greg
-- 
Greg Ward - Unix weenie                                 gward@python.net
http://starship.python.net/~gward/
Eschew obfuscation!


From Jack.Jansen@oratrix.com  Mon Feb 18 15:38:20 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Mon, 18 Feb 2002 16:38:20 +0100
Subject: [Python-Dev] Global name lookup schemes
In-Reply-To: <3C710680.A06C59D@lemburg.com>
Message-ID: <88F73897-2485-11D6-B2BF-0030655234CE@oratrix.com>

On Monday, February 18, 2002, at 02:49 , M.-A. Lemburg wrote:

> Ka-Ping Yee wrote:
>>
>> On Mon, 18 Feb 2002, M.-A. Lemburg wrote:
>>> Way cool, Ping ! Does AI provide tools for simplifying these
>>> kind of diagrams or did you do it all by hand ?
>>
>> I spent all day on it.  Illustrator was not much help, sadly.
>
> Ouch... and I thought someone has finally come up with a great
> tool for doing technical diagrams. Oh well; I'll stick with Corel
> Draw then.

I've heard very good things of OmniGraffle. Never used it myself, but 
their web browser OmniWeb absolutely rooooooooooolz! Check out 
www.omnigroup.com.

MacOSX only, of course.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -



From fdrake@zope.com  Mon Feb 18 15:51:14 2002
From: fdrake@zope.com (Fred Drake)
Date: Mon, 18 Feb 2002 10:51:14 -0500
Subject: [Python-Dev] PEP 282: A Logging System  --  comments please
In-Reply-To: <20020218151809.GA1335@gerg.ca>
Message-ID: <web-3559495@digicool.com>

On 15 February 2002, Michael McLay said:
 > I scanned the PEP and didn't find a reference to the
logging package 
 > supporting logging over a network. 

On Mon, 18 Feb 2002 10:18:09 -0500
 Greg Ward <gward@python.net> wrote:
> I think the right way to handle that would be to pass a
> file-like object
> to the logging framework, which it then write()'s to.

It seems to me that it should be trivial to map from the
logging API to syslog; it already provides for logging to
remote systems as well as filtering.

I'm sure this is already hashed out in the PEP though, so I
should be reading that instead of commenting here.  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From jacobs@penguin.theopalgroup.com  Mon Feb 18 16:29:14 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Mon, 18 Feb 2002 11:29:14 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
Message-ID: <Pine.LNX.4.33.0202180845530.13612-100000@penguin.theopalgroup.com>

Hello all,

I've been meta-reflecting a lot lately: reflecting on reflection.

My recent post on __slots__ not being picklable (and the resounding lack of
response to it) inspired me to try my hand at channeling Guido and reverse-
engineer some of the design decisions that went into the new-style class
system.  Unfortunately, the more I dug into the code, the more philosophical
my questions became.  So, I've written up some questions that help lay bare
some of basic design questions that I've been asking myself and that you
should be aware of.

While there are several subtle issues I could raise, I do want some feedback
on some simple and fundamental ones first.  Please don't disqualify yourself
from commenting because you haven't read the code or used the new features
yet.  I've written my examples assuming only a basic and cursor
understanding of the new Python 2.2 features.

  [In this discussion I am only going to talk about native Python classes,
   not C-extension or native Python types (e.g., ints, lists, tuples,
   strings, cStringIO, etc.)]

  1) Should class instances explicitly/directly know all of their attributes?

     Before Python 2.2, all object instances contained a __dict__ attribute
     that mapped attribute names to their values.  This made pickling and
     some other reflection tasks fairly easy.

        e.g.:

        class Foo:
          def __init__(self):
            self.a = 1
            self.b = 2

        class Bar(Foo):
          def __init__(self):
            Foo.__init__(self)
            self.c = 3

        bar = Bar()
        print bar.__dict__
        > {'a': 1, 'c': 3, 'b': 2}

     I am aware that there are situations where this simple case does not
     hold (e.g., when implementing __setattr__ or __getattr__), but let's
     ignore those for now.  Rather, I will concentrate on how this classical
     Python idiom interacts with the new slots mechanism.  Here is the above
     example using slots:

        e.g.:

        class Foo(object):
          __slots__ = ['a','b']
          def __init__(self):
            self.a = 1
            self.b = 2

        class Bar(Foo):
          __slots__ = ['c']
          def __init__(self):
            Foo.__init__(self)
            self.c = 3

        bar = Bar()
        print bar.__dict__
        > AttributeError: 'Bar' object has no attribute '__dict__'

     We can see that the class instance 'bar' has no __dict__ attribute.
     This is because the slots mechanism allocates space for attribute
     storage directly inside the object, and thus does not use (or need) a
     per-object instance dictionary to store attributes.  Of course, it is
     possible to request that a per-instance dictionary by inheriting from a
     new-style class that does not list any slots.  e.g. continuing from
     above:

        class Baz(Bar):
          def __init__(self):
            Bar.__init__(self)
            self.d = 4
            self.e = 5

        baz = Baz()
        print baz.__dict__
        > {'e': 5, 'd': 4}

     We have now created a class that has __dict__, but it only contains the
     attributes not stored in slots!  So, should class instances explicitly
     know their attributes?  Or more precisely, should class instances
     always have a __dict__ attribute that contains their attributes?  Don't
     worry, this does not mean that we cannot also have slots, though it
     does have some other implications.  Keep reading...


  2) Should attribute access follow the same resolution order rules as
     methods?

        class Foo(object):
          __slots__ = ['a']
          self.a
          def __init__(self):
            self.a = 1

        class Bar(Foo):
          __slots__ = ('a',)
          def __init__(self):
            Foo.__init__(self)
            self.a = 2

        bar = Bar()
        print bar.a
        > 2
        print super(Bar,bar).a   # this doesn't actually work
        > 2 or 1?

    Don't worry -- this isn't a proposal and no, this doesn't actually work.
    However, the current implementation only narrowly escapes this trap:

        print bar.__class__.a.__get__(bar)
        > 2
        print bar.__class__.__base__.a.__get__(bar)
        > AttributeError: a

    Ok, let me explain what just happened.  Slots are implemented via the
    new descriptor interface.  In short, descriptor objects are properties
    and support __get__ and __set__ methods.  The slot descriptors are told
    the offset within an object instance the PyObject* lives and proxy
    operations for them.  So getting and setting slots involves:

        # print bar.a
        a_descr = bar.__class__.a
        print a_descr.__set__(bar)

        # bar.a = 1
        a_descr = bar.__class__.a
        a_descr.__set__(bar, 1)

    So, above we get an attribute error when trying to access the 'a' slot
    from Bar since it was never initialized.  However, with a little
    ugliness you can do the following:

        # Get the descriptors for Foo.a and Bar.a
        a_foo_descr = bar.__class__.__base__.a
        a_bar_descr = bar.__class__.a
        a_foo_descr.__set__(bar,1)
        a_bar_descr.__set__(bar,2)

        print bar.a
        > 2
        print a_foo_descr.__get__(bar)
        > 1
        print a_bar_descr.__get__(bar)
        > 2

    In other words, the namespace for slots is not really flat, although
    there is no simple way to access these hidden attributes since method
    resolution order rules are not invoked by default.


  3) Should __slots__ be immutable?

     The __slots__ attribute of a new-style class lists all of the slots
     defined by that class.  It is represented as whatever sequence type
     what given when the object was declared:

       print Foo.__slots__
       > ['a']
       print Bar.__slots__
       > ('a',)

     This allows us to do things like:

       Foo.__slots__.append('b')
       foo = Foo()
       foo.b = 42
       > AttributeError: 'Foo' object has no attribute 'b'

     So modifying the slots does not do what one may expect.  This is
     because slot descriptors and the space for slots are only allocated
     when the classes are created (i.e., when they are inherited from
     'object', or from an object that descends from 'object').


  4) Should __slots__ be flat?

     bar.__slots__ only lists the slots specifically requested in bar, even
     though it inherits from 'foo', which has its own slots.  Which would be
     the preferable behavior?

       class Foo(object):
         __slots__ = ('a','b')
       class Bar(object):
         __slots__ = ('c','d')

       print Bar.__slots__
       > ('c','d')           # current behavior
       or
       > ('a','b','c','d')   # alternate behavior

    Clearly, this issue goes back to the ideas addressed in question 1.  If
    slot descriptors are not stored in a per-instance dictionary, then
    the assumptions on how to do object reflection must change.  However,
    which version of the following code do you prefer to print all
    attributes of a given object:

      Old style or if descriptors are stored in obj.__dict__:

        if hasattr(obj,'__dict__'):
          print ''.join([ '%s=%s' % nameval for nameval in obj.__dict__ ])


      Currently in Python 2.2 (and still not quite correct):

        def print_slot_attrs(obj,cls=None):
          if not cls:
            cls = obj.__class__
          for name,obj in cls.__dict__.items()
            if str(type(obj)) == "<type 'member_descriptor'>":
              if hasattr(obj, name):
                print "%s=%s" % (name,getattr(obj, name))
          for base in cls.__bases__:
            print_slot_attrs(obj,base)

        if hasattr(obj,'__dict__'):
          print [ '%s=%s' % nameval for nameval in obj.__dict__ ]
        print_slot_attrs(obj)


      Flat and immutable slot namespace:

        a  = [ '%s=%s' % nameval for nameval in obj.__dict__ ]
        a += [ '%s=%s' % (name,val) for name,val in obj.__slots__ \
                                         if hasattr(obj, name) ]
        print ''.join(a)

  So, which one of these do you want to support or explain to a new user?


Thanks,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From john_coppola_r_s@yahoo.com  Mon Feb 18 16:46:55 2002
From: john_coppola_r_s@yahoo.com (john coppola)
Date: Mon, 18 Feb 2002 08:46:55 -0800 (PST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <Pine.LNX.4.33.0202180845530.13612-100000@penguin.theopalgroup.com>
Message-ID: <20020218164655.38970.qmail@web11806.mail.yahoo.com>

I haven't even finished reading this yet.
This is good stuff!

--- Kevin Jacobs <jacobs@penguin.theopalgroup.com>
wrote:
> Hello all,
> 
> I've been meta-reflecting a lot lately: reflecting
> on reflection.
> 
> My recent post on __slots__ not being picklable (and
> the resounding lack of
> response to it) inspired me to try my hand at
> channeling Guido and reverse-
> engineer some of the design decisions that went into
> the new-style class
> system.  Unfortunately, the more I dug into the
> code, the more philosophical
> my questions became.  So, I've written up some
> questions that help lay bare
> some of basic design questions that I've been asking
> myself and that you
> should be aware of.
> 
> While there are several subtle issues I could raise,
> I do want some feedback
> on some simple and fundamental ones first.  Please
> don't disqualify yourself
> from commenting because you haven't read the code or
> used the new features
> yet.  I've written my examples assuming only a basic
> and cursor
> understanding of the new Python 2.2 features.
> 
>   [In this discussion I am only going to talk about
> native Python classes,
>    not C-extension or native Python types (e.g.,
> ints, lists, tuples,
>    strings, cStringIO, etc.)]
> 
>   1) Should class instances explicitly/directly know
> all of their attributes?
> 
>      Before Python 2.2, all object instances
> contained a __dict__ attribute
>      that mapped attribute names to their values. 
> This made pickling and
>      some other reflection tasks fairly easy.
> 
>         e.g.:
> 
>         class Foo:
>           def __init__(self):
>             self.a = 1
>             self.b = 2
> 
>         class Bar(Foo):
>           def __init__(self):
>             Foo.__init__(self)
>             self.c = 3
> 
>         bar = Bar()
>         print bar.__dict__
>         > {'a': 1, 'c': 3, 'b': 2}
> 
>      I am aware that there are situations where this
> simple case does not
>      hold (e.g., when implementing __setattr__ or
> __getattr__), but let's
>      ignore those for now.  Rather, I will
> concentrate on how this classical
>      Python idiom interacts with the new slots
> mechanism.  Here is the above
>      example using slots:
> 
>         e.g.:
> 
>         class Foo(object):
>           __slots__ = ['a','b']
>           def __init__(self):
>             self.a = 1
>             self.b = 2
> 
>         class Bar(Foo):
>           __slots__ = ['c']
>           def __init__(self):
>             Foo.__init__(self)
>             self.c = 3
> 
>         bar = Bar()
>         print bar.__dict__
>         > AttributeError: 'Bar' object has no
> attribute '__dict__'
> 
>      We can see that the class instance 'bar' has no
> __dict__ attribute.
>      This is because the slots mechanism allocates
> space for attribute
>      storage directly inside the object, and thus
> does not use (or need) a
>      per-object instance dictionary to store
> attributes.  Of course, it is
>      possible to request that a per-instance
> dictionary by inheriting from a
>      new-style class that does not list any slots. 
> e.g. continuing from
>      above:
> 
>         class Baz(Bar):
>           def __init__(self):
>             Bar.__init__(self)
>             self.d = 4
>             self.e = 5
> 
>         baz = Baz()
>         print baz.__dict__
>         > {'e': 5, 'd': 4}
> 
>      We have now created a class that has __dict__,
> but it only contains the
>      attributes not stored in slots!  So, should
> class instances explicitly
>      know their attributes?  Or more precisely,
> should class instances
>      always have a __dict__ attribute that contains
> their attributes?  Don't
>      worry, this does not mean that we cannot also
> have slots, though it
>      does have some other implications.  Keep
> reading...
> 
> 
>   2) Should attribute access follow the same
> resolution order rules as
>      methods?
> 
>         class Foo(object):
>           __slots__ = ['a']
>           self.a
>           def __init__(self):
>             self.a = 1
> 
>         class Bar(Foo):
>           __slots__ = ('a',)
>           def __init__(self):
>             Foo.__init__(self)
>             self.a = 2
> 
>         bar = Bar()
>         print bar.a
>         > 2
>         print super(Bar,bar).a   # this doesn't
> actually work
>         > 2 or 1?
> 
>     Don't worry -- this isn't a proposal and no,
> this doesn't actually work.
>     However, the current implementation only
> narrowly escapes this trap:
> 
>         print bar.__class__.a.__get__(bar)
>         > 2
>         print bar.__class__.__base__.a.__get__(bar)
>         > AttributeError: a
> 
>     Ok, let me explain what just happened.  Slots
> are implemented via the
>     new descriptor interface.  In short, descriptor
> objects are properties
>     and support __get__ and __set__ methods.  The
> slot descriptors are told
>     the offset within an object instance the
> PyObject* lives and proxy
>     operations for them.  So getting and setting
> slots involves:
> 
>         # print bar.a
>         a_descr = bar.__class__.a
>         print a_descr.__set__(bar)
> 
>         # bar.a = 1
>         a_descr = bar.__class__.a
>         a_descr.__set__(bar, 1)
> 
>     So, above we get an attribute error when trying
> to access the 'a' slot
>     from Bar since it was never initialized. 
> However, with a little
>     ugliness you can do the following:
> 
>         # Get the descriptors for Foo.a and Bar.a
>         a_foo_descr = bar.__class__.__base__.a
>         a_bar_descr = bar.__class__.a
>         a_foo_descr.__set__(bar,1)
>         a_bar_descr.__set__(bar,2)
> 
>         print bar.a
>         > 2
>         print a_foo_descr.__get__(bar)
>         > 1
>         print a_bar_descr.__get__(bar)
>         > 2
> 
>     In other words, the namespace for slots is not
> really flat, although
>     there is no simple way to access these hidden
> attributes 
=== message truncated ===


__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com


From DavidA@ActiveState.com  Mon Feb 18 19:01:51 2002
From: DavidA@ActiveState.com (David Ascher)
Date: Mon, 18 Feb 2002 11:01:51 -0800
Subject: [Python-Dev] Meta-reflections
References: <Pine.LNX.4.33.0202180845530.13612-100000@penguin.theopalgroup.com>
Message-ID: <3C714F9F.841DB772@activestate.com>

I think that you're making useful points, but I think that it's worth
stepping even further back and deciding what the reflection API should
be like from a "what's it for" POV?

This relates to much of the discussion about what dir() should do on
new-style classes, as well as why some Python objects have 'members',
some have 'methods', etc. 

In my opinon, __dict__ is mostly an implementation detail, and it makes
sense to me that the slot names dont' show up in there (after all, it's
not a dictionary!).  

What I'd propose is that the inspect module grow some "abstract"
reflection APIs which make it possible for folks who don't need to know
about implementation details to get away with it.

Looking at it, maybe it already has everything we need.  I'm not quite
sure why inspect.getmembers is called that, but maybe I'm the only one
who's not sure what 'members' mean in Python.

--david


From jacobs@penguin.theopalgroup.com  Mon Feb 18 19:33:04 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Mon, 18 Feb 2002 14:33:04 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <3C714F9F.841DB772@activestate.com>
Message-ID: <Pine.LNX.4.33.0202181401470.16028-100000@penguin.theopalgroup.com>

On Mon, 18 Feb 2002, David Ascher wrote:
> I think that you're making useful points, but I think that it's worth
> stepping even further back and deciding what the reflection API should
> be like from a "what's it for" POV?

Exactly!  However, having a meta-discussion on meta-reflection is a little
too abstract for the disinterested to jump in on.  However most people who
read python-dev use and come to rely on using __dict__ as The Python
Reflection API for instance attributes.

> This relates to much of the discussion about what dir() should do on
> new-style classes, as well as why some Python objects have 'members',
> some have 'methods', etc.

Sure, except that I've _NEVER_ assumed dir() was anything more than a
quick-and-dirty ultra-high level hack that was occaisonally useful for doing
reflection.  One call does not a reflection API make.

> In my opinon, __dict__ is mostly an implementation detail, and it makes
> sense to me that the slot names dont' show up in there (after all, it's
> not a dictionary!).

I think so too, though I don't want to ram my own views down people's
throats on the matter.  However, it is potentially valid to view __dict__ as
the one true reflection API for getting access to all attributes.  This
isn't too outlandish since it effectively is in Python 2.2.  Pickle and
cPickle 2.2 (among several dozen other examples I've found) are currently
implemented assuming this.  If we wanted to keep this existing API we could
support reflection on slots by extending object instances with only slot
attributes to share a common read-only __dict__.  New style class instances
with per-instance __dict__'s should start with a mutable copy when
instantiated.  For the record, I don't think this is the right way to go,
even though it is a valid way way of defining the Python reflection API.

> What I'd propose is that the inspect module grow some "abstract"
> reflection APIs which make it possible for folks who don't need to know
> about implementation details to get away with it.

Great idea!  I've already got a stack of suggestions and patches that
clean up other various bits of it.

However, there was an unstated and important question left out of my last
e-mail: We need to decide if slots are really 'attributes' or "something
else".  The "something else" being akin to __set/getattr__ virtual
attributes, pure properties, and other techniques that will almost always
require explicit hooks to into reflection APIs.  My preference is the
former, that slot declarations simply affect the allocation choices made by
an object, and not the semantics of what may be done with that object
(modulo issues when per-instance dicts are not allocated).

Thanks,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From john_coppola_r_s@yahoo.com  Mon Feb 18 20:01:17 2002
From: john_coppola_r_s@yahoo.com (john coppola)
Date: Mon, 18 Feb 2002 12:01:17 -0800 (PST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <3C714F9F.841DB772@activestate.com>
Message-ID: <20020218200117.32087.qmail@web11808.mail.yahoo.com>

[ David Ascher <DavidA@ActiveState.com> wrote:]
> I think that you're making useful points, but I
> think that it's worth stepping even further back
> and deciding what the reflection API should
> be like from a "what's it for" POV?
 
> This relates to much of the discussion about what
> dir() should do on
> new-style classes, as well as why some Python
> objects have 'members',
> some have 'methods', etc. 

I think his points were more than useful.  His
examples expose serious flaws with the use of slots,
which I hardly see as satisfactory behavior. 
Particularly, are slot attributes to be treated like
MRO or not?  Hummm...

Does each class have a separate set of slot
attributes?

class A:
   __slots__=('foo','bar')

class B(A):
   __slots__=('foo','spam')

What are we supposed to expect here?

I believe things would be greatly simplified if the
inheritance tree was traversed and all slot attributes
were concatenated and regarded as unique for a given
instance.  So in the above example we would expect
(breadth first then depth),
__slots__ =('foo', 'spam', 'bar').

A note on dir...
Since we have no other introspection tools aside from
dir, I as of this revision of python dir should
correctly display slot attributes with dict
attributes.

Does "dir" stand for directory or does it mean
__dict__.keys()?  I believe this is an implementation
detail.  Dir should lookup every attribute, but now we
need two additional functions: slotdir(), dictdir(). 
Maybe better names: slots(), dictdir().
Another argument for why dir should be changed, is
that cPickle and pickle would problably function
correctly as a result of changing dir's implementation
to reveal slot attributes.

But even on another note which digs deeper into the
philosophy of __slots__, why not use slots for class
methods?  Can this already be done?  It can be used
everywhere, modules, imported modules, classes,
instances, etc.




 

> 
> In my opinion, __dict__ is mostly an implementation
> detail, and it makes
> sense to me that the slot names dont' show up in
> there (after all, it's
> not a dictionary!).  
> 
> What I'd propose is that the inspect module grow
> some "abstract"
> reflection APIs which make it possible for folks who
> don't need to know
> about implementation details to get away with it.
> 
> Looking at it, maybe it already has everything we
> need.  I'm not quite
> sure why inspect.getmembers is called that, but
> maybe I'm the only one
> who's not sure what 'members' mean in Python.
> 
> --david
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev


__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com


From john_coppola_r_s@yahoo.com  Mon Feb 18 20:07:33 2002
From: john_coppola_r_s@yahoo.com (john coppola)
Date: Mon, 18 Feb 2002 12:07:33 -0800 (PST)
Subject: [Python-Dev] Re: comp.lang.python in English? (was Re: Why Python is like BASIC)
In-Reply-To: <oqofim3dei.fsf@titan.progiciels-bpi.ca>
Message-ID: <20020218200733.73832.qmail@web11803.mail.yahoo.com>

--- François Pinard <pinard@iro.umontreal.ca> wrote:
> 
> Exactement.  Bravo!  Bien dit! :-)

I agree with François and I'm english speaking.

-john


__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com


From mal@lemburg.com  Mon Feb 18 20:19:47 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Feb 2002 21:19:47 +0100
Subject: [Python-Dev] Re: comp.lang.python in English? (was Re: Why Python is
 like BASIC)
References: <20020218200733.73832.qmail@web11803.mail.yahoo.com>
Message-ID: <3C7161E3.F195BE73@lemburg.com>

john coppola wrote:
>=20
> --- Fran=E7ois Pinard <pinard@iro.umontreal.ca> wrote:
> >
> > Exactement.  Bravo!  Bien dit! :-)
>=20
> I agree with Fran=E7ois and I'm english speaking.

Either my mailer is broken or I am missing some context here...
what does this have to do with python-dev ?

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From john_coppola_r_s@yahoo.com  Mon Feb 18 20:49:49 2002
From: john_coppola_r_s@yahoo.com (john coppola)
Date: Mon, 18 Feb 2002 12:49:49 -0800 (PST)
Subject: [Python-Dev] inadvertantly posted to wrong discussion
In-Reply-To: <3C7161E3.F195BE73@lemburg.com>
Message-ID: <20020218204949.12674.qmail@web11805.mail.yahoo.com>

It was completely an accident.

Sorry.



__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com


From DavidA@ActiveState.com  Mon Feb 18 21:30:30 2002
From: DavidA@ActiveState.com (David Ascher)
Date: Mon, 18 Feb 2002 13:30:30 -0800
Subject: [Python-Dev] Global name lookup schemes
References: <Pine.LNX.4.33.0202180610220.18893-100000@server1.lfw.org>
Message-ID: <3C717276.9FF18C31@activestate.com>

Ka-Ping Yee wrote:
> 
> On Mon, 18 Feb 2002, M.-A. Lemburg wrote:
> > Way cool, Ping ! Does AI provide tools for simplifying these
> > kind of diagrams or did you do it all by hand ?
> 
> I spent all day on it.  Illustrator was not much help, sadly.
> It's so lame that it can't even keep arrowheads stuck to arrows,
> and its grid-snapping behaviour is mysterious and unpredictable.

FYI, a language that I really enjoyed working with for illustrations is
MetaPost:

  http://cm.bell-labs.com/who/hobby/MetaPost.html
  http://www.tug.org/metapost.html

It's a really cool language, and like a lot of drawing languages,
debugging is pretty. =)

I've wanted to bridge Python and MetaPost, but never found the time (or
frankly, a good excuse).

--da


From martin@v.loewis.de  Mon Feb 18 21:31:22 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 18 Feb 2002 22:31:22 +0100
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <Pine.LNX.4.33.0202180845530.13612-100000@penguin.theopalgroup.com>
References: <Pine.LNX.4.33.0202180845530.13612-100000@penguin.theopalgroup.com>
Message-ID: <m3adu6zdcl.fsf@mira.informatik.hu-berlin.de>

Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes:

>   1) Should class instances explicitly/directly know all of their attributes?

Since types are classes, this is the same question as "should type
instances know all their attributes?" I don't think they should, in
general: For example, there is no way to find out whether a string
object has an interned pointer, and I don't think there should be.

The __slots__ aren't really different here. In fact, if you do

class Spam(object):
  __slots__ = ('a','b')

s = Spam()
s.a = {}
del Spam.a

you loose access to s.a, even though it is still available (I guess it
is actually a bug that cyclic garbage collection won't find cycles
involving slots).

>   2) Should attribute access follow the same resolution order rules as
>      methods?

Yes, I think so.

>   4) Should __slots__ be flat?

Yes. They should also be a property of the type, not a member of the
dict of the type, and they should be a tuple of member object, not a
list of strings. It might be reasonable to call this property
__members__.

>        > ('c','d')           # current behavior
>        or
>        > ('a','b','c','d')   # alternate behavior

Neither, nor; assuming you meant Bar to inherit from Foo, it should be

(<member 'a' of 'Foo' objects>, <member 'b' of 'Foo' objects>,
 <member 'c' of 'Bar' objects>, <member 'd' of 'Bar' objects>)

Regards,
Martin


From ping@lfw.org  Mon Feb 18 21:44:06 2002
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 18 Feb 2002 15:44:06 -0600 (CST)
Subject: [Python-Dev] Global name lookup schemes
In-Reply-To: <3C717276.9FF18C31@activestate.com>
Message-ID: <Pine.LNX.4.33.0202181542190.22288-100000@server1.lfw.org>

On Mon, 18 Feb 2002, David Ascher wrote:
> FYI, a language that I really enjoyed working with for illustrations is
> MetaPost:

I'd rather discuss the *diagrams* on this list than the diagram-making
tools.  :)  (You can send your suggestions for better tools to me
individually, if you like, and i'll summarize later if there's interest.)

So what do you all think of the various global name lookup proposals?


-- ?!ng



From tim.one@comcast.net  Mon Feb 18 22:25:58 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 18 Feb 2002 17:25:58 -0500
Subject: [Python-Dev] Global name lookup schemes
In-Reply-To: <Pine.LNX.4.33.0202181542190.22288-100000@server1.lfw.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEGPNNAA.tim.one@comcast.net>

[Ping]
> I'd rather discuss the *diagrams* on this list than the diagram-making
> tools.  :)

Unless you write a new tool in Python <wink>.

> ...
> So what do you all think of the various global name lookup proposals?

I expect reality has once again managed to extinguish most post-Conference
euphoria.  I spent 200% of my "free time" this weekend doing research for
PSF board issues, and still haven't gotten to even reading about Oren's
(IIRC) dict gimmicks.

Guido is off traveling.  Jeremy is in the midst of moving.  Skip is too busy
approving posts of mine that stinking SpamCop rejects since I involuntarily
switched ISPs.

So I'm glad we harassed Guido into at least starting a PEP ...

once-upon-a-time-ly y'rs  - tim



From nas@python.ca  Mon Feb 18 22:58:06 2002
From: nas@python.ca (Neil Schemenauer)
Date: Mon, 18 Feb 2002 14:58:06 -0800
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEGPNNAA.tim.one@comcast.net>; from tim.one@comcast.net on Mon, Feb 18, 2002 at 05:25:58PM -0500
References: <Pine.LNX.4.33.0202181542190.22288-100000@server1.lfw.org> <LNBBLJKPBEHFEDALKOLCEEGPNNAA.tim.one@comcast.net>
Message-ID: <20020218145806.A26111@glacier.arctrix.com>

I've been working on Skip's rattlesnake and have made some progress.
Right now I'm trying to hack the compiler package and am looking for a
good reference on code generation.  I have the "New Dragon book" as well
as "Essentials of Programming Lanuages" but neither one seem to be
telling me want I want to know.  Any suggestions?

  Neil


From simon@netthink.co.uk  Mon Feb 18 23:12:17 2002
From: simon@netthink.co.uk (Simon Cozens)
Date: Mon, 18 Feb 2002 23:12:17 +0000
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <20020218145806.A26111@glacier.arctrix.com>
References: <Pine.LNX.4.33.0202181542190.22288-100000@server1.lfw.org> <LNBBLJKPBEHFEDALKOLCEEGPNNAA.tim.one@comcast.net> <20020218145806.A26111@glacier.arctrix.com>
Message-ID: <20020218231217.GA5118@netthink.co.uk>

Neil Schemenauer:
> good reference on code generation.  I have the "New Dragon book" as well
> as "Essentials of Programming Lanuages" but neither one seem to be
> telling me want I want to know.  Any suggestions?

You could try Appel: Modern Compiler Implementation in {C,ML,Java}.

-- 
"How should I know if it works?  That's what beta testers are for.  I only
coded it."
(Attributed to Linus Torvalds, somewhere in a posting)


From josh.winters@webstream.net  Mon Feb 18 23:37:02 2002
From: josh.winters@webstream.net (josh.winters@webstream.net)
Date: Mon, 18 Feb 2002 18:37:02 -0500
Subject: [Python-Dev] We would like to possibly get some info on your company
Message-ID: <E16cxLC-0003Bl-00@mail.python.org>

Hello, 

We would like to possibly get some info on your company in an effort to explore the ways that we might be able to work together. We may be able to save you money and offer you the benefits of our reseller program.

We have been developing and hosting web sites since 1997. We offer design, programming, hosting and webcasting and videoconferencing. We support Linux, NT and AS400. 

Please forward this to the proper party, or respond to the address below. You can also visit our web site at http://webstream.net for more information on our services. 

If by e-mail: 
josh.winters@webstream.net 

If by mail: 
WebStream Internet Solutions 
Outsourcing/Purchasing 
2200 W.Commercial Blvd. Suite 204 
Ft. Lauderdale, FL 33309 USA 

Thank you very much. 

Sincerely, 

Josh Winters 
josh.winters@webstream.net 
http://webstream.net 
Design * Programming * Hosting * WebCasting * Since 1997



From JeffH@ActiveState.com  Tue Feb 19 00:25:57 2002
From: JeffH@ActiveState.com (Jeff Hobbs)
Date: Mon, 18 Feb 2002 16:25:57 -0800
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: <200202161628.g1GGS1x30329@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <009201c1b8dc$00eb6fb0$ba03a8c0@activestate.ca>

> I hope that it's possible to do something better with Tcl/Tk 8.3 that
> doesn't require the sleep and maintains the existing _tkinter API /
> semantics.

I guess I would need to get a better understanding of why it was
designed with the sleep in the first place.  Martin mentioned
that it allows a thread switch to occur, but a shorter sleep
interval would have done the same.

Tk 8.1+ has been thread-safe, but only in 8.3 have people been
pushing it a little harder (most users of threads are Tcl-only).
However, there are the different models between Python and Tcl
threading, and perhaps that is a reason another method wasn't
attempted early.

Anyway, as to Martin's questions:

> In the light of this rationale, can you please explain what
> Tcl_AsyncMark is and how it would avoid the problem, or what effect
> calling Tcl_CreateEventSource would have, or how Tcl_ThreadQueueEvent
> would help?

Tcl_AsyncMark is what you would call if you left the while loop
looking more like:

[global or tls]
static stopRequested = 0;

[in func]
	while (!stopRequested && foundEvent) {
	    foundEvent = Tcl_DoOneEvent(TCL_ALL_EVENTS);
	}

And whenever a signal occurs, you would do:

ProcessSignal()
{
    stopRequested = 1;
    Tcl_AsyncMark(asyncHandler);
}

The asyncHandler then has it's own callback routine of your
choosing.  Now this might not be what you want, as this is more
the design for single-threaded systems that want an event loop.

There is also the Tcl_CreateEventSource route.  This allows you
to provide a proc that gets called in addition to the internal
Tcl one for processing events.  This is most often used when
tieing together event sources like Tk and Gtk, or Tk and MFC, ...

You may simply need to call Tcl_SetMaxBlockTime.  This will
prevent Tcl from indefinitely blocking when no events are
received.  This may be the simplest solution to create the same
effect as the Sleep, but without any other negative effects.

Most all of these are described in fairly good detail here:
	http://www.tcl.tk/man/tcl8.3/TclLib/Notifier.htm

Jeff


From gward@python.net  Tue Feb 19 01:14:19 2002
From: gward@python.net (Greg Ward)
Date: Mon, 18 Feb 2002 20:14:19 -0500
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <Pine.LNX.4.33.0202180845530.13612-100000@penguin.theopalgroup.com>
References: <Pine.LNX.4.33.0202180845530.13612-100000@penguin.theopalgroup.com>
Message-ID: <20020219011419.GB5121@gerg.ca>

On 18 February 2002, Kevin Jacobs said:
> My recent post on __slots__ not being picklable (and the resounding lack of
> response to it)

Certainly caught my attention, but I had nothing to add.

>   1) Should class instances explicitly/directly know all of their attributes?

I'm not sure you're asking the right question.  If you're concerned with
introspection, shouldn't the question be: "Should arbitrary code be able
to find out the set of attributes associated with a given object?"  The
Pythonic answer is clearly yes.  And if "attribute" means "something
that follows a dot", then you can do this using dir().  Unfortunately,
the expansion of dir() to include methods means it's no longer very
useful for getting just instance attributes, whether they're in a
__dict__ or some other method.

So the obvious answer is to use vars(), which works on classic classes
and __slots__-less new-style classes.  (I think vars(x) is just a more
sociable way to spell x.__dict__.)  But it bombs on classes with
__slots__:

>>> class C(object):
...   __slots__ = ['a', 'b']
... 
>>> c = C()
>>> vars(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: vars() argument must have __dict__ attribute

Uh-oh.  This is a problem.

>   3) Should __slots__ be immutable?

Yes, definitely.  Clearly __slots__ is a property of the type (class),
not of the instance, and once the class is defined, that's it.  (Or that
should be it.)  It looks as though you can modify __slots__, but it has
no effect; that's mildly bogus.

>   4) Should __slots__ be flat?

Hmmmm... probably.  That's certainly consistent with "... once the class
is defined, that's it".

        Greg
-- 
Greg Ward - geek-at-large                               gward@python.net
http://starship.python.net/~gward/
If you and a friend are being chased by a lion, it is not necessary to
outrun the lion.  It is only necessary to outrun your friend.


From tim.one@comcast.net  Mon Feb 18 23:51:52 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 18 Feb 2002 18:51:52 -0500
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <20020218145806.A26111@glacier.arctrix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHDNNAA.tim.one@comcast.net>

[Neil Schemenauer]
> I've been working on Skip's rattlesnake and have made some progress.

Cool!  I encourage this.  That and 2 dollars will buy you a cup of coffee.

> Right now I'm trying to hack the compiler package and am looking for a
> good reference on code generation.  I have the "New Dragon book" as well
> as "Essentials of Programming Lanuages" but neither one seem to be
> telling me want I want to know.  Any suggestions?

Write it in Python because you won't be happy with your first 2 or 3
attempts.

Compiler books have historically been very heavy on front end issues, in
large part because the theory of parsing is well understood.  What
relatively little you can find on back end issues tends to be sketchy and
severely limited by the author's personal experience.  In large part this is
because almost no interesting optimization problem can be solved in linear
time (whether it's optimal instruction selection, optimal register
assignment, optimal instruction ordering, ...), so real-life back ends are a
mountain of idiosyncratic heuristics.

Excellent advice that almost nobody follows <0.5 wink>:  choose a flexible
intermediate representation, then structure all your transformations as
independent passes, such that the output of every pass is acceptable as the
input to every pass.  Then keep each pass focused, as simple as possible
(for example, if a transformation may create regions of dead code, don't
dare try to clean it up in the same pass, or contort the logic even a little
bit to try to avoid creating dead code -- instead let it create all the dead
code it wants, and (re)invoke a "remove dead code" pass afterwards).
*Because* back ends are a mountain of idiosyncratic heuristics, this design
lets you add new ones, remove old ones, and reorder them with minimal pain.
One compiler I read about (but didn't have the pleasure of using) actually
allowed you to specify the sequence of back end transformations on the
cmdline, using a regular expression notation where, e.g.

   (hoist dead)+

meant "run the hoist pass followed by the dead code removal pass, one or
more times, until a fixed point is reached".

Since none of that told you want to know either <wink>, what do you want to
know?  Sounds like a legit topic for python-dev.



From dan@dberlin.org  Tue Feb 19 01:50:57 2002
From: dan@dberlin.org (Daniel Berlin)
Date: Mon, 18 Feb 2002 20:50:57 -0500 (EST)
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <20020218231217.GA5118@netthink.co.uk>
Message-ID: <Pine.LNX.4.44.0202182048030.10978-100000@dberlin.org>

On Mon, 18 Feb 2002, Simon Cozens wrote:
> Neil Schemenauer:
> > good reference on code generation.  I have the "New Dragon book" as well
> > as "Essentials of Programming Lanuages" but neither one seem to be
> > telling me want I want to know.  Any suggestions?
> 
> You could try Appel: Modern Compiler Implementation in {C,ML,Java}.

When you get to optimizations, you want Advanced Compiler Design and 
Implementation by Muchnick.

And/Or Building an Optimizing Compiler by Morgan.





From jacobs@penguin.theopalgroup.com  Tue Feb 19 01:51:48 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Mon, 18 Feb 2002 20:51:48 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <m3adu6zdcl.fsf@mira.informatik.hu-berlin.de>
Message-ID: <Pine.LNX.4.33.0202182023170.19182-100000@penguin.theopalgroup.com>

On 18 Feb 2002, Martin v. Loewis wrote:
> Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes:
>
> >   1) Should class instances explicitly/directly know all of their attributes?
>
> Since types are classes, this is the same question as "should type
> instances know all their attributes?" I don't think they should, in
> general: For example, there is no way to find out whether a string
> object has an interned pointer, and I don't think there should be.

I explicitly made note that my discussion of slots was in the context of
native new-style Python class and not C-types, even ones that can be used as
bases class for other new-style classes.  We will always need to hide C
implementation details behind Python objects, but we are not talking about
reflection on such hidden state.  My belief is that slots should be treated
as much as possible like normal attributes and not as "hidden object state".

> class Spam(object):
>   __slots__ = ('a','b')
>
> s = Spam()
> s.a = {}
> del Spam.a
>
> you loose access to s.a, even though it is still available (I guess it
> is actually a bug that cyclic garbage collection won't find cycles
> involving slots).

Not exactly -- the semantics are the same as regular attributes in this
case.  Continuing your example, you can then do

  s.a = 5

so access to the slot is not lost, only to the value.

> >   2) Should attribute access follow the same resolution order rules as
> >      methods?
>
> Yes, I think so.

Ouch!  This implies a great deal more than you may be thinking of.  For
example, do you really want to be able to do this:

    class Foo(object):
      __slots__ = ('a',)

    class Bar(Foo):
      __slots__ = ('a',)

    bar = Bar()
    bar.a = 1
    super(Bar, bar).a = 2
    print bar.a
    > 1

This violates the traditional Python idiom of having a flat namespace for
attributes, even in the presence of inheritance.  This has very profound
implications to Python semantics and performance.

> >   4) Should __slots__ be flat?
>
> Yes. They should also be a property of the type, not a member of the
> dict of the type, and they should be a tuple of member object, not a
> list of strings. It might be reasonable to call this property
> __members__.
>
> >        > ('c','d')           # current behavior
> >        or
> >        > ('a','b','c','d')   # alternate behavior
>
> Neither, nor; assuming you meant Bar to inherit from Foo, it should be
>
> (<member 'a' of 'Foo' objects>, <member 'b' of 'Foo' objects>,
>  <member 'c' of 'Bar' objects>, <member 'd' of 'Bar' objects>)

An interesting idea that I had not considered.  Currently the slot
descriptor objects to not directly expose the name or type of the object
except in the repr.  This could easily be fixed.

However this brings up another issue.  The essence of a slot (or, more
correctly, a slot descriptor) is to store an offset into a PyObject* that
represents a value within an object.  The name to which the slot is bound is
not the intrinsic and defining characteristic.  So it would be somewhat
illogical to mandate static name bindings to slots.  This supports the
notion rebinding slot names during object inheritance (this is already
partially implemented), or storing the descriptor objects in a __slots__
tuple and providing an interface to query and reset the name binding for
each of them.

Comments?  Thoughts?

Thanks,
-Kevin


--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From tim.one@comcast.net  Tue Feb 19 04:57:22 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 18 Feb 2002 23:57:22 -0500
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: <009201c1b8dc$00eb6fb0$ba03a8c0@activestate.ca>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEIANNAA.tim.one@comcast.net>

[Jeff Hobbs]
> I guess I would need to get a better understanding of why it was
> designed with the sleep in the first place.  Martin mentioned
> that it allows a thread switch to occur, but a shorter sleep
> interval would have done the same.

I believe Martin was correct in large part.  The other part is that, without
a sleep at all, we would have a pure busy loop here, competing for cycles
non-stop with every process on the box.

About the length of the sleep, do note that Sleep(20) sleeps 20 milliseconds
here (not seconds), and that the sleep is skipped so long as
Tcl_DoOneEvent() says it's finding things to do.  IOW, Tcl gets all the
cycles it can it eat so long as it says it's busy, and doesn't generally
wait more than about 0.02 seconds for another chance after it runs out of
work to do.

Back when 20 was first picked, machines were slow enough that an utterly
idle Tkinter app in the background still showed up as consuming a measurable
percentage of a CPU, thanks to this not-so-busy loop.  We could afford to
make the sleep shorter on faster boxes, but I'm not sure I buy the argument
that we're making Tcl/Tk look sluggish.  The reason we hate the loop is more
that it's a miserably ugly hack.



From martin@v.loewis.de  Tue Feb 19 08:43:21 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 19 Feb 2002 09:43:21 +0100
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEIANNAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCOEIANNAA.tim.one@comcast.net>
Message-ID: <m38z9pkgk6.fsf@mira.informatik.hu-berlin.de>

Tim Peters <tim.one@comcast.net> writes:

> I believe Martin was correct in large part.  The other part is that, without
> a sleep at all, we would have a pure busy loop here, competing for cycles
> non-stop with every process on the box.

Avoiding the wait to be busy is probably the #1 reason for the
sleep. The alternative to avoid a busy wait would be to do
Tcl_DoOneEvent with TCL_ALL_EVENTS, however, once Tcl becomes idle,
this will block, depriving any other thread of the opportunity to
invoke Tcl.

Of Jeff's options, invoking Tcl_SetMaxBlockTime seemed to be most
promising: I want Tcl_DoOneEvent to return after 20ms, to give other
Tcl threads a chance. So I invented the patch

Index: _tkinter.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Modules/_tkinter.c,v
retrieving revision 1.123
diff -u -r1.123 _tkinter.c
--- _tkinter.c	26 Jan 2002 20:21:50 -0000	1.123
+++ _tkinter.c	19 Feb 2002 08:34:17 -0000
@@ -1676,7 +1967,11 @@
 {
 	int threshold = 0;
 #ifdef WITH_THREAD
+	Tcl_Time blocktime = {0, 20000}; 
 	PyThreadState *tstate = PyThreadState_Get();
+	ENTER_TCL
+	Tcl_SetMaxBlockTime(&blocktime);
+	LEAVE_TCL
 #endif
 
 	if (!PyArg_ParseTuple(args, "|i:mainloop", &threshold))
@@ -1688,16 +1983,15 @@
 	       !errorInCmd)
 	{
 		int result;
+						    
 
 #ifdef WITH_THREAD
 		Py_BEGIN_ALLOW_THREADS
 		PyThread_acquire_lock(tcl_lock, 1);
 		tcl_tstate = tstate;
-		result = Tcl_DoOneEvent(TCL_DONT_WAIT);
+		result = Tcl_DoOneEvent(0);
 		tcl_tstate = NULL;
 		PyThread_release_lock(tcl_lock);
-		if (result == 0)
-			Sleep(20);
 		Py_END_ALLOW_THREADS
 #else
 		result = Tcl_DoOneEvent(0);

However, it does not work. The script

import Tkinter
import thread
import time

c = 0
l = Tkinter.Label(text = str(c))
l.pack()

def doit():
    global c
    while 1:
        c+=1
        l['text']=str(c)
        time.sleep(1)

thread.start_new(doit, ())
l.tk.mainloop()

ought to continously increase the counter in the label (once a
second), but doesn't, atleast not on Linux, using Tcl 8.3.3.  In the
strace output, it appears that it first does a select call with a
timeout, but that is followed by one without time limit before
Tcl_DoOneEvent returns.

Jeff, any ideas as to why this is happening?

Regards,
Martin


From mwh@python.net  Tue Feb 19 09:50:21 2002
From: mwh@python.net (Michael Hudson)
Date: 19 Feb 2002 09:50:21 +0000
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: martin@v.loewis.de's message of "19 Feb 2002 09:43:21 +0100"
References: <LNBBLJKPBEHFEDALKOLCOEIANNAA.tim.one@comcast.net> <m38z9pkgk6.fsf@mira.informatik.hu-berlin.de>
Message-ID: <2m3czx3in6.fsf@starship.python.net>

martin@v.loewis.de (Martin v. Loewis) writes:

[schniiiip]
> ought to continously increase the counter in the label (once a
> second), but doesn't, atleast not on Linux, using Tcl 8.3.3.  In the
> strace output, it appears that it first does a select call with a
> timeout, but that is followed by one without time limit before
> Tcl_DoOneEvent returns.
> 
> Jeff, any ideas as to why this is happening?

Well, at least this one is easy.  From the link Jeff posted:

  Information provided to Tcl_SetMaxBlockTime is only used for the
  next call to Tcl_WaitForEvent; it is discarded after
  Tcl_WaitForEvent returns. The next time an event wait is done each
  of the event sources' setup procedures will be called again, and
  they can specify new information for that event wait.

so you need to move the Tcl_SetMaxBlockTime inside the while loop.

It certainly looks to this novice as if 8.3 provides enough hooks to
do what we want, but...

Cheers,
M.

-- 
  Like most people, I don't always agree with the BDFL (especially
  when he wants to change things I've just written about in very 
  large books), ... 
         -- Mark Lutz, http://python.oreilly.com/news/python_0501.html


From simon@netthink.co.uk  Tue Feb 19 10:35:24 2002
From: simon@netthink.co.uk (Simon Cozens)
Date: Tue, 19 Feb 2002 10:35:24 +0000
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <Pine.LNX.4.44.0202182048030.10978-100000@dberlin.org>
References: <20020218231217.GA5118@netthink.co.uk> <Pine.LNX.4.44.0202182048030.10978-100000@dberlin.org>
Message-ID: <20020219103524.GB8249@netthink.co.uk>

Daniel Berlin:
> > You could try Appel: Modern Compiler Implementation in {C,ML,Java}.
> 
> When you get to optimizations, you want Advanced Compiler Design and 
> Implementation by Muchnick.
> 
> And/Or Building an Optimizing Compiler by Morgan.

Yeah. See also http://www.perldoc.com/readinglist.pl

Don't-be-put-off-by-the-domain-name-ly yrs,
Simon

-- 
You are in a maze of little twisting passages, all different.


From kjetilja@cs.uit.no  Tue Feb 19 13:04:44 2002
From: kjetilja@cs.uit.no (Kjetil Jacobsen)
Date: 19 Feb 2002 14:04:44 +0100
Subject: [Python-Dev] asyncore.poll behaviour
Message-ID: <1014123884.20195.93.camel@tac-ce1.cs.UiT.No>

hello,

in python2.2 the semantics for asyncore.poll (which uses the select()
system call) is different than for asyncore.poll3 (which uses the poll()
system call) when an EINTR exception occurs.  in asyncore.poll3, the
pollset is correctly reset to an empty list, but in asyncore.poll this
is not done, which in turn causes a lot of strange things to happen when
an EINTR occurs (spurious handler invocations and so on).

i've tried to upload the patch to sourceforge, but the patch manager has
not responded for me the last couple of days so i'm sending it here
instead.  

the fix is a simple one-liner which makes the semantics of asyncore.poll
and asyncore.poll3 similar:

*** /usr/local/lib/python2.2/asyncore.py        Wed Jan 30 15:51:00 2002
--- asyncore.py Wed Jan 30 16:19:28 2002
***************
*** 80,85 ****
--- 80,86 ----
          except select.error, err:
              if err[0] != EINTR:
                  raise
+             r, w, e = [], [], []
  
          if DEBUG:
              print r,w,e


btw, the asyncore.poll2 function does not seems to have either the
behaviour of asyncore.poll or asyncore.poll3 with respect to handling of
EINTR.  perhaps asyncore.poll2 should be removed altogether or just
remapped to asyncore.poll3?

regards,

	- kjetil


From dan@dberlin.org  Tue Feb 19 13:14:20 2002
From: dan@dberlin.org (Daniel Berlin)
Date: Tue, 19 Feb 2002 08:14:20 -0500 (EST)
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHDNNAA.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.44.0202190803390.14186-100000@dberlin.org>

On Mon, 18 Feb 2002, Tim Peters wrote:

> [Neil Schemenauer]
> > I've been working on Skip's rattlesnake and have made some progress.
> 
> Cool!  I encourage this.  That and 2 dollars will buy you a cup of coffee.
> 
> > Right now I'm trying to hack the compiler package and am looking for a
> > good reference on code generation.  I have the "New Dragon book" as well
> > as "Essentials of Programming Lanuages" but neither one seem to be
> > telling me want I want to know.  Any suggestions?
> 
> Write it in Python because you won't be happy with your first 2 or 3
> attempts.
> 
> Compiler books have historically been very heavy on front end issues, in
> large part because the theory of parsing is well understood.  What
> relatively little you can find on back end issues tends to be sketchy and
> severely limited by the author's personal experience.  In large part this is
> because almost no interesting optimization problem can be solved in linear
> time (whether it's optimal instruction selection, optimal register
> assignment, optimal instruction ordering, ...), so real-life back ends are a
> mountain of idiosyncratic heuristics.

This is true.
In fact, it's actually worse than "can be solved in linear time", it's 
"are currently thought/proved to be in NP".  For graph coloring register 
allocation algorithms, it's even worse (if you thought that was possible).  
You can't even approximate the chromatic number of the graph (IE, the 
number of colors, and therefore, registers, it would take to color it) 
to more than a certain degree in an absurd time bound.

However, you've missed the middle end, where a lot of 
interesting optimizations *can* be done in linear time or n log n time, 
and where most people now concentrate their time.

On an SSA graph, you can do at least the following in linear time or n 
log n time:

Partial Redundancy Elimination
Conditional Constant Propagation
Copy propagation
Dead store elimination
Dead load elimination
Global code motion
Global value numbering
Store motion
Load motion
Dead code elimination
Lots of loop optimizations
Lots of memory hiearchy optimizations

I've ignored interprocedural optimizations, including various pointer 
analyses that are linear time or close to it, because it would be harder 
to apply them to python.

 > 
> Excellent advice that almost nobody follows <0.5 wink>:  choose a flexible
> intermediate representation, then structure all your transformations as
> independent passes, such that the output of every pass is acceptable as the
> input to every pass. 

Everyone tries to do this these days, actually.
At least, from my working on gcc and looking at the source to tons of 
compilers each year.
You really need more than one level of IR to do serious optimization.
Tradeoff between losing valueable info (such as array indexing operations) 
vs. simplicity of writing optimization passes usually causes people to do 
some types optimization on higher level IR's (particularly, loop 
optimizations), while other optimization passes on lower IR's.

GCC is moving towards 3 IR's, a language independent tree IR, a mid-level 
RTL, and the current low-level RTL.

>  Then keep each pass focused, as simple as possible
> (for example, if a transformation may create regions of dead code, don't
> dare try to clean it up in the same pass, or contort the logic even a little
> bit to try to avoid creating dead code -- instead let it create all the dead
> code it wants, and (re)invoke a "remove dead code" pass afterwards).

Usually you don't hit this problem inside a single pass like DCE, because 
they iterate until nothing changes.


> One compiler I read about (but didn't have the pleasure of using) actually
> allowed you to specify the sequence of back end transformations on the
> cmdline, using a regular expression notation where, e.g.
> 
>    (hoist dead)+
> 
> meant "run the hoist pass followed by the dead code removal pass, one or
> more times, until a fixed point is reached".
> 
> Since none of that told you want to know either <wink>, what do you want to
> know?  Sounds like a legit topic for python-dev.



From mwh@python.net  Tue Feb 19 14:10:40 2002
From: mwh@python.net (Michael Hudson)
Date: 19 Feb 2002 14:10:40 +0000
Subject: [Python-Dev] 2.2.1 issues
Message-ID: <2madu5zhnj.fsf@starship.python.net>

Well, we have the first 2.2 bugfix that isn't a no-brainer to port to
2.2.1.  This is to do with the 

[ #495401 ] Build troubles: --with-pymalloc

bug.

As far as understand it, there were two problems.

1) with wide unicode characters, some function in unicodeobject.c to
   do with interpreting escape codes could write into memory it didn't
   own.

2) something to do with the handling of "unpaired high surrogates" in
   the utf-8 codec.

Were these problems related?  I think they got fixed at the same time,
but I may have gotten confused.

1) shouldn't be too much of an issue to get into 2.2.1 (there was some
contention about which fix performed better, but for 2.2.1 I don't
care too much).

2) is more troublesome, because to fix it properly breaks .pycs, in
turn because marshal uses the utf-8 codec to store unicode string
constants, and this is a no-no according to PEP 6.

Is it possible to worm around 2) by reconstructing valid strings from
the bad marshal data, or has information been lost?  How severe is the
bug?  Maybe it would be best to leave it unfixed in 2.2.1.

Basically, I guess I'm saying I'm too much of a unicode dunce to
understand all the issues involved in fixing this problems in 2.2, so
as unofficial bugfix-porter, I'd like someone else (Marc?  Martin?) to
port these particular fixes.  If the mechanics of fiddling with the
branch is too much, sending me patches is fine.

Cheers,
M.

-- 
  This is the fixed point problem again; since all some implementors
  do is implement the compiler and libraries for compiler writing, the
  language becomes good at writing compilers and not much else!
                                 -- Brian Rogoff, comp.lang.functional


From mal@lemburg.com  Tue Feb 19 14:34:24 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 19 Feb 2002 15:34:24 +0100
Subject: [Python-Dev] 2.2.1 issues
References: <2madu5zhnj.fsf@starship.python.net>
Message-ID: <3C726270.7D33E687@lemburg.com>

Michael Hudson wrote:
> 
> Well, we have the first 2.2 bugfix that isn't a no-brainer to port to
> 2.2.1.  This is to do with the
> 
> [ #495401 ] Build troubles: --with-pymalloc
> 
> bug.
> 
> As far as understand it, there were two problems.
> 
> 1) with wide unicode characters, some function in unicodeobject.c to
>    do with interpreting escape codes could write into memory it didn't
>    own.
> 
> 2) something to do with the handling of "unpaired high surrogates" in
>    the utf-8 codec.
> 
> Were these problems related?  I think they got fixed at the same time,
> but I may have gotten confused.

Right. 1) was caused by 2). Both are fixed now.
 
> 1) shouldn't be too much of an issue to get into 2.2.1 (there was some
> contention about which fix performed better, but for 2.2.1 I don't
> care too much).
> 
> 2) is more troublesome, because to fix it properly breaks .pycs, in
> turn because marshal uses the utf-8 codec to store unicode string
> constants, and this is a no-no according to PEP 6.
> 
> Is it possible to worm around 2) by reconstructing valid strings from
> the bad marshal data, or has information been lost?  How severe is the
> bug?  Maybe it would be best to leave it unfixed in 2.2.1.

Well, I posted a message to python-dev or the checkins list about 
this (don't remember). The situation is basically like this:

In Python <= 2.2.0, you could write

u = u"\uD800"

in a .py file. The first time you import this file, Python will
create a .pyc file for it using the broken UTF-8 encoding. The
import will succeed. The second time you import the module,
Python will try to use the .pyc file. Now reading that file
in fails with a UnicodeError and Python also does not revert
to the .py file.

As a result, modules using unpaired surrogates in Unicode
literals are simply broken in Python <= 2.2.0.

The problem with backporting this patch is that in order
for Python to properly recompile any broken module, the
magic will have to be changed. Question is whether this
is a reasonable thing to do in a patch level release...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From nas@python.ca  Tue Feb 19 14:51:51 2002
From: nas@python.ca (Neil Schemenauer)
Date: Tue, 19 Feb 2002 06:51:51 -0800
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <Pine.LNX.4.44.0202182048030.10978-100000@dberlin.org>; from dan@dberlin.org on Mon, Feb 18, 2002 at 08:50:57PM -0500
References: <20020218231217.GA5118@netthink.co.uk> <Pine.LNX.4.44.0202182048030.10978-100000@dberlin.org>
Message-ID: <20020219065151.A28722@glacier.arctrix.com>

Daniel Berlin wrote:
> When you get to optimizations, you want Advanced Compiler Design and 
> Implementation by Muchnick.

Right now I'm not planning to do any optimizations (except perhaps
limiting the number of registers used).

  Neil


From fdrake@acm.org  Tue Feb 19 15:23:07 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 19 Feb 2002 10:23:07 -0500
Subject: [Python-Dev] 2.2.1 issues
In-Reply-To: <3C726270.7D33E687@lemburg.com>
References: <2madu5zhnj.fsf@starship.python.net>
 <3C726270.7D33E687@lemburg.com>
Message-ID: <15474.28123.180241.360278@grendel.zope.com>

M.-A. Lemburg writes:
 > The problem with backporting this patch is that in order
 > for Python to properly recompile any broken module, the
 > magic will have to be changed. Question is whether this
 > is a reasonable thing to do in a patch level release...

Guido can rule as he sees fit, but I don't see any reason *not* to
change the magic number.  This seems like a pretty important fix to
me.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From dan@dberlin.org  Tue Feb 19 15:57:59 2002
From: dan@dberlin.org (Daniel Berlin)
Date: Tue, 19 Feb 2002 10:57:59 -0500
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <20020219065151.A28722@glacier.arctrix.com>
Message-ID: <72820FED-2551-11D6-9D9B-000393575BCC@dberlin.org>

On Tuesday, February 19, 2002, at 09:51  AM, Neil Schemenauer wrote:

> Daniel Berlin wrote:
>> When you get to optimizations, you want Advanced Compiler Design and
>> Implementation by Muchnick.
>
> Right now I'm not planning to do any optimizations (except perhaps
> limiting the number of registers used).
>
This is, of course, a tricky optimization to do.
Limiting registers used involves splitting live ranges at the right 
places, etc.
--Dan



From jacobs@penguin.theopalgroup.com  Tue Feb 19 16:01:26 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Tue, 19 Feb 2002 11:01:26 -0500 (EST)
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <72820FED-2551-11D6-9D9B-000393575BCC@dberlin.org>
Message-ID: <Pine.LNX.4.33.0202191059090.23215-100000@penguin.theopalgroup.com>

On Tue, 19 Feb 2002, Daniel Berlin wrote:
> On Tuesday, February 19, 2002, at 09:51  AM, Neil Schemenauer wrote:
>
> > Daniel Berlin wrote:
> >> When you get to optimizations, you want Advanced Compiler Design and
> >> Implementation by Muchnick.
> >
> > Right now I'm not planning to do any optimizations (except perhaps
> > limiting the number of registers used).
> >
> This is, of course, a tricky optimization to do.
> Limiting registers used involves splitting live ranges at the right
> places, etc.

Why limit the number of registers at all?  So long as they fit in L1 cache
you are golden.  If not, no great loss.  Of course, this does mean that you
will want to have the ability to heap-allocate large register files, though
I suspect that frame objects do this already for fast locals (of course, I
haven't looked).

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From dan@dberlin.org  Tue Feb 19 16:37:22 2002
From: dan@dberlin.org (Daniel Berlin)
Date: Tue, 19 Feb 2002 11:37:22 -0500
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <Pine.LNX.4.33.0202191059090.23215-100000@penguin.theopalgroup.com>
Message-ID: <F304B40A-2556-11D6-9D9B-000393575BCC@dberlin.org>

On Tuesday, February 19, 2002, at 11:01  AM, Kevin Jacobs wrote:

> On Tue, 19 Feb 2002, Daniel Berlin wrote:
>> On Tuesday, February 19, 2002, at 09:51  AM, Neil Schemenauer wrote:
>>
>>> Daniel Berlin wrote:
>>>> When you get to optimizations, you want Advanced Compiler Design and
>>>> Implementation by Muchnick.
>>>
>>> Right now I'm not planning to do any optimizations (except perhaps
>>> limiting the number of registers used).
>>>
>> This is, of course, a tricky optimization to do.
>> Limiting registers used involves splitting live ranges at the right
>> places, etc.
>
> Why limit the number of registers at all?  So long as they fit in L1 
> cache
> you are golden.

Err, what makes you think this?
The largest problem on architectures like x86 is the number of registers.
You end up with about 4 usable registers. (hardware register renaming 
only helps eliminate instruction dependencies, before someone mentions 
it).
Performance quickly drops when you start spilling registers to the stack.

In fact, i've seen multiple  SPEC regressions of 15% or more caused by a 
single extra spilled register.
Why?
Because you have to save it and reload it multiple times.
These *kill* pipelines, and instruction scheduling.

It's also *much* harder to model the cache hierarchy properly so that 
you can make sure they'd fit in the l1 cache, than it is to make sure 
they stay in registers where needed in the first place.


Try taking a performance critical loop entirely in registers, and change 
it to save to and load from memory into a register on every iteration.
See how much slower it gets.


--Dan



From mwh@python.net  Tue Feb 19 16:50:04 2002
From: mwh@python.net (Michael Hudson)
Date: 19 Feb 2002 16:50:04 +0000
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: Daniel Berlin's message of "Tue, 19 Feb 2002 11:37:22 -0500"
References: <F304B40A-2556-11D6-9D9B-000393575BCC@dberlin.org>
Message-ID: <2mbselpgar.fsf@starship.python.net>

Daniel Berlin <dan@dberlin.org> writes:

> On Tuesday, February 19, 2002, at 11:01  AM, Kevin Jacobs wrote:
> 
> > On Tue, 19 Feb 2002, Daniel Berlin wrote:
> >> On Tuesday, February 19, 2002, at 09:51  AM, Neil Schemenauer wrote:
> >>
> >>> Daniel Berlin wrote:
> >>>> When you get to optimizations, you want Advanced Compiler Design and
> >>>> Implementation by Muchnick.
> >>>
> >>> Right now I'm not planning to do any optimizations (except perhaps
> >>> limiting the number of registers used).
> >>>
> >> This is, of course, a tricky optimization to do.
> >> Limiting registers used involves splitting live ranges at the right
> >> places, etc.
> >
> > Why limit the number of registers at all?  So long as they fit in L1 
> > cache
> > you are golden.
> 
> Err, what makes you think this?
> The largest problem on architectures like x86 is the number of registers.
> You end up with about 4 usable registers. (hardware register renaming 
> only helps eliminate instruction dependencies, before someone mentions 
> it).
> Performance quickly drops when you start spilling registers to the stack.

I think you misunderstand what Rattlesnake is; AIUI it is (or
will/intends to be) a register based VM for Python replacing the
current stack based VM -- I think gcc still gets to decide which x86
registers to use...

Cheers,
M.

-- 
  ARTHUR:  The ravenours bugblatter beast of Traal ... is it safe?
    FORD:  Oh yes, it's perfectly safe ... it's just us who are in 
           trouble.
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 6


From jeff@hobbs.org  Tue Feb 19 16:53:53 2002
From: jeff@hobbs.org (Jeffrey Hobbs)
Date: Tue, 19 Feb 2002 08:53:53 -0800
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: <m38z9pkgk6.fsf@mira.informatik.hu-berlin.de>
Message-ID: <AEEJJLFGECCKDONDOCKEGELBDJAA.jeffh@activestate.com>

Martin,

	...
> Of Jeff's options, invoking Tcl_SetMaxBlockTime seemed to be most
> promising: I want Tcl_DoOneEvent to return after 20ms, to give other
> Tcl threads a chance. So I invented the patch
	...
> ought to continously increase the counter in the label (once a
> second), but doesn't, atleast not on Linux, using Tcl 8.3.3.  In the
> strace output, it appears that it first does a select call with a
> timeout, but that is followed by one without time limit before
> Tcl_DoOneEvent returns.

IIRC, Tcl_SetMaxBlockTime is a one-short call - it sets the next
block time, not all block times.  I'm sure there was a reason for
this, but that was implemented before I was a core guy.  Anyway,
I think you just need to try:

-		result = Tcl_DoOneEvent(TCL_DONT_WAIT);
+		Tcl_SetMaxBlockTime(&blocktime);
+		result = Tcl_DoOneEvent(0);

and see if that satisfies the need for responsiveness as well as
not blocking.

Thanks,

Jeff


From aahz@rahul.net  Tue Feb 19 17:14:47 2002
From: aahz@rahul.net (Aahz Maruch)
Date: Tue, 19 Feb 2002 09:14:47 -0800 (PST)
Subject: [Python-Dev] 2.2.1 issues
In-Reply-To: <15474.28123.180241.360278@grendel.zope.com> from "Fred L. Drake, Jr." at Feb 19, 2002 10:23:07 AM
Message-ID: <20020219171447.EBA84E8C8@waltz.rahul.net>

Fred L. Drake, Jr. wrote:
> M.-A. Lemburg writes:
>>
>> The problem with backporting this patch is that in order
>> for Python to properly recompile any broken module, the
>> magic will have to be changed. Question is whether this
>> is a reasonable thing to do in a patch level release...
> 
> Guido can rule as he sees fit, but I don't see any reason *not* to
> change the magic number.  This seems like a pretty important fix to
> me.

The question is not whether it's an important fix, but whether the fix
and its consequences are important enough to warrant changing the magic
number.  It's obviously possible for people to regen their .pyc files by
deleting them, so I think we should wait for Guido to say "yes" before
bumping the magic number, given that one of the cardinal points of the
new bugfix process is that .pyc files will not be regenerated due to a
bugfix release.

Note carefully that I do agree that it's a serious enough issue to
consider the possibility of breaking that rule, but I think we can't
afford to pull the trigger without Guido's specific buy-in.  We'll also
need to think about how we're going to market it if we do bump the magic
number.

To me, then, the proper question is, "Is this an issue where *automatic*
regeneration of .pyc files is sufficiently important?"

(I don't know enough to have an opinion myself ;-), but I'll point out
that the import failure means that at least it isn't a silent failure --
which I would absolutely agree needs a magic number bump.)
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

We must not let the evil of a few trample the freedoms of the many.


From pedroni@inf.ethz.ch  Tue Feb 19 17:29:55 2002
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Tue, 19 Feb 2002 18:29:55 +0100
Subject: [Python-Dev] Meta-reflections
References: <Pine.LNX.4.33.0202182023170.19182-100000@penguin.theopalgroup.com>
Message-ID: <017501c1b96b$0c4e93c0$6d94fea9@newmexico>

Hi.

From: Kevin Jacobs <jacobs@penguin.theopalgroup.com>
> On 18 Feb 2002, Martin v. Loewis wrote:
> > Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes:
> > >   2) Should attribute access follow the same resolution order rules as
> > >      methods?
> >
> > Yes, I think so.
> 
> Ouch!  This implies a great deal more than you may be thinking of.  For
> example, do you really want to be able to do this:
> 
>     class Foo(object):
>       __slots__ = ('a',)
> 
>     class Bar(Foo):
>       __slots__ = ('a',)
> 
>     bar = Bar()
>     bar.a = 1
>     super(Bar, bar).a = 2
>     print bar.a
>     > 1
> 
> This violates the traditional Python idiom of having a flat namespace for
> attributes, even in the presence of inheritance.  This has very profound
> implications to Python semantics and performance.
> 

Probably I have not followed the discussion close enough.
The variant with super does not work but

>>> bar=Bar()
>>> bar.a=1
>>> Foo.a.__set__(bar,2)
>>> bar.a
1
>>> Foo.a.__get__(bar)
2
>>>

works. Slots are already not flat.
They have basically a similar behavior to fields
in JVM object model (and I presume in .NET).

Given that implementing slots with fields is
one of the possibility for Jython (and some
possible Python over .NET), with indeed
some practical advantages [Btw for the
moment I don't see obstacles to such an 
approach but I have not considered
all the details], it is probably
reasonable to keep things as they are.

Consider also:

>>> class Goo(object):
...  __slots__ = ('a',)
...
>>> class Bar(Goo,Foo): pass
...
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: multiple bases have instance lay-out conflict

that helps and reinforces that model.

Samuele.








From jacobs@penguin.theopalgroup.com  Tue Feb 19 17:53:03 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Tue, 19 Feb 2002 12:53:03 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <017501c1b96b$0c4e93c0$6d94fea9@newmexico>
Message-ID: <Pine.LNX.4.33.0202191245190.24116-100000@penguin.theopalgroup.com>

On Tue, 19 Feb 2002, Samuele Pedroni wrote:
> Slots are already not flat.
> They have basically a similar behavior to fields
> in JVM object model (and I presume in .NET).

I agree, but do we want slots to be non-flat?  It goes very much against the
traditional Python idiom.  In my opinion, I believe that slots should have
exactly the same semantics as normal instance attributes, except for
how/where they are allocated.

> Given that implementing slots with fields is one of the possibility for
> Jython

This is possible for flat slot namespaces too; just remap new slots to
existing ones when they overlap, instead of allocating a new one.

> Consider also:
>
> >>> class Goo(object):
> ...  __slots__ = ('a',)
> ...
> >>> class Bar(Goo,Foo): pass
> ...
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: multiple bases have instance lay-out conflict
>
> that helps and reinforces that model.

I'll contend that the current implementation is flawed for this and several
other reasons I've stated in my previous e-mails.  Of course, we're waiting
to hear back from Guido when he returns, since his opinion is infinitely
more important than mine in this matter.

Thanks,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From mal@lemburg.com  Tue Feb 19 18:00:24 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 19 Feb 2002 19:00:24 +0100
Subject: [Python-Dev] 2.2.1 issues
References: <20020219171447.EBA84E8C8@waltz.rahul.net>
Message-ID: <3C7292B8.89E43335@lemburg.com>

Aahz Maruch wrote:
> 
> Fred L. Drake, Jr. wrote:
> > M.-A. Lemburg writes:
> >>
> >> The problem with backporting this patch is that in order
> >> for Python to properly recompile any broken module, the
> >> magic will have to be changed. Question is whether this
> >> is a reasonable thing to do in a patch level release...
> >
> > Guido can rule as he sees fit, but I don't see any reason *not* to
> > change the magic number.  This seems like a pretty important fix to
> > me.
> 
> The question is not whether it's an important fix, but whether the fix
> and its consequences are important enough to warrant changing the magic
> number.  It's obviously possible for people to regen their .pyc files by
> deleting them, so I think we should wait for Guido to say "yes" before
> bumping the magic number, given that one of the cardinal points of the
> new bugfix process is that .pyc files will not be regenerated due to a
> bugfix release.

We could of course ship the patch level release with the same
magic number. Modules that haven't worked before will then start
to work.

Note that we haven't had *any* bug report directly related to 
this, so it's likely that noone has actually hit this bug in
practice.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From nas@python.ca  Tue Feb 19 18:20:53 2002
From: nas@python.ca (Neil Schemenauer)
Date: Tue, 19 Feb 2002 10:20:53 -0800
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <F304B40A-2556-11D6-9D9B-000393575BCC@dberlin.org>; from dan@dberlin.org on Tue, Feb 19, 2002 at 11:37:22AM -0500
References: <Pine.LNX.4.33.0202191059090.23215-100000@penguin.theopalgroup.com> <F304B40A-2556-11D6-9D9B-000393575BCC@dberlin.org>
Message-ID: <20020219102053.A29414@glacier.arctrix.com>

Daniel Berlin wrote:
> The largest problem on architectures like x86 is the number of
> registers.  You end up with about 4 usable registers. (hardware
> register renaming only helps eliminate instruction dependencies,
> before someone mentions it).  Performance quickly drops when you start
> spilling registers to the stack.

I'm not going to be using hardware registers.  Bytecode will be
generated to run on a virtual machine.  I can use a many registers as I
want.  However, I suspect it would be better to reuse registers rather
than have one for every intermediate result.

  Neil


From pedroni@inf.ethz.ch  Tue Feb 19 18:25:26 2002
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Tue, 19 Feb 2002 19:25:26 +0100
Subject: [Python-Dev] Meta-reflections
References: <Pine.LNX.4.33.0202191245190.24116-100000@penguin.theopalgroup.com>
Message-ID: <01a001c1b972$cdf9e040$6d94fea9@newmexico>

From: Kevin Jacobs <jacobs@penguin.theopalgroup.com>
> On Tue, 19 Feb 2002, Samuele Pedroni wrote:
> > Slots are already not flat.
> > They have basically a similar behavior to fields
> > in JVM object model (and I presume in .NET).
> 
> I agree, but do we want slots to be non-flat?  It goes very much against the
> traditional Python idiom.  In my opinion, I believe that slots should have
> exactly the same semantics as normal instance attributes, except for
> how/where they are allocated.

Personally I don't expect slots to behave like attributes. I mean,
the different naming is a hint.

> > Given that implementing slots with fields is one of the possibility for
> > Jython
> 
> This is possible for flat slot namespaces too; just remap new slots to
> existing ones when they overlap, instead of allocating a new one.

Yes, but from the POV of fields this is less natural.
There's a trade-off issue here.

> > Consider also:
> >
> > >>> class Goo(object):
> > ...  __slots__ = ('a',)
> > ...
> > >>> class Bar(Goo,Foo): pass
> > ...
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in ?
> > TypeError: multiple bases have instance lay-out conflict
> >
> > that helps and reinforces that model.
> 
> I'll contend that the current implementation is flawed for this and several
> other reasons I've stated in my previous e-mails.  Of course, we're waiting
> to hear back from Guido when he returns, since his opinion is infinitely
> more important than mine in this matter.
> 

It is not flawed, it is just single-inheritance-of-struct-like-layout-based.
I'm fine with that.

To be honest I would find very annoying that what we are about
to implement in Jython 2.2 should be somehow radically changed for Jython 2.3.
We have not the necessary amount of human resources 
to happily play that kind of game.

I hope and presume that Guido did know what he was designing,
and I had that impression too.
OTOH I agree that pickle should work for new-style classes too.

regards, Samuele Pedroni.



From jacobs@penguin.theopalgroup.com  Tue Feb 19 17:23:40 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Tue, 19 Feb 2002 12:23:40 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <01a001c1b972$cdf9e040$6d94fea9@newmexico>
Message-ID: <Pine.LNX.4.33.0202191157150.1210-100000@penguin.theopalgroup.com>

On Tue, 19 Feb 2002, Samuele Pedroni wrote:
> Personally I don't expect slots to behave like attributes. I mean,
> the different naming is a hint.

For me, slot declarations are a hint that certain attributes should be
allocated 'statically' versus the normal Python 'dynamic' attribute
allocation.  In virtually all other ways I expect them to act like
attributes.  The question is should the static allocation introduce all the
complex scoping rules that come with Java fields or C++ instance variables.
If we go by the "principle of least surprise", it seems much better to keep
the normal Python attribute rules than those of Java or C++.

> > > Given that implementing slots with fields is one of the possibility for
> > > Jython
> >
> > This is possible for flat slot namespaces too; just remap new slots to
> > existing ones when they overlap, instead of allocating a new one.
>
> Yes, but from the POV of fields this is less natural.
> There's a trade-off issue here.

Less natural for Java maybe, but not for Python.

> > I'll contend that the current implementation is flawed for this and several
> > other reasons I've stated in my previous e-mails.  Of course, we're waiting
> > to hear back from Guido when he returns, since his opinion is infinitely
> > more important than mine in this matter.
>
> It is not flawed, it is just single-inheritance-of-struct-like-layout-based.
> I'm fine with that.

Please read some of my earlier messages.  There are other 'warts'.

> To be honest I would find very annoying that what we are about
> to implement in Jython 2.2 should be somehow radically changed for Jython 2.3.
> We have not the necessary amount of human resources
> to happily play that kind of game.

Well, we are dealing with an implementation that is not documented _at all_.
So, in virtually all respects, Jython 2.2 could ignore their existence
totally and still function correctly.  I hope that you will be pleased by
the in-depth discussions on this topic, since it will likely lead to the
formulation of refined documentation for many of these very fuzzily defined
features.  As an implementer, that kind of clarification can be invaluable
since it means you don't have to guess at the semantics and have to change
it later.

> I hope and presume that Guido did know what he was designing,
> and I had that impression too.
> OTOH I agree that pickle should work for new-style classes too.

He knew what he was designing, but was focused on achieving other goals with
this infrastructure (like class/type unification).  I have the feeling that
slots were more of an experiment than anything else.  Don't get me wrong --
they are insanely useful even in their current state.  On the other hand, I
don't think they're ready for prime-time until we smooth over the picky
semantic issues relating to slot overloading, reflection and renaming. Just
look at the Python standard library -- you'll notice that slots are not used
anywhere.  I predict that we will be using them extensively, especially in
the standard library, as soon as they are deemed ready for prime-time.

Best regards,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From tismer@tismer.com  Tue Feb 19 19:15:31 2002
From: tismer@tismer.com (Christian Tismer)
Date: Tue, 19 Feb 2002 20:15:31 +0100
Subject: [Python-Dev] Stackless Design Q.
Message-ID: <3C72A453.7080905@tismer.com>

Hi friends,

my tasklets are flying.

Now I am at the point where I'm worst suited for:
Design an interface.

First of all what it is:
We have a "tasklet", a tiny object around chains of frames,
with some additional structure that keeps C stack snippets.

Ignore the details, tasklets are simply the heart of Stackless'
coro/uthread/anything.

The tstate maintains two lists of tasklets.
One is the list of all running (or better "runnable"?) tasklets.
These tasklets are not in "value state", they don't want to
transmit a value. They can be scheduled as you like it.

The other list keeps record of blocked tasklets. These are
tasklets which are in "value state", they are waiting to
do some transmission.

Whenever a tasklet calls an other tasklet's "transfer"
method for data transfer, the following happens:
- the other tasklet is checked to be in blocked state.
- the tasklet is removed from the runnable list,
- it is blocked
- data is transferred
- the other tasklet is unblocked
- the other tasklet is inserted into the runnables
- the other tasklet is run

The "transfer" word comes from Modula II. It implements
coroutine behavior.

Then, I have something like the ICON suspend, and I gave
it the name "suspend" for now, since yield is in use.
Suspend is a one-sided thing, and it is also needed
to initiate a blocked state at all.

Thre is a "client" variable that holds a reference
to the tasklet that called "transfer".

When suspend is called, then we have two cases:
1) There is another tasklet in the client variable.
    We take the client and call client.transfer(data)
2) There is no client set already.
    We go into blocked state and wait, until some
    tasklet transfers to us.

What suspend does is yielding (like in generators),
but also initial blocking, providing targets for
transfer to jump to.

What I'm missing is name and concept of an opposite
method: Finish the data transfer, but leave both
partners in runnable state.

Ok, here a summary of what I have.
Please tell me what you think, and what you'd like
to change.

stackless module:
-----------------

schedule()
     switch to the next task in the runnable list.

taskoutlet(func)
     call it with a function, and it generates a generator
     for tasklets.

ret = suspend(value)
    initiates data exchange, see above. The current tasklet
    gets blocked. If client is set already, a transfer is
    performed.

Example:

def demo(n):
     print n

factory = taskoutlet(demo)

t = factory(42)   # this is now a tasklet with bound arguments.

tasklet methods:
----------------

t.insert()
     inserts t into the according tasklet ring at the "end".
     if t is in a ring already, it is removed first.
     The ring is either "runnables" or "blocked", depending
     on t's state.

t.remove()
     removes t from whatever ring it is in.

t.run()
     inserts t into runnables and switches to it immediately.

ret = t.transfer(value)
     unblocks t, tansfers data, blocks myself.

*Wanted*

Again, What is the name of this function/method, and
where does it belong?
It
- unblocks another tasklet
- transfers data
- does not block anything
- schedules the other tasklet

Or is this a bad design at all?
If so, please help me as well.

Thanks in advance - ciao - chris


p.s.: Not all of the above is implemented already.

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net/
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
      where do you want to jump today?   http://www.stackless.com/




From tim.one@comcast.net  Tue Feb 19 20:01:28 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 19 Feb 2002 15:01:28 -0500
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <20020219102053.A29414@glacier.arctrix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEKHNNAA.tim.one@comcast.net>

[Neil Schemenauer]
> I'm not going to be using hardware registers.  Bytecode will be
> generated to run on a virtual machine.  I can use a many registers as I
> want.  However, I suspect it would be better to reuse registers rather
> than have one for every intermediate result.

I think your intuition there is darned good.

Within a basic block, "the obvious" greedy scheme is provably optimal wrt
minimizing the # of temp registers needed by the block:  start the block
with an initially empty set of temp registers.  March down the instructions
one at a time.  For each input temp register whose contained value's last
use is in the current instruction, return that temp register to the set of
free temp registers.  Then for each output temp register needed by the
current instruction, take one (any) from the set of free temp registers;
else if the set is empty invent a new temp register out of thin air (bumping
a block high-water mark is an easy way to do this).

That part is easy.  What's hard (too expensive to be practical, in general)
is provably minimizing the overall number of temps *across* basic blocks.
Still, look up "graph coloring register assignment" on google and you'll
find lots of effective heuristics.  For a first cut, punt:  just store
everything still alive into memory at the end of a basic block.  If you're
retaining Rattlesnake's idea of treating "the register file" as a superset
of the local vrbl vector, mounds of such "stores" will be nops.

What's also hard is selecting instructions in such a way as to minimize the
number of temp registers needed, ditto ordering instructions toward that
end.  When you think about those, you realize that minimizing the number of
temps is actually a tradeoff, not an absolute good, both affecting and
affected by other decisions.  A mountain of idiosyncratic heuristics follows
soon after <wink>.

but=you-don't-have-to-solve-everything-at-the-start-ly y'rs  - tim



From martin@v.loewis.de  Tue Feb 19 20:22:16 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 19 Feb 2002 21:22:16 +0100
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: <AEEJJLFGECCKDONDOCKEGELBDJAA.jeffh@activestate.com>
References: <AEEJJLFGECCKDONDOCKEGELBDJAA.jeffh@activestate.com>
Message-ID: <m3g03x8bnr.fsf@mira.informatik.hu-berlin.de>

"Jeffrey Hobbs" <jeffh@ActiveState.com> writes:

> IIRC, Tcl_SetMaxBlockTime is a one-short call - it sets the next
> block time, not all block times.  I'm sure there was a reason for
> this, but that was implemented before I was a core guy.  Anyway,
> I think you just need to try:
> 
> -		result = Tcl_DoOneEvent(TCL_DONT_WAIT);
> +		Tcl_SetMaxBlockTime(&blocktime);
> +		result = Tcl_DoOneEvent(0);
> 
> and see if that satisfies the need for responsiveness as well as
> not blocking.

Thanks, but that won't help. Tcl still performs a blocking select.
Studying the Tcl source, it seems that the SetMaxBlockTime feature is
broken in Tcl 8.3. DoOneEvent has

	/*
	 * If TCL_DONT_WAIT is set, be sure to poll rather than
	 * blocking, otherwise reset the block time to infinity.
	 */

	if (flags & TCL_DONT_WAIT) {
	    tsdPtr->blockTime.sec = 0;
	    tsdPtr->blockTime.usec = 0;
	    tsdPtr->blockTimeSet = 1;
	} else {
	    tsdPtr->blockTimeSet = 0;
	}

So if TCL_DONT_WAIT is set, the blocktime is 0, otherwise, it is
considered not set. It then goes on doing

	if ((flags & TCL_DONT_WAIT) || tsdPtr->blockTimeSet) {
	    timePtr = &tsdPtr->blockTime;
	} else {
	    timePtr = NULL;
	}
	result = Tcl_WaitForEvent(timePtr);

So if TCL_DONT_WAIT isn't set, it will block; if it is, it will
busy-wait. Looks like we lose either way.

In-between, it invokes the setupProcs of each input source, so that
they can set a maxblocktime, but I don't think _tkinter should hack
itself into that process.

So I don't see a solution on the path of changing how Tcl invokes
select.

About thread-safety: Is Tcl 8.3 thread-safe in its standard
installation, so that we can just use it from multiple threads? If
not, what is the compile-time check to determine whether it is
thread-safe? If there is none, I really don't see a solution, and the
Sleep must stay.

Regards,
Martin


From martin@v.loewis.de  Tue Feb 19 20:29:49 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 19 Feb 2002 21:29:49 +0100
Subject: [Python-Dev] 2.2.1 issues
In-Reply-To: <3C726270.7D33E687@lemburg.com>
References: <2madu5zhnj.fsf@starship.python.net>
 <3C726270.7D33E687@lemburg.com>
Message-ID: <m3bsel8bb6.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> Right. 1) was caused by 2).

That wasn't actually the case. The overwriting of memory was really
independent of the error in surrogate processing, and can be fixed
independently.

> As a result, modules using unpaired surrogates in Unicode
> literals are simply broken in Python <= 2.2.0.

I think this is unimportant enough to just accept this bug for Python
2.2.x. If people ever run into the problem, well: just don't do this.
Unpaired surrogates will be entirely in Unicode 3.2.

> The problem with backporting this patch is that in order
> for Python to properly recompile any broken module, the
> magic will have to be changed. Question is whether this
> is a reasonable thing to do in a patch level release...

The memory-overwriting problem can be fixed independently, e.g. with

https://sourceforge.net/tracker/download.php?group_id=5470&atid=105470&file_id=15248&aid=495401

Regards,
Martin



From nas@python.ca  Tue Feb 19 20:35:07 2002
From: nas@python.ca (Neil Schemenauer)
Date: Tue, 19 Feb 2002 12:35:07 -0800
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEKHNNAA.tim.one@comcast.net>; from tim.one@comcast.net on Tue, Feb 19, 2002 at 03:01:28PM -0500
References: <20020219102053.A29414@glacier.arctrix.com> <LNBBLJKPBEHFEDALKOLCKEKHNNAA.tim.one@comcast.net>
Message-ID: <20020219123507.B29834@glacier.arctrix.com>

Tim Peters wrote:
> Within a basic block, "the obvious" greedy scheme is provably optimal wrt
> minimizing the # of temp registers needed by the block

I already had this part of the plan mostly figured out.  Thanks for
verifying my thinking however.

> What's hard (too expensive to be practical, in general) is provably
> minimizing the overall number of temps *across* basic blocks.

This was the part that was worrying me.

> Still, look up "graph coloring register assignment" on google and
> you'll find lots of effective heuristics.  For a first cut, punt:
> just store everything still alive into memory at the end of a basic
> block.

Okay, that's easy.  I suspect it will work fairly well in practice since
most functions have a small number of basic blocks and that increasing
the number of registers is cheap.

> If you're retaining Rattlesnake's idea of treating "the register file"
> as a superset of the local vrbl vector, mounds of such "stores" will
> be nops.

I'm planning to keep this idea.  There seems to be no good reason to
treat local variables any differently than registers.  I suppose it
would be fairly easy to add a simple peep-hole optimizer that would
clean out the redundant stores.  When you talked about flexible
intermediate code did you have anything in mind?

Hmm, perhaps constants can be handled in a similar way.  The only way I
can think of doing it at the moment is to copy the list of constants
into registers when the frame is created.  That seems like it could
easily end up as a net loss though.

> What's also hard is selecting instructions in such a way as to
> minimize the number of temp registers needed, ditto ordering
> instructions toward that end.

But is there really any freedom to do reordering?  For example, a
BINARY_ADD opcode to end up doing practically anything.

  Neil


From mal@lemburg.com  Tue Feb 19 21:21:33 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 19 Feb 2002 22:21:33 +0100
Subject: [Python-Dev] 2.2.1 issues
References: <2madu5zhnj.fsf@starship.python.net>
 <3C726270.7D33E687@lemburg.com> <m3bsel8bb6.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3C72C1DD.B561052A@lemburg.com>

"Martin v. Loewis" wrote:
> 
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
> > Right. 1) was caused by 2).
> 
> That wasn't actually the case. The overwriting of memory was really
> independent of the error in surrogate processing, and can be fixed
> independently.

In that case, it's probably best to just use this patch and leave 
the UTF-8 fix in 2.3 only.
 
> The memory-overwriting problem can be fixed independently, e.g. with
> 
> https://sourceforge.net/tracker/download.php?group_id=5470&atid=105470&file_id=15248&aid=495401

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From martin@v.loewis.de  Tue Feb 19 20:31:25 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 19 Feb 2002 21:31:25 +0100
Subject: [Python-Dev] 2.2.1 issues
In-Reply-To: <15474.28123.180241.360278@grendel.zope.com>
References: <2madu5zhnj.fsf@starship.python.net>
 <3C726270.7D33E687@lemburg.com>
 <15474.28123.180241.360278@grendel.zope.com>
Message-ID: <m37kp98b8i.fsf@mira.informatik.hu-berlin.de>

"Fred L. Drake, Jr." <fdrake@acm.org> writes:

> Guido can rule as he sees fit, but I don't see any reason *not* to
> change the magic number.  This seems like a pretty important fix to
> me.

The memory-overwriting problem can be fixed without bumping the pyc
magic. The rationale for bumping the pyc magic is pretty weak, IMO,
so that aspect should not be propagated to 2.2.1.

Regards,
Martin


From fdrake@acm.org  Tue Feb 19 21:46:23 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 19 Feb 2002 16:46:23 -0500
Subject: [Python-Dev] 2.2.1 issues
In-Reply-To: <m37kp98b8i.fsf@mira.informatik.hu-berlin.de>
References: <2madu5zhnj.fsf@starship.python.net>
 <3C726270.7D33E687@lemburg.com>
 <15474.28123.180241.360278@grendel.zope.com>
 <m37kp98b8i.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15474.51119.2069.367137@grendel.zope.com>

Martin v. Loewis writes:
 > The memory-overwriting problem can be fixed without bumping the pyc
 > magic. The rationale for bumping the pyc magic is pretty weak, IMO,
 > so that aspect should not be propagated to 2.2.1.

I'm happy with that.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From pedroni@inf.ethz.ch  Tue Feb 19 22:23:05 2002
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Tue, 19 Feb 2002 23:23:05 +0100
Subject: [Python-Dev] Meta-reflections
References: <Pine.LNX.4.33.0202191157150.1210-100000@penguin.theopalgroup.com>
Message-ID: <036001c1b994$00dd8360$6d94fea9@newmexico>


From: Kevin Jacobs <jacobs@penguin.theopalgroup.com>
> On Tue, 19 Feb 2002, Samuele Pedroni wrote:
> > Personally I don't expect slots to behave like attributes. I mean,
> > the different naming is a hint.
>
> For me, slot declarations are a hint that certain attributes should be
> allocated 'statically' versus the normal Python 'dynamic' attribute
> allocation.

Interesting but for the implementation:

class f(file):
  __slots__ = ('a',)

slot a and file.softspace are in the same league,
which is not attributes' league.

They are struct member and the descriptor
logic to access them exploit this.

>From the implementation  it seems clear that slots and attributes
are not interchangeable.

On the other hand that means that there cannot be a slot-only future
for Python.

> > > I'll contend that the current implementation is flawed for this and
several
> > > other reasons I've stated in my previous e-mails.  Of course, we're
waiting
> > > to hear back from Guido when he returns, since his opinion is infinitely
> > > more important than mine in this matter.
> >
> > It is not flawed, it is just
single-inheritance-of-struct-like-layout-based.
> > I'm fine with that.
>
> Please read some of my earlier messages.  There are other 'warts'.

Yes, but changing the whole impl design is probably not the only solution.
I mean this literally.

> > To be honest I would find very annoying that what we are about
> > to implement in Jython 2.2 should be somehow radically changed for Jython
2.3.
> > We have not the necessary amount of human resources
> > to happily play that kind of game.
>
> Well, we are dealing with an implementation that is not documented _at all_.

The 2.2 type/class unification tutorial has references to __slots__:

http://www.python.org/2.2/descrintro.html

What is true is that the surface aspects of the undelying design are not
documented.

> So, in virtually all respects, Jython 2.2 could ignore their existence
> totally and still function correctly.

False. See above.

> I hope that you will be pleased by
> the in-depth discussions on this topic, since it will likely lead to the
> formulation of refined documentation for many of these very fuzzily defined
> features.  As an implementer, that kind of clarification can be invaluable
> since it means you don't have to guess at the semantics and have to change
> it later.

This one is insolent.
Btw the tutorial contain this:

There's no check that prevents you to override an instance variable already
defined by a base class using a __slots__ declaration. If  you do that, the
instance variable defined by the base class is inaccessible (except by
retrieving its descriptor directly from the base class; this could be used to
rename it). Doing this renders the meaning of your program undefined; a check
to prevent this may be added in the future.

 > > I hope and presume that Guido did know what he was designing,
> > and I had that impression too.
> > OTOH I agree that pickle should work for new-style classes too.
>
> He knew what he was designing, but was focused on achieving other goals with
> this infrastructure (like class/type unification).  I have the feeling that
> slots were more of an experiment than anything else.  Don't get me wrong --
> they are insanely useful even in their current state.  On the other hand, I
> don't think they're ready for prime-time until we smooth over the picky
> semantic issues relating to slot overloading, reflection and renaming. Just
> look at the Python standard library -- you'll notice that slots are not used
> anywhere.  I predict that we will be using them extensively, especially in
> the standard library, as soon as they are deemed ready for prime-time.
>

A possible approach:
write a patch implementing your preferred semantics.
You can keep it orthogonal from the rest, using
a name different than "__slots__", for the first
cut.

regards, Samuele Pedroni.



From greg@cosc.canterbury.ac.nz  Wed Feb 20 03:01:34 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 20 Feb 2002 16:01:34 +1300 (NZDT)
Subject: [Python-Dev] Stackless Design Q.
In-Reply-To: <3C72A453.7080905@tismer.com>
Message-ID: <200202200301.QAA22063@s454.cosc.canterbury.ac.nz>

> Now I am at the point where I'm worst suited for:
> Design an interface.

> Please tell me what you think, and what you'd like
> to change.

It's not clear exactly what you're after here. Are you
trying to define the lowest-level interface upon which
everything else will be built? If so, I think what you
have presented is FAR too complex.

It seems to me you need only two things:

(1) A constructor for new tasklets:

   t = tasklet(f)

      Takes a callable object f of no parameters and returns
      a tasklet which will execute the code of f. The tasklet
      is initially suspended and does not execute any of f's
      code until it is switched to for the first time.

(2) A way of switching to another tasklet:

   t.transfer()

      Suspends the currently-running tasklet and resumes
      tasklet t were it last left off. This will either be
      at the beginning or where it last called the transfer()
      of another tasklet.

All the other stuff you talk about -- passing values between
tasklets, rings of runnable tasklets, scheduling policies, etc --
can all be implemented in Python on top of these primitives.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From rushing@nightmare.com  Wed Feb 20 07:41:39 2002
From: rushing@nightmare.com (Sam Rushing)
Date: 19 Feb 2002 23:41:39 -0800
Subject: [Python-Dev] Re: [Stackless] Stackless Design Q.
In-Reply-To: <3C72A453.7080905@tismer.com>
References: <3C72A453.7080905@tismer.com>
Message-ID: <1014190899.31006.5.camel@fang.nightmare.com>

--=-o0D+Wc9k+nOeeoHZukjD
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

On Tue, 2002-02-19 at 11:15, Christian Tismer wrote:
> Again, What is the name of this function/method, and
> where does it belong?
> It
> - unblocks another tasklet
> - transfers data
> - does not block anything
> - schedules the other tasklet
>=20
> Or is this a bad design at all?

In our current system, this function is called 'schedule()';
and it takes an 'args' tuple.  It doesn't transfer control, it just
makes the other coro ready to run ASAP. [i.e., next trip through the
event loop it will be added to the set of 'runnable' coros].

Here is our coro::condition_variable::wake_one() for context:

    def wake_one (self, args=3D()):
        for coro in self._waiting:
            try:
                schedule (coro, args)
            except ScheduleError:
                pass
            else:
                self._waiting.pop(0)
                return 1
        else:
            return 0


[ScheduleError is thrown if the coro has already been scheduled]

-Sam


--=-o0D+Wc9k+nOeeoHZukjD
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (FreeBSD)
Comment: For info see http://www.gnupg.org

iD8DBQA8c1Mz96I2VlFshRwRAjTiAJ4opHWSmb45l5YgaroZoa3Oy6KhbgCgmLVT
Vbg1DWF5JI62zVhxTtvlbHA=
=ptSd
-----END PGP SIGNATURE-----

--=-o0D+Wc9k+nOeeoHZukjD--



From tim.one@comcast.net  Wed Feb 20 07:46:07 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 20 Feb 2002 02:46:07 -0500
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <20020219123507.B29834@glacier.arctrix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEMONNAA.tim.one@comcast.net>

[Tim]
>> Within a basic block, "the obvious" greedy scheme is ...

[Neil Schemenauer]
> I already had this part of the plan mostly figured out.  Thanks for
> verifying my thinking however.

You're welcome.  Note that you've been pretty mysterious about what it is
you do want to know, so I'm more pleased that I channeled you than that I
was slightly helpful <wink>.

>> What's hard (too expensive to be practical, in general) is provably
>> minimizing the overall number of temps *across* basic blocks.

> This was the part that was worrying me.

It can worry you later just as well.  Python isn't C, and the dynamic
semantics make it very much harder to prove that a subexpression is, e.g., a
loop invariant, or that an instance of "+" won't happen to change the
binding of every global, etc (ha! now I see you pointed that out yourself
later -- good chanelling on your end too <wink>).  For that reason there's
less need to get gonzo at the start.  IIRC, the primary motivation for
Rattlesnake was to cut eval loop overhead, and it's enough to aim for just
that much at the start.

>> If you're retaining Rattlesnake's idea of treating "the register file"
>> as a superset of the local vrbl vector, mounds of such "stores" will
>> be nops.

> I'm planning to keep this idea.  There seems to be no good reason to
> treat local variables any differently than registers.

Not now.  If you're looking at going on to generate native machine code
someday, well, this isn't the only simplification that will bite.

> I suppose it would be fairly easy to add a simple peep-hole optimizer
> that would clean out the redundant stores.  When you talked about
> flexible intermediate code did you have anything in mind?

That's why I urged you to keep it in Python at the start.  IRs come in all
sorts of flavors, and AFAICT people more or less stumble into one that works
well for them based on what they've done so far (and that was my experience
too in my previous lives).  You have to expect to rework it as ambitions
gorw.  Note Daniel Berlin's helpful comment that gcc is moving toward 3 IRs
now; that's the way these things always go.

At the start, I expect I'd represent a Python basic block as an object
containing a plain Python list of instruction objects.  Then you've got the
whole universe of zippy Python builtin list ops to build on when mutating
the instruction stream.  Note that my focus is to minimize *your* time
wrestling with this stuff:  implementing fancy data structures is a waste of
time at the start.  I'd also be delighted to let cyclic gc clean up dead
graph structures, so wouldn't spend any time trying, e.g., to craft a
gimmick out of weakrefs to avoid hard cycles.

You may or may not want to do a survey of popular IRs.  Here's a nice
*brief* page with links to follow:

    http://www.math.tau.ac.il/~guy/pa/ir.html

I think a lot of this is a matter of taste.  For example, some people swear
by the "Value Dependence Graphs" that came out of Microsoft Research.  I
never liked it, and it's hard to explain why ("too complicated").  Static
Single Assignment form is more to my tastes, but again it's hard to explain
why ("almost but not quite too simple").

Regardless, you can get a lot done at the Rattlesnake level just following
your gut intuitions, prodded by what turns out to be too clumsy.  As with
most other things, reading papers is much more useful after you've wrestled
with the problems on your own.  There's only one way to do it, but what that
way is remains a mystery to me.

> ...
> But is there really any freedom to do reordering?  For example, a
> BINARY_ADD opcode to end up doing practically anything.

That's right, and that's why you should gently file away but ignore almost
all the advice you'll get <wink>.  Skip kept discovering over and over again
just how little his peephole optimizer could get away with doing on Python
bytecode.  Collapsing jumps to jumps is one of the few safe things you can
get away with.

BTW, the JUMP_IF_TRUE and JUMP_IF_FALSE opcodes peek at the top of the stack
 instead of consuming it.  As a result, both the fall-through and target
instructions are almost always the no-bang-for-the-buck POP_TOP.  This
always irritated me to death (it's *useful* behavior, IIRC, only for chained
comparisons).  If Rattlesnake cleans up just that much, it will be a huge
win in my eyes, depite not being measurable <ahem>.



From jacobs@penguin.theopalgroup.com  Wed Feb 20 10:45:38 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Wed, 20 Feb 2002 05:45:38 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <036001c1b994$00dd8360$6d94fea9@newmexico>
Message-ID: <Pine.LNX.4.33.0202200442480.6805-100000@penguin.theopalgroup.com>

On Tue, 19 Feb 2002, Samuele Pedroni wrote:
> From: Kevin Jacobs <jacobs@penguin.theopalgroup.com>
> > On Tue, 19 Feb 2002, Samuele Pedroni wrote:
> > > Personally I don't expect slots to behave like attributes. I mean,
> > > the different naming is a hint.
> >
> > For me, slot declarations are a hint that certain attributes should be
> > allocated 'statically' versus the normal Python 'dynamic' attribute
> > allocation.
>
> Interesting but for the implementation:
>
> class f(file):
>   __slots__ = ('a',)
>
> slot a and file.softspace are in the same league,
> which is not attributes' league.

Currently this is true, though only you and Martin v. Loewis have replied
agreeing that this should be the case.  Everyone else I've spoken to
_wants_ slots to act more like instance attributes.

> Yes, but changing the whole impl design is probably not the only solution.
> I mean this literally.

I realize this.  That's why I'm trying to build a consensus until some sort
of clarity emerges.  That's also why I'm asking for feedback on what should
be the correct semantics of slots instead of assuming that the current
implementation is the One True Bible of slots.  If you think that slots are
implemented correctly, then I welcome you to work with me to make exactly
that case.  Unless you (or others) step up to do that, I will continue to
feel that the current slot implementation is flawed and will continue to
advocate their reform.  Unfortunately, repeatedly pointing out that my
suggestions are not how they are implemented doesn't advance either of our
cases.

> > Well, we are dealing with an implementation that is not documented _at all_.
>
> The 2.2 type/class unification tutorial has references to __slots__:
>
> http://www.python.org/2.2/descrintro.html
>
> What is true is that the surface aspects of the undelying design are not
> documented.

A tutorial is not documentation.  It is certainly suggestive of what will
eventually be documented, but it is not documented until it is part of the
Python Reference Manual.  For example, I would not be surprised if large
hunks of the descrintro ceases to work after the 'super' and 'property'
syntax changes slated for Python 2.3.

> > So, in virtually all respects, Jython 2.2 could ignore their existence
> > totally and still function correctly.
>
> False. See above.

Don't take my word for it -- ask Guido when he gets back.

> > I hope that you will be pleased by
> > the in-depth discussions on this topic, since it will likely lead to the
> > formulation of refined documentation for many of these very fuzzily defined
> > features.  As an implementer, that kind of clarification can be invaluable
> > since it means you don't have to guess at the semantics and have to change
> > it later.
>
> This one is insolent.

Please, lets not descend into name calling.  I truly believe that I am
providing a service to the general Python community by engaging in these
discussions.  If you feel that it is insolent to question the language
implementers just because I am a newcomer and have some controversial
issues, then I recommend that you rapidly get used to it.  I do it all the
time and don't plan to stop.

> A possible approach:
> write a patch implementing your preferred semantics.
> You can keep it orthogonal from the rest, using
> a name different than "__slots__", for the first
> cut.

I fully intend to provide a reference implementation of some of these ideas.
In fact, its likely to be a fairly small patch.  However, I still don't know
what the ideal semantics are.  I would very much value your input on the
matter, even if on a purely theoretical level.  So, lets start with the
premise that __attrs__ is a declaration like __slots__ except that:

  1) the namespace of these 'attrs' is flat  -- repeated names in descendant
     classes either results in an error or silently re-using the existing
     slot.  This maintains the traditional flat instance namespace of
     attributes.

  2) A complete and immutable list of slots is available as a member of each
     type to allow for easy and efficient reflection.  (though I am also in
     favor of working on better formal reflection APIs)

  3) These 'attrs' are to otherwise have the same semantics as normal
     __dict__ instance attributes.  e.g., they should be automatically
     picklable, they can have properties assigned to them, etc.

Regards,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From ping@lfw.org  Wed Feb 20 12:40:49 2002
From: ping@lfw.org (Ka-Ping Yee)
Date: Wed, 20 Feb 2002 06:40:49 -0600 (CST)
Subject: [Python-Dev] Global variable access schemes
Message-ID: <Pine.LNX.4.33.0202200628140.4954-100000@server1.lfw.org>

I've added diagrams for Guido's more recent proposal, and
summarized everything on a web page:

    http://lfw.org/python/globals.html

Check out http://lfw.org/python/guido2a.gif and
http://lfw.org/python/guido2b.gif.

About Guido2:

    - I renamed some things -- globals_vector is a structure,
      not a vector, so i put it in md_cache and used the
      prefix mc_ for its fields.

    - When you del a module variable, do you just go through
      all of mc_names to find the entry to invalidate?
      (I suppose if you sort mc_names you can binary search.)

    - It should be possible to add entries in the cache for
      attributes in other modules, too, right?  If we assume
      that varibles don't get deleted often, it should pay off.

Haven't heard anything from anybody about this topic in a while.
Has anyone been thinking about it?


-- ?!ng



From tismer@tismer.com  Wed Feb 20 12:44:31 2002
From: tismer@tismer.com (Christian Tismer)
Date: Wed, 20 Feb 2002 13:44:31 +0100
Subject: [Python-Dev] Stackless Design Q.
References: <200202200301.QAA22063@s454.cosc.canterbury.ac.nz>
Message-ID: <3C739A2F.5030502@tismer.com>

Greg Ewing wrote:

>>Now I am at the point where I'm worst suited for:
>>Design an interface.
>>
> 
>>Please tell me what you think, and what you'd like
>>to change.
>>
> 
> It's not clear exactly what you're after here. Are you
> trying to define the lowest-level interface upon which
> everything else will be built? If so, I think what you
> have presented is FAR too complex.


The old Stackless with its continuations was at lowest
possible level, in a sense.
What I now try to do is a compromise: I would like to
build the simplest possible but powerful set of
methods. At the same time, I'd like to keep track
of tasklets, since they are now containing vitual
information about stack state, and I cannot afford
to loose one of them, or we'll crash.
The little doubly-linked list maintenance is very
cheap to do. So my basic idea was to provide
what is needed to get uthreads at very high speed,
without the ned to use Python for the basic
machinery.

> It seems to me you need only two things:


<snip/>

Yes, I need these two things, and some more.

> All the other stuff you talk about -- passing values between
> tasklets, rings of runnable tasklets, scheduling policies, etc --
> can all be implemented in Python on top of these primitives.


Sure it can, with one exception:
My tasklets will also support threading, that is they
will become auto-scheduled if the user switches this
on. But auto-scheduled frames are a diffeent kind
of thing than those which are in "waiting for data"
state. I need to distinguish them or I will crash.
That's the reason why I keep these linked lists.
Switching to the wrong tasklet should be rock solid
in the kernel, this is nothing that I want people
to play with from Python.

Thanks a lot anyway, I'll try to make it even simpler.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net/
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
      where do you want to jump today?   http://www.stackless.com/




From Paul.Moore@atosorigin.com  Wed Feb 20 14:31:39 2002
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Wed, 20 Feb 2002 14:31:39 -0000
Subject: [Python-Dev] Meta-reflections
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F1@UKRUX002.rundc.uk.origin-it.com>

> I fully intend to provide a reference implementation of
> some of these ideas. In fact, its likely to be a fairly
> small patch. However, I still don't know what the ideal
> semantics are. I would very much value your input on the
> matter, even if on a purely theoretical level. So, lets
> start with the premise that __attrs__ is a declaration like
> __slots__ except that:

It seems relevant to me that your choice of name ("attrs") indicates a
relationship with attributes - which your ideas seem to deny. It is
important to make the distinction entirely clear, otherwise your points are
going to get obscured. I've been having a hard time keeping track of
precisely what you are suggesting (on which basis, thanks for this
summary...)

>   1) the namespace of these 'attrs' is flat -- repeated
>   names in descendant classes either results in an error or
>   silently re-using the existing slot. This maintains the
>   traditional flat instance namespace of attributes.

FWIW, I disagree with this completely. I would expect slots with a
particular name in derived classes to hide the same-named slots in base
classes. Whether or not the base class slot is available via some sort of
"super" shenannigans is less relevant. But hiding semantics is critical. How
do you expect to reliably make it possible to derive from a class with slots
otherwise? Perl gets into this sort of mess with its implementation of
objects.

>   2) A complete and immutable list of slots is available
>   as a member of each type to allow for easy and efficient
>   reflection. (though I am also in favor of working on
>   better formal reflection APIs)

Agreed - up to a point. I don't see a need for a way to distinguish between
slots and "normal" attributes, personally. But I don't do anything fancy
here, so my experience isn't very indicative.

I'm more or less happy with dir() as a start, although I agree that a better
formal reflection API would be helpful. I suspect that such a thing could be
built in Python on top of the existing facilities, however...

>   3) These 'attrs' are to otherwise have the same semantics
>   as normal __dict__ instance attributes. e.g., they should
>   be automatically picklable, they can have properties
>   assigned to them, etc.

I think I agree here. However, if you want slots to behave like normal
attributes, except for the flat namespace, I see no value. Why have the
exception at all?

Hmm, this raises the question of why we have slots at all. If they act
exactly like attributes, why have them? As a user, I perceive them as an
efficiency thing - they save the memory associated with a dictionary, and
are probably faster as well. There can be tradeoffs which you pay for that
efficiency, but that's all. No semantic difference. Actually, that's pretty
much how the descrintro document describes slots. Strange that...

I guess I am saying that I'm happy with slots as designed (documented in
descrintro) - modulo some implementation bugs such as not getting
automatically pickled.

Paul.


From jacobs@penguin.theopalgroup.com  Wed Feb 20 13:30:05 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Wed, 20 Feb 2002 08:30:05 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F1@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <Pine.LNX.4.33.0202200747380.8251-100000@penguin.theopalgroup.com>

Paul, thanks for the very constructive feedback!

On Wed, 20 Feb 2002, Moore, Paul wrote:

> > I fully intend to provide a reference implementation of
> > some of these ideas. In fact, its likely to be a fairly
> > small patch. However, I still don't know what the ideal
> > semantics are. I would very much value your input on the
> > matter, even if on a purely theoretical level. So, lets
> > start with the premise that __attrs__ is a declaration like
> > __slots__ except that:
>
> It seems relevant to me that your choice of name ("attrs") indicates a
> relationship with attributes - which your ideas seem to deny.

You are right -- lets call them 'slotattrs', since they should ideally have
virtually the same semantics as attributes, except they are allocated like
slots.

> >   1) the namespace of these 'attrs' is flat -- repeated
> >   names in descendant classes either results in an error or
> >   silently re-using the existing slot. This maintains the
> >   traditional flat instance namespace of attributes.
>
> FWIW, I disagree with this completely. I would expect slots with a
> particular name in derived classes to hide the same-named slots in base
> classes. Whether or not the base class slot is available via some sort of
> "super" shenannigans is less relevant. But hiding semantics is critical. How
> do you expect to reliably make it possible to derive from a class with slots
> otherwise? Perl gets into this sort of mess with its implementation of
> objects.

Attributes currently have a flat namespace, and the construct that I feel is
most natural would maintain that characteristic.  e.g.:

class Base:
  def __init__(self):
    self.foo = 1

class Derived(Base):
  def __init__(self):
    Base.__init__(self)
    self.foo = 2          # this is the same foo as in Base


Python already implements a form data hiding semantics in a different way,
so I'm not sure it is a good idea to add another ad-hoc method to do the
same thing.  The current way to implement data hiding is by using the
namespace munging trick by prefixing symbols with '__', which is then munged
by prepending an underscore and the name of the class.

class Foo:
  __var = 1

dir(Foo)
> ['_Foo__var', '__doc__', '__module__']

> >   2) A complete and immutable list of slots is available
> >   as a member of each type to allow for easy and efficient
> >   reflection. (though I am also in favor of working on
> >   better formal reflection APIs)
>
> Agreed - up to a point. I don't see a need for a way to distinguish between
> slots and "normal" attributes, personally. But I don't do anything fancy
> here, so my experience isn't very indicative.

Without a more formal reflection API, the traditional way to get all normal
dictionary attributes is by using instance.__dict__.keys().  All I'm
proposing is that instance.__slotattrs__ (or possibly
instance.__class__.__slotattrs__) returns a list of objects that reveal the
name of all slots in the instance (including those declared in base
classes).  I am not sure what that list should look like, though here are
the current suggestions:

  1)  __slotattrs__ = ('a','b')

  2)  # slot descriptor objects -- the repr is shown here
      __slotattrs__ = ('<member 'a' of 'Baz' objects>',
                        <member 'b' of 'Baz' objects>')

The only issue that concerns me is that I am not sure if the slot to slot
name mapping should be fixed.  The intrinsic definition of a slot is a type
and the offset of the slot in the type.  The name is just a binding to a
slot descriptor, so it "feels" unnecessary to make that immutable.  It
either case, it is not a big issue for me.

> >   3) These 'attrs' are to otherwise have the same semantics
> >   as normal __dict__ instance attributes. e.g., they should
> >   be automatically picklable, they can have properties
> >   assigned to them, etc.
>
> I think I agree here. However, if you want slots to behave like normal
> attributes, except for the flat namespace, I see no value. Why have the
> exception at all?

Attributes currently have a flat namespace?  I must not have been clear -- I
_do_ want my slotattrs to be allocated like slots, mainly for efficiency
reasons.

> Hmm, this raises the question of why we have slots at all. If they act
> exactly like attributes, why have them? As a user, I perceive them as an
> efficiency thing - they save the memory associated with a dictionary, and
> are probably faster as well. There can be tradeoffs which you pay for that
> efficiency, but that's all. No semantic difference. Actually, that's pretty
> much how the descrintro document describes slots. Strange that...

EXACTLY!  I want to use slots (or slotattrs, or whatever you call them) to
be solely an allocation declaration.  For small objects that you expect to
create many, many instance of, they make a huge difference.  I have some
results that I measured on various implementations of a class to represent
database rows.  The purpose of this class is to give simple dictionary-like
attribute access semantics to tuples returned as a result of default DB-API
relational database queries.  The trick is to add the ability to access
fields by name (instead of by only by index) without incurring the overhead
of allocating a dictionary for every instance.  Below are results of a
benchmark that compare various ways of implementing this class:

                                         time
                                 SIZE    (sec)   Bytes/row
                              --------   ------  ---------
                 overhead:     4,744KB     0.56        -
                    tuple:    18,948KB     2.49       73
     C extension w/ slots:    18,924KB     4.85       73
             native dict*:       117MB    13.50      589
    Python class w/ slots:    18,960KB    17.23       73
   Python class w/o slots:       117MB    24.09      589

     * the native dict implementation does not allow indexed access,
       and is only included as a reference point.

[For more details and discussion of this specific application, please see
 this URL: http://opensource.theopalgroup.com/ ]

Thanks,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From Paul.Moore@atosorigin.com  Wed Feb 20 15:53:41 2002
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Wed, 20 Feb 2002 15:53:41 -0000
Subject: [Python-Dev] Meta-reflections
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F2@UKRUX002.rundc.uk.origin-it.com>

From: Kevin Jacobs [mailto:jacobs@penguin.theopalgroup.com]
> > FWIW, I disagree with this completely. I would expect
> > slots with a particular name in derived classes to hide
> > the same-named slots in base classes. Whether or not the
> > base class slot is available via some sort of "super"
> > shenannigans is less relevant. But hiding semantics
> > is critical. How do you expect to reliably make it
> > possible to derive from a class with slots otherwise?
> > Perl gets into this sort of mess with its implementation
> > of objects.
>
> Attributes currently have a flat namespace, and the
> construct that I feel is most natural would maintain that
> characteristic. e.g.:

Oops. Sorry, I got that completely wrong. OK, I side with silent re-use.
That's what attributes do. [Meta-meta comment: the rule should be to work
"just like" attributes.] So why did you feel the need to separate this point
from your point (3)? It gives the impression that this point differs from
attributes, in contrast to the things mentioned in (3).

> > Agreed - up to a point. I don't see a need for a way
> > to distinguish between slots and "normal" attributes,
> > personally. But I don't do anything fancy here, so my
> > experience isn't very indicative.
>
> Without a more formal reflection API, the traditional
> way to get all normal dictionary attributes is by using
> instance.__dict__.keys().

The official, and supported, way is to use dir(). This was hashed out on
python-dev at the time. As I understand it, dir() always "worked", and was
extended to support slots when they were added. __dict__ clearly only
handles dict-based attributes, and so cannot be extended to include slots.
The official advice on reflection was therefore modified to point out that
dir() and __dict__.keys() were no longer equivalent, and dir() was "the way"
to get the full set. (Whether this advice was included into the formal
documentation, I couldn't confirm, but it was written down - arguing "if
it's not in the documentation, it's not official", is a little naive, given
the new and relatively experimental status of the whole area...)

> All I'm proposing is that instance.__slotattrs__ (or
> possibly instance.__class__.__slotattrs__) returns a
> list of objects that reveal the name of all slots in the
> instance (including those declared in base classes).

Do you have any reason why you would need to get a list of only slots, or
only dict-based attributes? Given that I'm arguing that the two should work
exactly the same (apart from storage and efficiency details), it seems
unreasonable to want to make the distinction (unless you're doing something
incestuous and low-level, when you're on your own...)

Remember, instance.__dict__['attrname'] is now regarded as incomplete in the
face of slots. Again, I point you to the descrintro document, just below the
discussion of slots, in the paragraphs starting from "The correct way to get
any attribute from self inside __getattribute__ is to call the base class's
__getattribute__ method".

> The only issue that concerns me is that I am not sure if the 
> slot to slot name mapping should be fixed.

I haven't melted my brain enough to understand the PEPs, but I believe that
there are ways of doing all sorts of low-level hacking with descriptors, if
you really want to. I don't believe that making this easy for "normal" users
is a good thing.

[BTW, this reminds me of your point on what is documented. I believe that
the PEPs and descrintro count as the canonical documentation of these
features. If they haven't been fully migrated into the Python documentation
set yet, that's a secondary issue. The PEPs *are* the definition of what was
agreed - people had time to comment on the PEPs at the time. And the
descrintro document is the current draft of the user-level documentation.
You can assume it will end up in the manuals. That's my view...]

> EXACTLY! I want to use slots (or slotattrs, or whatever
> you call them) to be solely an allocation declaration.
> For small objects that you expect to create many, many
> instance of, they make a huge difference.

In which case, you are giving the impression of wanting large changes, when
you really want things to stay as they are (modulo some relatively
non-controversial (IMHO) bugs). If you read the descrintro document on
slots, you will see that it presents an identical viewpoint. OK, there are
some technical restrictions, which will always apply because of the nature
of the optimisation, but the intention is clear. And it matches yours...

Paul.


From senn@maya.com  Wed Feb 20 16:00:37 2002
From: senn@maya.com (Jeff Senn)
Date: Wed, 20 Feb 2002 11:00:37 -0500
Subject: [Stackless] Re: [Python-Dev] Stackless Design Q.
In-Reply-To: <200202200301.QAA22063@s454.cosc.canterbury.ac.nz> (Greg
 Ewing's message of "Wed, 20 Feb 2002 16:01:34 +1300 (NZDT)")
References: <200202200301.QAA22063@s454.cosc.canterbury.ac.nz>
Message-ID: <g03wqh22.fsf@SNIPE.maya.com>

Greg Ewing <greg@cosc.canterbury.ac.nz> writes:
> It's not clear exactly what you're after here. Are you
> trying to define the lowest-level interface upon which
> everything else will be built? If so, I think what you
> have presented is FAR too complex.
>
> It seems to me you need only two things:
...
>    t = tasklet(f)
>    t.transfer()

(Sorry if I missed something -- I've been *way* busy lately and
haven't been giving this much attention -- that said...)

But (if I understand the current plan) we will need mechanisms
internal to the Python interpreter to transfer values and maintain
blocked/running state anyway; since when you generate a tasklet and
run it:

 t = tasklet(f)
 t.transfer()

That may cause many more tasklets to be generated, run, and destroyed
that you don't ever see ...  recursions/function calls in f, and
only-Christian-knows what else...  so the transfer value mechanism
might as well be built in.

I haven't thought enough about the "unamed produce-and-continue
function" to decide how exactly it should work.

I have two concerns in implementing uthreads this way (scheduler in
C):

 1 -- there doesn't seem to be anyway to "kill" a tasklet
 2 -- the scheduling algorithm will be hard to tune (we'll probably
      *at least* need tasklet priority...)  Maybe there should still
      be a "timeslice" function so an in-Python scheduler can be written?

-- 
-Jas   --------------------     www.maya.com
       Jeff Senn          |   / / |-/ \ / /|®
       Chief Technologist |  /|/| |/ o | /-|
       Head of R&D        | Taming Complexity®



From fdrake@acm.org  Wed Feb 20 16:04:00 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 20 Feb 2002 11:04:00 -0500
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F2@UKRUX002.rundc.uk.origin-it.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F2@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <15475.51440.121712.347215@grendel.zope.com>

Moore, Paul writes:
 > [BTW, this reminds me of your point on what is documented. I believe that
 > the PEPs and descrintro count as the canonical documentation of these
 > features. If they haven't been fully migrated into the Python documentation
 > set yet, that's a secondary issue. The PEPs *are* the definition of what was
 > agreed - people had time to comment on the PEPs at the time. And the
 > descrintro document is the current draft of the user-level documentation.
 > You can assume it will end up in the manuals. That's my view...]

This is my perspective as well.  I'm not in a hurry to document
relatively volatile feature that may change and (hopefully!) be
available using a nicer syntax in the future.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From jacobs@penguin.theopalgroup.com  Wed Feb 20 16:32:17 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Wed, 20 Feb 2002 11:32:17 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <15475.51440.121712.347215@grendel.zope.com>
Message-ID: <Pine.LNX.4.33.0202201116410.9061-100000@penguin.theopalgroup.com>

On Wed, 20 Feb 2002, Fred L. Drake, Jr. wrote:
> Moore, Paul writes:
>  > [BTW, this reminds me of your point on what is documented. I believe that
>  > the PEPs and descrintro count as the canonical documentation of these
>  > features. If they haven't been fully migrated into the Python documentation
>  > set yet, that's a secondary issue. The PEPs *are* the definition of what was
>  > agreed - people had time to comment on the PEPs at the time. And the
>  > descrintro document is the current draft of the user-level documentation.
>  > You can assume it will end up in the manuals. That's my view...]
>
> This is my perspective as well.  I'm not in a hurry to document
> relatively volatile feature that may change and (hopefully!) be
> available using a nicer syntax in the future.

Then what is the criterion for deciding when to apply the standard Python
deprecation procedures when things like super() and the __slots__ change?

  # Python 2.3?
  from __future__ import super_as_builtin
  from __future__ import hidden_slots

I had (possibly incorrectly) assumed that the criterion was when it was
officially documented in a release.

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From A.M. Kuchling <akuchlin@mems-exchange.org>  Wed Feb 20 16:54:55 2002
From: A.M. Kuchling <akuchlin@mems-exchange.org> (Andrew Kuchling)
Date: Wed, 20 Feb 2002 11:54:55 -0500
Subject: [Python-Dev] Parser-SIG created
Message-ID: <E16da19-0005oR-00@dust.mems-exchange.org>

Parser SIG: Selection of a parser for the standard library

Description: This SIG is for discussing and comparing several
different parser generators in order to assess which one would be
worth including to the Python standard library.

Deliverables (in roughly this order):

1) A list of requirements for a parser generator suitable for
   inclusion.

2) If no parser meets those requirements, the SIG might work to
   enhance one or more parsers until the requirements are met.
   (It would be nice if this step became a null operation; otherwise
   we might fall prey to creeping scope.)
   
3) A recommendation for a parser to include, along with a patch
   against the Python CVS tree.  The BDFL can then ignore or follow the
   recommendation and patch as he sees fit.

Martin von Loewis presented a paper at Python10 comparing several
different parser generators in order to assess which one would be
worth adding to the standard library; it will likely serve as the
starting point for discussion.  Jonathan Riehl suggested creating a
Parser SIG, and I offered to champion it.

The SIG will aim to complete its task in time for Python 2.3.  No
schedule for 2.3 has been officially announced yet, but probably the
SIG will have to complete its mission by May or June 2002.

To join the SIG mailing list, use Mailman at:
   http://mail.python.org/mailman/listinfo/parser-sig/

--amk                                                  (www.amk.ca)
I can see you've been doing the TARDIS up a bit. I don't like it.
    -- The second Doctor, in "The Three Doctors"


From jacobs@penguin.theopalgroup.com  Wed Feb 20 17:08:07 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Wed, 20 Feb 2002 12:08:07 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F2@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <Pine.LNX.4.33.0202201133170.9061-100000@penguin.theopalgroup.com>

On Wed, 20 Feb 2002, Moore, Paul wrote:
> Oops. Sorry, I got that completely wrong. OK, I side with silent re-use.
> That's what attributes do. [Meta-meta comment: the rule should be to work
> "just like" attributes.] So why did you feel the need to separate this point
> from your point (3)? It gives the impression that this point differs from
> attributes, in contrast to the things mentioned in (3).

Having a flat namespace (i.e., no hidden slots), and having all 'reachable'
slots in a single list are really two separate issues.  Right now, we have
this situation:

   class Foo(object):
     __slots__ = ('a','b')

   class Bar(Foo):
     __slots__ = ('c','d')

   bar=Bar()
   print bar.__slots__
   > ('c', 'd')
   print bar.__class__.__base__.__slots__
   > ('a', 'b')

   I content that bar.__slots__ should be:
   > ('a', 'b', 'c', 'd')

I think somewhere along the line I may have mixed up which 'flatness' I was
talking about.

> The official, and supported, way is to use dir().

I agree, but that "official" support has clear limitations.  Right now,
there are several examples in the Python standard library where we use
obj.__dict__.keys() --  most significantly in pickle and cPickle.  There is
also the vars(obj), which may be the better reflection method, though it
currently doesn't know about slots.

> (Whether this advice was included into the formal documentation, I
> couldn't confirm, but it was written down - arguing "if it's not in the
> documentation, it's not official", is a little naive, given the new and
> relatively experimental status of the whole area...)

Naive, maybe, but saying "undocumented" is equivalent to "unsupported
implementation detail" saves us from having to maintain backward
compatibility and following the official Python deprecation process.

> Do you have any reason why you would need to get a list of only slots, or
> only dict-based attributes?

Yes.  Dict-based attributes always have values, while slot-based attributes
can be unset and raise AttributeErrors when trying to access them.  e.g.,
here is how I would handle pickling (excerpt from pickle.py):

        try:
            getstate = object.__getstate__
        except AttributeError:
            stuff = object.__dict__

            # added to support slots
            if hasattr(object.__slots__):
              for slot in object.__slots__:
                if hasattr(object, slot):
                  stuff[slot] = getattr(object, slot)

        else:
            stuff = getstate()
            _keep_alive(stuff, memo)
        save(stuff)
        write(BUILD)

> Given that I'm arguing that the two should work
> exactly the same (apart from storage and efficiency details), it seems
> unreasonable to want to make the distinction (unless you're doing something
> incestuous and low-level, when you're on your own...)

I'm not suggesting anything more incestuous and low-level than what is
already done in many, many, many places.  A larger, more-encompassing
proposal is definitely welcome.

> > EXACTLY! I want to use slots (or slotattrs, or whatever
> > you call them) to be solely an allocation declaration.
> > For small objects that you expect to create many, many
> > instance of, they make a huge difference.
>
> In which case, you are giving the impression of wanting large changes, when
> you really want things to stay as they are (modulo some relatively
> non-controversial (IMHO) bugs). If you read the descrintro document on
> slots, you will see that it presents an identical viewpoint. OK, there are
> some technical restrictions, which will always apply because of the nature
> of the optimisation, but the intention is clear. And it matches yours...

Well, I've not found resounding agreement on the first two of my three basic
issues/bugs I've raised so far:

  1) Flat slot namespaces: Objects should not hiding slots when inherited by
     classes implementing the same slot name.

  2) Flat slot descriptions:  object.__slots__ being an immutable flat tuple
     of all slot names (including inherited ones), as opposed to being a
     potentially mutable sequence of only the slots defined by the most
     derived class.

  3) First class status for slot reflection: making slots picklable by
     default, returned by vars(object), and made part of other relevant
     reflection APIs and standard implementations.

The good news is that once Guido and others have spoken, I can have patches
that accomplish all of this fairly quickly.  I just don't want to do a lot
of unnecessary work if it won't be accepted.

Thanks,
-Kevin


--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From gmcm@hypernet.com  Wed Feb 20 18:27:58 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 20 Feb 2002 13:27:58 -0500
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <Pine.LNX.4.33.0202200747380.8251-100000@penguin.theopalgroup.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F1@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <3C73A45E.16869.1EABF7B2@localhost>

On 20 Feb 2002 at 8:30, Kevin Jacobs wrote:

> Attributes currently have a flat namespace, 

Instance attributes do, but that's a tautology.

> and the
> construct that I feel is most natural would maintain
> that characteristic.  e.g.:
> 
> class Base:
>   def __init__(self):
>     self.foo = 1
> 
> class Derived(Base):
>   def __init__(self):
>     Base.__init__(self)
>     self.foo = 2          # this is the same foo as in
>     Base

But these aren't:

 class Base
  foo = 1

 class Derived(Base):
  foo = 2

-- Gordon
http://www.mcmillan-inc.com/



From JeffH@ActiveState.com  Wed Feb 20 18:33:25 2002
From: JeffH@ActiveState.com (Jeff Hobbs)
Date: Wed, 20 Feb 2002 10:33:25 -0800
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: <m3g03x8bnr.fsf@mira.informatik.hu-berlin.de>
Message-ID: <001101c1ba3d$15e9cda0$ba03a8c0@activestate.ca>

> So if TCL_DONT_WAIT isn't set, it will block; if it is, it will
> busy-wait. Looks like we lose either way.
> 
> In-between, it invokes the setupProcs of each input source, so that
> they can set a maxblocktime, but I don't think _tkinter should hack
> itself into that process.

That's correct - I should have looked a bit more into what I did
before (I was always tying in another GUI's event loop).  However,
I don't see why you should not consider the extra event source.
Tk uses this itself for X.  It would be something like:

[in tk setup]
    Tcl_CreateEventSource(TkinterSetupProc, NULL, NULL);

/*
 *----------------------------------------------------------------------
 *
 * TkinterSetupProc --
 *
 *	This procedure implements the setup part of the Tkinter
 *	event source.  It is invoked by Tcl_DoOneEvent before entering
 *	the notifier to check for events on all displays.
 *
 * Results:
 *	None.
 *
 * Side effects:
 *	The maximum block time will be set to 20000 usecs to ensure that
 *	the notifier returns control to Tcl.
 *
 *----------------------------------------------------------------------
 */

static void
TkinterSetupProc(clientData, flags)
    ClientData clientData;	/* Not used. */
    int flags;
{
    static Tcl_Time blockTime = { 0, 20000 };
    Tcl_SetMaxBlockTime(&blockTime);
}

In fact, you can look at tk/unix/tkUnixEvent.c to see something
similar already done in Tk.

> About thread-safety: Is Tcl 8.3 thread-safe in its standard
> installation, so that we can just use it from multiple threads? If
> not, what is the compile-time check to determine whether it is
> thread-safe? If there is none, I really don't see a solution, and the

You would compile with --enable-threads (both Tcl and Tk).

Jeff


From jacobs@penguin.theopalgroup.com  Wed Feb 20 18:49:51 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Wed, 20 Feb 2002 13:49:51 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <3C73A45E.16869.1EABF7B2@localhost>
Message-ID: <Pine.LNX.4.33.0202201340270.11372-100000@penguin.theopalgroup.com>

On Wed, 20 Feb 2002, Gordon McMillan wrote:
> On 20 Feb 2002 at 8:30, Kevin Jacobs wrote:
> > Attributes currently have a flat namespace,
>
> Instance attributes do, but that's a tautology.

Yes, though one implication of the new slots mechanism in Python 2.2 is that
we now have have non-flat per-instance namespaces for slots.  i.e., we would
have per-instance slots that would hide other per-instance slots of the same
name from ancestor classes:

  class Base(object):
     __slots__ = ['foo']
     def __init__(self):
       self.foo = 1          # which slot this sets depends on type(self)
                             # if type(self) == Base, then the slot is
                             # described by Base.foo.
                             # else if type(self) == Derived, then the
                             # slot is described by Derived.foo

   class Derived(Base):
     __slots__ = ['foo']
     def __init__(self):
       Base.__init__(self)
       self.foo = 2          # this is NOT the same foo as in Base

  o = Derived()
  print o.foo
  > 2
  o.__class__.__base__.foo = 3
  print o.foo
  > 2
  print o.__class__.__base__.foo
  > 3

So slots, as currently implemented, do not act like attributes, and this
whole discussion revolves around whether they should or should not.

Regards,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From DavidA@ActiveState.com  Wed Feb 20 19:25:42 2002
From: DavidA@ActiveState.com (David Ascher)
Date: Wed, 20 Feb 2002 11:25:42 -0800
Subject: [Python-Dev] Meta-reflections
References: <Pine.LNX.4.33.0202201340270.11372-100000@penguin.theopalgroup.com>
Message-ID: <3C73F836.5FCDB2EE@activestate.com>

Kevin Jacobs wrote:

> Yes, though one implication of the new slots mechanism in Python 2.2 is that
> we now have have non-flat per-instance namespaces for slots.  i.e., we would
> have per-instance slots that would hide other per-instance slots of the same
> name from ancestor classes:
> 
>   class Base(object):
>      __slots__ = ['foo']
>      def __init__(self):
>        self.foo = 1          # which slot this sets depends on type(self)
>                              # if type(self) == Base, then the slot is
>                              # described by Base.foo.
>                              # else if type(self) == Derived, then the
>                              # slot is described by Derived.foo
> 
>    class Derived(Base):
>      __slots__ = ['foo']
>      def __init__(self):
>        Base.__init__(self)
>        self.foo = 2          # this is NOT the same foo as in Base
> 
>   o = Derived()
>   print o.foo
>   > 2
>   o.__class__.__base__.foo = 3
>   print o.foo
>   > 2
>   print o.__class__.__base__.foo
>   > 3
> 
> So slots, as currently implemented, do not act like attributes, and this
> whole discussion revolves around whether they should or should not.

This example is not a great example of that, since the code above does
exactly the same thing if you delete the lines defining __slots__. 
You're modifying class attributes in that case, but I think it's
important to keep the examples which illustrate the problem "pure" and
"realistic".

My take on this thread is that I think it's simply not going to happen
that slots are going to act 100% like attributes except for
performance/memory considerations.  It would be nice, but if that had
been possible, then they'd simply be an internal optimization and would
have no syntactic impact.

There are much more shallow ways in which slots aren't like attributes:

>>> class A(object):
...   __slots__ = ('a',)
...
>>> a = A()
>>> a.a = 123             # set a slot on a
>>> A.a = 555             # set a slot on A
>>> a.a                   # Way 1: A's slot overrides a's
555
>>> b = A()
>>> b.a                   
555
>>> del A.a
>>> a.a                   # Way 2: deleting the class slot
                          # did not uncover the instance slot
AttributeError: 'A' object has no attribute 'a'

--david


From martin@v.loewis.de  Wed Feb 20 19:33:01 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 20 Feb 2002 20:33:01 +0100
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <Pine.LNX.4.33.0202201116410.9061-100000@penguin.theopalgroup.com>
References: <Pine.LNX.4.33.0202201116410.9061-100000@penguin.theopalgroup.com>
Message-ID: <m3r8ngj6du.fsf@mira.informatik.hu-berlin.de>

Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes:

> Then what is the criterion for deciding when to apply the standard Python
> deprecation procedures when things like super() and the __slots__ change?

It may be that the change does not need to involve deprecation of
anything; first let's see the new feature, then decide how to
deprecate the exiting one.

Regards,
Martin



From JeffH@ActiveState.com  Wed Feb 20 19:43:00 2002
From: JeffH@ActiveState.com (Jeff Hobbs)
Date: Wed, 20 Feb 2002 11:43:00 -0800
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: <m3g03x8bnr.fsf@mira.informatik.hu-berlin.de>
Message-ID: <002601c1ba46$ce683f20$ba03a8c0@activestate.ca>

> In-between, it invokes the setupProcs of each input source, so that
> they can set a maxblocktime, but I don't think _tkinter should hack
> itself into that process.

BTW in addition to my last message, you might want to create
an ExitHandler that delete the event source.  Also, you might
add more code to the TkinterSetupProc to only set a block time
if multiple threads are actually used (or only create the
event source at that time).  This would make simple Tkinter
apps be efficient and snappy all the time.

Jeff


From jacobs@penguin.theopalgroup.com  Wed Feb 20 19:49:43 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Wed, 20 Feb 2002 14:49:43 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <3C73F836.5FCDB2EE@activestate.com>
Message-ID: <Pine.LNX.4.33.0202201434310.11865-100000@penguin.theopalgroup.com>

On Wed, 20 Feb 2002, David Ascher wrote:
> Kevin Jacobs wrote:
> This example is not a great example of that, since the code above does
> exactly the same thing if you delete the lines defining __slots__.

True, which is why the current implementation (IMHO) isn't broken; just
flawed.  There is, in effect, a flat slot namespace, only by virtue of the
fact that there is no simple and explicit slot resolution syntax.  This
basically means that most arguments for using the current implementation of
slots as a data-hiding mechanism over inheritance are very weak unless
significant additional syntax is created.

> You're modifying class attributes in that case, but I think it's
> important to keep the examples which illustrate the problem "pure" and
> "realistic".

Nope -- they aren't class attributes at all, they are per-instance slots
with class-level descriptors (with which you expose another bug below).

> My take on this thread is that I think it's simply not going to happen
> that slots are going to act 100% like attributes except for
> performance/memory considerations.  It would be nice, but if that had
> been possible, then they'd simply be an internal optimization and would
> have no syntactic impact.

I'd like to know why else you think that?  I'm fairly confident that I can
submit a patch that accomplishes this (and even fix the following issue).

> There are much more shallow ways in which slots aren't like attributes:
>
> >>> class A(object):
> ...   __slots__ = ('a',)
> ...
> >>> a = A()
> >>> a.a = 123             # set a slot on a
> >>> A.a = 555             # set a slot on A
> >>> a.a                   # Way 1: A's slot overrides a's
> 555
> >>> b = A()
> >>> b.a
> 555
> >>> del A.a
> >>> a.a                   # Way 2: deleting the class slot
>                           # did not uncover the instance slot
> AttributeError: 'A' object has no attribute 'a'

Ouch!  You've discovered another problem with the current implementation.
You have effectively removed the slot descriptor from class A and replaced
it with a class attribute.  In fact, I don't think you can ever re-create
the slot descriptor!  This is actually the best form of data hiding in pure
Python I've seen to date.  The fix is to make slot descriptors read-only,
like the rest of the immutible class attributes.

Sigh,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From DavidA@ActiveState.com  Wed Feb 20 19:57:37 2002
From: DavidA@ActiveState.com (David Ascher)
Date: Wed, 20 Feb 2002 11:57:37 -0800
Subject: [Python-Dev] Meta-reflections
References: <Pine.LNX.4.33.0202201434310.11865-100000@penguin.theopalgroup.com>
Message-ID: <3C73FFB1.667C2D8B@activestate.com>

Kevin:

> > My take on this thread is that I think it's simply not going to happen
> > that slots are going to act 100% like attributes except for
> > performance/memory considerations.  It would be nice, but if that had
> > been possible, then they'd simply be an internal optimization and would
> > have no syntactic impact.
> 
> I'd like to know why else you think that?  

Ye old poverty of the imagination argument, coupled with the assumption
that Guido had time to finish what he'd started I guess =).

> I'm fairly confident that I can
> submit a patch that accomplishes this (and even fix the following issue).

Great!  

I can't channel Guido very well, but everytime that he's talked about
slots in my presence, he talked about their main purpose as a
memory-savings one.  If he didn't have other intents, and if you can
limit their impact to pure memory savings, then more power to all of us
thanks to you.

> Ouch!  You've discovered another problem with the current implementation.
> You have effectively removed the slot descriptor from class A and replaced
> it with a class attribute.  In fact, I don't think you can ever re-create
> the slot descriptor!  

Ooh, cool.  You're right:

after deleting the class attribute:

>>> a.a = 1100
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'A' object has no attribute 'a'

which is a really wacky error message if you look at the case of the
letters...

--david


From jacobs@penguin.theopalgroup.com  Wed Feb 20 20:09:07 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Wed, 20 Feb 2002 15:09:07 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <3C73FFB1.667C2D8B@activestate.com>
Message-ID: <Pine.LNX.4.33.0202201501300.11865-100000@penguin.theopalgroup.com>

On Wed, 20 Feb 2002, David Ascher wrote:
> > > My take on this thread is that I think it's simply not going to happen
> > > that slots are going to act 100% like attributes except for
> > > performance/memory considerations.  It would be nice, but if that had
> > > been possible, then they'd simply be an internal optimization and would
> > > have no syntactic impact.
> >
> > I'd like to know why else you think that?
>
> Ye old poverty of the imagination argument, coupled with the assumption
> that Guido had time to finish what he'd started I guess =).

This is why I haven't unleashed a patch, even though I pretty much know
exactly how to fix all of the problems we've noticed and make slots work as
I imagine they should.  Some of the things I heard Guido say at IPC10 lead
me to believe that he has something up his sleeve wrt slots (specifically
some plan about storing dicts in __slots__ that do something nifty).

So I'm going to sit on my hands until Guido gets back into town and sets us
all straight.

Thanks,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From greg@cosc.canterbury.ac.nz  Thu Feb 21 00:08:01 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 21 Feb 2002 13:08:01 +1300 (NZDT)
Subject: [Python-Dev] Stackless Design Q.
In-Reply-To: <3C739A2F.5030502@tismer.com>
Message-ID: <200202210008.NAA22179@s454.cosc.canterbury.ac.nz>

Christian Tismer <tismer@tismer.com>:

> I'd like to keep track
> of tasklets, since they are now containing vitual
> information about stack state, and I cannot afford
> to loose one of them, or we'll crash.

I'm not sure what the problem is here. A tasklet isn't going to go
away until there are no more references to it anywhere, and once that
happens, there is no longer any way of switching to it.

> So my basic idea was to provide what is needed to get uthreads at
> very high speed, without the ned to use Python for the basic
> machinery.

Well, the higher-level stuff doesn't *have* to be implemented in
Python. But I think it should in principle be possible. Then you can
experiment with different designs for higher-level faciliies in
Python, find out which ones are most useful, and re-code those in C
later.

> My tasklets will also support threading, that is they will become
> auto-scheduled if the user switches this on.

I'm not sure how this is going to interact with the facility for
switching to a specific tasklet. Seems to me that, in the presence of
pre-emptive scheduling, it no longer makes sense to do so, since some
other tasklet could easily get scheduled a moment later.  The most you
can do is say "I don't want to run any more now, let some other
tasklet have a go".

So it appears that we already have two distinct layers of
functionality here: a low-level, non-preemptive layer where we
explicitly switch from one tasklet to another, and a higher-level,
preemptive one where we let the scheduler take care of picking what to
run next. 

These two layers should be clearly separated, with the higher one
built strictly on the facilities provided by the lower one. In
particular, there should be exactly one way of switching between
tasklets, i.e. by calling t.transfer().  Preemptive switching should
be done by some kind of signal or event handler which does this.

> But auto-scheduled frames are a diffeent kind
> of thing than those which are in "waiting for data"
> state. I need to distinguish them or I will crash.

If you get rid of the idea of passing values between tasklets as part
of the switching process, then this distinction disappears. I think
that value-passing and tasklet-switching are orthogonal activities and
would be better decoupled.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Thu Feb 21 00:59:46 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 21 Feb 2002 13:59:46 +1300 (NZDT)
Subject: [Stackless] Re: [Python-Dev] Stackless Design Q.
In-Reply-To: <g03wqh22.fsf@SNIPE.maya.com>
Message-ID: <200202210059.NAA22187@s454.cosc.canterbury.ac.nz>

Jeff Senn <senn@maya.com>:

> That may cause many more tasklets to be generated, run, and
> destroyed that you don't ever see ...  recursions/function calls in
> f, and only-Christian-knows what else...  so the transfer value
> mechanism might as well be built in.

I don't understand what you mean. Are you saying that every function
call creates a new tasklet? That stack frame == tasklet?  If that's
the case, then we're back to continuations! But I don't think so,
because Christian said that a tasklet contains "a chain of frames",
not just one frame.

> 2 -- the scheduling algorithm will be hard to tune (we'll probably
>      *at least* need tasklet priority...)  Maybe there should still
>      be a "timeslice" function so an in-Python scheduler can be written?

The huge variety of possible scheduling policies is all the more
reason *not* to make scheduling part of the core functionality.
The user should be free to implement his own scheduling layer
on top of the primitives if he doesn't like what is provided.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From gmcm@hypernet.com  Thu Feb 21 02:00:50 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 20 Feb 2002 21:00:50 -0500
Subject: [Stackless] Re: [Python-Dev] Stackless Design Q.
In-Reply-To: <200202200301.QAA22063@s454.cosc.canterbury.ac.nz>
References: <3C72A453.7080905@tismer.com>
Message-ID: <3C740E82.8148.204A91B4@localhost>

On 20 Feb 2002 at 16:01, Greg Ewing wrote:

[Christian's plea]

> It seems to me you need only two things:
> 
> (1) A constructor for new tasklets:
> 
>    t = tasklet(f)

[snip] 

> (2) A way of switching to another tasklet:
> 
>    t.transfer()

[snip]

> All the other stuff you talk about -- passing
> values between tasklets, rings of runnable tasklets,
> scheduling policies, etc -- can all be implemented
> in Python on top of these primitives. 

Unless you've got a way to detect or pass tasklet's through transfer, you don't have enough.

-- Gordon
http://www.mcmillan-inc.com/



From greg@cosc.canterbury.ac.nz  Thu Feb 21 02:08:47 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 21 Feb 2002 15:08:47 +1300 (NZDT)
Subject: [Stackless] Re: [Python-Dev] Stackless Design Q.
In-Reply-To: <3C740E82.8148.204A91B4@localhost>
Message-ID: <200202210208.PAA22202@s454.cosc.canterbury.ac.nz>

Gordon McMillan <gmcm@hypernet.com>:

> Unless you've got a way to detect or pass tasklet's through transfer,
> you don't have enough.

You'll have to elaborate. I don't have any idea what
you mean by that!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From Paul.Moore@atosorigin.com  Thu Feb 21 09:58:36 2002
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Thu, 21 Feb 2002 09:58:36 -0000
Subject: [Python-Dev] Meta-reflections
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F3@UKRUX002.rundc.uk.origin-it.com>

From: Kevin Jacobs [mailto:jacobs@penguin.theopalgroup.com]
> Having a flat namespace (i.e., no hidden slots), and having 
> all 'reachable' slots in a single list are really two
> separate issues.  Right now, we have this situation:
> 
>    class Foo(object):
>      __slots__ = ('a','b')
> 
>    class Bar(Foo):
>      __slots__ = ('c','d')
> 
>    bar=Bar()
>    print bar.__slots__
>    > ('c', 'd')
>    print bar.__class__.__base__.__slots__
>    > ('a', 'b')
> 
>    I content that bar.__slots__ should be:
>    > ('a', 'b', 'c', 'd')
> 
> I think somewhere along the line I may have mixed up which 
> 'flatness' I was> talking about.

Um. Be aware that I'm not 100% sure about the "no hidden slots" point. I
only support it on the basis of acting the same as attributes. But I'm not
sure about it for attributes, either... (although I doubt it will ever
change).

As for your contention over __slots__, I don't agree. I don't have a strong
view, but I feel that __slots__ is really only meant as a way of *defining*
the slots (and as such, may be replaced later by better syntax). Think of it
as write-only. Modifying it after the class is defined, or reading it,
aren't really well defined (and don't need to be). If slot definition had
been spelt "slot a" instead of "__slots__ = ['a']", you wouldn't necessarily
expect to have a readable attribute containing the list of slots...

> > The official, and supported, way is to use dir().
> 
> I agree, but that "official" support has clear limitations.  

I'm not sure what you mean.

> Right now, there are several examples in the Python standard
> library where we use obj.__dict__.keys() --  most significantly
> in pickle and cPickle.

But aren't we agreed that this is the source of a bug (that slots aren't
picklable)?

> There is also the vars(obj), which may be the better reflection 
> method, though it currently doesn't know about slots.

Possibly you're right. This could easily be raised as a feature request. And
possibly even as a bug (vars() should know about slots).

> Naive, maybe, but saying "undocumented" is equivalent to "unsupported
> implementation detail" saves us from having to maintain backward
> compatibility and following the official Python deprecation process.

Equally, saying "it's not in the manual, so tough luck" is unreasonably
harsh. We need to be reasonable about this.

> > Do you have any reason why you would need to get a list of 
> > only slots, or only dict-based attributes?
> 
> Yes.  Dict-based attributes always have values, while 
> slot-based attributes can be unset and raise AttributeErrors
> when trying to access them.

Hmm. I could argue this a couple of ways. Slots should contain None when
unassigned (no AttributeErrors), or code should be (re-)written to cope with
AttributeError from things in dir(). I wouldn't argue it as "slots can raise
AttributeError, so we need to treat slots and dict-based attibutes
separately, in 2 passes".

> here is how I would handle pickling (excerpt from pickle.py):
> 
>         try:
>             getstate = object.__getstate__
>         except AttributeError:
>             stuff = object.__dict__
> 
>             # added to support slots
>             if hasattr(object.__slots__):
>               for slot in object.__slots__:
>                 if hasattr(object, slot):
>                   stuff[slot] = getattr(object, slot)
> 
>         else:
>             stuff = getstate()
>             _keep_alive(stuff, memo)
>         save(stuff)
>         write(BUILD)

Why not just change the line stuff = object.__dict__ to

    stuff = [a for a in dir(object) if hasattr(object,a) and not
callable(getattr(object,a))]

[The hasattr() gets rid of unbound slots - this may be another argument for
unbound slots containing None, and the callable() gets rid of methods]. Then
slots are covered, as well as any future non-dict-based attribute types.

If vars(obj) was fixed to include slots, like dir() was, then this could be
reduced to "stuff = vars(object)" (modulo protection against
AttributeError).

> I'm not suggesting anything more incestuous and low-level than what is
> already done in many, many, many places.  A larger, more-encompassing
> proposal is definitely welcome.

I'm not sure we need a larger proposal. A smaller one may work better. I'm
arguing above for

1. Make unbound slots return None rather than AttributeError
2. Make vars() return slots as well as dict-based attributes
3. Document __dict__ as legacy usage, not slots-aware
4. Fix bugs caused by use of the legacy __dict__ approach
5. Educate users in the new approaches which are slots-aware
   (dir/vars, calling base class setattr, etc)

(and maybe a sixth, don't make __slots__ a reflection API - make it an
implementation detail, a bit like __dict__ is now viewed)

> Well, I've not found resounding agreement on the first two of 
> my three basic issues/bugs I've raised so far:
> 
>   1) Flat slot namespaces: Objects should not hiding slots 
>      when inherited by classes implementing the same slot name.

You're right - I'm not in "resounding" agreement. I think it's probably
better, for consistency with dict-based attributes, but I sort of wish it
wasn't. (The fact that I've not hit problems because of the equivalent
property of attributes means that I'm probably wrong, and attributes are
fine as they are, though.)

>   2) Flat slot descriptions:  object.__slots__ being an 
>      immutable flat tuple of all slot names (including
>      inherited ones), as opposed to being a potentially
>      mutable sequence of only the slots defined by the most
>      derived class.

This I disagree with. I think __slots__ should be immutable. But I'm happy
with "don't do that" as the way of implementing immutability, if that's what
Guido prefers. I definitely don't think __slots__ should return something
different when you read it than what you assigned to it in the first place
(which is what including inherited slots does). But I don't really think
people have any business reading __slots__ in any case (see my arguments
above).

>   3) First class status for slot reflection: making slots picklable by
>      default, returned by vars(object), and made part of 
>      other relevant reflection APIs and standard implementations.

I agree on this one.

Paul.


From tismer@tismer.com  Thu Feb 21 12:13:28 2002
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 21 Feb 2002 13:13:28 +0100
Subject: [Stackless] Re: [Python-Dev] Stackless Design Q.
References: <200202210008.NAA22179@s454.cosc.canterbury.ac.nz>
Message-ID: <3C74E468.1090403@tismer.com>

Greg Ewing wrote:

> Christian Tismer <tismer@tismer.com>:


...


>>But auto-scheduled frames are a diffeent kind
>>of thing than those which are in "waiting for data"
>>state. I need to distinguish them or I will crash.
>>
> 
> If you get rid of the idea of passing values between tasklets as part
> of the switching process, then this distinction disappears. I think
> that value-passing and tasklet-switching are orthogonal activities and
> would be better decoupled.


Hmm, first I thought you were wrong:

Any Python function that calls something, may it be a stackless
schedule function or something else, expects a value to
be returned. Always and ever.

But when I have a scheduler counter built into the Python
interpreter loop, then a schedule will happen *between*
opcodes. Such a frame is not awaiting data, and therefor
not suitable to be switched to by one which is in data
transfer.

Now I see it: You mean I can make this schedule function behave
like a normal function call, that accepts and drops a dummy
value? In fact, this would make all tasklets compatible.

thinking - thanks - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net/
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
      where do you want to jump today?   http://www.stackless.com/




From jacobs@penguin.theopalgroup.com  Thu Feb 21 12:36:14 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Thu, 21 Feb 2002 07:36:14 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F3@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <Pine.LNX.4.33.0202210644490.16498-100000@penguin.theopalgroup.com>

On Thu, 21 Feb 2002, Moore, Paul wrote:
> > I agree, but that "official" support has clear limitations.
>
> I'm not sure what you mean.

When you request dir(object), there is a fairly significant amount of work
done.  Even though it is implemented in C, I can foresee a non-trivial
performance hit to a great deal of code.  vars(object) is better, though it
also has some performance implications.

I know that we are talking about Python, and performance is not of paramount
importance.  But realize that my company produces _extremely_ large Python
applications used for financial and business tracking and forecasting.  We
are acutely aware of bottle-necks in critical paths such as object
serialization.  I'm just not looking forward to 25% slowdowns in pickling
(number pulled out of hat) and I'm sure the Zope guys aren't either...

> > Right now, there are several examples in the Python standard
> > library where we use obj.__dict__.keys() --  most significantly
> > in pickle and cPickle.
>
> But aren't we agreed that this is the source of a bug (that slots aren't
> picklable)?

That is not the bug -- if for no other reason, the standard library is free
to use implementation specific knowledge.  Getting obj.__dict__ is a really
slick and efficient way to reflect on all normal instance variables.

> > Naive, maybe, but saying "undocumented" is equivalent to "unsupported
> > implementation detail" saves us from having to maintain backward
> > compatibility and following the official Python deprecation process.
>
> Equally, saying "it's not in the manual, so tough luck" is unreasonably
> harsh. We need to be reasonable about this.

I don't mean to imply that we need to be harsh, though in some classes we do
not want to worry about backward compatibility.  How do we tell users which
features are safe to use, so that they don't write thousands of lines of
code that suddenly break when the next Python version is released?  Well,
not documenting it in the official Python reference manual is a pretty good
way.  Personally, I'm extremely wary of using anything that isn't.  Such
features can also be documented in the reference manual and explicitly
marked as "subject to change", but that is not the case we are currently
dealing with.

> > Yes.  Dict-based attributes always have values, while
> > slot-based attributes can be unset and raise AttributeErrors
> > when trying to access them.
>
> Hmm. I could argue this a couple of ways. Slots should contain None when
> unassigned (no AttributeErrors), or code should be (re-)written to cope with
> AttributeError from things in dir(). I wouldn't argue it as "slots can raise
> AttributeError, so we need to treat slots and dict-based attibutes
> separately, in 2 passes".

I don't see how filling slots with default values is compatible with the
premise that we want slots to act as close to normal instance attributes as
possible.  I've implemented quite a few things with slots.  In fact, I have
an experimental branch of a 200k LOC project that re-implements many low level
components using slots.  The speed-ups and memory savings from doing so are
very, very compelling.  There are cases where I declare slots that may never
be used using any particular instance lifetime.  I do expect them to raise
an AttributeError, just like they would have before they were slots.  If I
fill the slot, assigning it to None has another very different semantic
meaning than an AttributeError.

Another good example is pickling.  Why would you ever want to pickle empty
slots?  They have no value, not even a default one, so why waste the
processor cycles or the disk space?

> Why not just change the line stuff = object.__dict__ to
>
>     stuff = [a for a in dir(object) if hasattr(object,a) and not
> callable(getattr(object,a))]

Um, because its wrong?  I pickle _lots_ of callable objects.  It also
pickles class-attributes as instance-attributes.  Here is a better version
that can be used once vars(object) has been fixed:

  stuff = dict([ (a,getattr(object,a)) for a in vars(object) if hasattr(object,a)])

Note that it does an unnecessary getattr, hasattr, memory allocation and
incurs loop overhead on every dict attribute, but otherwise it should work
once vars is fixed.

> 1. Make unbound slots return None rather than AttributeError

I am strongly against this.  It doesn't make sense to start supplying
implicit default values.  Explicit is better than implicit...

> 2. Make vars() return slots as well as dict-based attributes

Agree.

> 3. Document __dict__ as legacy usage, not slots-aware

Agree, though __dict__ should still be a valid way of accessing all non-slot
instance attributes.  Too much legacy code would break if this were not so.

> 4. Fix bugs caused by use of the legacy __dict__ approach

I'd rephrase that as: fix reflection code that assumes attributes only live
in __dict__.

> 5. Educate users in the new approaches which are slots-aware
>    (dir/vars, calling base class setattr, etc)

Calling base class setattr?  I'm not sure what you mean?

> >   2) Flat slot descriptions:  object.__slots__ being an
> >      immutable flat tuple of all slot names (including
> >      inherited ones), as opposed to being a potentially
> >      mutable sequence of only the slots defined by the most
> >      derived class.
>
> This I disagree with. I think __slots__ should be immutable. But I'm happy
> with "don't do that" as the way of implementing immutability, if that's what
> Guido prefers.

Not knowing what Guido is planning, all I can say is that he has made
__bases__ and __mro__ explicitly immutable.  If we now intend to make
__slots__ immutable as well, then it is better to do so explicitly.

> I definitely don't think __slots__ should return something
> different when you read it than what you assigned to it in the first place
> (which is what including inherited slots does). But I don't really think
> people have any business reading __slots__ in any case (see my arguments
> above).

By your logic, people don't have any business reading __dict__, but they do.
Imagine what would happen if we didn't expose __dict__ in Python 2.3?

Thanks,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From David Abrahams" <david.abrahams@rcn.com  Thu Feb 21 13:06:05 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 21 Feb 2002 08:06:05 -0500
Subject: [Python-Dev] A little GC confusion
Message-ID: <0f5001c1bad8$885e0590$0500a8c0@boostconsulting.com>

Hi All,

The following extension module (AA) is a reduced example of what I'm doing
to make extension
classes in 2.2. I followed the examples given by typeobject.c. When I
"import AA,pdb" I get a crash in GC. Investigating further, I see this makes
sense: GC is enabled in class_metatype_object, yet class_type_object does
not follow the first rule of objects whose type has GC enabled:

    "The memory for the object must be allocated using PyObject_GC_New()
or PyObject_GC_VarNew()."

So, I guess the question is, how does PyBaseObject_Type (also statically
allocated) get away with it?

TIA,
Dave

----------------

// Copyright David Abrahams 2002. Permission to copy, use,
// modify, sell and distribute this software is granted provided this
// copyright notice appears in all copies. This software is provided
// "as is" without express or implied warranty, and with no claim as
// to its suitability for any purpose.
#include <Python.h>

PyTypeObject class_metatype_object = {
    PyObject_HEAD_INIT(0)
        0,
        "Boost.Python.class",
        PyType_Type.tp_basicsize,
        0,
        0,                                      /* tp_dealloc */
        0,                                      /* tp_print */
        0,                                      /* tp_getattr */
        0,                                      /* tp_setattr */
        0,                                      /* tp_compare */
        0,                                      /* tp_repr */
        0,                                      /* tp_as_number */
        0,                                      /* tp_as_sequence */
        0,                                      /* tp_as_mapping */
        0,                                      /* tp_hash */
        0,                                      /* tp_call */
        0,                                      /* tp_str */
        0,                                      /* tp_getattro */
        0,                                      /* tp_setattro */
        0,                                      /* tp_as_buffer */
    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC
  | Py_TPFLAGS_BASETYPE,  /* tp_flags */
        0,                                      /* tp_doc */
        0,                                      /* tp_traverse */
        0,                                      /* tp_clear */
        0,                                      /* tp_richcompare */
        0,                                      /* tp_weaklistoffset */
        0,                                      /* tp_iter */
        0,                                      /* tp_iternext */
        0,                                      /* tp_methods */
        0,                                      /* tp_members */
        0,                                      /* tp_getset */
        0, // &PyType_Type,                           /* tp_base */
        0,                                      /* tp_dict */
        0,                                      /* tp_descr_get */
        0,                                      /* tp_descr_set */
        0,                                      /* tp_dictoffset */
        0,                                      /* tp_init */
        0,                                      /* tp_alloc */
        0,
        // PyType_GenericNew                       /* tp_new */
};

// Get the metatype object for all extension classes.
PyObject* class_metatype()
{
    if (class_metatype_object.tp_dict == 0)
    {
        class_metatype_object.ob_type = &PyType_Type;
        class_metatype_object.tp_base = &PyType_Type;
        if (PyType_Ready(&class_metatype_object))
            return 0;
    }
    Py_INCREF(&class_metatype_object);
    return (PyObject*)&class_metatype_object;
}

// Each extension instance will be one of these
typedef struct instance
{
    PyObject_HEAD
    void* objects;
} instance;

static void instance_dealloc(PyObject* inst)
{
    instance* kill_me = (instance*)inst;

    inst->ob_type->tp_free(inst);
}

PyTypeObject class_type_object = {
    PyObject_HEAD_INIT(0) file://&class_metatype_object)
        0,
        "Boost.Python.instance",
        sizeof(PyObject),
        0,
        instance_dealloc,                       /* tp_dealloc */
        0,                                      /* tp_print */
        0,                                      /* tp_getattr */
        0,                                      /* tp_setattr */
        0,                                      /* tp_compare */
        0,                                      /* tp_repr */
        0,                                      /* tp_as_number */
        0,                                      /* tp_as_sequence */
        0,                                      /* tp_as_mapping */
        0,                                      /* tp_hash */
        0,                                      /* tp_call */
        0,                                      /* tp_str */
        0,                                      /* tp_getattro */
        0,                                      /* tp_setattro */
        0,                                      /* tp_as_buffer */
    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC
  | Py_TPFLAGS_BASETYPE,  /* tp_flags */
        0,                                      /* tp_doc */
        0,                                      /* tp_traverse */
        0,                                      /* tp_clear */
        0,                                      /* tp_richcompare */
        0,                                      /* tp_weaklistoffset */
        0,                                      /* tp_iter */
        0,                                      /* tp_iternext */
        0,                                      /* tp_methods */
        0,                                      /* tp_members */
        0,                                      /* tp_getset */
    0, file://&PyBaseObject_Type,                     /* tp_base */
        0,                                      /* tp_dict */
        0,                                      /* tp_descr_get */
        0,                                      /* tp_descr_set */
        0,                                      /* tp_dictoffset */
        0,                                      /* tp_init */
        PyType_GenericAlloc,                    /* tp_alloc */
        PyType_GenericNew
};

PyObject* class_type()
{
    if (class_type_object.tp_dict == 0)
    {
        class_type_object.ob_type = (PyTypeObject*)class_metatype();
        class_type_object.tp_base = &PyBaseObject_Type;
        if (PyType_Ready(&class_type_object))
            return 0;
    }
    Py_INCREF(&class_type_object);
    return (PyObject*)&class_type_object;
}

PyObject* make_class()
{
    PyObject* bases, *args, *mt, *result;
    bases = PyTuple_New(1);
    PyTuple_SET_ITEM(bases, 0, class_type());

    args = PyTuple_New(3);
    PyTuple_SET_ITEM(args, 0, PyString_FromString("AA"));
    PyTuple_SET_ITEM(args, 1, bases);
    PyTuple_SET_ITEM(args, 2, PyDict_New());

    mt = class_metatype();
    result = PyObject_CallObject(mt, args);
    Py_XDECREF(mt);
    Py_XDECREF(args);
    return result;
}

static PyMethodDef SpamMethods[] = {
    {NULL,      NULL}        /* Sentinel */
};


DL_EXPORT(void)
initAA()
{
    PyObject *m, *d;

    m = Py_InitModule("AA", SpamMethods);
    d = PyModule_GetDict(m);
    PyDict_SetItemString(d, "AA", make_class());
}


+---------------------------------------------------------------+
                  David Abrahams
      C++ Booster (http://www.boost.org)               O__  ==
      Pythonista (http://www.python.org)              c/ /'_ ==
  resume: http://users.rcn.com/abrahams/resume.html  (*) \(*) ==
          email: david.abrahams@rcn.com
+---------------------------------------------------------------+




From Paul.Moore@atosorigin.com  Thu Feb 21 13:19:47 2002
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Thu, 21 Feb 2002 13:19:47 -0000
Subject: [Python-Dev] Meta-reflections
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F6@UKRUX002.rundc.uk.origin-it.com>

From: Kevin Jacobs [mailto:jacobs@penguin.theopalgroup.com]
> On Thu, 21 Feb 2002, Moore, Paul wrote:
> > > I agree, but that "official" support has clear limitations.
> >
> > I'm not sure what you mean.
> 
> When you request dir(object), there is a fairly significant 
> amount of work done.
[...]
> I know that we are talking about Python, and performance is 
> not of paramount importance.

Hmm. I tend to favour "do it right, then do it fast". If there's a
performance hit on dir(), why can't it be made faster? If nothing else, as a
part of the core, dir() has the right to access __dict__ and __slots__. So
there's no a priori reason why dir() should be slower than *any* user-coded
way of doing the same.

Of course, we *really* want vars() here, as we're otherwise doing work in
dir() to get entries that we then throw away. But that's the only issue. Get
vars() to work, and if it's too slow, you can argue that it's a bug because
"I can get the same results by using the following code, which is faster".

> I'm just not looking forward to 25% slowdowns in pickling
> (number pulled out of hat) and I'm sure the Zope guys
> aren't either...

There's bound to be some slowdown, from the (new) need to find slots as well
as dict-based attributes. I'm happy if you want it minimised. But that's a
new point you've raised, which I don't have the expertise to comment on.

> That is not the bug -- if for no other reason, the standard 
> library is free to use implementation specific knowledge.  Getting 
> obj.__dict__ is a really slick and efficient way to reflect
> on all normal instance variables.

I'm not sure I agree here - it's better if the standard library uses
interfaces which are available to the user. And if pickling can be made
fast, why shouldn't the machinery that makes this possible be made available
to the end user?

You could say that this argues in favour of making __dict__ and __slots__
part of the "official" reflection API. My view is that it argues for making
the "official" API (which I'm assuming will be vars() for now) efficient
enough that people don't need to use __disct__ and __slots__. Encapsulation
is good.

> I don't see how filling slots with default values is 
> compatible with the premise that we want slots to act
> as close to normal instance attributes as possible.

Fair enough. I offered that as one option. Clearly you prefer the other
(that what's in dir() and/or __slots__ cannot be guaranteed not to raise
AttributeError). I'm happy either way, not having a vested interest in the
issue.

> > Why not just change the line stuff = object.__dict__ to
> >
> >     stuff = [a for a in dir(object) if hasattr(object,a) and not
> > callable(getattr(object,a))]
> 
> Um, because its wrong?

Sorry - it was an off-the-top-of-the-head suggestion. But it made my real
point, which was that you can do it with dir().

> stuff = dict([ (a,getattr(object,a)) for a in vars(object) 
> if hasattr(object,a)])
> 
> Note that it does an unnecessary getattr, hasattr, memory 
> allocation and incurs loop overhead on every dict attribute,
> but otherwise it should work once vars is fixed.

Efficiency again. I'd have to bow to your greater experience here. Although
with pickling, doesn't I/O usually outweigh any performance cost?

> > 3. Document __dict__ as legacy usage, not slots-aware
> 
> Agree, though __dict__ should still be a valid way of 
> accessing all non-slot instance attributes.  Too much
> legacy code would break if this were not so.

That's what I meant. Document it as the historical way of getting at
instance attributes. Still available, but code which uses it will not
support slots. After all, if you pass classes using slots into code which
uses __dict__, things will go wrong. That's just another sort of breakage.
Nobody's arguing that __dict__ should go away. Except possibly from the
documentation :-)

> Calling base class setattr?  I'm not sure what you mean?

It's in the part of the descrintro document I pointed you at. Traditional
implementations of setattr used assignment to self.__dict__['attr'] to avoid
infinite recursion. The "new way" discussed in the descrintro document is to
call the base class setattr.

> By your logic, people don't have any business reading 
> __dict__, but they do.

They don't have any business *any more*. An important distinction. (And it's
not anywhere near as black and white as that comment implies - I know that).

> Imagine what would happen if we didn't expose __dict__ in Python 2.3?

Nothing at all, if we provide an alternative. Except for backward
compatibility issues, which there's a well-documented deprecation process to
address. Of course, nobody is proposing the removal of __dict__. All I'm
suggesting is that we document its limitations, point out better ways, and
leave it at that.

Paul.


From jacobs@penguin.theopalgroup.com  Thu Feb 21 13:38:42 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Thu, 21 Feb 2002 08:38:42 -0500 (EST)
Subject: [Python-Dev] A little GC confusion
In-Reply-To: <0f5001c1bad8$885e0590$0500a8c0@boostconsulting.com>
Message-ID: <Pine.LNX.4.33.0202210830500.16498-100000@penguin.theopalgroup.com>

On Thu, 21 Feb 2002, David Abrahams wrote:
> The following extension module (AA) is a reduced example of what I'm doing
> to make extension
> classes in 2.2. I followed the examples given by typeobject.c. When I
> "import AA,pdb" I get a crash in GC. Investigating further, I see this makes
> sense: GC is enabled in class_metatype_object, yet class_type_object does
> not follow the first rule of objects whose type has GC enabled:
>
>     "The memory for the object must be allocated using PyObject_GC_New()
> or PyObject_GC_VarNew()."
>
> So, I guess the question is, how does PyBaseObject_Type (also statically
> allocated) get away with it?

I doesn't have any time to really look at your code, but I thought I'd point
out a trick that several extension modules use to protect statically
allocated type objects.  Here is the code from socketmodule.c:

/* static PyTypeObject PySocketSock_Type = {
.
.
.
        0,      /* set below */                 /* tp_alloc */
        PySocketSock_new,                       /* tp_new */
        0,      /* set below */                 /* tp_free */
};

/* buried in init_socket */
        PySocketSock_Type.tp_alloc = PyType_GenericAlloc;
        PySocketSock_Type.tp_free = _PyObject_Del;

This trick ensures that the static type object is never freed.

Also, there is a funny-looking line in your PyTypeObject:

    0, file://&PyBaseObject_Type,                     /* tp_base */

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From tismer@tismer.com  Thu Feb 21 13:48:07 2002
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 21 Feb 2002 14:48:07 +0100
Subject: [Stackless] Re: [Python-Dev] Stackless Design Q.
References: <200202200301.QAA22063@s454.cosc.canterbury.ac.nz> <g03wqh22.fsf@SNIPE.maya.com>
Message-ID: <3C74FA97.7080700@tismer.com>

Jeff Senn wrote:

> Greg Ewing <greg@cosc.canterbury.ac.nz> writes:
> 
>>It's not clear exactly what you're after here. Are you
>>trying to define the lowest-level interface upon which
>>everything else will be built? If so, I think what you
>>have presented is FAR too complex.
>>
>>It seems to me you need only two things:
>>
> ...
> 
>>   t = tasklet(f)
>>   t.transfer()
>>
> 
> (Sorry if I missed something -- I've been *way* busy lately and
> haven't been giving this much attention -- that said...)
> 
> But (if I understand the current plan) we will need mechanisms
> internal to the Python interpreter to transfer values and maintain
> blocked/running state anyway; since when you generate a tasklet and
> run it:
> 
>  t = tasklet(f)
>  t.transfer()
> 
> That may cause many more tasklets to be generated, run, and destroyed
> that you don't ever see ...  recursions/function calls in f, and
> only-Christian-knows what else...  so the transfer value mechanism
> might as well be built in.


I think all these little things are cheap to implement.

> I haven't thought enough about the "unamed produce-and-continue
> function" to decide how exactly it should work.


Somebody named it "resume", and together with "suspend" we get
a nice couple.
On the other hand: I'm not sure whether resume should block
its caller. I'm very undecided after all the input I got,
if it is in fact better to forget data transfer completely
by now and just make switching primitives which always work.

> I have two concerns in implementing uthreads this way (scheduler in
> C):
> 
>  1 -- there doesn't seem to be anyway to "kill" a tasklet


Not yet, but I want to provide an exception to kill tasklets.
Also it will be prossible to just pick it off and drop it,
but I'm a little concerned about the C stack inside. This
might be the last resort if the exception doesn't work.


>  2 -- the scheduling algorithm will be hard to tune (we'll probably
>       *at least* need tasklet priority...)  Maybe there should still
>       be a "timeslice" function so an in-Python scheduler can be written?


We had the timeslice function, yes. I think to make things
simpler this time and just periodically call the scheduler
which is written in C. I also have a rough concept of
priorities which can be very cheaply implemented.
Maybe I implement some default behavior, but allow these
objects to be subclassed from Python?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net/
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
      where do you want to jump today?   http://www.stackless.com/




From mwh@python.net  Thu Feb 21 13:47:56 2002
From: mwh@python.net (Michael Hudson)
Date: 21 Feb 2002 13:47:56 +0000
Subject: [Python-Dev] A little GC confusion
In-Reply-To: Kevin Jacobs's message of "Thu, 21 Feb 2002 08:38:42 -0500 (EST)"
References: <Pine.LNX.4.33.0202210830500.16498-100000@penguin.theopalgroup.com>
Message-ID: <2mr8nfeyk3.fsf@starship.python.net>

Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes:

> I doesn't have any time to really look at your code, but I thought I'd point
> out a trick that several extension modules use to protect statically
> allocated type objects.  Here is the code from socketmodule.c:
> 
> /* static PyTypeObject PySocketSock_Type = {
> .
> .
> .
>         0,      /* set below */                 /* tp_alloc */
>         PySocketSock_new,                       /* tp_new */
>         0,      /* set below */                 /* tp_free */
> };
> 
> /* buried in init_socket */
>         PySocketSock_Type.tp_alloc = PyType_GenericAlloc;
>         PySocketSock_Type.tp_free = _PyObject_Del;
> 
> This trick ensures that the static type object is never freed.

Um, I think you'll find this is because PyType_GenericAlloc &
_PyObject_Del aren't compile-time constants when _socket is
dynamically linked (they're defined in a different dll).

Cheers,
M.

-- 
  > so python will fork if activestate starts polluting it?
  I find it more relevant to speculate on whether Python would fork
  if the merpeople start invading our cities riding on the backs of 
  giant king crabs.                 -- Brian Quinlan, comp.lang.python


From David Abrahams" <david.abrahams@rcn.com  Thu Feb 21 14:14:48 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 21 Feb 2002 09:14:48 -0500
Subject: [Python-Dev] Re: A little GC confusion
Message-ID: <0ffe01c1bae2$21033c80$0500a8c0@boostconsulting.com>

BTW, I haven't been approved for this list yet, so if you could cc: any
replies to me directly at david.abrahams@rcn.com it would be greatly
appreciated.

Thanks,
Dave



From jacobs@penguin.theopalgroup.com  Thu Feb 21 14:10:54 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Thu, 21 Feb 2002 09:10:54 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B1F6@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <Pine.LNX.4.33.0202210838480.16498-100000@penguin.theopalgroup.com>

On Thu, 21 Feb 2002, Moore, Paul wrote:
> Hmm. I tend to favour "do it right, then do it fast". If there's a
> performance hit on dir(), why can't it be made faster?
[snip]
> Of course, we *really* want vars() here, as we're otherwise doing work in
> dir() to get entries that we then throw away.

dir(object) simply doesn't do what we want.  I've tried several times to
write a correct pickler using dir(object) and have always run into problems
due to pathological corner-cases.  I encourage you to try your hand at it.

In the process I've found another issue with the slots implementation.
I'll post the details to python-dev in a separate e-mail.

> > Note that it does an unnecessary getattr, hasattr, memory
> > allocation and incurs loop overhead on every dict attribute,
> > but otherwise it should work once vars is fixed.
>
> Efficiency again. I'd have to bow to your greater experience here. Although
> with pickling, doesn't I/O usually outweigh any performance cost?

I can't speak for everyone's applications, but we frequently pickle to
memory or to the operating system buffer-cache don't live long enough to hit
the disk.

Thanks,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From thomas.heller@ion-tof.com  Thu Feb 21 14:14:03 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 21 Feb 2002 15:14:03 +0100
Subject: [Python-Dev] A little GC confusion
References: <0f5001c1bad8$885e0590$0500a8c0@boostconsulting.com>
Message-ID: <1a5f01c1bae2$0767c160$e000a8c0@thomasnotebook>

From: "David Abrahams" <david.abrahams@rcn.com>
> Hi All,
> 
> The following extension module (AA) is a reduced example of what I'm doing
> to make extension
> classes in 2.2. I followed the examples given by typeobject.c. When I
> "import AA,pdb" I get a crash in GC.

For me it crashes after
  import AA, gc
  gc.collect()
(Win2k Prof, SP1, MSVC6.0)

I'm not really sure, but it seems your code does not crash any longer
if you remove the Py_TPFLAGS_HAVE_GC from your definition of class_metatype_object.

This flag *will* be set by PyType_Ready(); I guess it is inherited
from the base type (PyType_Type in your case).

Thomas



From fdrake@acm.org  Thu Feb 21 14:22:01 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 21 Feb 2002 09:22:01 -0500
Subject: [Python-Dev] Re: A little GC confusion
In-Reply-To: <0ffe01c1bae2$21033c80$0500a8c0@boostconsulting.com>
References: <0ffe01c1bae2$21033c80$0500a8c0@boostconsulting.com>
Message-ID: <15477.649.42690.538861@grendel.zope.com>

David Abrahams writes:
 > BTW, I haven't been approved for this list yet, so if you could cc: any
 > replies to me directly at david.abrahams@rcn.com it would be greatly
 > appreciated.

You have now been approved.  The usual list admins are either away on
vacation or are having connectivity problems; sorry for the delay.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From thomas.heller@ion-tof.com  Thu Feb 21 14:36:14 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 21 Feb 2002 15:36:14 +0100
Subject: [Python-Dev] A little GC confusion
References: <0f5001c1bad8$885e0590$0500a8c0@boostconsulting.com> <1a5f01c1bae2$0767c160$e000a8c0@thomasnotebook>
Message-ID: <1aea01c1bae5$2088d500$e000a8c0@thomasnotebook>

PS: I have code very similar to yours, and a question:

Why does your class_type_object have PyPyBase_ObjectType as
tp_base? To implement a subtypable type this is not needed
IMO, or do I miss something?

Thomas



From David Abrahams" <david.abrahams@rcn.com  Thu Feb 21 14:42:34 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 21 Feb 2002 09:42:34 -0500
Subject: [Python-Dev] A little GC confusion
References: <0f5001c1bad8$885e0590$0500a8c0@boostconsulting.com> <1a5f01c1bae2$0767c160$e000a8c0@thomasnotebook>
Message-ID: <102b01c1bae6$5463de00$0500a8c0@boostconsulting.com>

----- Original Message -----
From: "Thomas Heller" <thomas.heller@ion-tof.com>


> From: "David Abrahams" <david.abrahams@rcn.com>
> > Hi All,
> >
> > The following extension module (AA) is a reduced example of what I'm
doing
> > to make extension
> > classes in 2.2. I followed the examples given by typeobject.c. When I
> > "import AA,pdb" I get a crash in GC.
>
> For me it crashes after
>   import AA, gc
>   gc.collect()
> (Win2k Prof, SP1, MSVC6.0)
>
> I'm not really sure, but it seems your code does not crash any longer
> if you remove the Py_TPFLAGS_HAVE_GC from your definition of
class_metatype_object.

Yes, I'm aware of that. What I don't understand is how  the builtin metatype
gets away with Py_TPFLAGS_HAVE_GC when some of its instance types are not
even heap-allocated.

-Dave



From thomas.heller@ion-tof.com  Thu Feb 21 14:52:11 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 21 Feb 2002 15:52:11 +0100
Subject: [Python-Dev] A little GC confusion
References: <0f5001c1bad8$885e0590$0500a8c0@boostconsulting.com> <1a5f01c1bae2$0767c160$e000a8c0@thomasnotebook> <1aea01c1bae5$2088d500$e000a8c0@thomasnotebook> <104301c1bae7$bbb57e50$0500a8c0@boostconsulting.com>
Message-ID: <1b2a01c1bae7$5ae47d60$e000a8c0@thomasnotebook>

From: "David Abrahams" <david.abrahams@rcn.com>
> From: "Thomas Heller" <thomas.heller@ion-tof.com>
> 
> > PS: I have code very similar to yours, and a question:
> >
> > Why does your class_type_object have PyPyBase_ObjectType as
> > tp_base? To implement a subtypable type this is not needed
> > IMO, or do I miss something?
> 
> I wanted subclasses of my class_type_object to have the same properties as
> new-style classes, so it just made sense to me to do that.
> 
> Python's documentation is... less than complete... so I'm sure I'm missing
> something.

Sure. Unfortunately, PEP253 still talks of PyType_InitDict instead
of PyType_Ready, but we're beyond that already;-)

Thomas



From thomas.heller@ion-tof.com  Thu Feb 21 15:01:16 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 21 Feb 2002 16:01:16 +0100
Subject: [Python-Dev] A little GC confusion
References: <0f5001c1bad8$885e0590$0500a8c0@boostconsulting.com> <1a5f01c1bae2$0767c160$e000a8c0@thomasnotebook> <102b01c1bae6$5463de00$0500a8c0@boostconsulting.com>
Message-ID: <1b6b01c1bae8$9f6af350$e000a8c0@thomasnotebook>

> > I'm not really sure, but it seems your code does not crash any longer
> > if you remove the Py_TPFLAGS_HAVE_GC from your definition of
> class_metatype_object.
> 
> Yes, I'm aware of that. What I don't understand is how  the builtin metatype
> gets away with Py_TPFLAGS_HAVE_GC when some of its instance types are not
> even heap-allocated.
> 
Hm,  I don't understand you, Are you talking about Py_TPFLAGS_HEAPTYPE?

Thomas



From David Abrahams" <david.abrahams@rcn.com  Thu Feb 21 15:16:24 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 21 Feb 2002 10:16:24 -0500
Subject: [Python-Dev] A little GC confusion
References: <0f5001c1bad8$885e0590$0500a8c0@boostconsulting.com> <1a5f01c1bae2$0767c160$e000a8c0@thomasnotebook> <102b01c1bae6$5463de00$0500a8c0@boostconsulting.com> <1b6b01c1bae8$9f6af350$e000a8c0@thomasnotebook>
Message-ID: <10f401c1baeb$3f4f5850$0500a8c0@boostconsulting.com>

----- Original Message -----
From: "Thomas Heller" <thomas.heller@ion-tof.com>
To: "David Abrahams" <david.abrahams@rcn.com>; <python-dev@python.org>
Sent: Thursday, February 21, 2002 10:01 AM
Subject: Re: [Python-Dev] A little GC confusion


> > > I'm not really sure, but it seems your code does not crash any longer
> > > if you remove the Py_TPFLAGS_HAVE_GC from your definition of
> > class_metatype_object.
> >
> > Yes, I'm aware of that. What I don't understand is how  the builtin
metatype
> > gets away with Py_TPFLAGS_HAVE_GC when some of its instance types are
not
> > even heap-allocated.
> >
> Hm,  I don't understand you, Are you talking about Py_TPFLAGS_HEAPTYPE?

No, please re-read my initial posting. Py_TPFLAGS_HAVE_GC places
requirements on the allocation method of instances, at least according to
the docs.




From gmcm@hypernet.com  Thu Feb 21 15:45:00 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 21 Feb 2002 10:45:00 -0500
Subject: [Stackless] Re: [Python-Dev] Stackless Design Q.
In-Reply-To: <200202210208.PAA22202@s454.cosc.canterbury.ac.nz>
References: <3C740E82.8148.204A91B4@localhost>
Message-ID: <3C74CFAC.24406.233D20CD@localhost>

On 21 Feb 2002 at 15:08, Greg Ewing wrote:

> Gordon McMillan <gmcm@hypernet.com>:
> 
> > Unless you've got a way to detect or pass tasklet's
> > through transfer, you don't have enough.
> 
> You'll have to elaborate. I don't have any idea what
> you mean by that!

You need a way to refer to "this" tasklet from 
Python, and pass that to the "other" tasklet. Alternatively, you need "the tasklet that
transferred to me". This is implicit in generators;
it needs to be explicit to do coroutines. You 
can't write a scheduler in Python without
it - you need the client tasklets to transfer
to the scheduler tasklet.

-- Gordon
http://www.mcmillan-inc.com/



From pedroni@inf.ethz.ch  Thu Feb 21 16:34:39 2002
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Thu, 21 Feb 2002 17:34:39 +0100
Subject: [Python-Dev] Meta-reflections
References: <Pine.LNX.4.33.0202210838480.16498-100000@penguin.theopalgroup.com>
Message-ID: <00bb01c1baf5$a8e4b800$6d94fea9@newmexico>

[Kevin Jacobs]
>
> In the process I've found another issue with the slots implementation.
> I'll post the details to python-dev in a separate e-mail.
>

FYI bug reported only on python-dev have a high probability
to get lost into vacuum (Tim often warns against that).

Now a seemingly bug is a seeminhly bug, so I have reported
your bug to SF:

http://sourceforge.net/tracker/index.php?func=detail&aid=520644&group_id=5470&a
tid=105470

In general don't expect that someone will post bugs on your behalf.

regards, Samuele Pedroni.




From jacobs@penguin.theopalgroup.com  Thu Feb 21 17:30:56 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Thu, 21 Feb 2002 12:30:56 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <00bb01c1baf5$a8e4b800$6d94fea9@newmexico>
Message-ID: <Pine.LNX.4.33.0202211208180.18978-100000@penguin.theopalgroup.com>

On Thu, 21 Feb 2002, Samuele Pedroni wrote:
> [Kevin Jacobs]
> >
> > In the process I've found another issue with the slots implementation.
> > I'll post the details to python-dev in a separate e-mail.
> >
>
> FYI bug reported only on python-dev have a high probability
> to get lost into vacuum (Tim often warns against that).
>
> Now a seemingly bug is a seemingly bug, so I have reported
> your bug to SF:
>
> http://sourceforge.net/tracker/index.php?func=detail&aid=520644&group_id=5470&a
> tid=105470
>
> In general don't expect that someone will post bugs on your behalf.

Thanks.  I have a collection of about ~8 more bugs that is expending as I
grow my test suite.  Before I spray all of them onto SF, I want to hear from
Guido, since some of my "bugs" are potentially subjective.

I _have_ tried three times to post a summary-bug to SF and its not worked
(as usual).  Is just me or is SF flaky as hell?  The last time I tried to
post a bug, it kicked me out and was "Down for maintenance" for some time
after that.  Now it won't let me login since it thinks I haven't responded
to the new account confirmation e-mail.  Grrrrrrrrrr

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From tim.one@comcast.net  Thu Feb 21 21:06:47 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 21 Feb 2002 16:06:47 -0500
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <Pine.LNX.4.33.0202211208180.18978-100000@penguin.theopalgroup.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFGNOAA.tim.one@comcast.net>

[Kevin Jacobs]
> ...
> I have a collection of about ~8 more bugs that is expending as I
> grow my test suite.  Before I spray all of them onto SF, I want
> to hear from Guido, since some of my "bugs" are potentially subjective.

The best way to hear from Guido is to post bugs, and suspected bugs, to
SourceForge, one bug per report.  There's so much verbiage about this now on
Python-Dev that I doubt he'll ever be able to make time to catch up with it
when he returns.  A great advantage of a good bug report is that it's
focused and brief.

Slots were definitely intended as a memory optimization, and the ways in
which they don't act like "regular old attributes" are at best warts.

> I _have_ tried three times to post a summary-bug to SF and its not worked
> (as usual).  Is just me or is SF flaky as hell?  The last time I tried to
> post a bug, it kicked me out and was "Down for maintenance" for some time
> after that.  Now it won't let me login since it thinks I haven't
> responded to the new account confirmation e-mail.  Grrrrrrrrrr

It *sounds* like you're getting started with SF.  Once it agrees not to hate
you <wink>, life gets a lot easier.  It's not flaky in general, but it does
suffer bouts of extreme flakiness from time to time.



From David Abrahams" <david.abrahams@rcn.com  Thu Feb 21 21:23:36 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 21 Feb 2002 16:23:36 -0500
Subject: [Python-Dev] Meta-reflections
References: <LNBBLJKPBEHFEDALKOLCCEFGNOAA.tim.one@comcast.net>
Message-ID: <12d501c1bb1e$8778d2e0$0500a8c0@boostconsulting.com>

FWIW, some of my Boost colleagues have been watching SF's future prospects
with some suspicion. The financial outlook is worrisome; I submitted a
support request in April 2001 that still hasn't been addressed (
http://sourceforge.net/tracker/?func=detail&aid=414066&group_id=1&atid=35000
1). We're establishing all new services elsewhere, and even moving some old
ones. For the long-term health of Python, you might want to make sure you're
prepared to move quickly if neccessary.

-Dave
----- Original Message -----
From: "Tim Peters" <tim.one@comcast.net>
To: "'Python Dev'" <python-dev@python.org>
Sent: Thursday, February 21, 2002 4:06 PM
Subject: RE: [Python-Dev] Meta-reflections


> [Kevin Jacobs]
> > ...
> > I have a collection of about ~8 more bugs that is expending as I
> > grow my test suite.  Before I spray all of them onto SF, I want
> > to hear from Guido, since some of my "bugs" are potentially subjective.
>
> The best way to hear from Guido is to post bugs, and suspected bugs, to
> SourceForge, one bug per report.  There's so much verbiage about this now
on
> Python-Dev that I doubt he'll ever be able to make time to catch up with
it
> when he returns.  A great advantage of a good bug report is that it's
> focused and brief.
>
> Slots were definitely intended as a memory optimization, and the ways in
> which they don't act like "regular old attributes" are at best warts.
>
> > I _have_ tried three times to post a summary-bug to SF and its not
worked
> > (as usual).  Is just me or is SF flaky as hell?  The last time I tried
to
> > post a bug, it kicked me out and was "Down for maintenance" for some
time
> > after that.  Now it won't let me login since it thinks I haven't
> > responded to the new account confirmation e-mail.  Grrrrrrrrrr
>
> It *sounds* like you're getting started with SF.  Once it agrees not to
hate
> you <wink>, life gets a lot easier.  It's not flaky in general, but it
does
> suffer bouts of extreme flakiness from time to time.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
>



From pedroni@inf.ethz.ch  Thu Feb 21 21:13:57 2002
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Thu, 21 Feb 2002 22:13:57 +0100
Subject: [Python-Dev] Meta-reflections
References: <LNBBLJKPBEHFEDALKOLCCEFGNOAA.tim.one@comcast.net>
Message-ID: <01ed01c1bb1c$acd05f60$6d94fea9@newmexico>

> [Kevin Jacobs]
> > ...
> > I have a collection of about ~8 more bugs that is expending as I
> > grow my test suite.  Before I spray all of them onto SF, I want
> > to hear from Guido, since some of my "bugs" are potentially subjective.
> 
> The best way to hear from Guido is to post bugs, and suspected bugs, to
> SourceForge, one bug per report.  There's so much verbiage about this now on
> Python-Dev that I doubt he'll ever be able to make time to catch up with it
> when he returns.  A great advantage of a good bug report is that it's
> focused and brief.

It's very true.
 
> Slots were definitely intended as a memory optimization, and the ways in
> which they don't act like "regular old attributes" are at best warts.
> 

I see, but it seems that the only way to coherently and transparently
remove the warts implies that the __dict__ of a new-style class 
instance with slots should be tied with the instance and cannot
be anymore a vanilla dict. Something only Guido can rule about.

some-more-verbiage-ly y'rs - Samuele.



From tim.one@comcast.net  Thu Feb 21 22:41:18 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 21 Feb 2002 17:41:18 -0500
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <12d501c1bb1e$8778d2e0$0500a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEFMNOAA.tim.one@comcast.net>

[David Abrahams]
> FWIW, some of my Boost colleagues have been watching SF's future
> prospects with some suspicion.

It's worth a lot, and we do too -- at least in fits, when somebody remembers
it's something that's going to kill us someday.

> The financial outlook is worrisome; I submitted a
> support request in April 2001 that still hasn't been addressed (
>
<http://sourceforge.net/tracker/?func=detail&aid=414066&group_id=1&atid=3500
01>).

Well, that's really a feature request, and *nobody* responds well to witty
oblique references to the Odyssey except me <wink>.

> We're establishing all new services elsewhere, and even moving some old
> ones. For the long-term health of Python, you might want to make
> sure you're prepared to move quickly if neccessary.

We supposedly have a cron job set up to suck down Python's CVS tarball every
night (the people who would know if this is currently working are out this
week).

What I don't think we ever figured out how to do was capture the info in the
trackers (bugs, patches, feature requests).  That would be a major loss, as
well as a chance to forget about 500 people who can't figure out how to use
threads on HP-UX, so let's call it a wash <wink>.



From tim.one@comcast.net  Thu Feb 21 22:51:13 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 21 Feb 2002 17:51:13 -0500
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <01ed01c1bb1c$acd05f60$6d94fea9@newmexico>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEFNNOAA.tim.one@comcast.net>

[Tim]
> Slots were definitely intended as a memory optimization, and the ways in
> which they don't act like "regular old attributes" are at best warts.

[Samuele Pedroni]
> I see, but it seems that the only way to coherently and transparently
> remove the warts implies that the __dict__ of a new-style class
> instance with slots should be tied with the instance and cannot
> be anymore a vanilla dict. Something only Guido can rule about.

He'll be happy to <wink>.  Optimizations aren't always wart-free, and then
living with warts is a price paid for benefiting from the optimization.  I'm
sure Guido would consider it "a bug" if slots are ignored by the pickling
mechanism, but wouldn't for an instant consider it "a bug" that the set of
slots in effect when a class is created can't be dynamically expanded later
(this latter is more a sensible restriction than a wart, IMO -- and likely
in Guido's too).



From guido@python.org  Fri Feb 22 00:28:19 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 21 Feb 2002 19:28:19 -0500
Subject: [Python-Dev] Meta-reflections
In-Reply-To: Your message of "Thu, 21 Feb 2002 17:41:18 EST."
 <LNBBLJKPBEHFEDALKOLCAEFMNOAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCAEFMNOAA.tim.one@comcast.net>
Message-ID: <200202220028.g1M0SJ024961@pcp742651pcs.reston01.va.comcast.net>

> What I don't think we ever figured out how to do was capture the
> info in the trackers (bugs, patches, feature requests).  That would
> be a major loss, as well as a chance to forget about 500 people who
> can't figure out how to use threads on HP-UX, so let's call it a
> wash <wink>.

>From a recent SF mailing to project administrators:

  DATA EXPORT
  ---------------------------
  We have added a new tool for project administrators to backup their
  Project data.  It is now possible to export data from the Trackers (bug
  tracker, support tracker, etc), mailing lists,  and forum data in to a
  single XML text file.  This can be done at any time.

  This is actually not a new feature.   The ability to export data was
  available through March of 2001 until we did a major upgrade of the
  site, which broke the export scripts.  We have now re-worked the code,
  and it's available again.   Enjoy.   http://sourceforge.net/export

SOMEBODY with admin perms should set up a cron job to such down the
nightly XML.  It's big!  (Are we still sucking down the nightly cvs
tarballs?  We should!)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From pedroni@inf.ethz.ch  Fri Feb 22 00:38:27 2002
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Fri, 22 Feb 2002 01:38:27 +0100
Subject: [Python-Dev] Meta-reflections
References: <LNBBLJKPBEHFEDALKOLCOEFNNOAA.tim.one@comcast.net>
Message-ID: <043801c1bb39$3ed8a540$6d94fea9@newmexico>

From: Tim Peters <tim.one@comcast.net>
> [Tim]
> > Slots were definitely intended as a memory optimization, and the ways in
> > which they don't act like "regular old attributes" are at best warts.
>
> [Samuele Pedroni]
> > I see, but it seems that the only way to coherently and transparently
> > remove the warts implies that the __dict__ of a new-style class
> > instance with slots should be tied with the instance and cannot
> > be anymore a vanilla dict. Something only Guido can rule about.
>
> He'll be happy to <wink>.  Optimizations aren't always wart-free, and then
> living with warts is a price paid for benefiting from the optimization.  I'm
> sure Guido would consider it "a bug" if slots are ignored by the pickling
> mechanism, but wouldn't for an instant consider it "a bug" that the set of
> slots in effect when a class is created can't be dynamically expanded later
> (this latter is more a sensible restriction than a wart, IMO -- and likely
> in Guido's too).
>

I was thinking along the line of the C equiv of this:
[Yup the situation of a subclass of a class with slots
is more relevant]

class C(object):
  __slots__ = ['_a']


class D(C): pass


def allslots(cls):
  mro = list(cls.__mro__)
  mro.reverse()
  allslots = {}
  for c in mro:
    cdict = c.__dict__
    if '__slots__' in cdict:
      for slot in cdict['__slots__']:
        allslots[slot] = cdict[slot]
  return allslots

class slotdict(dict):
   __slots__ = ['_inst','_allslots']
   def __init__(self,inst,allslots):
     self._inst = inst
     self._allslots = allslots

   def __getitem__(self,k):
     if self._allslots.has_key(k):
        # self _allslots should be reachable as
self._inst.__class__.__allslots__
        # AttributeError should become a KeyError ?
        return self._allslots[k].__get__(self._inst)
     else:
        return dict.__getitem__(self,v)

   def __setitem__(self,k,v):
     if self._allslots.has_key(k):
        # self _allslots should be reachable as
self._inst.__class__.__allslots__
        # AttributeError should become a KeyError ?
        return self._allslots[k].__set__(self._inst,v)
     else:
        return dict.__setitem__(self,v)

   # other methods accordingly

d=D()
d.__dict__ = slotdict(d,allslots(D)) # should be so automagically

# allslots(D) should be probably accessible as d.__class__.__allslots__
# for transparency C.__dict__ should not contain any slot descr

#  __allslots__ should be readonly and disallow rebinding
# d.__dict__ should disallow rebinding

# c =C() ; c.__dict__ should return a proxy dict lazily or even more so ...

Lots of things to rule about and trade-offs to consider.

the-more-it's-arbitrary-the-more-you-need-_one_-ruler-ly y'rs - Samuele.



From tim.one@comcast.net  Fri Feb 22 01:46:35 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 21 Feb 2002 20:46:35 -0500
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <200202220028.g1M0SJ024961@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEGNNOAA.tim.one@comcast.net>

[Guido]
> From a recent SF mailing to project administrators:
>
>   DATA EXPORT
>   ---------------------------

Jeremy (and less so I) played with that in the past (before it was
publicized), but hit a brick wall:  there seemed to be a cap on how many
records it would deliver, and we couldn't brute-force our way around it.
Maybe it's better now.

> ...
> SOMEBODY with admin perms should set up a cron job to such down the
> nightly XML.  It's big!  (Are we still sucking down the nightly cvs
> tarballs?  We should!)

IIRC, Barry was doing that on a home machine, and if so he's not around this
week to answer.



From greg@cosc.canterbury.ac.nz  Fri Feb 22 02:41:29 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 22 Feb 2002 15:41:29 +1300 (NZDT)
Subject: [Python-Dev] A little GC confusion
In-Reply-To: <Pine.LNX.4.33.0202210830500.16498-100000@penguin.theopalgroup.com>
Message-ID: <200202220241.PAA22324@s454.cosc.canterbury.ac.nz>

Kevin Jacobs <jacobs@penguin.theopalgroup.com>:

> I doesn't have any time to really look at your code, but I thought I'd point
> out a trick that several extension modules use to protect statically
> allocated type objects.

>         0,      /* set below */                 /* tp_alloc */
>         PySocketSock_new,                       /* tp_new */
>         0,      /* set below */                 /* tp_free */

I don't think that has anything to do with protecting the type
object.

As I understand it, static type objects are protected by
having their refcount statically initialised to 1, so that
it will never drop to zero.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From Anthony Baxter <anthony@ekit-inc.com>  Fri Feb 22 03:04:57 2002
From: Anthony Baxter <anthony@ekit-inc.com> (Anthony Baxter)
Date: Fri, 22 Feb 2002 14:04:57 +1100
Subject: [Python-Dev] Meta-reflections
In-Reply-To: Message from Tim Peters <tim.one@comcast.net>
 of "Thu, 21 Feb 2002 17:41:18 CDT." <LNBBLJKPBEHFEDALKOLCAEFMNOAA.tim.one@comcast.net>
Message-ID: <200202220304.g1M34vn21406@burswood.off.ekorp.com>

>>> Tim Peters wrote
> What I don't think we ever figured out how to do was capture the info in the
> trackers (bugs, patches, feature requests).  That would be a major loss, as
> well as a chance to forget about 500 people who can't figure out how to use
> threads on HP-UX, so let's call it a wash <wink>.

I still think adding a 'Resolution' of "HP/UX" would be a good way to
clean up the trackers...

Anthony.

-- 
Anthony Baxter     <anthony@interlink.com.au>   
It's never to late to have a happy childhood.



From fdrake@acm.org  Fri Feb 22 03:17:21 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 21 Feb 2002 22:17:21 -0500
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <200202220028.g1M0SJ024961@pcp742651pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCAEFMNOAA.tim.one@comcast.net>
 <200202220028.g1M0SJ024961@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15477.47169.910859.986885@grendel.zope.com>

Guido van Rossum writes:
 > SOMEBODY with admin perms should set up a cron job to such down the
 > nightly XML.  It's big!  (Are we still sucking down the nightly cvs
 > tarballs?  We should!)

It's failing for me now; I'll submit a support request.

I think the tarballs are being downloaded to the python.org machine;
I'm not sure if they're still landing on Barry's home machine.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From greg@cosc.canterbury.ac.nz  Fri Feb 22 03:21:09 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 22 Feb 2002 16:21:09 +1300 (NZDT)
Subject: [Stackless] Re: [Python-Dev] Stackless Design Q.
In-Reply-To: <3C74CFAC.24406.233D20CD@localhost>
Message-ID: <200202220321.QAA22332@s454.cosc.canterbury.ac.nz>

Gordon McMillan <gmcm@hypernet.com>:

> You need a way to refer to "this" tasklet from Python

Yes, that occurred to me as well. Would a built-in function
called current_tasklet() provide what you want?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From fdrake@acm.org  Fri Feb 22 03:28:14 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 21 Feb 2002 22:28:14 -0500
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <15477.47169.910859.986885@grendel.zope.com>
References: <LNBBLJKPBEHFEDALKOLCAEFMNOAA.tim.one@comcast.net>
 <200202220028.g1M0SJ024961@pcp742651pcs.reston01.va.comcast.net>
 <15477.47169.910859.986885@grendel.zope.com>
Message-ID: <15477.47822.794530.781796@grendel.zope.com>

I wrote:
 > It's failing for me now; I'll submit a support request.

http://sourceforge.net/tracker/index.php?func=detail&aid=521302&group_id=1&atid=200001


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From greg@cosc.canterbury.ac.nz  Fri Feb 22 03:54:53 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 22 Feb 2002 16:54:53 +1300 (NZDT)
Subject: [Stackless] Re: [Python-Dev] Stackless Design Q.
In-Reply-To: <3C74FA97.7080700@tismer.com>
Message-ID: <200202220354.QAA22337@s454.cosc.canterbury.ac.nz>

Christian Tismer <tismer@tismer.com>:

> Now I see it: You mean I can make this schedule function behave
> like a normal function call, that accepts and drops a dummy
> value?

Yes, that's right. (Or more precisely, it would take
no parameters and return None.)

> when I have a scheduler counter built into the Python
> interpreter loop

I can see the attraction of having pre-emption built in this
way -- it would indeed be extremely efficient.

But I think you need to make a decision about whether your
tasklet model is going to be fundamentally pre-emptive or
fundamentally non-pre-emptive, because, as I said before,
the notion of switching to a specific tasklet is incompatible
with pre-emptive scheduling.

If you want to go with a fundamentally pre-emptive model,
I would suggest the following primitives:

   t = tasklet(f)
      Creates a new tasklet executing f. The new tasklet
      is initially blocked.

   t.block()
      Removes tasklet t from the set of runnable tasklets.

   t.unblock()
      Adds tasklet t to the set of runnable tasklets.

   current_tasklet()
      A built-in function which returns the currently
      running tasklet.

Using this model, a coroutine switch would be implemented
using something like

   def transfer(t):
      "Transfer from the currently running tasklet to t."
      t.unblock()
      current_tasklet().block()

although some locking may be needed in there somewhere.
Have to think about that some more.

For sending values from one tasklet to another, I think
I'd use an intermediate object to mediate the transfer,
something like a channel in Occam:

   c = channel()

   # tasklet 1 does:
   c.send(value)

   # tasklet 2 does:
   value = c.receive()

Tasklet 1 blocks at the send() until tasklet 2 reaches
the receive(), or vice versa if tasklet 2 reaches the
receive() first. When they're both ready, the value is
transferred and both tasklets are unblocked.

The advantage of this is that it's more symmetrical.
Instead of one tasklet having to know about the
other, they don't know about each other but they
both know about the intermediate object.

> I want to provide an exception to kill tasklets.
> Also it will be prossible to just pick it off and drop it,
> but I'm a little concerned about the C stack inside.

As I said before, if there are no references left to a
tasklet, there's no way it can ever be switched to again,
so its C stack is no longer relevant. Unless you can have
return addresses from one C stack pointing into another,
or something... can you?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From tim.one@comcast.net  Fri Feb 22 05:01:27 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 22 Feb 2002 00:01:27 -0500
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <15477.47169.910859.986885@grendel.zope.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHENOAA.tim.one@comcast.net>

[Guido]
> SOMEBODY with admin perms should set up a cron job to such down the
> nightly XML.

[Fred]
> It's failing for me now; I'll submit a support request.

It doesn't crap out for me, but this is the entire file I get back:

"""
<project_export>

<artifacts>
"""

Yes, I was logged in as an admin at the time.  Else I get this:

"""
<project_export>
You are not an admin of this project.  Permission denied.
"""

BTW, from the verbal description of what's supposed to happen, it sounds
like it may not include attachments (like patches).



From adam@isdn.net.il  Fri Feb 22 07:23:12 2002
From: adam@isdn.net.il (adam)
Date: Fri, 22 Feb 2002 09:23:12 +0200
Subject: [Python-Dev] warning before a legal claim
Message-ID: <001301c1bb71$c9832c00$0101c80a@LocalHost>

This is a multi-part message in MIME format.

------=_NextPart_000_0010_01C1BB82.8C967DE0
Content-Type: text/plain;
	charset="x-user-defined"
Content-Transfer-Encoding: quoted-printable

warning before a legal claim

Remove us from your announcements list !!=20



------=_NextPart_000_0010_01C1BB82.8C967DE0
Content-Type: text/html;
	charset="x-user-defined"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Dx-user-defined">
<META content=3D"MSHTML 6.00.2600.0" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>
<P align=3Dleft>warning before a legal claim</P>
<P align=3Dleft>Remove us from your announcements list !! </P>
<P align=3Dleft></P></FONT></DIV></BODY></HTML>

------=_NextPart_000_0010_01C1BB82.8C967DE0--



From adam@isdn.net.il  Fri Feb 22 07:23:54 2002
From: adam@isdn.net.il (adam)
Date: Fri, 22 Feb 2002 09:23:54 +0200
Subject: [Python-Dev] warning before a legal claim
References: <E16e6oT-00010r-00@mail.python.org>
Message-ID: <001701c1bb71$e30c0980$0101c80a@LocalHost>

warning before a legal claim

Remove us from your announcements list !!


----- Original Message -----
From: <python-dev-request@python.org>
To: <python-dev@python.org>
Sent: Friday, February 22, 2002 5:56 AM
Subject: Python-Dev digest, Vol 1 #1903 - 15 msgs


> Send Python-Dev mailing list submissions to
> python-dev@python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://mail.python.org/mailman/listinfo/python-dev
> or, via email, send a message with subject or body 'help' to
> python-dev-request@python.org
>
> You can reach the person managing the list at
> python-dev-admin@python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Python-Dev digest..."
>
>
> Today's Topics:
>
>    1. Re: Meta-reflections (Kevin Jacobs)
>    2. RE: Meta-reflections (Tim Peters)
>    3. Re: Meta-reflections (David Abrahams)
>    4. Re: Meta-reflections (Samuele Pedroni)
>    5. RE: Meta-reflections (Tim Peters)
>    6. RE: Meta-reflections (Tim Peters)
>    7. Re: Meta-reflections (Guido van Rossum)
>    8. Re: Meta-reflections (Samuele Pedroni)
>    9. RE: Meta-reflections (Tim Peters)
>   10. Re: A little GC confusion (Greg Ewing)
>   11. Re: Meta-reflections (Anthony Baxter)
>   12. Re: Meta-reflections (Fred L. Drake, Jr.)
>   13. Re: [Stackless] Re: [Python-Dev] Stackless Design Q. (Greg Ewing)
>   14. Re: Meta-reflections (Fred L. Drake, Jr.)
>   15. Re: [Stackless] Re: [Python-Dev] Stackless Design Q. (Greg Ewing)
>
> --__--__--
>
> Message: 1
> Date: Thu, 21 Feb 2002 12:30:56 -0500 (EST)
> From: Kevin Jacobs <jacobs@penguin.theopalgroup.com>
> To: Samuele Pedroni <pedroni@inf.ethz.ch>
> cc: "'Python Dev'" <python-dev@python.org>
> Subject: Re: [Python-Dev] Meta-reflections
>
> On Thu, 21 Feb 2002, Samuele Pedroni wrote:
> > [Kevin Jacobs]
> > >
> > > In the process I've found another issue with the slots implementation.
> > > I'll post the details to python-dev in a separate e-mail.
> > >
> >
> > FYI bug reported only on python-dev have a high probability
> > to get lost into vacuum (Tim often warns against that).
> >
> > Now a seemingly bug is a seemingly bug, so I have reported
> > your bug to SF:
> >
> >
http://sourceforge.net/tracker/index.php?func=detail&aid=520644&group_id=547
0&a
> > tid=105470
> >
> > In general don't expect that someone will post bugs on your behalf.
>
> Thanks.  I have a collection of about ~8 more bugs that is expending as I
> grow my test suite.  Before I spray all of them onto SF, I want to hear
from
> Guido, since some of my "bugs" are potentially subjective.
>
> I _have_ tried three times to post a summary-bug to SF and its not worked
> (as usual).  Is just me or is SF flaky as hell?  The last time I tried to
> post a bug, it kicked me out and was "Down for maintenance" for some time
> after that.  Now it won't let me login since it thinks I haven't responded
> to the new account confirmation e-mail.  Grrrrrrrrrr
>
> -Kevin
>
> --
> Kevin Jacobs
> The OPAL Group - Enterprise Systems Architect
> Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
> Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com
>
>
>
>
> --__--__--
>
> Message: 2
> Date: Thu, 21 Feb 2002 16:06:47 -0500
> From: Tim Peters <tim.one@comcast.net>
> Subject: RE: [Python-Dev] Meta-reflections
> To: 'Python Dev' <python-dev@python.org>
>
> [Kevin Jacobs]
> > ...
> > I have a collection of about ~8 more bugs that is expending as I
> > grow my test suite.  Before I spray all of them onto SF, I want
> > to hear from Guido, since some of my "bugs" are potentially subjective.
>
> The best way to hear from Guido is to post bugs, and suspected bugs, to
> SourceForge, one bug per report.  There's so much verbiage about this now
on
> Python-Dev that I doubt he'll ever be able to make time to catch up with
it
> when he returns.  A great advantage of a good bug report is that it's
> focused and brief.
>
> Slots were definitely intended as a memory optimization, and the ways in
> which they don't act like "regular old attributes" are at best warts.
>
> > I _have_ tried three times to post a summary-bug to SF and its not
worked
> > (as usual).  Is just me or is SF flaky as hell?  The last time I tried
to
> > post a bug, it kicked me out and was "Down for maintenance" for some
time
> > after that.  Now it won't let me login since it thinks I haven't
> > responded to the new account confirmation e-mail.  Grrrrrrrrrr
>
> It *sounds* like you're getting started with SF.  Once it agrees not to
hate
> you <wink>, life gets a lot easier.  It's not flaky in general, but it
does
> suffer bouts of extreme flakiness from time to time.
>
>
>
> --__--__--
>
> Message: 3
> Reply-To: "David Abrahams" <david.abrahams@rcn.com>
> From: "David Abrahams" <david.abrahams@rcn.com>
> To: <python-dev@python.org>
> Subject: Re: [Python-Dev] Meta-reflections
> Date: Thu, 21 Feb 2002 16:23:36 -0500
>
> FWIW, some of my Boost colleagues have been watching SF's future prospects
> with some suspicion. The financial outlook is worrisome; I submitted a
> support request in April 2001 that still hasn't been addressed (
>
http://sourceforge.net/tracker/?func=detail&aid=414066&group_id=1&atid=35000
> 1). We're establishing all new services elsewhere, and even moving some
old
> ones. For the long-term health of Python, you might want to make sure
you're
> prepared to move quickly if neccessary.
>
> -Dave
> ----- Original Message -----
> From: "Tim Peters" <tim.one@comcast.net>
> To: "'Python Dev'" <python-dev@python.org>
> Sent: Thursday, February 21, 2002 4:06 PM
> Subject: RE: [Python-Dev] Meta-reflections
>
>
> > [Kevin Jacobs]
> > > ...
> > > I have a collection of about ~8 more bugs that is expending as I
> > > grow my test suite.  Before I spray all of them onto SF, I want
> > > to hear from Guido, since some of my "bugs" are potentially
subjective.
> >
> > The best way to hear from Guido is to post bugs, and suspected bugs, to
> > SourceForge, one bug per report.  There's so much verbiage about this
now
> on
> > Python-Dev that I doubt he'll ever be able to make time to catch up with
> it
> > when he returns.  A great advantage of a good bug report is that it's
> > focused and brief.
> >
> > Slots were definitely intended as a memory optimization, and the ways in
> > which they don't act like "regular old attributes" are at best warts.
> >
> > > I _have_ tried three times to post a summary-bug to SF and its not
> worked
> > > (as usual).  Is just me or is SF flaky as hell?  The last time I tried
> to
> > > post a bug, it kicked me out and was "Down for maintenance" for some
> time
> > > after that.  Now it won't let me login since it thinks I haven't
> > > responded to the new account confirmation e-mail.  Grrrrrrrrrr
> >
> > It *sounds* like you're getting started with SF.  Once it agrees not to
> hate
> > you <wink>, life gets a lot easier.  It's not flaky in general, but it
> does
> > suffer bouts of extreme flakiness from time to time.
> >
> >
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev@python.org
> > http://mail.python.org/mailman/listinfo/python-dev
> >
>
>
>
> --__--__--
>
> Message: 4
> From: "Samuele Pedroni" <pedroni@inf.ethz.ch>
> To: "Tim Peters" <tim.one@comcast.net>,
> "'Python Dev'" <python-dev@python.org>,
> "Kevin Jacobs" <jacobs@penguin.theopalgroup.com>
> Subject: Re: [Python-Dev] Meta-reflections
> Date: Thu, 21 Feb 2002 22:13:57 +0100
>
>
> > [Kevin Jacobs]
> > > ...
> > > I have a collection of about ~8 more bugs that is expending as I
> > > grow my test suite.  Before I spray all of them onto SF, I want
> > > to hear from Guido, since some of my "bugs" are potentially
subjective.
> >
> > The best way to hear from Guido is to post bugs, and suspected bugs, to
> > SourceForge, one bug per report.  There's so much verbiage about this
now on
> > Python-Dev that I doubt he'll ever be able to make time to catch up with
it
> > when he returns.  A great advantage of a good bug report is that it's
> > focused and brief.
>
> It's very true.
>
> > Slots were definitely intended as a memory optimization, and the ways in
> > which they don't act like "regular old attributes" are at best warts.
> >
>
> I see, but it seems that the only way to coherently and transparently
> remove the warts implies that the __dict__ of a new-style class
> instance with slots should be tied with the instance and cannot
> be anymore a vanilla dict. Something only Guido can rule about.
>
> some-more-verbiage-ly y'rs - Samuele.
>
>
>
> --__--__--
>
> Message: 5
> Date: Thu, 21 Feb 2002 17:41:18 -0500
> From: Tim Peters <tim.one@comcast.net>
> Subject: RE: [Python-Dev] Meta-reflections
> To: David Abrahams <david.abrahams@rcn.com>
> Cc: python-dev@python.org
>
> [David Abrahams]
> > FWIW, some of my Boost colleagues have been watching SF's future
> > prospects with some suspicion.
>
> It's worth a lot, and we do too -- at least in fits, when somebody
remembers
> it's something that's going to kill us someday.
>
> > The financial outlook is worrisome; I submitted a
> > support request in April 2001 that still hasn't been addressed (
> >
>
<http://sourceforge.net/tracker/?func=detail&aid=414066&group_id=1&atid=3500
> 01>).
>
> Well, that's really a feature request, and *nobody* responds well to witty
> oblique references to the Odyssey except me <wink>.
>
> > We're establishing all new services elsewhere, and even moving some old
> > ones. For the long-term health of Python, you might want to make
> > sure you're prepared to move quickly if neccessary.
>
> We supposedly have a cron job set up to suck down Python's CVS tarball
every
> night (the people who would know if this is currently working are out this
> week).
>
> What I don't think we ever figured out how to do was capture the info in
the
> trackers (bugs, patches, feature requests).  That would be a major loss,
as
> well as a chance to forget about 500 people who can't figure out how to
use
> threads on HP-UX, so let's call it a wash <wink>.
>
>
>
> --__--__--
>
> Message: 6
> Date: Thu, 21 Feb 2002 17:51:13 -0500
> From: Tim Peters <tim.one@comcast.net>
> Subject: RE: [Python-Dev] Meta-reflections
> To: 'Python Dev' <python-dev@python.org>
>
> [Tim]
> > Slots were definitely intended as a memory optimization, and the ways in
> > which they don't act like "regular old attributes" are at best warts.
>
> [Samuele Pedroni]
> > I see, but it seems that the only way to coherently and transparently
> > remove the warts implies that the __dict__ of a new-style class
> > instance with slots should be tied with the instance and cannot
> > be anymore a vanilla dict. Something only Guido can rule about.
>
> He'll be happy to <wink>.  Optimizations aren't always wart-free, and then
> living with warts is a price paid for benefiting from the optimization.
I'm
> sure Guido would consider it "a bug" if slots are ignored by the pickling
> mechanism, but wouldn't for an instant consider it "a bug" that the set of
> slots in effect when a class is created can't be dynamically expanded
later
> (this latter is more a sensible restriction than a wart, IMO -- and likely
> in Guido's too).
>
>
>
> --__--__--
>
> Message: 7
> To: python-dev@python.org
> Subject: Re: [Python-Dev] Meta-reflections
> From: Guido van Rossum <guido@python.org>
> Date: Thu, 21 Feb 2002 19:28:19 -0500
>
> > What I don't think we ever figured out how to do was capture the
> > info in the trackers (bugs, patches, feature requests).  That would
> > be a major loss, as well as a chance to forget about 500 people who
> > can't figure out how to use threads on HP-UX, so let's call it a
> > wash <wink>.
>
> From a recent SF mailing to project administrators:
>
>   DATA EXPORT
>   ---------------------------
>   We have added a new tool for project administrators to backup their
>   Project data.  It is now possible to export data from the Trackers (bug
>   tracker, support tracker, etc), mailing lists,  and forum data in to a
>   single XML text file.  This can be done at any time.
>
>   This is actually not a new feature.   The ability to export data was
>   available through March of 2001 until we did a major upgrade of the
>   site, which broke the export scripts.  We have now re-worked the code,
>   and it's available again.   Enjoy.   http://sourceforge.net/export
>
> SOMEBODY with admin perms should set up a cron job to such down the
> nightly XML.  It's big!  (Are we still sucking down the nightly cvs
> tarballs?  We should!)
>
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
>
> --__--__--
>
> Message: 8
> From: "Samuele Pedroni" <pedroni@inf.ethz.ch>
> To: "Tim Peters" <tim.one@comcast.net>,
> "'Python Dev'" <python-dev@python.org>
> Subject: Re: [Python-Dev] Meta-reflections
> Date: Fri, 22 Feb 2002 01:38:27 +0100
>
>
> From: Tim Peters <tim.one@comcast.net>
> > [Tim]
> > > Slots were definitely intended as a memory optimization, and the ways
in
> > > which they don't act like "regular old attributes" are at best warts.
> >
> > [Samuele Pedroni]
> > > I see, but it seems that the only way to coherently and transparently
> > > remove the warts implies that the __dict__ of a new-style class
> > > instance with slots should be tied with the instance and cannot
> > > be anymore a vanilla dict. Something only Guido can rule about.
> >
> > He'll be happy to <wink>.  Optimizations aren't always wart-free, and
then
> > living with warts is a price paid for benefiting from the optimization.
I'm
> > sure Guido would consider it "a bug" if slots are ignored by the
pickling
> > mechanism, but wouldn't for an instant consider it "a bug" that the set
of
> > slots in effect when a class is created can't be dynamically expanded
later
> > (this latter is more a sensible restriction than a wart, IMO -- and
likely
> > in Guido's too).
> >
>
> I was thinking along the line of the C equiv of this:
> [Yup the situation of a subclass of a class with slots
> is more relevant]
>
> class C(object):
>   __slots__ = ['_a']
>
>
> class D(C): pass
>
>
> def allslots(cls):
>   mro = list(cls.__mro__)
>   mro.reverse()
>   allslots = {}
>   for c in mro:
>     cdict = c.__dict__
>     if '__slots__' in cdict:
>       for slot in cdict['__slots__']:
>         allslots[slot] = cdict[slot]
>   return allslots
>
> class slotdict(dict):
>    __slots__ = ['_inst','_allslots']
>    def __init__(self,inst,allslots):
>      self._inst = inst
>      self._allslots = allslots
>
>    def __getitem__(self,k):
>      if self._allslots.has_key(k):
>         # self _allslots should be reachable as
> self._inst.__class__.__allslots__
>         # AttributeError should become a KeyError ?
>         return self._allslots[k].__get__(self._inst)
>      else:
>         return dict.__getitem__(self,v)
>
>    def __setitem__(self,k,v):
>      if self._allslots.has_key(k):
>         # self _allslots should be reachable as
> self._inst.__class__.__allslots__
>         # AttributeError should become a KeyError ?
>         return self._allslots[k].__set__(self._inst,v)
>      else:
>         return dict.__setitem__(self,v)
>
>    # other methods accordingly
>
> d=D()
> d.__dict__ = slotdict(d,allslots(D)) # should be so automagically
>
> # allslots(D) should be probably accessible as d.__class__.__allslots__
> # for transparency C.__dict__ should not contain any slot descr
>
> #  __allslots__ should be readonly and disallow rebinding
> # d.__dict__ should disallow rebinding
>
> # c =C() ; c.__dict__ should return a proxy dict lazily or even more so
...
>
> Lots of things to rule about and trade-offs to consider.
>
> the-more-it's-arbitrary-the-more-you-need-_one_-ruler-ly y'rs - Samuele.
>
>
>
> --__--__--
>
> Message: 9
> Date: Thu, 21 Feb 2002 20:46:35 -0500
> From: Tim Peters <tim.one@comcast.net>
> Subject: RE: [Python-Dev] Meta-reflections
> To: python-dev@python.org
>
> [Guido]
> > From a recent SF mailing to project administrators:
> >
> >   DATA EXPORT
> >   ---------------------------
>
> Jeremy (and less so I) played with that in the past (before it was
> publicized), but hit a brick wall:  there seemed to be a cap on how many
> records it would deliver, and we couldn't brute-force our way around it.
> Maybe it's better now.
>
> > ...
> > SOMEBODY with admin perms should set up a cron job to such down the
> > nightly XML.  It's big!  (Are we still sucking down the nightly cvs
> > tarballs?  We should!)
>
> IIRC, Barry was doing that on a home machine, and if so he's not around
this
> week to answer.
>
>
>
> --__--__--
>
> Message: 10
> Date: Fri, 22 Feb 2002 15:41:29 +1300 (NZDT)
> From: Greg Ewing <greg@cosc.canterbury.ac.nz>
> Subject: Re: [Python-Dev] A little GC confusion
> To: python-dev@python.org
>
> Kevin Jacobs <jacobs@penguin.theopalgroup.com>:
>
> > I doesn't have any time to really look at your code, but I thought I'd
point
> > out a trick that several extension modules use to protect statically
> > allocated type objects.
>
> >         0,      /* set below */                 /* tp_alloc */
> >         PySocketSock_new,                       /* tp_new */
> >         0,      /* set below */                 /* tp_free */
>
> I don't think that has anything to do with protecting the type
> object.
>
> As I understand it, static type objects are protected by
> having their refcount statically initialised to 1, so that
> it will never drop to zero.
>
> Greg Ewing, Computer Science Dept,
+--------------------------------------+
> University of Canterbury,    | A citizen of NewZealandCorp, a   |
> Christchurch, New Zealand    | wholly-owned subsidiary of USA Inc.  |
> greg@cosc.canterbury.ac.nz    +--------------------------------------+
>
>
> --__--__--
>
> Message: 11
> To: Tim Peters <tim.one@comcast.net>
> cc: David Abrahams <david.abrahams@rcn.com>, python-dev@python.org
> From: Anthony Baxter <anthony@ekit-inc.com>
> Reply-to: Anthony Baxter <anthony@ekit-inc.com>
> Subject: Re: [Python-Dev] Meta-reflections
> Date: Fri, 22 Feb 2002 14:04:57 +1100
>
>
> >>> Tim Peters wrote
> > What I don't think we ever figured out how to do was capture the info in
the
> > trackers (bugs, patches, feature requests).  That would be a major loss,
as
> > well as a chance to forget about 500 people who can't figure out how to
use
> > threads on HP-UX, so let's call it a wash <wink>.
>
> I still think adding a 'Resolution' of "HP/UX" would be a good way to
> clean up the trackers...
>
> Anthony.
>
> --
> Anthony Baxter     <anthony@interlink.com.au>
> It's never to late to have a happy childhood.
>
>
>
> --__--__--
>
> Message: 12
> Date: Thu, 21 Feb 2002 22:17:21 -0500
> To: Guido van Rossum <guido@python.org>
> Cc: python-dev@python.org
> Subject: Re: [Python-Dev] Meta-reflections
> From: "Fred L. Drake, Jr." <fdrake@acm.org>
>
>
> Guido van Rossum writes:
>  > SOMEBODY with admin perms should set up a cron job to such down the
>  > nightly XML.  It's big!  (Are we still sucking down the nightly cvs
>  > tarballs?  We should!)
>
> It's failing for me now; I'll submit a support request.
>
> I think the tarballs are being downloaded to the python.org machine;
> I'm not sure if they're still landing on Barry's home machine.
>
>
>   -Fred
>
> --
> Fred L. Drake, Jr.  <fdrake at acm.org>
> PythonLabs at Zope Corporation
>
>
> --__--__--
>
> Message: 13
> Date: Fri, 22 Feb 2002 16:21:09 +1300 (NZDT)
> From: Greg Ewing <greg@cosc.canterbury.ac.nz>
> Subject: Re: [Stackless] Re: [Python-Dev] Stackless Design Q.
> To: python-dev@python.org, stackless@tismer.com
>
> Gordon McMillan <gmcm@hypernet.com>:
>
> > You need a way to refer to "this" tasklet from Python
>
> Yes, that occurred to me as well. Would a built-in function
> called current_tasklet() provide what you want?
>
> Greg Ewing, Computer Science Dept,
+--------------------------------------+
> University of Canterbury,    | A citizen of NewZealandCorp, a   |
> Christchurch, New Zealand    | wholly-owned subsidiary of USA Inc.  |
> greg@cosc.canterbury.ac.nz    +--------------------------------------+
>
>
> --__--__--
>
> Message: 14
> Date: Thu, 21 Feb 2002 22:28:14 -0500
> To: python-dev@python.org
> Subject: Re: [Python-Dev] Meta-reflections
> From: "Fred L. Drake, Jr." <fdrake@acm.org>
>
>
> I wrote:
>  > It's failing for me now; I'll submit a support request.
>
>
http://sourceforge.net/tracker/index.php?func=detail&aid=521302&group_id=1&a
tid=200001
>
>
>   -Fred
>
> --
> Fred L. Drake, Jr.  <fdrake at acm.org>
> PythonLabs at Zope Corporation
>
>
> --__--__--
>
> Message: 15
> Date: Fri, 22 Feb 2002 16:54:53 +1300 (NZDT)
> From: Greg Ewing <greg@cosc.canterbury.ac.nz>
> Subject: Re: [Stackless] Re: [Python-Dev] Stackless Design Q.
> To: python-dev@python.org, stackless@tismer.com
>
> Christian Tismer <tismer@tismer.com>:
>
> > Now I see it: You mean I can make this schedule function behave
> > like a normal function call, that accepts and drops a dummy
> > value?
>
> Yes, that's right. (Or more precisely, it would take
> no parameters and return None.)
>
> > when I have a scheduler counter built into the Python
> > interpreter loop
>
> I can see the attraction of having pre-emption built in this
> way -- it would indeed be extremely efficient.
>
> But I think you need to make a decision about whether your
> tasklet model is going to be fundamentally pre-emptive or
> fundamentally non-pre-emptive, because, as I said before,
> the notion of switching to a specific tasklet is incompatible
> with pre-emptive scheduling.
>
> If you want to go with a fundamentally pre-emptive model,
> I would suggest the following primitives:
>
>    t = tasklet(f)
>       Creates a new tasklet executing f. The new tasklet
>       is initially blocked.
>
>    t.block()
>       Removes tasklet t from the set of runnable tasklets.
>
>    t.unblock()
>       Adds tasklet t to the set of runnable tasklets.
>
>    current_tasklet()
>       A built-in function which returns the currently
>       running tasklet.
>
> Using this model, a coroutine switch would be implemented
> using something like
>
>    def transfer(t):
>       "Transfer from the currently running tasklet to t."
>       t.unblock()
>       current_tasklet().block()
>
> although some locking may be needed in there somewhere.
> Have to think about that some more.
>
> For sending values from one tasklet to another, I think
> I'd use an intermediate object to mediate the transfer,
> something like a channel in Occam:
>
>    c = channel()
>
>    # tasklet 1 does:
>    c.send(value)
>
>    # tasklet 2 does:
>    value = c.receive()
>
> Tasklet 1 blocks at the send() until tasklet 2 reaches
> the receive(), or vice versa if tasklet 2 reaches the
> receive() first. When they're both ready, the value is
> transferred and both tasklets are unblocked.
>
> The advantage of this is that it's more symmetrical.
> Instead of one tasklet having to know about the
> other, they don't know about each other but they
> both know about the intermediate object.
>
> > I want to provide an exception to kill tasklets.
> > Also it will be prossible to just pick it off and drop it,
> > but I'm a little concerned about the C stack inside.
>
> As I said before, if there are no references left to a
> tasklet, there's no way it can ever be switched to again,
> so its C stack is no longer relevant. Unless you can have
> return addresses from one C stack pointing into another,
> or something... can you?
>
> Greg Ewing, Computer Science Dept,
+--------------------------------------+
> University of Canterbury,    | A citizen of NewZealandCorp, a   |
> Christchurch, New Zealand    | wholly-owned subsidiary of USA Inc.  |
> greg@cosc.canterbury.ac.nz    +--------------------------------------+
>
>
>
> --__--__--
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
>
>
> End of Python-Dev Digest



From adam@isdn.net.il  Fri Feb 22 07:25:31 2002
From: adam@isdn.net.il (adam)
Date: Fri, 22 Feb 2002 09:25:31 +0200
Subject: [Python-Dev] warning before a legal claim
References: <E16dvRD-0006lh-00@mail.python.org>
Message-ID: <002301c1bb72$1c8923a0$0101c80a@LocalHost>

warning before a legal claim

Remove us from your announcements list !! 




From adam@isdn.net.il  Fri Feb 22 07:29:23 2002
From: adam@isdn.net.il (adam)
Date: Fri, 22 Feb 2002 09:29:23 +0200
Subject: [Python-Dev] warning before a legal claim
References: <E16dwbl-0003mg-00@mail.python.org>
Message-ID: <004d01c1bb72$a675fa20$0101c80a@LocalHost>

warning before a legal claim
 
Remove us from your announcements list 



From tismer@tismer.com  Fri Feb 22 08:15:19 2002
From: tismer@tismer.com (Christian Tismer)
Date: Fri, 22 Feb 2002 09:15:19 +0100
Subject: [Stackless] Re: [Python-Dev] Stackless Design Q.
References: <200202220354.QAA22337@s454.cosc.canterbury.ac.nz>
Message-ID: <3C75FE17.1040807@tismer.com>

Greg Ewing wrote:

> Christian Tismer <tismer@tismer.com>:


...

> I can see the attraction of having pre-emption built in this
> way -- it would indeed be extremely efficient.
> 
> But I think you need to make a decision about whether your
> tasklet model is going to be fundamentally pre-emptive or
> fundamentally non-pre-emptive, because, as I said before,
> the notion of switching to a specific tasklet is incompatible
> with pre-emptive scheduling.


Yes. I will go a bit further and adapt the process model
of the Alef language a bit.

> If you want to go with a fundamentally pre-emptive model,
> I would suggest the following primitives:


[blocking stuff  - ok]


> Using this model, a coroutine switch would be implemented
> using something like
> 
>    def transfer(t):
>       "Transfer from the currently running tasklet to t."
>       t.unblock()
>       current_tasklet().block()
> 
> although some locking may be needed in there somewhere.
> Have to think about that some more.
> 
> For sending values from one tasklet to another, I think
> I'd use an intermediate object to mediate the transfer,
> something like a channel in Occam:
> 
>    c = channel()
> 
>    # tasklet 1 does:
>    c.send(value)
> 
>    # tasklet 2 does:
>    value = c.receive()
> 
> Tasklet 1 blocks at the send() until tasklet 2 reaches
> the receive(), or vice versa if tasklet 2 reaches the
> receive() first. When they're both ready, the value is
> transferred and both tasklets are unblocked.
> 
> The advantage of this is that it's more symmetrical.
> Instead of one tasklet having to know about the
> other, they don't know about each other but they
> both know about the intermediate object.


Yes. This all sounds very familiar to me. In private
conversation with Russ Cox, Bell Labs, I learned
about rendevouz techniques which are quite similar.

Having read the Alef user guide which can be found at
http://plan9.bell-labs.com/who/rsc/thread.html
http://plan9.bell-labs.com/who/rsc/ug.pdf
I got the following picture:

(Thanks to Russ Cox, these are his ideas!)
We use a two-level structure. Toplevel is something
similar to threads, processes in Alef language.
These are pre-emptively scheduled by an internal
scheduler that switches after every n opcodes.
These threads are groups of tasklets, which have
collaborative scheduling between them.

This gives us a lot of flexibility: If people prefer
thread-like behavior, they can use the system
provided approach and just use the toplevel layer
with just one tasklet in it.
Creating new tasklets inside a process then has
coroutine-like behavior.
I'm just busy designing the necessary structures,
things should not get too complicated on the C level.

>>I want to provide an exception to kill tasklets.
>>Also it will be prossible to just pick it off and drop it,
>>but I'm a little concerned about the C stack inside.
>>
> 
> As I said before, if there are no references left to a
> tasklet, there's no way it can ever be switched to again,
> so its C stack is no longer relevant. Unless you can have
> return addresses from one C stack pointing into another,
> or something... can you?


Well, the problem is that an extension *might* be sitting
inside a tasklet's stack with a couple of allotted
objects. I would assume that the extension frees these
objects when I send an exception to the tasklet.
But without that, I cannot be sure if all resources
are freed.

thanks a lot for your help - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net/
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
      where do you want to jump today?   http://www.stackless.com/




From jason@jorendorff.com  Fri Feb 22 09:31:55 2002
From: jason@jorendorff.com (Jason Orendorff)
Date: Fri, 22 Feb 2002 03:31:55 -0600
Subject: [Python-Dev] A little GC confusion
In-Reply-To: <1b6b01c1bae8$9f6af350$e000a8c0@thomasnotebook>
Message-ID: <HFEKILOLEFEFMKAECNDLIEHIDEAA.jason@jorendorff.com>

Thomas Heller wrote:
> David Abrahams wrote:
> > What I don't understand is how the builtin metatype
> > gets away with Py_TPFLAGS_HAVE_GC when some of its instance 
> > types are not even heap-allocated.
>
> Hm,  I don't understand you, Are you talking about Py_TPFLAGS_HEAPTYPE?

I think David is asking about line 1404 of Objects/typeobject.c,
where it says that PyType_Type is Py_TPFLAGS_HAVE_GC.
How can it have GC when many instances are static objects, not
allocated with PyObject_GC_VarNew()?

I don't know the answer.

## Jason Orendorff    http://www.jorendorff.com/


From martin@v.loewis.de  Fri Feb 22 10:03:53 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 22 Feb 2002 11:03:53 +0100
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: <001101c1ba3d$15e9cda0$ba03a8c0@activestate.ca>
References: <001101c1ba3d$15e9cda0$ba03a8c0@activestate.ca>
Message-ID: <m3bsehal4m.fsf@mira.informatik.hu-berlin.de>

"Jeff Hobbs" <JeffH@ActiveState.com> writes:

> That's correct - I should have looked a bit more into what I did
> before (I was always tying in another GUI's event loop).  However,
> I don't see why you should not consider the extra event source.
> Tk uses this itself for X.  It would be something like:

That does not work, either. I'm using the patch attached below, and
I'm getting the output

...
setupproc called 729
setupproc called 730
setupproc called 731
setupproc called 732
setupproc called 733
setupproc called 734
setupproc called 735
...

That is, even though the setupproc is called, and even though the
select is not blocking anymore, DoOneEvent does not return (I don't
see the "Event done" messages).

Regards,
Martin

Index: _tkinter.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Modules/_tkinter.c,v
retrieving revision 1.123
diff -u -r1.123 _tkinter.c
--- _tkinter.c	26 Jan 2002 20:21:50 -0000	1.123
+++ _tkinter.c	22 Feb 2002 09:56:13 -0000
@@ -133,6 +139,10 @@
    These locks expand to several statements and brackets; they should not be
    used in branches of if statements and the like.
 
+   To give other threads a chance to access Tcl while the Tk mainloop is
+   runnning, an input source is registered with Tcl which results in Tcl
+   not blocking for more than 20ms.
+
 */
 
 static PyThread_type_lock tcl_lock = 0;
@@ -237,24 +248,6 @@
 
 /**** Utils ****/
 
-#ifdef WITH_THREAD
-#ifndef MS_WINDOWS
-
-/* Millisecond sleep() for Unix platforms. */
-
-static void
-Sleep(int milli)
-{
-	/* XXX Too bad if you don't have select(). */
-	struct timeval t;
-	t.tv_sec = milli/1000;
-	t.tv_usec = (milli%1000) * 1000;
-	select(0, (fd_set *)0, (fd_set *)0, (fd_set *)0, &t);
-}
-#endif /* MS_WINDOWS */
-#endif /* WITH_THREAD */
-
-
 static char *
 AsString(PyObject *value, PyObject *tmp)
 {
@@ -1671,6 +1948,37 @@
 
 /** Event Loop **/
 
+#ifdef WITH_THREAD
+static int setupproc_registered;
+/*
+ *----------------------------------------------------------------------
+ *
+ * TkinterSetupProc --
+ *
+ *	This procedure implements the setup part of the Tkinter
+ *	event source.  It is invoked by Tcl_DoOneEvent before entering
+ *	the notifier to check for events on all displays.
+ *
+ * Results:
+ *	None.
+ *
+ * Side effects:
+ *	The maximum block time will be set to 20000 usecs to ensure that
+ *	the notifier returns control to Tcl.
+ *
+ *----------------------------------------------------------------------
+ */
+
+static void
+TkinterSetupProc(ClientData clientData, int flags)
+{
+    static Tcl_Time blockTime = { 0, 20000 };
+    static int i = 0;
+    printf("setupproc called %d\n",++i);
+    Tcl_SetMaxBlockTime(&blockTime);
+}
+#endif
+
 static PyObject *
 Tkapp_MainLoop(PyObject *self, PyObject *args)
 {
@@ -1682,22 +1990,29 @@
 	if (!PyArg_ParseTuple(args, "|i:mainloop", &threshold))
 		return NULL;
 
+#ifdef WITH_THREAD
+	if (!setupproc_registered) {
+		Tcl_CreateEventSource(TkinterSetupProc, NULL, NULL);
+		setupproc_registered = 1;
+	}
+#endif
+
 	quitMainLoop = 0;
 	while (Tk_GetNumMainWindows() > threshold &&
 	       !quitMainLoop &&
 	       !errorInCmd)
 	{
 		int result;
+						    
 
 #ifdef WITH_THREAD
 		Py_BEGIN_ALLOW_THREADS
 		PyThread_acquire_lock(tcl_lock, 1);
 		tcl_tstate = tstate;
-		result = Tcl_DoOneEvent(TCL_DONT_WAIT);
+		result = Tcl_DoOneEvent(0);
+		printf("Event done\n");
 		tcl_tstate = NULL;
 		PyThread_release_lock(tcl_lock);
-		if (result == 0)
-			Sleep(20);
 		Py_END_ALLOW_THREADS
 #else
 		result = Tcl_DoOneEvent(0);
@@ -2033,12 +2364,10 @@
 		PyThread_acquire_lock(tcl_lock, 1);
 		tcl_tstate = event_tstate;
 
-		result = Tcl_DoOneEvent(TCL_DONT_WAIT);
+		result = Tcl_DoOneEvent(0);
 
 		tcl_tstate = NULL;
 		PyThread_release_lock(tcl_lock);
-		if (result == 0)
-			Sleep(20);
 		Py_END_ALLOW_THREADS
 #else
 		result = Tcl_DoOneEvent(0);



From martin@v.loewis.de  Fri Feb 22 10:10:01 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 22 Feb 2002 11:10:01 +0100
Subject: [Python-Dev] PEP needed? Introducing Tcl objects
In-Reply-To: <002601c1ba46$ce683f20$ba03a8c0@activestate.ca>
References: <002601c1ba46$ce683f20$ba03a8c0@activestate.ca>
Message-ID: <m37kp5akue.fsf@mira.informatik.hu-berlin.de>

"Jeff Hobbs" <JeffH@ActiveState.com> writes:

> BTW in addition to my last message, you might want to create
> an ExitHandler that delete the event source.  Also, you might
> add more code to the TkinterSetupProc to only set a block time
> if multiple threads are actually used (or only create the
> event source at that time).  This would make simple Tkinter
> apps be efficient and snappy all the time.

I'm not sure this will be necessary (provided I get this to work at
all); after all, all that the timeout will do is to setup the event
loop 50 times in a second. Computers should have no problems with that
these days; in a snappy Tkinter app, there will be much more than 50
events per second. Furthermore, such a change would not affect
snappiness at all, only efficiency (and only slightly so).

Regards,
Martin


From martin@v.loewis.de  Fri Feb 22 10:25:09 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 22 Feb 2002 11:25:09 +0100
Subject: [Python-Dev] A little GC confusion
In-Reply-To: <HFEKILOLEFEFMKAECNDLIEHIDEAA.jason@jorendorff.com>
References: <HFEKILOLEFEFMKAECNDLIEHIDEAA.jason@jorendorff.com>
Message-ID: <m3sn7t95kq.fsf@mira.informatik.hu-berlin.de>

"Jason Orendorff" <jason@jorendorff.com> writes:

> I think David is asking about line 1404 of Objects/typeobject.c,
> where it says that PyType_Type is Py_TPFLAGS_HAVE_GC.
> How can it have GC when many instances are static objects, not
> allocated with PyObject_GC_VarNew()?

Because the type type implements tp_is_gc (typeobject.c:1378),
declaring static type objects as not being gc. In turn, garbage
collection will not attempt to look at the GC header of these type
objects.

Regards,
Martin


From David Abrahams" <david.abrahams@rcn.com  Fri Feb 22 12:34:23 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 22 Feb 2002 07:34:23 -0500
Subject: [Python-Dev] A little GC confusion
References: <HFEKILOLEFEFMKAECNDLIEHIDEAA.jason@jorendorff.com> <m3sn7t95kq.fsf@mira.informatik.hu-berlin.de>
Message-ID: <14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com>

----- Original Message -----
From: "Martin v. Loewis" <martin@v.loewis.de>
> "Jason Orendorff" <jason@jorendorff.com> writes:
>
> > I think David is asking about line 1404 of Objects/typeobject.c,
> > where it says that PyType_Type is Py_TPFLAGS_HAVE_GC.
> > How can it have GC when many instances are static objects, not
> > allocated with PyObject_GC_VarNew()?
>
> Because the type type implements tp_is_gc (typeobject.c:1378),
> declaring static type objects as not being gc. In turn, garbage
> collection will not attempt to look at the GC header of these type
> objects.


Aha! And the implementation is...

static int
type_is_gc(PyTypeObject *type)
{
    return type->tp_flags & Py_TPFLAGS_HEAPTYPE;
}

so, wouldn't it make more sense that the Python source always checks
Py_TPFLAGS_HEAPTYPE before tp_is_gc?

Also, is there any guideline for which type slots get automatically copied
from the base type? Since my slots are nearly all zero I expected to inherit
most of the slots from type_type.

-Dave



From David Abrahams" <david.abrahams@rcn.com  Fri Feb 22 13:07:28 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 22 Feb 2002 08:07:28 -0500
Subject: [Python-Dev] A little GC confusion
References: <HFEKILOLEFEFMKAECNDLIEHIDEAA.jason@jorendorff.com> <m3sn7t95kq.fsf@mira.informatik.hu-berlin.de> <14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com>
Message-ID: <150601c1bba1$e22467d0$0500a8c0@boostconsulting.com>

----- Original Message -----
From: "David Abrahams" <david.abrahams@rcn.com>

> ----- Original Message -----
> From: "Martin v. Loewis" <martin@v.loewis.de>

> > Because the type type implements tp_is_gc (typeobject.c:1378),
> > declaring static type objects as not being gc. In turn, garbage
> > collection will not attempt to look at the GC header of these type
> > objects.
>
>
> Aha! And the implementation is...
>
> static int
> type_is_gc(PyTypeObject *type)
> {
>     return type->tp_flags & Py_TPFLAGS_HEAPTYPE;
> }
>
> so, wouldn't it make more sense that the Python source always checks
> Py_TPFLAGS_HEAPTYPE before tp_is_gc?

Nice try, but no cigar I'm afraid: copying the tp_is_gc slot from
PyType_Type into my metatype before PyType_Ready() doesn't prevent the
crash.

Does anyone really understand what happens here?




From mwh@python.net  Fri Feb 22 13:46:10 2002
From: mwh@python.net (Michael Hudson)
Date: 22 Feb 2002 13:46:10 +0000
Subject: [Python-Dev] 2.2.1 issues
In-Reply-To: martin@v.loewis.de's message of "19 Feb 2002 21:29:49 +0100"
References: <2madu5zhnj.fsf@starship.python.net> <3C726270.7D33E687@lemburg.com> <m3bsel8bb6.fsf@mira.informatik.hu-berlin.de>
Message-ID: <2mzo21fx3x.fsf@starship.python.net>

martin@v.loewis.de (Martin v. Loewis) writes:

> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
> > Right. 1) was caused by 2).
> 
> That wasn't actually the case. The overwriting of memory was really
> independent of the error in surrogate processing, and can be fixed
> independently.

OK, thanks for the clarification.

> > As a result, modules using unpaired surrogates in Unicode
> > literals are simply broken in Python <= 2.2.0.
> 
> I think this is unimportant enough to just accept this bug for Python
> 2.2.x. If people ever run into the problem, well: just don't do this.
> Unpaired surrogates will be entirely in Unicode 3.2.

I think you're missing a word in the last sentence?

> > The problem with backporting this patch is that in order
> > for Python to properly recompile any broken module, the
> > magic will have to be changed. Question is whether this
> > is a reasonable thing to do in a patch level release...
> 
> The memory-overwriting problem can be fixed independently, e.g. with
> 
> https://sourceforge.net/tracker/download.php?group_id=5470&atid=105470&file_id=15248&aid=495401

Thanks, I've now checked this fix in, and will consider the whole
issue to be closed until further notice.

Cheers,
M.

-- 
  That's why the smartest companies use Common Lisp, but lie about it
  so all their competitors think Lisp is slow and C++ is fast.  (This
  rumor has, however, gotten a little out of hand. :)
                                        -- Erik Naggum, comp.lang.lisp


From mwh@python.net  Fri Feb 22 14:10:10 2002
From: mwh@python.net (Michael Hudson)
Date: 22 Feb 2002 14:10:10 +0000
Subject: [Python-Dev] What's blocking 2.2.1?
Message-ID: <2mwux5fvzx.fsf@starship.python.net>

I think I'm caught up on porting fixes from the trunk to the
release22-maint branch.  Now would be a good time to shout if you
think I've missed something (although I might not read my email before
Monday).

(I've entirely ignored the Mac subtree here.  Jack, that's your
problem, I'm afraid).

There are still some bugs in the trunk that need fixing, though.

[ #496873 ] cPickle / time.struct_time loop
 - I think this one is firmly in Guido's domain.
[ #501591 ] dir() doc is old
 - probably not that hard.

are all that are marked as 2.2.1 candidates (apart from two MacOS
bugs), but there are probably more.  I don't want to trawl through all
the 250+ (!) open bugs to look for them if I don't have to, so can I
ask people to nominate bugs they know of?

Cheers,
M.

-- 
  We've had a lot of problems going from glibc 2.0 to glibc 2.1.
  People claim binary compatibility.  Except for functions they
  don't like.                       -- Peter Van Eynde, comp.lang.lisp


From jacobs@penguin.theopalgroup.com  Fri Feb 22 15:05:09 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Fri, 22 Feb 2002 10:05:09 -0500 (EST)
Subject: [Python-Dev] Meta-reflections
In-Reply-To: <043801c1bb39$3ed8a540$6d94fea9@newmexico>
Message-ID: <Pine.LNX.4.33.0202220953380.1315-100000@penguin.theopalgroup.com>

On Fri, 22 Feb 2002, Samuele Pedroni wrote:
> I was thinking along the line of the C equiv of this:
[...code snipped...]

[
  An updated version of a comment to SF issue:
  http://sourceforge.net/tracker/?func=detail&atid=105470&aid=520644&group_id=5470
]

Samuele's sltattr.py is an interesting approach, though I am not entirely
sure it is sufficient to address all of the problems with slots.  Here is a
mostly complete list of smaller changes that are somewhat orthogonal to how
we address accesses to __dict__:

  1) Flatten slot lists:  Change obj.__class__.__slots__ to return an
     immutable list of all slot descriptors in the object (including all
     those of base classes).  The motivation for this is similar in spirit
     to storing a flattened __mro__.

     The advantages of this change are:

     a) allows for fast and explicit object reflection that correctly finds
        all dict attributes, all slot attributes.

     b) allows reflection implementations (like vars(object) and pickle) to
        treat dict and slot attrs differently if we choose not to proxy
        __dict__.  This has several advantages, as explained in change #2.
        Also importantly, this way it is not possible to "lose" descriptors
        permanently by deleting them from obj.__class__.__dict__.

  2) Update reflection API even if we do not choose to proxy __dict__: Alter
     vars(object) to return a dictionary of all attributes, including both
     the contents of the non-proxied __dict__ and the valid attributes that
     result from iterating over __slots__ and evaluating the descriptors.
     The details of how this is best implemented depend on how we wish to
     define the behavior of modifying the resulting dictionary.  It could be
     either:

           a) explicitly immutable, which involves creating proxy objects
           b) mutable, which involves copying
           c) undefined, which means implicitly immutable

     Aside from the questions over the nature of the return type, this
     implementation (coupled with #1) has distinct advantages.  Specifically
     the native object.__dict__ has a very natural internal representation
     that pairs attribute names directly with values.  In contrast, a fair
     amount of additional work is needed to extract the slots that store
     values and create a dictionary of their names and values.  Other
     implementations will require a great deal more work since they would
     have to traverse though base classes to collecting slot descriptors.

  3) Flatten slot inheritance:  Update the new-style object inheritance
     mechanism to re-use slots of the same name, rather than creating a new
     slot and hiding the old.  This makes the inheritance semantics of slots
     equivalent to those of normal instance attributes and avoids
     introducing an ad-hoc and obscure method of data hiding.

  4) Update standard library to use new reflection API (and make them robust
     to properies at the same time) if we choose not to proxy __dict__.
     Virtually all of the changes are simple and involve updating these
     constructs:

           a) obj.__dict__
           b) obj.__dict__[blah]
           c) obj.__dict__[blah] = x

     (What these will become depends on other factors, including the context
      and semantics of vars(obj).)

     Here is a fairly complete list of Python 2.2 modules that will need to
     be updated:
       copy, copy_reg, inspect, pickle, pydoc, cPickle, Bastion, codeop,
       dis, doctest, gettext, ihooks, imputil, knee, pdb, profile, rexec,
       rlcompleter, tempfile, unittest, xmllib, xmlrpclib

    5) (NB: potentially controversial and not required) We could alter the
       descriptor protocol to make slots (and properties) more transparent
       when the values they reference do not exist.  Here is an example to
       illustrate this:

           class A(object):
             foo = 1

           class B(A):
             __slots__ = ('foo',)

           b = B()
           print b.foo
           > 1 or AttributeError?

        Currently an AttributeError is raised.  However, it is a fairly easy
        change to make AttributeErrors signal that attribute resolution is
        to continue until either a valid descriptor is evaluated, an
        instance-attribute is found, or until the resolution fails after
        search the meta-type, the type, and the instance dictionary.

I am prepared to submit patches to address each of these issues.  However, I
do want feedback beforehand, so that I do not waste time implementing
something that will never be accepted.

Regards,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From Jack.Jansen@oratrix.com  Fri Feb 22 15:07:10 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Fri, 22 Feb 2002 16:07:10 +0100
Subject: [Python-Dev] What's blocking 2.2.1?
In-Reply-To: <2mwux5fvzx.fsf@starship.python.net>
Message-ID: <D83C0489-27A5-11D6-BAAA-0030655234CE@oratrix.com>

On Friday, February 22, 2002, at 03:10 , Michael Hudson wrote:

> I think I'm caught up on porting fixes from the trunk to the
> release22-maint branch.  Now would be a good time to shout if you
> think I've missed something (although I might not read my email before
> Monday).
>
> (I've entirely ignored the Mac subtree here.  Jack, that's your
> problem, I'm afraid).

I'll do a quick check to see whether there's anything that is vital for 
Mac OS X unix Python that has to go in, I'll let you know.

I think MacPython will have to be done after the unix/win distribution 
has been made, but that depends on your timeframe (i.e. if you can hand 
the Mac/ portion of the tree over to me real soon now I can try and 
squeeze the time in).
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -



From fdrake@acm.org  Fri Feb 22 15:26:21 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Feb 2002 10:26:21 -0500
Subject: [Python-Dev] A little GC confusion
In-Reply-To: <m3sn7t95kq.fsf@mira.informatik.hu-berlin.de>
References: <HFEKILOLEFEFMKAECNDLIEHIDEAA.jason@jorendorff.com>
 <m3sn7t95kq.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15478.25373.94172.389436@grendel.zope.com>

"Jason Orendorff" <jason@jorendorff.com> writes:
 > How can it have GC when many instances are static objects, not
 > allocated with PyObject_GC_VarNew()?

Martin v. Loewis writes:
 > Because the type type implements tp_is_gc (typeobject.c:1378),
 > declaring static type objects as not being gc. In turn, garbage
 > collection will not attempt to look at the GC header of these type
 > objects.

I'm starting to really fear writing the documentation for all this!
There are going to be a lot of mostly-inscrutible details to get
right, and people are already asking the questions, so it really will
need to be written down.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From jafo@tummy.com  Fri Feb 22 23:01:27 2002
From: jafo@tummy.com (Sean Reifschneider)
Date: Fri, 22 Feb 2002 16:01:27 -0700
Subject: [Python-Dev] patch: speed up name access by up to 80%
In-Reply-To: <20020211225538.GA93506@hishome.net>; from oren-py-d@hishome.net on Mon, Feb 11, 2002 at 05:55:38PM -0500
References: <20020211220954.A12061@hishome.net> <3C6844F9.954E1DA2@metaslash.com> <20020211225538.GA93506@hishome.net>
Message-ID: <20020222160127.A13114@tummy.com>

On Mon, Feb 11, 2002 at 05:55:38PM -0500, Oren Tirosh wrote:
>I got strage results comparing to the python2.2 RPM package (some faster,
>some slower).  I didn't start to get consistent results until I used two

The 2.2-3 RPM from python.org does "./configure --prefix=/usr" (unless you
enable the pymalloc or ipv6 flags before building, the RPMs up there do
not).  It then does "make".

About the only thing that may be unusual would be that RPM may
automatically strip the resulting binary.  I'm not doing it manually.

Interesting that you're seeing oddness.  Note that if you download the
SRPM, you can install the .src.rpm and then build it by doing:

   rpm -bc /usr/src/redhat/SPECS/python-2.2.spec

(or other similar location that the spec file would be installed depending
on your distribution).  Also note that you can have it build a patched
version by adding the patch below the "Patch1:" line as "Patch2:", and also
add a line below "%patch1" which reads "%patch2" (unless you have to give
options such as "-p1" -- see the example for "%patch0").  You can then
do a fresh build of the code using the above command.  You can also build
an RPM by using "rpm -ba [...]".

One of the big wins with a packaging system -- reproducability...

Sean
-- 
 Let us live!!!  Let us love!!!  Let us share the deepest secrets
 of our souls!!!       You first.
Sean Reifschneider, Inimitably Superfluous <jafo@tummy.com>
tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python


From tim.one@comcast.net  Fri Feb 22 23:55:05 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 22 Feb 2002 18:55:05 -0500
Subject: [Python-Dev] What's blocking 2.2.1?
In-Reply-To: <2mwux5fvzx.fsf@starship.python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEKKNOAA.tim.one@comcast.net>

[Michael Hudson]
> I think I'm caught up on porting fixes from the trunk to the
> release22-maint branch.

Voluminous thanks for the work, Michael!

> There are still some bugs in the trunk that need fixing, though.
> ...
> [ #501591 ] dir() doc is old
>  - probably not that hard.

I just reassigned that one to me; Fred is off today, and it's shallow (the
docstring got updated to a correct state when I implemented 2.2 dir()
changes, but somehow or other the docs didn't).  I'll fix this before I pass
out tonight.



From martin@v.loewis.de  Sat Feb 23 00:47:44 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 23 Feb 2002 01:47:44 +0100
Subject: [Python-Dev] 2.2.1 issues
In-Reply-To: <2mzo21fx3x.fsf@starship.python.net>
References: <2madu5zhnj.fsf@starship.python.net>
 <3C726270.7D33E687@lemburg.com>
 <m3bsel8bb6.fsf@mira.informatik.hu-berlin.de>
 <2mzo21fx3x.fsf@starship.python.net>
Message-ID: <m3bsehvxan.fsf@mira.informatik.hu-berlin.de>

Michael Hudson <mwh@python.net> writes:

> > Unpaired surrogates will be entirely in Unicode 3.2.
> 
> I think you're missing a word in the last sentence?

"banned" or "outlawed" is the right word, I guess :-)

> Thanks, I've now checked this fix in, and will consider the whole
> issue to be closed until further notice.

Thanks! If I find the time, I'll review the code.

Regards,
Martin


From martin@v.loewis.de  Sat Feb 23 00:43:55 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 23 Feb 2002 01:43:55 +0100
Subject: [Python-Dev] A little GC confusion
In-Reply-To: <14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com>
References: <HFEKILOLEFEFMKAECNDLIEHIDEAA.jason@jorendorff.com>
 <m3sn7t95kq.fsf@mira.informatik.hu-berlin.de>
 <14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com>
Message-ID: <m3k7t5vxh0.fsf@mira.informatik.hu-berlin.de>

"David Abrahams" <david.abrahams@rcn.com> writes:

> static int
> type_is_gc(PyTypeObject *type)
> {
>     return type->tp_flags & Py_TPFLAGS_HEAPTYPE;
> }
> 
> so, wouldn't it make more sense that the Python source always checks
> Py_TPFLAGS_HEAPTYPE before tp_is_gc?

No. Most GC objects do not have the HEAPTYPE flag (they actually
aren't even type objects).

Regards,
Martin



From martin@v.loewis.de  Sat Feb 23 00:45:58 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 23 Feb 2002 01:45:58 +0100
Subject: [Python-Dev] A little GC confusion
In-Reply-To: <150601c1bba1$e22467d0$0500a8c0@boostconsulting.com>
References: <HFEKILOLEFEFMKAECNDLIEHIDEAA.jason@jorendorff.com>
 <m3sn7t95kq.fsf@mira.informatik.hu-berlin.de>
 <14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com>
 <150601c1bba1$e22467d0$0500a8c0@boostconsulting.com>
Message-ID: <m3g03tvxdl.fsf@mira.informatik.hu-berlin.de>

"David Abrahams" <david.abrahams@rcn.com> writes:

> Nice try, but no cigar I'm afraid: copying the tp_is_gc slot from
> PyType_Type into my metatype before PyType_Ready() doesn't prevent the
> crash.
> 
> Does anyone really understand what happens here?

Understand why your code crashes? Because there is a bug in it...  To
understand what the bug is, one would have to study your code first.

Regards,
Martin



From fdrake@acm.org  Sat Feb 23 02:09:33 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Feb 2002 21:09:33 -0500
Subject: [Python-Dev] What's blocking 2.2.1?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEKKNOAA.tim.one@comcast.net>
References: <2mwux5fvzx.fsf@starship.python.net>
 <LNBBLJKPBEHFEDALKOLCMEKKNOAA.tim.one@comcast.net>
Message-ID: <15478.63965.177519.255626@grendel.zope.com>

Tim Peters writes:
 > I just reassigned that one to me; Fred is off today, and it's shallow (the
 > docstring got updated to a correct state when I implemented 2.2 dir()

Thanks, Tim!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From David Abrahams" <david.abrahams@rcn.com  Sat Feb 23 02:44:45 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 22 Feb 2002 21:44:45 -0500
Subject: [Python-Dev] A little GC confusion
References: <HFEKILOLEFEFMKAECNDLIEHIDEAA.jason@jorendorff.com><m3sn7t95kq.fsf@mira.informatik.hu-berlin.de><14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com><150601c1bba1$e22467d0$0500a8c0@boostconsulting.com> <m3g03tvxdl.fsf@mira.informatik.hu-berlin.de>
Message-ID: <013901c1bc14$ba52c940$0500a8c0@boostconsulting.com>

----- Original Message -----
From: "Martin v. Loewis" <martin@v.loewis.de>


> "David Abrahams" <david.abrahams@rcn.com> writes:
>
> > Nice try, but no cigar I'm afraid: copying the tp_is_gc slot from
> > PyType_Type into my metatype before PyType_Ready() doesn't prevent the
> > crash.
> >
> > Does anyone really understand what happens here?
>
> Understand why your code crashes?

I'm not asking that. I'm asking if anyone really understands how the flags
and tp_xxx slots are supposed to interact.

> Because there is a bug in it...

I /guess/ there's a bug in my code if you measure it against the standard
that says "if it doesn't work with the current Python source code, it's
buggy". I'd consider that standard a bit more legitimate if I could find,
for example, a mention of Py_TPFLAGS_HEAPTYPE *anywhere* in the Python docs.
As it stands, your position seems a bit more unhelpful than neccessary.

I can live with incomplete documentation if there's someone around who can
explain how the software is supposed to be used; I just want to fill in the
holes so that I know I'm not making important errors. I thought I was doing
everything right until a few days ago when someone tried something new with
my code and uncovered the GC crash. One can only cover so many cases with
tests. Even if I repair this problem, how can I be sure I've got the rest of
the formula right? Better docs would fix that problem, and give us an
objective standard against which to judge which code has bugs. In lieu of
that, I would hope that my questions would be answered in good faith.

[In the meantime, GC remains turned off for my types and metatypes]

> To understand what the bug is, one would have to study your code first.

I posted the code yesterday. Did you miss it? I'm sure you could figure out
how to apply the simple modification described at the top of this message.

-Dave




From jeremy@alum.mit.edu  Sat Feb 23 03:51:28 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 22 Feb 2002 22:51:28 -0500
Subject: [Python-Dev] A little GC confusion
In-Reply-To: <013901c1bc14$ba52c940$0500a8c0@boostconsulting.com>
Message-ID: <BCEJJGNAEAKMACBPLMEPOEEECAAA.jeremy@alum.mit.edu>

[David Abrahams]
> I /guess/ there's a bug in my code if you measure it against the standard
> that says "if it doesn't work with the current Python source code, it's
> buggy". I'd consider that standard a bit more legitimate if I could find,
> for example, a mention of Py_TPFLAGS_HEAPTYPE *anywhere* in the
> Python docs.

I've been struggling with the meaning of the various TPFLAGS myself.  I
don't think it's documented anywhere, and I don't think anyone except Guido
really understands what all the flags mean.

One property of types that do not have define HEAPTYPE is that their
__module__ attribute is always __builtin__.  This makes them mighty hard to
pickle.  It further suggests that every type that isn't a builtin type
should define HEAPTYPE.

There are lots of other cases affected by HEAPTYPE.  I imagine you've done
the same grep that I did.

Jeremy



From jeremy@alum.mit.edu  Sat Feb 23 03:59:58 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 22 Feb 2002 22:59:58 -0500
Subject: [Python-Dev] A little GC confusion
In-Reply-To: <BCEJJGNAEAKMACBPLMEPOEEECAAA.jeremy@alum.mit.edu>
Message-ID: <BCEJJGNAEAKMACBPLMEPAEEGCAAA.jeremy@alum.mit.edu>

[I wrote:]
> One property of types that do not have define HEAPTYPE is that their
> __module__ attribute is always __builtin__.  This makes them
> mighty hard to
> pickle.  It further suggests that every type that isn't a builtin type
> should define HEAPTYPE.

I don't think I made much sense above.  I meant to say: When my C types
didn't define HEAPTYPE, it was impossible to pickle them.  When I added the
HEAPTYPE and defined __safe_for_unpickling__ as a data member, it became
possible to pickle instances of those types.  It was far from obvious,
though, that I needed to do those two things to make pickling work.

Jeremy



From tim.one@comcast.net  Sat Feb 23 05:10:42 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 23 Feb 2002 00:10:42 -0500
Subject: [Python-Dev] A little GC confusion
In-Reply-To: <BCEJJGNAEAKMACBPLMEPOEEECAAA.jeremy@alum.mit.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCEELINOAA.tim.one@comcast.net>

[Jeremy Hylton]
> I've been struggling with the meaning of the various TPFLAGS myself.  I
> don't think it's documented anywhere, and I don't think anyone
> except Guido really understands what all the flags mean.

I agree that at this point Guido is the only one who fully understands what
they were all *intended* to mean, but I don't believe even Guido can tell
you (without the same kinds of study and experimentation and hair-pulling
you're doing) what the flags actually do today in all circumstances and
combinations.  A consequence is that neither can he (or anyone else) always
predict what you need to do to get a desired result.

What shipped in 2.2 was solid to the extent that it supported everything
used by the Python core.  You and David are pushing it in other directions,
and while it was intended to support them, this stuff was never really
*tried* at the C level beyond the demo xxsubtype.c module and some
ExtensionClass fiddling.  Most "weird experiments" were tried at the Python
level instead, just because it's so much more time-efficient to try stuff in
Python, and time was in short supply.

So you're pioneers, and you've got to draw your own maps of the new
territory.  Luckily, God isn't resting yet, so He can still create new
lifeforms if needed <wink>.

> One property of types that do not have define HEAPTYPE is that their
> __module__ attribute is always __builtin__.  This makes them
> mighty hard to pickle.  It further suggests that every type that isn't
> a builtin type should define HEAPTYPE.

Yup, all kinds of questions get answered by "does it have HEAPTYPE?" that
don't have any obvious connection to heaps.  One of my favorites is this
seemingly straightforward branch in type_repr():

	if (type->tp_flags & Py_TPFLAGS_HEAPTYPE)
		kind = "class";
	else
		kind = "type";

The philosophical questions that raises could go on for pages <wink>.



From adam@isdn.net.il  Sat Feb 23 08:28:49 2002
From: adam@isdn.net.il (adam)
Date: Sat, 23 Feb 2002 10:28:49 +0200
Subject: [Python-Dev] remove
Message-ID: <002001c1bc44$1f0e1860$0101c80a@LocalHost>

This is a multi-part message in MIME format.

------=_NextPart_000_001D_01C1BC54.E1FFD880
Content-Type: text/plain;
	charset="x-user-defined"
Content-Transfer-Encoding: quoted-printable


pls remove us from yr mail list !!

>>>>> "adam" =3D=3D   <adam@isdn.net.il> writes:

    adam> warning before a legal claim

    adam> Remove us from your announcements list !!

You are not on any "announcements list".  You are, however on
python-dev@python.org which is a technical mailing list for developers
of the free Python programming language.  If you want to be removed
from that, please let me know.  Please do not mail administrivia
requests to the whole mailing list.

-Barry

------=_NextPart_000_001D_01C1BC54.E1FFD880
Content-Type: text/html;
	charset="x-user-defined"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Dx-user-defined">
<META content=3D"MSHTML 6.00.2600.0" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff><FONT face=3DArial size=3D2>
<DIV><BR><FONT face=3D"Times New Roman" size=3D3>pls remove us from yr =
mail list=20
!!</FONT></DIV>
<DIV><FONT face=3D"Times New Roman" size=3D3><BR>&gt;&gt;&gt;&gt;&gt; =
"adam"=20
=3D=3D&nbsp;&nbsp; &lt;<A =
href=3D"mailto:adam@isdn.net.il">adam@isdn.net.il</A>&gt;=20
writes:<BR><BR>&nbsp;&nbsp;&nbsp; adam&gt; warning before a legal=20
claim<BR><BR>&nbsp;&nbsp;&nbsp; adam&gt; Remove us from your =
announcements list=20
!!<BR><BR>You are not on any "announcements list".&nbsp; You are, =
however=20
on<BR><A href=3D"mailto:python-dev@python.org">python-dev@python.org</A> =
which is=20
a technical mailing list for developers<BR>of the free Python =
programming=20
language.&nbsp; If you want to be removed<BR>from that, please let me=20
know.&nbsp; Please do not mail administrivia<BR>requests to the whole =
mailing=20
list.<BR><BR>-Barry</FONT></DIV></FONT></BODY></HTML>

------=_NextPart_000_001D_01C1BC54.E1FFD880--



From mwh@python.net  Sat Feb 23 08:46:43 2002
From: mwh@python.net (Michael Hudson)
Date: 23 Feb 2002 08:46:43 +0000
Subject: [Python-Dev] What's blocking 2.2.1?
In-Reply-To: Jack Jansen's message of "Fri, 22 Feb 2002 16:07:10 +0100"
References: <D83C0489-27A5-11D6-BAAA-0030655234CE@oratrix.com>
Message-ID: <2m1yfcmvpo.fsf@starship.python.net>

Jack Jansen <Jack.Jansen@oratrix.com> writes:

> On Friday, February 22, 2002, at 03:10 , Michael Hudson wrote:
> 
> > I think I'm caught up on porting fixes from the trunk to the
> > release22-maint branch.  Now would be a good time to shout if you
> > think I've missed something (although I might not read my email before
> > Monday).
> >
> > (I've entirely ignored the Mac subtree here.  Jack, that's your
> > problem, I'm afraid).
> 
> I'll do a quick check to see whether there's anything that is vital for 
> Mac OS X unix Python that has to go in, I'll let you know.

Fine.  I've done the first one.

> I think MacPython will have to be done after the unix/win distribution 
> has been made, but that depends on your timeframe (i.e. if you can hand 
> the Mac/ portion of the tree over to me real soon now I can try and 
> squeeze the time in).

?  Sorry, you've lost me. I'm afraid.  The Mac/ portion of the tree is
yours.

Cheers,
M.

-- 
  That's why the smartest companies use Common Lisp, but lie about it
  so all their competitors think Lisp is slow and C++ is fast.  (This
  rumor has, however, gotten a little out of hand. :)
                                        -- Erik Naggum, comp.lang.lisp


From mwh@python.net  Sat Feb 23 08:58:10 2002
From: mwh@python.net (Michael Hudson)
Date: 23 Feb 2002 08:58:10 +0000
Subject: [Python-Dev] What's blocking 2.2.1?
In-Reply-To: Tim Peters's message of "Fri, 22 Feb 2002 18:55:05 -0500"
References: <LNBBLJKPBEHFEDALKOLCMEKKNOAA.tim.one@comcast.net>
Message-ID: <2mbsegtw0t.fsf@starship.python.net>

Tim Peters <tim.one@comcast.net> writes:

> [Michael Hudson]
> > I think I'm caught up on porting fixes from the trunk to the
> > release22-maint branch.
> 
> Voluminous thanks for the work, Michael!

It's now *really* easy for me to get checkins across to the branch.
Does anyone else use gnus to read their -checkins mail?  I could
probably put the little bundle of scripts I use under Tools/scripts/.

Cheers,
M.

-- 
  In short, just business as usual in the wacky world of floating
  point <wink>.                        -- Tim Peters, comp.lang.python


From martin@v.loewis.de  Sat Feb 23 10:51:57 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 23 Feb 2002 11:51:57 +0100
Subject: [Python-Dev] A little GC confusion
In-Reply-To: <150601c1bba1$e22467d0$0500a8c0@boostconsulting.com>
References: <HFEKILOLEFEFMKAECNDLIEHIDEAA.jason@jorendorff.com>
 <m3sn7t95kq.fsf@mira.informatik.hu-berlin.de>
 <14c501c1bb9d$d774a510$0500a8c0@boostconsulting.com>
 <150601c1bba1$e22467d0$0500a8c0@boostconsulting.com>
Message-ID: <m3wux45v3m.fsf@mira.informatik.hu-berlin.de>

"David Abrahams" <david.abrahams@rcn.com> writes:

> Nice try, but no cigar I'm afraid: copying the tp_is_gc slot from
> PyType_Type into my metatype before PyType_Ready() doesn't prevent the
> crash.
> 
> Does anyone really understand what happens here?

After studying your code in a debugger, it turns out that the code now
crashes for a different reason: The "AA" class created in make_class
is traversed in subtract_refs. To do so, its type's traverse function
is invoked, i.e. classtype_meta_object.tp_traverse. This is a null
pointer, hence the crash.

If you want objects (in your case, classes) to participate in GC, the
of the objects (in your case, the metaclass) needs to implement the GC
API. IOW, don't set Py_TPFLAGS_HAVE_GC in a type unless you also set
tp_clear and tp_traverse in the same type, see

http://www.python.org/doc/current/api/supporting-cycle-detection.html

for details. This likely has been the problem all the time; if I
remove tp_is_gc, but implement tp_traverse, your test case (import
AA,pdb) does not crash anymore.

BTW, gcc rejects the code you've posted, as you cannot use
PyType_Type.tp_basicsize in an initializer of a global object (it's
not a constant).

HTH,
Martin


From info@virtucomnetworks.com  Sun Feb 24 04:14:40 2002
From: info@virtucomnetworks.com (info@virtucomnetworks.com)
Date: Sat, 23 Feb 2002 23:14:40 -0500
Subject: [Python-Dev] Webhosting en PESOS!
Message-ID: <200202240414.g1O4Eeo10928@toservers.com>

<!-- saved from url=3D(0022)http://internet.e-mail -->=0D
<html>=0D
<head>=0D
<title>Untitled Document</title>=0D
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Diso-885=
9-1">=0D
=0D
</head>=0D
=0D
<body bgcolor=3D"#FFFFFF" text=3D"#000000">=0D
<table width=3D"580" border=3D"0" cellspacing=3D"0" cellpadding=3D"0" ali=
gn=3D"center">=0D
  <tr>=0D
    <td><img src=3D"http://200.49.75.24/todesign.com.ar/email/venezuela/i=
mg/img.jpg" width=3D"537" height=3D"47"></td>=0D
</tr><tr>=0D
    <td>=0D
      <font face=3D"Arial" size=3D"2" color=3D"002558">=0D
      <br>=0D
      <b>Todav=EDa tiene su p=E1gina hosteada en USA? sigue pagando en DO=
LARES?<br>=0D
      Nosotros cobramos en </b></font><b><font face=3D"Arial" size=3D"2" =
color=3D"#CC3300">PESOS</font><font face=3D"Arial" size=3D"2" color=3D"00=
2558">=0D
      y </font><font face=3D"Arial" size=3D"2" color=3D"#CC3300">NO</font=
><font face=3D"Arial" size=3D"2" color=3D"002558">=0D
      vamos a aumentar los precios!<br>=0D
      <br>=0D
      Por qu=E9? porque queremos que nuestro pa=EDs salga adelante y=0D
      apostamos&nbsp;<br>=0D
      a Argentina!<br>=0D
      <br>=0D
      Puede transferir su p=E1gina desde cualquier proveedor internaciona=
l=0D
      bonificando el cargo de setup y manteniendo su website online todo =
el=0D
      tiempo.</font></b><font face=3D"Arial" size=3D"2" color=3D"002558">=
<br>=0D
      <br>=0D
      <b>Webhosting sobre =0D
      Global Crossing desde Am&eacute;rica latina para todo el mundo! <br=
>=0D
      Centro de atenci&oacute;n on-line en espa&ntilde;ol las 24 hs, <a h=
ref=3D"http://200.49.75.24/todesign.com.ar/local-cgi/to-chat/ventas.html"=
 target=3D"_blank">compruebelo =0D
      ahora mismo,<br>=0D
      <br>=0D
      </a></b></font>=0D
      <hr>=0D
      <table width=3D"100%" border=3D"0" align=3D"center" cellpadding=3D"=
0" cellspacing=3D"0">=0D
        <tr> =0D
          <td width=3D"33%" align=3D"center"> =0D
            <p><a href=3D"http://www.towebs.com/hostux100.html"><br>=0D
            <img src=3D"http://200.49.75.24/todesign.com.ar/email/venezue=
la/img/100ux.jpg" width=3D"134" height=3D"24" border=3D"0"></a></p>=0D
          </td>=0D
          <td width=3D"33%"> =0D
            <p></p>=0D
          </td>=0D
          <td width=3D"33%" align=3D"center"> =0D
            <p><a href=3D"http://www.towebs.com/hostux700.html"><img src=3D=
"http://200.49.75.24/todesign.com.ar/email/venezuela/img/700ux.jpg" width=
=3D"126" height=3D"21" border=3D"0"></a></p>=0D
          </td>=0D
        </tr>=0D
        <tr> =0D
          <td width=3D"33%" valign=3D"top" align=3D"center"> =0D
            <p><font face=3D"Verdana, Arial, Helvetica, sans-serif" size=3D=
"1"><b>100 =0D
              megabytes</b> de disco <br>=0D
              <b>20 cuentas</b> e-mail pop3<br>=0D
              E-mail alias ILIMITADOS! <br>=0D
              Panel de Control <b>Haiti 2.0</b><br>=0D
              Real Audio/Video <br>=0D
              Estad=EDsticas diarias<br>=0D
              <font color=3D"002558"><b>Soporte Telef&oacute;nico </b></f=
ont><b><font color=3D"002558"> =0D
              24 hs</font></b><br>=0D
              <br>=0D
              <a href=3D"http://www.towebs.com/hostux100.html">MAS INFO</=
a></font></p>=0D
          </td>=0D
          <td width=3D"33%"> =0D
            <p align=3D"center"><a href=3D"http://www.towebs.com/mail_pro=
mocion.html" target=3D"_blank"><img src=3D"http://200.49.75.24/todesign.c=
om.ar/email/16-01-02/img/promo.gif" border=3D"0" width=3D"177" height=3D"=
90"><br>=0D
            </a><b><font face=3D"Verdana, Arial, Helvetica, sans-serif" s=
ize=3D"2" color=3D"#CC3300">Cod: =0D
              11666391</font></b>=0D
          </td>=0D
          <td width=3D"33%" valign=3D"top" align=3D"center"> =0D
            <p><font face=3D"Verdana, Arial, Helvetica, sans-serif" size=3D=
"1"><b>700 =0D
              megabytes</b> de disco <br>=0D
              <b>200 cuentas </b>e-mail pop3 <br>=0D
              Subdominios ILIMITADOS!<br>=0D
              Panel de Control <b>Haiti 2.0</b><br>=0D
              Real Audio/Video <br>=0D
              Estad=EDsticas diarias<br>=0D
              <font color=3D"002558"><b>Soporte Telef&oacute;nico </b></f=
ont><b><font color=3D"002558"> =0D
              24 hs<br>=0D
            </font></b>=0D
              <br>=0D
              <a href=3D"http://www.towebs.com/hostux700.html">MAS INFO</=
a></font></p>=0D
          </td>=0D
        </tr>=0D
        <tr> =0D
          <td width=3D"33%" align=3D"center"><br>=0D
            <a href=3D"http://www.towebs.com/hostux100.html"><img src=3D"=
http://200.49.75.24/todesign.com.ar/email/16-01-02/img/15.jpg" width=3D"1=
71" height=3D"21" border=3D"0" alt=3D"M&aacute;s informaci&oacute;n acerc=
a de este Plan..."></a></td>=0D
          <td width=3D"33%"></td>=0D
          <td width=3D"33%" align=3D"center"><br>=0D
            <a href=3D"http://www.towebs.com/hostux700.html"><img src=3D"=
http://200.49.75.24/todesign.com.ar/email/16-01-02/img/24.jpg" width=3D"1=
74" height=3D"21" border=3D"0" alt=3D"M&aacute;s informaci&oacute;n acerc=
a de este Plan..."></a></td>=0D
        </tr>=0D
      </table>=0D
            =0D
      <br>=0D
      <table width=3D"100%" border=3D"0" cellspacing=3D"0" cellpadding=3D=
"4">=0D
        <tr> =0D
          <td><font face=3D"Verdana, Arial, Helvetica, sans-serif" size=3D=
"2"><img src=3D"http://200.49.75.24/todesign.com.ar/email/venezuela/img/q=
uenos.jpg" width=3D"246" height=3D"16"><img src=3D"http://200.49.75.24/to=
design.com.ar/email/venezuela/img/quenoslin.jpg" width=3D"281" height=3D"=
16"><br>=0D
            </font><font face=3D"Verdana, Arial, Helvetica, sans-serif" s=
ize=3D"2"></font><br>=0D
            <table width=3D"97%" border=3D"1" cellspacing=3D"0" cellpaddi=
ng=3D"0" bordercolor=3D"002558" align=3D"center">=0D
              <tr> =0D
                <td bgcolor=3D"002558"><b><font face=3D"Verdana, Arial, H=
elvetica, sans-serif" size=3D"1" color=3D"#FFFFFF">HERRAMIENTAS</font></b=
></td>=0D
              </tr>=0D
              <tr> =0D
                <td> =0D
                  <table width=3D"97%" border=3D"0" cellspacing=3D"3" cel=
lpadding=3D"4" align=3D"center">=0D
                    <tr> =0D
                      <td width=3D"33%"> =0D
                        <div align=3D"center"><a href=3D"http://www.toweb=
s.com/haiti/index.html" target=3D"_blank"><img src=3D"http://200.49.75.24=
/todesign.com.ar/email/venezuela/img/haiti.jpg" width=3D"114" height=3D"8=
2" border=3D"1" alt=3D"DEMO  -- Haiti Control Module 2.0"></a></div>=0D
                      </td>=0D
                      <td width=3D"33%"> =0D
                        <div align=3D"center"><a href=3D"http://www.virtu=
comnetworks.com/toshop/toshop.cgi" target=3D"_blank"><img src=3D"http://2=
00.49.75.24/todesign.com.ar/email/venezuela/img/toshop.jpg" width=3D"114"=
 height=3D"82" border=3D"1" alt=3D"DEMO  -- ToShop  carrito de compras"><=
/a></div>=0D
                      </td>=0D
                      <td width=3D"33%"> =0D
                        <div align=3D"center"><a href=3D"http://webmail.t=
odesign.com.ar/" target=3D"_blank"><img src=3D"http://200.49.75.24/todesi=
gn.com.ar/email/venezuela/img/tomail.jpg" width=3D"114" height=3D"82" bor=
der=3D"1" alt=3D"DEMO  -- ToMail  WebMail"></a></div>=0D
                      </td>=0D
                    </tr>=0D
                    <tr> =0D
                      <td width=3D"33%"> =0D
                        <div align=3D"center"><b><font face=3D"Verdana, A=
rial, Helvetica, sans-serif" size=3D"2">Haiti =0D
                          2.0</font></b></div>=0D
                      </td>=0D
                      <td width=3D"33%"> =0D
                        <div align=3D"center"><b><font face=3D"Verdana, A=
rial, Helvetica, sans-serif" size=3D"2">ToShop =0D
                          </font></b></div>=0D
                      </td>=0D
                      <td width=3D"33%"> =0D
                        <div align=3D"center"><b><font face=3D"Verdana, A=
rial, Helvetica, sans-serif" size=3D"2">ToMail</font></b></div>=0D
                      </td>=0D
                    </tr>=0D
                    <tr> =0D
                      <td width=3D"33%" valign=3D"top"><font color=3D"009=
2D6" face=3D"Arial, Helvetica, sans-serif" size=3D"1"><b><font color=3D"#=
000000" face=3D"Verdana, Arial, Helvetica, sans-serif">Panel =0D
                        de control</font><font color=3D"#666666"> persona=
l donde =0D
                        puede administrar todo su sitio, como crear cuent=
as de =0D
                        email, cuentas FTP, instalar diferentes component=
es y =0D
                        mirar el caudal de visitas de su sitio</font></b>=
</font></td>=0D
                      <td width=3D"33%" valign=3D"top"><font face=3D"Aria=
l, Helvetica, sans-serif" size=3D"1"><b><font face=3D"Verdana, Arial, Hel=
vetica, sans-serif">Toshop</font> =0D
                        <font color=3D"#666666">es un producto completame=
nte configurable =0D
                        en funcionamiento y dise&ntilde;o por lo que se a=
molda =0D
                        a todos los requerimientos que su empresa necesit=
a. <br>=0D
                        La administraci&oacute;n de Toshop es tan simple =
como =0D
                        navegar por cualquier p&aacute;gina de internet.<=
/font></b></font></td>=0D
                      <td width=3D"33%" valign=3D"top"> =0D
                        <p><font face=3D"Arial, Helvetica, sans-serif" si=
ze=3D"1"><b><font face=3D"Verdana, Arial, Helvetica, sans-serif">ToMail,<=
/font><font color=3D"#666666"> =0D
                          con esta aplicaci&oacute;n, usted podr&aacute; =
otorgar =0D
                          un e-mail propio a sus visitantes o clientes si=
n ning&uacute;n =0D
                          costo para los mismos y con el nombre de su dom=
inio, =0D
                          de esta manera, al enviar o recibir e-mails est=
ar&aacute;n =0D
                          publicitando su website.<br>=0D
                          <font color=3D"#3333CC" face=3D"Verdana, Arial,=
 Helvetica, sans-serif">usuario: =0D
                          demo<br>=0D
                          clave: demo</font></font></b></font></p>=0D
                      </td>=0D
                    </tr>=0D
                    <tr> =0D
                      <td width=3D"33%"> =0D
                        <div align=3D"center"><font face=3D"Verdana, Aria=
l, Helvetica, sans-serif" size=3D"1"><b><a href=3D"http://www.towebs.com/=
haiti/index.html" target=3D"_blank"><font color=3D"#FF0000">-- =0D
                          ver demo --</font></a></b></font></div>=0D
                      </td>=0D
                      <td width=3D"33%"> =0D
                        <div align=3D"center"><font face=3D"Verdana, Aria=
l, Helvetica, sans-serif" size=3D"1"><b><a href=3D"http://www.virtucomnet=
works.com/toshop/toshop.cgi" target=3D"_blank"><font color=3D"#FF0000">--=
 =0D
                          ver demo --</font></a></b></font></div>=0D
                      </td>=0D
                      <td width=3D"33%"> =0D
                        <div align=3D"center"><font face=3D"Verdana, Aria=
l, Helvetica, sans-serif" size=3D"1"><b><a href=3D"http://webmail.todesig=
n.com.ar/" target=3D"_blank"><font color=3D"#FF0000">-- =0D
                          ver demo --</font></a></b></font></div>=0D
                      </td>=0D
                    </tr>=0D
                  </table>=0D
                </td>=0D
              </tr>=0D
              <tr> =0D
                <td bgcolor=3D"002558"><font face=3D"Verdana, Arial, Helv=
etica, sans-serif" size=3D"2" color=3D"#FFFFFF"><b><font size=3D"1">ATENC=
ION =0D
                  PERSONALIZADA LAS 24 HS.</font></b></font></td>=0D
              </tr>=0D
              <tr> =0D
                <td> =0D
                  <table width=3D"100%" border=3D"0" cellspacing=3D"3" ce=
llpadding=3D"4" align=3D"center">=0D
                    <tr> =0D
                      <td width=3D"50%" height=3D"70" valign=3D"bottom"> =
=0D
                        <table border=3D'0' cellspacing=3D'0' cellpadding=
=3D'0' align=3D"center">=0D
                          <tr> =0D
                            <td align=3D'center'><a href=3D"http://200.49=
=2E75.24/todesign.com.ar/local-cgi/to-chat/ventas.html"><img src=3D'http:=
//200.49.75.24/todesign.com.ar/email/venezuela/img/ventas1.jpg' name=3D'h=
cIcon' border=3D0 alt=3D"Atencion Comercial"></a></td>=0D
                          </tr>=0D
                        </table>=0D
                        =0D
                      </td>=0D
                      <td width=3D"50%" height=3D"70" valign=3D"bottom"> =
=0D
                        <table border=3D'0' cellspacing=3D'0' cellpadding=
=3D'0' align=3D"center">=0D
                          <tr> =0D
                            <td align=3D'center'><a href=3D"http://200.49=
=2E75.24/todesign.com.ar/local-cgi/to-chat/soporte.html"><img src=3D"http=
://200.49.75.24/todesign.com.ar/email/venezuela/img/soporte1.jpg" name=3D=
'hcIcon' border=3D0 alt=3D"Soporte Tecnico"></a></td>=0D
                          </tr>=0D
                        </table>=0D
                        =0D
                      </td>=0D
                    </tr>=0D
                    <tr> =0D
                      <td width=3D"50%"> =0D
                        <div align=3D"center"><b><font face=3D"Verdana, A=
rial, Helvetica, sans-serif" size=3D"2"> =0D
                          Ejecutivo de Ventas</font></b></div>=0D
                      </td>=0D
                      <td width=3D"50%"> =0D
                        <div align=3D"center"><b><font face=3D"Verdana, A=
rial, Helvetica, sans-serif" size=3D"2">Soporte =0D
                          T&eacute;cnico</font></b></div>=0D
                      </td>=0D
                    </tr>=0D
                    <tr> =0D
                      <td width=3D"50%" valign=3D"top"><font face=3D"Aria=
l, Helvetica, sans-serif" size=3D"1"><b>Atencion =0D
                        comercial<font color=3D"#666666"> un ejecutivo de=
 ventas =0D
                        lo asesorar&aacute; sobre sus dudas e inquietudes=
 para =0D
                        poder aprovechar al maximo los beneficios que bri=
nda Towebs</font></b></font></td>=0D
                      <td width=3D"50%" valign=3D"top"><font face=3D"Aria=
l, Helvetica, sans-serif" size=3D"1"><b>Soporte =0D
                        T&eacute;cnico <font color=3D"#666666">on-line y =
telef&oacute;nico =0D
                        las 24 hs. los 365 d&iacute;as del a&ntilde;o un =
especialista =0D
                        estar&aacute; dispuesto a solucionar sus problema=
s.</font></b></font></td>=0D
                    </tr>=0D
                    <tr> =0D
                      <td width=3D"50%"> =0D
                        <div align=3D"center"><font face=3D"Verdana, Aria=
l, Helvetica, sans-serif" size=3D"1"><b><a href=3D"http://200.49.75.24/to=
design.com.ar/local-cgi/to-chat/ventas.html" target=3D"_blank"><font colo=
r=3D"#FF0000">- =0D
                          compruebelo ahora mismo -</font></a> </b></font=
></div>=0D
                      </td>=0D
                      <td width=3D"50%"> =0D
                        <div align=3D"center"><font face=3D"Verdana, Aria=
l, Helvetica, sans-serif" size=3D"1"><b><a href=3D"http://200.49.75.24/to=
design.com.ar/local-cgi/to-chat/soporte.html" target=3D"_blank"><font col=
or=3D"#FF0000">- =0D
                          compruebelo ahora mismo -</font></a> </b></font=
></div>=0D
                      </td>=0D
                    </tr>=0D
                  </table>=0D
                </td>=0D
              </tr>=0D
            </table>=0D
            &nbsp;&nbsp;</td>=0D
        </tr>=0D
        <tr> =0D
          <td> <font face=3D"Verdana, Arial, Helvetica, sans-serif" size=3D=
"1"><b>Nota:=0D
            </b>Los precios no incluyen iva</font></td>=0D
        </tr>=0D
        <tr> =0D
          <td> <font face=3D"Verdana, Arial, Helvetica, sans-serif" size=3D=
"1"><b>C&oacute;mo =0D
            borrarse/desuscribirse de nuestros newsletters:</b><br>=0D
            Este mensaje no es spam! Est&aacute; recibiendo esta oferta p=
orque =0D
            usted (o alguien utilizando su cuenta de correo) complet&oacu=
te; un =0D
            formulario para contacto en nuestro website http://www.towebs=
=2Ecom =0D
            o solicit&oacute; informaci&oacute;n sobre webhosting de ToWe=
bs.<br>=0D
            Si usted no quiere recibir m&aacute;s informaci&oacute;n de n=
uestra =0D
            parte, puede borrar su direcci&oacute;n de correo presionando=
 <a href=3D"http://www.towebs.com/mailing-spanish.html" target=3D"_blank"=
>aqui</a> =0D
            y ser&aacute; removido de nuestra lista.<br>=0D
            Disculpe nuevamente las molestias que le pueda haber causado =
este =0D
            email.<br>=0D
            &nbsp; &nbsp;</font></td>=0D
        </tr>=0D
        <tr>=0D
          <td>=0D
            <div align=3D"center"><font face=3D"Verdana, Arial, Helvetica=
, sans-serif" size=3D"1"><b><font color=3D"002558">ToWebs, =0D
              (c) 1999 Virtucom Networks S.A <br>=0D
              - San Mart=EDn 390 piso 12, Capital Federal, Buenos Aires, =
Argentina =0D
              - <br>=0D
              - Tel / Fax: (54)-11-4393-0999 -<br>=0D
              </font><a href=3D"http://www.towebs.com"><font color=3D"002=
558">http://www.towebs.com</font> =0D
              </a></b></font></div>=0D
          </td>=0D
        </tr>=0D
      </table>=0D
    </td>=0D
  </tr>=0D
  <tr>=0D
    <td>=0D
    </td>=0D
  </tr>=0D
</table>=0D
</body>=0D
</html>=0D



From Jack.Jansen@oratrix.com  Mon Feb 25 15:23:09 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Mon, 25 Feb 2002 16:23:09 +0100
Subject: [Python-Dev] Pthreads wizard needed to look at bug #522393
Message-ID: <9314C5E0-2A03-11D6-8301-0030655234CE@oratrix.com>

Folks,
in the process of testing 2.2.1 I ran into a bug on SGI that has been 
there since at least 2.2: if you build with threads you get an undefined 
error on pthread_detach() while linking the interpreter.

I think the solution is to change the autoconf test that decides which 
libraries to add for pthread (make it refer not only the thread_create() 
but also thread_detach()), but as pthread_detach apparently isn't 
available in all pthread implementations (if I understand the small 
forest of ifdefs inside thread_pthread.h correctly) I'm not sure this 
won't break anything else.

Could some pthread guru please have a look at bug #522393?
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -



From skip@pobox.com  Mon Feb 25 15:53:34 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 25 Feb 2002 09:53:34 -0600
Subject: [Python-Dev] PEP 282: A Logging System  --  comments please
In-Reply-To: <20020215164333.A31903@ActiveState.com>
References: <20020215164333.A31903@ActiveState.com>
Message-ID: <15482.24062.951258.55849@12-248-41-177.client.attbi.com>

    Trent> More standard Handlers may be implemented if deemed desirable and
    Trent> feasible.  Other interesting candidates:

    ...
    Trent>     - SyslogHandler: Akin to log4j's SyslogAppender.

I'd implement at least this one, both as a proof of concept and to provide a
standard mapping between the levels used in your logger and those syslog
provides.

Skip


From skip@pobox.com  Mon Feb 25 16:45:17 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 25 Feb 2002 10:45:17 -0600
Subject: [Python-Dev] Rattlesnake progress
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHDNNAA.tim.one@comcast.net>
References: <20020218145806.A26111@glacier.arctrix.com>
 <LNBBLJKPBEHFEDALKOLCIEHDNNAA.tim.one@comcast.net>
Message-ID: <15482.27165.395216.650499@12-248-41-177.client.attbi.com>

    Tim> Excellent advice that almost nobody follows <0.5 wink>: choose a
    Tim> flexible intermediate representation, then structure all your
    Tim> transformations as independent passes, such that the output of
    Tim> every pass is acceptable as the input to every pass.  

I did this with my peephole optimizer.  It worked great.  Each peephole
optimization is a very simple subclass of an OptimizeFilter base class.  The
IR is essentially the bytecode split into basic blocks, with each basic
block a list of (opcode argument) tuples.  Jump targets are represented as
simple indexes into the block list.  (In fact, my Rattlesnake converter was
just a "peephole optimizer" named InstructionSetConverter.)  As Tim
mentioned about KISS, this means you sometimes have to run particular
optimizations or groups of optimizations multiple times.

I want to get it checked into the sandbox where others can play with it, but
time has shifted its foundation a tad and a couple optimizations don't work
any longer.

Skip


From pedronis@bluewin.ch  Mon Feb 25 17:01:32 2002
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Mon, 25 Feb 2002 18:01:32 +0100
Subject: [Python-Dev] Re: Jython bugs or features?
Message-ID: <00cb01c1be1e$159b9a60$6d94fea9@newmexico>

Dinu Gherman <gherman@europemail.com> wrote in message
gherman-035CF3.16342425022002@news.t-online.com...
> Hi,
>
> I'm trying to get a little more familiar with Jython, but after having
> just installed the 2.1 final release on OS X I find the following
> rather surprising differences between Jython and CPython (2.1 and 2.2
> respectively, but that doesn't matter here):
>
>   1. cStringIO.StringIO.reset() is missing in Jython
>   2. list() doesn't perform as expected in Jython

[namely it does not do a copy :( ]

>
> Are these bugs and if so: how many people are actually using Jython,
> then, especially as books start to come out for it from O'Reilly and
> New Riders...?
>

Yes they are bugs. Please report them.
Since 2.1 release (two month ago) there have been approximately  10000
download.

How are such bugs possible?

1. the typical idiom for list(l) is l[:] (mildly irrelevant I know)
2. if you check the test suite for CPython 2.1 and Jython there is no test
for list(l) behavior
3. the new reset method is also not tested by the test suite and is not
reported in the NEWS file

(These are the result of some grepping, maybe I'm wrong but given the bugs
probably
I'm right).

We try to follow the big picture and the PEPs and check the NEWS file but
for the rest the test suites
are our best hope vs. delusional friend.
If a feature is old, under-used, or through usage the bug does not show
through, and un-tested things become tricky.

It is open source: Jython is as conforming as its community and CPython
community ACTIVELY want it to be.

regards, Samuele Pedroni.





From barry@zope.com  Mon Feb 25 17:41:26 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 12:41:26 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
Message-ID: <15482.30534.640750.875602@anthem.wooz.org>

I'm still woefully behind on my email since returning from vacation,
but I thought I'd rehash a bit on PEP 215, string interpolation, given
some recent hacking and thinking about stuff we talked about at IPC10.

Background: PEP 215 has some interesting ideas, but IMHO is more than
I'm comfortable with.  At IPC10, Guido described his rules for string
interpolation as they would be if his time machine were more powerful.
These follow some discussions we've had during various Zope sprints
about making the rules simpler for non-programmers to understand.
I've also been struggling with how error prone %(var)s substitutions
can be in the thru-the-web Mailman strings where this is supported.
Here's what I've come up with.

Guido's rules for $-substitutions are really simple:

1. $$ substitutes to just a single $

2. $identifier followed by non-identifier characters gets interpolated
   with the value of the 'identifier' key in the substitution
   dictionary.

3. For handling cases where the identifier is followed by identifier
   characters that aren't part of the key, ${identfier} is equivalent
   to $identifier.

And that's it.  For the sake of discussion, forget about where the
dictionary for string interpolation comes from.

I've hacked together 4 functions which I'm experimentally using to
provide these rules in thru-the-web string editing, and also for
sanity checking the strings as they're submitted.  I think there's a
fairly straightforward conversion between traditional %-strings and
these newfangled $-strings, and so two of the functions do the
conversions back and forth.

The second two functions attempt to return a list of all the
substitution variables found in either a %-string or a $-string.  I
match this against the list of known legal substitution variables, and
bark loudly if there's some mismatch.

The one interesting thing about %-to-$ conversion is that the regexp I
use leaves the trailing `s' in %(var)s as optional, so I can
auto-correct for those that are missing.  I think this was an idea
that Paul Dubois came up with during the lunch discussion.  Seems to
work well, and I can do a %-to-$-to-% roundtrip; if the strings at the
ends are the same then there wasn't any missing `s's, otherwise the
conversion auto-corrected and I can issue a warning.

This is all really proto-stuff, but I've done some limited testing and
it seems to work pretty well.  So without changing the language we can
play with $-strings using Guido's rules to see if we like them or not,
by simply converting them to traditional %-strings manually, and then
doing the mod-operator substitutions.

Hopefully I've extracted the right bits of code from my modules for
you to get the idea.  There may be bugs <wink>.

-Barry

-------------------- snip snip --------------------
import re

from string import digits
try:
    # Python 2.2
    from string import ascii_letters
except ImportError:
    # Older Pythons
    _lower = 'abcdefghijklmnopqrstuvwxyz'
    ascii_letters = _lower + _lower.upper()

# Search for $(identifier)s strings, except that the trailing s is optional,
# since that's a common mistake
cre = re.compile(r'%\(([_a-z]\w*?)\)s?', re.IGNORECASE)
# Search for $$, $identifier, or ${identifier}
dre = re.compile(r'(\${2})|\$([_a-z]\w*)|\${([_a-z]\w*)}', re.IGNORECASE)

IDENTCHARS = ascii_letters + digits + '_'
EMPTYSTRING = ''

# Utilities to convert from simplified $identifier substitutions to/from
# standard Python $(identifier)s substititions.  The "Guido rules" for the
# former are:
#    $$ -> $
#    $identifier -> $(identifier)s
#    ${identifier} -> $(identifier)s

def to_dollar(s):
    """Convert from %-strings to $-strings."""
    s = s.replace('$', '$$')
    parts = cre.split(s)
    for i in range(1, len(parts), 2):
        if parts[i+1] and parts[i+1][0] in IDENTCHARS:
            parts[i] = '${' + parts[i] + '}'
        else:
            parts[i] = '$' + parts[i]
    return EMPTYSTRING.join(parts)


def to_percent(s):
    """Convert from $-strings to %-strings."""
    s = s.replace('%', '%%')
    parts = dre.split(s)
    for i in range(1, len(parts), 4):
        if parts[i] is not None:
            parts[i] = '$'
        elif parts[i+1] is not None:
            parts[i+1] = '%(' + parts[i+1] + ')s'
        else:
            parts[i+2] = '%(' + parts[i+2] + ')s'
    return EMPTYSTRING.join(filter(None, parts))


def dollar_identifiers(s):
    """Return the set (dictionary) of identifiers found in a $-string."""
    d = {}
    for name in filter(None, [b or c or None for a, b, c in dre.findall(s)]):
        d[name] = 1
    return d


def percent_identifiers(s):
    """Return the set (dictionary) of identifiers found in a %-string."""
    d = {}
    for name in cre.findall(s):
        d[name] = 1
    return d

-------------------- snip snip --------------------
Python 2.2 (#1, Dec 24 2001, 15:39:01) 
[GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import dollar
>>> dollar.to_dollar('%(one)s %(two)three %(four)seven')
'$one ${two}three ${four}even'
>>> dollar.to_percent(dollar.to_dollar('%(one)s %(two)three %(four)seven'))
'%(one)s %(two)sthree %(four)seven'
>>> dollar.percent_identifiers('%(one)s %(two)three %(four)seven')
{'four': 1, 'two': 1, 'one': 1}
>>> dollar.dollar_identifiers(dollar.to_dollar('%(one)s %(two)three %(four)seven'))
{'four': 1, 'two': 1, 'one': 1}


From mal@lemburg.com  Mon Feb 25 18:08:48 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 25 Feb 2002 19:08:48 +0100
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
Message-ID: <3C7A7DB0.A8A3416E@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> Background: PEP 215 has some interesting ideas, but IMHO is more than
> I'm comfortable with.  At IPC10, Guido described his rules for string
> interpolation as they would be if his time machine were more powerful.
> These follow some discussions we've had during various Zope sprints
> about making the rules simpler for non-programmers to understand.
> I've also been struggling with how error prone %(var)s substitutions
> can be in the thru-the-web Mailman strings where this is supported.
> Here's what I've come up with.
> 
> Guido's rules for $-substitutions are really simple:
> 
> 1. $$ substitutes to just a single $
> 
> 2. $identifier followed by non-identifier characters gets interpolated
>    with the value of the 'identifier' key in the substitution
>    dictionary.
> 
> 3. For handling cases where the identifier is followed by identifier
>    characters that aren't part of the key, ${identfier} is equivalent
>    to $identifier.
> 
> And that's it.  For the sake of discussion, forget about where the
> dictionary for string interpolation comes from.

Wouldn't it be a lot simpler and more inline with what we
already have, if we'd use '%' as escape characters ?

1. %% becomes %

2. %ident maps to %(ident)s as we have it now

3. %{ident} maps to %(ident)s

4. %(ident)s continues to have the same semantics as
   before

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From guido@python.org  Mon Feb 25 18:33:32 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 25 Feb 2002 13:33:32 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: Your message of "Mon, 25 Feb 2002 19:08:48 +0100."
 <3C7A7DB0.A8A3416E@lemburg.com>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
Message-ID: <200202251833.g1PIXW912610@pcp742651pcs.reston01.va.comcast.net>

[Barry]
> > Guido's rules for $-substitutions are really simple:
> > 
> > 1. $$ substitutes to just a single $
> > 
> > 2. $identifier followed by non-identifier characters gets interpolated
> >    with the value of the 'identifier' key in the substitution
> >    dictionary.
> > 
> > 3. For handling cases where the identifier is followed by identifier
> >    characters that aren't part of the key, ${identfier} is equivalent
> >    to $identifier.
> > 
> > And that's it.  For the sake of discussion, forget about where the
> > dictionary for string interpolation comes from.

[MAL]
> Wouldn't it be a lot simpler and more inline with what we
> already have, if we'd use '%' as escape characters ?
> 
> 1. %% becomes %
> 
> 2. %ident maps to %(ident)s as we have it now
> 
> 3. %{ident} maps to %(ident)s
> 
> 4. %(ident)s continues to have the same semantics as
>    before

That's not simpler, it's more complicated.  Any tool dealing with
these will have to understand all the rules.

The point of switching to $ is twofold: (1) it avoids confusion with
the old %-based syntax (which can continue to exist for different
purposes), (2) it is familiar to people who have seen substitution in
other languages.  $ is nearly universal (Perl, Tcl, Ruby, shell, etc.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@zope.com  Mon Feb 25 18:48:32 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 13:48:32 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
Message-ID: <15482.34560.688685.262327@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> 1. %% becomes %

    MAL> 2. %ident maps to %(ident)s as we have it now

    MAL> 3. %{ident} maps to %(ident)s

    MAL> 4. %(ident)s continues to have the same semantics as
    MAL>    before

What happens to %dogfood or %sickpuppy?  If you're trying to maintain
backwards compatibility with existing syntax, you can't use %ident
strings.

-Barry


From mal@lemburg.com  Mon Feb 25 19:25:59 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 25 Feb 2002 20:25:59 +0100
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org>
Message-ID: <3C7A8FC7.9CB321EE@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "MAL" == M  <mal@lemburg.com> writes:
> 
>     MAL> 1. %% becomes %
> 
>     MAL> 2. %ident maps to %(ident)s as we have it now
> 
>     MAL> 3. %{ident} maps to %(ident)s
> 
>     MAL> 4. %(ident)s continues to have the same semantics as
>     MAL>    before
> 
> What happens to %dogfood or %sickpuppy?  If you're trying to maintain
> backwards compatibility with existing syntax, you can't use %ident
> strings.

That's what I was trying to achieve. The only gripe I sometimes
have with '%(ident)s' is that users forget the 's' behind 
'%(ident)'; I'd be ok with dropping 2. and only adding 3.

Whatever you do, just please don't mix the old and new 
semantics...

   'Joe has $ %(a)5.2f in his pocket.' % locals()

is perfectly valid now and should continue to be valid.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From barry@zope.com  Mon Feb 25 19:28:13 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 14:28:13 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
Message-ID: <15482.36941.605165.133988@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> Whatever you do, just please don't mix the old and new 
    MAL> semantics...

    MAL>    'Joe has $ %(a)5.2f in his pocket.' % locals()

    MAL> is perfectly valid now and should continue to be valid.

I agree completely; it ought to be one or the other.  In the code I
emailed, you actually had to do a conversion step from $-strings to
%-strings to use the build-in string-mod operator.  In practice, if
$-strings were to be added to the language, I suspect some new prefix
would have to designate a new type of string object, e.g. $''
strings.  Or perhaps a different binary operator could be used.

-Barry


From mal@lemburg.com  Mon Feb 25 19:44:48 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 25 Feb 2002 20:44:48 +0100
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org>
Message-ID: <3C7A9430.1E1077F8@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "MAL" == M  <mal@lemburg.com> writes:
> 
>     MAL> Whatever you do, just please don't mix the old and new
>     MAL> semantics...
> 
>     MAL>    'Joe has $ %(a)5.2f in his pocket.' % locals()
> 
>     MAL> is perfectly valid now and should continue to be valid.
> 
> I agree completely; it ought to be one or the other.  In the code I
> emailed, you actually had to do a conversion step from $-strings to
> %-strings to use the build-in string-mod operator.  In practice, if
> $-strings were to be added to the language, I suspect some new prefix
> would have to designate a new type of string object, e.g. $''
> strings.  Or perhaps a different binary operator could be used.

Good.

Since the strings themselves don't really change and to
avoid confusing string modifiers...

	ur$'my $format \$tring'

I'd suggest to use a new operator, e.g.

	'Joe has $$ $a in his pocket.' $ locals()

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From barry@zope.com  Mon Feb 25 19:55:29 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 14:55:29 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
Message-ID: <15482.38577.933015.221824@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> 	'Joe has $$ $a in his pocket.' $ locals()

I'd prefer to hijack an existing operator -- one that's unsupported by
the string object.  Perhaps / or - or & or |

?
-Barry


From mal@lemburg.com  Mon Feb 25 20:07:02 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 25 Feb 2002 21:07:02 +0100
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org>
Message-ID: <3C7A9966.CE6C7CCB@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "MAL" == M  <mal@lemburg.com> writes:
> 
>     MAL>        'Joe has $$ $a in his pocket.' $ locals()
> 
> I'd prefer to hijack an existing operator -- one that's unsupported by
> the string object.  Perhaps / or - or & or |

'/' looks nice and has this "interpret under" sort of meaning:

	'Joe has $$ $a in his pocket.' / locals()

If you are more into algebra, then '*' would probably also appeal 
to the eye:

	'Joe has $$ $a in his pocket.' * locals()

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From fdrake@acm.org  Mon Feb 25 20:08:49 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 25 Feb 2002 15:08:49 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <3C7A9966.CE6C7CCB@lemburg.com>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9966.CE6C7CCB@lemburg.com>
Message-ID: <15482.39377.989664.700535@grendel.zope.com>

M.-A. Lemburg writes:
 > '/' looks nice and has this "interpret under" sort of meaning:
 > 
 > 	'Joe has $$ $a in his pocket.' / locals()

I'd read that more as "mapped over" rather than "interpret under".  ;)

 > If you are more into algebra, then '*' would probably also appeal 
 > to the eye:
 > 
 > 	'Joe has $$ $a in his pocket.' * locals()

But * is already meaningful for strings, so not a good choice.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From barry@zope.com  Mon Feb 25 20:10:59 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 15:10:59 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9966.CE6C7CCB@lemburg.com>
Message-ID: <15482.39507.311309.96141@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> '/' looks nice and has this "interpret under" sort of
    MAL> meaning:

    MAL> 	'Joe has $$ $a in his pocket.' / locals()

I agree, I like that one.

    MAL> If you are more into algebra, then '*' would probably also
    MAL> appeal to the eye:

    MAL> 	'Joe has $$ $a in his pocket.' * locals()

I avoid it because then you'd have to add another type test to
operator-*.

Ping, if you're around and care to comment, perhaps we can try to
update PEP 215 and maybe add a reference implementation?

-Barry


From mal@lemburg.com  Mon Feb 25 20:23:04 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 25 Feb 2002 21:23:04 +0100
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9966.CE6C7CCB@lemburg.com> <15482.39507.311309.96141@anthem.wooz.org>
Message-ID: <3C7A9D28.DE993EEC@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "MAL" == M  <mal@lemburg.com> writes:
> 
>     MAL> '/' looks nice and has this "interpret under" sort of
>     MAL> meaning:
> 
>     MAL>        'Joe has $$ $a in his pocket.' / locals()
> 
> I agree, I like that one.

Fine with me.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From pedronis@bluewin.ch  Mon Feb 25 20:18:36 2002
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Mon, 25 Feb 2002 21:18:36 +0100
Subject: [Python-Dev] Re: Jython bugs or features?
Message-ID: <021a01c1be39$9cfb1b00$6d94fea9@newmexico>

 [Dinu Gherman]
>
> I'm pretty surprised! I knew the Python test suite is *very* far
> from complete, but list() is an *extremely* crucial function,
> isn't it?

Given how things concretely are and not considering
some abstract point of view, I would say YMMV.

I have skimmed over CPython 2.1 std lib,
honestly there are not many places relevant to Jython where list for
copying lists is used and where the bug can show through.
Consider things like:

a += list(b) OR b = ... + list(b)

and all the places where l = list(sq) and sq is not a list,
those works in Jython.

Btw here simply Jython was never up to speed
(that means always implemented the wrong semantics)
and nobody noticed this or to be precise ever
reported this.

OTOH we try to be responsive to bug reports.

Yes the test suites are *far* from optimal
and CPython sometimes regresses too.

>
> My impression is that Jython claims to be an implementation of
> Python in Java.

Yes but your PS (see below) shows that this can mean
a lot of things, Jython is more about writing new programs
and Java integration (which is a big and non-easy part and of its
codebase) than for example supporting completely CPython os
and shell-like programming and allowing effortless porting of
all kinds of CPython programs (clearly list bug is another kind of issue
given that the right behaviour is well documented).

Java integration has a higher priority than those thing.
Jython is more about embracing  the Java platform
(philosophy) than work-arounding it.
For example anything that would require writing
JNI glue and native C code is simply discarded.

There are things for which CPython is simply
better suited than Java/Jython and the other way
around.

They are not fully equivalent substitutes.

> Everybody understands the existence of bugs, but
> if functions are missing I'm not sure there is sufficient quality
> control, leave alone a useful test suite.
>
> My bold guess is that
> it should be very easy to check automatically for each module in
> the std.lib. at least if the same classes and methods do exist
> in CPython and JPython.
>
> Honestly, I don't think it makes much sense to maintain two
> code bases without some degree of automatic testing...

JPython is not born with such a test, and Jython until
now has never grown one. But I have taken note of this
adding a feature request for such a test but is more a matter of resources
and priorities than easiness.

Btw after some checking of the CPython CVS

http://groups.google.com/groups?q=g:thl2649249624d&hl=en&selm=mailman.100742
5027.30019.python-list%40python.org

(look at the change in the module-level __doc__ )

it seems that reset is vestigial and one should use seek(0)
instead.

reset is not supported by StringIO or by files
and not documented (not after 1.5 for sure).

You made me think that it was added in 2.1,
so at least IMO it is an option for Jython to have decided
not to support it.

>
> When seeing such bugs, my immediate reaction (like that of most
> others) is to think that not many people can be using this serious-
> ly.
>

See e.g.:
http://groups.google.com/groups?q=g:thl2649249624d&hl=en&selm=mailman.100742
5027.30019.python-list%40python.org
and thread.

> PS: BTW, how about this one:
>
> [localhost:~] dinu% jython
> Jython 2.1 on java1.3.1 (JIT: null)
> Type "copyright", "credits" or "license" for more information.
> >>> import os
> >>> os.chdir('..')
> Traceback (innermost last):
>   File "<console>", line 1, in ?
>   File ".../Jython-2.1/Lib/javaos.py", line 56, in chdir
> OSError: [Errno 0] chdir not supported in Java: ..

We don't support this one, sorry.





From paul@prescod.net  Mon Feb 25 20:31:55 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 25 Feb 2002 12:31:55 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org>
Message-ID: <3C7A9F3B.B42265DC@prescod.net>

"Barry A. Warsaw" wrote:
> 
> >>>>> "MAL" == M  <mal@lemburg.com> writes:
> 
>     MAL>        'Joe has $$ $a in his pocket.' $ locals()
> 
> I'd prefer to hijack an existing operator -- one that's unsupported by
> the string object.  Perhaps / or - or & or |

Yuck!

String interopolation should be a *compile time* action, not an
operator. One of the goals, in my mind, is to allow people to string
interpolate without knowing what the locals() function does. After all,
that function is otherwise useless for most Python programmers (and
should probably be moved to an introspection module). 

Your strategy requires the naive user to learn a) the $ syntax, b) the
magic operator syntax and c) the meaning of the locals() function. Plus
you've thrown away the idea that interpolation works as it does in the
shell or in Perl/Awk/Ruby etc.

At that point, in my mind, we're back where we started and should just
use %. Well have reinvented it with a few small tweaks.

Plus, operator-based evaluation has some security implications that
compile time evaluation does not. In particular, if the left-hand thing
can be any string then you have the potential of accidentally allowing
the user to supply a string that allows him/her to introspect your local
variables. That can't happen if the interpolation is done at compile
time.

 Paul Prescod


From fredrik@pythonware.com  Mon Feb 25 20:44:13 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 25 Feb 2002 21:44:13 +0100
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>	<3C7A7DB0.A8A3416E@lemburg.com>	<15482.34560.688685.262327@anthem.wooz.org>	<3C7A8FC7.9CB321EE@lemburg.com>	<15482.36941.605165.133988@anthem.wooz.org>	<3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net>
Message-ID: <07fe01c1be3d$32699fb0$ced241d5@hagrid>

paul wrote:
> Your strategy requires the naive user to learn a) the $ syntax, b) the
> magic operator syntax and c) the meaning of the locals() function. Plus
> you've thrown away the idea that interpolation works as it does in the
> shell or in Perl/Awk/Ruby etc.
> 
> At that point, in my mind, we're back where we started and should just
> use %.

    # interpolate!
    s = I('Joe has $ ', a, ' in his pocket.')

or perhaps

    # print-like interpolation
    s = P('Joe has $', a, 'in his pocket.')

works pretty well too.  in all versions of python, with all existing
syntax-aware tools.  and if written in C, it's probably as fast as
any other solution...

(implementing I/P is left as an exercise etc etc)

</F>



From nas@python.ca  Mon Feb 25 20:50:39 2002
From: nas@python.ca (Neil Schemenauer)
Date: Mon, 25 Feb 2002 12:50:39 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <3C7A9F3B.B42265DC@prescod.net>; from paul@prescod.net on Mon, Feb 25, 2002 at 12:31:55PM -0800
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net>
Message-ID: <20020225125039.A22519@glacier.arctrix.com>

Paul Prescod wrote:
> At that point, in my mind, we're back where we started and should just
> use %.

I agree.

  Neil


From fdrake@acm.org  Mon Feb 25 20:55:15 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 25 Feb 2002 15:55:15 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <3C7A9F3B.B42265DC@prescod.net>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
Message-ID: <15482.42163.449374.118807@grendel.zope.com>

Paul Prescod writes:
 > String interopolation should be a *compile time* action, not an
 > operator. One of the goals, in my mind, is to allow people to string

This doesn't work as soon as the string is not a constant.  Many of
the discussions at PythonLabs did not involve text included as part of
an application's source, and the conversion operation would not be
driven by application code but by library/service code.  Even if it
were a constant, needing to add in message catalog support changes
things as well.  So auto-magical interpolation doesn't seem like a
good idea.

 > interpolate without knowing what the locals() function does. After all,
 > that function is otherwise useless for most Python programmers (and
 > should probably be moved to an introspection module). 

You'd still only need to use locals() if that's your source of
variables.

 > Your strategy requires the naive user to learn a) the $ syntax, b) the
 > magic operator syntax and c) the meaning of the locals() function. Plus
 > you've thrown away the idea that interpolation works as it does in the
 > shell or in Perl/Awk/Ruby etc.

a) The $ syntax is easier than the % syntax, and already more familiar
   to most new users.
b) What's a magic operator?  string % mapping is already pretty
   magical as far as the modulus operation is concerned.
c) And you still don't have to use locals() if you don't want to.

And the string syntax matches a common subset of what's used
elsewhere.  We just have the added control over the source of
substitution values (a good thing).

 > At that point, in my mind, we're back where we started and should just
 > use %. Well have reinvented it with a few small tweaks.

And we've made it a lot easier for strings that are not part of Python
source code, and for people who produce that data but never know
Python.

 > Plus, operator-based evaluation has some security implications that
 > compile time evaluation does not. In particular, if the left-hand thing
 > can be any string then you have the potential of accidentally allowing
 > the user to supply a string that allows him/her to introspect your local
 > variables. That can't happen if the interpolation is done at compile
 > time.

I'm not sure I understand this.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From jepler@unpythonic.dhs.org  Mon Feb 25 21:01:06 2002
From: jepler@unpythonic.dhs.org (Jeff Epler)
Date: Mon, 25 Feb 2002 15:01:06 -0600
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <3C7A9F3B.B42265DC@prescod.net>
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net>
Message-ID: <20020225150106.B22803@unpythonic.dhs.org>

On Mon, Feb 25, 2002 at 12:31:55PM -0800, Paul Prescod wrote:
> Yuck!
> 
> String interopolation should be a *compile time* action, not an
> operator. One of the goals, in my mind, is to allow people to string
> interpolate without knowing what the locals() function does. After all,
> that function is otherwise useless for most Python programmers (and
> should probably be moved to an introspection module). 
> 
> Your strategy requires the naive user to learn a) the $ syntax, b) the
> magic operator syntax and c) the meaning of the locals() function. Plus
> you've thrown away the idea that interpolation works as it does in the
> shell or in Perl/Awk/Ruby etc.
> 
> At that point, in my mind, we're back where we started and should just
> use %. Well have reinvented it with a few small tweaks.
> 
> Plus, operator-based evaluation has some security implications that
> compile time evaluation does not. In particular, if the left-hand thing
> can be any string then you have the potential of accidentally allowing
> the user to supply a string that allows him/her to introspect your local
> variables. That can't happen if the interpolation is done at compile
> time.

But how do you internationalize your program once you use $-subs?  The
great strength of %-formats, and the *printf functions that inspired
them, are that the interpretation of the format takes place at runtime.
(printf has added positional specifiers, spelled like "%1$s", to permit
reordering of items in the format, while Python has added
key-specifiers, spelled like "%(id)s", but they're about equally
powerful)

With %-subs, we can write
    def gettext(s):
	""" Return the localized version of s from the message catalog """
	return s

    def print_chance(who, chance):
	print gettext("%(who)s has a %(percent).2f%% chance of surviving") % {
		'who': who,
		'percent': chance * 100}

    print_chance("Jeff", 1./3)

I'm not interested in any proposal that turns code that's easy to
internationalize (just add calls to gettext(), commonly spelled _(),
around each string that needs translating, then fix up the places where
the programmer was too clever) into code that's impossible to
internationalize by design.

Jeff


From akuchlin@mems-exchange.org  Mon Feb 25 21:04:23 2002
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 25 Feb 2002 16:04:23 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <15482.42163.449374.118807@grendel.zope.com>
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com>
Message-ID: <20020225210423.GA2398@crystal.mems-exchange.org>

On Mon, Feb 25, 2002 at 03:55:15PM -0500, Fred L. Drake, Jr. wrote:
>And we've made it a lot easier for strings that are not part of Python
>source code, and for people who produce that data but never know
>Python.

But for applications where people don't edit Python, this could just
be a library module and doesn't need a new operator in the Python
code.  I agree with Paul; there's no actual gain in clarity from the
new syntax.

>
> > Plus, operator-based evaluation has some security implications that
> > compile time evaluation does not. In particular, if the left-hand thing
>
>I'm not sure I understand this.

Presumably Paul is thinking of something like:
mlist = load_list('listname')
# Lists have .title, .password, ...
form_value = cgi.form['text'] # User puts $password into text
print text \ vars(mlist)

--amk                                                  (www.amk.ca)
The most merciful thing in the world, I think, is the inability of the
human mind to correlate all its contents.
    -- H.P. Lovecraft, "The Call of Cthulhu"



From guido@python.org  Mon Feb 25 21:06:15 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 25 Feb 2002 16:06:15 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: Your message of "Mon, 25 Feb 2002 12:31:55 PST."
 <3C7A9F3B.B42265DC@prescod.net>
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
Message-ID: <200202252106.g1PL6FY13393@pcp742651pcs.reston01.va.comcast.net>

> String interopolation should be a *compile time* action, not an
> operator. One of the goals, in my mind, is to allow people to string
> interpolate without knowing what the locals() function does. After all,
> that function is otherwise useless for most Python programmers (and
> should probably be moved to an introspection module). 
> 
> Your strategy requires the naive user to learn a) the $ syntax, b) the
> magic operator syntax and c) the meaning of the locals() function. Plus
> you've thrown away the idea that interpolation works as it does in the
> shell or in Perl/Awk/Ruby etc.
> 
> At that point, in my mind, we're back where we started and should just
> use %. Well have reinvented it with a few small tweaks.
> 
> Plus, operator-based evaluation has some security implications that
> compile time evaluation does not. In particular, if the left-hand thing
> can be any string then you have the potential of accidentally allowing
> the user to supply a string that allows him/her to introspect your local
> variables. That can't happen if the interpolation is done at compile
> time.

All right, but there *also* needs to be a way to invoke interpolation
explicitly -- just like eval().  This has applicability e.g. in i18n.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From nas@python.ca  Mon Feb 25 21:14:36 2002
From: nas@python.ca (Neil Schemenauer)
Date: Mon, 25 Feb 2002 13:14:36 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <15482.42163.449374.118807@grendel.zope.com>; from fdrake@acm.org on Mon, Feb 25, 2002 at 03:55:15PM -0500
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com>
Message-ID: <20020225131436.B22519@glacier.arctrix.com>

Fred L. Drake, Jr. wrote:
> This doesn't work as soon as the string is not a constant.  Many of
> the discussions at PythonLabs did not involve text included as part of
> an application's source, and the conversion operation would not be
> driven by application code but by library/service code.

Write a function or use %.  This is not a good reason to add a string
interpolation operator to the language.  Note that this does not mean
I'm against PEP 215.  PEP 215 proposes to solve a different problem and
should not be hijacked, IMHO.

  Neil


From skip@pobox.com  Mon Feb 25 21:16:19 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 25 Feb 2002 15:16:19 -0600
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <15482.36941.605165.133988@anthem.wooz.org>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
Message-ID: <15482.43427.691923.976178@12-248-41-177.client.attbi.com>

    BAW> ... I suspect some new prefix would have to designate a new type of
    BAW> string object, e.g. $'' strings.  Or perhaps a different binary
    BAW> operator could be used.

I'm still not at all fond of the $-string idea, but in the interests of
completeness, perhaps using '$' as a binary operator (by analogy with '%' as
a binary operator having nothing to do with modulo when the left arg is a
string) would be appropriate.

Skip


From nas@python.ca  Mon Feb 25 21:18:57 2002
From: nas@python.ca (Neil Schemenauer)
Date: Mon, 25 Feb 2002 13:18:57 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <20020225150106.B22803@unpythonic.dhs.org>; from jepler@unpythonic.dhs.org on Mon, Feb 25, 2002 at 03:01:06PM -0600
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org>
Message-ID: <20020225131857.A22769@glacier.arctrix.com>

Jeff Epler wrote:
> But how do you internationalize your program once you use $-subs?

So don't use them.  What's the problem?

  Neil


From jepler@unpythonic.dhs.org  Mon Feb 25 21:20:36 2002
From: jepler@unpythonic.dhs.org (Jeff Epler)
Date: Mon, 25 Feb 2002 15:20:36 -0600
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <15482.42163.449374.118807@grendel.zope.com>
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com>
Message-ID: <20020225152035.C22803@unpythonic.dhs.org>

On Mon, Feb 25, 2002 at 03:55:15PM -0500, Fred L. Drake, Jr. wrote:
> Paul Prescod writes:
>  > Plus, operator-based evaluation has some security implications that
>  > compile time evaluation does not. In particular, if the left-hand thing
>  > can be any string then you have the potential of accidentally allowing
>  > the user to supply a string that allows him/her to introspect your local
>  > variables. That can't happen if the interpolation is done at compile
>  > time.
> 
> I'm not sure I understand this.

Imagine that you have:
    def print_crypted_passwd(name, plaintext, salt="Xx"):
	crypted = crypt.crypt(plaintext, salt)
	print _("""%(name)s, your crypted password is %(crypted)s.""") % locals()

and that some crafty devil translates this as
    msgstr "%(name)s, your plaintext password is %(plaintext).  HA HA HA"

i.e., the translator (or other person who can influence the format
string) can access other information in the dict you pass in, even if
you didn't intend it.

Personally, I tend to view this as showing that using % locals() is
unsanitary.  But that means that the problem is in using the locals()
dictionary, a problem made worse by making the use of locals() implicit.

(And under $-substitution, if locals() is implicit, how do I substitute
with a dictionary other than locals()?

    def print_crypted_passwd(accountinfo):
	print "%(name)s, your crypted password is %(crypted)s." \
		    % accountinfo.__dict__
vs
    def print_crypted_passwd(accountinfo):
	def really_subst(name, crypted):
	    return $"$name, your crypted password is $crypted"
	print really_subst(accountinfo.name, accountinfo.crypted)
or
    def print_crypted_passwd(accountinfo):
	name = accountinfo.name
	crypted = accountinfo.crypted
	print $"$name, your crypted password is $crypted"
???)

Jeff


From jepler@unpythonic.dhs.org  Mon Feb 25 21:26:01 2002
From: jepler@unpythonic.dhs.org (Jeff Epler)
Date: Mon, 25 Feb 2002 15:26:01 -0600
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <20020225131857.A22769@glacier.arctrix.com>
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com>
Message-ID: <20020225152600.E22803@unpythonic.dhs.org>

On Mon, Feb 25, 2002 at 01:18:57PM -0800, Neil Schemenauer wrote:
> Jeff Epler wrote:
> > But how do you internationalize your program once you use $-subs?
> 
> So don't use them.  What's the problem?

The problem is when I have to internationalize a program some schmuck
wrote using $-subs throughout.

Jeff


From paul@prescod.net  Mon Feb 25 21:37:26 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 25 Feb 2002 13:37:26 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org>
Message-ID: <3C7AAE96.BA4D19FE@prescod.net>

Jeff Epler wrote:
> 
> ...
> 
> Imagine that you have:
>     def print_crypted_passwd(name, plaintext, salt="Xx"):
>         crypted = crypt.crypt(plaintext, salt)
>         print _("""%(name)s, your crypted password is %(crypted)s.""") % locals()
> 
> and that some crafty devil translates this as
>     msgstr "%(name)s, your plaintext password is %(plaintext).  HA HA HA"
> 
> i.e., the translator (or other person who can influence the format
> string) can access other information in the dict you pass in, even if
> you didn't intend it.

Right. I don't claim that this is a killer problem. I'm actually much
more concerned about the usability aspects. But if we can improve
security at the same time, then lets.

> Personally, I tend to view this as showing that using % locals() is
> unsanitary.  But that means that the problem is in using the locals()
> dictionary, a problem made worse by making the use of locals() implicit.

If it is done a compile time then the crafty devil couldn't get in the
alternate string!

On the other hand, if you're doing runtime translation stuff then of
course you need to use a runtime function, like "%" or maybe a new
"interpol". I am not against the existence of such a thing. I'm against
it being the default way to do interpolation. It's like "eval" a
compile-time tool that sophisticated users have access to at runtime.

> (And under $-substitution, if locals() is implicit, how do I substitute
> with a dictionary other than locals()?

Well I don't think you should have to, because you could use the
"interpol" function (maybe from the "interpol" module). But anyhow, your
question has a factual answer and you already gave it!

>     def print_crypted_passwd(accountinfo):
>         def really_subst(name, crypted):
>             return $"$name, your crypted password is $crypted"
>         print really_subst(accountinfo.name, accountinfo.crypted)
> or
>     def print_crypted_passwd(accountinfo):
>         name = accountinfo.name
>         crypted = accountinfo.crypted
>         print $"$name, your crypted password is $crypted"

This last one looks very clear and simple to me! What's the problem with
it?

Still, I don't argue against the need for something at runtime -- as a
power tool. Either we could just keep "%" or make a function. 

Okay, so my proposal for $ doesn't do everything that % does. It was
never spec'd to do everything "%" does. For instance it doesn't do float
formatting tricks.

 Paul Prescod


From barry@zope.com  Mon Feb 25 21:44:02 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 16:44:02 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <15482.43427.691923.976178@12-248-41-177.client.attbi.com>
Message-ID: <15482.45090.3848.616817@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    SM> I'm still not at all fond of the $-string idea, but in the
    SM> interests of completeness, perhaps using '$' as a binary
    SM> operator (by analogy with '%' as a binary operator having
    SM> nothing to do with modulo when the left arg is a string) would
    SM> be appropriate.

I can't say whether it's a good thing to add this to the language or
not.  I tend to think that %(var)s is just fine from a Python
programmer's point of view, and in the interest of TOOWTDI, we don't
need anything else.

>From a /non-programmer's/ point of view, %(var)s is way too error
prone, and $-strings are an attempt at implementing a simple to
explain, hard to get wrong, rule for thru-the-web supplied template
strings.  There's been no usability testing yet to know whether
$-strings actually will be easier to use <wink>, but I've got plenty
of anecdotal evidence that %-strings suck badly for useability by
non-Python programmers.

Still, if $-strings are better for non-programmers, maybe they're
better for programmers too.  There's certainly evidence that
translators get them wrong too.
-Barry


From paul@prescod.net  Mon Feb 25 21:46:46 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 25 Feb 2002 13:46:46 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org>
Message-ID: <3C7AB0C6.8C3BF369@prescod.net>

Jeff Epler wrote:
> 
> On Mon, Feb 25, 2002 at 01:18:57PM -0800, Neil Schemenauer wrote:
> > Jeff Epler wrote:
> > > But how do you internationalize your program once you use $-subs?
> >
> > So don't use them.  What's the problem?
> 
> The problem is when I have to internationalize a program some schmuck
> wrote using $-subs throughout.

I think you go through and remove the "$" signs (probably at the same
time you are removing "_") and use a runtime function to do the
translation (probably the same function doing the interpolation). Then
you take on the responsibility yourself for making sure that the
original string is a constant (not a user-supplied variable) and that
the replacement strings come from somewhere secure.

So:

a = $"Hello there $name"

becomes:

a = _("Hello there $name")

I think Barry's gettext already does that or something, doesn't it?

 Paul Prescod


From barry@zope.com  Mon Feb 25 21:51:00 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 16:51:00 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <15482.42163.449374.118807@grendel.zope.com>
 <20020225152035.C22803@unpythonic.dhs.org>
Message-ID: <15482.45508.332144.622180@anthem.wooz.org>

>>>>> "JE" == Jeff Epler <jepler@unpythonic.dhs.org> writes:

    JE> Imagine that you have:
    >>    def print_crypted_passwd(name, plaintext, salt="Xx"): crypted =
    >> crypt.crypt(plaintext, salt) print _("""%(name)s, your crypted
    >> password is %(crypted)s.""") % locals()

    JE> and that some crafty devil translates this as msgstr
    JE> "%(name)s, your plaintext password is %(plaintext).  HA HA HA"

    JE> i.e., the translator (or other person who can influence the
    JE> format string) can access other information in the dict you
    JE> pass in, even if you didn't intend it.

That's a very interesting vulnerability you bring up!

In my own implementation, _() uses sys._getframe(1) to gather up the
caller's locals and globals into the interpolation dictionary,
i.e. you don't need to specify it explicitly.  Damn convenient, but
also vulnerable to this exploit.  In that case, I'd be very careful to
make sure that print_crypted_passwd() was written such that the
plaintext wasn't available via a variable in the caller's frame.

    JE> Personally, I tend to view this as showing that using %
    JE> locals() is unsanitary.

Nope, but you have to watch out not to mix cooked and raw food on the
same plate (to stretch an unsavory analogy).
    
    JE> But that means that the problem is in using the locals()
    JE> dictionary, a problem made worse by making the use of locals()
    JE> implicit.

    JE> (And under $-substitution, if locals() is implicit, how do I
    JE> substitute with a dictionary other than locals()?

def print_crypted_passwd(name, crypted):
    print $"$name, your crypted password is $crypted"

print_crypted_passwd(yername, crypt.crypt(plaintext, salt))

-Barry


From barry@zope.com  Mon Feb 25 21:53:12 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 16:53:12 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <20020225150106.B22803@unpythonic.dhs.org>
Message-ID: <15482.45640.614214.271184@anthem.wooz.org>

>>>>> "JE" == Jeff Epler <jepler@unpythonic.dhs.org> writes:

    JE> I'm not interested in any proposal that turns code that's easy
    JE> to internationalize (just add calls to gettext(), commonly
    JE> spelled _(), around each string that needs translating, then
    JE> fix up the places where the programmer was too clever) into
    JE> code that's impossible to internationalize by design.

I'm with you there, Jeff.

-Barry


From barry@zope.com  Mon Feb 25 21:55:48 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 16:55:48 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <15482.42163.449374.118807@grendel.zope.com>
 <20020225152035.C22803@unpythonic.dhs.org>
 <3C7AAE96.BA4D19FE@prescod.net>
Message-ID: <15482.45796.817626.14965@anthem.wooz.org>

>>>>> "PP" == Paul Prescod <paul@prescod.net> writes:

    PP> Okay, so my proposal for $ doesn't do everything that %
    PP> does. It was never spec'd to do everything "%" does. For
    PP> instance it doesn't do float formatting tricks.

Does anybody ever even use something other than `s' for %() strings?

>>> '%(float)f' % {'float': 3.9}
'3.900000'

I never have.
-Barry


From barry@zope.com  Mon Feb 25 21:57:27 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 16:57:27 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <20020225150106.B22803@unpythonic.dhs.org>
 <20020225131857.A22769@glacier.arctrix.com>
 <20020225152600.E22803@unpythonic.dhs.org>
 <3C7AB0C6.8C3BF369@prescod.net>
Message-ID: <15482.45895.373283.698600@anthem.wooz.org>

>>>>> "PP" == Paul Prescod <paul@prescod.net> writes:

    PP> I think Barry's gettext already does that or something,
    PP> doesn't it?

Yes, I have a function that does that.

-Barry


From paul@prescod.net  Mon Feb 25 21:56:14 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 25 Feb 2002 13:56:14 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <15482.42163.449374.118807@grendel.zope.com>
 <20020225152035.C22803@unpythonic.dhs.org>
 <3C7AAE96.BA4D19FE@prescod.net> <15482.45796.817626.14965@anthem.wooz.org>
Message-ID: <3C7AB2FE.D1FFE397@prescod.net>

"Barry A. Warsaw" wrote:
> 
>...
> 
> Does anybody ever even use something other than `s' for %() strings?
> 
> >>> '%(float)f' % {'float': 3.9}
> '3.900000'

Presumably numerical analysts do....and David Ascher once told me he
uses %d as a sanity type-check. I don't bother.

 Paul Prescod


From fredrik@pythonware.com  Mon Feb 25 21:59:14 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 25 Feb 2002 22:59:14 +0100
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org><3C7A7DB0.A8A3416E@lemburg.com><15482.34560.688685.262327@anthem.wooz.org><3C7A8FC7.9CB321EE@lemburg.com><15482.36941.605165.133988@anthem.wooz.org><15482.43427.691923.976178@12-248-41-177.client.attbi.com> <15482.45090.3848.616817@anthem.wooz.org>
Message-ID: <0a1801c1be47$b9b4bdb0$ced241d5@hagrid>

barry wrote:
> From a /non-programmer's/ point of view, %(var)s is way too error
> prone, and $-strings are an attempt at implementing a simple to
> explain, hard to get wrong, rule for thru-the-web supplied template
> strings.

how about making that "s" optional?

    1. %% substitutes to just a single %

    2. %(identifier) followed by non-identifier characters gets
    interpolated with the value of the 'identifier' key in the sub-
    stitution dictionary.

    3. For handling cases where the identifier is followed by
    identifier characters that aren't part of the key, $(identfier)s
    is equivalent to %(identifier)

</F>



From skip@pobox.com  Mon Feb 25 22:01:53 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 25 Feb 2002 16:01:53 -0600
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <15482.45090.3848.616817@anthem.wooz.org>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <15482.43427.691923.976178@12-248-41-177.client.attbi.com>
 <15482.45090.3848.616817@anthem.wooz.org>
Message-ID: <15482.46161.285288.743587@12-248-41-177.client.attbi.com>

    BAW> There's been no usability testing yet to know whether $-strings
    BAW> actually will be easier to use <wink>, but I've got plenty of
    BAW> anecdotal evidence that %-strings suck badly for useability by
    BAW> non-Python programmers.

I presume your anecdotal evidence comes from Mailman.  If you have a pair of
functions that implement the %-to-$-to-% transformation and can catch the
missing 's' problem automatically (is that the biggest problem non-
programmers have?), then why not just use this in Mailman and be done with
the problem?  In fact, why not just document Mailman so that "%(var)" is the
correct form and silently add the "missing" 's' in your transformation step?

That %-strings suck for Mailman administrators does not mean they
necessarily suck for programmers.  The two populations obviously overlap
somewhat, but not tremendously.  I have never had a problem with
%-strings, certainly not with omitting the trailing 's'.  Past experience
with printf() doesn't obviously pollute the sample population too much
either, since the %(var)s type of format is not supported by printf().

    BAW> Still, if $-strings are better for non-programmers, maybe they're
    BAW> better for programmers too.  There's certainly evidence that
    BAW> translators get them wrong too.

What do you mean by "translators"?

Skip


From fdrake@acm.org  Mon Feb 25 21:59:51 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 25 Feb 2002 16:59:51 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <20020225210423.GA2398@crystal.mems-exchange.org>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <15482.42163.449374.118807@grendel.zope.com>
 <20020225210423.GA2398@crystal.mems-exchange.org>
Message-ID: <15482.46039.769808.389448@grendel.zope.com>

Andrew Kuchling writes:
 > But for applications where people don't edit Python, this could just
 > be a library module and doesn't need a new operator in the Python
 > code.  I agree with Paul; there's no actual gain in clarity from the
 > new syntax.

I'm happy with that as well.

 > Presumably Paul is thinking of something like:
 > mlist = load_list('listname')
 > # Lists have .title, .password, ...
 > form_value = cgi.form['text'] # User puts $password into text
 > print text \ vars(mlist)

Yes, but I'm not convinced this has any more security implications
implications than using a library function to perform the
transformation.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From barry@zope.com  Mon Feb 25 22:01:25 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 17:01:25 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <15482.42163.449374.118807@grendel.zope.com>
 <20020225152035.C22803@unpythonic.dhs.org>
 <3C7AAE96.BA4D19FE@prescod.net>
 <15482.45796.817626.14965@anthem.wooz.org>
 <3C7AB2FE.D1FFE397@prescod.net>
Message-ID: <15482.46133.509028.780744@anthem.wooz.org>

>>>>> "PP" == Paul Prescod <paul@prescod.net> writes:

    PP> "Barry A. Warsaw" wrote:
    >>
    >> ...  Does anybody ever even use something other than `s' for
    >> %() strings?  >>> '%(float)f' % {'float': 3.9} '3.900000'

    PP> Presumably numerical analysts do....and David Ascher once told
    PP> me he uses %d as a sanity type-check. I don't bother.

%d I sometimes use, but I don't think I've ever (purposely) used
%(var)d.

-Barry


From guido@python.org  Mon Feb 25 22:04:04 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 25 Feb 2002 17:04:04 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: Your message of "Mon, 25 Feb 2002 16:44:02 EST."
 <15482.45090.3848.616817@anthem.wooz.org>
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com>
 <15482.45090.3848.616817@anthem.wooz.org>
Message-ID: <200202252204.g1PM44I13781@pcp742651pcs.reston01.va.comcast.net>

There are two entirely different potential uses for interpolation.
One is for the Python programmer; call this literal interpolation.
It's cute to be able to write

    a = 12
    b = 15
    c = a*b
    print $"A rectangle of $a x $b has an area of $c."

This is arguably better than

    print "A rectangle of", a, "x", b, "has an area of", c, "."

(and to get rid of the space between the value of c and the '.' a
totally different paradigm would have to be used).

A totally *different* use of interpolation is for templates, where
both the template (any data containing the appropriate $ syntax) and
the set of variables to be substituted (any mapping) should be under
full control of the program.  This is what mailmail needs.

Literal interpolation has no security issues, if done properly.  In
the latter use, the security issues can be taken care of by carefully
deciding what data is available in the set of variables to be
interpolated.  The interpolation syntax I've proposed is intentionally
very simple, so that this is relatively easy.  I recall seeing slides
at the conference of a templating system (maybe Twisted's?) that
allowed expressions like $foo.bar[key] which would be much harder to
secure.

I18n of templates is easy -- just look up the template string in the
translation database.

I18n of apps using literal interpolation is more of a can of worms,
and I have no clear solution.  I agree that a solution is needed --
otherwise literal interpolation would be *worse* than what we have now!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Feb 25 22:05:59 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 25 Feb 2002 17:05:59 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: Your message of "Mon, 25 Feb 2002 13:56:14 PST."
 <3C7AB2FE.D1FFE397@prescod.net>
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org> <3C7AAE96.BA4D19FE@prescod.net> <15482.45796.817626.14965@anthem.wooz.org>
 <3C7AB2FE.D1FFE397@prescod.net>
Message-ID: <200202252205.g1PM5xD13825@pcp742651pcs.reston01.va.comcast.net>

> > Does anybody ever even use something other than `s' for %() strings?
> > 
> > >>> '%(float)f' % {'float': 3.9}
> > '3.900000'

I never use this in combination with named variables, but I often
write timing programs that format times using "%6.3f" to get
millisecond precision.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@zope.com  Mon Feb 25 22:16:18 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 17:16:18 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <15482.43427.691923.976178@12-248-41-177.client.attbi.com>
 <15482.45090.3848.616817@anthem.wooz.org>
 <15482.46161.285288.743587@12-248-41-177.client.attbi.com>
Message-ID: <15482.47026.213046.548051@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    BAW> There's been no usability testing yet to know whether
    BAW> $-strings actually will be easier to use <wink>, but I've got
    BAW> plenty of anecdotal evidence that %-strings suck badly for
    BAW> useability by non-Python programmers.

    SM> I presume your anecdotal evidence comes from Mailman.

Correct.
    
    SM> If you have a pair of functions that implement the %-to-$-to-%
    SM> transformation and can catch the missing 's' problem
    SM> automatically (is that the biggest problem non- programmers
    SM> have?),

The biggest, yes, but not necessarily the only one.
    
    SM> then why not just use this in Mailman and be done with the
    SM> problem?

That's what I plan on doing for MM2.1, except I won't force it down
people's throats yet.  It'll be optional (but it'll be an either-or
option).  I won't use it in Python code yet though (too disruptive),
just the thru-the-web template defining text-boxes.
    
    SM> In fact, why not just document Mailman so that "%(var)" is the
    SM> correct form and silently add the "missing" 's' in your
    SM> transformation step?

    SM> That %-strings suck for Mailman administrators does not mean
    SM> they necessarily suck for programmers.

True, but who knows?  I wouldn't necessarily classify python-dev as a
representative sample of users.
    
    SM> The two populations obviously overlap somewhat, but not
    SM> tremendously.  I have never had a problem with %-strings,
    SM> certainly not with omitting the trailing 's'.  Past experience
    SM> with printf() doesn't obviously pollute the sample population
    SM> too much either, since the %(var)s type of format is not
    SM> supported by printf().

    BAW> Still, if $-strings are better for non-programmers, maybe
    BAW> they're better for programmers too.  There's certainly
    BAW> evidence that translators get them wrong too.

    SM> What do you mean by "translators"?

Someone who is fluent in a natural language other than English, and
translates a catalog of English source strings to a target non-English
natural language.  E.g.

    "No such list: %(listname)s" -> "Non esiste la lista: %(listname)s"

-Barry


From fdrake@acm.org  Mon Feb 25 22:16:48 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 25 Feb 2002 17:16:48 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <15482.45090.3848.616817@anthem.wooz.org>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <15482.43427.691923.976178@12-248-41-177.client.attbi.com>
 <15482.45090.3848.616817@anthem.wooz.org>
Message-ID: <15482.47056.544681.741629@grendel.zope.com>

Barry A. Warsaw writes:
 > I can't say whether it's a good thing to add this to the language or
 > not.  I tend to think that %(var)s is just fine from a Python
 > programmer's point of view, and in the interest of TOOWTDI, we don't

We're definately seeing a lot of reasonable concern over adding
another formatting operator, and my own interest in the proposal has
nothing to do with having an operator to do this.  I probably
shouldn't have said anything about the topic (I don't recall even
noting a preference, myself, just that I'd read one alternative
differently than Marc-Andre and that another already had a meaning).

 > From a /non-programmer's/ point of view, %(var)s is way too error
 > prone, and $-strings are an attempt at implementing a simple to
 > explain, hard to get wrong, rule for thru-the-web supplied template

How the string was obtained is irrelevant, only that it is not part of
the source code and the author may not be a programmer.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From barry@zope.com  Mon Feb 25 22:19:41 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 17:19:41 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <15482.43427.691923.976178@12-248-41-177.client.attbi.com>
 <15482.45090.3848.616817@anthem.wooz.org>
 <200202252204.g1PM44I13781@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15482.47229.674597.577768@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> There are two entirely different potential uses for
    GvR> interpolation.

Ah yes Guido, thanks for the clarity!

-Barry


From barry@zope.com  Mon Feb 25 22:23:02 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 17:23:02 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <15482.43427.691923.976178@12-248-41-177.client.attbi.com>
 <15482.45090.3848.616817@anthem.wooz.org>
 <15482.47056.544681.741629@grendel.zope.com>
Message-ID: <15482.47430.913924.157520@anthem.wooz.org>

>>>>> "Fred" == Fred L Drake, Jr <fdrake@acm.org> writes:

    >> From a /non-programmer's/ point of view, %(var)s is way too
    >> error prone, and $-strings are an attempt at implementing a
    >> simple to explain, hard to get wrong, rule for thru-the-web
    >> supplied template

    Fred> How the string was obtained is irrelevant, only that it is
    Fred> not part of the source code and the author may not be a
    Fred> programmer.

Correct.
-Barry


From martin@v.loewis.de  Mon Feb 25 22:27:49 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 25 Feb 2002 23:27:49 +0100
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <3C7AB0C6.8C3BF369@prescod.net>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <20020225150106.B22803@unpythonic.dhs.org>
 <20020225131857.A22769@glacier.arctrix.com>
 <20020225152600.E22803@unpythonic.dhs.org>
 <3C7AB0C6.8C3BF369@prescod.net>
Message-ID: <m3wux1urh6.fsf@mira.informatik.hu-berlin.de>

Paul Prescod <paul@prescod.net> writes:

> I think you go through and remove the "$" signs (probably at the same
> time you are removing "_") and use a runtime function to do the
> translation (probably the same function doing the interpolation).

I could not accept any solution that cannot offer anything but this.
This kind of interpolation is plain broken.

Regards,
Martin


From martin@v.loewis.de  Mon Feb 25 22:25:48 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 25 Feb 2002 23:25:48 +0100
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <15482.45508.332144.622180@anthem.wooz.org>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <15482.42163.449374.118807@grendel.zope.com>
 <20020225152035.C22803@unpythonic.dhs.org>
 <15482.45508.332144.622180@anthem.wooz.org>
Message-ID: <m31yf9w64z.fsf@mira.informatik.hu-berlin.de>

barry@zope.com (Barry A. Warsaw) writes:

>     JE> i.e., the translator (or other person who can influence the
>     JE> format string) can access other information in the dict you
>     JE> pass in, even if you didn't intend it.
> 
> That's a very interesting vulnerability you bring up!

That's not a vulnerability. It assumes that the translator is an
attacker, or that the attacker can change the catalogs. If he is or
can, you could not trust them, anyway, as they could cause arbitrary
other failures, as well.

Regards,
Martin


From jepler@unpythonic.dhs.org  Mon Feb 25 22:34:49 2002
From: jepler@unpythonic.dhs.org (Jeff Epler)
Date: Mon, 25 Feb 2002 16:34:49 -0600
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <m3wux1urh6.fsf@mira.informatik.hu-berlin.de>
References: <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> <3C7AB0C6.8C3BF369@prescod.net> <m3wux1urh6.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020225163447.G22803@unpythonic.dhs.org>

On Mon, Feb 25, 2002 at 11:27:49PM +0100, Martin v. Loewis wrote:
> Paul Prescod <paul@prescod.net> writes:
> 
> > I think you go through and remove the "$" signs (probably at the same
> > time you are removing "_") and use a runtime function to do the
> > translation (probably the same function doing the interpolation).
> 
> I could not accept any solution that cannot offer anything but this.
> This kind of interpolation is plain broken.

Exactly.  Why spend all this time and effort complicating the Python
parser and compiler, only to find that all real-world programs just
instead implement the feature inside a function call?

Jeff


From jepler@unpythonic.dhs.org  Mon Feb 25 22:45:33 2002
From: jepler@unpythonic.dhs.org (Jeff Epler)
Date: Mon, 25 Feb 2002 16:45:33 -0600
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <m31yf9w64z.fsf@mira.informatik.hu-berlin.de>
References: <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <15482.42163.449374.118807@grendel.zope.com> <20020225152035.C22803@unpythonic.dhs.org> <15482.45508.332144.622180@anthem.wooz.org> <m31yf9w64z.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020225164532.H22803@unpythonic.dhs.org>

On Mon, Feb 25, 2002 at 11:25:48PM +0100, Martin v. Loewis wrote:
> That's not a vulnerability. It assumes that the translator is an
> attacker, or that the attacker can change the catalogs. If he is or
> can, you could not trust them, anyway, as they could cause arbitrary
> other failures, as well.

It means that you must audit not only your source code, but also your
message catalogs, to determine whether information that is supposed to
remain internal to a program is not formatted into a string.  Of course,
it is fairly easy to do this audit by showing that the translated string
doesn't contain substitution on any identifiers that the original string
did not.

I don't think it's impossible that someone supplying catalogs could be
an "attacker", even if a plausible scenario doesn't come directly to
mind.

Jeff


From barry@zope.com  Mon Feb 25 23:04:46 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 18:04:46 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <15482.42163.449374.118807@grendel.zope.com>
 <20020225152035.C22803@unpythonic.dhs.org>
 <15482.45508.332144.622180@anthem.wooz.org>
 <m31yf9w64z.fsf@mira.informatik.hu-berlin.de>
 <20020225164532.H22803@unpythonic.dhs.org>
Message-ID: <15482.49934.923783.45301@anthem.wooz.org>

>>>>> "JE" == Jeff Epler <jepler@unpythonic.dhs.org> writes:

    JE> On Mon, Feb 25, 2002 at 11:25:48PM +0100, Martin v. Loewis
    JE> wrote:
    >> That's not a vulnerability. It assumes that the translator is
    >> an attacker, or that the attacker can change the catalogs. If
    >> he is or can, you could not trust them, anyway, as they could
    >> cause arbitrary other failures, as well.

    JE> It means that you must audit not only your source code, but
    JE> also your message catalogs, to determine whether information
    JE> that is supposed to remain internal to a program is not
    JE> formatted into a string.  Of course, it is fairly easy to do
    JE> this audit by showing that the translated string doesn't
    JE> contain substitution on any identifiers that the original
    JE> string did not.

>From what I've been told, newer versions (possibly not yet released)
of the GNU gettext tools, will do exactly that, and understand Python
syntax too (hmm, an argument for keeping the current crop of %-string
rules?).

Alternatively, or in conjunction, you should be auditing your
translation sites to make sure that maliciously translated strings
can't access sensitive information.

-Barry


From paul@prescod.net  Mon Feb 25 23:04:33 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 25 Feb 2002 15:04:33 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org> <3C7A9F3B.B42265DC@prescod.net> <20020225150106.B22803@unpythonic.dhs.org> <20020225131857.A22769@glacier.arctrix.com> <20020225152600.E22803@unpythonic.dhs.org> <3C7AB0C6.8C3BF369@prescod.net> <m3wux1urh6.fsf@mira.informatik.hu-berlin.de> <20020225163447.G22803@unpythonic.dhs.org>
Message-ID: <3C7AC301.20660EBA@prescod.net>

Jeff Epler wrote:
> 
>...
> 
> Exactly.  Why spend all this time and effort complicating the Python
> parser and compiler, only to find that all real-world programs just
> instead implement the feature inside a function call?

Nobody said to reimplement it. I've said on several occasions that there
should be a runtime version.

 Paul Prescod


From paul@prescod.net  Mon Feb 25 23:05:11 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 25 Feb 2002 15:05:11 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <20020225150106.B22803@unpythonic.dhs.org>
 <20020225131857.A22769@glacier.arctrix.com>
 <20020225152600.E22803@unpythonic.dhs.org>
 <3C7AB0C6.8C3BF369@prescod.net> <m3wux1urh6.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3C7AC327.4164C382@prescod.net>

"Martin v. Loewis" wrote:
> 
> Paul Prescod <paul@prescod.net> writes:
> 
> > I think you go through and remove the "$" signs (probably at the same
> > time you are removing "_") and use a runtime function to do the
> > translation (probably the same function doing the interpolation).
> 
> I could not accept any solution that cannot offer anything but this.
> This kind of interpolation is plain broken.

How so? I need more info to go on.

 Paul Prescod


From skip@pobox.com  Mon Feb 25 23:16:24 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 25 Feb 2002 17:16:24 -0600
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <15482.47026.213046.548051@anthem.wooz.org>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <15482.43427.691923.976178@12-248-41-177.client.attbi.com>
 <15482.45090.3848.616817@anthem.wooz.org>
 <15482.46161.285288.743587@12-248-41-177.client.attbi.com>
 <15482.47026.213046.548051@anthem.wooz.org>
Message-ID: <15482.50632.874371.694865@12-248-41-177.client.attbi.com>

    SM> What do you mean by "translators"?

    BAW> Someone who is fluent in a natural language other than English, and
    BAW> translates a catalog of English source strings to a target
    BAW> non-English natural language.  E.g.

    BAW>  "No such list: %(listname)s" -> "Non esiste la lista: %(listname)s"

So translators aren't programmers either.  Just tell them anything between
%(...) and the first alphabetic character after that is off-limits.  Again,
it doesn't look to me like a programmer problem.

Just to play the devil's advocate (and ignoring the bit about $-strings not
being i18n-friendly), I suspect non-programming translators would have just
as much trouble with something like

    $"Please confirm your choice of color ($color)..."

"$color" will look like a word to be translated.  You would have to tell
them "don't translate anything immediately following a dollar sign up to,
but not inluding the next character that can't be part of a Python
identifier."  Seems either a bit error-prone or confusing to me if I pretend
I'm not a programmer.

Skip


From paul@prescod.net  Mon Feb 25 23:12:31 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 25 Feb 2002 15:12:31 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <15482.42163.449374.118807@grendel.zope.com>
 <20020225210423.GA2398@crystal.mems-exchange.org> <15482.46039.769808.389448@grendel.zope.com>
Message-ID: <3C7AC4DF.61A7D7AB@prescod.net>

"Fred L. Drake, Jr." wrote:
> 
>...
> 
> Yes, but I'm not convinced this has any more security implications
> implications than using a library function to perform the
> transformation.

The point is that the simplest mechanism, that we teach to newbies, has
security non-obvious "concerns". If we have literal interpolation, then
a library function would be used by people who WANT to do it at runtime
because they have a REASON for doing it at runtime and thus have a
pretty clear concept of the distinction between runtime and compile
time.

But as I've said, the major reason for this is not security. I don't
know that a Python program has been hacked through "%" so it doesn't
make sense to lose sleep over it. The major reason for doing it at
compile time (for me) is that you can have a nice syntax that doesn't
evolve modulus-ing (or dividing) an otherwise useless vars() or locals()
dictionary.

 Paul Prescod


From barry@zope.com  Mon Feb 25 23:19:24 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 18:19:24 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <15482.43427.691923.976178@12-248-41-177.client.attbi.com>
 <15482.45090.3848.616817@anthem.wooz.org>
 <15482.46161.285288.743587@12-248-41-177.client.attbi.com>
 <15482.47026.213046.548051@anthem.wooz.org>
 <15482.50632.874371.694865@12-248-41-177.client.attbi.com>
Message-ID: <15482.50812.53834.442570@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    SM> So translators aren't programmers either.

Well, they may not be /Python/ programmers. ;)
    
    SM> Just tell them anything between %(...) and the first
    SM> alphabetic character after that is off-limits.  Again, it
    SM> doesn't look to me like a programmer problem.

    SM> Just to play the devil's advocate (and ignoring the bit about
    SM> $-strings not being i18n-friendly), I suspect non-programming
    SM> translators would have just as much trouble with something
    SM> like

    SM>     $"Please confirm your choice of color ($color)..."

    SM> "$color" will look like a word to be translated.  You would
    SM> have to tell them "don't translate anything immediately
    SM> following a dollar sign up to, but not inluding the next
    SM> character that can't be part of a Python identifier."  Seems
    SM> either a bit error-prone or confusing to me if I pretend I'm
    SM> not a programmer.

To be clear, I think the ideal interface would be a graphical one,
with drag-n-drop icons for the textual placeholders.  This would allow
them to re-arrange the order of the placeholder, and it would be
obvious what is variable in your templates, but it wouldn't allow them
to change, remove, or add placeholders.

Then it wouldn't matter what syntax you actually used.

I'm holding my breath... ready... go! <wink>
-Barry


From paul@prescod.net  Mon Feb 25 23:19:06 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 25 Feb 2002 15:19:06 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <15482.43427.691923.976178@12-248-41-177.client.attbi.com>
 <15482.45090.3848.616817@anthem.wooz.org> <200202252204.g1PM44I13781@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7AC66A.5B7EF75C@prescod.net>

Guido van Rossum wrote:
> 
> There are two entirely different potential uses for interpolation.
> One is for the Python programmer; call this literal interpolation.

True!

>...
> A totally *different* use of interpolation is for templates, where
> both the template (any data containing the appropriate $ syntax) and
> the set of variables to be substituted (any mapping) should be under
> full control of the program.  This is what mailmail needs.

True!

But we've already got a solution for this. Is there something wrong with
it? I guess I don't know what problem we're trying to solve. My only
interest in interpolation was to make the common, simple case easier.

> Literal interpolation has no security issues, if done properly.  In
> the latter use, the security issues can be taken care of by carefully
> deciding what data is available in the set of variables to be
> interpolated.  The interpolation syntax I've proposed is intentionally
> very simple, so that this is relatively easy.  I recall seeing slides
> at the conference of a templating system (maybe Twisted's?) that
> allowed expressions like $foo.bar[key] which would be much harder to
> secure.

I'm not attached enough to fight for these but I'll re-emphasize your
implicit point that these are entirely secure if used in literal
interpolation.

> I18n of templates is easy -- just look up the template string in the
> translation database.
> 
> I18n of apps using literal interpolation is more of a can of worms,
> and I have no clear solution.  I agree that a solution is needed --
> otherwise literal interpolation would be *worse* than what we have now!

You translate them from compile time interpolation to runtime by
removing a $ and replacing it by a function call.

a = $"My name is $name"

 becomes:

a = interp(_("My name is $name"))

But of course it is trivial to make the last line of '_' return
interp(rc) so that the client doesn't have to do it.

 Paul Prescod


From tim.one@comcast.net  Mon Feb 25 23:17:11 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 25 Feb 2002 18:17:11 -0500
Subject: [Python-Dev] Tkinter versus Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEIANNAA.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEFDNPAA.tim.one@comcast.net>

FYI, someone just added a new entry to the ancient Python "Programs using
Tkinter sometimes can't shut down (Windows)" bug report that everyone gave
up on.  John Popplewell claims to have found the (a?) cause, in part:

"""
Managed to track it down to a problem inside Tcl.
For the Tcl8.3.4 source distribution the problem is in
the file win/tclWinNotify.c
"""

Beats me, but a claim so specific is probably worth checking out:

<http://sf.net/tracker/?func=detail&atid=105470&aid=216289&group_id=5470>



From skip@pobox.com  Mon Feb 25 23:27:18 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 25 Feb 2002 17:27:18 -0600
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <3C7AC327.4164C382@prescod.net>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <20020225150106.B22803@unpythonic.dhs.org>
 <20020225131857.A22769@glacier.arctrix.com>
 <20020225152600.E22803@unpythonic.dhs.org>
 <3C7AB0C6.8C3BF369@prescod.net>
 <m3wux1urh6.fsf@mira.informatik.hu-berlin.de>
 <3C7AC327.4164C382@prescod.net>
Message-ID: <15482.51286.256028.67275@12-248-41-177.client.attbi.com>

>>>>> "Paul" == Paul Prescod <paul@prescod.net> writes:

    Paul> "Martin v. Loewis" wrote:
    >> I could not accept any solution that cannot offer anything but this.
    >> This kind of interpolation is plain broken.

    Paul> How so? I need more info to go on.

I have no direct experience with text translation, but in this internet day
and age, it seems to me that a change to the language shouldn't make
internationalization more difficult than it already is.  (I doubt anyone
will claim that it's truly easy, even with gettext.)  Guido mentioned a
number of other languages that already use $-interpolation, Perl, the
shells, awk and Ruby I think.  Of those, all but Ruby were around before the
explosion of the internet in general and the web and Unicode in particular,
so internationalization wasn't a prime consideration when those languages'
$-interpolation facilities were implemented.

Skip


From skip@pobox.com  Mon Feb 25 23:28:27 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 25 Feb 2002 17:28:27 -0600
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <15482.50812.53834.442570@anthem.wooz.org>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <15482.43427.691923.976178@12-248-41-177.client.attbi.com>
 <15482.45090.3848.616817@anthem.wooz.org>
 <15482.46161.285288.743587@12-248-41-177.client.attbi.com>
 <15482.47026.213046.548051@anthem.wooz.org>
 <15482.50632.874371.694865@12-248-41-177.client.attbi.com>
 <15482.50812.53834.442570@anthem.wooz.org>
Message-ID: <15482.51355.567748.233425@12-248-41-177.client.attbi.com>


From skip@pobox.com  Mon Feb 25 23:29:21 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 25 Feb 2002 17:29:21 -0600
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <15482.50812.53834.442570@anthem.wooz.org>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <15482.43427.691923.976178@12-248-41-177.client.attbi.com>
 <15482.45090.3848.616817@anthem.wooz.org>
 <15482.46161.285288.743587@12-248-41-177.client.attbi.com>
 <15482.47026.213046.548051@anthem.wooz.org>
 <15482.50632.874371.694865@12-248-41-177.client.attbi.com>
 <15482.50812.53834.442570@anthem.wooz.org>
Message-ID: <15482.51409.945722.29088@12-248-41-177.client.attbi.com>

Sorry about the first, fumbled reply...

    BAW> To be clear, I think the ideal interface would be a graphical one,
    BAW> with drag-n-drop icons for the textual placeholders.  This would
    BAW> allow them to re-arrange the order of the placeholder, and it would
    BAW> be obvious what is variable in your templates, but it wouldn't
    BAW> allow them to change, remove, or add placeholders.

This places the onus back on the application programmer, not the language
designer.

Skip



From paul@prescod.net  Mon Feb 25 23:29:39 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 25 Feb 2002 15:29:39 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <20020225150106.B22803@unpythonic.dhs.org>
 <20020225131857.A22769@glacier.arctrix.com>
 <20020225152600.E22803@unpythonic.dhs.org>
 <3C7AB0C6.8C3BF369@prescod.net>
 <m3wux1urh6.fsf@mira.informatik.hu-berlin.de>
 <3C7AC327.4164C382@prescod.net> <15482.51286.256028.67275@12-248-41-177.client.attbi.com>
Message-ID: <3C7AC8E3.1175D99D@prescod.net>

Skip Montanaro wrote:
> 
> >>>>> "Paul" == Paul Prescod <paul@prescod.net> writes:
> 
>     Paul> "Martin v. Loewis" wrote:
>     >> I could not accept any solution that cannot offer anything but this.
>     >> This kind of interpolation is plain broken.
> 
>     Paul> How so? I need more info to go on.
> 
> I have no direct experience with text translation, but in this internet day
> and age, it seems to me that a change to the language shouldn't make
> internationalization more difficult than it already is.  

I've proposed that whereas today you add a "_( )" in the future you
would add "_( )" and remove "$" if it happens to occur at the start of
the string. If the string didn't start with a "$" you might also have to
scan to see if it contains one. In that case you double it up.

This doesn't make internationalization more difficult. As proof I
present mailman, which *already* does the interpolation I ask for as a
feature of its implementation of "_()". All I'm asking is that mailman's
interpolation feature ALSO be available under a simplified syntax at
compile time.

 Paul Prescod


From martin@v.loewis.de  Mon Feb 25 23:34:26 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 26 Feb 2002 00:34:26 +0100
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <20020225164532.H22803@unpythonic.dhs.org>
References: <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <15482.42163.449374.118807@grendel.zope.com>
 <20020225152035.C22803@unpythonic.dhs.org>
 <15482.45508.332144.622180@anthem.wooz.org>
 <m31yf9w64z.fsf@mira.informatik.hu-berlin.de>
 <20020225164532.H22803@unpythonic.dhs.org>
Message-ID: <m3d6yt3zlp.fsf@mira.informatik.hu-berlin.de>

Jeff Epler <jepler@unpythonic.dhs.org> writes:

> It means that you must audit not only your source code, but also your
> message catalogs, to determine whether information that is supposed to
> remain internal to a program is not formatted into a string.  Of course,
> it is fairly easy to do this audit by showing that the translated string
> doesn't contain substitution on any identifiers that the original string
> did not.

That specific test could be done automatically. In fact, GNU msgfmt
already performs the test for c-format strings; msgfmt.py should
probably learn about the common notations for string interpolation.

Regards,
Martin



From martin@v.loewis.de  Mon Feb 25 23:38:58 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 26 Feb 2002 00:38:58 +0100
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <3C7AC327.4164C382@prescod.net>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <20020225150106.B22803@unpythonic.dhs.org>
 <20020225131857.A22769@glacier.arctrix.com>
 <20020225152600.E22803@unpythonic.dhs.org>
 <3C7AB0C6.8C3BF369@prescod.net>
 <m3wux1urh6.fsf@mira.informatik.hu-berlin.de>
 <3C7AC327.4164C382@prescod.net>
Message-ID: <m38z9h3ze5.fsf@mira.informatik.hu-berlin.de>

Paul Prescod <paul@prescod.net> writes:

> > > I think you go through and remove the "$" signs (probably at the same
> > > time you are removing "_") and use a runtime function to do the
> > > translation (probably the same function doing the interpolation).
> > 
> > I could not accept any solution that cannot offer anything but this.
> > This kind of interpolation is plain broken.
> 
> How so? I need more info to go on.

In the applications that I have in mind, interpolated strings are
typically presented to the user, so there must be a way to localize
them. An extension to the language that does not support localization
is useless if I have to find some other means for l10n. 

If there will be a standard library function that does the
interpolation anyway, I'd prefer not to have a language extension ot
achieve the same thing, but is more limited. If anything, the language
extension should be more powerful, not more limited, in applicability.

Regards,
Martin


From JeffH@ActiveState.com  Mon Feb 25 23:38:35 2002
From: JeffH@ActiveState.com (Jeff Hobbs)
Date: Mon, 25 Feb 2002 15:38:35 -0800
Subject: [Python-Dev] RE: Tkinter versus Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEFDNPAA.tim.one@comcast.net>
Message-ID: <008401c1be55$8b5d7520$ba03a8c0@activestate.ca>

I added a note to the bug report, which was correct.  The fix was
already made in the Tcl head, but I back-ported it to the 8.3-branch
of Tcl for those who want to be able to grab that and work against it.

Jeff

> -----Original Message-----
> From: Tim Peters [mailto:tim.one@comcast.net]
> 
> FYI, someone just added a new entry to the ancient Python "Programs using
> Tkinter sometimes can't shut down (Windows)" bug report that everyone gave
> up on.  John Popplewell claims to have found the (a?) cause, in part:
> 
> """
> Managed to track it down to a problem inside Tcl.
> For the Tcl8.3.4 source distribution the problem is in
> the file win/tclWinNotify.c
> """
> 
> Beats me, but a claim so specific is probably worth checking out:
> 
> <http://sf.net/tracker/?func=detail&atid=105470&aid=216289&group_id=5470>



From martin@v.loewis.de  Mon Feb 25 23:42:41 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 26 Feb 2002 00:42:41 +0100
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <15482.50632.874371.694865@12-248-41-177.client.attbi.com>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <15482.43427.691923.976178@12-248-41-177.client.attbi.com>
 <15482.45090.3848.616817@anthem.wooz.org>
 <15482.46161.285288.743587@12-248-41-177.client.attbi.com>
 <15482.47026.213046.548051@anthem.wooz.org>
 <15482.50632.874371.694865@12-248-41-177.client.attbi.com>
Message-ID: <m33czp3z7y.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> Just to play the devil's advocate (and ignoring the bit about $-strings not
> being i18n-friendly), I suspect non-programming translators would have just
> as much trouble with something like
> 
>     $"Please confirm your choice of color ($color)..."
> 
> "$color" will look like a word to be translated.  You would have to tell
> them "don't translate anything immediately following a dollar sign up to,
> but not inluding the next character that can't be part of a Python
> identifier."  Seems either a bit error-prone or confusing to me if I pretend
> I'm not a programmer.

Indeed. Therefore, the only true solution is to have an automatic
check that verifies that the translated string has the same inserts as
the original. Such a check could instruct users to follow any
interpolation scheme; even if translators don't know the programming
language of the application, they still are typically capable of
understanding the error messages from msgfmt.

Regards,
Martin


From paul@prescod.net  Mon Feb 25 23:46:31 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 25 Feb 2002 15:46:31 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <20020225150106.B22803@unpythonic.dhs.org>
 <20020225131857.A22769@glacier.arctrix.com>
 <20020225152600.E22803@unpythonic.dhs.org>
 <3C7AB0C6.8C3BF369@prescod.net>
 <m3wux1urh6.fsf@mira.informatik.hu-berlin.de>
 <3C7AC327.4164C382@prescod.net> <m38z9h3ze5.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3C7ACCD7.9219AAF5@prescod.net>

"Martin v. Loewis" wrote:
> 
>...
> 
> In the applications that I have in mind, interpolated strings are
> typically presented to the user, so there must be a way to localize
> them. An extension to the language that does not support localization
> is useless if I have to find some other means for l10n.

You will use another invocation syntax, but probably the same string
interpolation syntax.

> If there will be a standard library function that does the
> interpolation anyway, I'd prefer not to have a language extension ot
> achieve the same thing, but is more limited. If anything, the language
> extension should be more powerful, not more limited, in applicability.

The language extension should be syntactically simpler because it is
what is used for simpler cases. Simpler constructs are also less likely
to open up security issues.

 Paul Prescod


From barry@zope.com  Tue Feb 26 00:20:49 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 19:20:49 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <20020225150106.B22803@unpythonic.dhs.org>
 <20020225131857.A22769@glacier.arctrix.com>
 <20020225152600.E22803@unpythonic.dhs.org>
 <3C7AB0C6.8C3BF369@prescod.net>
 <m3wux1urh6.fsf@mira.informatik.hu-berlin.de>
 <3C7AC327.4164C382@prescod.net>
 <15482.51286.256028.67275@12-248-41-177.client.attbi.com>
 <3C7AC8E3.1175D99D@prescod.net>
Message-ID: <15482.54497.412834.587484@anthem.wooz.org>

>>>>> "PP" == Paul Prescod <paul@prescod.net> writes:

    PP> This doesn't make internationalization more difficult. As
    PP> proof I present mailman, which *already* does the
    PP> interpolation I ask for as a feature of its implementation of
    PP> "_()". All I'm asking is that mailman's interpolation feature
    PP> ALSO be available under a simplified syntax at compile time.

Except that remember the interpolation step must happen /after/ the
translation step, otherwise it's worse than useless.

-Barry


From paul@prescod.net  Tue Feb 26 00:39:21 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 25 Feb 2002 16:39:21 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org> <3C7A7DB0.A8A3416E@lemburg.com> <15482.34560.688685.262327@anthem.wooz.org> <3C7A8FC7.9CB321EE@lemburg.com> <15482.36941.605165.133988@anthem.wooz.org> <3C7A9430.1E1077F8@lemburg.com> <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net> <200202252106.g1PL6FY13393@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7AD939.FCFE163D@prescod.net>

[meant to send this before]

Guido van Rossum wrote:
> 
>...
> 
> All right, but there *also* needs to be a way to invoke interpolation
> explicitly -- just like eval().  This has applicability e.g. in i18n.

Agree 100%. The last time we discussed this I proposed there should be a
function to do this. But the naive integrated syntax could be compile
time.

My complaints with the current interpolation are:

 1. they require too many magical incantations to invoke (especially %
vars())
 2. they require too much thinking about types and conversions in the
syntax
 3. special behaviour with dictionaries and tuples and singleton tuples
etc.
 4. operator abuse

I would only be in favour of a replacement if for *simple cases* it
cleared up all of these issues so that it is roughly as easy as in
Perl/Ruby/sh/tcl:

a = $"My name is $name"

If there is any more syntax than that then personally I think that the
cost/benefit ratio falls down. So I don't see this as a big win:

a = "My name is $name" \ locals()

It solves two of my four problems.

Maybe other people have different goals than I do and that's why they
see the above as a "win".

 Paul Prescod


From paul@prescod.net  Tue Feb 26 00:49:19 2002
From: paul@prescod.net (Paul Prescod)
Date: Mon, 25 Feb 2002 16:49:19 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <20020225150106.B22803@unpythonic.dhs.org>
 <20020225131857.A22769@glacier.arctrix.com>
 <20020225152600.E22803@unpythonic.dhs.org>
 <3C7AB0C6.8C3BF369@prescod.net>
 <m3wux1urh6.fsf@mira.informatik.hu-berlin.de>
 <3C7AC327.4164C382@prescod.net>
 <15482.51286.256028.67275@12-248-41-177.client.attbi.com>
 <3C7AC8E3.1175D99D@prescod.net> <15482.54497.412834.587484@anthem.wooz.org>
Message-ID: <3C7ADB8F.7A7304BC@prescod.net>

"Barry A. Warsaw" wrote:
> 
> >>>>> "PP" == Paul Prescod <paul@prescod.net> writes:
> 
>     PP> This doesn't make internationalization more difficult. As
>     PP> proof I present mailman, which *already* does the
>     PP> interpolation I ask for as a feature of its implementation of
>     PP> "_()". All I'm asking is that mailman's interpolation feature
>     PP> ALSO be available under a simplified syntax at compile time.
> 
> Except that remember the interpolation step must happen /after/ the
> translation step, otherwise it's worse than useless.

Right, that's why you *for localized software* you should do it at
runtime. And insofar as the process of localization *already* consists
of touching every string, it takes no extra effort to change a
compile-time interpolation to a runtime one while you are at it.

But the newbie to Python should not be saddled with a syntax optimized
towards advanced users, and even as a person often hacking
single-language software I shouldn't be saddled with dynamic
interpolation until I need it either! "Saddled with" means "required to
use a verbose, non-intuitive syntax with a bunch of special cases for a
simple and common operation."

 Paul Prescod


From fdrake@acm.org  Tue Feb 26 02:00:54 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 25 Feb 2002 21:00:54 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <3C7AC4DF.61A7D7AB@prescod.net>
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <15482.42163.449374.118807@grendel.zope.com>
 <20020225210423.GA2398@crystal.mems-exchange.org>
 <15482.46039.769808.389448@grendel.zope.com>
 <3C7AC4DF.61A7D7AB@prescod.net>
Message-ID: <15482.60502.420424.995597@grendel.zope.com>

Paul Prescod writes:
 >                                   The major reason for doing it at
 > compile time (for me) is that you can have a nice syntax that doesn't
 > evolve modulus-ing (or dividing) an otherwise useless vars() or locals()
 > dictionary.

Which has everything to do with your usage.  I almost never use % with
locals() or vars(), so I don't share that motivation.  I'm much more
likely to build a dict specifically for the purpose, which includes
computed values, or have something already created which includes this
usage as part of the larger picture.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From greg@cosc.canterbury.ac.nz  Tue Feb 26 02:14:10 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 26 Feb 2002 15:14:10 +1300 (NZDT)
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <3C7AD939.FCFE163D@prescod.net>
Message-ID: <200202260214.PAA23584@s454.cosc.canterbury.ac.nz>

I'm not sure I like the idea of using $ as the character
for prefixing interpolated strings. Somehow

  a = $"My name is $name"

looks confusing. I think it has something to do with the
fact that $ is appearing both inside and outside the
quotes, making my visual parser worry that the quotes
are misplaced.

Also, it uses up one of the three precious not-yet-used
characters, and I think we should keep those for some
future time when we really need them. We don't need one
for this -- there are plenty of operators available
that haven't yet been used on strings.

I suggest '^', since it does a nice job of suggesting
"inject stuff into this string". We can have both a
prefix form for compile-time interpolation:

  a = ^ "My name is $name"

and an infix form for run-time interpolation:

  a = "My name is $name" ^ dict

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From barry@zope.com  Tue Feb 26 03:12:22 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 25 Feb 2002 22:12:22 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <3C7AD939.FCFE163D@prescod.net>
 <200202260214.PAA23584@s454.cosc.canterbury.ac.nz>
Message-ID: <15482.64790.812638.147714@anthem.wooz.org>

>>>>> "GE" == Greg Ewing <greg@cosc.canterbury.ac.nz> writes:

    GE> Also, it uses up one of the three precious not-yet-used
    GE> characters, and I think we should keep those for some
    GE> future time when we really need them. We don't need one
    GE> for this -- there are plenty of operators available
    GE> that haven't yet been used on strings.

    GE> I suggest '^', since it does a nice job of suggesting
    GE> "inject stuff into this string". We can have both a
    GE> prefix form for compile-time interpolation:

    GE>   a = ^ "My name is $name"

    GE> and an infix form for run-time interpolation:

    GE>   a = "My name is $name" ^ dict

I think I suggested using ~ for this at IPC10:

    a = ~'my name is $name'

for the compile-time interpolation.  I don't think it matters much
which operator is chosen (let Guido decide).

-Barry


From tim.one@comcast.net  Tue Feb 26 03:24:03 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 25 Feb 2002 22:24:03 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <15482.45796.817626.14965@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEGGNPAA.tim.one@comcast.net>

[Barry A. Warsaw]
> Does anybody ever even use something other than `s' for %() strings?
>
> >>> '%(float)f' % {'float': 3.9}
> '3.900000'
>
> I never have.

Then again, you've never used a floating-point number either <wink>.  I've
certainly used %(x)f/g/e with float formats.

Not quite speaking of which, if Python grows a new $ operator, let's get the
precedence right.  This kind of thing is too common today:

>>> amount = 3.50
>>> n = 3
>>> print "Total: $%.2f." % amount*n
Total: $3.50.Total: $3.50.Total: $3.50.
>>>



From tim.one@comcast.net  Tue Feb 26 03:29:05 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 25 Feb 2002 22:29:05 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <200202252205.g1PM5xD13825@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEGHNPAA.tim.one@comcast.net>

[Guido]
> I never use this in combination with named variables, but I often
> write timing programs that format times using "%6.3f" to get
> millisecond precision.

Note that you also use %(name)s with width, precision and justification
modifiers.  For example, this line is yours:

     s = "%(name)-20.20s %(sts)-10s %(uptime)6s %(idletime)6s" % locals()



From DavidA@ActiveState.com  Tue Feb 26 04:25:59 2002
From: DavidA@ActiveState.com (David Ascher)
Date: Mon, 25 Feb 2002 20:25:59 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <15482.42163.449374.118807@grendel.zope.com>
 <20020225152035.C22803@unpythonic.dhs.org>
 <3C7AAE96.BA4D19FE@prescod.net> <15482.45796.817626.14965@anthem.wooz.org> <3C7AB2FE.D1FFE397@prescod.net>
Message-ID: <3C7B0E57.E0FC3960@ActiveState.com>

Paul Prescod wrote:
> 
> "Barry A. Warsaw" wrote:
> >
> >...
> >
> > Does anybody ever even use something other than `s' for %() strings?
> >
> > >>> '%(float)f' % {'float': 3.9}
> > '3.900000'
> 
> Presumably numerical analysts do....and David Ascher once told me he
> uses %d as a sanity type-check. I don't bother.

Paul's starting to turn into my brother -- quoting things I said twenty
years ago and that I have no way of disproving.  As Bill said, "I don't
recall".

These days, I rarely think in FP, even if I use FP, so I typically use
%s.  Back then I probably cared about mantissa and her friends.

--da


From mal@lemburg.com  Tue Feb 26 10:27:46 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 11:27:46 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3C7B6322.440D21E7@lemburg.com>

I consider the above PEP ready for review by the developers.
Please comment.

    http://python.sourceforge.net/peps/pep-0263.html
	
After approval, the next step would be to implement phase 1
for 2.3. Step two would then be on the plate for 2.4.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From jim@zope.com  Tue Feb 26 14:04:27 2002
From: jim@zope.com (Jim Fulton)
Date: Tue, 26 Feb 2002 09:04:27 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <029a01c1b0cb$195cb400$ced241d5@hagrid> <200202081814.g18IEYe02932@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7B95EB.952037FE@zope.com>

Guido van Rossum wrote:
> 
> >     http://effbot.org/ideas/time-type.htm
> >
> > I can produce PEP and patch if necessary.
> 
> Yes, a PEP, please!  Jim Fulton has been asking for this for a long
> time too.  His main requirement is that timestamp objects are small,
> both in memory and as pickles, because Zope keeps a lot of these
> around. 

I also need time-zone support.

> They are currently represented either as long ints (with a
> little under 64 bits) or as 8-byte strings. 

ZODB has a TimeStamp type that uses a 32-bit unsigned integer 
to store year, month,, day, hour, and minute in a way that makes it dirt
simple to extract a component.  It uses a 32-bit integer to store
seconds in units of 60/2**32 seconds.

This type isn't appropriate for general use because it only allows
dates later than Dec 31, 1899.

> A dedicated timestamp
> object could be smaller than that.

A type that only needed minute precision could easily be expressed
with 32-bits. Of course, the two-word object overhead makes the difference
between 32-bits and 64-bits rather unexciting.
 
> Your idea of a base type (which presumably standarizes at least one
> form of representation) sounds like a breakthrough that can help
> satisfy different other needs.

I agree.

Jim

--
Jim Fulton           mailto:jim@zope.com       Python Powered!        
CTO                  (888) 344-4332            http://www.python.org  
Zope Corporation     http://www.zope.com       http://www.zope.org


From jim@zope.com  Tue Feb 26 14:05:11 2002
From: jim@zope.com (Jim Fulton)
Date: Tue, 26 Feb 2002 09:05:11 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7B9617.D22E449F@zope.com>

Guido van Rossum wrote:
> 
> [Tim]
> > Are you sure Jim is looking to replace the TimeStamp object?  All the
> > complaints I've seen aren't about the relatively tiny TimeStamp object, but
> > about Zope's relatively huge DateTime class (note that you won't have source
> > for that if you're looking at a StandaloneZODB checkout -- DateTime is used
> > at higher Zope levels), which is a Python class with a couple dozen(!)
> > instance attributes.  See, e.g.,
> >
> >     http://dev.zope.org/Wikis/DevSite/Proposals/ReplacingDateTime
> >
> > It seems clear from the source code that TimeStamp is exactly what Jim
> > intended it to be <wink>.
> 
> I'm notoriously bad at channeling Jim.  Nevertheless, I do recall him
> saying he wanted a lightweight time object.  I think the mistake of
> DateTime is that it stores the broken-out info, rather than computing
> it on request.

I don't think the mistake was so much to store broken-out info, 
but to store too much data in general. It stores redundant data, which
makes it's implementation a bit difficult to understand and maintain.

Note that scalability was not a goal of Zope's DateTime type. We meant
to replace it with something much tighter a long time ago, but never
got around to it.


> > > Your idea of a base type (which presumably standarizes at least one
> > > form of representation) sounds like a breakthrough that can help
> > > satisfy different other needs.
> >
> > Best I can make out, /F is only proposing what Jim would call an Interface:
> > the existence of two methods, timetuple() and utctimetuple().  In a comment
> > on his page, /F calls it an "abstract" base class, which is more C++-ish
> > terminology, and the sample implementation makes clear it's a "pure"
> > abstract base class, so same thing as a Jim Interface in the end.

Yup.
 
> I'll show the PEP to Jim when it appears.

Thanks. ;)

BTW, has there been any progress on this?

Jim

--
Jim Fulton           mailto:jim@zope.com       Python Powered!        
CTO                  (888) 344-4332            http://www.python.org  
Zope Corporation     http://www.zope.com       http://www.zope.org


From jim@zope.com  Tue Feb 26 14:05:44 2002
From: jim@zope.com (Jim Fulton)
Date: Tue, 26 Feb 2002 09:05:44 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCKEKDNKAA.tim.one@comcast.net>
 <15459.9239.83647.334632@gondolin.digicool.com>
 <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net>
 <15460.15864.266226.241495@grendel.zope.com>
 <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net> <15460.17309.905113.103005@grendel.zope.com> <3C644E7B.F69AF73C@lemburg.com>
Message-ID: <3C7B9638.C59EEC35@zope.com>

"M.-A. Lemburg" wrote:
> 
> "Fred L. Drake, Jr." wrote:
> >
> > Guido van Rossum writes:
> >  > Is comparison the same what Tim mentioned as range searches?  I guess
> >  > a representation like current Zope timestamps or what time.time()
> >  > returns is fine for that -- it is monononous even if not necessarily
> >  > continuous.  I guess a broken-out time tuple is much harder to compare.
> >
> > Yes; as long as ordering is easy to check, we're fine with a long int
> > or some such thing.  The range search is indeed the specific
> > application Jim has in mind.
> 
> Uhm... I think this thread is heading in the wrong direction.
> 
> Fredrik wasn't proposing a solution to Jim's particular
> problem (whatever it was ;-), but instead opting for a solution
> of a large number of Python users out there.

Right. ;)

> While mxDateTime probably works for most of them (and is used by
> pretty much all major database modules out there), some may feel
> that they don't want to rely on external libs for their software
> to run on.

I have no problem relying on external libraries.
 
> I would be willing to make the mxDateTime types subtypes of
> whatever Fredrik comes up with. The only requirement I have is
> that the binary footprint of the types needs to match todays
> layout of mxDateTime types since I need to maintain binary
> compatibility.

The binary footprint of your types, not the standard base class, 
right? I don't see a problem with that,

> The other possibility would be adding a set of new types
> to mxDateTime which focus on low memory requirements rather
> than data roundtrip safety and speed.

What is data roundtrip safety?

I rarely do date-time arithmetic, but I often do date-time-part
extraction. I think that mxDateTime is optimized for arithmetic,
whereas, I'd prefer a type more focussed on extraction efficiency, 
and memory usage,  and that effciently supports time zones. 
This is, of course, no knock on mxDateTime. I also want
fast comparisons, which I presume mxDateTime provides.

Jim

--
Jim Fulton           mailto:jim@zope.com       Python Powered!        
CTO                  (888) 344-4332            http://www.python.org  
Zope Corporation     http://www.zope.com       http://www.zope.org


From jim@zope.com  Tue Feb 26 14:05:32 2002
From: jim@zope.com (Jim Fulton)
Date: Tue, 26 Feb 2002 09:05:32 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCKEKDNKAA.tim.one@comcast.net> <15459.9239.83647.334632@gondolin.digicool.com>
Message-ID: <3C7B962C.147AAA15@zope.com>

Jeremy Hylton wrote:
> 
> >>>>> "TP" == Tim Peters <tim.one@comcast.net> writes:
> 
...
> Also, it may not be necessary to have a TimeStamp object in ZODB 4.
> There are three uses for the timestamp: tracking how recently an
> object was used for cache evication,

Time-stamp isn't used for this.

> providing a last modified time to
> users, and as a simple version number.

This is certainly a hack.
 
> In ZODB 4, the cache eviction may be done quite differently. 

Yup, or, with Toby Dickenson's patches, in ZODB 3. :)

> The
> version number may be a simple int.  The last mod time will not be
> provided for each object; instead, users will need to define this
> themselves if they care about it.  If they define it themselves,
> they'd probably use a DateTime object, but we'd care much less about
> how small it is.

The TimeStamp type will still be useful for storage implementations
that want compact time strings.

I could imagine al alternate implementation that conformed to the new
interface and retained the compact representation.

Jim

--
Jim Fulton           mailto:jim@zope.com       Python Powered!        
CTO                  (888) 344-4332            http://www.python.org  
Zope Corporation     http://www.zope.com       http://www.zope.org


From guido@python.org  Tue Feb 26 14:20:47 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 09:20:47 -0500
Subject: [Python-Dev] OS/2 EMX changes to dynload_shlib.c
In-Reply-To: Your message of "Tue, 26 Feb 2002 03:41:36 PST."
 <E16ffzE-0007dJ-00@usw-pr-cvs1.sourceforge.net>
References: <E16ffzE-0007dJ-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <200202261420.g1QEKld15953@pcp742651pcs.reston01.va.comcast.net>

Given the number of OS/2 EMX specific changes to dynload_shlib.c,
wouldn't it be better to create a separate dynload_os2.c?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Feb 26 14:22:58 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 09:22:58 -0500
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: Your message of "Tue, 26 Feb 2002 11:27:46 +0100."
 <3C7B6322.440D21E7@lemburg.com>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
Message-ID: <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net>

> I consider the above PEP ready for review by the developers.
> Please comment.
> 
>     http://python.sourceforge.net/peps/pep-0263.html
> 	
> After approval, the next step would be to implement phase 1
> for 2.3. Step two would then be on the plate for 2.4.

That looks OK to me.  I the Emacs-style comment in fact compatible
with Emacs?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Tue Feb 26 14:31:52 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 15:31:52 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7B9C58.5D9367E0@lemburg.com>

Guido van Rossum wrote:
> 
> > I consider the above PEP ready for review by the developers.
> > Please comment.
> >
> >     http://python.sourceforge.net/peps/pep-0263.html
> >
> > After approval, the next step would be to implement phase 1
> > for 2.3. Step two would then be on the plate for 2.4.
> 
> That looks OK to me.  I the Emacs-style comment in fact compatible
> with Emacs?

According to Martin, it is compatible. If it's not we'll make it 
so :-)

Barry ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Tue Feb 26 14:33:05 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 15:33:05 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCKEKDNKAA.tim.one@comcast.net>
 <15459.9239.83647.334632@gondolin.digicool.com>
 <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net>
 <15460.15864.266226.241495@grendel.zope.com>
 <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net> <15460.17309.905113.103005@grendel.zope.com> <3C644E7B.F69AF73C@lemburg.com> <3C7B9529.309D9902@zope.com>
Message-ID: <3C7B9CA1.B7798317@lemburg.com>

Jim Fulton wrote:
> 
> > I would be willing to make the mxDateTime types subtypes of
> > whatever Fredrik comes up with. The only requirement I have is
> > that the binary footprint of the types needs to match todays
> > layout of mxDateTime types since I need to maintain binary
> > compatibility.
> 
> The binary footprint of your types, not the standard base class,
> right? I don't see a problem with that,

Fredrik's solution only provides an abstract base type 
with no additional parameters in the type object (only an
interface definition on top of it) -- this
would work nicely.
 
> > The other possibility would be adding a set of new types
> > to mxDateTime which focus on low memory requirements rather
> > than data roundtrip safety and speed.
> 
> What is data roundtrip safety?

Roundtrip safety means that e.g. if you take a COM date value
from a ADO and create a DateTime object with it, you can
be sure to get back the exact same value via the COMDate()
method.

The same is true for broken down values and, of course,
the internal values .absdate and .abstime.

This may not be too important for most applications, but
it certainly is for database related ones, since rounding
can cause e.g. 14:00:00.00 to become 13:59:59.99 and that's
not what you want if you transfer data from one database
to another.

> I rarely do date-time arithmetic, but I often do date-time-part
> extraction. I think that mxDateTime is optimized for arithmetic,
> whereas, I'd prefer a type more focussed on extraction efficiency,
> and memory usage,  and that effciently supports time zones.
> This is, of course, no knock on mxDateTime. I also want
> fast comparisons, which I presume mxDateTime provides.

DateTime objects use .abstime and .absdate for doing
arithmetic since these provides the best accuracy. The most
important broken down values are calculated once at creation
time; a few others are done on-the-fly. 

I suppose that I could easily make a few calculation
lazy to enhance speed; memory footprint would not change
though. It's currently at 56 bytes per DateTime object
and 36 bytes per DateTimeDelta object.

To get similar accuracy in Python, you'd need a float and
an integer per object, that's 16 bytes + 12 bytes == 28 bytes
+ malloc() overhead for the two and the wrapping instance
which gives another 32 bytes (provided you store the two
objects in slots)... >60 bytes per Python based date time 
object.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From barry@zope.com  Tue Feb 26 15:57:49 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 26 Feb 2002 10:57:49 -0500
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net>
 <3C7B9C58.5D9367E0@lemburg.com>
Message-ID: <15483.45181.422942.816096@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> According to Martin, it is compatible. If it's not we'll make
    MAL> it so :-)

    MAL> Barry ?

I believe so, although I haven't ever used this trick to specify the
file's encoding.  In some quick tests, at least XEmacs doesn't bomb
out on it (if I stick a real encoding in for <encoding name>).

-Barry


From martin@v.loewis.de  Tue Feb 26 18:00:22 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 26 Feb 2002 19:00:22 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de>

--=-=-=

Guido van Rossum <guido@python.org> writes:

> That looks OK to me.  I the Emacs-style comment in fact compatible
> with Emacs?

It is. I expect many people want to put "utf-8" as the encoding name,
and you need Emacs 21 for that (or Emacs with Mule-UCS, or some such).

In GNU Emacs, you see the effect of the coding: directive in the Emacs
status line. Just try the attached file, it will indicate "R" for
KOI8-R. Not sure about XEmacs.

Regards,
Martin


--=-=-=
Content-Type: application/octet-stream
Content-Disposition: attachment; filename=foo.py
Content-Transfer-Encoding: quoted-printable

#! /usr/bin/python
# -*- coding: koi8-r -*-

print "=ED=C1=D2=D4=C9=CE"


--=-=-=--


From guido@python.org  Tue Feb 26 18:17:54 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 13:17:54 -0500
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: Your message of "26 Feb 2002 19:00:22 +0100."
 <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net>
 <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200202261817.g1QIHsO19019@pcp742651pcs.reston01.va.comcast.net>

> In GNU Emacs, you see the effect of the coding: directive in the Emacs
> status line. Just try the attached file, it will indicate "R" for
> KOI8-R. Not sure about XEmacs.

Cool!  It worked for me in Emacs, but not in XEmacs.  Oh well.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Tue Feb 26 18:19:33 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 26 Feb 2002 12:19:33 -0600
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net>
 <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15483.53685.311901.785106@12-248-41-177.client.attbi.com>

    >> That looks OK to me.  I the Emacs-style comment in fact compatible
    >> with Emacs?

    Martin> It is. I expect many people want to put "utf-8" as the encoding
    Martin> name, and you need Emacs 21 for that (or Emacs with Mule-UCS, or
    Martin> some such).

    Martin> In GNU Emacs, you see the effect of the coding: directive in the
    Martin> Emacs status line. Just try the attached file, it will indicate
    Martin> "R" for KOI8-R. Not sure about XEmacs.

I use XEmacs 21.4.5 (non-MULE).  I see nothing particularly interesting when
visiting that file.  Apropos doesn't indicate there is a variable named
"coding" either.  I see ":encoding", an undocumented variable.  Everything
else containing "coding" is more complex and seems package-specific (tramp,
vm, ediff, etc).  Perhaps using MULE would make a difference.

Skip



From paul@prescod.net  Tue Feb 26 18:21:08 2002
From: paul@prescod.net (Paul Prescod)
Date: Tue, 26 Feb 2002 10:21:08 -0800
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
References: <15482.30534.640750.875602@anthem.wooz.org>
 <3C7A7DB0.A8A3416E@lemburg.com>
 <15482.34560.688685.262327@anthem.wooz.org>
 <3C7A8FC7.9CB321EE@lemburg.com>
 <15482.36941.605165.133988@anthem.wooz.org>
 <3C7A9430.1E1077F8@lemburg.com>
 <15482.38577.933015.221824@anthem.wooz.org>
 <3C7A9F3B.B42265DC@prescod.net>
 <15482.42163.449374.118807@grendel.zope.com>
 <20020225210423.GA2398@crystal.mems-exchange.org>
 <15482.46039.769808.389448@grendel.zope.com>
 <3C7AC4DF.61A7D7AB@prescod.net> <15482.60502.420424.995597@grendel.zope.com>
Message-ID: <3C7BD214.5065100B@prescod.net>

"Fred L. Drake, Jr." wrote:
> 
> Paul Prescod writes:
>  >                                   The major reason for doing it at
>  > compile time (for me) is that you can have a nice syntax that doesn't
>  > evolve modulus-ing (or dividing) an otherwise useless vars() or locals()
>  > dictionary.
> 
> Which has everything to do with your usage.  I almost never use % with
> locals() or vars(), so I don't share that motivation.  

Even so you have to modulus a tuple or a variable. That doesn't make any
more sense for a newbie and is just as inconvenient for the script
kiddie (which is often me!), compared to languages like Perl, Ruby, Tcl,
sh etc.

Python's interpolation syntax is: more verbose, more complicated, less
secure and also more powerful. I have no problem with keeping the power
but I'd like something less verbose and less complicated alongside it.

> I'm much more
> likely to build a dict specifically for the purpose, which includes
> computed values, or have something already created which includes this
> usage as part of the larger picture.

I don't believe that this feature should be taken away from you. But I
don't see how it relates to the PEP because what you want to do is
already doable. PEP 215 is about making things *easier for simple
cases*. If you have new, high-end needs for runtime string interpolation
then PEP 215 probably won't address them.

 Paul Prescod


From barry@zope.com  Tue Feb 26 18:24:41 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 26 Feb 2002 13:24:41 -0500
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net>
 <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15483.53993.852170.135298@anthem.wooz.org>

--9qjFu7wnRj
Content-Type: text/plain; charset=us-ascii
Content-Description: message body text
Content-Transfer-Encoding: 7bit


>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

    MvL> Guido van Rossum <guido@python.org> writes:

    >> That looks OK to me.  I the Emacs-style comment in fact
    >> compatible with Emacs?

    MvL> It is. I expect many people want to put "utf-8" as the
    MvL> encoding name, and you need Emacs 21 for that (or Emacs with
    MvL> Mule-UCS, or some such).

    MvL> In GNU Emacs, you see the effect of the coding: directive in
    MvL> the Emacs status line. Just try the attached file, it will
    MvL> indicate "R" for KOI8-R. Not sure about XEmacs.

I don't think it works for XEmacs.  I've got a MULE-aware XEmacs
21.4.6 and while it asks if I want to set the local variables in the
-*- line, I still see "Raw" in the modeline, and I see the following
letters in print string (with funny little lines above the
characters): iAOOEI.  See attached capture.  That doesn't seem right,
does it?

-Barry


--9qjFu7wnRj
Content-Type: image/gif
Content-Disposition: inline;
	filename="capture.gif"
Content-Transfer-Encoding: base64

R0lGODdh9gCoAPcAAAAAAAAAiwD/AF5eXms6SnNzc3p6eqOjo7IiIr+/v8zMzM2waM3NtNtw
k+D//+bm5vX19fzV4f/XAAhACHxM8Dg6PxQUFAgICAI8dAD0OAD/FAC/CAAEEEAAAAQAAGy4
APXcAP8LQL9ABAuIcKEx8wQU/0AIv+6ICzgxoRQUBAgIQIQ8ljn0OAz/FAi/CALceADAsQCy
EAAACHxobgD0QQD/AwC/QCPoaQH0AAC/AIcEDTkAAAwAAAgAALh0Ztz0AAv/AEC/AGT0vHb0
OAb/FEC/CAP+oAA08wAD/wBAv8AEQjcAahQABwgACFyEbgL0QbgEadz1AIAEDfUAAP8AAL8A
ACSQZoL0AAb/ADgQxDf1OBT/FMD+oDc08xQD/whAv1yIdAI4agAUBwAICLiQiNz0MQv/FMAE
gzcAABQAQAgABKy48PXcP/8LFL9ACDqIac0xAAQUAEAIADiIEDcxABQUAAgIAMCQZzf0ABT/
AAi/AOgEEiG8DUD0AAj/AJw8rEf18wz//wi/vwD+NgA0cQADBwBACFy8HPY4X/8UE78ICNC8
Acf0ABf/AAAEiQAAsQAAEMC4vPXc8/8L/79Av1uIPcwxZQQUB0AICFiI0PYxXFy8CPb0+f//
/9AAvMcA9hcA/wgAvwDcYwDAfACyBReIDYIxAAcUANDwAMc/ABcUAAFACAA6+QAU/wAIvwAK
CAAA+QAA/wAAvwAALAEA9gBA/wAEvwAYAAD1AGQLAPahAP8EAL9AAGC6APY6AP8UAL8IAFyA
APb1AP//AL+/ABgKCPYA+f8A/78Av1GAAIL1AAf/ANBguMe13BcQC/aL/wD1/wD//wC//6gP
4AA4TwADIQBAQAAECgADAAi6lPk69P8U/78IvwAkBAAPMAA49AAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACwAAAAA9gCoAAAI/gAjCBxIsKBBAgYTKlzIsKHD
hxAjSpxIsaLFgxcRWtR4saPHjyBDihw5kONEAhobqFzJsiVLhC5jypxJs6bNmzhz6tzJs6fP
ljB5ohTYM+jPo0iTKl3KtKlKozqHRniKsqrVq1izat3KtavXr2DDih1LtqzZs1inNpD6VIDb
twIGyJ1Lt67du3jz6t3Lt6/fv4ADCx7s14Dhw4gTK0aslu1auG/nLgA8mbDdBZjxVva7WXBm
y6BDi+67uLRpCI1TPnYLQEBryX87h64se3DtvbJvj97Ne7DhA8CDB4dgQLjxAxBQUyW62rVz
2Jx50xatO2/u3tizB/59HDjx4t2R/itfq5qAawDo0UPnS7u95M8DJmO+Lnf+5vmw8dedrh+/
f/v1xddZfwJWp92Bf3F3XHIKgNddcqkxZ55zrcUVYF/3QZfhhZetNyBdn9EXn4Ye7jfdiBeG
iOCKvjloHAQKNOiicBAu1xhrzlmIIm4gegjfjj2WuN6OHwq5IZBEcngikiw2qZeCNMIY44zD
jecYAemlp6OBQx4p5H5BAlmkmGFmOKaSaCbp5JpPUolcjFO6WSN5ElIYWWxgpvjlkGpy2KeI
9/1oJol6+snmoXNB6Z2UcMppZXl2usXnXbX5Vx98AHZZ4Hv/uYdkppN2uumo/XGJKIKKvgln
nAs+Widk/jqyJ92seXZ46q3ZKcrgqlO2GuGNsOK6om6mCmssaTMyyquurgIL2bHZ6fcetNQS
BuWuvPYa5a9PVevtt+DWdW22q17bbLfhmhhmbz/yCGK76fKmILbkNritjeiOWGytCO47G2Fe
sgvuvPWW6+CcWeXnWZP+gtZwmQc+3CR39BZ8MARXEURAgdKyBypngoYcIqajXiofngJyqvKe
YHbsscvGUlzwsuBBWJKEAUrMpKyFoujlgCf+zOOYZ5asmWU6r/hbxTPXfO5aOetl36Amwxsq
oWQaOanUgK674dRbmyjtyVUnLdrSM2e79NMETG02Zbb6rKnWhg4N8dw7hx3v/l6GJWfa3wiX
py+bX2N9ZuF1S51m1otTurdgff8NONuDEy7yuyeDXWrV614WqMmcYmp16I8n6GZ4vuILdekO
s+76XX0nJ/vstNceOM6v25b77gPEbvvvshtwe2O8g/x28YjGnrbawlOO/PPQ26X88gYPn2/0
2CM/PfVxWr969uDvvj33sXP7ffjolz4+9eWrvnH68O+9/vLt00l8/PgP3Dz3jTZv/vv5CyC0
5pe2+l1JgAiM2f74JyPvATCBEGQTAZvmP/dF8IJrmqDFKmi/62Hwg7laIP8MKDgQmhA7GqwX
CXF3whaeTYTk4+ABSyeButTQLjekSw7lskMXjiaF/uRa4f3+woDdOGAAR9QhDu/SwxrusIkD
2OERk+hDyMGQfTIsIRH5UkS+TFGJh/piFa3ItAJmkYW86eJexKgXCTTxjXNxoxx5KMccsnGM
pisjBR04gCIyQI197GMX/whIP6qRkH8EzBxt+EQextGRUcRLD/ForSvS74zES2Qg5aJJuhRy
kHMBZGCgGMlSOrKRk4QkJUEDRObx8ZChvEshY8lJy5DyhqhUYh1tuMrQtJJmr6TlJusyy1oa
czBPrKMTdclMJvaSlZY0YzCPKcpjDhOWhGnkI3O5TUk+0zK/rB7lsDlMT5qTmrYEIyRx+chS
ptKU39xONPc4TkKek5j3/vxkNfmiTVW6kZFx3GUUBRrPPCoLi9Ms6DPD2b+EKnSVDO2e8x66
0HlukI8UrageLzrRjFIyog3sqEfHCFIhenCkPiwpJk+K0haqFKMtreJLRRrTE870fzWVqUVV
uNLz5dSEN7XgT1260yD29IFD/WBQO+jTpF5wqTN0qlKL6kqaSjWBUNXiVSOYVTRuFatUBaZV
v5q/rg6RrAg0K0vRGj+1NpWtbQ2rOHEKVwG6Fal1hd9d8xrAvfIVf379q17l2tCxCjZ6gT1s
+BKr2OwxtrGIJaxE6QpZx0o2pJStbGQ3ylOYahZ6j/0s70Ir2tyRtrSuOy1q1XdZk751taxT
/i1s4yXb2Yartrb9Fm5zW63d8naArT3qb1MbXM8Ol7bFNexxe5vczC73ts0V6nPT5dvpJi+6
TMWrdalV3e1mELtR9S5zvyO5xRhXvMcqr+TOi15hpQp1qctue8cLvPrKzrnzxZXv7As8/Ob3
VGhjYHHY+99DrW2EA1ZugSdWsxEuyr8L/u6isPhg6Ub4usNhX5UgfGEWzeugzNuwhTss4Qw3
7V7yJbGBXbRRc3FYxdrRlcVepGAY94ZZRqXxi218YyqB2F46HjGPD5SqihW5xkN+4YOW9SAk
JxmaTe5fk3f8ZF+ejlFXdnKVWxQeBr1XPFTeciVRB6PTgVnIYv6h/pnLTGYtpzlB/OVvmN8M
mDjL2X0PyLOe98znPvv5z4AOtKAHTehCG/rQiE60ohfN6D0X4H+NjrSkJ03pSlv60pguwKPx
/IAEePrToA61qEdN6lKb+tSoTrWqV83qVrv61bCOdQI0DelOlxo9oca1qAEg6177OtUBCPav
h01sUPO61whAwKlpzelbe/rYCTg2tJ89bVRnqdrFJnYAss3tX0M72eA+tbJDPe5SMzu7eTa2
uqP9aWmrG9rYHjW8u13sbdP73q2udrnLPWp+J8Dfoz73ldL97HWz++AId3e7Tz1tXmeJ2g7X
Nb5HHeyKf9riFxc2qbeNcYtjfOKknja//sGt7HCTfN/JBnW4PS3w8hC82vNOeMERvvBbPzzX
M6c5yENt74v7PAE977mog+5pou9c3uT+9Lj3rfSke3rpTZ/1ptHdaWzHfN4xNzjSE35zhx+c
5Epf+b/FDvani33joBY6x9NeaqPbW+hH1/rYU/70ukf97iVXOcunPvCq7zrnCr92w0199ZpH
2+tx57nifa52tP/87YnHud5FDXW73/3fmI96yyX0cskXXu7x1jriZR76o8Pd7T+neOqBDvfI
65vUdLd83u3O9L3X2vDUzjXMJV56iPs+PQcffeRZr/GiF5/4izf+8YE+fMmbPfaXf/7czS79
T7f8KgRvvva1/t/6om8/5+IetqYzVpLsf//8IO8+89Hfe8v7mtY3a4z50U9/bn9c+eqv/9wB
DuvNy1/TABiAAjiABFiABniACJiACriADNiADviAEBiBEiiA/zOBFniBGJiBGriBHNiB/rdW
dCYscxaCa0JZDnCCoXGCKkiCgUFXSURFhAGDLAgYOCWDeEFFNjgXOXgXKriDYlaDfYGDedGD
eiGEMwiEdNGDK4hETBiEQ6iDMzgASOiEfuGDRsiCU7hGTehFTygXPrhlLsiFXtiFW2gXL/iF
VWaCKHiDUDiEa+iGZRiCI3ggaPhkc6gddZhk/yMRJkESfviHgBiIgqgQffgQqrMTQVDhFIq4
iIzYiI4YE4mIE46BiGrxiJZ4iZiYiTcRiZuIEGcRAWgRiqI4iqRYiqZIiqB4iqq4iqzYiq74
irCIFQEBADs=
--9qjFu7wnRj--


From martin@v.loewis.de  Tue Feb 26 18:49:38 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 26 Feb 2002 19:49:38 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <15483.53993.852170.135298@anthem.wooz.org>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net>
 <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de>
 <15483.53993.852170.135298@anthem.wooz.org>
Message-ID: <m3n0xww01p.fsf@mira.informatik.hu-berlin.de>

barry@zope.com (Barry A. Warsaw) writes:

> I don't think it works for XEmacs.  I've got a MULE-aware XEmacs
> 21.4.6 and while it asks if I want to set the local variables in the
> -*- line, I still see "Raw" in the modeline, and I see the following
> letters in print string (with funny little lines above the
> characters): iAOOEI.  See attached capture.  That doesn't seem right,
> does it?

Indeed not: It interprets it as latin-1. I hope XEmacs will eventually
follow the GNU Emacs conventions here, since I think they are useful.

Regards,
Martin



From fredrik@pythonware.com  Tue Feb 26 19:08:03 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 26 Feb 2002 20:08:03 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com>
Message-ID: <065201c1bef8$f3e03ad0$ced241d5@hagrid>

jim wrote:

> BTW, has there been any progress on this?

the current pepoid proposal is here:

    http://effbot.org/ideas/time-type.htm

major issues right now is 1. should the type know about
timezones (probably not), and 2. should it support basic
arithmetics (probably yes).

(and 3. find time to write a html-to-pep converter ;-)

cheers /F



From bckfnn@worldonline.dk  Tue Feb 26 19:44:39 2002
From: bckfnn@worldonline.dk (Finn Bock)
Date: Tue, 26 Feb 2002 19:44:39 GMT
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <3C7B6322.440D21E7@lemburg.com>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com>
Message-ID: <3c7bbf00.17218508@mail.wanadoo.dk>

[MAL]

>I consider the above PEP ready for review by the developers.
>Please comment.

The pep seems to dictate that the source by default must be read as
latin-1:

"""
Python will default to Latin-1 as standard encoding if no other
encoding hints are given.
"""

Jython already reads the python source with the default java encoding
which usually depends on the PCs locale.

If a small loophole could be added to that requirement, then the pep
have my full support.

regards,
finn


From mal@lemburg.com  Tue Feb 26 19:50:35 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 20:50:35 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net>
 <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de>
 <15483.53993.852170.135298@anthem.wooz.org> <m3n0xww01p.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3C7BE70B.74713ED2@lemburg.com>

"Martin v. Loewis" wrote:
> 
> barry@zope.com (Barry A. Warsaw) writes:
> 
> > I don't think it works for XEmacs.  I've got a MULE-aware XEmacs
> > 21.4.6 and while it asks if I want to set the local variables in the
> > -*- line, I still see "Raw" in the modeline, and I see the following
> > letters in print string (with funny little lines above the
> > characters): iAOOEI.  See attached capture.  That doesn't seem right,
> > does it?
> 
> Indeed not: It interprets it as latin-1. I hope XEmacs will eventually
> follow the GNU Emacs conventions here, since I think they are useful.

After reading some of the Emacs docs, I think we should allow
a more flexible coding line:

	-*- ... coding: (\w+) ... -*-

because you will sometimes want to add more variables to that
Emacs init line than just the encoding declaration.

Does anybody know where XEmacs is moving w/r to this ? (and
for that matter what about vi, vim, etc. ?)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From jim@zope.com  Tue Feb 26 19:53:49 2002
From: jim@zope.com (Jim Fulton)
Date: Tue, 26 Feb 2002 14:53:49 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid>
Message-ID: <3C7BE7CD.EAA8BA24@zope.com>

Fredrik Lundh wrote:
> 
> jim wrote:
> 
> > BTW, has there been any progress on this?
> 
> the current pepoid proposal is here:
> 
>     http://effbot.org/ideas/time-type.htm
> 
> major issues right now is 1. should the type know about
> timezones (probably not),

:(

-1

Doesn't the proposal sort of imply time-zone
awareness of some kind? Or does it simply imply
UT storage?

> and 2. should it support basic
> arithmetics (probably yes).

Does this imply leap second hell, or will we 
simply be vague about expectations?

I'd also like to see simple access methods for year, 
month, day, hours, minutes, and seconds, with date parts
being one based and time parts being zero based.

Jim

--
Jim Fulton           mailto:jim@zope.com       Python Powered!        
CTO                  (888) 344-4332            http://www.python.org  
Zope Corporation     http://www.zope.com       http://www.zope.org


From guido@python.org  Tue Feb 26 19:58:54 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 14:58:54 -0500
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: Your message of "Tue, 26 Feb 2002 19:44:39 GMT."
 <3c7bbf00.17218508@mail.wanadoo.dk>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
Message-ID: <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>

> """
> Python will default to Latin-1 as standard encoding if no other
> encoding hints are given.
> """

I missed this.  Why not default to ASCII like any decent programming
language does in the absence of an explicit encoding?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@zope.com  Tue Feb 26 19:59:45 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 26 Feb 2002 14:59:45 -0500
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net>
 <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de>
 <15483.53993.852170.135298@anthem.wooz.org>
 <m3n0xww01p.fsf@mira.informatik.hu-berlin.de>
 <3C7BE70B.74713ED2@lemburg.com>
Message-ID: <15483.59697.784765.121045@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> Does anybody know where XEmacs is moving w/r to this ? (and
    MAL> for that matter what about vi, vim, etc. ?)

I'll ask around in the XEmacs community.

-Barry


From mal@lemburg.com  Tue Feb 26 20:11:47 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 21:11:47 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com>
Message-ID: <3C7BEC03.E8225CFD@lemburg.com>

Jim Fulton wrote:
> 
> > major issues right now is 1. should the type know about
> > timezones (probably not),
> 
> :(
> 
> -1
> 
> Doesn't the proposal sort of imply time-zone
> awareness of some kind? Or does it simply imply
> UT storage?

I tried that in early version of mxDateTime -- it fails
badly. I switched to the local time assumption very
early in the development.

Note that Fredrik's type is an abstract type; it doesn't
even store anything -- that's up to subtypes which of
course can implement timezones at their liking.
 
> > and 2. should it support basic
> > arithmetics (probably yes).
> 
> Does this imply leap second hell, or will we
> simply be vague about expectations?

The type will store a fixed point in time, so why
worry about leap seconds (most system's don't support these
anyway and if they do, the support is usually switched off per
default) ?
 
> I'd also like to see simple access methods for year,
> month, day, hours, minutes, and seconds, with date parts
> being one based and time parts being zero based.

In the abstract base type ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Tue Feb 26 20:15:40 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 21:15:40 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7BECEC.E1550553@lemburg.com>

Guido van Rossum wrote:
> 
> > """
> > Python will default to Latin-1 as standard encoding if no other
> > encoding hints are given.
> > """
> 
> I missed this.  Why not default to ASCII like any decent programming
> language does in the absence of an explicit encoding?

Jack had the same question. The simple answer is: we need this
in order to maintain backward compatibility when we move to
phase two of the implementation.

Here's the longer one:

ASCII is the standard encoding for Python keywords and identifiers. 
There is no standard source code encoding for string literals. 
Unicode literals are interpreted using 'unicode-escape' which 
is an enhanced Latin-1 with escape semantics.

This makes Latin-1 the right choice:

* Unicode literals already use it today

* As soon as we get to phase two of the implementation,
  8-bit string literals will be have to make the round trip
  raw binary -> Unicode -> raw binary and this only works
  if you make Latin-1 the default.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From fredrik@pythonware.com  Tue Feb 26 20:17:23 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 26 Feb 2002 21:17:23 +0100
Subject: [Python-Dev] proposal: add basic money type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid>
Message-ID: <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid>

and while I'm at it:

I propose adding an "abstract" money base type to the standard
library, to be subclassed by real money/decimal implementations.

    if isinstance(v, basemoney):
        # yay! it's money
        print float(money) # let's hope it's not too much

The goal is not to standardize any behaviour beyond this; anything
else should be provided by subtypes.

More details here:

    http://effbot.org/ideas/money-type.htm

I can produce PEP and patch if necessary.

</F>



From mal@lemburg.com  Tue Feb 26 20:20:56 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 21:20:56 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net>
 <LNBBLJKPBEHFEDALKOLCKEKDNKAA.tim.one@comcast.net>
 <15459.9239.83647.334632@gondolin.digicool.com>
 <200202082103.g18L3X404624@pcp742651pcs.reston01.va.comcast.net>
 <15460.15864.266226.241495@grendel.zope.com>
 <200202082122.g18LMOo05568@pcp742651pcs.reston01.va.comcast.net> <15460.17309.905113.103005@grendel.zope.com> <3C644E7B.F69AF73C@lemburg.com> <3C7B9529.309D9902@zope.com> <3C7B9C08.EAE1A30C@lemburg.com> <3C7BA3D8.7C417182@zope.com>
Message-ID: <3C7BEE28.9030E4EA@lemburg.com>

Jim Fulton wrote:
> 
> "M.-A. Lemburg" wrote:
> >
> > Jim Fulton wrote:
> ...
> 
> > > What is data roundtrip safety?
> >
> > Roundtrip safety means that e.g. if you take a COM date value
> > from a ADO and create a DateTime object with it, you can
> > be sure to get back the exact same value via the COMDate()
> > method.
> 
> Since I don't know what COMDate is, this doesn't mean
> anything to me. :)

Then you're lucky -- COMDates are just about the strangest
beast I've ever seen as date/time encoding.
 
> ...
> > I suppose that I could easily make a few calculation
> > lazy to enhance speed; memory footprint would not change
> > though. It's currently at 56 bytes per DateTime object
> > and 36 bytes per DateTimeDelta object.
> 
> Does that include the two words of Python object overhead?

I suppose so -- the values I quoted are the tp_size
values of the types. The instance will probably also require
a dictionary and the weak ref list on top of those figures.
 
> > To get similar accuracy in Python,
> 
> I assume you mean precision.

Eh, yes.
 
> > you'd need a float and
> > an integer per object,
> 
> It depends on the desired precision. To get minute
> precision, an int will do. Two ints can get you about
> a hundreth of a microsecond precision, which is more than
> most people need.

I was just trying to compare apples to apples :-)

mxDateTime offers the same precision as a float (for daytime) 
and an integer (for the day) can give.
 
> > that's 16 bytes + 12 bytes == 28 bytes
> > + malloc() overhead for the two and the wrapping instance
> > which gives another 32 bytes (provided you store the two
> > objects in slots)... >60 bytes per Python based date time
> > object.
> 
> A Python-based date-time object isn't very interesting to me.

You should have mentioned that earlier ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From fredrik@pythonware.com  Tue Feb 26 20:22:59 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 26 Feb 2002 21:22:59 +0100
Subject: [Python-Dev] proposal: add basic money type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid>
Message-ID: <071b01c1bf03$645eb520$ced241d5@hagrid>

oops.

two lines of code, and one bug.  should be:

>     if isinstance(v, basemoney):
>         # yay! it's money
>         print float(v) # let's hope it's not too much

</F>



From guido@python.org  Tue Feb 26 20:28:24 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 15:28:24 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 14:53:49 EST."
 <3C7BE7CD.EAA8BA24@zope.com>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid>
 <3C7BE7CD.EAA8BA24@zope.com>
Message-ID: <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net>

> Does this imply leap second hell, or will we 
> simply be vague about expectations?

IMO, leap seconds should be ignored.  Time stands still during a leap
second.  Consider this a BDFL pronouncement if you wish. :-)

> I'd also like to see simple access methods for year, 
> month, day, hours, minutes, and seconds,

The timetuple() method provides access to all of these
simultaneously.  Isn't that enough?  t.year() could be spelled as
t.timetuple()[0].  I expect that usually you'd request several of
these together anyway, in order to do some fancy formatting, so the
timetuple() approach makes sense.

> with date parts
> being one based and time parts being zero based.

I'm not sure what you mean here.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Feb 26 20:29:59 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 15:29:59 -0500
Subject: [Python-Dev] proposal: add basic money type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 21:17:23 +0100."
 <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid>
 <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid>
Message-ID: <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net>

> I propose adding an "abstract" money base type to the standard
> library, to be subclassed by real money/decimal implementations.

Why do we need this?  I guess that would be Question #1...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Tue Feb 26 20:31:52 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 21:31:52 +0100
Subject: [Python-Dev] proposal: add basic money type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid>
Message-ID: <3C7BF0B8.84C89070@lemburg.com>

Fredrik Lundh wrote:
> 
> and while I'm at it:
> 
> I propose adding an "abstract" money base type to the standard
> library, to be subclassed by real money/decimal implementations.
> 
>     if isinstance(v, basemoney):
>         # yay! it's money
>         print float(money) # let's hope it's not too much
> 
> The goal is not to standardize any behaviour beyond this; anything
> else should be provided by subtypes.
> 
> More details here:
> 
>     http://effbot.org/ideas/money-type.htm
> 
> I can produce PEP and patch if necessary.

Sounds like a plan. 

One thing though: the RE "[+|-]?\d+(.\d+)?" should be extended 
to allow for currency symbols and names in front or after the
monetary value.

Currency for money is a bit like timezones for datetime,
so it's a good idea, not to add it to the base type 
interface. However, the interface should be extendable
to include currency information.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Tue Feb 26 20:33:27 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 21:33:27 +0100
Subject: [Python-Dev] proposal: add basic money type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid>
 <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7BF117.D7417F7D@lemburg.com>

Guido van Rossum wrote:
> 
> > I propose adding an "abstract" money base type to the standard
> > library, to be subclassed by real money/decimal implementations.
> 
> Why do we need this?  I guess that would be Question #1...

For databases ?! The DB API has long had a monetary or at least
decimal type on its plate... never got around to implementing
one, though :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From guido@python.org  Tue Feb 26 20:37:05 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 15:37:05 -0500
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: Your message of "Tue, 26 Feb 2002 21:15:40 +0100."
 <3C7BECEC.E1550553@lemburg.com>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
Message-ID: <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>

> > I missed this.  Why not default to ASCII like any decent programming
> > language does in the absence of an explicit encoding?
> 
> Jack had the same question. The simple answer is: we need this
> in order to maintain backward compatibility when we move to
> phase two of the implementation.
> 
> Here's the longer one:
> 
> ASCII is the standard encoding for Python keywords and identifiers. 
> There is no standard source code encoding for string literals. 
> Unicode literals are interpreted using 'unicode-escape' which 
> is an enhanced Latin-1 with escape semantics.
> 
> This makes Latin-1 the right choice:
> 
> * Unicode literals already use it today

But they shouldn't, IMO.

We should require an explicit encoding when more than ASCII is used,
and I'd like to enforce this.

> * As soon as we get to phase two of the implementation,
>   8-bit string literals will be have to make the round trip
>   raw binary -> Unicode -> raw binary and this only works
>   if you make Latin-1 the default.

Sorry, I don't understand what you're trying to say here.  Can you
explain this with an example?  Why can't we require any program
encoded in more than pure ASCII to have an encoding magic comment?  I
guess I don't understand why you mean by "raw binary".

Once you've explained it to me, the PEP should address this issue.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com  Tue Feb 26 20:42:02 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 26 Feb 2002 21:42:02 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com>
Message-ID: <075f01c1bf06$1383d380$ced241d5@hagrid>

jim wrote:

> Doesn't the proposal sort of imply time-zone
> awareness of some kind? Or does it simply imply
> UT storage?

as written, it still implies time-zone awareness.  the question is
whether to remove that constraint (and the utc* methods).

*all* early reviewers argued that time zones are a representation
thingie, and doesn't belong in the abstract type.

I'm tempted to agree, but I'm not sure I can explain why...

> > and 2. should it support basic
> > arithmetics (probably yes).
> 
> Does this imply leap second hell, or will we 
> simply be vague about expectations?

vague.

> I'd also like to see simple access methods for year, 
> month, day, hours, minutes, and seconds, with date parts
> being one based and time parts being zero based.

use timetuple().

(I rather not add too much stuff to the abstract interface;
the goal is to let MAL turn mxDateTime into a basetime sub-
type without breaking any application code...)

</F>



From jim@zope.com  Tue Feb 26 20:39:22 2002
From: jim@zope.com (Jim Fulton)
Date: Tue, 26 Feb 2002 15:39:22 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <3C7BEC03.E8225CFD@lemburg.com>
Message-ID: <3C7BF27A.BD588B4B@zope.com>

"M.-A. Lemburg" wrote:
> 
> Jim Fulton wrote:
> >
> > > major issues right now is 1. should the type know about
> > > timezones (probably not),
> >
> > :(
> >
> > -1
> >
> > Doesn't the proposal sort of imply time-zone
> > awareness of some kind? Or does it simply imply
> > UT storage?
> 
> I tried that in early version of mxDateTime -- it fails
> badly. I switched to the local time assumption very
> early in the development.

That's a really bad assumption if times are used in different 
time zones.

> Note that Fredrik's type is an abstract type; it doesn't
> even store anything -- that's up to subtypes which of
> course can implement timezones at their liking.

No, it does store something, it just doesn't say how.
There are methods for returning localtime and gmtime, so
there is (abstract) storage.

> > > and 2. should it support basic
> > > arithmetics (probably yes).
> >
> > Does this imply leap second hell, or will we
> > simply be vague about expectations?
> 
> The type will store a fixed point in time, so why
> worry about leap seconds (most system's don't support these
> anyway and if they do, the support is usually switched off per
> default) ?

There are a lot of semantic issues with date-time math.
Leap seconds is an example. If you store local time, are
date-time subtractions affected by daylight-savings time?
Do the calculations depend on the calendar? Do you take
into account the lost days in the switch from the Julean
to the Gregorian calendar?

I'm not really opposed to doing math, but we need to at least
recognize the fuzzyness of the semantics.
 
> > I'd also like to see simple access methods for year,
> > month, day, hours, minutes, and seconds, with date parts
> > being one based and time parts being zero based.
> 
> In the abstract base type ?

Yes.

Jim

--
Jim Fulton           mailto:jim@zope.com       Python Powered!        
CTO                  (888) 344-4332            http://www.python.org  
Zope Corporation     http://www.zope.com       http://www.zope.org


From guido@python.org  Tue Feb 26 20:49:04 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 15:49:04 -0500
Subject: [Python-Dev] proposal: add basic money type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 21:33:27 +0100."
 <3C7BF117.D7417F7D@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net>
 <3C7BF117.D7417F7D@lemburg.com>
Message-ID: <200202262049.g1QKn4H19925@pcp742651pcs.reston01.va.comcast.net>

> > > I propose adding an "abstract" money base type to the standard
> > > library, to be subclassed by real money/decimal implementations.
> > 
> > Why do we need this?  I guess that would be Question #1...
> 
> For databases ?! The DB API has long had a monetary or at least
> decimal type on its plate... never got around to implementing
> one, though :-)

I can only find one reference to money or decimal in the DB API PEP,
and that's as a future task.  I guess that's what you mean by "on its
plate".

Since I'm not a database expert, maybe you can explain the use of this
in more detail?  And why would we need a monetary type rather than a
fixed-point decimal type?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal@lemburg.com  Tue Feb 26 20:48:57 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 21:48:57 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk>
Message-ID: <3C7BF4B9.472C9207@lemburg.com>

Finn Bock wrote:
> 
> [MAL]
> 
> >I consider the above PEP ready for review by the developers.
> >Please comment.
> 
> The pep seems to dictate that the source by default must be read as
> latin-1:
> 
> """
> Python will default to Latin-1 as standard encoding if no other
> encoding hints are given.
> """
> 
> Jython already reads the python source with the default java encoding
> which usually depends on the PCs locale.
> 
> If a small loophole could be added to that requirement, then the pep
> have my full support.

Hmm, in phase two we will need to decode the source code
file using some encoding into Unicode and then reencode the
8-bit string parts using that same encoding. The only 
requirement we have for that is round-trip safety, so that
string literals turn out as the same value you see in the
source file.

Now, Unicode literals are explicit about this: unicode-escape
is a latin-1 codec with some escaping knowledge. I'm not sure
how to get this in line with the "any round-trip safe encoding"
strategy...

OTOH, if Jython users write source code which depends on the
PC's locale then they are bound to write non-portable code,
so fixing one encoding would certainly help here.

What I don't understand is why you read the file using the
PC's locale. Wouldn't it be possible to set the file encoding 
prior to reading from it ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From fredrik@pythonware.com  Tue Feb 26 20:48:47 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 26 Feb 2002 21:48:47 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <3C7BEC03.E8225CFD@lemburg.com>
Message-ID: <079f01c1bf07$11963ee0$ced241d5@hagrid>

mal wrote:

> > Doesn't the proposal sort of imply time-zone
> > awareness of some kind? Or does it simply imply
> > UT storage?
> 
> I tried that in early version of mxDateTime -- it fails
> badly.

can you elaborate?

> > Does this imply leap second hell, or will we
> > simply be vague about expectations?
> 
> The type will store a fixed point in time, so why
> worry about leap seconds (most system's don't support these
> anyway and if they do, the support is usually switched off per
> default) ?

the updated proposal adds __hash__ and __cmp__, and
the following (optional?) operations:

    deltaobject = timeobject - timeobject
    floatobject = float(deltaobject) # fractional seconds
    timeobject = timeobject + integerobject
    timeobject = timeobject + floatobject
    timeobject = timeobject + deltaobject

note that "deltaobject" can be anything; the abstract type
only says that if you manage to subtract one time object from
another one of the same type, you get some object that you
can 1) convert to a float, and 2) add to another time object.

vague, but pretty useful.

> > I'd also like to see simple access methods for year,
> > month, day, hours, minutes, and seconds, with date parts
> > being one based and time parts being zero based.
> 
> In the abstract base type ?

Q. does mxDateTime provide separate accessors for individual
members?

</F>



From fredrik@pythonware.com  Tue Feb 26 20:52:48 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 26 Feb 2002 21:52:48 +0100
Subject: [Python-Dev] proposal: add basic money type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <3C7BF0B8.84C89070@lemburg.com>
Message-ID: <07c101c1bf07$8fbe1540$ced241d5@hagrid>

mal wrote:

> > I propose adding an "abstract" money base type to the standard
> > library, to be subclassed by real money/decimal implementations.
> > 
> >     if isinstance(v, basemoney):
> >         # yay! it's money
> >         print float(money) # let's hope it's not too much
> > 
> > The goal is not to standardize any behaviour beyond this; anything
> > else should be provided by subtypes.
> > 
> > More details here:
> > 
> >     http://effbot.org/ideas/money-type.htm
> > 
> > I can produce PEP and patch if necessary.
> 
> Sounds like a plan. 
> 
> One thing though: the RE "[+|-]?\d+(.\d+)?" should be extended 
> to allow for currency symbols and names in front or after the
> monetary value.

isn't this better handled by a separate method/attribute?

(otherwise, I fear that we'll end up adding all possible currency
notations to the abstract type.  but maybe there is a standard
for this, somewhere?)

</F>



From guido@python.org  Tue Feb 26 20:54:30 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 15:54:30 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 21:42:02 +0100."
 <075f01c1bf06$1383d380$ced241d5@hagrid>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com>
 <075f01c1bf06$1383d380$ced241d5@hagrid>
Message-ID: <200202262054.g1QKsUk19953@pcp742651pcs.reston01.va.comcast.net>

> *all* early reviewers argued that time zones are a representation
> thingie, and doesn't belong in the abstract type.

Then maybe the reviewers didn't have a sufficiently wide range of
applications in mind?

If I were to create a database of email messages, I'd be seriously
annoyed if it normalized the timezone info away.  A message sent at
10pm EST has a different feel to it than one sent at 4am MET.  It
should *sort* on UTC, but it should use the original timezone to
display the dates.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim@zope.com  Tue Feb 26 20:53:01 2002
From: jim@zope.com (Jim Fulton)
Date: Tue, 26 Feb 2002 15:53:01 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid>
 <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7BF5AD.45BACF05@zope.com>

Guido van Rossum wrote:
> 
> > Does this imply leap second hell, or will we
> > simply be vague about expectations?
> 
> IMO, leap seconds should be ignored.  Time stands still during a leap
> second.  Consider this a BDFL pronouncement if you wish. :-)
> 
> > I'd also like to see simple access methods for year,
> > month, day, hours, minutes, and seconds,
> 
> The timetuple() method provides access to all of these
> simultaneously.  Isn't that enough? 

>From a minimalist point of view, yet, but from a usability point 
of view, no.

> t.year() could be spelled as
> t.timetuple()[0]. 

Yes, but t.year() is a lot more readable.

> I expect that usually you'd request several of
> these together anyway, in order to do some fancy formatting, so the
> timetuple() approach makes sense.

I find the time tuples to be really inconvenient. I *always*
have to slice off the parts I don't want, which I find annoying.

Hm, now that I mention the extra parts, it seems kind of silly
to make implementors of the type come up with weekday, julian day, and
a daylight-savings flag. This time format is really biased by
the C time library, which is fine for a module that wraps the C library
but seems a bit silly for a standard date-time interface.

> > with date parts
> > being one based and time parts being zero based.
> 
> I'm not sure what you mean here.

Years, months, and days should start from 1.
Hours, minutes and seconds should start from 0.

One confusion I often have with time tuples is that I know
too much about C's time struct, which numbers months from 0
and which has years since 1900. 

Jim

--
Jim Fulton           mailto:jim@zope.com       Python Powered!        
CTO                  (888) 344-4332            http://www.python.org  
Zope Corporation     http://www.zope.com       http://www.zope.org


From guido@python.org  Tue Feb 26 20:58:57 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 15:58:57 -0500
Subject: [Python-Dev] proposal: add basic money type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 21:31:52 +0100."
 <3C7BF0B8.84C89070@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid>
 <3C7BF0B8.84C89070@lemburg.com>
Message-ID: <200202262058.g1QKwvA19990@pcp742651pcs.reston01.va.comcast.net>

> Currency for money is a bit like timezones for datetime,
> so it's a good idea, not to add it to the base type 
> interface. However, the interface should be extendable
> to include currency information.

Currency is much worse than timezones -- once you are interested in
exchange rates, you need to know *when* to calculate the exchange rate
(as well as other parameters such as which exchange rate).

So please let's keep the currency out of the money type; it's utterly
application dependent what to do with that information.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Tue Feb 26 21:00:40 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 26 Feb 2002 15:00:40 -0600
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net>
 <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net>
 <3C7B9617.D22E449F@zope.com>
 <065201c1bef8$f3e03ad0$ced241d5@hagrid>
 <3C7BE7CD.EAA8BA24@zope.com>
 <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15483.63352.850857.833197@12-248-41-177.client.attbi.com>

    Guido> The timetuple() method provides access to all of these
    Guido> simultaneously.  Isn't that enough?  t.year() could be spelled as
    Guido> t.timetuple()[0].

Since we're discussing an abstract type it probably doesn't apply directly,
but perhaps timetuple() could be specified to return a super-tuple like
os.stat() does...

Skip


From jepler@unpythonic.dhs.org  Tue Feb 26 21:01:43 2002
From: jepler@unpythonic.dhs.org (jepler@unpythonic.dhs.org)
Date: Tue, 26 Feb 2002 15:01:43 -0600
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <3C7BE70B.74713ED2@lemburg.com>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de> <15483.53993.852170.135298@anthem.wooz.org> <m3n0xww01p.fsf@mira.informatik.hu-berlin.de> <3C7BE70B.74713ED2@lemburg.com>
Message-ID: <20020226150143.C16980@unpythonic.dhs.org>

On Tue, Feb 26, 2002 at 08:50:35PM +0100, M.-A. Lemburg wrote:
> Does anybody know where XEmacs is moving w/r to this ? (and
> for that matter what about vi, vim, etc. ?)

I'm working with Vim 6.0, 20001 Sep 14.

VIM lets you set variables with text similar to
	vim:KEY=VALUE:KEY=VALUE:....:
Apparently you would use
	vim:fileencoding=sjis:
to select shift-jis encoding.  In the vim style, it seems most common to
place this at the bottom of a file, but it can be placed at the top too.
The variable "modelines" controls how many lines at each end of the file is
inspected, with the default being 5.  It's documented that the form
	vi:set KEY=VALUE:
may be compatible with "some versions of Vi" but does not say which.  (I
can't get this to work)

You can set a list of encodings to attempt when a file is loaded, which
defaults to "ucs-bom,utf-8,latin1".  A user who wanted to treate
non-unicode files as shift-jis by default would
	:set fileencodings=ucs-bom,utf-8,sjis
You can also load a particular file with the ++enc parameter:
	:edit ++enc=koi8-r russian.txt
(I can get this to work, but I have to do it manually to load anything in
an odd character set)

The emacs line is harmless in vim, but doesn't do anything.  It's possible
that using :autocmd someone could make vim use the emacs line to set
encoding, but I'm not sure -- setting fileencoding after a file is loaded
seems to perform a translation from the old characterset to the new.

Jeff


From fdrake@acm.org  Tue Feb 26 21:01:12 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 26 Feb 2002 16:01:12 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <200202262054.g1QKsUk19953@pcp742651pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net>
 <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net>
 <3C7B9617.D22E449F@zope.com>
 <065201c1bef8$f3e03ad0$ced241d5@hagrid>
 <3C7BE7CD.EAA8BA24@zope.com>
 <075f01c1bf06$1383d380$ced241d5@hagrid>
 <200202262054.g1QKsUk19953@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15483.63384.640568.56380@grendel.zope.com>

Guido van Rossum writes:
 > 10pm EST has a different feel to it than one sent at 4am MET.  It
 > should *sort* on UTC, but it should use the original timezone to
 > display the dates.

Sounds like a user preference, not a universal truth.

Is it important that the timezone is part of the date/time type,
though?  Is it important that it be part of the abstract base
date/time?

Specific implementations should certainly be able to add support for
timezones, and perhaps some hypothetical default date/time type should
include it for convenience, but that doesn't tell me it's fundamental.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From tim@zope.com  Tue Feb 26 21:05:10 2002
From: tim@zope.com (Tim Peters)
Date: Tue, 26 Feb 2002 16:05:10 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <3C7B9CA1.B7798317@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEJBNPAA.tim@zope.com>

[M.-A. Lemburg]
>>> The other possibility would be adding a set of new types
>>> to mxDateTime which focus on low memory requirements rather
>>> than data roundtrip safety and speed.

[Jim Fulton]
>> What is data roundtrip safety?

[MAL]
> Roundtrip safety means that e.g. if you take a COM date value
> from a ADO and create a DateTime object with it, you can
> be sure to get back the exact same value via the COMDate()
> method.
>
> The same is true for broken down values and, of course,
> the internal values .absdate and .abstime.

I'm unsure whether Jim is aware of this, but if not he should be:  the
non-trivial time I spent over the last week repairing test failures in
Zope's current DateTime.py was all spent finding & repairing basic roundtrip
failures.  These came in two flavors:

1.
        dt = DateTime()

        dt2 = DateTime(
            dt.year(),
            dt.month(),
            dt.day(),
            dt.hour(),
            dt.minute(),
            dt.second())

        assert dt == dt2

could fail, depending on the exact time you tried it.  This was the root
cause of many test failures (they were usually reported as failures after
doing some sort of arithmetic first, but the arithmetic was actually
irrelevant:  when these failed, the base objects didn't match from the
start).

2. Failure of roundtrip conversion between DateTime objects D and repr(D),
than back again, to reproduce the original DateTime or string.

"Floating-point" got the generic blame for these things under the covers,
but it was really peoples' spectacular inability to deal with *binary*
floating-point that caused all the problems.  People just can't help but
see, e.g., "50.327" as an exact decimal value, so just can't help writing
code that assumes false correlates (such as, e.g., that int((50.327 - 50) *
1000) will return 327; but it doesn't; it returns 326).  If we were using
decimal floating-point instead, the numerically naive code here would have
worked fine.

> ...
> DateTime objects use .abstime and .absdate for doing
> arithmetic since these provides the best accuracy. The most
> important broken down values are calculated once at creation
> time; a few others are done on-the-fly.

Note that 2.2 properties allow natural support of calculated attributes, and
that a computed attribute can easily arrange to cache its value.

> I suppose that I could easily make a few calculation
> lazy to enhance speed; memory footprint would not change
> though. It's currently at 56 bytes per DateTime object
> and 36 bytes per DateTimeDelta object.

I'm assuming that counts Python object header overhead, but does not count
hidden malloc overhead.  Switching to pymalloc would slash the latter.

> To get similar accuracy in Python, you'd need a float and
> an integer per object, that's 16 bytes + 12 bytes == 28 bytes
> + malloc() overhead for the two and the wrapping instance
> which gives another 32 bytes (provided you store the two
> objects in slots)... >60 bytes per Python based date time
> object.

Just noting that a Zope DateTime instance is huge, with a dozen named
attributes, one of which holds a Python long (unbounded integer).



From mal@lemburg.com  Tue Feb 26 21:06:15 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 22:06:15 +0100
Subject: [Python-Dev] proposal: add basic money type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net>
 <3C7BF117.D7417F7D@lemburg.com> <200202262049.g1QKn4H19925@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7BF8C7.DECB7D38@lemburg.com>

Guido van Rossum wrote:
> 
> > > > I propose adding an "abstract" money base type to the standard
> > > > library, to be subclassed by real money/decimal implementations.
> > >
> > > Why do we need this?  I guess that would be Question #1...
> >
> > For databases ?! The DB API has long had a monetary or at least
> > decimal type on its plate... never got around to implementing
> > one, though :-)
> 
> I can only find one reference to money or decimal in the DB API PEP,
> and that's as a future task.  I guess that's what you mean by "on its
> plate".

Exactly :-)
 
> Since I'm not a database expert, maybe you can explain the use of this
> in more detail?  And why would we need a monetary type rather than a
> fixed-point decimal type?

A decimal would do as well, I suppose, at least in
terms of storing the raw value. The reason for trying
to come up with a monetary type is to make operations between
monetary values having two different currencies illegal. 
Coercion between two of those would always have to be 
made explicit (for obvious reasons).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From guido@python.org  Tue Feb 26 21:07:48 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 16:07:48 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 15:53:01 EST."
 <3C7BF5AD.45BACF05@zope.com>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net>
 <3C7BF5AD.45BACF05@zope.com>
Message-ID: <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net>

> > The timetuple() method provides access to all of these
> > simultaneously.  Isn't that enough? 
> 
> From a minimalist point of view, yet, but from a usability point 
> of view, no.
> 
> > t.year() could be spelled as
> > t.timetuple()[0]. 
> 
> Yes, but t.year() is a lot more readable.

When do you ever use this in isolation?  I'd expect in 99% of the
cases you hand it off to a formatting routine, and guess what --
strftime() takes a time tuple.

I worry about the time wasted by calling all of t.year(), t.month(),
t.day() (etc.) -- given that they do so little, the call overhead is
probably near 100%.

I wonder how often this is needed.  The only occurrences of year() in
the entire Zope source that I found are in various test routines.

> > I expect that usually you'd request several of
> > these together anyway, in order to do some fancy formatting, so the
> > timetuple() approach makes sense.
> 
> I find the time tuples to be really inconvenient. I *always*
> have to slice off the parts I don't want, which I find annoying.

Serious question: what do you tend to do with time values?  I imagine
that once we change strftime() to accept an abstract time object,
you'll never need to call either timetuple() or year() -- strftime()
will do it for you.

> Hm, now that I mention the extra parts, it seems kind of silly
> to make implementors of the type come up with weekday, julian day, and
> a daylight-savings flag. This time format is really biased by
> the C time library, which is fine for a module that wraps the C library
> but seems a bit silly for a standard date-time interface.

That's why /F's pre-PEP allows the implementation to leaves these
three set to -1.

> > > with date parts
> > > being one based and time parts being zero based.
> > 
> > I'm not sure what you mean here.
> 
> Years, months, and days should start from 1.
> Hours, minutes and seconds should start from 0.
> 
> One confusion I often have with time tuples is that I know
> too much about C's time struct, which numbers months from 0
> and which has years since 1900. 

I guess that confusion is yours alone.  In Python, of course month and
day start from 1.  Whether years start from 1 is a theological
question. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Feb 26 21:11:34 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 16:11:34 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 16:01:12 EST."
 <15483.63384.640568.56380@grendel.zope.com>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <075f01c1bf06$1383d380$ced241d5@hagrid> <200202262054.g1QKsUk19953@pcp742651pcs.reston01.va.comcast.net>
 <15483.63384.640568.56380@grendel.zope.com>
Message-ID: <200202262111.g1QLBZA20104@pcp742651pcs.reston01.va.comcast.net>

> Guido van Rossum writes:
>  > 10pm EST has a different feel to it than one sent at 4am MET.  It
>  > should *sort* on UTC, but it should use the original timezone to
>  > display the dates.

[Fred]
> Sounds like a user preference, not a universal truth.

Fair enough.  Even some of my well-traveled friends cannot do timezone
arithmetic in their head... :-)

> Is it important that the timezone is part of the date/time type,
> though?  Is it important that it be part of the abstract base
> date/time?
> 
> Specific implementations should certainly be able to add support for
> timezones, and perhaps some hypothetical default date/time type should
> include it for convenience, but that doesn't tell me it's fundamental.

I guess I want it to be possible to have an implementation that keeps
track of the timezone as entered.

It's true that time deltas are a nightmare when dealing with different
timezones.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Feb 26 21:16:31 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 16:16:31 -0500
Subject: [Python-Dev] proposal: add basic money type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 22:06:15 +0100."
 <3C7BF8C7.DECB7D38@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net> <3C7BF117.D7417F7D@lemburg.com> <200202262049.g1QKn4H19925@pcp742651pcs.reston01.va.comcast.net>
 <3C7BF8C7.DECB7D38@lemburg.com>
Message-ID: <200202262116.g1QLGVN20144@pcp742651pcs.reston01.va.comcast.net>

> A decimal would do as well, I suppose, at least in
> terms of storing the raw value. The reason for trying
> to come up with a monetary type is to make operations between
> monetary values having two different currencies illegal. 
> Coercion between two of those would always have to be 
> made explicit (for obvious reasons).

Are you sure you're trying to solve a real problem here?  There are
lots of operations on monetary values that make no sense (try
multiplying two amounts of money).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Tue Feb 26 21:26:47 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 22:26:47 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net>
 <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7BFD97.B69DDDFD@lemburg.com>

Guido van Rossum wrote:
> 
> > > The timetuple() method provides access to all of these
> > > simultaneously.  Isn't that enough?
> >
> > From a minimalist point of view, yet, but from a usability point
> > of view, no.
> >
> > > t.year() could be spelled as
> > > t.timetuple()[0].
> >
> > Yes, but t.year() is a lot more readable.
> 
> When do you ever use this in isolation?  I'd expect in 99% of the
> cases you hand it off to a formatting routine, and guess what --
> strftime() takes a time tuple.

FWIW, mxDateTime exposes these values as attributes -- there
is no call overhead.
 
> I worry about the time wasted by calling all of t.year(), t.month(),
> t.day() (etc.) -- given that they do so little, the call overhead is
> probably near 100%.
> 
> I wonder how often this is needed.  The only occurrences of year() in
> the entire Zope source that I found are in various test routines.

I actually use the attributes quite a bit in my stuff
and hardly ever use .strftime(). The mxDateTime is different
though, e.g. I sometimes get questions about how to make
strftime() output fractions of a seconds (doesn't work, because
strftime() doesn't support it).
 
> > > I expect that usually you'd request several of
> > > these together anyway, in order to do some fancy formatting, so the
> > > timetuple() approach makes sense.
> >
> > I find the time tuples to be really inconvenient. I *always*
> > have to slice off the parts I don't want, which I find annoying.
> 
> Serious question: what do you tend to do with time values?  I imagine
> that once we change strftime() to accept an abstract time object,
> you'll never need to call either timetuple() or year() -- strftime()
> will do it for you.

Depends on the application space. Database applications
will call .timetuple() very often and use strftime() hardly
ever.
 
> > > > with date parts
> > > > being one based and time parts being zero based.
> > >
> > > I'm not sure what you mean here.
> >
> > Years, months, and days should start from 1.
> > Hours, minutes and seconds should start from 0.
> >
> > One confusion I often have with time tuples is that I know
> > too much about C's time struct, which numbers months from 0
> > and which has years since 1900.
> 
> I guess that confusion is yours alone.  In Python, of course month and
> day start from 1.  Whether years start from 1 is a theological
> question. :-)

It's not really a question: the year 0 simply does not exist 
in reality ! (Christians didn't have a 0 available at the 
time ;-)

Still, historic dates are usually referenced by making year 0 ==
1 b.c., -1 == 2 b.c., etc.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From jim@zope.com  Tue Feb 26 21:23:42 2002
From: jim@zope.com (Jim Fulton)
Date: Tue, 26 Feb 2002 16:23:42 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net>
 <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7BFCDE.F920078F@zope.com>

Guido van Rossum wrote:
> 
> > > The timetuple() method provides access to all of these
> > > simultaneously.  Isn't that enough?
> >
> > From a minimalist point of view, yet, but from a usability point
> > of view, no.
> >
> > > t.year() could be spelled as
> > > t.timetuple()[0].
> >
> > Yes, but t.year() is a lot more readable.
> 
> When do you ever use this in isolation?  I'd expect in 99% of the
> cases you hand it off to a formatting routine, and guess what --
> strftime() takes a time tuple.
> 
> I worry about the time wasted by calling all of t.year(), t.month(),
> t.day() (etc.) -- given that they do so little, the call overhead is
> probably near 100%.
> 
> I wonder how often this is needed.  The only occurrences of year() in
> the entire Zope source that I found are in various test routines.

These methods and others are used a lot in presentation code, 
which tends to be expressed in DTML or ZPT.

It's not uncommon to select/catagorize things by year or month.

I think most people would find individual date-part methods
a lot more natural than tuples.

> > > I expect that usually you'd request several of
> > > these together anyway, in order to do some fancy formatting, so the
> > > timetuple() approach makes sense.
> >
> > I find the time tuples to be really inconvenient. I *always*
> > have to slice off the parts I don't want, which I find annoying.
> 
> Serious question: what do you tend to do with time values? 

I format them in various ways and I sort them.

> I imagine
> that once we change strftime() to accept an abstract time object,
> you'll never need to call either timetuple() or year() -- strftime()
> will do it for you.

Maybe, if I use strftime, but I don't use strftime all that much.
I can certainly think of even formatting cases (e.g. internationalized
dates) where it's not adequate.
 
> > Hm, now that I mention the extra parts, it seems kind of silly
> > to make implementors of the type come up with weekday, julian day, and
> > a daylight-savings flag. This time format is really biased by
> > the C time library, which is fine for a module that wraps the C library
> > but seems a bit silly for a standard date-time interface.
> 
> That's why /F's pre-PEP allows the implementation to leaves these
> three set to -1.

I missed these. Still, providing -1s seems, uh, vestigial. 
 
> > > > with date parts
> > > > being one based and time parts being zero based.
> > >
> > > I'm not sure what you mean here.
> >
> > Years, months, and days should start from 1.
> > Hours, minutes and seconds should start from 0.
> >
> > One confusion I often have with time tuples is that I know
> > too much about C's time struct, which numbers months from 0
> > and which has years since 1900.
> 
> I guess that confusion is yours alone.  In Python, of course month and
> day start from 1.  Whether years start from 1 is a theological
> question. :-)

I doubt the confusion is mine alone, but I'll take your word for it.

Jim

--
Jim Fulton           mailto:jim@zope.com       Python Powered!        
CTO                  (888) 344-4332            http://www.python.org  
Zope Corporation     http://www.zope.com       http://www.zope.org


From mal@lemburg.com  Tue Feb 26 21:32:31 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 22:32:31 +0100
Subject: [Python-Dev] proposal: add basic money type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net> <3C7BF117.D7417F7D@lemburg.com> <200202262049.g1QKn4H19925@pcp742651pcs.reston01.va.comcast.net>
 <3C7BF8C7.DECB7D38@lemburg.com> <200202262116.g1QLGVN20144@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7BFEEF.16BDC177@lemburg.com>

Guido van Rossum wrote:
> 
> > A decimal would do as well, I suppose, at least in
> > terms of storing the raw value. The reason for trying
> > to come up with a monetary type is to make operations between
> > monetary values having two different currencies illegal.
> > Coercion between two of those would always have to be
> > made explicit (for obvious reasons).
> 
> Are you sure you're trying to solve a real problem here?  There are
> lots of operations on monetary values that make no sense (try
> multiplying two amounts of money).

Indeed, monetary types solve different problems than decimal 
types. Financial applications do have a need for these kind
of implicit error checks.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From guido@python.org  Tue Feb 26 21:37:31 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 16:37:31 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 16:23:42 EST."
 <3C7BFCDE.F920078F@zope.com>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net>
 <3C7BFCDE.F920078F@zope.com>
Message-ID: <200202262137.g1QLbWq20272@pcp742651pcs.reston01.va.comcast.net>

[me]
> > I wonder how often this is needed.  The only occurrences of year() in
> > the entire Zope source that I found are in various test routines.

[Jim]
> These methods and others are used a lot in presentation code, 
> which tends to be expressed in DTML or ZPT.
> 
> It's not uncommon to select/catagorize things by year or month.
> 
> I think most people would find individual date-part methods
> a lot more natural than tuples.

OK, that explains a lot.  For this context I agree, although I think
they should probably appear as (computed) attributes rather than
methods.  Properties seem perfect.

> > I imagine
> > that once we change strftime() to accept an abstract time object,
> > you'll never need to call either timetuple() or year() -- strftime()
> > will do it for you.
> 
> Maybe, if I use strftime, but I don't use strftime all that much.

Maybe you should. :-)

> I can certainly think of even formatting cases (e.g. internationalized
> dates) where it's not adequate.

Then a super-strftime() should be invented that *is* enough, rather
than fumbling with hand-coded solutions.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@v.loewis.de  Tue Feb 26 21:36:58 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 26 Feb 2002 22:36:58 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> > This makes Latin-1 the right choice:
> > 
> > * Unicode literals already use it today
> 
> But they shouldn't, IMO.

I agree. I recommend to deprecate this feature, and raise a
DeprecationWarning if a Unicode literal contains non-ASCII characters
but no encoding has been declared.

> Sorry, I don't understand what you're trying to say here.  Can you
> explain this with an example?  Why can't we require any program
> encoded in more than pure ASCII to have an encoding magic comment?  I
> guess I don't understand why you mean by "raw binary".

With the proposed implementation, the encoding declaration is only
used for Unicode literals. In all other places where non-ASCII
characters can occur (comments, string literals), those characters are
treated as "bytes", i.e. it is not verified that these bytes are
meaningful under the declared encoding.

Marc's original proposal was to apply the declared encoding to the
complete source code, but I objected claiming that it would make the
tokenizer changes more complex, and the resulting tokenizer likely
significantly slower (atleast if you use the codecs API to perform the
decoding).

In phase 2, the encoding will apply to all strings. So it will not be
possible to put arbitrary byte sequences in a string literal, atleast
if the encoding disallows certain byte sequences (like UTF-8, or
ASCII). Since this is currently possible, we have a backwards
compatibility problem.

Regards,
Martin



From guido@python.org  Tue Feb 26 21:40:42 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 16:40:42 -0500
Subject: [Python-Dev] proposal: add basic money type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 22:32:31 +0100."
 <3C7BFEEF.16BDC177@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net> <3C7BF117.D7417F7D@lemburg.com> <200202262049.g1QKn4H19925@pcp742651pcs.reston01.va.comcast.net> <3C7BF8C7.DECB7D38@lemburg.com> <200202262116.g1QLGVN20144@pcp742651pcs.reston01.va.comcast.net>
 <3C7BFEEF.16BDC177@lemburg.com>
Message-ID: <200202262140.g1QLehH20305@pcp742651pcs.reston01.va.comcast.net>

> Indeed, monetary types solve different problems than decimal 
> types. Financial applications do have a need for these kind
> of implicit error checks.

But this is easily done by creating a custom class -- which has the
advantage that the set of constraints can be specialized to the needs
of a specific application.  When we add a monetary type to the
language we'll never get it right for all apps.  OTOH, I think we
could get a fixed point type right.

How many other languages have a monetary type?

What support for money does SQL have?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Feb 26 21:53:55 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 16:53:55 -0500
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: Your message of "26 Feb 2002 22:36:58 +0100."
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200202262153.g1QLrt220437@pcp742651pcs.reston01.va.comcast.net>

> In phase 2, the encoding will apply to all strings. So it will not be
> possible to put arbitrary byte sequences in a string literal, atleast
> if the encoding disallows certain byte sequences (like UTF-8, or
> ASCII). Since this is currently possible, we have a backwards
> compatibility problem.

I would say that any program that currently uses non-ASCII in string
literals (whether Unicode or 8-bit literals) is strictly spoken
undefined.  For cases where a specific encoding is used, the solution
is easy: add an explicit encoding.  Other cases are simply garbage and
should use \xDD escapes instead.

Maybe an implementation phase 1a should be introduced that warns about
the occurrence of non-ASCII characters anywhere in the source code
when no encoding is specified.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Nikolai <trastme@honkong.com>  Tue Feb 26 19:12:01 2002
From: Nikolai <trastme@honkong.com> (Nikolai)
Date: Tue, 26 Feb 2002 23:12:01 +0400
Subject: [Python-Dev] =?koi8-r?B?8NLP3tTJ1MUsIM3P1sXULCDc1M8g98HbINvBztMuIPXEwczJ1NggzsXJy8/H?=
 =?koi8-r?B?xMEgzsUg0M/axM7PLg==?=
Message-ID: <147594609.20020226231201@honkong.com>

SGVsbG8sDQoNCg0KICAgICDtxc7RINrP19XUIO7Jy8/MwcouIOna18nOydTFINrBIMLF09DPy8/K
09TXzywgzs8gzsUg1M/Sz9DJ1MXT2CDVxMHM0dTYINzUzyDQydPYzc8gLSDPzs8gzc/WxdQg09nH
0sHU2CDXwdbO1cAg0s/M2CDXDQr3wdvFyiDWydrOySEg6c3Fzs7PINTByyDQ0s/J2s/bzM8g088g
zc7PyiDQz9PMxSDQz8zV3sXOydEgLSDOxcvP1M/Sz8Ug19LFzdEgzsHawcQgLSDUwcvPx88g1sUg
0MnT2M3BLiD2xczBwCDJIPfBzSDOxQ0K1dDV09TJ1Ngg09fPyiDbwc7TISDv4vH64fTl7Pju7yDz
7+jy4e7p9OUg/PTvIPDp8/jt7yDu4SD25fP06+/tIOkg5+ni6+noIOTp8+vh6CENCg0KICAgICD3
0NLP3sXNLCDF08zJIPfZIM7FIMjP1MnUxSDTz9fF0tvFzs7PINrBy8/Ozs8gxM/Qz8zOydTFzNjO
zyDawdLBwsHU2dfB1NggzsXTy8/M2MvPINTZ09HeIMnMySDExdPR1MvP1yDU2dPR3iBVUyQg1w0K
zcXT0cMgzsUgz9TIz8TRIM/UIM3F09TBLCDHxMUg09TPydQg98HbIMvPzdDYwNTF0iAoy8/Oy9LF
1M7B0SDT1c3NwSDC1cTF1CDawdfJ08XU2CDUz8zYy88gz9Qg19LFzcXOySwgy8/Uz9LPxSD32SDa
wdPUwdfJ1MUNCtfB2yDLz83Q2MDUxdIgydPLwdTYIEUtbWFpbCjZKSDJIM/U0NLB18zR1Ngg0M8g
zsnNIMTBzs7PxSDQydPYzc8sIMTB1sUgxdPMySD32SDTwc3JLCDQ0skg3NTPzSwgydrXyc7J1MUs
INPQydTFIC0g1yDMwMLPzQ0K083Z08zFINzUz8fPINPMz9fBKSwg0M/Mzs/T1NjAIM7F2sHXydPJ
zc8gz9Qgy8/HzyDC2SDUzyDOySDC2czPIMkg08/I0sHO0dEgKMXTzMkg98HNINzUzyDC1cTF1CDV
x8/Ezs8pIMHOz87Jzc7P09TYLA0Kyc7XxdPUydLV0SDOwSDT1MHS1MUg19PFx88gMjAgVVMkICjL
z9TP0tnFLCDLINTPzdUg1sUsIM/L1dDR1NPRIMLVy9fBzNjOzyDexdLF2iDOxcTFzMAgyczJIMTX
xSksINPPyNLBztHRINDSySDX08XNINzUz80NCt7J09TVwCDTz9fF09TYICjULssuIPfBzSDOySDS
wdrVIM7FINDSycTF1NPRIM7Jy8/HzyDPws3BztnXwdTYIC0g/PTvIO7lIPDp8uHt6eThIMEgzNEg
7e3tIMnMySwgxMHWxSwg58XSwsHMwcrGIC0g1NXUDQrPws/Hwd3BwNTT0SDX08UsIMvUzyDSxcHM
2M7PINLBws/UwcXUIMLF2iDcy9PQzNXB1MHDyckgItfF0sjBzckiLCDLz9TP0tnIINrExdPYINDS
z9PUzyDOxdQsIM7PIMkgwsXaINXSwdfOyczP18vJLCDLz87F3s7PKSwNCtPQz8vPys7ZyiDTz84g
ySDQz8TPwsHA3cXFINbFzMHOycUsINTPIM3P1sXUxSDIzMHEzs/L0s/Xzs8g1cLJ18HU2CDT18/A
IO7h5OX25PUg7uEg9ez1/vvl7unlIOvh/uXz9PfhIPbp+u7pLCDVzsne1M/WwdEg3NTPDQrQydPY
zc8hDQoNCiAgICAg98HWzs8g2sHNxdTJ1NgsIN7UzyDIz9TRINDPxM/CztnKINDSz8XL1CDawdLP
xMnM09Eg1yDz++EsIPfBzSDQ0sXEzMHHwcXU09EgxcfPINfB0snBztQg8O/s7u/z9PjgIOHk4fD0
6fLv9+Hu7vnqIOsNCvPw5ePp5unr5SDy9fPz6+/x+vn+7u/qIP7h8/TpIOnu9OXy7uX04SAyMDAy
IMfPxMEhDQoNCiAgKioqDQoNCiAg4SD05fDl8vggLSAg7/D59CDi+ffh7O/n7yDy7/Pz6erz6+/n
7yDw7+z4+u/34fTl7PEgDQoNCg0KDQogICAgIPzUwSDQ0s/H0sHNzcEgxMXK09TXydTFzNjOzyDE
xcrT1NfVxdQuIPbJ19UgzsUg1yDhzcXSycvFLCDBINcg8s/T08nJLCDJINPOwd7BzMEg0SDCz9HM
09EsIM7FIMLZzCDV18XSxc4sIMTFytPU18nUxczYzs8NCszJINzUzyDExcrT1NfVxdQsIMEg0M/U
z83VLCDOxSDP1M7P08nM09EgyyDc1M/N1SDTxdLYxdrOzy4g4SDQz9TPzSDTy8Hawcwg08XCxTog
IuEg0M/exc3VIM7F1D8iLiDzz9rEwcwgy8/bxczFyywg0M/Qz8zOycwNCsXHzyAsIMkg08TFzMHM
INDF0sXXz8QsINrBy8Hawdcg08XCxSDexdTZ0sUgyc7T1NLVy8PJyS4g9yDUxd7FzsnJIDUtySDE
zsXKINDPzNXeycwgycgg19PFyCDQzyBlLW1haWwuIOTPzNjbxSDX08XHzyDQ0snbzM/T2A0K1sTB
1Nggyc7T1NLVy8PJySA0LiDuzyDc1M8gySDQz87R1M7PLCDXxcTYINUg0NLPxMHXw8Eg3NTPx88s
INDP08zFxM7Fx88g1dLP187RLCDU2dPR3skg2sHLwdrP1y4g99PFINPExczBzCDUz97OzyDQzw0K
yc7T1NLVy8PJySwg3tTPwtkgwtnU2CDV18XSxc7O2c0sIMXTzMkg3NTPIMTFzM8gzsUg2sHSwcLP
1MHF1Cwg1M8g3NTPIM7FINDSyd7JzsEgzc/FyiDP28nCy8kgySDWxMHMLiANCg0KICAgICDxINfO
yc3B1MXM2M7PINDSz97J1MHMINfTxSDQz8zV3sXOztnFIMnO09TS1cvDycksIMEgy8/HxMEg1drO
wcwsIMvByyDX08UgzsHEzyDExczB1NgsIM7B3sHMINPXz8ogwsnazsXTLiDxIMnTy8HMDQrBxNLF
08Eg18XaxMUgySDTxMXMwcwg08XCxSDEzMnOztnKINPQydPPyyAozc7FINzUzyDC2czPIMTFytPU
18nUxczYzs8gyc7UxdLF087PLCDc1M8gwtnMzyDLwcsgzs/Xz8UgyM/CwsksIMkg0SDOxSDNz8cg
zsnexcfPDQrQz9TF0tHU2CksIMvByyDOxc7P0s3BzNjO2cog0SDOwd7BzCDQz9PZzMHU2CBlLW1h
aWwgzMDE0c0g1yDDxczPzSDT18XUxS4g5MXMwcwg0SDc1M8g0M/T1M/Rzs7PLCDJIMvB1sTZyiDE
xc7YIMvPztTSz8zJ0s/XwcwNCtPXz8ogy8/bxczFyy4g8NLJzcXSzs8g3sXSxdogxMXO2CDOwd7B
zMkg0NLJyM/EydTYINrBy8Ha2S4g5M8g08nIINDP0iDQz83OwCDUz9Qgzc/Nxc7ULCDLz8fEwSDP
ws7B0tXWycwg0MXS19nKINrBy8HaLg0K7sXLz9TP0s/FINfSxc3RINEg0NLP09TPINPUz9HMIMkg
zsUgzc/HIMTXycfB1NjT0TogIvzUzyDSwcLP1MHF1CEg/NTBINvU1cvBINrB0sHCz9TBzMEgzcHU
2CDFxSDUwcshIi4g8NLP29Ug0NLP3cXOydEg2sENCtfZ0sHWxc7JxSwgzs8g8SDC2cwgz97Fztgg
097B09TMydcsIMkgzsHewcwg0M/T2czB1Nggxd3FIMLPzNjbxSBlLW1haWwsINDP0dfJzNPRINPJ
zNjOxcrbycog09TJzdXMIMsg0sHCz9TFLiANCg0KICAgICDuwSDTzMXE1cDdycogxMXO2CAtINDV
09TPyiDR3cnLIMkg087P18Eg0SDQz8TVzcHMLCDe1M8g3NTPIM7FIMLVxMXUINLBws/UwdTYLCDO
zyDPy8HawczP09ggzsHPws/Sz9QuIO7BINPMxcTVwN3JyiDExc7YDQrRINDPzNXeycwgMyDawcvB
2sEsINcg1M/UINbFIM3PzcXO1CDRINDP08zBzCDMwMTRzSDJyCDJztPU0tXLw8nJLCDe1M/C2SDN
z8fMySDUz9bFIMLZ09TSzyDawdLBws/UwdTYIMTFztjHySAoxMzRINPFwtEgySDEzNENCs3FztEp
LiD6wSDE18UgzsXExczJLCDLwdbE2cogxMXO2CDRINPJxMXMINDSyc3F0s7PIDMwIM3JztXUINUg
y8/N0NjA1MXSwSDJINDP09nMwcwg2sHLwdrZLiD3INTF3sXOyckgxNfVyCDOxcTFzNgg0SDQz8zV
3snMDQoyOSDawcvB2s/XIM7BIMnO09TS1cvDycAgIzEuIPDP1M/NINrBy8Ha2SDT1MHMySDQ0snI
z8TJ1Ngg3sHdxSDJIMLZ09TSxcUsIMvB1sTVwCDOxcTFzMAg0SDQz8zV3sHMIM/Lz8zPINPUwSDa
wcvB2s/XLCBhDQrExc7Yx8kg19PFINDP09TV0MHMySDOwSDNz8og097F1C4g9yDDxczPzSDRINrB
0sHCz9TBzCDPy8/MzyA2NC4wMDAsLSBVU0QuIPcg/PTvIO7l9+/67e/27u8g4vns7yDw7/fl8un0
+CEg7sEg0NLP28zPyiDOxcTFzMUNCtEgy9XQycwg08XCxSDOz9fVwCDUwd7L1SDJINzUzyDCzMHH
z8TB0tEg0NLPx9LBzc3FLiDl08zJIMkg1MXQxdLYIPfZIM7FINrOwcXUxSwg3tTPIMTFzMHU2Cwg
1MHLINEg98HNIMfP18/SwA0K8F/vX/Bf8l/vX+Jf9V/qX/Rf5SDJIM7FINDP1sHMxcXUxS4g/NTP
IPfB2yDbwc7TLCDF08zJIMXHzyDV0NXT1MnUxSwg1MHLIMLVxMXUxSDWwczF1Nggz8Ig3NTPzSDE
zyDLz87DwSDWydrOySENCg0KICDuLiDyxcLSz9csIPLP09PJ0S4NCg0KDQoNCiAgKioqDQoNCiAg
4SDUxdDF0tgg18XSzsXN09EgyyDz9f3l8/T39SDExczBLiD3IN7FzSDTz9PUz8nUINDSxcTMwcfB
xc3B0SDy4eLv9OE/DQoNCg0KDQogICAgIPfZIO/k6e4g0sHaINDPy9XQwcXUxSA0IMnO09TS1cvD
yckgKMXdxSDPxM7BIC0gIs7VzMXXwdEiIMTP09TBzMHT2CD3wc0gwsXT0MzB1M7PIC0g3NTPIM7B
09TP0d3FxSDQydPYzc8pINDPINzUz8og08HNz8oNCsTF0dTFzNjOz9PUySAozc7Px8/V0s/XzsXN
1SDTxdTF18/N1SDNwdLLxdTJzsfVINcg6c7UxdLOxdTFKSDXINzMxcvU0s/Ozs/NINfJxMUgKNDP
IDUgVVMkINrBIMvB1sTPxSkg1SA0Lcgg0sHaztnIIMzAxMXKIC0NCvfB28nIINDSxcTbxdPU18XO
zsnLz9cg0M8gyc7Gz9LNwcPJz87Oz8ogw8XQz97LxS4g+sHUxc0sIPfZIOzg4u/lIN7J08zPINLB
2iDQ0s/EwcXUxSDc1Mkgyc7T1NLVy8PJySDX08XNINbFzMHA3cnNDQrQz8zY2s/XwdTFzNHNIOnO
1MXSzsXUwS4NCg0KICAgICDrwdbE2cogxMXO2CDOwSDPxM7PzSDUz8zYy88gbWFpbC5ydSAtINPB
zc/NIMna18XT1M7PzSDXIPLP09PJySwgzs8gxMHMxcvPIM7FINPBzdnNIMzV3tvJzSAi0M/T1MHX
3cnLz80iIMLF09DMwdTO2cgNCtHdycvP1ywg3snTzM8g0M/M2NrP18HUxczFyiDXz9rSwdPUwcXU
IMLPzMXFIN7FzSDOwSA2INTZ09HeLiDr0s/NxSDUz8fPLCDXINLV09PLz9Ha2d7Oz8og3sHT1Mkg
6c7UxdLOxdTBIM7F08/Qz9PUwdfJzc8NCs3FztjbwdEgy8/Oy9XSxc7DydEg1yDSxcvMwc3FLCDe
xc0gLSDXIMHOx8zP0drZ3s7PyiAozsEgIs7B28kiINHdycvJINcgx8/EINDSycjPxMnUIM3Fztjb
xSDSxcvMwc3ZLCDexc0gzsEgIsnIIiAtINcgxMXO2CkhDQoNCiAgICAg8MXSxcQg1MXNIMvByyDS
xdvB1NgsIMjP1MnUxSDc1MnNINrBzsnNwdTY09EgyczJIM7F1Cwg0NLP3snUwcrUxSDTzMXE1cDd
ycUgxsHL1Nkgz8Ig3NTPyiDQ0s/H0sHNzcUgLSD32ToNCg0KICAxLiDw8u/k4eX05SDw8u/k9ev0
LCDw8u/p+vfv5PP09+8g6+/07/Lv5+8g9+HtIO7p/uXn7yDu5SDz9O/p9CENCg0KICAyLiDw8u/k
4eX05SDw8u/k9ev0LCD08uHu8/Dv8vTp8u/36+Eg6+/07/Lv5+8g9+HtIO7p/uXn7yDu5SDz9O/p
9CENCg0KICAzLiDw8u/k4eX05SDw8u/k9ev0LCDy5evs4e3hIOvv9O/y7+fvIPfh7SDu6f7l5+8g
7uUg8/Tv6fQhDQoNCiAgNC4g6fPw7+z4+vXl9OUg8+ns9SDp7vTl8u7l9OEg6SBNVUxUSS1MRVZF
TCBNQVJLRVRJTkchDQoNCiAgNS4g9+H76e0g5eTp7vP09+Xu7vntIPfr7OHk7+0sIOvy7+3lIO7h
/uHs+O7v6iDp7vfl8/Tp4+npINcgMjAgVVMkIPH37PHl9PPxIPTv7Pjr7yD34fvlIPfy5e3xICjL
wcsgzMnezs/FLCDUwcsgyQ0K0M/Ey8zA3sXOydEgyyDpztTF0s7F1NUpIQ0KDQogIDYuIPfl8/gg
+uHy4eLv9O/rLCDr7/Tv8vnqIPf5IPDv7PX+6fTlLCDx9+zx5fTz8SD+6fP07+og8PLp4vns+OAh
DQoNCiAgNy4g/PThIPDy7+fy4e3t4SDu4ffz5efk4SDp+u3l7un0IPfh+/Ug9un67vghDQoNCiAg
KioqDQoNCiAg9+/0LCD+9O8sIOvv7uvy5fTu7ywg7vX27u8g4vXk5fQg9+HtIPPk5ezh9PggKOnu
8/Ty9evj6fEgIzApOg0KDQogIDEuIPPLz9DJ0s/XwdTYINDP08zFxM7AwCDXxdLTycAgKM7FIM3F
zsXFLCDexc0gMikg0NLPx9LBzc3ZIFdlYk1vbmV5IGtlZXBlciDOwSDTwcrUxSBodHRwOi8vd3d3
LndlYm1vbmV5LnJ1LyD0wc0g1sUg99kNCs7BysTF1MUgz9DJ08HOycUg0sHCz9TZINMgy8/bxczY
y8/NIMkgyc7Gz9LNwcPJwCDPwiDc1M/KINPJ09TFzcUg0MzB1MXWxcouDQoNCiAgMi4g8M/Qz8zO
ydTYIPP37+ogWiDLz9vFzMXLIDIwIFVTJCAoyczJINPOwd7BzMEgUiDLz9vFzMXLIC0gxdPMySDX
IPfB28XNIMfP0s/ExSDOxdQg0NLFxNPUwdfJ1MXM2NPU18EgV2ViTW9uZXksIMEgwsHOy8kNCs/U
y8Ha2dfBwNTT0SDP09XdxdPU18zR1Ngg0MXSxdfPxNkg1yBVUyQsIMEg2sHUxc0gz8LNxc7R1Ngg
0tXCzMkgzsEgVVMkINfO1dTSySDTwc3PyiDTydPUxc3ZICjULsUuINDF0sXT1MkgxMXO2MfJINPP
INPXz8XHzw0KUi3Lz9vFzNjLwSDOwSBaICkuIPDPxNLPws7PxSDP0MnTwc7JxSDc1MnIIM/QxdLB
w8nKIC0gzsEgaHR0cDovL3d3dy53ZWJtb25leS5ydS9ydXMvcGVyZXZvZHMuaHRtLw0KDQogIDMu
IPDP08zFINDP09TV0MzFzsnRIMTFzsXHINcg98HbIMvP28XMxcssINrBy8HawdTYINPFwsUg19PF
IN7F1NnSxSDJztPU0tXLw8nJLCDQ1dTFzSDQxdLF18/EwSBXZWJNb25leSDJ2iDT18/Fx88gy8/b
xczYy8ENCtcgy8HWxNnKIMnaIDQtyCDLz9vFzNjLz9cg0NLPxMHXw8/XLCDT1c3N2SA0Ljk2IFdN
WiAoMC44JSAtIMvPzcnT08nPzs7ZxSDTwc3PyiDTydPUxc3ZINrBIM/T1d3F09TXzMXOycUg0MXS
xdfPxMEgxMXOxccpLCDawQ0Ky8HWxNXAIMnO09TS1cvDycAuIPfh9u7vISDl08zJIPfZINDP0M/M
zsnUxSDT18/KIMvP28XMxcsg0s/Xzs8gMjAkIMkg0MXSxdfFxMXUxSDSz9fOzyDQzyA1JCDawSDJ
ztPU0tXLw8nJIDEtMywg1M8g1SD3wdMNCs/LwdbF1NPRIM7FxM/T1MHUz97OzyDExc7FxyDEzNEg
0MXSxdfPxMEg2sEgyc7T1NLVy8PJwCAjNCENCg0KICAqIPDy6e3l/uHu6eU6DQoNCiAgKvcgy8/b
xczYy8UsIMTFztjHySDI0sHO0dTT0SDXINfJxMUg1dPMz9fO2cggxcTJzsnDIChXZWJNb25leSku
8M8gy9XS09UgMVdNID0gMSDS1cIuIMTM0SBSLSDLz9vFzNjLwSwgMVdNID0gMSDEz8zMwdIg8/vh
DQrEzNEgWi0gy8/bxczYy8EuDQoNCiAgKuvPx8TBINPExczBxdTFINPXz8og2sHLwdosINXCxcTJ
1MXT2Cwg3tTPINfZINrBy8HawczJINfTxSDJztPU0tXLw8nJLiD308Ugz87JINDPzsHEz8LR1NPR
IMTM0SDUz8fPLCDe1M/C2SD32SDTz8jSwc7JzMkg1Q0K08XC0SDXIMvPzdDYwNTF0sUgKMkgzsEg
xMnTy8XUxSwgxMzRIM7BxMXWzs/T1MkpIN7Uz8LZINDP1M/NIPfZIM3Px8zJINDSz8TB18HU2CDL
z9DJyS4g98HNIMTFytPU18nUxczYzs8gztXWztkg19PFINzUyQ0Kyc7T1NLVy8PJyS4g5dPMySDV
IPfB0yDOxSDC1cTF1CDI18HUwdTYIM/Ezs/KIMnaIM7JyCwg99kgzsUg083P1sXUxSDP09XdxdPU
18zR1Ngg0sHT09nMy9UuIPTF0MXS2CDc1M8g98HbINTP18HSLCDTINDSwdfPzQ0K0NLPxMHWySEg
78LR2sHUxczYzs8sINXLwdbJ1MUg1yDQz8zFIMvPzc3FztTB0snRIM7PzcXSIMnO09TS1cvDyckg
ySDT18/KIEUtbWFpbCDBxNLF0y4NCg0KICDw0snNxdIg08/Pwt3FzsnRLCDQ0snMwcfBxc3Px88g
yyDQxdLF18/E1SDExc7FxyDexdLF2iBXZWJNb25leTogIsnO09TS1cvDydEgIzE7IEUtbWFpbCB4
eHh4eHh4eHh4eHh4QHh4eHh4eC54eCINCg0KICD0wcLMycPBOg0KDQogIMnO09TS1cvDydEgIyAg
ICAvICAgLiDLz9vFzMXLDQoNCiAgICAgICAgICAgICAgICAgICAxIC8gWjM0MTEzODQ2MjE3OA0K
DQogMiAvIFo4MzE4MDA0MDQxMTgNCg0KICAgICAgICAgICAgICAgICAgIDMgLyBaODU1Njc4MzI2
NDQ1DQoNCiAgICAgICAgICAgICAgICAgICA0IC8gWjQ1MjkyNTA2NjExNA0KDQogDQoNCg0KDQog
IPfu6e3h7unlISEhDQoNCiAg7sUgzsHQ0sHXzNHK1MUg98HbySDXz9DSz9PZIMkg0M/E1NfF0tbE
xc7J0SDP0MzB1Nkg0yDQz83P3djAIMvOz9DLySAiz9TXxdTJ1Nggz9TQ0sHXydTFzMAiLCAiz9TX
xdTJ1NggzsEg19nC0sHOzs/FDQrQydPYzc8iLCDJzMkgInJlcGx5IiDOwSDR3cnLLCDTIMvP1M/S
z8fPIPfZINDPzNXeyczJIMTBzs7PxSDQydPYzc8gLSDXINDSz9TJ187PzSDTzNXewcUgz87PINDS
z9PUzyDOxSDC1cTF1CDQ0s/eydTBzs8sIMENCsHE0sXT1crUxSDJyCD07+z46+8g1yDXycTFINPP
z8Ldxc7J0SDXINPJ09TFzcUgV2ViTW9uZXkuDQoNCiAgNC4g9yD0wcLMycPFLCDVxMHMydTFINPU
0s/L1Swg08/P1NfF1NPU19XA3dXAIMnO09TS1cvDyckgIzQuIOnazcXOydTFIM7PzcXSwSDJztPU
0tXLw8nKIDMgLSDOwSA0LCAyIC0gzsEgMyDJIDEgLSDOwSAyLCDOxQ0KzcXO0dEgy8/bxczYy8/X
INcg08/P1NfF1NPU19XA3cnIINPU0s/Lwcgg9MHCzMnD2S4g5M/CwdfY1MUg1yDUwcLMycPVICjT
18XSyNUpINPU0s/L1SAxINPPIPP37+ntIMvP28XM2MvPzS4g8M/Nxc7RytTFIM3PyiBXTQ0KycTF
ztTJxsnLwdTP0iDOwSDz9+/qINcg0M/TzMXEzsXNINPPxMXS1sHUxczYzs/NIMHC2sHDxSDEwc7O
z8fPINDJ09jNwS4g7sHLz87Fwywg2sHNxc7J1MUgzc/FIMnN0SDOwSD3wdvFINcg08HNz80gzsHe
wczFIMkNCsvPzsPFINDJ09jNwS4g8MnT2M3PIMTM0SD3wdvFyiDSwdPT2czLySDHz9TP188hIPTF
0MXS2CD32SDT1MHMySDQ0s/EwdfDz80gyc7T1NLVy8PJySAjMS4NCg0KICD34fbu7yEhIQ0KDQog
ICAgIO7FIM3FztHK1MUgzs/NxdLBIMvP28XM2MvP1ywgy8/Uz9LZxSDOwcjPxNHU09Eg1yD0wcLM
ycPFLCDOycvBy8nNINPQz9PPws/NLCDL0s/NxSDP0MnTwc7Oz8fPINcg0NXOy9TFIDQsIMnOwd7F
DQrQz9TF0tHF1MUgws/M2NvVwCDewdPU2CDT18/JyCDEz8jPxM/XLiDrz8fEwSDQz8rNxdTFLCDL
wcsg3NTPIMTFytPU19XF1Cwg98HNINPSwdrVINPUwc7F1CDQz87R1M7PLCDQz97FzdUg3NTPINDF
0sXT1MHF1A0K0sHCz9TB1NgsIMvPx8TBIN7Uzy3OycLVxNggydrNxc7J29ggzsUg0M8g0NXOy9TV
IDQgKNcgV2ViTW9uZXkgzc/Wzs8gzMXHy88g0NLP18XSydTYIM7FIMLZzMEgzMkg0NLPydrXxcTF
zsEg0M/EzcXOwSkuDQoNCiAgICAg7sUgxMXMwcrUxSDOycvBy8nIIMnazcXOxc7JyiDXIOnu8/Ty
9evj6ekhISENCg0KICAgICD3wdsgxsnOwc7Tz9fZyiDXy8zBxCDXIMTBzs7PxSDQ0sXE0NLJ0dTJ
xSDR18zRxdTT0SDQ0sHL1MnexdPLySDOyd7Uz9bO2c0uIO7FIM/QwdPBytTF09gsIN7UzyD3wc0g
zsUg19nbzMDUIMnO09TS1cvDyckNCi0g3NTPIMLZzM8gwtkgx8zV0M8g088g09TP0s/O2SDQ0s/E
wdfDz9cgLSDQ0sXE28XT1NfFzs7Jy8/XINcgyc7Gz9LNwcPJz87Oz8ogw8XQz97LxS4g787JICjL
0s/NxSDQ0s/EwdfDwSDJztPU0tXLw8nJICM0LCDOzw0Kz84g08HN2cogws/HwdTZyiDJIM/CzcHO
2dfB1Nggxc3VIM7F1CDTzdnTzMEpIMvSz9fOzyDawcnO1MXSxdPP18HO2SDXIM3By9PJzcHM2M7P
zSDV09DFyMUg09fPycgg0M/TzMXEz9fB1MXMxcosINQuyy4gycgNCsvP28XM2MvJIMLVxNXUIMbJ
x9XSydLP18HU2CDXINLB09PZzMHFzc/NIPfBzckg0MnT2M3FLiDr0s/NxSDUz8fPLCDQ0s/EwdfD
2SDJztPU0tXLw8nKIM7J3sXHzyDOxSDUxdLRwNQgz9TQ0sHXzNHRIMnIIPfBzS4g4Q0Kz8LNwc7V
1ywgz87JINLJ08vVwNQg0NLFy9LB1MnU2CDc1NUg09fPwCDExdHUxczYzs/T1Nggydot2sEg98Hb
xcog1sHMz8LZINcgwcTNyc7J09TSwcPJwCBXZWJtb25leS4NCg0KICAgICDr1dDJ1yDX08UgNCDJ
ztPU0tXLw8nJLCD32SDCxdPQzMHUzs8g0M/M1d7BxdTFIDIg08/XxdLbxc7OzyDOxc/CyM/Eyc3Z
xSD3wc0g0NLPx9LBzc3ZINPCz9LBIMHE0sXTz9cgySDNwdPTz9fPyg0K0sHT09nMy8kgz8TJzsHL
z9fZyCDQydPFzSwgy8/Uz9LZxSwgxdPMySD32SDC1cTF1MUgycgg0M/L1dDB1Ngg08HNz9PUz9HU
xczYzs8sIM/Cz8rE1dTT0SD3wc0sIMvByyDNyc7JzdXNLCDXINTFINbFIDIwIFVTJCwNCt7UzyDJ
INfTxSDJztPU0tXLw8nJLg0KDQogIO/i8frh9OXs+O7vIPDy7/fl8vj05SDw8uH36ez47u/z9Pgg
6frt5e7l7unxIPTh4uzp4/khISENCg0KICAgICDw0s/XxdLY1MUsIM/Tz8LFzs7PINfOyc3B1MXM
2M7PLCDQ0sHXyczYzs/T1Ngg1cvB2sHOydEgzs/NxdLBIMvP28XM2MvBINDSySDQxdLF18/ExS4g
/NTPIM/exc7YINfB1s7PLCDUwcsgy8HLINDPy8EgzsUNCtrB0MzB1MnUxSDQ0sHXyczYzs8sINrB
y8HaIM7FINDSycTF1CwgwSD32SDOxSDQz8zV3snUxSDT18/AIMnO09TS1cvDycAuIO7BysTJ1MUg
19LFzdEsIN7Uz8LZIPfZINPNz8fMySDTxMXMwdTYINfTxSDQ0sHXyczYzs8NCskgzsUg1M/Sz9DR
09gsINDP1M/N1SDe1M8g3NTPIM/Tzs/XwSD3wdvFx88g2sHSwcLP1MvBINcg3NTPzSDQ0s/Fy9TF
Lg0KDQogICAgIPfTxcfEwSwgy8/HxMEg98HbIMvP28XMxcsg0NLPxNfJx8HF1NPRINfOydog0M8g
09DJ08vVLCD32SDQz8zV3sHF1MUg2sHLwdogzsEg08zFxNXA3dXAIMnO09TS1cvDycAsINDP3NTP
zdUgzc/WxdTFDQrP1NPMxcTJ1Ngg09fPxSDQ0s/E18nWxc7JxSwg0M8g1M/N1Swgy8HL1cAgyc7T
1NLVy8PJwCDP1CD3wdMg2sHLwdrZ18HA1CDMwMTJISDw0skg3NTPzSDeydPMzyD3wdvJyCDQz8vV
0MHUxczFyiwgwSDT1MHMzyDC2dTYDQrJIMTPyM/ELCDXz9rSwdPUwcXUINcgx8XPzcXU0snexdPL
z8og0NLPx9LF09PJySEg5dPMySDQz9bFzMHF1MUgxd3FINDP19nTydTYINPXz8ogxM/Iz8QsINTP
INDSz9PUzyDQz9PZzMHK1MUgzsHT1M/R3cXFDQrQydPYzc8gKNMg1cvB2sHOztnNySDXINAuNCDF
x88gyc7T1NLVy8PJySAjMCDJ2s3FzsXOydHNySwgy8/Oxd7Ozykg0M8gzs/Xz83VINPQydPL1SDB
xNLF08/XLiD0wcsg99kgzsHezsXUxSDXxdPYINDSz8PF09MNCtPOwd7BzMEuDQoNCiAgICAg5dPM
ySDQz9PMxSDXzsnNwdTFzNjOz8fPINDSz97Uxc7J0SDOwdPUz9HdxcfPINDJ09jNwSDVIPfB0yDP
09TBzMnT2CDXz9DSz9PZIC0g2sHEwdfBytTFIMnILCDOzyDQydvJ1MUgzsUgzsEgRS1tYWlsLCDP
1A0Ky8/Uz9LPx88g99kg0M/M1d7JzMkgxMHOzs/FINDJ09jNzyAoz84gyyDUz83VINfSxc3Fzskg
0M/e1MkgzsHXxdLO0cvBIMLVxMXUIMzJy9fJxMnSz9fBziksIMEg9O/s+OvvIM7BIFdlYk1vbmV5
DQoo0M/TzMXEz9fB1MXM2M7PIN3FzMvO1dTYINDPICLNxc7AIiwgItPPz8Ldxc7J0SIgySAiz9TQ
0sHXydTYIikgySwg1sXMwdTFzNjOzywg9O/s+OvvINDSz8TB18PVIMnO09TS1cvDyckgIzEgKM/T
1MHM2M7ZxSDU0s/FDQotINcgx8/SwdrEzyDCz8zY28XKINPUxdDFzskgLSDXz9rSwdPUwcDdxcog
1yDHxc/NxdTSyd7F08vPyiDQ0s/H0sXT08nJIC0g0MXSxcfS1dbFztkgz9TQ0sHXy8/KIMnO09TS
1cvDycopLCDOxSDawcLZ18HRDQrVy8HawdTYIPP37+ogRS1tYWlsIMTM0SDT19HayS4g7c/KIFdN
IMnExc7UycbJy8HUz9IgKN7Uz8LZIPfBzSDT0MXDycHM2M7PIM7FINXazsHXwdTYIMXHzyAi08nN
1czJ0tXRIiDQxdLF18/EIMTFzsXHKSAtDQoyNzEzMDIxNzM2MjUuDQoNCiAgICAg8SDPwtHawdTF
zNjOzyDP1NfF3tUuDQoNCiAgICAg9sXMwcAg19PFx88gzsHJzNXe28XHzyEg9sTVINrBy8Haz9cg
ySDXz9DSz9PP1yENCg0KICDzIMnTy9LFzs7JzSDV18HWxc7Jxc0sDQoNCiAg7snLz8zByg0KICAN
Cg0KLS0gDQpCZXN0IHJlZ2FyZHMsDQogTmlrb2xhaSAgICAgICAgICAgICAgICAgICAgICAgICAg
bWFpbHRvOnRyYXN0bWVAaG9ua29uZy5jb20=




From jim@zope.com  Tue Feb 26 21:50:21 2002
From: jim@zope.com (Jim Fulton)
Date: Tue, 26 Feb 2002 16:50:21 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net>
 <3C7BFCDE.F920078F@zope.com> <200202262137.g1QLbWq20272@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7C031D.4376D05A@zope.com>

Guido van Rossum wrote:
> 
> [me]
> > > I wonder how often this is needed.  The only occurrences of year() in
> > > the entire Zope source that I found are in various test routines.
> 
> [Jim]
> > These methods and others are used a lot in presentation code,
> > which tends to be expressed in DTML or ZPT.
> >
> > It's not uncommon to select/catagorize things by year or month.
> >
> > I think most people would find individual date-part methods
> > a lot more natural than tuples.
> 
> OK, that explains a lot.  For this context I agree, although I think
> they should probably appear as (computed) attributes rather than
> methods.  Properties seem perfect.

That's fine with me.

> > > I imagine
> > > that once we change strftime() to accept an abstract time object,
> > > you'll never need to call either timetuple() or year() -- strftime()
> > > will do it for you.
> >
> > Maybe, if I use strftime, but I don't use strftime all that much.
> 
> Maybe you should. :-)

I do when I can. But it often doesn't meet my needs.
 
> > I can certainly think of even formatting cases (e.g. internationalized
> > dates) where it's not adequate.
> 
> Then a super-strftime() should be invented that *is* enough, rather
> than fumbling with hand-coded solutions.

I think we don't need a one-size-fits-all all-powerful date-time
formating solution. ;)

Jim

--
Jim Fulton           mailto:jim@zope.com       Python Powered!        
CTO                  (888) 344-4332            http://www.python.org  
Zope Corporation     http://www.zope.com       http://www.zope.org


From guido@python.org  Tue Feb 26 22:01:59 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 17:01:59 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 16:50:21 EST."
 <3C7C031D.4376D05A@zope.com>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> <3C7BFCDE.F920078F@zope.com> <200202262137.g1QLbWq20272@pcp742651pcs.reston01.va.comcast.net>
 <3C7C031D.4376D05A@zope.com>
Message-ID: <200202262201.g1QM1x620487@pcp742651pcs.reston01.va.comcast.net>

> I think we don't need a one-size-fits-all all-powerful date-time
> formating solution. ;)

It's probably impossible to create one, but I think there's also no
reason to require people to invent the wheel over and over.  I've seen
enough broken code attempting to do date/time formatting that I
strongly prefer the creation of a few standard solutions that will
work for most people, rather than only giving people the low-level
bits to work with.

Another thing to consider is that for most apps, the choice of the
date/time format should be taken out of the hands of the programmer
and placed into the hands of the user, through some kind of preference
setting.  I18n and L10n also strongly suggests to take this route.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Jack.Jansen@oratrix.com  Tue Feb 26 22:04:16 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Tue, 26 Feb 2002 23:04:16 +0100
Subject: [Python-Dev] Adding lots of basic types to Python
Message-ID: <C67F9572-2B04-11D6-8581-003065517236@oratrix.com>

I can think of a lot more basic types that we could add to 
Python that would make as much sense as currencies: pixels, 
points, geometric figures, (audio) samples, images, ...

But: aren't we really trying to standardise interfaces? The main 
benefit of a basic currency type would be that it defines the 
set of operations allowed on it (add, subtract are fine, divide 
gives a normal number, multiply isn't allowed) much more than 
code sharing.

Note that I do think standardised interfaces would be a great 
thing, and if there was a common Python "pixel" interface that 
would free up quite a lot of my brain cells that are now used 
for remembering the 4 or 5 different pixel interfaces that I use 
regularly, but I'm not sure that a standard Python pixel type is 
the solution.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -



From guido@python.org  Tue Feb 26 22:06:14 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 17:06:14 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 22:26:47 +0100."
 <3C7BFD97.B69DDDFD@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net>
 <3C7BFD97.B69DDDFD@lemburg.com>
Message-ID: <200202262206.g1QM6En20549@pcp742651pcs.reston01.va.comcast.net>

> FWIW, mxDateTime exposes these values as attributes -- there
> is no call overhead.

Good, I think this is the way to go.  (Of course there will be some
C-level call overhead if we make these properties.)

> > Serious question: what do you tend to do with time values?  I imagine
> > that once we change strftime() to accept an abstract time object,
> > you'll never need to call either timetuple() or year() -- strftime()
> > will do it for you.
> 
> Depends on the application space. Database applications
> will call .timetuple() very often and use strftime() hardly
> ever.

What does a database app with the resulting tuple?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Tue Feb 26 22:15:22 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 23:15:22 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <3C7BEC03.E8225CFD@lemburg.com> <079f01c1bf07$11963ee0$ced241d5@hagrid>
Message-ID: <3C7C08FA.C6375FD7@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:
> 
> > > Doesn't the proposal sort of imply time-zone
> > > awareness of some kind? Or does it simply imply
> > > UT storage?
> >
> > I tried that in early version of mxDateTime -- it fails
> > badly.
> 
> can you elaborate?

First of all, the C lib only support UTC and local time,
so you don't really have a chance of correctly converting
a non-local time using a different time zone in either local
time or UTC: there simply are no C APIs you could use and
the problems which DST and leap seconds introduces are
no fun at all (they are fun to read though: figuring out
the various DST switch times is an adventure -- just have
a look at the C lib's DST files).

The next problem is that the C lib only provides APIs
for conversion from local time to UTC, but not UTC to
local time. There is an API called timegm() for this
on some platforms, but its non-standard.

As a result, making UTC the default won't allow you
to safely represent the datetime value in local time.

A third obstacle is typcial user assumptions: users simply
assume local time and it's hard to tell them otherwise
(people have very personal feelings about date and time 
for some reason...).

> > > Does this imply leap second hell, or will we
> > > simply be vague about expectations?
> >
> > The type will store a fixed point in time, so why
> > worry about leap seconds (most system's don't support these
> > anyway and if they do, the support is usually switched off per
> > default) ?
> 
> the updated proposal adds __hash__ and __cmp__, and
> the following (optional?) operations:
> 
>     deltaobject = timeobject - timeobject
>     floatobject = float(deltaobject) # fractional seconds
>     timeobject = timeobject + integerobject
>     timeobject = timeobject + floatobject
>     timeobject = timeobject + deltaobject
> 
> note that "deltaobject" can be anything; the abstract type
> only says that if you manage to subtract one time object from
> another one of the same type, you get some object that you
> can 1) convert to a float, and 2) add to another time object.
> 
> vague, but pretty useful.

Indeed :-)
 
> > > I'd also like to see simple access methods for year,
> > > month, day, hours, minutes, and seconds, with date parts
> > > being one based and time parts being zero based.
> >
> > In the abstract base type ?
> 
> Q. does mxDateTime provide separate accessors for individual
> members?

Yes, it provides access to these in form of attributes.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Tue Feb 26 22:18:22 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 23:18:22 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net>
 <3C7BFD97.B69DDDFD@lemburg.com> <200202262206.g1QM6En20549@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7C09AE.54D16E2D@lemburg.com>

Guido van Rossum wrote:
> 
> > FWIW, mxDateTime exposes these values as attributes -- there
> > is no call overhead.
> 
> Good, I think this is the way to go.  (Of course there will be some
> C-level call overhead if we make these properties.)

Right.
 
> > > Serious question: what do you tend to do with time values?  I imagine
> > > that once we change strftime() to accept an abstract time object,
> > > you'll never need to call either timetuple() or year() -- strftime()
> > > will do it for you.
> >
> > Depends on the application space. Database applications
> > will call .timetuple() very often and use strftime() hardly
> > ever.
> 
> What does a database app with the resulting tuple?

It puts the values into struct fields for year, month, day, etc.
(Databases usually avoid using Unix ticks since these cause
the known problems with dates prior to 1970)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From tim@zope.com  Tue Feb 26 22:23:59 2002
From: tim@zope.com (Tim Peters)
Date: Tue, 26 Feb 2002 17:23:59 -0500
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
Message-ID: <LNBBLJKPBEHFEDALKOLCCEJHNPAA.tim@zope.com>

[Guido]
> ...
> Another thing to consider is that for most apps, the choice of the
> date/time format should be taken out of the hands of the programmer
> and placed into the hands of the user, through some kind of preference
> setting.  I18n and L10n also strongly suggests to take this route.

I'm sure nobody wants to admit this <wink>, but in sheer numbers, nobody has
more experience with this stuff than Microsoft.  If you sit at your Windows
box and go to Start -> Settings -> Control Panel -> Regional Settings,
you'll get a tabbed dialog for specifying the format of number, currency,
time, and date displays.  A Windows app that ignores the settings here is
considered to be broken (and rightly so).

Idiosyncratic formats for user-visible number/currency/date/time info is
going to become an increasingly Bad Idea on other OSes too.



From guido@python.org  Tue Feb 26 22:26:19 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 17:26:19 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 23:18:22 +0100."
 <3C7C09AE.54D16E2D@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> <3C7BFD97.B69DDDFD@lemburg.com> <200202262206.g1QM6En20549@pcp742651pcs.reston01.va.comcast.net>
 <3C7C09AE.54D16E2D@lemburg.com>
Message-ID: <200202262226.g1QMQKG20726@pcp742651pcs.reston01.va.comcast.net>

> > What does a database app with the resulting tuple?
> 
> It puts the values into struct fields for year, month, day, etc.
> (Databases usually avoid using Unix ticks since these cause
> the known problems with dates prior to 1970)

Hm, I thought that databases have their own date/time types?  Aren't
these used?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Feb 26 22:29:36 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 17:29:36 -0500
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 17:23:59 EST."
 <LNBBLJKPBEHFEDALKOLCCEJHNPAA.tim@zope.com>
References: <LNBBLJKPBEHFEDALKOLCCEJHNPAA.tim@zope.com>
Message-ID: <200202262229.g1QMTat20771@pcp742651pcs.reston01.va.comcast.net>

> Idiosyncratic formats for user-visible number/currency/date/time
> info is going to become an increasingly Bad Idea on other OSes too.

Oh, so it'll be at least another 10 years before the same wisdom
reaches the typical web application... :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Tue Feb 26 22:36:01 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 23:36:01 +0100
Subject: [Python-Dev] proposal: add basic money type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <06dd01c1bf02$9d5ca9a0$ced241d5@hagrid> <200202262029.g1QKTx519708@pcp742651pcs.reston01.va.comcast.net> <3C7BF117.D7417F7D@lemburg.com> <200202262049.g1QKn4H19925@pcp742651pcs.reston01.va.comcast.net> <3C7BF8C7.DECB7D38@lemburg.com> <200202262116.g1QLGVN20144@pcp742651pcs.reston01.va.comcast.net>
 <3C7BFEEF.16BDC177@lemburg.com> <200202262140.g1QLehH20305@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7C0DD1.A71FDEB2@lemburg.com>

Guido van Rossum wrote:
> 
> > Indeed, monetary types solve different problems than decimal
> > types. Financial applications do have a need for these kind
> > of implicit error checks.
> 
> But this is easily done by creating a custom class -- which has the
> advantage that the set of constraints can be specialized to the needs
> of a specific application.  When we add a monetary type to the
> language we'll never get it right for all apps.  OTOH, I think we
> could get a fixed point type right.

True. 

A real implementation of a good working decimal
type with adjustable rounding rules would certainly go
a long way and the money type could be built on top of 
it.
 
> What support for money does SQL have?

SQL-92 doesn't have support for it, but some modern database
engines do, e.g. MS SQL Server, PostgreSQL (even though it's
deprecated there, now), MS Access.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Tue Feb 26 22:38:49 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 26 Feb 2002 23:38:49 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> <3C7BFD97.B69DDDFD@lemburg.com> <200202262206.g1QM6En20549@pcp742651pcs.reston01.va.comcast.net>
 <3C7C09AE.54D16E2D@lemburg.com> <200202262226.g1QMQKG20726@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7C0E79.AB97C994@lemburg.com>

Guido van Rossum wrote:
> 
> > > What does a database app with the resulting tuple?
> >
> > It puts the values into struct fields for year, month, day, etc.
> > (Databases usually avoid using Unix ticks since these cause
> > the known problems with dates prior to 1970)
> 
> Hm, I thought that databases have their own date/time types?  Aren't
> these used?

At C level, interfacing is usually done using structs (ISO SQL/CLI 
defines these).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From barry@zope.com  Tue Feb 26 22:38:28 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 26 Feb 2002 17:38:28 -0500
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCCEJHNPAA.tim@zope.com>
Message-ID: <15484.3684.503252.531036@anthem.wooz.org>

>>>>> "TP" == Tim Peters <tim@zope.com> writes:

    TP> [Guido]
    >> ...  Another thing to consider is that for most apps, the
    >> choice of the date/time format should be taken out of the hands
    >> of the programmer and placed into the hands of the user,
    >> through some kind of preference setting.  I18n and L10n also
    >> strongly suggests to take this route.

    TP> I'm sure nobody wants to admit this <wink>, but in sheer
    TP> numbers, nobody has more experience with this stuff than
    TP> Microsoft.  If you sit at your Windows box and go to Start ->
    TP> Settings -> Control Panel -> Regional Settings, you'll get a
    TP> tabbed dialog for specifying the format of number, currency,
    TP> time, and date displays.  A Windows app that ignores the
    TP> settings here is considered to be broken (and rightly so).

    TP> Idiosyncratic formats for user-visible
    TP> number/currency/date/time info is going to become an
    TP> increasingly Bad Idea on other OSes too.

I've not been following this thread at all, so apologies if this has
been brought up already.

The localization context should not (always) be taken from the user
environment.  In systems like web-based services, the context will
instead be relative to the person/entity making the remote request, so
we have to be able to explicitly specify the localization context, or
at least query, modify, and restore some global context.

-Barry


From tim@zope.com  Tue Feb 26 22:39:59 2002
From: tim@zope.com (Tim Peters)
Date: Tue, 26 Feb 2002 17:39:59 -0500
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
In-Reply-To: <200202262229.g1QMTat20771@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEJJNPAA.tim@zope.com>

[Tim]
> Idiosyncratic formats for user-visible number/currency/date/time
> info is going to become an increasingly Bad Idea on other OSes too.

[Guido]
> Oh, so it'll be at least another 10 years before the same wisdom
> reaches the typical web application... :-)

Optimist.


From guido@python.org  Tue Feb 26 22:45:30 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 17:45:30 -0500
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 17:38:28 EST."
 <15484.3684.503252.531036@anthem.wooz.org>
References: <LNBBLJKPBEHFEDALKOLCCEJHNPAA.tim@zope.com>
 <15484.3684.503252.531036@anthem.wooz.org>
Message-ID: <200202262245.g1QMjUf23970@pcp742651pcs.reston01.va.comcast.net>

> I've not been following this thread at all, so apologies if this has
> been brought up already.

No, but unclear if it's relevant.

> The localization context should not (always) be taken from the user
> environment.  In systems like web-based services, the context will
> instead be relative to the person/entity making the remote request, so
> we have to be able to explicitly specify the localization context, or
> at least query, modify, and restore some global context.

Sure.  So the interface may be different.  The main argument (that you
shouldn't be using t.year() to format dates) remains the same.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@zope.com  Tue Feb 26 22:48:24 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 26 Feb 2002 17:48:24 -0500
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCCEJHNPAA.tim@zope.com>
 <15484.3684.503252.531036@anthem.wooz.org>
 <200202262245.g1QMjUf23970@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15484.4280.89844.484487@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    >> The localization context should not (always) be taken from the
    >> user environment.  In systems like web-based services, the
    >> context will instead be relative to the person/entity making
    >> the remote request, so we have to be able to explicitly specify
    >> the localization context, or at least query, modify, and
    >> restore some global context.

    GvR> Sure.  So the interface may be different.  The main argument
    GvR> (that you shouldn't be using t.year() to format dates)
    GvR> remains the same.

Doesn't Java have separate formatting objects?  You decide which
format object you need based on the localication context, then you
pass in the timestamp/date/money/whatever thingie and the format
object knowws how to render that data representation in the
appropriate localization.

makes-sense-to-me-ly y'rs,
-Barry


From pedroni@inf.ethz.ch  Tue Feb 26 22:36:28 2002
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Tue, 26 Feb 2002 23:36:28 +0100
Subject: [Python-Dev] Mmh, unrelated "silly" test suite patch
Message-ID: <081501c1bf16$0841cbc0$6d94fea9@newmexico>

Hi, mmh, kind of break

strange as it may seem (CVS homework results)
in 1994 Guido added some tests for the tuple
built-in to the test suite but this one has
never grown some explicit tests for the basic behavior
of list(SEQ).

Maybe there is some esoteric reason for this,
but anyway I have just posted a patch to SF

https://sourceforge.net/tracker/index.php?func=detail&aid=523169&group_id=5470&
atid=305470

with tests for list() and an amended test for tuple(),
in particular they try to check:

list2 = list(LIST)
list2 is not LIST and list2 == LIST

tuple(TUPLE) is TUPLE

also the documented behavior.

Yup, there is no hurry to check this in,
I have written this because
yesterday the lacking test has burned "badly" your Jython brothers
[we don't have a time machine on our side :(]

and for the benefit of the future generations of Python
re-implementers.

The new tests pass Python 2.2.

hoping-that-doing-this-so-late-in-game-will-not-cause-some-kind-of-karmic-
unbalance-also-because-I-have-just-bumped-the-#-of-patches-from-sacred-128-
to-129-ly y'rs - Samuele.










From guido@python.org  Tue Feb 26 22:53:07 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 17:53:07 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 23:38:49 +0100."
 <3C7C0E79.AB97C994@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCEEKANKAA.tim.one@comcast.net> <200202082027.g18KRGq04141@pcp742651pcs.reston01.va.comcast.net> <3C7B9617.D22E449F@zope.com> <065201c1bef8$f3e03ad0$ced241d5@hagrid> <3C7BE7CD.EAA8BA24@zope.com> <200202262028.g1QKSPE19679@pcp742651pcs.reston01.va.comcast.net> <3C7BF5AD.45BACF05@zope.com> <200202262107.g1QL7mQ20059@pcp742651pcs.reston01.va.comcast.net> <3C7BFD97.B69DDDFD@lemburg.com> <200202262206.g1QM6En20549@pcp742651pcs.reston01.va.comcast.net> <3C7C09AE.54D16E2D@lemburg.com> <200202262226.g1QMQKG20726@pcp742651pcs.reston01.va.comcast.net>
 <3C7C0E79.AB97C994@lemburg.com>
Message-ID: <200202262253.g1QMr7J24039@pcp742651pcs.reston01.va.comcast.net>

> > Hm, I thought that databases have their own date/time types?  Aren't
> > these used?
> 
> At C level, interfacing is usually done using structs (ISO SQL/CLI 
> defines these).

Ah, there's another requirement that we (or at least I) almost forgot.
There should be an efficient C-level interface for the abstract
date/time type.

This stuff is hairier than it seems!  I think the main tension is
between improving upon the Unix time_t type, and improving upon the
Unix "struct tm" type.  Improving upon time_t could mean to extend the
range beyond 1970-2038, and/or extend the precision to milliseconds or
microseconds.  Improving upon struct tm is hard (it has all the
necessary fields and no others), unless you want to add operations
(just add methods) or make the representation more compact (several of
the fields can be packed in 4-6 bits each).  A third dimension might
be to provide better date/time arithmetic, but I'm not sure if there's
much of a market for that, given all the fuzzy semantics (leap
seconds, differences across DST changes, timezones).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Tue Feb 26 23:00:09 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 26 Feb 2002 17:00:09 -0600
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEJHNPAA.tim@zope.com>
References: <LNBBLJKPBEHFEDALKOLCCEJHNPAA.tim@zope.com>
Message-ID: <15484.4985.471940.886541@12-248-41-177.client.attbi.com>

    Tim> .... If you sit at your Windows box and go to Start -> Settings ->
    Tim> Control Panel -> Regional Settings, you'll get a tabbed dialog for
    Tim> specifying the format of number, currency, time, and date displays.

I'm gonna go ever so slightly out on a limb here and make a wild-ass guess
here that Apple probably had this functionality before Microsoft and that
like on Windows, all well-behaved Mac applications had to use the user's
settings.  Maybe this abstract time object's strftime method (or
time.strftime) should grow format specifiers for the user-specified date and
time...

    Tim> Idiosyncratic formats for user-visible number/currency/date/time
    Tim> info is going to become an increasingly Bad Idea on other OSes too.

Of course, neither Apple's nor Microsoft's efforts in this area will help
the poor person trying to emit a dynamic web page containing "correctly"
formatted dates.  You still have to guess or just fall back to something
most everyone can deduce.

can-we-squeeze-it-into-http-2.0?-ly, y'rs,

Skip


From tim@zope.com  Tue Feb 26 22:59:41 2002
From: tim@zope.com (Tim Peters)
Date: Tue, 26 Feb 2002 17:59:41 -0500
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
In-Reply-To: <15484.3684.503252.531036@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEJKNPAA.tim@zope.com>

[Barry]
> I've not been following this thread at all, so apologies if this has
> been brought up already.
>
> The localization context should not (always) be taken from the user
> environment.  In systems like web-based services, the context will
> instead be relative to the person/entity making the remote request, so
> we have to be able to explicitly specify the localization context, or
> at least query, modify, and restore some global context.

Like I said <wink>, Microsoft has more experience with this stuff than
anyone.  Check out

    http://www.trigeminal.com/

Provided you're using IE, it should up come in the right language for you.
Try viewing it in different languages, and note, e.g., how date formats
change automatically.

Here's an article on how they do it:

    http://msdn.microsoft.com/msdnmag/issues/0700/localize/localize.asp



From guido@python.org  Tue Feb 26 23:04:23 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 26 Feb 2002 18:04:23 -0500
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
In-Reply-To: Your message of "Tue, 26 Feb 2002 17:48:24 EST."
 <15484.4280.89844.484487@anthem.wooz.org>
References: <LNBBLJKPBEHFEDALKOLCCEJHNPAA.tim@zope.com> <15484.3684.503252.531036@anthem.wooz.org> <200202262245.g1QMjUf23970@pcp742651pcs.reston01.va.comcast.net>
 <15484.4280.89844.484487@anthem.wooz.org>
Message-ID: <200202262304.g1QN4Of24123@pcp742651pcs.reston01.va.comcast.net>

> Doesn't Java have separate formatting objects?  You decide which
> format object you need based on the localication context, then you
> pass in the timestamp/date/money/whatever thingie and the format
> object knowws how to render that data representation in the
> appropriate localization.

Yes, that's probably a good way to do it in general.  There may be
global functions that are initialized based on getenv(), and Zope may
provide a different set of global functions that are initialized based
on the preferences of the client.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim@zope.com  Tue Feb 26 23:08:10 2002
From: tim@zope.com (Tim Peters)
Date: Tue, 26 Feb 2002 18:08:10 -0500
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
In-Reply-To: <15484.4985.471940.886541@12-248-41-177.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEJMNPAA.tim@zope.com>

[Skip Montanaro]
> ...
> Of course, neither Apple's nor Microsoft's efforts in this area will help
> the poor person trying to emit a dynamic web page containing "correctly"
> formatted dates.  You still have to guess or just fall back to something
> most everyone can deduce.

No, MS languages support APIs for high-level things like FormatCurrency(),
and some have dedicated Currency, Time and Date types.  You don't *get*
low-level control under these things, and the high-level APIs automatically
respect user preferences.  So, for example, if you're generating dynamic
content via a VBScript program, it's easy provided you stick to VBScript's
high-level date format functions when you pump out a date:  you can't not
respect the user's date format preferences then.  See the links I posted
just before this to see how a server can suck down the user's preferences,
at least to a first approximation (the default formats for the user's
primary language).



From DavidA@ActiveState.com  Tue Feb 26 23:37:37 2002
From: DavidA@ActiveState.com (David Ascher)
Date: Tue, 26 Feb 2002 15:37:37 -0800
Subject: [Python-Dev] Re: proposal: add basic time type to the standard
 library
References: <LNBBLJKPBEHFEDALKOLCKEJKNPAA.tim@zope.com>
Message-ID: <3C7C1C41.AB594D72@activestate.com>

Tim Peters wrote:

> Like I said <wink>, Microsoft has more experience with this stuff than
> anyone.  Check out
>=20
>     http://www.trigeminal.com/
>=20
> Provided you're using IE, it should up come in the right language for y=
ou.
> Try viewing it in different languages, and note, e.g., how date formats
> change automatically.

Except that:

	Derni=E8re mise-=E0-jour:  02/08/02 05:33 AM=20

is not correct.  The AM/PM distinction is one which no right-thinking
folks bother with.  Then again, it looks like that particular line is
not using their fancy logic, since the date is in the future if you
follow the pattern set by other dates.

Also note:

    Des problemes avec ce site? SVP, contacter le webmaster
    avec vos commentaires, questions ou suggestions (si possible, en
anglais).

which is pretty funny =3D).

--david


From tim@zope.com  Wed Feb 27 00:06:52 2002
From: tim@zope.com (Tim Peters)
Date: Tue, 26 Feb 2002 19:06:52 -0500
Subject: [Python-Dev] Re: proposal: add basic time type to the standard  library
In-Reply-To: <3C7C1C41.AB594D72@activestate.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEKANPAA.tim@zope.com>

[David Ascher]
> Except that:
>
> 	Derni=E8re mise-=E0-jour:  02/08/02 05:33 AM
>
> is not correct.

Well, if you're not viewing the page in English like God intended, you
should be grateful it talked back to you at all.

> ...
> Also note:
>
>     Des problemes avec ce site? SVP, contacter le webmaster
>     avec vos commentaires, questions ou suggestions (si possible, en
>     anglais).
>
> which is pretty funny =3D).

Mabye to you.  Me, I don't know French at all, so I'm delighted to see th=
em
produce French that I can understand easily.  Hell, I even recognize SVP =
as
a suffix of the good old English RSVP -- it figures the French would drop
the first and most important letter <wink>.

the-kind-of-i18n-a-sensitive-american-can-support-ly y'rs  - tim



From Anthony Baxter <anthony@ekit-inc.com>  Wed Feb 27 00:21:22 2002
From: Anthony Baxter <anthony@ekit-inc.com> (Anthony Baxter)
Date: Wed, 27 Feb 2002 11:21:22 +1100
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Message from "M.-A. Lemburg" <mal@lemburg.com>
 of "Tue, 26 Feb 2002 21:11:47 BST." <3C7BEC03.E8225CFD@lemburg.com>
Message-ID: <200202270021.g1R0LMr15633@burswood.off.ekorp.com>

>>> "M.-A. Lemburg" wrote
> I tried that in early version of mxDateTime -- it fails
> badly. I switched to the local time assumption very
> early in the development.

If you must store stuff without timezones, _please_ don't use
localtime. localtime is a variable thing (think what happens
when daylight savings goes on and off).

Storing UTC, or else the local time and enough timezone data to
get to UTC reliably, is the only thing that will lead to hugs and
puppies.

Anthony, who deals with stupid date/times all the time, in billing
systems, and in different time zones.

-- 
Anthony Baxter     <anthony@interlink.com.au>   
It's never to late to have a happy childhood.



From ping@lfw.org  Wed Feb 27 00:35:05 2002
From: ping@lfw.org (Ka-Ping Yee)
Date: Tue, 26 Feb 2002 18:35:05 -0600 (CST)
Subject: [Python-Dev] Re: proposal: add basic time type to the standard
 library
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEKANPAA.tim@zope.com>
Message-ID: <Pine.LNX.4.33.0202261833340.511-100000@server1.lfw.org>

David Ascher wrote;
> Also note:
>
>     Des problemes avec ce site? SVP, contacter le webmaster
>     avec vos commentaires, questions ou suggestions (si possible, en
>     anglais).
>
> which is pretty funny =).

Tim Peters wrote:
> Mabye to you.  Me, I don't know French at all, so I'm delighted to see them
> produce French that I can understand easily.

In case the humour wasn't clear, here's the translation:

    Problems with this site?  Please contact the webmaster
    with your comments, questions, or suggestions (if possible,
    in English).


-- ?!ng



From tim@zope.com  Wed Feb 27 00:42:28 2002
From: tim@zope.com (Tim Peters)
Date: Tue, 26 Feb 2002 19:42:28 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <3C7B95EB.952037FE@zope.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEKDNPAA.tim@zope.com>

[Jim Fulton]
> ZODB has a TimeStamp type that uses a 32-bit unsigned integer
> to store year, month,, day, hour, and minute in a way that makes it dirt
> simple to extract a component.

You really think so?  It's a mixed-radix scheme:

    	  v=((((y-1900)*12+mo-1)*31+d-1)*24+h)*60+m;

so requires lots of expensive integer division and remainder operations to
pick apart again (the trend in CPUs is to make these relatively more
expensive, not less, and e.g. Itanium doesn't even have an integer division
instruction).

If we had this to do over again, I'd strongly suggest assigning 12 bits to
the year, 4 to the month, 5 each to day and hour, and 6 to the minute.  The
components would then be truly dirt simple and dirt cheap to extract, and we
wouldn't even have to bother switching between 0-based and 1-based for the
months and days (let 'em stay 1-based).  They would still sort and compare
correctly in packed format.  The only downside I can see is that not
pursuing every last drop of potential compression would shrink the dynamic
range from 8000+ years to 4000+ years, but we're likely to have much worse
problems in Zope by the year 5900 anyway <wink>.



From tim@zope.com  Wed Feb 27 00:51:17 2002
From: tim@zope.com (Tim Peters)
Date: Tue, 26 Feb 2002 19:51:17 -0500
Subject: [Python-Dev] Re: proposal: add basic time type to the standard  library
In-Reply-To: <Pine.LNX.4.33.0202261833340.511-100000@server1.lfw.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEKENPAA.tim@zope.com>

[Ping]
> In case the humour wasn't clear, here's the translation:
>
>     Problems with this site?  Please contact the webmaster
>     with your comments, questions, or suggestions (if possible,
>     in English).

Yes, it was clear.  If the French could get over their speech impediment of
using "avec" when they mean "with", and stuck to webspeak in this style,
they *would* be speaking English.  I think that's even funnier <wink>.



From nhodgson@bigpond.net.au  Wed Feb 27 01:23:28 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Wed, 27 Feb 2002 12:23:28 +1100
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCKEJKNPAA.tim@zope.com>
Message-ID: <076601c1bf2d$5c389490$0acc8490@neil>

Tim Peters:

> Like I said <wink>, Microsoft has more experience with this stuff than
> anyone.  Check out
>
>     http://www.trigeminal.com/
>
> Provided you're using IE, it should up come in the right language for you.
> Try viewing it in different languages, and note, e.g., how date formats
> change automatically.

   Well, it crushes us Aussies under the jackboot of US cultural
imperialism, unless month 31 was recently added to the calendar. This is
despite sending a nice
Accept-Language: en-au
   It does work if I change to German.

   Neil



From gmcm@hypernet.com  Wed Feb 27 01:30:29 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Tue, 26 Feb 2002 20:30:29 -0500
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
In-Reply-To: <15484.4280.89844.484487@anthem.wooz.org>
Message-ID: <3C7BF065.20061.3F14F1B8@localhost>

On 26 Feb 2002 at 17:48, Barry A. Warsaw wrote:

> Doesn't Java have separate formatting objects?  You
> decide which format object you need based on the
> localication context, then you pass in the
> timestamp/date/money/whatever thingie and the
> format object knowws how to render that data
> representation in
> the appropriate localization.

Yeah. In an optimization gig I had a couple years
ago, I had them take out their use of the fancy
format objects. It took 7,000 calls to print a
date time according to the trace.

-- Gordon
http://www.mcmillan-inc.com/



From greg@cosc.canterbury.ac.nz  Wed Feb 27 02:58:12 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 27 Feb 2002 15:58:12 +1300 (NZDT)
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
In-Reply-To: <3C7C1C41.AB594D72@activestate.com>
Message-ID: <200202270258.PAA23722@s454.cosc.canterbury.ac.nz>

David Ascher <DavidA@ActiveState.com>:

>    Des problemes avec ce site? SVP, contacter le webmaster
>    avec vos commentaires, questions ou suggestions (si possible, en
>    anglais).
>
> which is pretty funny =).

Obviously they haven't yet implemented the natural-language-
understanding webmaster-bot that adapts to the user's
language settings.

They could just pipe your query through Babelfish, though...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Wed Feb 27 03:10:05 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 27 Feb 2002 16:10:05 +1300 (NZDT)
Subject: [Python-Dev] Re: proposal: add basic time type to the standard library
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEKANPAA.tim@zope.com>
Message-ID: <200202270310.QAA23726@s454.cosc.canterbury.ac.nz>

Tim Peters <tim@zope.com>:

> Hell, I even recognize SVP as a suffix of the good old English RSVP --
> it figures the French would drop the first and most important letter
> <wink>.

RSVP = repondez s'il vous plait = reply, please
SVP = s'il vous plait = please

(I know, this has nothing to do with timezones.
Sorry. I'll stop now.)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From andymac@bullseye.apana.org.au  Tue Feb 26 21:44:36 2002
From: andymac@bullseye.apana.org.au (Andrew MacIntyre)
Date: Wed, 27 Feb 2002 08:44:36 +1100 (EDT)
Subject: [Python-Dev] Re: OS/2 EMX changes to dynload_shlib.c
In-Reply-To: <web-3614176@digicool.com>
Message-ID: <Pine.OS2.4.32.0202270837460.53-100000@tenring.andymac.org>

On Tue, 26 Feb 2002, Guido Van Rossum wrote:

> Given the number of OS/2 EMX specific changes to
> dynload_shlib.c, wouldn't it be better to create a
> separate dynload_os2.c?

Its just occurred to me that you might in fact be referring to the changes
to import.c, which were in the same commit (& thus the same checkin
message) and were extensive.  I did try to make it clear in the checkin
message that changes to multiple files were committed.

It seems to be de rigeur to commit a single file at a time; something I
hadn't appreciated and don't remember being advised about.  I will follow
this practive from now on.

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  | Snail: PO Box 370
        andymac@pcug.org.au            |        Belconnen  ACT  2616
Web:    http://www.andymac.org/        |        Australia



From andymac@bullseye.apana.org.au  Tue Feb 26 21:06:47 2002
From: andymac@bullseye.apana.org.au (Andrew MacIntyre)
Date: Wed, 27 Feb 2002 08:06:47 +1100 (EDT)
Subject: [Python-Dev] Re: OS/2 EMX changes to dynload_shlib.c
In-Reply-To: <web-3614176@digicool.com>
Message-ID: <Pine.OS2.4.32.0202270752380.45-100000@tenring.andymac.org>

On Tue, 26 Feb 2002, Guido Van Rossum wrote:

> [I sent this to python-dev, but my copy to you
> bounced; I'm not sure if you're on python-dev yet.]

Have been for some time.

> Given the number of OS/2 EMX specific changes to
> dynload_shlib.c, wouldn't it be better to create a
> separate dynload_os2.c?

There is already a dynload_os2.c for the VACPP port, which implements a
module loader via direct OS/2 API calls.

Because the EMX stuff that I picked up (a 1.5.2 port by Andrew Zobolotny)
used an emulation of dlopen() (PC/os2emx/dlfcn.c) I carried that over.

I hadn't considered the changes to dynload_shlib.c that significant.
however I'll look into whether I can adapt the EMX port to use the VACPP
module loader in dynload_os2.c.

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  | Snail: PO Box 370
        andymac@pcug.org.au            |        Belconnen  ACT  2616
Web:    http://www.andymac.org/        |        Australia



From tim_one@email.msn.com  Wed Feb 27 06:06:28 2002
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 27 Feb 2002 01:06:28 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <200202262253.g1QMr7J24039@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCELHNPAA.tim_one@email.msn.com>

[Guido]
> ...
> This stuff is hairier than it seems!

You're just getting your toes wet:  it's impossible to make any two of
{astronomers, businessfolk, Jim} happy at the same time, even if they're all
American and live in the same city.

> I think the main tension is between improving upon the Unix time_t type,
> and improving upon the Unix "struct tm" type.  Improving upon time_t
> could mean to extend the range beyond 1970-2038, and/or extend the
> precision to milliseconds or microseconds.
>
> Improving upon struct tm is hard (it has all the necessary fields and no
> others), unless you want to add operations (just add methods) or make
> the representation more compact (several of the fields can be packed in
> 4-6 bits each).

I'm suprised you say "all the necessary fields", because a tm contains no
info about timezone.  C99 introduces a struct tmx that does.  The initial
segment of a struct tmx must be identical to a struct tm, but the meaning of
tmx.tm_isdst differs from tm.tm_isdst.  tmx.tm_isdst

    is the positive number of minutes of offset if Daylight Saving
    Time is in effect, zero if Daylight Saving Time is not in effect,
    and -1 if the information is not available.

Then it adds some fields not present in a struct tm:

    int tm_version;  // version number
    int tm_zone;     // time zone offset in minutes from UTC [-1439, +1439]
    int tm_leapsecs; // number of leap seconds applied
    void *tm_ext;    // extension block
    size_t tm_extlen;// size of the extension block

The existence of tm_version, tm_ext and tm_extlen can be fairly viewed as a
committee's inability to say "no" <wink>.

> A third dimension might be to provide better date/time arithmetic, but
> I'm not sure if there's much of a market for that, given all the fuzzy
> semantics (leap seconds, differences across DST changes, timezones).

I don't think we can get off that easy.  Time computation is critical for
businesses and astronomers, and leap seconds etc are a PITA independent of
time computations.  Time computations seem to me to be the easiest of all,
provided we've already "done something" intelligible about the rest:  any
calculation boils down to factoring away leap seconds etc in conversion to a
canonical form, doing the computing there, then injecting leap seconds etc
back in to the result when converting out of canonical form again.

The ECMAScript std (nee Javascript) has, I think, a good example of a usable
facility that refused to get mired down in impossible details; e.g., it
flat-out refuses to recognize leap seconds.  mxDateTime is similarly sane,
but MAL keeps threatening to flirt with insanity <wink>.

BTW, I doubt there'd be any discussion of leap seconds in the C std if some
astronomers hadn't been early Unix users.  It's never a net win in the end
to try to make a scientist happy <0.9 wink>.



From mal@lemburg.com  Wed Feb 27 09:16:18 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 10:16:18 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3C7CA3E2.C3705289@lemburg.com>

"Martin v. Loewis" wrote:
> 
> Guido van Rossum <guido@python.org> writes:
> 
> > > This makes Latin-1 the right choice:
> > >
> > > * Unicode literals already use it today
> >
> > But they shouldn't, IMO.
> 
> I agree. I recommend to deprecate this feature, and raise a
> DeprecationWarning if a Unicode literal contains non-ASCII characters
> but no encoding has been declared.
> 
> > Sorry, I don't understand what you're trying to say here.  Can you
> > explain this with an example?  Why can't we require any program
> > encoded in more than pure ASCII to have an encoding magic comment?  I
> > guess I don't understand why you mean by "raw binary".
> 
> With the proposed implementation, the encoding declaration is only
> used for Unicode literals. In all other places where non-ASCII
> characters can occur (comments, string literals), those characters are
> treated as "bytes", i.e. it is not verified that these bytes are
> meaningful under the declared encoding.
> 
> Marc's original proposal was to apply the declared encoding to the
> complete source code, but I objected claiming that it would make the
> tokenizer changes more complex, and the resulting tokenizer likely
> significantly slower (atleast if you use the codecs API to perform the
> decoding).

I don't think that the codecs will significantly slow down
overall compilation -- the compiler is not fast to begin 
with.

However, changing the bsae type in the tokenizer and compiler
from char* to Py_UNICODE* will be a significant effort and
that's why I added two phases to the implementation.

The first phase will only touch Unicode literals as proposed by Martin.
 
> In phase 2, the encoding will apply to all strings. So it will not be
> possible to put arbitrary byte sequences in a string literal, atleast
> if the encoding disallows certain byte sequences (like UTF-8, or
> ASCII). Since this is currently possible, we have a backwards
> compatibility problem.

Right and I believe that a lot of people in European 
countries write strings literals with a Latin-1 encoding
in mind. We cannot simply break all that code.

The other problem is with comments found in Python source
code. In phase 2 these will break as well.

So how about this:

In phase 1, the tokenizer checks the *complete file* for
non-ASCII characters and outputs single warning 
per file if it doesn't find a coding declaration at
the top. Unicode literals continue to use [raw-]unicode-escape
as codec.

In phase 2, we enforce ASCII as default encoding, i.e.
the warning will turn into an error. The [raw-]unicode-escape
codec will be extended to also support converting Unicode
to Unicode, that is, only handle escape sequences in this
case.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Wed Feb 27 09:20:57 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 10:20:57 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de> <15483.53993.852170.135298@anthem.wooz.org> <m3n0xww01p.fsf@mira.informatik.hu-berlin.de> <3C7BE70B.74713ED2@lemburg.com> <20020226150143.C16980@unpythonic.dhs.org>
Message-ID: <3C7CA4F9.86626985@lemburg.com>

jepler@unpythonic.dhs.org wrote:
> 
> On Tue, Feb 26, 2002 at 08:50:35PM +0100, M.-A. Lemburg wrote:
> > Does anybody know where XEmacs is moving w/r to this ? (and
> > for that matter what about vi, vim, etc. ?)
> 
> I'm working with Vim 6.0, 20001 Sep 14.
> 
> VIM lets you set variables with text similar to
>         vim:KEY=VALUE:KEY=VALUE:....:
> Apparently you would use
>         vim:fileencoding=sjis:
> to select shift-jis encoding.  In the vim style, it seems most common to
> place this at the bottom of a file, but it can be placed at the top too.
> The variable "modelines" controls how many lines at each end of the file is
> inspected, with the default being 5.  It's documented that the form
>         vi:set KEY=VALUE:
> may be compatible with "some versions of Vi" but does not say which.  (I
> can't get this to work)
> 
> You can set a list of encodings to attempt when a file is loaded, which
> defaults to "ucs-bom,utf-8,latin1".  A user who wanted to treate
> non-unicode files as shift-jis by default would
>         :set fileencodings=ucs-bom,utf-8,sjis
> You can also load a particular file with the ++enc parameter:
>         :edit ++enc=koi8-r russian.txt
> (I can get this to work, but I have to do it manually to load anything in
> an odd character set)
> 
> The emacs line is harmless in vim, but doesn't do anything.  It's possible
> that using :autocmd someone could make vim use the emacs line to set
> encoding, but I'm not sure -- setting fileencoding after a file is loaded
> seems to perform a translation from the old characterset to the new.

So if we use the RE "coding[=:]\s*([\w-]+)" on the first line,
we should be able to reach out for the encoding, right ?

This RE would then cover both vim and emacs.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From martin@v.loewis.de  Wed Feb 27 09:26:15 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 27 Feb 2002 10:26:15 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <3C7CA3E2.C3705289@lemburg.com>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
Message-ID: <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> In phase 1, the tokenizer checks the *complete file* for
> non-ASCII characters and outputs single warning 
> per file if it doesn't find a coding declaration at
> the top. Unicode literals continue to use [raw-]unicode-escape
> as codec.

Do you suggest that in this phase, the declared encoding is not used
for anything except to complain? -1. I think people need to gain
something from declaring the encoding; what they gain is that Unicode
literals work right (i.e. that they really denote the strings that
people see on their screen - given the appropriate editor).

Regards,
Martin


From mal@lemburg.com  Wed Feb 27 09:36:05 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 10:36:05 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <200202270021.g1R0LMr15633@burswood.off.ekorp.com>
Message-ID: <3C7CA885.1E4F1ED8@lemburg.com>

Anthony Baxter wrote:
> 
> >>> "M.-A. Lemburg" wrote
> > I tried that in early version of mxDateTime -- it fails
> > badly. I switched to the local time assumption very
> > early in the development.
> 
> If you must store stuff without timezones, _please_ don't use
> localtime. localtime is a variable thing (think what happens
> when daylight savings goes on and off).

You probably didn't notice the "assumption" -- mxDateTime
has a few APIs which make assumptions about the value
stored in DateTime objects; however, you can just as well
store UTC in them. In that case, the APIs making the
local time assumption will produce wrong data of course.

>>> from mx.DateTime import *
>>> now() # local time
<DateTime object for '2002-02-27 10:33:52.63' at 8199ec0>
>>> now().gmtime() # in UTC
<DateTime object for '2002-02-27 09:33:55.72' at 8196b38>

In the end, it's better to leave the decision what to store
in a DateTime object to the programmer. Timezones, DST and
leap seconds sometimes have their application and sometimes
just cause plain confusion. 

IMHO, the application should decide what to do about them 
and manage the data storage aspects of its decision.
 
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From Jack.Jansen@oratrix.com  Wed Feb 27 09:38:28 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Wed, 27 Feb 2002 10:38:28 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <3C7CA3E2.C3705289@lemburg.com>
Message-ID: <C0CDD070-2B65-11D6-97B1-0030655234CE@oratrix.com>

On Wednesday, February 27, 2002, at 10:16 , M.-A. Lemburg wrote:
> In phase 1, the tokenizer checks the *complete file* for
> non-ASCII characters and outputs single warning
> per file if it doesn't find a coding declaration at
> the top. Unicode literals continue to use [raw-]unicode-escape
> as codec.
>
> In phase 2, we enforce ASCII as default encoding, i.e.
> the warning will turn into an error. The [raw-]unicode-escape
> codec will be extended to also support converting Unicode
> to Unicode, that is, only handle escape sequences in this
> case.

+1
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -



From mal@lemburg.com  Wed Feb 27 09:48:25 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 10:48:25 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com> <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3C7CAB69.FF421A6E@lemburg.com>

"Martin v. Loewis" wrote:
> 
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
> > In phase 1, the tokenizer checks the *complete file* for
> > non-ASCII characters and outputs single warning
> > per file if it doesn't find a coding declaration at
> > the top. Unicode literals continue to use [raw-]unicode-escape
> > as codec.
> 
> Do you suggest that in this phase, the declared encoding is not used
> for anything except to complain? 

No. This is just an extra step on top of what is proposed in
the PEP to make people aware of the problem in phase 1.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Wed Feb 27 09:56:45 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 10:56:45 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com> <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3C7CAD5D.6692F44@lemburg.com>

I just got a private response about the proposal from Atsuo Ishimoto, 
Japan. They use two different encoding in day-to-day life (one for
windows, one for unix) and have their complete tool chain setup
to auto-convert all files between the two environments.

Recognizing the magic comment would pose a problem for them,
since their tools assume conversion to the PC's locale setting.

He proposed to make the interpreters default encoding the default
for source files which don't specify an encoding. That is
ASCII on all standard Python installations and different
encodings on tweaked installations.

He also told me that they put raw Shift-JIS and EUC-JP
into Python literal strings -- just like Europeans do
with Latin-1.

Wouldn't his suggestion be a good compromise for phase 2 ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Wed Feb 27 10:07:15 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 11:07:15 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <LNBBLJKPBEHFEDALKOLCEEMPNPAA.tim.one@comcast.net>
Message-ID: <3C7CAFD3.60B32168@lemburg.com>

Tim Peters wrote:
> 
> [M.-A. Lemburg]
> > Jack had the same question. The simple answer is: we need this
> > in order to maintain backward compatibility when we move to
> > phase two of the implementation.
> >
> > Here's the longer one:
> >
> > ASCII is the standard encoding for Python keywords and identifiers.
> > There is no standard source code encoding for string literals.
> 
> But there is:
> 
>     Python uses the 7-bit ASCII character set for program text and
>     string literals.  8-bit characters may be used in string literals
>     and comments but their interpretation is platform dependent; the
>     proper way to insert 8-bit characters in string literals is by
>     using octal or hexadecimal escape sequences.
> 
> The Ref Man has said "7-bit ASCII" for both "program text and string
> literals" for a long time.  The formal grammar in the Ref Man agrees with
> this (including the formal grammar for Unicode literals).  It's an
> historical accident that the tokenizer happened to use C isalpha() to
> "enforce" this for identifiers, and that C isalpha() happened to grow
> locale-dependence while Guido was too drunk with power to notice <wink>.

It's a fact of life that users don't read reference manuals,
but simply write programs and feel good if they happen to
work :-)

As a result, programs have used string literals in many different
encodings for a long time. Changing this situation will take 
time. The proposal aims at clarifying the situation and to
make the transition less painful.

> > Unicode literals are interpreted using 'unicode-escape' which
> > is an enhanced Latin-1 with escape semantics.
> 
> I'm sure they *do* "act like" Latin-1 on your box, and that identifiers also
> act like Latin-1 was in effect on your box.  But the Ref Man explicitly says
> all that is platform dependent; there's no "backward compatibility" to
> preserve here beyond 7-bit ASCII unless you want to preserve that Python
> always rely on what C isalpha() says.

You tell that to the Russians, Japanese or the Europeans 
writing Python programs -- it just happens that comments and
literals are bound to end up using local encodings.

Anyway, with the PEP implemented we'll no longer have to
restrict ourselves to 7-bit US-ASCII, so all these problems
will go away.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Wed Feb 27 11:16:25 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 12:16:25 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <LNBBLJKPBEHFEDALKOLCEEMPNPAA.tim.one@comcast.net> <3C7CAFD3.60B32168@lemburg.com>
Message-ID: <3C7CC009.F33C4AEE@lemburg.com>

I've updated the PEP with the new requirements.

	http://python.sourceforge.net/peps/pep-0263.html

The new scheme for the default encoding now maps the standard
procedure for all other conversions in Python which go
from strings to Unicode: use the sys.getdefaultencoding().

This happens to be ASCII in all standard installations,
but sys admins may change it at their own risk and liking.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From tim.one@comcast.net  Wed Feb 27 08:25:02 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 27 Feb 2002 03:25:02 -0500
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <3C7BECEC.E1550553@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEMPNPAA.tim.one@comcast.net>

[M.-A. Lemburg]
> Jack had the same question. The simple answer is: we need this
> in order to maintain backward compatibility when we move to
> phase two of the implementation.
>
> Here's the longer one:
>
> ASCII is the standard encoding for Python keywords and identifiers.
> There is no standard source code encoding for string literals.

But there is:

    Python uses the 7-bit ASCII character set for program text and
    string literals.  8-bit characters may be used in string literals
    and comments but their interpretation is platform dependent; the
    proper way to insert 8-bit characters in string literals is by
    using octal or hexadecimal escape sequences.

The Ref Man has said "7-bit ASCII" for both "program text and string
literals" for a long time.  The formal grammar in the Ref Man agrees with
this (including the formal grammar for Unicode literals).  It's an
historical accident that the tokenizer happened to use C isalpha() to
"enforce" this for identifiers, and that C isalpha() happened to grow
locale-dependence while Guido was too drunk with power to notice <wink>.

> Unicode literals are interpreted using 'unicode-escape' which
> is an enhanced Latin-1 with escape semantics.

I'm sure they *do* "act like" Latin-1 on your box, and that identifiers also
act like Latin-1 was in effect on your box.  But the Ref Man explicitly says
all that is platform dependent; there's no "backward compatibility" to
preserve here beyond 7-bit ASCII unless you want to preserve that Python
always rely on what C isalpha() says.



From mal@lemburg.com  Wed Feb 27 11:38:39 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 12:38:39 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCCELHNPAA.tim_one@email.msn.com>
Message-ID: <3C7CC53F.DF13D8D7@lemburg.com>

Tim Peters wrote:
> 
> The ECMAScript std (nee Javascript) has, I think, a good example of a usable
> facility that refused to get mired down in impossible details; e.g., it
> flat-out refuses to recognize leap seconds.  mxDateTime is similarly sane,
> but MAL keeps threatening to flirt with insanity <wink>.

FYI, mxDateTime test the C lib for leap second support; if leap
seconds are used, then it has to support these too in conversions
from and to Unix ticks.

> BTW, I doubt there'd be any discussion of leap seconds in the C std if some
> astronomers hadn't been early Unix users.  It's never a net win in the end
> to try to make a scientist happy <0.9 wink>.

What strange about leap seconds is that they don't fit well with
the idea of counting seconds since some fixed point in history.
They are only useful for conversions from this count to a broken
down date and time representation.... time simply doesn't
leap.

>From a comment in mxDateTime:
/* This function checks whether the system uses the POSIX time_t rules
   (which do not support leap seconds) or a time package with leap
   second support enabled. Return 1 if it uses POSIX time_t values, 0
   otherwise.

   POSIX: 1986-12-31 23:59:59 UTC == 536457599

   With leap seconds:             == 536457612

   (since there were 13 leap seconds in the years 1972-1985 according
   to the tz package available from ftp://elsie.nci.nih.gov/pub/)

*/

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From andy@reportlab.com  Wed Feb 27 12:08:22 2002
From: andy@reportlab.com (Andy Robinson)
Date: Wed, 27 Feb 2002 12:08:22 -0000
Subject: [Python-Dev] RE: Python-Dev digest, Vol 1 #1917 - 14 msgs
In-Reply-To: <E16foYw-0001dt-00@mail.python.org>
Message-ID: <LKENLBBMDHMKBECHIAIAAELNCEAA.andy@reportlab.com>

> > I propose adding an "abstract" money base type to the standard
> > library, to be subclassed by real money/decimal implementations.
> 
> Why do we need this?  I guess that would be Question #1...
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)

I can think of 3 reasons; I've seen all these occur in real life.

Reason 1: Currency safety. Having a special type can rule out 
subtle programming errors. Imagine this:
 
>>   x = Money(3, "USD")
>>   y = Money(4.5, "NLG")
>>   z = x + y
TypeError:  can't add different currencies

Likewise, if you add or subtract money you get money; if you
divide money in the same currency you get a float; and just
about any other operation might be an error.  IMHO a basic type
should just rule out operations; subclasses could do clever
conversions etc.  (Does anyone need Euro triagulation rules 
in the Python standard library?)

Reason 2: fixed decimals
SQL databases and AS400s have fixed decimal data types and
can do math on thousands or millions of numeric fields at
C-like speeds.  There would be a (very small) market for
a type that could do this.  

Reason (3): speed
If I went for a Python "money class" with smart behaviour,
I'd get a sizable speed hit compared to floats.  Let's say I 
want to average a time series of 1000 bond prices; it will be
faster on floats than on Python classes.

IMHO all these are best served by an extension package not
in the core language - but having a common base for them to
inherit from would get a thumbs-up from me.  


Best Regards,

Andy Robinson



From guido@python.org  Wed Feb 27 12:40:01 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 27 Feb 2002 07:40:01 -0500
Subject: [Python-Dev] Re: OS/2 EMX changes to dynload_shlib.c
In-Reply-To: Your message of "Wed, 27 Feb 2002 08:44:36 +1100."
 <Pine.OS2.4.32.0202270837460.53-100000@tenring.andymac.org>
References: <Pine.OS2.4.32.0202270837460.53-100000@tenring.andymac.org>
Message-ID: <200202271240.g1RCe1x25304@pcp742651pcs.reston01.va.comcast.net>

> On Tue, 26 Feb 2002, Guido Van Rossum wrote:
> 
> > Given the number of OS/2 EMX specific changes to
> > dynload_shlib.c, wouldn't it be better to create a
> > separate dynload_os2.c?
> 
> Its just occurred to me that you might in fact be referring to the changes
> to import.c, which were in the same commit (& thus the same checkin
> message) and were extensive.  I did try to make it clear in the checkin
> message that changes to multiple files were committed.
> 
> It seems to be de rigeur to commit a single file at a time; something I
> hadn't appreciated and don't remember being advised about.  I will follow
> this practive from now on.

No need -- you're right, I mistook the diff, but that's purely my
fault.  Multi-file commits are quite common when they're related.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Feb 27 12:52:42 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 27 Feb 2002 07:52:42 -0500
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: Your message of "Wed, 27 Feb 2002 10:16:18 +0100."
 <3C7CA3E2.C3705289@lemburg.com>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
Message-ID: <200202271252.g1RCqgG25420@pcp742651pcs.reston01.va.comcast.net>

> So how about this:
> 
> In phase 1, the tokenizer checks the *complete file* for
> non-ASCII characters and outputs single warning 
> per file if it doesn't find a coding declaration at
> the top. Unicode literals continue to use [raw-]unicode-escape
> as codec.
> 
> In phase 2, we enforce ASCII as default encoding, i.e.
> the warning will turn into an error. The [raw-]unicode-escape
> codec will be extended to also support converting Unicode
> to Unicode, that is, only handle escape sequences in this
> case.

+1.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Feb 27 12:57:11 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 27 Feb 2002 07:57:11 -0500
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: Your message of "Wed, 27 Feb 2002 10:56:45 +0100."
 <3C7CAD5D.6692F44@lemburg.com>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <m38z9gudqd.fsf@mira.informatik.hu-berlin.de> <3C7CA3E2.C3705289@lemburg.com> <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
Message-ID: <200202271257.g1RCvB525447@pcp742651pcs.reston01.va.comcast.net>

> I just got a private response about the proposal from Atsuo Ishimoto, 
> Japan. They use two different encoding in day-to-day life (one for
> windows, one for unix) and have their complete tool chain setup
> to auto-convert all files between the two environments.
> 
> Recognizing the magic comment would pose a problem for them,
> since their tools assume conversion to the PC's locale setting.
> 
> He proposed to make the interpreters default encoding the default
> for source files which don't specify an encoding. That is
> ASCII on all standard Python installations and different
> encodings on tweaked installations.
> 
> He also told me that they put raw Shift-JIS and EUC-JP
> into Python literal strings -- just like Europeans do
> with Latin-1.
> 
> Wouldn't his suggestion be a good compromise for phase 2 ?

I'm OK with a way to change the default to something locale-specific,
as long as there's also a way to make the default strict ASCII (for
export).  Maybe python -A could force the default encoding to be ASCII
even if the locale specifies something different.

(I'd still *prefer* it the other way around, where you have to specify
an explicit option to make the default equal to the locale rather than
ASCII, but I can see the other side.  Sigh.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim@zope.com  Wed Feb 27 13:01:43 2002
From: jim@zope.com (Jim Fulton)
Date: Wed, 27 Feb 2002 08:01:43 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCCEKDNPAA.tim@zope.com>
Message-ID: <3C7CD8B7.3E9A89A3@zope.com>

Tim Peters wrote:
> 
> [Jim Fulton]
> > ZODB has a TimeStamp type that uses a 32-bit unsigned integer
> > to store year, month,, day, hour, and minute in a way that makes it dirt
> > simple to extract a component.
> 
> You really think so?  It's a mixed-radix scheme:
> 
>           v=((((y-1900)*12+mo-1)*31+d-1)*24+h)*60+m;
> 
> so requires lots of expensive integer division and remainder operations to
> pick apart again (the trend in CPUs is to make these relatively more
> expensive, not less, and e.g. Itanium doesn't even have an integer division
> instruction).

Compared to storing date-times as offsets from an epoch, this is
much simpler and cheaper.

> If we had this to do over again, I'd strongly suggest assigning 12 bits to
> the year, 4 to the month, 5 each to day and hour, and 6 to the minute.  The
> components would then be truly dirt simple and dirt cheap to extract, and we
> wouldn't even have to bother switching between 0-based and 1-based for the
> months and days (let 'em stay 1-based).  They would still sort and compare
> correctly in packed format.  The only downside I can see is that not
> pursuing every last drop of potential compression would shrink the dynamic
> range from 8000+ years to 4000+ years, but we're likely to have much worse
> problems in Zope by the year 5900 anyway <wink>.

Sounds good to me.

Jim

--
Jim Fulton           mailto:jim@zope.com       Python Powered!        
CTO                  (888) 344-4332            http://www.python.org  
Zope Corporation     http://www.zope.com       http://www.zope.org


From guido@python.org  Wed Feb 27 13:07:15 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 27 Feb 2002 08:07:15 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Wed, 27 Feb 2002 12:38:39 +0100."
 <3C7CC53F.DF13D8D7@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCCELHNPAA.tim_one@email.msn.com>
 <3C7CC53F.DF13D8D7@lemburg.com>
Message-ID: <200202271307.g1RD7FC25501@pcp742651pcs.reston01.va.comcast.net>

> FYI, mxDateTime test the C lib for leap second support; if leap
> seconds are used, then it has to support these too in conversions
> from and to Unix ticks.

Since AFAIK POSIX doesn't admit the existence of leap seconds, how do
you ask the C library for leap seconds?

> > BTW, I doubt there'd be any discussion of leap seconds in the C
> > std if some astronomers hadn't been early Unix users.  It's never
> > a net win in the end to try to make a scientist happy <0.9 wink>.

Yeah, we learned that the hard way by adding complex numbers. :-)

> What strange about leap seconds is that they don't fit well with
> the idea of counting seconds since some fixed point in history.
> They are only useful for conversions from this count to a broken
> down date and time representation.... time simply doesn't
> leap.
> 
> >From a comment in mxDateTime:
> /* This function checks whether the system uses the POSIX time_t rules
>    (which do not support leap seconds) or a time package with leap
>    second support enabled. Return 1 if it uses POSIX time_t values, 0
>    otherwise.
> 
>    POSIX: 1986-12-31 23:59:59 UTC == 536457599
> 
>    With leap seconds:             == 536457612
> 
>    (since there were 13 leap seconds in the years 1972-1985 according
>    to the tz package available from ftp://elsie.nci.nih.gov/pub/)
> 
> */

I think an important (but so far unvoiced) requirement is that
datetime objects can be stored in a database.  Since the database may
be read by systems that may or may not support leap seconds, we should
be independent of the leap second support in the C library. As I've
said before, we should ignore leap seconds.  Even if we end up
expressing times deltas as a number of seconds, that should be
understood to be calendar seconds and not astronomical seconds.  Let
the astronomers deal with leap seconds themselves -- they should know
how to.

BTW, this means that we can't use the C calls mktime(), timegm(),
localtime(), and gmtime(), or their Python wrappers in the time
module!  That's fine by me.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Feb 27 13:27:10 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 27 Feb 2002 08:27:10 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Wed, 27 Feb 2002 01:06:28 EST."
 <LNBBLJKPBEHFEDALKOLCCELHNPAA.tim_one@email.msn.com>
References: <LNBBLJKPBEHFEDALKOLCCELHNPAA.tim_one@email.msn.com>
Message-ID: <200202271327.g1RDRBl25604@pcp742651pcs.reston01.va.comcast.net>

[me]
> > Improving upon struct tm is hard (it has all the necessary fields and no
> > others), unless you want to add operations (just add methods) or make
> > the representation more compact (several of the fields can be packed in
> > 4-6 bits each).

[Tim]
> I'm suprised you say "all the necessary fields", because a tm contains no
> info about timezone.

Oops.  My mistake.  I thought it had timezone.

[tmx details snipped]

> > A third dimension might be to provide better date/time arithmetic, but
> > I'm not sure if there's much of a market for that, given all the fuzzy
> > semantics (leap seconds, differences across DST changes, timezones).
> 
> I don't think we can get off that easy.  Time computation is critical for
> businesses and astronomers, and leap seconds etc are a PITA independent of
> time computations.  Time computations seem to me to be the easiest of all,
> provided we've already "done something" intelligible about the rest:  any
> calculation boils down to factoring away leap seconds etc in conversion to a
> canonical form, doing the computing there, then injecting leap seconds etc
> back in to the result when converting out of canonical form again.
> 
> The ECMAScript std (nee Javascript) has, I think, a good example of a usable
> facility that refused to get mired down in impossible details; e.g., it
> flat-out refuses to recognize leap seconds.  mxDateTime is similarly sane,
> but MAL keeps threatening to flirt with insanity <wink>.
> 
> BTW, I doubt there'd be any discussion of leap seconds in the C std if some
> astronomers hadn't been early Unix users.  It's never a net win in the end
> to try to make a scientist happy <0.9 wink>.

I'd be happy to support time computations, provided we keep the leap
seconds out.

I propose a representation that resembles a compressed struct tm (or
tmx), with appropriately-sized bit fields for year, month, day, hour,
minute, second, millisecond, and microsecond, and timezone and DST
info.  Since the most likely situation is extraction in local time,
these should be stored as local time with an explicit timezone.  (I
don't want to store these things in a database without an explicit
timezone, even if it costs another 12 bit field.)  An app extracting
the local time without checking the timezone could be fooled by a time
stored with a different timezone.  Do we care?

Time computations are only slightly complex because they have to be
calendar-aware, but at least they don't have to be DST-aware -- they
can just thake the timezone offset in minutes and apply it.

The DST info should probably be two bits: one telling whether DST is
in effect at the given time, one telling whether DST is honored in the
given timezone.  Maybe it should also allow "missing info" for either.
Details, details.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh@python.net  Wed Feb 27 13:35:07 2002
From: mwh@python.net (Michael Hudson)
Date: 27 Feb 2002 13:35:07 +0000
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include patchlevel.h,2.60.2.1,2.60.2.1.2.1
In-Reply-To: Guido van Rossum's message of "Wed, 27 Feb 2002 08:11:57 -0500"
References: <E16g3kB-0008UJ-00@usw-pr-cvs1.sourceforge.net> <200202271311.g1RDBvW25563@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <2madtvkpys.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> > I *think* this is the only place I need to do this.
> 
> I think so too.

Good.

> > There are also some "(c) 2001"s that should probably be turned into
> > "(c) 2001, 2002"s -- should this be done on the trunk too?
> 
> Yes -- can you take care of it?

I'm such a sucker.

-- 
  > so python will fork if activestate starts polluting it?
  I find it more relevant to speculate on whether Python would fork
  if the merpeople start invading our cities riding on the backs of 
  giant king crabs.                 -- Brian Quinlan, comp.lang.python


From gward@python.net  Wed Feb 27 13:56:16 2002
From: gward@python.net (Greg Ward)
Date: Wed, 27 Feb 2002 08:56:16 -0500
Subject: [Python-Dev] PEP 215 redux: toward a simplified consensus?
In-Reply-To: <15482.64790.812638.147714@anthem.wooz.org>
References: <3C7AD939.FCFE163D@prescod.net> <200202260214.PAA23584@s454.cosc.canterbury.ac.nz> <15482.64790.812638.147714@anthem.wooz.org>
Message-ID: <20020227135616.GA8928@gerg.ca>

[Greg Ewing]
> I suggest '^', since it does a nice job of suggesting
> "inject stuff into this string". We can have both a
> prefix form for compile-time interpolation:
>
>   a = ^ "My name is $name"
>
> and an infix form for run-time interpolation:
>
>   a = "My name is $name" ^ dict

[Barry]
> I think I suggested using ~ for this at IPC10:
> 
>     a = ~'my name is $name'
> 
> for the compile-time interpolation.  I don't think it matters much
> which operator is chosen (let Guido decide).

-1 on all line-noise string modifiers.  (I just looked at Barry's
example and part of my reptilian hindbrain thought it was a regex match.
Don't do that to Perl and awk refugees, please!)

All existing string modifiers are letters; how about "i" for
"interpolation":

   a = i"my name is $name"

Assuming of course that we really do need yet another flavour of
strings...

        Greg
-- 
Greg Ward - programmer-at-large                         gward@python.net
http://starship.python.net/~gward/
Time flies like an arrow; fruit flies like a banana.


From mal@lemburg.com  Wed Feb 27 14:12:25 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 15:12:25 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <m38z9gudqd.fsf@mira.informatik.hu-berlin.de> <3C7CA3E2.C3705289@lemburg.com> <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com> <200202271257.g1RCvB525447@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7CE949.8893873D@lemburg.com>

Guido van Rossum wrote:
> 
> > I just got a private response about the proposal from Atsuo Ishimoto,
> > Japan. They use two different encoding in day-to-day life (one for
> > windows, one for unix) and have their complete tool chain setup
> > to auto-convert all files between the two environments.
> >
> > Recognizing the magic comment would pose a problem for them,
> > since their tools assume conversion to the PC's locale setting.
> >
> > He proposed to make the interpreters default encoding the default
> > for source files which don't specify an encoding. That is
> > ASCII on all standard Python installations and different
> > encodings on tweaked installations.
> >
> > He also told me that they put raw Shift-JIS and EUC-JP
> > into Python literal strings -- just like Europeans do
> > with Latin-1.
> >
> > Wouldn't his suggestion be a good compromise for phase 2 ?
> 
> I'm OK with a way to change the default to something locale-specific,
> as long as there's also a way to make the default strict ASCII (for
> export).  Maybe python -A could force the default encoding to be ASCII
> even if the locale specifies something different.
> 
> (I'd still *prefer* it the other way around, where you have to specify
> an explicit option to make the default equal to the locale rather than
> ASCII, but I can see the other side.  Sigh.)

Let's put it this way: the interpreter's default encoding has
to be changed explicitly by the sys admin (in sitecustomize.py),
so the decision to take e.g. a locale specific default encoding
is one which the admin maintaining the installation has
to make (with all the consequences that go with it).

Per default, the default encoding is ASCII, so I don't
think we really need an extra option. 

Hmm, could be that python -S already implies this, BTW...
checking this reveils that even sys.setdefaultencoding()
remains available if -S is used. Perhaps we should remove
the API with -S too ?!

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Wed Feb 27 14:21:54 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 15:21:54 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCCELHNPAA.tim_one@email.msn.com> <200202271327.g1RDRBl25604@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7CEB82.3035BE20@lemburg.com>

The discussion is going astray again: Fredrik proposed an abstract
base type, i.e. a type providing only the name and an interface
which is defined as convention.

I am all for adding such an abstract base type (and others
as well, e.g. for numbers, sequences, money, decimal, etc.)
with minimal interfaces, but not for fixing a complex interface 
on top of these.

What you are currently discussing is heading in the direction
of imlementing one or more time subclasses. That's two steps
ahead of what Fredrik was proposing.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Wed Feb 27 14:33:00 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 15:33:00 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCCELHNPAA.tim_one@email.msn.com>
 <3C7CC53F.DF13D8D7@lemburg.com> <200202271307.g1RD7FC25501@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7CEE1C.D6EF5C4B@lemburg.com>

Guido van Rossum wrote:
> 
> > FYI, mxDateTime test the C lib for leap second support; if leap
> > seconds are used, then it has to support these too in conversions
> > from and to Unix ticks.
> 
> Since AFAIK POSIX doesn't admit the existence of leap seconds, how do
> you ask the C library for leap seconds?

See below (the quoted C comment).
 
> > > BTW, I doubt there'd be any discussion of leap seconds in the C
> > > std if some astronomers hadn't been early Unix users.  It's never
> > > a net win in the end to try to make a scientist happy <0.9 wink>.
> 
> Yeah, we learned that the hard way by adding complex numbers. :-)
> 
> > What strange about leap seconds is that they don't fit well with
> > the idea of counting seconds since some fixed point in history.
> > They are only useful for conversions from this count to a broken
> > down date and time representation.... time simply doesn't
> > leap.
> >
> > >From a comment in mxDateTime:
> > /* This function checks whether the system uses the POSIX time_t rules
> >    (which do not support leap seconds) or a time package with leap
> >    second support enabled. Return 1 if it uses POSIX time_t values, 0
> >    otherwise.
> >
> >    POSIX: 1986-12-31 23:59:59 UTC == 536457599
> >
> >    With leap seconds:             == 536457612
> >
> >    (since there were 13 leap seconds in the years 1972-1985 according
> >    to the tz package available from ftp://elsie.nci.nih.gov/pub/)
> >
> > */
> 
> I think an important (but so far unvoiced) requirement is that
> datetime objects can be stored in a database.  Since the database may
> be read by systems that may or may not support leap seconds, ...

SQL databases don't deal with leap seconds. They store
the broken down value (in some way) without time zone information
and that's it, fortunately :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From Jack.Jansen@oratrix.com  Wed Feb 27 14:40:43 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Wed, 27 Feb 2002 15:40:43 +0100
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <3C7CEB82.3035BE20@lemburg.com>
Message-ID: <FAA6A80D-2B8F-11D6-9384-0030655234CE@oratrix.com>

On Wednesday, February 27, 2002, at 03:21 , M.-A. Lemburg wrote:

> The discussion is going astray again: Fredrik proposed an abstract
> base type, i.e. a type providing only the name and an interface
> which is defined as convention.
>
> I am all for adding such an abstract base type (and others
> as well, e.g. for numbers, sequences, money, decimal, etc.)
> with minimal interfaces, but not for fixing a complex interface
> on top of these.

Oops, I had missed that bit as well, that adding an *abstract* base type 
was the intention.

I'm all for that as well.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -



From guido@python.org  Wed Feb 27 14:42:09 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 27 Feb 2002 09:42:09 -0500
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: Your message of "Wed, 27 Feb 2002 15:12:25 +0100."
 <3C7CE949.8893873D@lemburg.com>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de> <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk> <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net> <3C7BECEC.E1550553@lemburg.com> <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net> <m38z9gudqd.fsf@mira.informatik.hu-berlin.de> <3C7CA3E2.C3705289@lemburg.com> <m3sn7nz360.fsf@mira.informatik.hu-berlin.de> <3C7CAD5D.6692F44@lemburg.com> <200202271257.g1RCvB525447@pcp742651pcs.reston01.va.comcast.net>
 <3C7CE949.8893873D@lemburg.com>
Message-ID: <200202271442.g1REg9D25882@pcp742651pcs.reston01.va.comcast.net>

> > (I'd still *prefer* it the other way around, where you have to specify
> > an explicit option to make the default equal to the locale rather than
> > ASCII, but I can see the other side.  Sigh.)
> 
> Let's put it this way: the interpreter's default encoding has
> to be changed explicitly by the sys admin (in sitecustomize.py),
> so the decision to take e.g. a locale specific default encoding
> is one which the admin maintaining the installation has
> to make (with all the consequences that go with it).

OK.  I missed that part -- I thought that it would look in the locale
by default.

> Per default, the default encoding is ASCII, so I don't
> think we really need an extra option. 

Agreed.

> Hmm, could be that python -S already implies this, BTW...

:-)

> checking this reveils that even sys.setdefaultencoding()
> remains available if -S is used. Perhaps we should remove
> the API with -S too ?!

I don't think so.  It should be left in, caveat emptor.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Feb 27 14:43:37 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 27 Feb 2002 09:43:37 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Wed, 27 Feb 2002 15:21:54 +0100."
 <3C7CEB82.3035BE20@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCCELHNPAA.tim_one@email.msn.com> <200202271327.g1RDRBl25604@pcp742651pcs.reston01.va.comcast.net>
 <3C7CEB82.3035BE20@lemburg.com>
Message-ID: <200202271443.g1REhbQ25904@pcp742651pcs.reston01.va.comcast.net>

> The discussion is going astray again: Fredrik proposed an abstract
> base type, i.e. a type providing only the name and an interface
> which is defined as convention.
> 
> I am all for adding such an abstract base type (and others
> as well, e.g. for numbers, sequences, money, decimal, etc.)
> with minimal interfaces, but not for fixing a complex interface 
> on top of these.
> 
> What you are currently discussing is heading in the direction
> of imlementing one or more time subclasses. That's two steps
> ahead of what Fredrik was proposing.

Good point.  The two discussions are both useful, but should be
separated.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jacobs@penguin.theopalgroup.com  Wed Feb 27 15:11:56 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Wed, 27 Feb 2002 10:11:56 -0500 (EST)
Subject: [Python-Dev] proposal: add basic time type to the standard
 library
In-Reply-To: <3C7CEE1C.D6EF5C4B@lemburg.com>
Message-ID: <Pine.LNX.4.33.0202270957390.28771-100000@penguin.theopalgroup.com>

On Wed, 27 Feb 2002, M.-A. Lemburg wrote:
> SQL databases don't deal with leap seconds. They store
> the broken down value (in some way) without time zone information
> and that's it, fortunately :-)

Er...  SQL99 (and I believe SQL92) have native support for time with and
without time zones, and neither say nothing about how databases are to
"store" those values.  I don't have a copy in front of me, so I can't tell
you what they say about leap-seconds.  Of course, few implementations
support this yet, though it worth being forward-looking.

For my own uses, I have a base time class that encapsulates either
mxDateTime objects or unix time-since-epoch, and implements the basic time
and date accessors and simple arithmetic.  A subclass of that type then adds
awareness of timezones and daylight savings time.  My first effort at trying
to do all of those things in one big monolithic class was a nightmare.  This
layering does result in some (relative) inefficiency, but correctness and
maintainability is vastly more important to me.

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From jepler@unpythonic.dhs.org  Wed Feb 27 17:10:59 2002
From: jepler@unpythonic.dhs.org (Jeff Epler)
Date: Wed, 27 Feb 2002 11:10:59 -0600
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <3C7CA4F9.86626985@lemburg.com>
References: <3C7B5E35.129E5501@lemburg.com> <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de> <15483.53993.852170.135298@anthem.wooz.org> <m3n0xww01p.fsf@mira.informatik.hu-berlin.de> <3C7BE70B.74713ED2@lemburg.com> <20020226150143.C16980@unpythonic.dhs.org> <3C7CA4F9.86626985@lemburg.com>
Message-ID: <20020227111058.B30863@unpythonic.dhs.org>

On Wed, Feb 27, 2002 at 10:20:57AM +0100, M.-A. Lemburg wrote:
> So if we use the RE "coding[=:]\s*([\w-]+)" on the first line,
> we should be able to reach out for the encoding, right ?
> 
> This RE would then cover both vim and emacs.

I've been informed on a #vim irc channel that "vim:fillencoding=blah:"
does not work.  Unfortunate.  I overlooked the part of the documentation
which states
    To read a file in a certain encoding it won't work by setting
    'fileencoding', use the |++enc| argument.

However, there's a "charset plugin" for vim:
    http://vim.sourceforge.net/scripts/script.php?script_id=199
which could be adapted to follow whatever convention is chosen for
Python.  However, this plugin is not standard in any version of
vim.  It's not clear what license it's under, but referencing it from
the PEP and documenting that something like
    au BufReadPost *.py ReloadWhenCharset(1, "coding[:=]\s([\w-]+)")
    au BufReadPost *.py ReloadWhenCharset(2, "coding[:=]\s([\w-]+)")
(search the first two lines for the emacs coding special marker) would
cause it to detect the charset of a Python file would certainly be
possible.  The plugin functions by executing a reload of the file with
++enc when ReloadWhenCharset matches its pattern.

Jeff


From skip@pobox.com  Wed Feb 27 17:20:51 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 27 Feb 2002 11:20:51 -0600
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <3C7CAFD3.60B32168@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCEEMPNPAA.tim.one@comcast.net>
 <3C7CAFD3.60B32168@lemburg.com>
Message-ID: <15485.5491.709403.99698@beluga.mojam.com>

    >> Python uses the 7-bit ASCII character set for program text and string
    >> literals.  8-bit characters may be used in string literals and
    >> comments but their interpretation is platform dependent; the proper
    >> way to insert 8-bit characters in string literals is by using octal
    >> or hexadecimal escape sequences.

    mal> It's a fact of life that users don't read reference manuals, but
    mal> simply write programs and feel good if they happen to work :-)

Perhaps a warning should be emitted by the compiler if a plain string
literal is found that contains 8-bit characters.  Better yet, perhaps Neal
can add this to PyChecker if he hasn't already...

Skip


From martin@v.loewis.de  Wed Feb 27 17:26:54 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 27 Feb 2002 18:26:54 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <3C7CAD5D.6692F44@lemburg.com>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
Message-ID: <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> He also told me that they put raw Shift-JIS and EUC-JP
> into Python literal strings -- just like Europeans do
> with Latin-1.

I expected that much; chosing Latin-1 as the default encoding is
certainly Euro-centric.

At the moment, declaring either eucJP or or Shift-JIS wouldn't work
with the proposed implementation, anyway, since those encodings are
not supported in the standard Python installation.

> Wouldn't his suggestion be a good compromise for phase 2 ?

This raises the question what exactly should be deprecated. AFAIK,
both eucJP and Shift-JIS use non-ASCII bytes to denote Japanese
characters, so they'd get a DeprecationWarning on every file. However,
they could not put an encoding declaration into the file, as Python
would not recognize the encoding.

I don't see the convention to convert as too much of a stumbling
block; to my knowledge, many editors can display text in both
encodings correctly these days (but I may be wrong with that
assumption).

Regards,
Martin


From mal@lemburg.com  Wed Feb 27 17:31:13 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 18:31:13 +0100
Subject: [Python-Dev] proposal: add basic time type to the standardlibrary
References: <Pine.LNX.4.33.0202270957390.28771-100000@penguin.theopalgroup.com>
Message-ID: <3C7D17E1.326DB5AF@lemburg.com>

Kevin Jacobs wrote:
> 
> On Wed, 27 Feb 2002, M.-A. Lemburg wrote:
> > SQL databases don't deal with leap seconds. They store
> > the broken down value (in some way) without time zone information
> > and that's it, fortunately :-)
> 
> Er...  SQL99 (and I believe SQL92) have native support for time with and
> without time zones, and neither say nothing about how databases are to
> "store" those values.  I don't have a copy in front of me, so I can't tell
> you what they say about leap-seconds.  Of course, few implementations
> support this yet, though it worth being forward-looking.

True, SQL-92 defines data types "TIME WITH TIME ZONE" and
"TIMESTAMP WITH TIME ZONE". The standard is only available
as book, but here's a draft which has all the details:

http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt

Still, only Oracle and PostgreSQL seem to actually implement these 
and ODBC (SQL/CLI), the defacto standard for database interfacing, 
doesn't even provide interfaces to query or store time zone 
information (you can put the information directly in the SQL 
string, but not use it in bound variables).

Basically, you should not store local time in databases,
but instead use UTC. If you need the original time zone 
information for reference, you'd keep this in separate
DB columns (e.g. as strings).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From jepler@unpythonic.dhs.org  Wed Feb 27 17:32:50 2002
From: jepler@unpythonic.dhs.org (Jeff Epler)
Date: Wed, 27 Feb 2002 11:32:50 -0600
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <20020227111058.B30863@unpythonic.dhs.org>
References: <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de> <15483.53993.852170.135298@anthem.wooz.org> <m3n0xww01p.fsf@mira.informatik.hu-berlin.de> <3C7BE70B.74713ED2@lemburg.com> <20020226150143.C16980@unpythonic.dhs.org> <3C7CA4F9.86626985@lemburg.com> <20020227111058.B30863@unpythonic.dhs.org>
Message-ID: <20020227113249.C30863@unpythonic.dhs.org>

This actually works in vim with "charset plugin":
        let s:pep263='coding[:=]\s*\([-A-Za-z0-9_]\+\)'
        au BufReadPost *.py call ReloadWhenCharsetSet(1, s:pep263)
        au BufReadPost *.py call ReloadWhenCharsetSet(2, s:pep263)
It searches for a RE compatible with PEP263 in the first and second lines.

You could change the pattern from *.py to * if you want to recognize the
emacs-style coding in all files.

Jeff


From mal@lemburg.com  Wed Feb 27 17:36:39 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 18:36:39 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <LNBBLJKPBEHFEDALKOLCEEMPNPAA.tim.one@comcast.net>
 <3C7CAFD3.60B32168@lemburg.com> <15485.5491.709403.99698@beluga.mojam.com>
Message-ID: <3C7D1927.3414607E@lemburg.com>

Skip Montanaro wrote:
> 
>     >> Python uses the 7-bit ASCII character set for program text and string
>     >> literals.  8-bit characters may be used in string literals and
>     >> comments but their interpretation is platform dependent; the proper
>     >> way to insert 8-bit characters in string literals is by using octal
>     >> or hexadecimal escape sequences.
> 
>     mal> It's a fact of life that users don't read reference manuals, but
>     mal> simply write programs and feel good if they happen to work :-)
> 
> Perhaps a warning should be emitted by the compiler if a plain string
> literal is found that contains 8-bit characters.  Better yet, perhaps Neal
> can add this to PyChecker if he hasn't already...

See the PEP: this is what phase 1 will do; phase 2 won't accept such
a file without an explicit encoding declaration.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From steve@cat-box.net  Wed Feb 27 17:44:28 2002
From: steve@cat-box.net (Steve Alexander)
Date: Wed, 27 Feb 2002 17:44:28 +0000
Subject: [Python-Dev] Supporting precision in a DateTime type
Message-ID: <3C7D1AFC.4050809@cat-box.net>

Hi folks,

On #zope3-dev, we were discussing how best to implement a DateTime type 
in Python.

Leaving aside arguments of whether to store it as a packed C tuple, or 
as ms since an epoch, I'd like to think about the concept of precision 
as it relates to dates and times.

As a writer of software applications that people use in people settings, 
I like to use types that reflect the elements of reality that people 
find important.

One aspect of time that is important is its precision. Here's an example

   How long is it between 1992 and March 15, 1993 ?

There isn't a sensible answer. Or, rather, there are many answers, some 
more sensible than others. The correct answer might be "1 year", a date 
range, or an error (perhaps a ValueError). In any case, the correct 
answer depends on the nature of the application.

Thus, if I'm only interested in using dates, such as in an application 
where I'm interested in birthdays, I want to be able to describe a date 
without reference to a particular time. It isn't just a default time, it 
is a "no time specified".
So, I won't get caught later on if I compare that datetime instance with 
another that has a different precision.

It is often possible to resolve differing precisions in an 
application-specific way.

Another way of thinking about precision is as a constraint on possible 
more precise values. So, I can play an April fool prank any time in the 
morning of April 1, in my local time-zone. The actual exact time of my 
pranks will fall within the less precise constraint. This makes dates 
with precision similar to durations.

Common precisions in applications include years, months, iso weeks of a 
year, days.
Any finer precision doesn't really matter; the max precision of time in 
C is ok for most human purposes.

Although you could catch some cases by having distinct types for dates 
and times, this only captures the precision of days. It doesn't help for 
other precisions.

Here's a paper I found via google, that discusses these issues:

   http://www.martinfowler.com/ap2/timePoint.html


ps. I'm not a regular reader of python-dev. Guido suggested I post this 
here for further discussion.
I'll catch up via the web eventually, but please cc me into any relevant 
replies.

--
Steve Alexander



From mal@lemburg.com  Wed Feb 27 17:43:10 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 18:43:10 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com> <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3C7D1AAE.234A01F7@lemburg.com>

"Martin v. Loewis" wrote:
> 
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
> > He also told me that they put raw Shift-JIS and EUC-JP
> > into Python literal strings -- just like Europeans do
> > with Latin-1.
> 
> I expected that much; chosing Latin-1 as the default encoding is
> certainly Euro-centric.
> 
> At the moment, declaring either eucJP or or Shift-JIS wouldn't work
> with the proposed implementation, anyway, since those encodings are
> not supported in the standard Python installation.

But they will be using Tamito's Japanese codecs... and, of course,
they do work now in string literals, since there is no enforcement of 
any encoding in the compiler.
 
> > Wouldn't his suggestion be a good compromise for phase 2 ?
> 
> This raises the question what exactly should be deprecated. AFAIK,
> both eucJP and Shift-JIS use non-ASCII bytes to denote Japanese
> characters, so they'd get a DeprecationWarning on every file. However,
> they could not put an encoding declaration into the file, as Python
> would not recognize the encoding.

With Tamito's codecs installed, this wouldn't be a problem. 
Putting the encoding comment in the files will turn the compiler
quiet in phase 1 and in phase 2 assure that their editors
do in fact use the defined encoding.

FYI, I've updated the PEP to use the interpreter's default 
encoding as basis for the source file encoding too.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Wed Feb 27 17:44:09 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 18:44:09 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de> <3C7B6322.440D21E7@lemburg.com> <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de> <15483.53993.852170.135298@anthem.wooz.org> <m3n0xww01p.fsf@mira.informatik.hu-berlin.de> <3C7BE70B.74713ED2@lemburg.com> <20020226150143.C16980@unpythonic.dhs.org> <3C7CA4F9.86626985@lemburg.com> <20020227111058.B30863@unpythonic.dhs.org> <20020227113249.C30863@unpythonic.dhs.org>
Message-ID: <3C7D1AE9.11C2740F@lemburg.com>

Jeff Epler wrote:
> 
> This actually works in vim with "charset plugin":
>         let s:pep263='coding[:=]\s*\([-A-Za-z0-9_]\+\)'
>         au BufReadPost *.py call ReloadWhenCharsetSet(1, s:pep263)
>         au BufReadPost *.py call ReloadWhenCharsetSet(2, s:pep263)
> It searches for a RE compatible with PEP263 in the first and second lines.
> 
> You could change the pattern from *.py to * if you want to recognize the
> emacs-style coding in all files.

Great ! So we can say that the RE fits vim and emacs, right ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From jacobs@penguin.theopalgroup.com  Wed Feb 27 17:45:17 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Wed, 27 Feb 2002 12:45:17 -0500 (EST)
Subject: [Python-Dev] proposal: add basic time type to the standardlibrary
In-Reply-To: <3C7D17E1.326DB5AF@lemburg.com>
Message-ID: <Pine.LNX.4.33.0202271237200.30291-100000@penguin.theopalgroup.com>

On Wed, 27 Feb 2002, M.-A. Lemburg wrote:
> True, SQL-92 defines data types "TIME WITH TIME ZONE" and
> "TIMESTAMP WITH TIME ZONE". The standard is only available
> as book, but here's a draft which has all the details:
>
> http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
>
> Still, only Oracle and PostgreSQL seem to actually implement these
> and ODBC (SQL/CLI), the defacto standard for database interfacing,
> doesn't even provide interfaces to query or store time zone
> information (you can put the information directly in the SQL
> string, but not use it in bound variables).

Strangely enough I use TIMESPAMP WITH TIMEZONE quite a bit on both Oracle
and PostgreSQL using native drivers.  I'm also fairly sure that Sybase and
MS-SQL store timestamps with timezone somehow, though my memory on the
project that did so is a little fuzzy.

> Basically, you should not store local time in databases,
> but instead use UTC. If you need the original time zone
> information for reference, you'd keep this in separate
> DB columns (e.g. as strings).

Why not minute offset from UTC like C99?

Anyhow, everyone knows that time zones and daylight savings time are a pain
to deal with.  However, lets provide work toward a sane implementation that
can relieve the end-user from having to smack their head against this
particular brick wall every time.  (even if it means smacking our collective
heads against the brick wall until we're happy, or reduced to unintelligible
ranting, or possibly both).

Regards,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From skip@pobox.com  Wed Feb 27 18:30:55 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 27 Feb 2002 12:30:55 -0600
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <3C7D1927.3414607E@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCEEMPNPAA.tim.one@comcast.net>
 <3C7CAFD3.60B32168@lemburg.com>
 <15485.5491.709403.99698@beluga.mojam.com>
 <3C7D1927.3414607E@lemburg.com>
Message-ID: <15485.9695.126411.600632@beluga.mojam.com>

    >> Perhaps a warning should be emitted by the compiler if a plain string
    >> literal is found that contains 8-bit characters.  Better yet, perhaps
    >> Neal can add this to PyChecker if he hasn't already...

    mal> See the PEP: this is what phase 1 will do; phase 2 won't accept
    mal> such a file without an explicit encoding declaration.

That wasn't what I was getting at.  The quoted part of the reference manual
seemed to suggest that programmers should be using hex escapes in string
literals instead of 8-bit characters.  This doesn't seem to me to be related
to what encoding the file is in.

Skip



From David Abrahams" <david.abrahams@rcn.com  Wed Feb 27 18:46:34 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 27 Feb 2002 13:46:34 -0500
Subject: [Python-Dev] Supporting precision in a DateTime type
References: <3C7D1AFC.4050809@cat-box.net>
Message-ID: <133401c1bfbf$46e3fd90$0500a8c0@boostconsulting.com>

----- Original Message -----
From: "Steve Alexander" <steve@cat-box.net>
> Here's a paper I found via google, that discusses these issues:
>
>    http://www.martinfowler.com/ap2/timePoint.html
>
>
> ps. I'm not a regular reader of python-dev. Guido suggested I post this
> here for further discussion.

You might also be interested in what's happening in this area in the C++
world. AFAIK the most advanced C++ date/time library development is centered
here:

    http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?GDTL

with a nice paper from OOPSLA available here:

    http://www.oonumerics.org/tmpw01/garland.pdf

or-you-might-run-away-screaming-ly y'rs,
Dave



From Barrett@stsci.edu  Wed Feb 27 19:04:40 2002
From: Barrett@stsci.edu (Paul Barrett)
Date: Wed, 27 Feb 2002 14:04:40 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
References: <LNBBLJKPBEHFEDALKOLCCELHNPAA.tim_one@email.msn.com>              <3C7CC53F.DF13D8D7@lemburg.com> <200202271307.g1RD7FC25501@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7D2DC8.90802@STScI.Edu>

Guido van Rossum wrote:

> 
> I think an important (but so far unvoiced) requirement is that
> datetime objects can be stored in a database.  Since the database may
> be read by systems that may or may not support leap seconds, we should
> be independent of the leap second support in the C library. As I've
> said before, we should ignore leap seconds.  Even if we end up
> expressing times deltas as a number of seconds, that should be
> understood to be calendar seconds and not astronomical seconds.  Let
> the astronomers deal with leap seconds themselves -- they should know
> how to.

As for us astronomers, we're suppose to represent time in Julian days and 
fractions thereof since the beginning of time (about 6714 years ago).  Today is 
day 2452346.  In practice we use whole days and represent the fractional part in 
seconds, because floating point numbers don't have a sufficient number of bits 
to represent Julian days to nanosecond precision.  A typical day contains 86400 
seconds.  In essence we use Julian days as our reference point and seconds of a 
day as our delta time.  From these two values you can theoretically calculate 
any time past, present, or future with or without leap seconds (if known).

Just thought you might like to know, if you didn't already.

  -- Paul

-- 
Paul Barrett, PhD      Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218



From guido@python.org  Wed Feb 27 19:55:40 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 27 Feb 2002 14:55:40 -0500
Subject: [Python-Dev] Supporting precision in a DateTime type
In-Reply-To: Your message of "Wed, 27 Feb 2002 17:44:28 GMT."
 <3C7D1AFC.4050809@cat-box.net>
References: <3C7D1AFC.4050809@cat-box.net>
Message-ID: <200202271955.g1RJteW26624@pcp742651pcs.reston01.va.comcast.net>

Thanks Steve, for posting this summary.  I'm going to take a different
route for now but will keep your remarks in mind.  Martin Fowler's
note on time points was really helpful!

Also thanks to David Abraham for the pointer to Boost GDTL.  Python's
standard date/time type will relate to GDTL like Python's iterator
concept relates to C++ iterators. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From David Abrahams" <david.abrahams@rcn.com  Wed Feb 27 20:01:50 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 27 Feb 2002 15:01:50 -0500
Subject: [Python-Dev] Alignment assumptions
Message-ID: <13b201c1bfc9$c94d1b90$0500a8c0@boostconsulting.com>

A quick grep-find through the Python-2.2 sources reveals the following:

Include/dictobject.h:49: long aligner;
Include/objimpl.h:275: double dummy;  /* force worst-case alignment */
Modules/addrinfo.h:162: LONG_LONG __ss_align; /* force desired structure
storage alignment */
Modules/addrinfo.h:164: double __ss_align; /* force desired structure
storage alignment */

At first glance, there appear to be different assumptions at work here about
what constitutes maximal alignment on any given platform. I've been using a
little C++ metaprogram to find a type which will properly align any other
given type. Because of limitations of one compiler, I had to disable the
computation and instead used the objimpl.h assumption that double was
maximally aligned, but also added a compile-time assertion to check that the
alignment is always greater than or equal to that of the target type. Well,
it failed today on Tru64 Unix with the latest compaq CXX 6.5 prerelease
compiler; it appears that the alignment of long double is greater than that
of double on that platform.

I thought someone might want to know,
Dave


+---------------------------------------------------------------+
                  David Abrahams
      C++ Booster (http://www.boost.org)               O__  ==
      Pythonista (http://www.python.org)              c/ /'_ ==
  resume: http://users.rcn.com/abrahams/resume.html  (*) \(*) ==
          email: david.abrahams@rcn.com
+---------------------------------------------------------------+



From barry@zope.com  Wed Feb 27 20:09:43 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 27 Feb 2002 15:09:43 -0500
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding)
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15485.15623.543255.443894@anthem.wooz.org>

>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

    MvL> At the moment, declaring either eucJP or or Shift-JIS
    MvL> wouldn't work with the proposed implementation, anyway, since
    MvL> those encodings are not supported in the standard Python
    MvL> installation.

Which actually touches on something I wanted to bring up.  Why don't
we include the Japanese codecs with Python?  Is it just a size issue?

The gzip'd tarball of the JapaneseCodecs-1.4.3 is 258k, unpacked it's
3.2M.  Okay, so that's nontrivial, but I can think of 2 approaches:

- Have a second, sumo (no pun intended) release that inclues the
  codecs

- Include the gzip'd tarball and do a distutils install at Python
  install time

I bet we'd win some Ruby converts if we did this <wink>.  For
reference, I'm thinking about including the Japanese and Chinese
codecs with MM2.1 because it makes little sense to claim support for
those languages without them.

-Barry


From mal@lemburg.com  Wed Feb 27 20:33:58 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 21:33:58 +0100
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code
 Encoding)
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de> <15485.15623.543255.443894@anthem.wooz.org>
Message-ID: <3C7D42B6.A88568CD@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:
> 
>     MvL> At the moment, declaring either eucJP or or Shift-JIS
>     MvL> wouldn't work with the proposed implementation, anyway, since
>     MvL> those encodings are not supported in the standard Python
>     MvL> installation.
> 
> Which actually touches on something I wanted to bring up.  Why don't
> we include the Japanese codecs with Python?  Is it just a size issue?
> 
> The gzip'd tarball of the JapaneseCodecs-1.4.3 is 258k, unpacked it's
> 3.2M.  Okay, so that's nontrivial, but I can think of 2 approaches:
> 
> - Have a second, sumo (no pun intended) release that inclues the
>   codecs
> 
> - Include the gzip'd tarball and do a distutils install at Python
>   install time

Why not simply make the installation a configure option ?

We could easily extend setup.py to grab the tarball from 
the web in case it is needed.
 
> I bet we'd win some Ruby converts if we did this <wink>.  For
> reference, I'm thinking about including the Japanese and Chinese
> codecs with MM2.1 because it makes little sense to claim support for
> those languages without them.

Agreed.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From jepler@unpythonic.dhs.org  Wed Feb 27 20:43:39 2002
From: jepler@unpythonic.dhs.org (Jeff Epler)
Date: Wed, 27 Feb 2002 14:43:39 -0600
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <3C7D1AE9.11C2740F@lemburg.com>
References: <200202261422.g1QEMxX15975@pcp742651pcs.reston01.va.comcast.net> <m3lmdgyvgp.fsf@mira.informatik.hu-berlin.de> <15483.53993.852170.135298@anthem.wooz.org> <m3n0xww01p.fsf@mira.informatik.hu-berlin.de> <3C7BE70B.74713ED2@lemburg.com> <20020226150143.C16980@unpythonic.dhs.org> <3C7CA4F9.86626985@lemburg.com> <20020227111058.B30863@unpythonic.dhs.org> <20020227113249.C30863@unpythonic.dhs.org> <3C7D1AE9.11C2740F@lemburg.com>
Message-ID: <20020227144337.D30863@unpythonic.dhs.org>

On Wed, Feb 27, 2002 at 06:44:09PM +0100, M.-A. Lemburg wrote:
> Great ! So we can say that the RE fits vim and emacs, right ?

Fits vim 6.0 with additional configuration of that editor.

Jeff


From martin@v.loewis.de  Wed Feb 27 21:01:25 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 27 Feb 2002 22:01:25 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <15485.9695.126411.600632@beluga.mojam.com>
References: <LNBBLJKPBEHFEDALKOLCEEMPNPAA.tim.one@comcast.net>
 <3C7CAFD3.60B32168@lemburg.com>
 <15485.5491.709403.99698@beluga.mojam.com>
 <3C7D1927.3414607E@lemburg.com>
 <15485.9695.126411.600632@beluga.mojam.com>
Message-ID: <m3g03miqqi.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

>     >> Perhaps a warning should be emitted by the compiler if a plain string
>     >> literal is found that contains 8-bit characters.  Better yet, perhaps
>     >> Neal can add this to PyChecker if he hasn't already...
> 
>     mal> See the PEP: this is what phase 1 will do; phase 2 won't accept
>     mal> such a file without an explicit encoding declaration.
> 
> That wasn't what I was getting at.  The quoted part of the reference manual
> seemed to suggest that programmers should be using hex escapes in string
> literals instead of 8-bit characters.  This doesn't seem to me to be related
> to what encoding the file is in.

PEP 263 says "the tokenizer must check the complete source file for
compliance with the default encoding". The part of the reference
manual will become incorrect: the meaning of 8-bit characters (rather:
bytes) will be well-defined if you have an encoding declaration.

If the default encoding is ASCII, and you have a 8-bit character, the
compiler will emit a warning if it is enhanced to follow PEP 263. So
what were you getting at?

Regards,
Martin


From skip@pobox.com  Wed Feb 27 21:41:53 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 27 Feb 2002 15:41:53 -0600
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <m3g03miqqi.fsf@mira.informatik.hu-berlin.de>
References: <LNBBLJKPBEHFEDALKOLCEEMPNPAA.tim.one@comcast.net>
 <3C7CAFD3.60B32168@lemburg.com>
 <15485.5491.709403.99698@beluga.mojam.com>
 <3C7D1927.3414607E@lemburg.com>
 <15485.9695.126411.600632@beluga.mojam.com>
 <m3g03miqqi.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15485.21153.951244.102021@beluga.mojam.com>

    Martin> If the default encoding is ASCII, and you have a 8-bit
    Martin> character, the compiler will emit a warning if it is enhanced to
    Martin> follow PEP 263. So what were you getting at?

I was thinking about strings used as byte containers for non-character data.

Skip


From martin@v.loewis.de  Wed Feb 27 22:03:51 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 27 Feb 2002 23:03:51 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <15485.21153.951244.102021@beluga.mojam.com>
References: <LNBBLJKPBEHFEDALKOLCEEMPNPAA.tim.one@comcast.net>
 <3C7CAFD3.60B32168@lemburg.com>
 <15485.5491.709403.99698@beluga.mojam.com>
 <3C7D1927.3414607E@lemburg.com>
 <15485.9695.126411.600632@beluga.mojam.com>
 <m3g03miqqi.fsf@mira.informatik.hu-berlin.de>
 <15485.21153.951244.102021@beluga.mojam.com>
Message-ID: <m3zo1ud1ko.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> I was thinking about strings used as byte containers for
> non-character data.

Ok, but then you also said that you would want to produce a warning
for those? How can you tell them apart from "proper" character strings
if the encoding allows arbitrary byte sequences (like Latin-1)?

Regards,
Martin



From martin@v.loewis.de  Wed Feb 27 22:02:03 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 27 Feb 2002 23:02:03 +0100
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding)
In-Reply-To: <15485.15623.543255.443894@anthem.wooz.org>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
Message-ID: <m34rk2eg84.fsf@mira.informatik.hu-berlin.de>

barry@zope.com (Barry A. Warsaw) writes:

> Which actually touches on something I wanted to bring up.  Why don't
> we include the Japanese codecs with Python?  Is it just a size issue?

I think Guido's original concern was about the size (apart from the
fact that they were not available before).

My concern is also correctness and efficiency. Most current systems
provide high-performance well-tested codecs, since they need those
frequently. It is a waste of resources not to make use of these
codecs. The counter-argument, of course, is that you cannot always
rely on these codecs being available (apart from the fact that you
need wrappers around the platform API).

> I bet we'd win some Ruby converts if we did this <wink>.  For
> reference, I'm thinking about including the Japanese and Chinese
> codecs with MM2.1 because it makes little sense to claim support for
> those languages without them.

That is certainly the right thing to do. If correctness could be
verified independently, I'd be in favour of including them with Python
- even though they will likely never get the efficiency that wrappers
around the platform's codecs would have.

Regards,
Martin


From barry@zope.com  Wed Feb 27 22:53:02 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 27 Feb 2002 17:53:02 -0500
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code Encoding)
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
 <m34rk2eg84.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15485.25422.524082.109890@anthem.wooz.org>

>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

    >> I bet we'd win some Ruby converts if we did this <wink>.  For
    >> reference, I'm thinking about including the Japanese and
    >> Chinese codecs with MM2.1 because it makes little sense to
    >> claim support for those languages without them.

    MvL> That is certainly the right thing to do. If correctness could
    MvL> be verified independently, I'd be in favour of including them
    MvL> with Python - even though they will likely never get the
    MvL> efficiency that wrappers around the platform's codecs would
    MvL> have.

I'm obviously not qualified to verify them independently, but I have
had some initial positive feedback from a few Japanese users of the
MM2.1 alphas.  My second hand information indicates that he Japanese
codecs are pretty good, the Chinese are okay, and the Korean ones need
a lot of work.

Also, it's a bit of a catch 22, in that the more official exposure
these codecs get, the better they will eventually become, hopefully.
I'd be +1 on including them in Python 2.3.

-Barry


From barry@zope.com  Wed Feb 27 22:53:55 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 27 Feb 2002 17:53:55 -0500
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code
 Encoding)
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
 <3C7D42B6.A88568CD@lemburg.com>
Message-ID: <15485.25475.913116.826208@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> Why not simply make the installation a configure option ?

    MAL> We could easily extend setup.py to grab the tarball from 
    MAL> the web in case it is needed.

That's another option.  Certainly stuff like that is becoming fairly
common for installers these days.
 
    >> I bet we'd win some Ruby converts if we did this <wink>.  For
    >> reference, I'm thinking about including the Japanese and
    >> Chinese codecs with MM2.1 because it makes little sense to
    >> claim support for those languages without them.

    MAL> Agreed.

-Barry


From gsw@agere.com  Wed Feb 27 22:54:32 2002
From: gsw@agere.com (Gerald S. Williams)
Date: Wed, 27 Feb 2002 17:54:32 -0500
Subject: [Python-Dev] POSIX thread code
Message-ID: <GBEGLOMMCLDACBPKDIHFKECGCHAA.gsw@agere.com>

This is a multi-part message in MIME format.

------=_NextPart_000_0023_01C1BFB7.CF9678F0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

I recently came up with a fix for thread support in Python
under Cygwin. Jason Tishler and Norman Vine are looking it
over, but I'm pretty sure something similar should be used
for the Cygwin Python port.

This is easily done--simply add a few lines to thread.c
and create a new thread_cygwin.h (context diff and new file
both provided).

But there is a larger issue:

The thread interface code in thread_pthread.h uses mutexes
and condition variables to emulate semaphores, which are
then used to provide Python "lock" and "sema" services.

I know this is a common practice since those two thread
synchronization primitives are defined in "pthread.h". But
it comes with quite a bit of overhead. (And in the case of
Cygwin causes race conditions, but that's another matter.)

POSIX does define semaphores, though. (In fact, it's in
the standard just before Mutexes and Condition Variables.)
According to POSIX, they are found in <semaphore.h> and
_POSIX_SEMAPHORES should be defined if they work as POSIX
expects.

If they are available, it seems like providing direct
semaphore services would be preferable to emulating them
using condition variables and mutexes.

thread_posix.h.diff-c is a context diff that can be used
to convert thread_pthread.h into a more general POSIX
version that will use semaphores if available.

thread_cygwin.h would no longer be needed then, since all
it does is uses POSIX semaphores directly rather than
mutexes/condition vars. Changing the interface to POSIX
threads should bring a performance improvement to any
POSIX platform that supports semaphores directly.

Does this sound like a good idea? Should I create a
more thorough set of patch files and submit them?

(I haven't been accepted to the python-dev list yet, so
please CC me. Thanks.)

-Jerry

-O Gerald S. Williams, 22Y-103GA : mailto:gsw@agere.com O-
-O AGERE SYSTEMS, 555 UNION BLVD : office:610-712-8661  O-
-O ALLENTOWN, PA, USA 18109-3286 : mobile:908-672-7592  O-

------=_NextPart_000_0023_01C1BFB7.CF9678F0
Content-Type: application/octet-stream;
	name="thread.c.diff-c"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename="thread.c.diff-c"

*** thread.c	Tue Oct 16 17:13:49 2001
--- thread.c.new	Tue Feb 26 07:49:13 2002
***************
*** 113,118 ****
--- 113,123 ----
  #include "thread_pth.h"
  #endif
  
+ #ifdef __CYGWIN__
+ #include "thread_cygwin.h"
+ #undef _POSIX_THREADS
+ #endif
+ 
  #ifdef _POSIX_THREADS
  #include "thread_pthread.h"
  #endif

------=_NextPart_000_0023_01C1BFB7.CF9678F0
Content-Type: application/octet-stream;
	name="thread_cygwin.h"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename="thread_cygwin.h"


/* Posix threads interface */

/*
 * Modified to avoid condition variables, which cause race conditions in Cygwin.
 * Gerald Williams, gsw@agere.com
 * $Id: thread_cygwin.h,v 1.6 2002/02/27 19:34:08 gsw Exp $
 */

#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#include <signal.h>
#include <semaphore.h>


/* try to determine what version of the Pthread Standard is installed.
 * this is important, since all sorts of parameter types changed from
 * draft to draft and there are several (incompatible) drafts in
 * common use.	these macros are a start, at least. 
 * 12 May 1997 -- david arnold <davida@pobox.com>
 */

#if defined(__ultrix) && defined(__mips) && defined(_DECTHREADS_)
/* _DECTHREADS_ is defined in cma.h which is included by pthread.h */
#  define PY_PTHREAD_D4

#elif defined(__osf__) && defined (__alpha)
/* _DECTHREADS_ is defined in cma.h which is included by pthread.h */
#  if !defined(_PTHREAD_ENV_ALPHA) || defined(_PTHREAD_USE_D4) || defined(PTHREAD_USE_D4)
#	 define PY_PTHREAD_D4
#  else
#	 define PY_PTHREAD_STD
#  endif

#elif defined(_AIX)
/* SCHED_BG_NP is defined if using AIX DCE pthreads
 * but it is unsupported by AIX 4 pthreads. Default
 * attributes for AIX 4 pthreads equal to NULL. For
 * AIX DCE pthreads they should be left unchanged.
 */
#  if !defined(SCHED_BG_NP)
#	 define PY_PTHREAD_STD
#  else
#	 define PY_PTHREAD_D7
#  endif

#elif defined(__DGUX)
#  define PY_PTHREAD_D6

#elif defined(__hpux) && defined(_DECTHREADS_)
#  define PY_PTHREAD_D4

#else /* Default case */
#  define PY_PTHREAD_STD

#endif

#ifdef USE_GUSI
/* The Macintosh GUSI I/O library sets the stackspace to
** 20KB, much too low. We up it to 64K.
*/
#define THREAD_STACK_SIZE 0x10000
#endif


/* set default attribute object for different versions */

#if defined(PY_PTHREAD_D4) || defined(PY_PTHREAD_D7)
#  define pthread_attr_default pthread_attr_default
#  define pthread_mutexattr_default pthread_mutexattr_default
#elif defined(PY_PTHREAD_STD) || defined(PY_PTHREAD_D6)
#  define pthread_attr_default ((pthread_attr_t *)NULL)
#  define pthread_mutexattr_default ((pthread_mutexattr_t *)NULL)
#endif


/* On platforms that don't use standard POSIX threads pthread_sigmask()
 * isn't present.  DEC threads uses sigprocmask() instead as do most
 * other UNIX International compliant systems that don't have the full
 * pthread implementation.
 */
#ifdef HAVE_PTHREAD_SIGMASK
#  define SET_THREAD_SIGMASK pthread_sigmask
#else
#  define SET_THREAD_SIGMASK sigprocmask
#endif

#define CHECK_STATUS(name)	if (status != 0) { perror(name); error = 1; }

/*
 * Initialization.
 */

#ifdef _HAVE_BSDI
static
void _noop(void)
{
}

static void
PyThread__init_thread(void)
{
	/* DO AN INIT BY STARTING THE THREAD */
	static int dummy = 0;
	pthread_t thread1;
	pthread_create(&thread1, NULL, (void *) _noop, &dummy);
	pthread_join(thread1, NULL);
}

#else /* !_HAVE_BSDI */

static void
PyThread__init_thread(void)
{
#if defined(_AIX) && defined(__GNUC__)
	pthread_init();
#endif
}

#endif /* !_HAVE_BSDI */

/*
 * Thread support.
 */


long
PyThread_start_new_thread(void (*func)(void *), void *arg)
{
	pthread_t th;
	int success;
	sigset_t oldmask, newmask;
#if defined(THREAD_STACK_SIZE) || defined(PTHREAD_SYSTEM_SCHED_SUPPORTED)
	pthread_attr_t attrs;
#endif
	dprintf(("PyThread_start_new_thread called\n"));
	if (!initialized)
		PyThread_init_thread();

#if defined(THREAD_STACK_SIZE) || defined(PTHREAD_SYSTEM_SCHED_SUPPORTED)
	pthread_attr_init(&attrs);
#endif
#ifdef THREAD_STACK_SIZE
	pthread_attr_setstacksize(&attrs, THREAD_STACK_SIZE);
#endif
#ifdef PTHREAD_SYSTEM_SCHED_SUPPORTED
		pthread_attr_setscope(&attrs, PTHREAD_SCOPE_SYSTEM);
#endif

	/* Mask all signals in the current thread before creating the new
	 * thread.	This causes the new thread to start with all signals
	 * blocked.
	 */
	sigfillset(&newmask);
	SET_THREAD_SIGMASK(SIG_BLOCK, &newmask, &oldmask);

	success = pthread_create(&th, 
#if defined(PY_PTHREAD_D4)
				 pthread_attr_default,
				 (pthread_startroutine_t)func, 
				 (pthread_addr_t)arg
#elif defined(PY_PTHREAD_D6)
				 pthread_attr_default,
				 (void* (*)(void *))func,
				 arg
#elif defined(PY_PTHREAD_D7)
				 pthread_attr_default,
				 func,
				 arg
#elif defined(PY_PTHREAD_STD)
#if defined(THREAD_STACK_SIZE) || defined(PTHREAD_SYSTEM_SCHED_SUPPORTED)
				 &attrs,
#else
				 (pthread_attr_t*)NULL,
#endif
				 (void* (*)(void *))func,
				 (void *)arg
#endif
				 );

	/* Restore signal mask for original thread */
	SET_THREAD_SIGMASK(SIG_SETMASK, &oldmask, NULL);

#if defined(THREAD_STACK_SIZE) || defined(PTHREAD_SYSTEM_SCHED_SUPPORTED)
	pthread_attr_destroy(&attrs);
#endif
	if (success == 0) {
#if defined(PY_PTHREAD_D4) || defined(PY_PTHREAD_D6) || defined(PY_PTHREAD_D7)
		pthread_detach(&th);
#elif defined(PY_PTHREAD_STD)
		pthread_detach(th);
#endif
	}
#if SIZEOF_PTHREAD_T <= SIZEOF_LONG
	return (long) th;
#else
	return (long) *(long *) &th;
#endif
}

/* XXX This implementation is considered (to quote Tim Peters) "inherently
   hosed" because:
	 - It does not guanrantee the promise that a non-zero integer is returned.
	 - The cast to long is inherently unsafe.
	 - It is not clear that the 'volatile' (for AIX?) and ugly casting in the
	   latter return statement (for Alpha OSF/1) are any longer necessary.
*/
long 
PyThread_get_thread_ident(void)
{
	volatile pthread_t threadid;
	if (!initialized)
		PyThread_init_thread();
	/* Jump through some hoops for Alpha OSF/1 */
	threadid = pthread_self();
#if SIZEOF_PTHREAD_T <= SIZEOF_LONG
	return (long) threadid;
#else
	return (long) *(long *) &threadid;
#endif
}

static void 
do_PyThread_exit_thread(int no_cleanup)
{
	dprintf(("PyThread_exit_thread called\n"));
	if (!initialized) {
		if (no_cleanup)
			_exit(0);
		else
			exit(0);
	}
}

void 
PyThread_exit_thread(void)
{
	do_PyThread_exit_thread(0);
}

void 
PyThread__exit_thread(void)
{
	do_PyThread_exit_thread(1);
}

#ifndef NO_EXIT_PROG
static void 
do_PyThread_exit_prog(int status, int no_cleanup)
{
	dprintf(("PyThread_exit_prog(%d) called\n", status));
	if (!initialized)
		if (no_cleanup)
			_exit(status);
		else
			exit(status);
}

void 
PyThread_exit_prog(int status)
{
	do_PyThread_exit_prog(status, 0);
}

void 
PyThread__exit_prog(int status)
{
	do_PyThread_exit_prog(status, 1);
}
#endif /* NO_EXIT_PROG */

/*
 * Lock support.
 */

PyThread_type_lock 
PyThread_allocate_lock(void)
{
	sem_t *lock;
	int status, error = 0;

	dprintf(("PyThread_allocate_lock called\n"));
	if (!initialized)
		PyThread_init_thread();

	lock = (sem_t *)malloc(sizeof(sem_t));

	if (lock) {
		status = sem_init(lock,0,1);
		CHECK_STATUS("sem_init");

		if (error) {
			free((void *)lock);
			lock = NULL;
		}
	}

	dprintf(("PyThread_allocate_lock() -> %p\n", lock));
	return (PyThread_type_lock)lock;
}

void 
PyThread_free_lock(PyThread_type_lock lock)
{
	sem_t *thelock = (sem_t *)lock;
	int status, error = 0;

	dprintf(("PyThread_free_lock(%p) called\n", lock));

	if (!thelock)
		return;

	status = sem_destroy(thelock);
	CHECK_STATUS("sem_destroy");

	free((void *)thelock);
}

int 
PyThread_acquire_lock(PyThread_type_lock lock, int waitflag)
{
	int success;
	sem_t *thelock = (sem_t *)lock;
	int status, error = 0;

	dprintf(("PyThread_acquire_lock(%p, %d) called\n", lock, waitflag));

	if (waitflag) {
		status = sem_wait(thelock);
		CHECK_STATUS("sem_wait");
	} else {
		status = sem_trywait(thelock);
	}

	success = (status == 0) ? 1 : 0;

	dprintf(("PyThread_acquire_lock(%p, %d) -> %d\n", lock, waitflag, success));
	return success;
}

void 
PyThread_release_lock(PyThread_type_lock lock)
{
	sem_t *thelock = (sem_t *)lock;
	int status, error = 0;

	dprintf(("PyThread_release_lock(%p) called\n", lock));

	status = sem_post(thelock);
	CHECK_STATUS("sem_post");
}

/*
 * Semaphore support.
 */

PyThread_type_sema 
PyThread_allocate_sema(int value)
{
	sem_t *sema;
	int status, error = 0;

	dprintf(("PyThread_allocate_sema called\n"));
	if (!initialized)
		PyThread_init_thread();

	sema = (sem_t *)malloc(sizeof(sem_t));

	if (sema) {
		status = sem_init(sema,0,value);
		CHECK_STATUS("sem_init");

		if (error) {
			free((void *)sema);
			sema = NULL;
		}
	}
	dprintf(("PyThread_allocate_sema() -> %p\n",  sema));
	return (PyThread_type_sema)sema;
}

void 
PyThread_free_sema(PyThread_type_sema sema)
{
	int status, error = 0;
	sem_t *thesema = (sem_t *)sema;

	dprintf(("PyThread_free_sema(%p) called\n",  sema));

	if (!thesema)
		return;

	status = sem_destroy(thesema);
	CHECK_STATUS("sem_destroy");

	free((void *) thesema);
}

int 
PyThread_down_sema(PyThread_type_sema sema, int waitflag)
{
	int status, error = 0, success;
	sem_t *thesema = (sem_t *)sema;

	dprintf(("PyThread_down_sema(%p, %d) called\n",  sema, waitflag));

	if (waitflag) {
		status = sem_wait(thesema);
		CHECK_STATUS("sem_wait");
	} else {
		status = sem_trywait(thesema);
	}

	success = (status == 0) ? 1 : 0;

	dprintf(("PyThread_down_sema(%p) return\n",  sema));
	return success;
}

void 
PyThread_up_sema(PyThread_type_sema sema)
{
	int status, error = 0;
	sem_t *thesema = (sem_t *)sema;

	dprintf(("PyThread_up_sema(%p)\n",	sema));

	status = sem_post(thesema);
	CHECK_STATUS("sem_post");
}

------=_NextPart_000_0023_01C1BFB7.CF9678F0
Content-Type: application/octet-stream;
	name="thread_posix.diff-c"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename="thread_posix.diff-c"

*** thread_pthread.h	Wed Feb 27 17:35:11 2002
--- thread_posix.h	Wed Feb 27 17:39:30 2002
***************
*** 5,10 ****
--- 5,13 ----
  #include <string.h>
  #include <pthread.h>
  #include <signal.h>
+ #ifdef _POSIX_SEMAPHORES
+ #include <semaphore.h>
+ #endif
  
  
  /* try to determine what version of the Pthread Standard is installed.
***************
*** 288,293 ****
--- 291,457 ----
  }
  #endif /* NO_EXIT_PROG */
  
+ #ifdef _POSIX_SEMAPHORES
+ /*
+  * Lock support.
+  */
+ 
+ PyThread_type_lock 
+ PyThread_allocate_lock(void)
+ {
+ 	sem_t *lock;
+ 	int status, error = 0;
+ 
+ 	dprintf(("PyThread_allocate_lock called\n"));
+ 	if (!initialized)
+ 		PyThread_init_thread();
+ 
+ 	lock = (sem_t *)malloc(sizeof(sem_t));
+ 
+ 	if (lock) {
+ 		status = sem_init(lock,0,1);
+ 		CHECK_STATUS("sem_init");
+ 
+ 		if (error) {
+ 			free((void *)lock);
+ 			lock = NULL;
+ 		}
+ 	}
+ 
+ 	dprintf(("PyThread_allocate_lock() -> %p\n", lock));
+ 	return (PyThread_type_lock)lock;
+ }
+ 
+ void 
+ PyThread_free_lock(PyThread_type_lock lock)
+ {
+ 	sem_t *thelock = (sem_t *)lock;
+ 	int status, error = 0;
+ 
+ 	dprintf(("PyThread_free_lock(%p) called\n", lock));
+ 
+ 	if (!thelock)
+ 		return;
+ 
+ 	status = sem_destroy(thelock);
+ 	CHECK_STATUS("sem_destroy");
+ 
+ 	free((void *)thelock);
+ }
+ 
+ int 
+ PyThread_acquire_lock(PyThread_type_lock lock, int waitflag)
+ {
+ 	int success;
+ 	sem_t *thelock = (sem_t *)lock;
+ 	int status, error = 0;
+ 
+ 	dprintf(("PyThread_acquire_lock(%p, %d) called\n", lock, waitflag));
+ 
+ 	if (waitflag) {
+ 		status = sem_wait(thelock);
+ 		CHECK_STATUS("sem_wait");
+ 	} else {
+ 		status = sem_trywait(thelock);
+ 	}
+ 
+ 	success = (status == 0) ? 1 : 0;
+ 
+ 	dprintf(("PyThread_acquire_lock(%p, %d) -> %d\n", lock, waitflag, success));
+ 	return success;
+ }
+ 
+ void 
+ PyThread_release_lock(PyThread_type_lock lock)
+ {
+ 	sem_t *thelock = (sem_t *)lock;
+ 	int status, error = 0;
+ 
+ 	dprintf(("PyThread_release_lock(%p) called\n", lock));
+ 
+ 	status = sem_post(thelock);
+ 	CHECK_STATUS("sem_post");
+ }
+ 
+ /*
+  * Semaphore support.
+  */
+ 
+ PyThread_type_sema 
+ PyThread_allocate_sema(int value)
+ {
+ 	sem_t *sema;
+ 	int status, error = 0;
+ 
+ 	dprintf(("PyThread_allocate_sema called\n"));
+ 	if (!initialized)
+ 		PyThread_init_thread();
+ 
+ 	sema = (sem_t *)malloc(sizeof(sem_t));
+ 
+ 	if (sema) {
+ 		status = sem_init(sema,0,value);
+ 		CHECK_STATUS("sem_init");
+ 
+ 		if (error) {
+ 			free((void *)sema);
+ 			sema = NULL;
+ 		}
+ 	}
+ 	dprintf(("PyThread_allocate_sema() -> %p\n",  sema));
+ 	return (PyThread_type_sema)sema;
+ }
+ 
+ void 
+ PyThread_free_sema(PyThread_type_sema sema)
+ {
+ 	int status, error = 0;
+ 	sem_t *thesema = (sem_t *)sema;
+ 
+ 	dprintf(("PyThread_free_sema(%p) called\n",  sema));
+ 
+ 	if (!thesema)
+ 		return;
+ 
+ 	status = sem_destroy(thesema);
+ 	CHECK_STATUS("sem_destroy");
+ 
+ 	free((void *) thesema);
+ }
+ 
+ int 
+ PyThread_down_sema(PyThread_type_sema sema, int waitflag)
+ {
+ 	int status, error = 0, success;
+ 	sem_t *thesema = (sem_t *)sema;
+ 
+ 	dprintf(("PyThread_down_sema(%p, %d) called\n",  sema, waitflag));
+ 
+ 	if (waitflag) {
+ 		status = sem_wait(thesema);
+ 		CHECK_STATUS("sem_wait");
+ 	} else {
+ 		status = sem_trywait(thesema);
+ 	}
+ 
+ 	success = (status == 0) ? 1 : 0;
+ 
+ 	dprintf(("PyThread_down_sema(%p) return\n",  sema));
+ 	return success;
+ }
+ 
+ void 
+ PyThread_up_sema(PyThread_type_sema sema)
+ {
+ 	int status, error = 0;
+ 	sem_t *thesema = (sem_t *)sema;
+ 
+ 	dprintf(("PyThread_up_sema(%p)\n",	sema));
+ 
+ 	status = sem_post(thesema);
+ 	CHECK_STATUS("sem_post");
+ }
+ #else /* _POSIX_SEMAPHORES */
  /*
   * Lock support.
   */
***************
*** 497,499 ****
--- 661,664 ----
  	status = pthread_mutex_unlock(&thesema->mutex);
  	CHECK_STATUS("pthread_mutex_unlock");
  }
+ #endif /* _POSIX_SEMAPHORES */

------=_NextPart_000_0023_01C1BFB7.CF9678F0--



From skip@pobox.com  Wed Feb 27 22:17:15 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 27 Feb 2002 16:17:15 -0600
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <m3zo1ud1ko.fsf@mira.informatik.hu-berlin.de>
References: <LNBBLJKPBEHFEDALKOLCEEMPNPAA.tim.one@comcast.net>
 <3C7CAFD3.60B32168@lemburg.com>
 <15485.5491.709403.99698@beluga.mojam.com>
 <3C7D1927.3414607E@lemburg.com>
 <15485.9695.126411.600632@beluga.mojam.com>
 <m3g03miqqi.fsf@mira.informatik.hu-berlin.de>
 <15485.21153.951244.102021@beluga.mojam.com>
 <m3zo1ud1ko.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15485.23275.603452.414165@beluga.mojam.com>

    >> I was thinking about strings used as byte containers for
    >> non-character data.

    Martin> Ok, but then you also said that you would want to produce a
    Martin> warning for those?

Never mind.  I'm probably just confused.

Skip


From mal@lemburg.com  Wed Feb 27 21:59:34 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 27 Feb 2002 22:59:34 +0100
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
References: <LNBBLJKPBEHFEDALKOLCEEMPNPAA.tim.one@comcast.net>
 <3C7CAFD3.60B32168@lemburg.com>
 <15485.5491.709403.99698@beluga.mojam.com>
 <3C7D1927.3414607E@lemburg.com>
 <15485.9695.126411.600632@beluga.mojam.com>
 <m3g03miqqi.fsf@mira.informatik.hu-berlin.de> <15485.21153.951244.102021@beluga.mojam.com>
Message-ID: <3C7D56C6.E9BAA5E4@lemburg.com>

Skip Montanaro wrote:
> 
>     Martin> If the default encoding is ASCII, and you have a 8-bit
>     Martin> character, the compiler will emit a warning if it is enhanced to
>     Martin> follow PEP 263. So what were you getting at?
> 
> I was thinking about strings used as byte containers for non-character data.

In string literals ? I think it is common to encode this sort of
data as hex or using octal escapes. Since these encodings are
plain 7-bit ASCII I don't see a problem.

Your hint about the manual is correct though: we'll have to adapt
that to the new reading as well.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From skip@pobox.com  Thu Feb 28 01:26:57 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 27 Feb 2002 19:26:57 -0600
Subject: [Python-Dev] PEP 263 -- Python Source Code Encoding
In-Reply-To: <3C7D56C6.E9BAA5E4@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCEEMPNPAA.tim.one@comcast.net>
 <3C7CAFD3.60B32168@lemburg.com>
 <15485.5491.709403.99698@beluga.mojam.com>
 <3C7D1927.3414607E@lemburg.com>
 <15485.9695.126411.600632@beluga.mojam.com>
 <m3g03miqqi.fsf@mira.informatik.hu-berlin.de>
 <15485.21153.951244.102021@beluga.mojam.com>
 <3C7D56C6.E9BAA5E4@lemburg.com>
Message-ID: <15485.34657.168922.781138@12-248-41-177.client.attbi.com>

    >> I was thinking about strings used as byte containers for
    >> non-character data.

    mal> In string literals ? I think it is common to encode this sort of
    mal> data as hex or using octal escapes. Since these encodings are plain
    mal> 7-bit ASCII I don't see a problem.

Precisely.  I was thinking about situations where they aren't encoded, but
sitting there naked, so to speak.

Skip


From guido@python.org  Thu Feb 28 02:11:08 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 27 Feb 2002 21:11:08 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
Message-ID: <200202280211.g1S2B8U27062@pcp742651pcs.reston01.va.comcast.net>

We had a brief jam on date/time objects at Zope Corp. HQ today.  I
won't get to writing up the full proposal that came out of this, but
I'd like to give at least a summary.  (Th0se who were there: my
thoughts have advanced a bit since this afternoon.)

My plan is to create a standard timestamp object in C that can be
subclassed.  The internal representation will favor extraction of
broken-out time fields (year etc.) in local time.  It will support
comparison, basic time computations, and effbot's minimal API, as well
as conversions to and from the two currently most popular time
representations used by the time module: posix timestamps in UTC and
9-tuples in local time.  There will be a C API.

Proposal for internal representation (also the basis for an efficient
pickle format):

year	 2 bytes, big-endian, unsigned (0 .. 65535)
month	 1 byte
day	 1 byte
hour	 1 byte
minute	 1 byte
second	 1 byte
usecond	 3 bytes, big-endian
tzoffset 2 bytes, big-endian, signed (in minutes, -1439 .. 1439)

total	12 bytes

Things this will not address (but which you may address through
subclassing):

- leap seconds
- alternate calendars
- years far in the future or BC
- precision of timepoints (e.g. a separate Date type)
- DST flags (DST is accounted for by the tzoffset field)

Mini-FAQ

- Why store a broken-out local time rather than seconds (or
  microseconds) relative to an epoch in UTC?  There are two kinds of
  operations on times: accessing the broken-out fields (probably in
  local time), and time computations.  The chosen representation
  favors accessing broken-out fields, which I expect to be more common
  than time computations.

- Why a big-endian internal representation?  So that comparison can be
  done using a single memcmp() call as long as the tzoffset fields are
  the same.

- Why not pack the fields closer to save a few bytes?  To make the
  pack and unpack operations more efficient; the object footprint
  isn't going to make much of a difference.

- Why is the year unsigned?  So memcmp() will do the right thing for
  comparing dates (in the same timezone).

- What's the magic number 1439?  One less than 24 * 60.  Timezone
  offsets may be up to 24 hours.  (The C99 standard does it this way.)

I'll try to turn this into a proper PEP ASAP.

(Stephan: do I need to CC you or are you reading python-dev?)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Thu Feb 28 02:33:52 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 27 Feb 2002 20:33:52 -0600
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <200202280211.g1S2B8U27062@pcp742651pcs.reston01.va.comcast.net>
References: <200202280211.g1S2B8U27062@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15485.38672.466928.755447@12-248-41-177.client.attbi.com>

    Guido> Proposal for internal representation (also the basis for an
    Guido> efficient pickle format):

    Guido> year  2 bytes, big-endian, unsigned (0 .. 65535)
    ...
    Guido> - Why is the year unsigned?  So memcmp() will do the right thing
    Guido>   for comparing dates (in the same timezone).

So the earliest year it can represent is 1BC (or does year == 0 represent
some other base year)?

One of MAL's desires were that he could use the abstract interface /F
defined and remain binary compatible with the current mxDateTime layout.
Will your layout work for him?

Skip


From guido@python.org  Thu Feb 28 02:39:10 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 27 Feb 2002 21:39:10 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: Your message of "Wed, 27 Feb 2002 20:33:52 CST."
 <15485.38672.466928.755447@12-248-41-177.client.attbi.com>
References: <200202280211.g1S2B8U27062@pcp742651pcs.reston01.va.comcast.net>
 <15485.38672.466928.755447@12-248-41-177.client.attbi.com>
Message-ID: <200202280239.g1S2dA927244@pcp742651pcs.reston01.va.comcast.net>

>     Guido> Proposal for internal representation (also the basis for an
>     Guido> efficient pickle format):
> 
>     Guido> year  2 bytes, big-endian, unsigned (0 .. 65535)
>     ...
>     Guido> - Why is the year unsigned?  So memcmp() will do the right thing
>     Guido>   for comparing dates (in the same timezone).
> 
> So the earliest year it can represent is 1BC (or does year == 0 represent
> some other base year)?

Correct.

> One of MAL's desires were that he could use the abstract interface /F
> defined and remain binary compatible with the current mxDateTime layout.
> Will your layout work for him?

My layout is incompatible with that of mxDateTime, but this is not
supposed to be /F's abstract interface -- this is supposed to be one
implementation of it, mxDateTime can be another.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Thu Feb 28 03:40:08 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 27 Feb 2002 22:40:08 -0500
Subject: [Python-Dev] Manning Seeking Python book authors
Message-ID: <200202280340.g1S3e8j27431@pcp742651pcs.reston01.va.comcast.net>

Is anybody interested in writing any of the titles below, or can
you recommend someone?

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Wed, 27 Feb 2002 15:02:17 -0500
From:    Susan Capparelle <suca@manning.com>
To:      Guido van Rossum <guido@CNRI.Reston.VA.US>
Subject: Seeking Python book authors

Hi Guido,

I hope all is well?

As someone who has done valuable reviewing for us before in the 
Python arena, I thought you might be one of the right people to 
contact.

We're currently seeking authors for a number of Python related books. 
A couple of titles or topics would be; 'Enterprise system development 
with
Python,' 'Practical Python' and 'Effective Python.'  Can you 
recommend anyone with the necessary experience and skills to 
undertake any of these books?

Looking forward to your response and thanks in advance for your help.

Sincerely,


=======================================
Susan W. Capparelle
Assistant Publisher
Manning Publications Co.
209 Bruce Park Avenue, Greenwich, CT 06830
suca@manning.com
tel. 203.629.2211   www.manning.com
fax. 203.629.2084
=======================================

------- End of Forwarded Message



From tim@zope.com  Thu Feb 28 04:24:15 2002
From: tim@zope.com (Tim Peters)
Date: Wed, 27 Feb 2002 23:24:15 -0500
Subject: [Python-Dev] POSIX thread code
In-Reply-To: <GBEGLOMMCLDACBPKDIHFKECGCHAA.gsw@agere.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEAIOAAA.tim@zope.com>

[Gerald S. Williams]
> I recently came up with a fix for thread support in Python
> under Cygwin. Jason Tishler and Norman Vine are looking it
> over, but I'm pretty sure something similar should be used
> for the Cygwin Python port.
>
> This is easily done--simply add a few lines to thread.c
> and create a new thread_cygwin.h (context diff and new file
> both provided).
>
> But there is a larger issue:
>
> The thread interface code in thread_pthread.h uses mutexes
> and condition variables to emulate semaphores, which are
> then used to provide Python "lock" and "sema" services.

Please use current CVS Python for patches.  For example, all the "sema" code
no longer exists (it was undocumented and unused).

> I know this is a common practice since those two thread
> synchronization primitives are defined in "pthread.h". But
> it comes with quite a bit of overhead. (And in the case of
> Cygwin causes race conditions, but that's another matter.)
>
> POSIX does define semaphores, though. (In fact, it's in
> the standard just before Mutexes and Condition Variables.)

Semaphores weren't defined by POSIX at the time this code was written; IIRC,
they were first introduced in the later and then-rarely implemented POSIX
realtime extensions.  How stable are they?  Some quick googling didn't
inspire a lot of confidence, but maybe I was just bumping into early bug
reports.

> According to POSIX, they are found in <semaphore.h> and
> _POSIX_SEMAPHORES should be defined if they work as POSIX
> expects.

This may be a nightmare; for example, I don't see anything in the Single
UNIX Specification about this symbol, and as far as I'm concerned POSIX as a
distinct standard is a DSW (dead standard walking <wink>).  That's one for
the Unixish geeks to address.

> If they are available, it seems like providing direct
> semaphore services would be preferable to emulating them
> using condition variables and mutexes.

They could be hugely better on Linux, but I don't know:  there's anecdotal
evidence that Linux scheduling of threads competing for a mutex can get
itself into a vastly unfair state.  Provided Linux implements semaphores
properly, sempahore contention can be tweaked (and Python should do so), as
befits a realtime gimmick, to guarantee fairness (SCHED_FIFO and SCHED_RR).

> thread_posix.h.diff-c is a context diff that can be used
> to convert thread_pthread.h into a more general POSIX
> version that will use semaphores if available.

I believe your PyThread_acquire_lock() code has two holes:

1. sem_trywait() is not checked for an error return.

2. sem_wait() and sem_trywait() can be interrupted by signal, and
   that's not an error condition.

So these calls should be stuck in a loop:

	do {
		... call the right one ...
	} while (status < 0 && errno == EINTR);

	if (status < 0) {
		/* an unexpected exceptional return */
		...
	}

> ...
> Does this sound like a good idea?

Yes, provided it works <wink>.

> Should I create a more thorough set of patch files and submit them?

I'd like that, but please don't email patches -- they'll just be forgotten.
Upload patches to the Python patch manager instead:

    http://sf.net/tracker/?group_id=5470&atid=305470

Discussion about the patches remains appropriate on Python-Dev.



From martin@v.loewis.de  Thu Feb 28 07:57:32 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 28 Feb 2002 08:57:32 +0100
Subject: [Python-Dev] POSIX thread code
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEAIOAAA.tim@zope.com>
References: <LNBBLJKPBEHFEDALKOLCKEAIOAAA.tim@zope.com>
Message-ID: <m3n0xu9gyb.fsf@mira.informatik.hu-berlin.de>

"Tim Peters" <tim@zope.com> writes:

> Semaphores weren't defined by POSIX at the time this code was written; IIRC,
> they were first introduced in the later and then-rarely implemented POSIX
> realtime extensions.  How stable are they?

They are in Single UNIX V2 (1997), so anybody claiming conformance to
Single UNIX has implemented them:
- AIX 4.3.1 and later
- Tru64 UNIX V5.1A and later
- Solaris 7 and later
[from the list of certified Unix98 systems]

In addition, the following implementations document support for sem_init:
- LinuxThreads since glibc 2.0 (1996)
- IRIX atleast since 6.5 (a patch for 6.2 is available since 1996)

> > According to POSIX, they are found in <semaphore.h> and
> > _POSIX_SEMAPHORES should be defined if they work as POSIX
> > expects.
> 
> This may be a nightmare; for example, I don't see anything in the Single
> UNIX Specification about this symbol, and as far as I'm concerned POSIX as a
> distinct standard is a DSW (dead standard walking <wink>).  That's one for
> the Unixish geeks to address.

You didn't ask google for _POSIX_SEMAPHORES, right? The first hit
brings you to

http://www.opengroup.org/onlinepubs/7908799/xsh/feature.html

_POSIX_SEMAPHORES
   Implementation supports the Semaphores option.

A quick check shows that both Solaris 8 and glibc 2.2 do indeed define
the symbol.

> They could be hugely better on Linux, but I don't know:  there's anecdotal
> evidence that Linux scheduling of threads competing for a mutex can get
> itself into a vastly unfair state.  

For glibc 2.1, semaphores have been reimplemented; they now provide
FIFO wakeup (sorted by thread priority). Same for mutexes: the
highest-priority oldest-waiting thread will be resumed.

> 	do {
> 		... call the right one ...
> 	} while (status < 0 && errno == EINTR);

Shouldn't EINTR check for KeyboardInterrupt?

Regards,
Martin


From mal@lemburg.com  Thu Feb 28 08:14:20 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Feb 2002 09:14:20 +0100
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code
 Encoding)
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
 <m34rk2eg84.fsf@mira.informatik.hu-berlin.de> <15485.25422.524082.109890@anthem.wooz.org>
Message-ID: <3C7DE6DC.893E594B@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:
> 
>     >> I bet we'd win some Ruby converts if we did this <wink>.  For
>     >> reference, I'm thinking about including the Japanese and
>     >> Chinese codecs with MM2.1 because it makes little sense to
>     >> claim support for those languages without them.
> 
>     MvL> That is certainly the right thing to do. If correctness could
>     MvL> be verified independently, I'd be in favour of including them
>     MvL> with Python - even though they will likely never get the
>     MvL> efficiency that wrappers around the platform's codecs would
>     MvL> have.
> 
> I'm obviously not qualified to verify them independently, but I have
> had some initial positive feedback from a few Japanese users of the
> MM2.1 alphas.  My second hand information indicates that he Japanese
> codecs are pretty good, the Chinese are okay, and the Korean ones need
> a lot of work.
> 
> Also, it's a bit of a catch 22, in that the more official exposure
> these codecs get, the better they will eventually become, hopefully.
> I'd be +1 on including them in Python 2.3.

You could (and probably should) add Tamito's codecs in Python, 
but the others have licensing problems :-/ 

It shouldn't be hard though for native speakers and programmers 
to build upon the work of Tamito and get those codecs done 
as well. Alternatively, the PSF or some company interested
in having these codecs available could fund the development.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Thu Feb 28 08:45:53 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Feb 2002 09:45:53 +0100
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code
 Encoding)
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
 <3C7D42B6.A88568CD@lemburg.com> <15485.25475.913116.826208@anthem.wooz.org>
Message-ID: <3C7DEE41.F31FAEA@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "MAL" == M  <mal@lemburg.com> writes:
> 
>     MAL> Why not simply make the installation a configure option ?
> 
>     MAL> We could easily extend setup.py to grab the tarball from
>     MAL> the web in case it is needed.
> 
> That's another option.  Certainly stuff like that is becoming fairly
> common for installers these days.

Hmm, make that ZIP-ball (we have no .tar support in the standard
lib, only ZIP-file support). Also, the setup.py will have to 
check whether it has to grab a level 0 compression ZIP file or
a level 9 one.

Nothing which cannot be done, of course... net installers
are quite common these days (see e.g. Mozilla, IE and others),
so people are probably quite used to them already. And
we can always provide a full install download as well.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From martin@v.loewis.de  Thu Feb 28 08:34:32 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 28 Feb 2002 09:34:32 +0100
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code  Encoding)
In-Reply-To: <3C7DE6DC.893E594B@lemburg.com>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
 <m34rk2eg84.fsf@mira.informatik.hu-berlin.de>
 <15485.25422.524082.109890@anthem.wooz.org>
 <3C7DE6DC.893E594B@lemburg.com>
Message-ID: <m3vgci80o7.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> You could (and probably should) add Tamito's codecs in Python, 
> but the others have licensing problems :-/ 

I would not recommend to incorporate any of this into Python without
asking the author(s). When doing so, it would be appropriate, IMO, to
ask them whether they would fill out the contributor agreement. Then,
the presumed licensing problems would be gone.

Regards,
Martin


From mal@lemburg.com  Thu Feb 28 09:08:17 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Feb 2002 10:08:17 +0100
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code
 Encoding)
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org> <m34rk2eg84.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3C7DF381.C2E1335A@lemburg.com>

"Martin v. Loewis" wrote:
> 
> barry@zope.com (Barry A. Warsaw) writes:
> 
> > Which actually touches on something I wanted to bring up.  Why don't
> > we include the Japanese codecs with Python?  Is it just a size issue?
> 
> I think Guido's original concern was about the size (apart from the
> fact that they were not available before).
> 
> My concern is also correctness and efficiency. Most current systems
> provide high-performance well-tested codecs, since they need those
> frequently. It is a waste of resources not to make use of these
> codecs. The counter-argument, of course, is that you cannot always
> rely on these codecs being available (apart from the fact that you
> need wrappers around the platform API).

Which wrapper APIs do we currently have which could actually
be made part of the Python core ?

Aside: while it's true that we could use those, the Unicode 
implementation has shown that rolling our own has worked out
quite well too.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From jacobs@penguin.theopalgroup.com  Thu Feb 28 11:46:24 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Thu, 28 Feb 2002 06:46:24 -0500 (EST)
Subject: [Python-Dev] Manning Seeking Python book authors
In-Reply-To: <200202280340.g1S3e8j27431@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <Pine.LNX.4.33.0202280643330.3969-100000@penguin.theopalgroup.com>

On Wed, 27 Feb 2002, Guido van Rossum wrote:
> Is anybody interested in writing any of the titles below, or can
> you recommend someone?

If I were to find a one or two motivated co-authors, I would strongly
consider tackling 'Enterprise system development with Python'.  I'm in the
final stretches of my upcoming book on data manipulation and statistical
analysis in Python for programmers and graduate students, and have been
looking around for new ideas.

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From mal@lemburg.com  Thu Feb 28 12:11:37 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Feb 2002 13:11:37 +0100
Subject: [Python-Dev] PEP 0275 -- Switching on Multiple Values
Message-ID: <3C7E1E79.751AF37A@lemburg.com>

I consider the PEP 0275 ready for review by the developers.
Comments please.

	http://python.sourceforge.net/peps/pep-0275.html

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From tim.one@comcast.net  Thu Feb 28 06:57:36 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 28 Feb 2002 01:57:36 -0500
Subject: [Python-Dev] Alignment assumptions
In-Reply-To: <13b201c1bfc9$c94d1b90$0500a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBDOAAA.tim.one@comcast.net>

[David Abrahams]
> A quick grep-find through the Python-2.2 sources reveals the following:
>
> Include/dictobject.h:49: long aligner;

This is in

#ifdef USE_CACHE_ALIGNED
	long	aligner;
#endif

and AFAIK nobody ever defines the symbol.  It's a cache-line optimization
gimmick, but is effectively a nop (except to waste memory) on "almost all"
machines.  IIRC, the author never measured any improvement by using it (not
surprising, since I believe almost all mallocs at least 8-byte align now).
I vote we delete it.

> Include/objimpl.h:275: double dummy;  /* force worst-case alignment */

One branch of a union, forces enough padding in the gc header so that
whatever follows the gc header is "aligned enough".  This is sufficient for
all core gc types, but may not be sufficient for user-defined gc types.  I'm
happy enough to view it as a restriction on what user-defined gc'able types
can contain.

> Modules/addrinfo.h:162: LONG_LONG __ss_align; /* force desired structure
> storage alignment */
> Modules/addrinfo.h:164: double __ss_align; /* force desired structure
> storage alignment */

This isn't our code (it's imported from the WIDE project), and I have no
idea what it thinks it's trying to accomplish (neither the mystery padding,
nor really much of anything else in the WIDE code!).

> At first glance, there appear to be different assumptions at work
> here about what constitutes maximal alignment on any given platform.

Only the objimpl.h trick might benefit from maximal alignment.

> I've been using a little C++ metaprogram to find a type which will
> properly align any other given type. Because of limitations of one
> compiler, I had to disable the computation and instead used the
> objimpl.h assumption that double was maximally aligned, but also
> added a compile-time assertion to check that the alignment is always
> greater than or equal to that of the target type. Well, it failed today
> on Tru64 Unix with the latest compaq CXX 6.5 prerelease compiler; it
> appears that the alignment of long double is greater than that
> of double on that platform.
>
> I thought someone might want to know,

If you ever compile on a KSR machine, you'll discover there's no std C type
that captures maximal alignment.  You'd have to guess it's an extension type
named "_subpage".  I'm not sure that even C++ template metaprogramming could
manage that bit of channeling <wink> (FYI, _subpage required 128-byte
alignment).

Stupid trick:  If you can compute this at run time, do malloc(1) a few
times, count the number of trailing 0 bits in the returned addresses, and
take the minimum.  Since malloc has to return memory "suitably aligned so
that it may be assigned to a pointer to any type of object and then used to
access such an object or an array of such objects", you'd soon discover you
always got at least 7 trailing zero bits back from KSR malloc(), and
presumably at least 4 under Tru64.

there's-the-standard-and-then-there's-real-life<wink>-ly y'rs  - tim



From thomas.heller@ion-tof.com  Thu Feb 28 13:19:20 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 28 Feb 2002 14:19:20 +0100
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
Message-ID: <04da01c1c05a$886255a0$e000a8c0@thomasnotebook>

[Jeremy on python-checkins list, PEP 283: Python 2.3 release schedule]
> Planned features for 2.3
>     Here are a few PEPs that I know to be under consideration.
[...]
>  S   273  Import Modules from Zip Archives             Ahlstrom

I haven't participated in the discussion of PEP 273,
IIRC it was mostly about implementation details...

Wouldn't it be the right time now, instead of complicating
the builtin import mechanism further, to simplify the builtin
import code, and use it as the foundation of a Python coded
implementation - imputil, or better Gordon's iu.py, or whatever?


Thomas



From David Abrahams" <david.abrahams@rcn.com  Thu Feb 28 13:44:00 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 28 Feb 2002 08:44:00 -0500
Subject: [Python-Dev] Alignment assumptions
References: <LNBBLJKPBEHFEDALKOLCEEBDOAAA.tim.one@comcast.net>
Message-ID: <172601c1c05e$0c0ea630$0500a8c0@boostconsulting.com>

----- Original Message -----
From: "Tim Peters" <tim.one@comcast.net>

> > Include/objimpl.h:275: double dummy;  /* force worst-case alignment */
>
> One branch of a union, forces enough padding in the gc header so that
> whatever follows the gc header is "aligned enough".  This is sufficient
for
> all core gc types, but may not be sufficient for user-defined gc types.
I'm
> happy enough to view it as a restriction on what user-defined gc'able
types
> can contain.

As I read the code, it affects all types (doesn't this header begin every
object, regardless of its GC flags?) and I think that's a very unhappy
circumstance for your numeric community. Remember, the type that raised the
alarm here was just a long double.

> > At first glance, there appear to be different assumptions at work
> > here about what constitutes maximal alignment on any given platform.
>
> Only the objimpl.h trick might benefit from maximal alignment.

I'm not actually after maximal alignment; I look for a
minimally-sized/aligned type whose alignment is a multiple of the target
type's alignment. In any case, I was just using the assumption that double
was maximally aligned since I was linking with Python code and the EDG
front-end was too slow to handle the metaprogram -- I figured that if the
assumption was good enough for Python and my clients were depending on it
anyway, it was good enough for my code (not!).

> If you ever compile on a KSR machine, you'll discover there's no std C
type
> that captures maximal alignment.

I was aware that this was a theoretical possibility, but not that it was a
practical one. What's KSR?

> You'd have to guess it's an extension type
> named "_subpage".  I'm not sure that even C++ template metaprogramming
could
> manage that bit of channeling <wink>

Nope; we can only look through a list of likely candidates to try to find a
match. We're hoping to address this for the next standard -- I'm pushing for
allowing non-POD types in unions, leaving construction/destruction up to the
user.

> (FYI, _subpage required 128-byte
> alignment).

I guess that strictly speaking, requiring maximal alignment wouldn't be
appropriate for objimpl ;-)

> Stupid trick:  If you can compute this at run time, do malloc(1) a few
> times, count the number of trailing 0 bits in the returned addresses, and
> take the minimum.  Since malloc has to return memory "suitably aligned so
> that it may be assigned to a pointer to any type of object and then used
to
> access such an object or an array of such objects", you'd soon discover
you
> always got at least 7 trailing zero bits back from KSR malloc(), and
> presumably at least 4 under Tru64.

Sounds like a good candidate for your autoconf script.
Seriously, though, I think it would be reasonable to stick to aligning the
standard builtin types, in which can you can do the test without calling
malloc, FWIW.

> there's-the-standard-and-then-there's-real-life<wink>-ly y'rs  - tim

in-theory-theory-and-practice-are-the-same-and-to-hell-with-what-happens-in-
practice-ly y'rs

-Dave




From mal@lemburg.com  Thu Feb 28 13:44:52 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Feb 2002 14:44:52 +0100
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
References: <04da01c1c05a$886255a0$e000a8c0@thomasnotebook>
Message-ID: <3C7E3454.B6690A7@lemburg.com>

Thomas Heller wrote:
> 
> [Jeremy on python-checkins list, PEP 283: Python 2.3 release schedule]
> > Planned features for 2.3
> >     Here are a few PEPs that I know to be under consideration.
> [...]
> >  S   273  Import Modules from Zip Archives             Ahlstrom
> 
> I haven't participated in the discussion of PEP 273,
> IIRC it was mostly about implementation details...
> 
> Wouldn't it be the right time now, instead of complicating
> the builtin import mechanism further, to simplify the builtin
> import code, and use it as the foundation of a Python coded
> implementation - imputil, or better Gordon's iu.py, or whatever?

This would be nice to have, but how do you bootstrap the 
importer if it's written in Python ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From gsw@agere.com  Thu Feb 28 13:47:42 2002
From: gsw@agere.com (Gerald S. Williams)
Date: Thu, 28 Feb 2002 08:47:42 -0500
Subject: [Python-Dev] POSIX thread code
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEBKOAAA.tim.one@comcast.net>
Message-ID: <GBEGLOMMCLDACBPKDIHFIECHCHAA.gsw@agere.com>

Tim Peters wrote:
> Please use current CVS Python for patches.  For example, all the "sema" code
> no longer exists (it was undocumented and unused).

DOH! Sorry, I thought of that after pressing SEND. I had
been using a specific Cygwin version to relay and test the
proposed changes.

DOH again! I just realized that a thread_nt.h patch that I
submitted to the patch manager has the same problem!

I'd better go get the latest CVS sources before commenting
any further about the code...

You and Martin have good points about the implementation,
some of which I had intended to address once I knew which
implementation to target.

It sounds like I'll be targetting the general POSIX thread
version of Python's thread interface code. I'd definitely
at least check for _POSIX_SEMAPHORES before changing the
behavior, though.

One question left is whether to continue calling the file
thread_pthread.h or to rename it thread_posix.h.

> /* Thread package.
>    This is intended to be usable independently from Python.
> 
> That's why there are no calls to Python runtime functions in
> thread_pthread.h (etc) files now; e.g., they call malloc() and free()
> directly, and don't reference any PyExc_XXX symbols.  That's a lot to
> overcome just to break existing code <wink>.

Actually, this isn't true. The current thread_nt.h creates
a Python dictionary to keep track of thread handles. This
was what my earlier patch was for--the dictionary isn't
even used (and creates a memory leak to boot). I proposed
removing it entirely (along with the #include <Python.h>).
I'll update my previous patch with one based on current
CVS sources.

-Jerry

-O Gerald S. Williams, 22Y-103GA : mailto:gsw@agere.com O-
-O AGERE SYSTEMS, 555 UNION BLVD : office:610-712-8661  O-
-O ALLENTOWN, PA, USA 18109-3286 : mobile:908-672-7592  O-



From guido@python.org  Thu Feb 28 13:49:12 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 28 Feb 2002 08:49:12 -0500
Subject: [Python-Dev] Version updates etc.
Message-ID: <200202281349.g1SDnCi28561@pcp742651pcs.reston01.va.comcast.net>

Maybe it's time for a quick informative PEP explaining where, when and
how version numbers, copyright dates and the like should be updated?
This info is currently spread all over the place (PEP 101 and 102 have
some, the rest in in the minds of various PythonLabs folks).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Feb 28 13:58:43 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 28 Feb 2002 08:58:43 -0500
Subject: [Python-Dev] PEP 0275 -- Switching on Multiple Values
In-Reply-To: Your message of "Thu, 28 Feb 2002 13:11:37 +0100."
 <3C7E1E79.751AF37A@lemburg.com>
References: <3C7E1E79.751AF37A@lemburg.com>
Message-ID: <200202281358.g1SDwh428640@pcp742651pcs.reston01.va.comcast.net>

> I consider the PEP 0275 ready for review by the developers.
> Comments please.
> 
> 	http://python.sourceforge.net/peps/pep-0275.html

I think it's fine to look into this, but I believe for Python
2.3 we should focus more on stabilization than on new language features.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Thu Feb 28 13:57:49 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 28 Feb 2002 14:57:49 +0100
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
References: <04da01c1c05a$886255a0$e000a8c0@thomasnotebook> <3C7E3454.B6690A7@lemburg.com>
Message-ID: <053201c1c05f$e86a9610$e000a8c0@thomasnotebook>

From: "M.-A. Lemburg" <mal@lemburg.com>
> Thomas Heller wrote:
> > Wouldn't it be the right time now, instead of complicating
> > the builtin import mechanism further, to simplify the builtin
> > import code, and use it as the foundation of a Python coded
> > implementation - imputil, or better Gordon's iu.py, or whatever?
> 
> This would be nice to have, but how do you bootstrap the 
> importer if it's written in Python ?
> 
Have you looked at imputil? It bootstraps itself only from builtin
modules (which may be the only mechanism to be in the core).
Probably everything else, even packages can be implemented outside.
How did ni do it?

Also I think Gordon's rimport and aimport are good ideas.

Thomas



From guido@python.org  Thu Feb 28 14:03:51 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 28 Feb 2002 09:03:51 -0500
Subject: [Python-Dev] Alignment assumptions
In-Reply-To: Your message of "Thu, 28 Feb 2002 01:57:36 EST."
 <LNBBLJKPBEHFEDALKOLCEEBDOAAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEEBDOAAA.tim.one@comcast.net>
Message-ID: <200202281403.g1SE3ph28665@pcp742651pcs.reston01.va.comcast.net>

> This is in
> 
> #ifdef USE_CACHE_ALIGNED
> 	long	aligner;
> #endif
> 
> and AFAIK nobody ever defines the symbol.  It's a cache-line
> optimization gimmick, but is effectively a nop (except to waste
> memory) on "almost all" machines.  IIRC, the author never measured
> any improvement by using it (not surprising, since I believe almost
> all mallocs at least 8-byte align now).  I vote we delete it.

The malloc 8-byte align argument doesn't apply, since this struct is
used in an array.  Since the struct itself doesn't require alignment
beyond 4 bytes, the array entries can be 12 bytes apart.  So I don't
think this is a nop -- I think it would waste 4 bytes per hash table
entry on most machines.

This was added by Jack Jansen ages ago -- I think he did measure a
speedup on an old Mac compiler, or he wouldn't have added it, and I
bet there was a #define USE_CACHE_ALIGNED in his config.h then.

But that's all history; I agree it should be deleted.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Thu Feb 28 14:07:47 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Feb 2002 15:07:47 +0100
Subject: [Python-Dev] PEP 0275 -- Switching on Multiple Values
References: <3C7E1E79.751AF37A@lemburg.com> <200202281358.g1SDwh428640@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3C7E39B3.AB4203F7@lemburg.com>

Guido van Rossum wrote:
> 
> > I consider the PEP 0275 ready for review by the developers.
> > Comments please.
> >
> >       http://python.sourceforge.net/peps/pep-0275.html
> 
> I think it's fine to look into this, but I believe for Python
> 2.3 we should focus more on stabilization than on new language features.

Should I move this to 2.4 then ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From mal@lemburg.com  Thu Feb 28 14:12:44 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Feb 2002 15:12:44 +0100
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
References: <04da01c1c05a$886255a0$e000a8c0@thomasnotebook> <3C7E3454.B6690A7@lemburg.com> <053201c1c05f$e86a9610$e000a8c0@thomasnotebook>
Message-ID: <3C7E3ADC.DCED6D15@lemburg.com>

Thomas Heller wrote:
> 
> From: "M.-A. Lemburg" <mal@lemburg.com>
> > Thomas Heller wrote:
> > > Wouldn't it be the right time now, instead of complicating
> > > the builtin import mechanism further, to simplify the builtin
> > > import code, and use it as the foundation of a Python coded
> > > implementation - imputil, or better Gordon's iu.py, or whatever?
> >
> > This would be nice to have, but how do you bootstrap the
> > importer if it's written in Python ?
> >
> Have you looked at imputil? It bootstraps itself only from builtin
> modules (which may be the only mechanism to be in the core).

Sure, but for finding imputil itself you still need the C import
mechanism. Even worse: if Python can't find imputil (for some 
reason), it would be completely broken.

My only gripe with the existing C implementation is that
I would like to have more hooks available. Currently, you
have to replace the complete API in order to add new 
features -- not exactly OO :-/

BTW, how is progress on the ZIP import patch doing ?
Perhaps Jim should just check in what he has so that the code
gets a little more code review...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From guido@python.org  Thu Feb 28 14:15:58 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 28 Feb 2002 09:15:58 -0500
Subject: [Python-Dev] PEP 0275 -- Switching on Multiple Values
In-Reply-To: Your message of "Thu, 28 Feb 2002 15:07:47 +0100."
 <3C7E39B3.AB4203F7@lemburg.com>
References: <3C7E1E79.751AF37A@lemburg.com> <200202281358.g1SDwh428640@pcp742651pcs.reston01.va.comcast.net>
 <3C7E39B3.AB4203F7@lemburg.com>
Message-ID: <200202281415.g1SEFwU28798@pcp742651pcs.reston01.va.comcast.net>

> > > I consider the PEP 0275 ready for review by the developers.
> > > Comments please.
> > >
> > >       http://python.sourceforge.net/peps/pep-0275.html
> > 
> > I think it's fine to look into this, but I believe for Python
> > 2.3 we should focus more on stabilization than on new language features.
> 
> Should I move this to 2.4 then ?

Yes, if that.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Thu Feb 28 08:35:35 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 28 Feb 2002 03:35:35 -0500
Subject: [Python-Dev] POSIX thread code
In-Reply-To: <m3n0xu9gyb.fsf@mira.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEBKOAAA.tim.one@comcast.net>

[Martin v. Loewis]
> ...
> You didn't ask google for _POSIX_SEMAPHORES, right? The first hit
> brings you to
>
> http://www.opengroup.org/onlinepubs/7908799/xsh/feature.html
>
> _POSIX_SEMAPHORES
>    Implementation supports the Semaphores option.

Good catch!  I didn't get a hit from the Open Group's SUS search box:

    http://www.opengroup.org/onlinepubs/7908799/

> A quick check shows that both Solaris 8 and glibc 2.2 do indeed define
> the symbol.

Cool.

> ...
> For glibc 2.1, semaphores have been reimplemented; they now provide
> FIFO wakeup (sorted by thread priority). Same for mutexes: the
> highest-priority oldest-waiting thread will be resumed.

My impression is that some at Zope Corp would find it hard to believe that
works.

>> 	do {
>> 		... call the right one ...
>> 	} while (status < 0 && errno == EINTR);

> Shouldn't EINTR check for KeyboardInterrupt?

Sorry, too much a can of worms for me -- the question and the possible
answers are irrelevant on my box <wink>.  Complications include that
interrupts weren't able to break out of a wait on a Python lock before (so
you'd change endcase semantics).  If you don't care about that, how would
you go about "checking for KeyboardInterrupt"?  Note thread.c's initial
comment:

/* Thread package.
   This is intended to be usable independently from Python.

That's why there are no calls to Python runtime functions in
thread_pthread.h (etc) files now; e.g., they call malloc() and free()
directly, and don't reference any PyExc_XXX symbols.  That's a lot to
overcome just to break existing code <wink>.



From jim@interet.com  Thu Feb 28 14:44:44 2002
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 28 Feb 2002 09:44:44 -0500
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
References: <04da01c1c05a$886255a0$e000a8c0@thomasnotebook> <3C7E3454.B6690A7@lemburg.com> <053201c1c05f$e86a9610$e000a8c0@thomasnotebook> <3C7E3ADC.DCED6D15@lemburg.com>
Message-ID: <3C7E425C.5060003@interet.com>

M.-A. Lemburg wrote:


> Sure, but for finding imputil itself you still need the C import
> mechanism. Even worse: if Python can't find imputil (for some 
> reason), it would be completely broken.


The other objection raised at the time was the possible
slow down of imports.

I think the existing C module search code is basically
good, although I wouldn't mind moving module import into
a Python method.  But since the C works, I have little
motivation to replace it.

 
> My only gripe with the existing C implementation is that
> I would like to have more hooks available. Currently, you
> have to replace the complete API in order to add new 
> features -- not exactly OO :-/


My code uses os.listdir to cache directory contents, but
defers its use until the os module can be imported using
the C import code.  I think a similar trick could be used
to replace imports with a new module.  This would make it
easy to replace imports.  But this would not
make it easy to add features unless a module were available
which implemented the current import semantics in Python.


> BTW, how is progress on the ZIP import patch doing ?
> Perhaps Jim should just check in what he has so that the code
> gets a little more code review...


The code is "done" has been in Source Forge patch 492105
for some time.

I am leaving for Panama tomorrow for 8 days, so if I
seem to disappear, that's why.  I would be happy to work
hard on this after I get back, because I think it is an
important addition for Python.

JimA






From thomas.heller@ion-tof.com  Thu Feb 28 14:57:07 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 28 Feb 2002 15:57:07 +0100
Subject: [Python-Dev] Version updates etc.
References: <200202281349.g1SDnCi28561@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <05e501c1c068$3124c670$e000a8c0@thomasnotebook>

From: "Guido van Rossum" <guido@python.org>
> Maybe it's time for a quick informative PEP explaining where, when and
> how version numbers, copyright dates and the like should be updated?
> This info is currently spread all over the place (PEP 101 and 102 have
> some, the rest in in the minds of various PythonLabs folks).

This PEP should also define the policy for the distutils version
number - when there is one. Maybe distutils should simply use the
Python version number, because Python is released more often than
distutils.

Thomas



From gmcm@hypernet.com  Thu Feb 28 15:55:39 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 28 Feb 2002 10:55:39 -0500
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
In-Reply-To: <3C7E3ADC.DCED6D15@lemburg.com>
Message-ID: <3C7E0CAB.19066.47536147@localhost>

[M.-A. Lemburg]
> > > This would be nice to have, but how do you
> > > bootstrap the importer if it's written in Python ?

The response to "let's revamp C import" is "Oh no,
we need a chance to play with it in Python first."

[Thomas Heller]
> > Have you looked at imputil? It bootstraps itself only
> > from builtin modules (which may be the only mechanism
> > to be in the core).

True when Greg wrote it, but strop is now
depecrated, and not necessarily builtin. It's
still the best route, because strop has no
dependencies, while string does.
 
> Sure, but for finding imputil itself you still need the
> C import mechanism. Even worse: if Python can't find
> imputil (for some reason), it would be completely
> broken.

If Python can't find the std lib, it's broken. No
change there.
 
> My only gripe with the existing C implementation is
> that I would like to have more hooks available.

All the more reason to try it in Python first. There's
never been agreement about what hooks should
be available. The import-sig was founded so ihooks
defenders could hash it out with imputil defenders
(the ihooks camp has never said a word).

It's my observation that most import hacks these
days are really namespace hacks anyway (that
is, they do a relatively normal import, and then
alter the way it's exposed so that "replace dots
with slashes and look in the filesystem" no
longer applies).

-- Gordon
http://www.mcmillan-inc.com/



From gmcm@hypernet.com  Thu Feb 28 15:55:39 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 28 Feb 2002 10:55:39 -0500
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
In-Reply-To: <3C7E425C.5060003@interet.com>
Message-ID: <3C7E0CAB.6729.475361AB@localhost>

On 28 Feb 2002 at 9:44, James C. Ahlstrom wrote:

> The other objection raised at the time was the
> possible slow down of imports. 

imputil was 30 to 40% slower than C import. iu
is about 10 to 15% slower under normal usage, but
can be faster if you use archives and arrange sys.path
intelligently.
 
> I think the existing C module search code is
> basically good, although I wouldn't mind moving
> module import into a Python method.  But since the C
> works, I have little motivation to replace it. 

It works because its implementation is the
definition of what works. Note that while the
import namespace (pkg.submodule.module)
is mapped to the filesystem, the two namespaces
are not isomorphic.
 
> My code uses os.listdir to cache directory
> contents, but defers its use until the os module can
> be imported using the C import code. 

A win over some threshold of number of hits on
that directory; a loss under that threshold.

> I think a
> similar trick could be used to replace imports with
> a new module.  This would make it easy to replace
> imports.  But this would not make it easy to add
> features unless a module were available which
> implemented the current import semantics in Python. 

The only incompatibility I'm aware of in iu.py is that
it doesn't have a import lock.
 

-- Gordon
http://www.mcmillan-inc.com/



From skip@pobox.com  Thu Feb 28 16:15:38 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 28 Feb 2002 10:15:38 -0600
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
In-Reply-To: <3C7E0CAB.19066.47536147@localhost>
References: <3C7E3ADC.DCED6D15@lemburg.com>
 <3C7E0CAB.19066.47536147@localhost>
Message-ID: <15486.22442.738570.615670@beluga.mojam.com>

    Gordon> [Thomas Heller]
    >> > Have you looked at imputil? It bootstraps itself only from builtin
    >> > modules (which may be the only mechanism to be in the core).

    Gordon> True when Greg wrote it, but strop is now depecrated, and not
    Gordon> necessarily builtin. It's still the best route, because strop
    Gordon> has no dependencies, while string does.
 
What do strop or string provide that string methods don't?  It's likely that
if you needed to import either in the past, you don't need to now.

Skip


From sdm7g@virginia.edu  Thu Feb 28 16:15:18 2002
From: sdm7g@virginia.edu (Steven Majewski)
Date: Thu, 28 Feb 2002 11:15:18 -0500 (EST)
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
In-Reply-To: <3C7E3ADC.DCED6D15@lemburg.com>
Message-ID: <Pine.OSX.4.40.0202281107120.2955-100000@d-128-54-224.bootp.virginia.edu>


On Thu, 28 Feb 2002, M.-A. Lemburg wrote:

> My only gripe with the existing C implementation is that
> I would like to have more hooks available. Currently, you
> have to replace the complete API in order to add new
> features -- not exactly OO :-/

It might be time to consider, rather than a special case for
zip files only, adding an extensible import mechanism ( something
like the protocol or mime-type handlers for browsers ).

If there's a zipfile in sys.path, then import calls the zipfile
handler to search it, if there's a URL in the path, it calls
a handler for that, etc. ( Maybe even a url for some sort of
directory service that finds the module for you. )

-- Steve       [Obviously thinking about TimBL's talk...]




From barry@zope.com  Thu Feb 28 16:17:01 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Thu, 28 Feb 2002 11:17:01 -0500
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code
 Encoding)
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
 <m34rk2eg84.fsf@mira.informatik.hu-berlin.de>
 <15485.25422.524082.109890@anthem.wooz.org>
 <3C7DE6DC.893E594B@lemburg.com>
Message-ID: <15486.22525.324049.844325@anthem.wooz.org>

[This thread probably ought to be moved to i18n-sig, so I'm CC'ing
them and will remove all future cc's to python-dev.  -BAW]

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> You could (and probably should) add Tamito's codecs in
    MAL> Python, but the others have licensing problems :-/

I believe I am using Tamito KAJIYAMA's codecs, from:

    http://pseudo.grad.sccs.chukyo-u.ac.jp/~kajiyama/python/

Or were you thinking about some different Japanese codecs?  The ones
at this url are BSD-ish and so should be compatible with the PSF
license, GPL, etc.

    MAL> It shouldn't be hard though for native speakers and
    MAL> programmers to build upon the work of Tamito and get those
    MAL> codecs done as well. Alternatively, the PSF or some company
    MAL> interested in having these codecs available could fund the
    MAL> development.

All good points.  I still think that by giving more visibility to the
codecs (i.e. adding them to the Python distro) would help bring muscle
to the effort.

>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

    MvL> I would not recommend to incorporate any of this into Python
    MvL> without asking the author(s). When doing so, it would be
    MvL> appropriate, IMO, to ask them whether they would fill out the
    MvL> contributor agreement. Then, the presumed licensing problems
    MvL> would be gone.

Agreed on both points!

-Barry


From barry@zope.com  Thu Feb 28 16:18:21 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Thu, 28 Feb 2002 11:18:21 -0500
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code
 Encoding)
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
 <3C7D42B6.A88568CD@lemburg.com>
 <15485.25475.913116.826208@anthem.wooz.org>
 <3C7DEE41.F31FAEA@lemburg.com>
Message-ID: <15486.22605.863259.997769@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> Hmm, make that ZIP-ball (we have no .tar support in the
    MAL> standard lib, only ZIP-file support). Also, the setup.py will
    MAL> have to check whether it has to grab a level 0 compression
    MAL> ZIP file or a level 9 one.

    MAL> Nothing which cannot be done, of course... net installers are
    MAL> quite common these days (see e.g. Mozilla, IE and others), so
    MAL> people are probably quite used to them already. And we can
    MAL> always provide a full install download as well.

Isn't there some PEP about all this? <wink>

-Barry


From tree@basistech.com  Thu Feb 28 16:27:41 2002
From: tree@basistech.com (Tom Emerson)
Date: Thu, 28 Feb 2002 11:27:41 -0500
Subject: [I18n-sig] Re: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code
 Encoding)
In-Reply-To: <15486.22525.324049.844325@anthem.wooz.org>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
 <m34rk2eg84.fsf@mira.informatik.hu-berlin.de>
 <15485.25422.524082.109890@anthem.wooz.org>
 <3C7DE6DC.893E594B@lemburg.com>
 <15486.22525.324049.844325@anthem.wooz.org>
Message-ID: <15486.23165.349397.260521@magrathea.basistech.com>

I've been working on a unified architecture for the Asian codecs. I
presented a paper about it at the last Unicode Conference in
Washington D.C. You can find it at

http://www.basistech.com/articles/python-zh-transcoding_iuc20_TE2.pdf

The presentation concentrates on Chinese, but the architecture will
work for JK as well.

    -tree

-- 
Tom Emerson                                          Basis Technology Corp.
Sr. Computational Linguist                         http://www.basistech.com
  "Beware the lollipop of mediocrity: lick it once and you suck forever"


From guido@python.org  Thu Feb 28 16:31:10 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 28 Feb 2002 11:31:10 -0500
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
In-Reply-To: Your message of "Thu, 28 Feb 2002 10:55:39 EST."
 <3C7E0CAB.19066.47536147@localhost>
References: <3C7E0CAB.19066.47536147@localhost>
Message-ID: <200202281631.g1SGVAw29092@pcp742651pcs.reston01.va.comcast.net>

> True when Greg wrote it, but strop is now
> depecrated, and not necessarily builtin. It's
> still the best route, because strop has no
> dependencies, while string does.

Have a look at the code.  It no longer import strop -- it uses string
methods now. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm@hypernet.com  Thu Feb 28 16:38:27 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 28 Feb 2002 11:38:27 -0500
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
In-Reply-To: <15486.22442.738570.615670@beluga.mojam.com>
References: <3C7E0CAB.19066.47536147@localhost>
Message-ID: <3C7E16B3.1124.477A9147@localhost>

On 28 Feb 2002 at 10:15, Skip Montanaro wrote:

> What do strop or string provide that string methods
> don't?  It's likely that if you needed to import
> either in the past, you don't need to now. 

Oops, you're right. iu doesn't use strop.
Just sys, imp and marshal (and optionally
Win32api if required and found).

-- Gordon
http://www.mcmillan-inc.com/



From mal@lemburg.com  Thu Feb 28 16:40:28 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Feb 2002 17:40:28 +0100
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code
 Encoding)
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
 <m34rk2eg84.fsf@mira.informatik.hu-berlin.de>
 <15485.25422.524082.109890@anthem.wooz.org>
 <3C7DE6DC.893E594B@lemburg.com> <15486.22525.324049.844325@anthem.wooz.org>
Message-ID: <3C7E5D7C.A62CC10F@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:
> 
>     MvL> I would not recommend to incorporate any of this into Python
>     MvL> without asking the author(s). When doing so, it would be
>     MvL> appropriate, IMO, to ask them whether they would fill out the
>     MvL> contributor agreement. Then, the presumed licensing problems
>     MvL> would be gone.
> 
> Agreed on both points!

+1.

The PSF will have to agree on the contribution docs first,
though. Since there's no discussion on the PSF docs discussion
list, I suppose everybody is happy with them :-) 

BTW, I was referring to the other codecs in the python-codecs
project on SF. Most of those are encumbered by the GPL and thus
unusable in non-GPL projects.

Tamito has switched to a BSD-license after some private 
discussions about this, which is goodness :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From David Abrahams" <david.abrahams@rcn.com  Thu Feb 28 16:45:59 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 28 Feb 2002 11:45:59 -0500
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
References: <3C7E1533.4516.4774B668@localhost>
Message-ID: <18e001c1c078$1cb02940$0500a8c0@boostconsulting.com>

----- Original Message -----
From: "Gordon McMillan" <gmcm@hypernet.com>

> That's not even part of import. import is done when it
> has [name1, name2, name3]. It's ceval.c that
> does the binding.

Yep, so I discovered.

> Sounds to me like you want to override __setitem__
> on the module's __dict__.

Not neccessarily, though that might be one approach. I might want to treat
explicit setting of attributes differently from an import.

> Tricky, 'cause a module
> is hardly in charge of its own __dict__.
>
> But if you see value in it, you'd better persue it
> now, because Jeremy's plans for optimization of
> module __dict__ will likely make things harder.

I thought this /was/ pursuing it. What did you have in mind?

-Dave



From barry@zope.com  Thu Feb 28 16:46:35 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Thu, 28 Feb 2002 11:46:35 -0500
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code
 Encoding)
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
 <m34rk2eg84.fsf@mira.informatik.hu-berlin.de>
 <15485.25422.524082.109890@anthem.wooz.org>
 <3C7DE6DC.893E594B@lemburg.com>
 <15486.22525.324049.844325@anthem.wooz.org>
 <3C7E5D7C.A62CC10F@lemburg.com>
Message-ID: <15486.24299.262770.702438@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> The PSF will have to agree on the contribution docs first,
    MAL> though. Since there's no discussion on the PSF docs
    MAL> discussion list, I suppose everybody is happy with them :-)

I am.  What do we need to do next?

    MAL> BTW, I was referring to the other codecs in the python-codecs
    MAL> project on SF. Most of those are encumbered by the GPL and
    MAL> thus unusable in non-GPL projects.

    MAL> Tamito has switched to a BSD-license after some private 
    MAL> discussions about this, which is goodness :-)

>From what I've been told, the Japanese codecs are the most stable.
I'm really not qualified to judge though.

-Barry


From mal@lemburg.com  Thu Feb 28 16:54:16 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Feb 2002 17:54:16 +0100
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code
 Encoding)
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
 <m34rk2eg84.fsf@mira.informatik.hu-berlin.de>
 <15485.25422.524082.109890@anthem.wooz.org>
 <3C7DE6DC.893E594B@lemburg.com>
 <15486.22525.324049.844325@anthem.wooz.org>
 <3C7E5D7C.A62CC10F@lemburg.com> <15486.24299.262770.702438@anthem.wooz.org>
Message-ID: <3C7E60B8.41458EE8@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "MAL" == M  <mal@lemburg.com> writes:
> 
>     MAL> The PSF will have to agree on the contribution docs first,
>     MAL> though. Since there's no discussion on the PSF docs
>     MAL> discussion list, I suppose everybody is happy with them :-)
> 
> I am.  What do we need to do next?

Wait. The deadline is mid-March. After that the docs will have
to go to the lawyer and only then we can use them...
 
>     MAL> BTW, I was referring to the other codecs in the python-codecs
>     MAL> project on SF. Most of those are encumbered by the GPL and
>     MAL> thus unusable in non-GPL projects.
> 
>     MAL> Tamito has switched to a BSD-license after some private
>     MAL> discussions about this, which is goodness :-)
> 
> From what I've been told, the Japanese codecs are the most stable.
> I'm really not qualified to judge though.

Me neither, but Tamito has put a lot of work into them and
with his move to C for the codec engine, speed is not an
issue anymore either.

Also, I've asked him about his thoughts about having them 
included in the core before. He would be happy with that
move.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From barry@zope.com  Thu Feb 28 16:56:42 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Thu, 28 Feb 2002 11:56:42 -0500
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code
 Encoding)
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com>
 <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
 <m34rk2eg84.fsf@mira.informatik.hu-berlin.de>
 <15485.25422.524082.109890@anthem.wooz.org>
 <3C7DE6DC.893E594B@lemburg.com>
 <15486.22525.324049.844325@anthem.wooz.org>
 <3C7E5D7C.A62CC10F@lemburg.com>
 <15486.24299.262770.702438@anthem.wooz.org>
 <3C7E60B8.41458EE8@lemburg.com>
Message-ID: <15486.24906.938762.229351@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> Wait. The deadline is mid-March. After that the docs will
    MAL> have to go to the lawyer and only then we can use them...

Right, I forgot. ;)
 
    MAL> Me neither, but Tamito has put a lot of work into them and
    MAL> with his move to C for the codec engine, speed is not an
    MAL> issue anymore either.

    MAL> Also, I've asked him about his thoughts about having them 
    MAL> included in the core before. He would be happy with that
    MAL> move.

Cool!
-Barry


From mal@lemburg.com  Thu Feb 28 17:00:24 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Feb 2002 18:00:24 +0100
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
References: <04da01c1c05a$886255a0$e000a8c0@thomasnotebook> <3C7E3454.B6690A7@lemburg.com> <053201c1c05f$e86a9610$e000a8c0@thomasnotebook> <3C7E3ADC.DCED6D15@lemburg.com> <3C7E425C.5060003@interet.com>
Message-ID: <3C7E6228.F8B3E59D@lemburg.com>

"James C. Ahlstrom" wrote:
> 
> M.-A. Lemburg wrote:
> > BTW, how is progress on the ZIP import patch doing ?
> > Perhaps Jim should just check in what he has so that the code
> > gets a little more code review...
> 
> The code is "done" has been in Source Forge patch 492105
> for some time.
> 
> I am leaving for Panama tomorrow for 8 days, so if I
> seem to disappear, that's why.  I would be happy to work
> hard on this after I get back, because I think it is an
> important addition for Python.

Great !

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From jim@interet.com  Thu Feb 28 17:02:27 2002
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 28 Feb 2002 12:02:27 -0500
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
References: <3C7E3ADC.DCED6D15@lemburg.com>        <3C7E0CAB.19066.47536147@localhost> <15486.22442.738570.615670@beluga.mojam.com>
Message-ID: <3C7E62A3.5090404@interet.com>

Skip Montanaro wrote:

>     Gordon> [Thomas Heller]
>     >> > Have you looked at imputil? It bootstraps itself only from builtin
>     >> > modules (which may be the only mechanism to be in the core).
> 
>     Gordon> True when Greg wrote it, but strop is now depecrated, and not
>     Gordon> necessarily builtin. It's still the best route, because strop
>     Gordon> has no dependencies, while string does.
>  
> What do strop or string provide that string methods don't?  It's likely that
> if you needed to import either in the past, you don't need to now.


The real problem isn't the string module, it is the os module.  Any
importer will need this.  The usual hack is to duplicate its logic
in the my_importer module.  That is, the selection of the correct
builtin os functions.

And MAL's point that you need a C importer to import
your Python importer is inescapable.

And suppose the whole Python library is in a zip file?  You
must have additional C code to extract and load your Python
importer as well as the modules it imports.

It seems to me that the correct solution is to use the C importer
to import the my_importer Python module, plus all the imports
that my_importer needs.  Then you switch to resolving imports
with my_importer.py.  Something like this is already in my
import.c patch.

I don't think this discussion should hold up installing
my zip import patches.  I believe these patches are required,
and can be the basis of a subsequent patch to add an external
Python importer.

JimA



From mal@lemburg.com  Thu Feb 28 17:09:46 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Feb 2002 18:09:46 +0100
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
References: <3C7E0CAB.19066.47536147@localhost>
Message-ID: <3C7E645A.3D464207@lemburg.com>

Gordon McMillan wrote:
> 
> [M.-A. Lemburg]
> > > > This would be nice to have, but how do you
> > > > bootstrap the importer if it's written in Python ?
> 
> The response to "let's revamp C import" is "Oh no,
> we need a chance to play with it in Python first."

I think you misunderstood my request: I *don't* want
to revamp import.c, I would just like some extra hooks
to be able to only replace those few parts which I'd
like to extend from time to time, e.g. instead of replacing
the complete __import__ machinery, it would be nice
to have a callback hook in the finder and another one
in the module loader.

All this has nothing to do with the PEP, though :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From jim@interet.com  Thu Feb 28 17:16:05 2002
From: jim@interet.com (James C. Ahlstrom)
Date: Thu, 28 Feb 2002 12:16:05 -0500
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
References: <3C7E0CAB.6729.475361AB@localhost>
Message-ID: <3C7E65D5.4080005@interet.com>

Gordon McMillan wrote:

> On 28 Feb 2002 at 9:44, James C. Ahlstrom wrote:
> 
>>The other objection raised at the time was the
>>possible slow down of imports. 
>>
> 
> imputil was 30 to 40% slower than C import. iu
> is about 10 to 15% slower under normal usage, but
> can be faster if you use archives and arrange sys.path
> intelligently.


I think I can add iu.py as the standard Python importer
to my import.c patches.  That is, if iu.py can be imported
(using C), then it takes over imports.  Note that the C code
changes to import.c are still required.  Also note that iu.py
may be in a zip file, and so the import.c changes are still
required.


>>My code uses os.listdir to cache directory
>>contents, but defers its use until the os module can
>>be imported using the C import code. 
>>
> 
> A win over some threshold of number of hits on
> that directory; a loss under that threshold.


Exactly correct.  It is tradeoff between the OS caching
directory hits from fopen() versus using a Python cache
and os.listdir().  Dramatic gains are obtained when importing
from network file systems, an important case.

 
JimA




From aahz@rahul.net  Thu Feb 28 18:17:06 2002
From: aahz@rahul.net (Aahz Maruch)
Date: Thu, 28 Feb 2002 10:17:06 -0800 (PST)
Subject: [Python-Dev] PEP 1 update
In-Reply-To: <3C7B6322.440D21E7@lemburg.com> from "M.-A. Lemburg" at Feb 26, 2002 11:27:46 AM
Message-ID: <20020228181706.D0335E8C7@waltz.rahul.net>

M.-A. Lemburg wrote:
> 
> I consider the above PEP ready for review by the developers.
> Please comment.
> 
>     http://python.sourceforge.net/peps/pep-0263.html

After looking at several PEPs over the last couple of days, I suggest
that PEP 1 be updated to require inclusion of the Last-Modified:
field.  At the very least, I suggest that Post-History: be checked more
rigorously.  (PEP 263 contains a Post-History: field, but it is blank.)

I don't think it's necessary to retrofit every PEP, but I think that
every PEP up for consideration should be required to comply.
-- 
                      --- Aahz (@pobox.com)

Hugs and backrubs -- I break Rule 6       <*>       http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

We must not let the evil of a few trample the freedoms of the many.


From pedroni@inf.ethz.ch  Thu Feb 28 18:22:40 2002
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Thu, 28 Feb 2002 19:22:40 +0100
Subject: [Python-Dev] PEP 1 update
References: <20020228181706.D0335E8C7@waltz.rahul.net>
Message-ID: <017101c1c084$e85cdaa0$6d94fea9@newmexico>

[Ahz Maruch]
> 
> After looking at several PEPs over the last couple of days, I suggest
> that PEP 1 be updated to require inclusion of the Last-Modified:
> field.  At the very least, I suggest that Post-History: be checked more
> rigorously.  (PEP 263 contains a Post-History: field, but it is blank.)
> 
> I don't think it's necessary to retrofit every PEP, but I think that
> every PEP up for consideration should be required to comply.
> -- 

>From some post son comp.lang.python it seems
that people has some problem keeping track
of PEPs and understand their status /iter:

- whether they are there hanging around
 from version to version for possible consideration
 until the BDFL pick them up
- whether they are open to changes or just pending
  and pushed for approval  (there is only the draft/final
  distinction)
- wondering whether some things under consideration
  are just oddballs hanging around for long spans of time 
  and why they are not rapidly rejected or improbably accepted.

I know what the PEP 1 says but anyway the PEP
summary and PEP headers don't seem 
to properly and completely 
capture the right information needed
to make sense for a casual reader.

Another problem is that there are PEPs
that have multiple phases but are marked
has finished just because the main changes are
implemented (division changes)

and PEPs with important changes already done 
that are reported somehow just as unimplemented .

Even Alex Martelli was  wondering what was
happing e.g. with PEP 246 (I think it has solved
that at IPC10).

Just my impressions.

regards, Samuele Pedroni.





From guido@python.org  Thu Feb 28 18:58:48 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 28 Feb 2002 13:58:48 -0500
Subject: [Python-Dev] Re: Of slots and metaclasses...
In-Reply-To: Your message of "Thu, 28 Feb 2002 09:30:51 EST."
 <Pine.LNX.4.33.0202280901150.4848-100000@penguin.theopalgroup.com>
References: <Pine.LNX.4.33.0202280901150.4848-100000@penguin.theopalgroup.com>
Message-ID: <200202281858.g1SIwm930118@pcp742651pcs.reston01.va.comcast.net>

[Kevin Jacobs wrote me in private to ask my position on __slots__.
I'm posting my reply here, quoting his full message -- I see no reason
to carry this on as a private conversation.  Sorry, Kevin, if this
wasn't your intention.]

> Hi Guido;
> 
> Now that you are back from your travels, I'll start bugging you, as
> gently as possible, for some insight into your intent wrt slots and
> metaclasses.  As you can read from the python-dev archives, I've
> instigated a fair amount of discussion on the topic, though the
> conversation is almost meaningless without your input.

Hi Kevin, you got me to finally browse the thread "Meta-reflections".
My first response was: "you've got it all wrong."  My second response
was a bit more nuanced: "that's not how I intended it to be at all!"
OK, let me elaborate. :-)

You want to be able to find out which instance attributes are defined
by __slots__, so that (by combining this with the instance's __dict__)
you can obtain the full set of attribute values.  But this defeats the
purpose of unifying built-in types and user-defined classes.

A new-style class, with or without __slots__, should be considered no
different from a new-style built-in type, except that all of the
methods happen to be defined in Python (except maybe for inherited
methods).

In order to find all attributes, you should *never* look at __slots__.
Your should search the __dict__ of the class and its base classes, in
MRO order, looking for descriptors, and *then* add the keys of the
__dict__ as a special case.  This is how PEP 252 wants it to be.

If the descriptors don't tell you everything you need, too bad -- some
types just are like that.  For example, if you're deriving from a list
or tuple, there's no attribute that leads to the items: you have to
use __len__ and __getitem__ to find out about these, and you have to
"know" that that's how you get at them (although the presence of
__getitem__ should be a clue).

Why do I reject your suggestion of making __slots__ (more) usable for
introspection?  Because it would create another split between built-in
types and user-defined classes: built-in types don't have __slots__,
so any strategy based on __slots__ will only work for user-defined
types.  And that's exactly what I'm trying to avoid!

You may complain that there are so many things to be found in a
class's __dict__, it's hard to tell which things are descriptors.
Actually, it's easy: if it has a __get__ (method) attribute, it's a
descriptor; if it also has a __set__ attribute, it's a data attribute,
otherwise it's a method.  (Note that read-only data attributes have a
descriptor that has a __set__ method that always raises TypeError or
AttributeError.)

Given this viewpoint, you won't be surprised that I have little desire
to implement your other proposals, in particular, I reject all these:

- Proxy the instance __dict__ with something that makes the slots
  visible

- Flatten slot lists and make them immutable

- Alter vars(obj) to return a dict of all attrs

- Flatten slot inheritance (see below)

- Change descriptors to fall back on class variables for unfilled
  slots

I'll be the first to admit that some details are broken in 2.2.

In particular, the fact that instances of classes with __slots__
appear picklable but lose all their slot values is a bug -- these
should either not be picklable unless you add a __reduce__ method, or
they should be pickled properly.  This is a bug of the same kind as
the problem with pickling time.localtime() (SF bug #496873), so I'm
glad this problem has now been entered in the SF database (as
#520644).  I haven't made up my mind on how to fix this -- it would be
nice if __slots__ would automatically be pickled, but it's tricky
(although I think it's doable -- without ever referencing the
__slots__ variable :-).

I'm not so sure that the fact that you can "override" or "hide" slots
defined in a base class should be classified as a bug.  I see it more
as a "don't do that" issue: If you're deriving a class that overrides
a base class slot, you haven't done your homework.  PyChecker could
warn about this though.

I think you're mostly right with your proposal "Update standard
library to use new reflection API".  Insofar as there are standard
support classes that use introspection to provide generic services for
classic classes, it would be nice of these could work correctly for
new-style classes even if they use slots or are derived from
non-trivial built-in types like dict or list.  This is a big job, and
I'd love some help.  Adding the right things to the inspect module
(without breaking pydoc :-) would probably be a first priority.

Now let me get to the rest of your letter.

> So I've been sitting on my hands and waiting for you to dive in and
> set us all straight.  Actually, that is not entirely true; I picked
> up a copy of 'Putting Metaclasses to Work' and read it cover to
> cover.

Wow.  That's more than I've ever managed (due to what I hope can still
be called a mild case of ADD :-).  But I think I studied all the
important parts.  (I should ask the authors for a percentage -- I
think they've made quite some sales because of my frequent quoting of
their book. :-)

> Many things you've done in Python 2.2 are much clearer now,
> though new questions have emerged.  I would greatly appreciate it if
> you would answer a few of them at a time.  In return, I will
> synthesize your ideas with my own and compile a document that
> clearly defines and justifies the new Python object model and
> metaclass protocol.

Maybe you can formulate it as a set of tentative clarifying patches to
PEPs 252, 253, and 254?

> To start, there are some fairly broad and overlapping questions to get
> started:
> 
>   1) How much of IBM's SOMobject MetaClass Protocol (SOMMCP) do you
>      want to adapt to Python?  For now (Python 2.2/2.3/2.4 time
>      frame)?  And in the future (Python 3.0/3000)?

Not much more than what I've done so far.  A lot of what they describe
is awfully C++ specific anyway; a lot of the things they struggle with
(such as the redispatch hacks and requestFirstCooperativeMethodCall)
can be done so much simpler in a dynamic language like Python that I
doubt we should follow their examples literally.

>   2) In Python 2.2, what intentional deviations have you chosen from the
>      SOMMCP and what differences are incidental or accidental?

Hard to say, unless you specifically list all the things that you
consider part of the SOMMCP.  Here are some things I know:

- In descrintro.html, I describe a slightly different algorithm for
  calculating the MRO than they use.  But my implementation is theirs
  -- I didn't realize the two were different until it was too late,
  and it only matters in uninteresting corner cases.

- I currently don't complain when there are serious order
  disagreements.  I haven't decided yet whether to make these an error
  (then I'd have to implement an overridable way of defining
  "serious") or whether it's more Pythonic to leave this up to the
  user.

- I don't enforce any of their rules about cooperative methods.  This
  is Pythonic: you can be cooperative but you don't have to be.  It
  would also be too incompatible with current practice (I expect few
  people will adopt super().)

- I don't automatically derive a new metaclass if multiple base
  classes have different metaclasses.  Instead, I see if any of the
  metaclasses of the bases is usable (i.e. I don't need to derive one
  anyway), and then use that; instead of deriving a new metaclass, I
  raise an exception.  To fix this, the user can derive a metaclass
  and provide it in the __metaclass__ variable in the class statement.
  I'm not sure whether I should automatically derive metaclasses; I
  haven't got enough experience with this stuff to get a good feel for
  when it's needed.  Since I expect that non-trivial metaclasses are
  often implemented in C, I'm not so comfortable with automatically
  merging multiple metaclasses -- I can't prove to myself that it's
  always safe.

- I don't check that a base class doesn't override instance
  variables.  As I stated above, I don't think I should, but I'm not
  100% sure.

>   3) Do you intend to enforce monotonicity for all methods and slots?
>      (Clearly, this is not desirable for instance __dict__ attributes.)

If I understand the concept of monotonicity, no.  Python traditionally
allows you to override methods in ways that are incompatible with the
contract of the base class method, and I don't intend to forbid this.
It would be good if PyChecker checked for accidental mistakes in this
area, and maybe there should be a way to declare that you do want this
enforced; I don't know how though.

There's also the issue that (again, if I remember the concepts right)
there are some semantic requirements that would be really hard to
check at compile time for Python.

>   4) Should descriptors work cooperatively?  i.e., allowing a
>      'super' call within __get__ and __set__.

I don't think so, but I haven't thought through all the consequences
(I'm not sure why you're asking this, and whether it's still a
relevant question after my responses above).  You can do this for
properties though.

Thanks for the dialogue!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gmcm@hypernet.com  Thu Feb 28 19:06:33 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 28 Feb 2002 14:06:33 -0500
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
In-Reply-To: <3C7E65D5.4080005@interet.com>
Message-ID: <3C7E3969.14214.480227F9@localhost>

On 28 Feb 2002 at 12:16, James C. Ahlstrom wrote:

> I think I can add iu.py as the standard Python
> importer to my import.c patches.  That is, if iu.py
> can be imported (using C), then it takes over
> imports.  Note that the C code changes to import.c
> are still required. Also note that iu.py may be in a
> zip file, and so the import.c changes are still
> required. 

Thanks, but I don't want iu.py to be used instead
of c import in normal Python installations. I'll use
it that way in Installer, but since that's an
embedding app, it's not hard to bootstrap.

In the context of python-dev, iu is, I think, useful
because it (a) emulates nearly exactly Python's
import rules and (b) it does so in a nicely OO
framework with some interesting facilities. In
other words, as a model of what some future
revamp of c import might be.
 

-- Gordon
http://www.mcmillan-inc.com/



From gmcm@hypernet.com  Thu Feb 28 19:06:33 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 28 Feb 2002 14:06:33 -0500
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
In-Reply-To: <3C7E62A3.5090404@interet.com>
Message-ID: <3C7E3969.16843.4802279F@localhost>

On 28 Feb 2002 at 12:02, James C. Ahlstrom wrote:

> The real problem isn't the string module, it is the
> os module.  Any importer will need this.  The usual
> hack is to duplicate its logic in the my_importer
> module. That is, the selection of the correct
> builtin os functions. 

getpath.c has to invent the same filesystem 
primitives, since it runs before builtins are loaded.
 
> And MAL's point that you need a C importer to import
> your Python importer is inescapable.

Everybody has the same bootstrap problem.

> And suppose the whole Python library is in a zip
> file? You must have additional C code to extract
> and load your Python importer as well as the modules
> it imports. 

Right. Primitives have to come from somewhere.
 
> It seems to me that the correct solution is to use
> the C importer to import the my_importer Python
> module, plus all the imports that my_importer needs.
>  Then you switch to resolving imports with
> my_importer.py. Something like this is already in
> my import.c patch. 

Which is what almost everybody does, the 
exception being macPython. They use resources
a lot, and most of the import extensions are built
in at a very low level.
 
> I don't think this discussion should hold up
> installing my zip import patches. 

Not at all. Getting zip files onto sys.path is
a very good thing.

-- Gordon
http://www.mcmillan-inc.com/



From nas@python.ca  Thu Feb 28 19:33:28 2002
From: nas@python.ca (Neil Schemenauer)
Date: Thu, 28 Feb 2002 11:33:28 -0800
Subject: [Python-Dev] PEP 273 - Import from Zip Archives
In-Reply-To: <3C7E645A.3D464207@lemburg.com>; from mal@lemburg.com on Thu, Feb 28, 2002 at 06:09:46PM +0100
References: <3C7E0CAB.19066.47536147@localhost> <3C7E645A.3D464207@lemburg.com>
Message-ID: <20020228113328.A3275@glacier.arctrix.com>

M.-A. Lemburg wrote:
> I think you misunderstood my request: I *don't* want
> to revamp import.c, I would just like some extra hooks
> to be able to only replace those few parts which I'd
> like to extend from time to time, e.g. instead of replacing
> the complete __import__ machinery, it would be nice
> to have a callback hook in the finder and another one
> in the module loader.

I have a some rough code that does this.  I've stuck it on my web
site at:

    http://arctrix.com/nas/python/cimport-20020228.tar.gz

if anyone is interested.  I found that for my application (importing
.ptl modules that need to be compiled with a different compiler),
imputil did not have the right kind of hooks.  ihooks was better but
still kunky and slow.


  Neil


From tim.one@comcast.net  Thu Feb 28 19:42:35 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 28 Feb 2002 14:42:35 -0500
Subject: [Python-Dev] Alignment assumptions
In-Reply-To: <172601c1c05e$0c0ea630$0500a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEEKOAAA.tim.one@comcast.net>

[Jack, skip to the end please]

[David Abrahams, on
    Include/objimpl.h:275: double dummy;  /* force worst-case alignment */
]
> As I read the code, it affects all types (doesn't this header begin every
> object, regardless of its GC flags?)

Nope, only objects that go through _PyObject_GC_Malloc().  It could be a
nightmare if, e.g., every string and int object consumed another (at least)
12 bytes.

> and I think that's a very unhappy circumstance for your numeric
> community. Remember, the type that raised the alarm here was just a
> long double.

The *Python* numeric community is far more likely to embed a float than a
long double, and in any case seems unlikely to build a container type
mixing long double with PyObject* members (i.e., one that ought to
participate in cyclic gc).

I expect we have a blind spot towards long double in general since Python
doesn't expose or use such a thing, all the developers run on platforms
where (as far as they know <wink>) it's the same as a double, and "long
double" was introduced after K&R (so some old-timers likely aren't even
aware C89 introduced it).

But I'll change the code here to use long double instead -- it's harmless,
as it doesn't make a lick of difference on any platform that matters <0.7
wink>.

>> Only the objimpl.h trick might benefit from maximal alignment.

> I'm not actually after maximal alignment; I look for a minimally-
> sized/aligned type whose alignment is a multiple of the target
> type's alignment. In any case, I was just using the assumption that
> double was maximally aligned since I was linking with Python code
> and the EDG front-end was too slow to handle the metaprogram -- I
> figured that if the assumption was good enough for Python

Well, nobody has complained yet, but the core never needs alignment stricter
than double, and-- as above --an extension type that both did and needed to
participate in GC is unlikey.

> and my clients were depending on it anyway, it was good enough for
> my code (not!).

One of the secrets to Python's success is that we tell unreasonable users to
go away and bother the C++ committee instead.

[128-byte alignment needed for KSR's _subpage type]
> I was aware that this was a theoretical possibility, but not that it
> was a practical one. What's KSR?

Kendall Square Research, my (and Tani's, Tamah's and Steve Breit's) employer
before Dragon.  The address space was carved into 128-byte "subpages", and
the hardware supported Python-style (non-owned non-reentrant) locks directly
on a per-subpage basis (Python's lock.acquire() and lock.release() were one
machine instruction each!).  Subpages were also the unit for cache coherency
across processors.  So use of _subpage in our system code, and in
speed-obsessed app code, was ubiquitous.  I guess the main thing KSR proved
was that you can't stay in business designing custom hardware to execute
Python's semantics directly <wink>.

> ...
> Seriously, though, I think it would be reasonable to stick to aligning
> the standard builtin types, in which can you can do the test without
> calling malloc, FWIW.

I checked this in:

	long double dummy;  /* force worst-case alignment */

[Guido, on
  #ifdef USE_CACHE_ALIGNED
 	long	aligner;
 #endif
]
> The malloc 8-byte align argument doesn't apply, since this struct is
> used in an array.

I was composing email while asleep <wink>.  Gotcha.

> ...
> This was added by Jack Jansen ages ago -- I think he did measure a
> speedup on an old Mac compiler, or he wouldn't have added it, and I
> bet there was a #define USE_CACHE_ALIGNED in his config.h then.
>
> But that's all history; I agree it should be deleted.

Jack, do you still want this?

fighting-code-rot-ly y'rs  - tim



From David Abrahams" <david.abrahams@rcn.com  Thu Feb 28 20:44:32 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 28 Feb 2002 15:44:32 -0500
Subject: [Python-Dev] Alignment assumptions
References: <LNBBLJKPBEHFEDALKOLCMEEKOAAA.tim.one@comcast.net>
Message-ID: <1a2401c1c099$1088e1e0$0500a8c0@boostconsulting.com>

----- Original Message -----
From: "Tim Peters" <tim.one@comcast.net>

> [David Abrahams, on
>     Include/objimpl.h:275: double dummy;  /* force worst-case alignment */
> ]
> > As I read the code, it affects all types (doesn't this header begin
every
> > object, regardless of its GC flags?)
>
> Nope, only objects that go through _PyObject_GC_Malloc().  It could be a
> nightmare if, e.g., every string and int object consumed another (at
least)
> 12 bytes.

Oh! I guess I should explicitly avoid _PyObject_GC_Malloc() unless I'm
supporting GC, then. As you can see, there's a lot of basic stuff I still
don't understand.

> > and I think that's a very unhappy circumstance for your numeric
> > community. Remember, the type that raised the alarm here was just a
> > long double.
>
> The *Python* numeric community is far more likely to embed a float than a
> long double, and in any case seems unlikely to build a container type
> mixing long double with PyObject* members (i.e., one that ought to
> participate in cyclic gc).

OK, I get it. I'm still not clear on what happens by default, but I was
under the mistaken impression that some types get GC support "automatically"
and thus that people would be subject to undesired alignment problems
without explicitly choosing them.

> I expect we have a blind spot towards long double in general since Python
> doesn't expose or use such a thing, all the developers run on platforms
> where (as far as they know <wink>) it's the same as a double, and "long
> double" was introduced after K&R (so some old-timers likely aren't even
> aware C89 introduced it).
>
> But I'll change the code here to use long double instead -- it's harmless,
> as it doesn't make a lick of difference on any platform that matters <0.7
> wink>.

Just for the record, I didn't twist your arm about this (only the ends of
your moustache).

> Well, nobody has complained yet, but the core never needs alignment
stricter
> than double, and-- as above --an extension type that both did and needed
to
> participate in GC is unlikey.

Makes sense. And I guess because this is 'C', hacking in the appropriate
alignment if such a type ever arose wouldn't be that hard.

> > and my clients were depending on it anyway, it was good enough for
> > my code (not!).
>
> One of the secrets to Python's success is that we tell unreasonable users
to
> go away and bother the C++ committee instead.

That explains everything, thank you (especially the oving relationship we
have with our lusers)!

> [128-byte alignment needed for KSR's _subpage type]
> > I was aware that this was a theoretical possibility, but not that it
> > was a practical one. What's KSR?
>
> Kendall Square Research, my (and Tani's, Tamah's and Steve Breit's)
employer
> before Dragon.  The address space was carved into 128-byte "subpages", and
> the hardware supported Python-style (non-owned non-reentrant) locks
directly
> on a per-subpage basis (Python's lock.acquire() and lock.release() were
one
> machine instruction each!).  Subpages were also the unit for cache
coherency
> across processors.  So use of _subpage in our system code, and in
> speed-obsessed app code, was ubiquitous.  I guess the main thing KSR
proved
> was that you can't stay in business designing custom hardware to execute
> Python's semantics directly <wink>.

/Please/ tell me you weren't trying to build a parallel Python machine
<5.99wink>.




From jacobs@penguin.theopalgroup.com  Thu Feb 28 20:48:05 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Thu, 28 Feb 2002 15:48:05 -0500 (EST)
Subject: [Python-Dev] Re: Of slots and metaclasses...
In-Reply-To: <200202281858.g1SIwm930118@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <Pine.LNX.4.33.0202281420030.8298-100000@penguin.theopalgroup.com>

On Thu, 28 Feb 2002, Guido van Rossum wrote:
> [Kevin Jacobs wrote me in private to ask my position on __slots__.
> I'm posting my reply here, quoting his full message -- I see no reason
> to carry this on as a private conversation.  Sorry, Kevin, if this
> wasn't your intention.]

No problem -- I sent it privately only to spare python-dev if you happened
to be too busy for a coherent reply.

> Hi Kevin, you got me to finally browse the thread "Meta-reflections".
> My first response was: "you've got it all wrong."  My second response
> was a bit more nuanced: "that's not how I intended it to be at all!"
> OK, let me elaborate. :-)

Yes -- I can see why my initial efforts of making slots work "just like
__dict__ attributes" is a bad idea.  However, it took reading 'Putting
Metaclasses to Work' for me to realize that.

> You want to be able to find out which instance attributes are defined
> by __slots__, so that (by combining this with the instance's __dict__)
> you can obtain the full set of attribute values.  But this defeats the
> purpose of unifying built-in types and user-defined classes.

I suppose the purpose of unifying built-in types and user-defined classes is
rather subjective.  There are many roads that will get us there, and I
happened to fixate on another one...

> A new-style class, with or without __slots__, should be considered no
> different from a new-style built-in type, except that all of the
> methods happen to be defined in Python (except maybe for inherited
> methods).

Sure.  Except that I also want to be able to extend existing new-style
classes/types in C, as well as Python.  Here is how I do it now (minus error
checking and ref-counting):

static PyMethodDef PyRow_methods[] = {
        {"__init__",      (PyCFunction)rowinit,       METH_VARARGS},
        {"__repr__",      (PyCFunction)rowstrrepr,    METH_NOARGS },
        {"__getitem__",   (PyCFunction)rowgetitem,    METH_VARARGS}
        /* etc... */ }

  PyRow_Type = (PyTypeObject*)PyType_Type.tp_call((PyObject*)&PyType_Type,args, NULL)

  /* Methods must be added _after_ PyRow_Type has been created
    since the type is an argument to PyDescr_NewMethod */
  dict = PyRow_Type->tp_dict;
  meth = PyRow_methods;
  for (; meth->ml_name != NULL; meth++)
  {
      PyObject* method = PyDescr_NewMethod(PyRow_Type, meth);
      PyDict_SetItemString(dict,meth->ml_name,method);
  }

Though this doesn't look nearly as ugly as it did when I first wrote it,
before I read 'Putting Metaclasses to Work'; strangely enough it ends up
looking a lot like their metaclass interface.

> In order to find all attributes, you should *never* look at __slots__.
> Your should search the __dict__ of the class and its base classes, in
> MRO order, looking for descriptors, and *then* add the keys of the
> __dict__ as a special case.  This is how PEP 252 wants it to be.

Sure.  I was just hoping to have that list of descriptors pre-computed and
stored in the class (like __mro__).  I suppose the question is why even
expose __slots__ if it is so worthless?

> If the descriptors don't tell you everything you need, too bad -- some
> types just are like that.

This has _never_ been a concern of mine --  I don't mind if the C
implementation chooses to hide things.

> Why do I reject your suggestion of making __slots__ (more) usable for
> introspection?  Because it would create another split between built-in
> types and user-defined classes: built-in types don't have __slots__,
> so any strategy based on __slots__ will only work for user-defined
> types.  And that's exactly what I'm trying to avoid!

Well, I'm busing creating C extension types that *do* have slots!  One of my
many current projects is to create a better type to store the results of
relational database queries.  I want the memory efficiency of tuples and the
ability to query by name (via __getitem__ or __getattr__).  So I basically
need to re-invent a magic tuple type that adds descriptors for every named
field.  Strangely enough, this is basically what the slots mechanism does.
I do realize that I could accomplish the same end by sub-classing tuple and
adding a bunch of descriptors.

> Given this viewpoint, you won't be surprised that I have little desire
> to implement your other proposals, in particular, I reject all these:
>
> - Proxy the instance __dict__ with something that makes the slots
>   visible

I wasn't real thrilled with this idea myself.  Among all the other reasons
why not to do this, it has some terrible performance implications.

> - Flatten slot lists and make them immutable

Again, why even have __slots__ if they are so useless?  Assuming that there
is a legitimate reason to peek at __slots__, why not at least make them
immutable?  Or, even better, why not use __slots__ to expose the etype slot
tuple instead?

> - Alter vars(obj) to return a dict of all attrs

Ok, I'm a little baffled by this.  Why not?

> I'll be the first to admit that some details are broken in 2.2.
>
> In particular, the fact that instances of classes with __slots__
> appear picklable but lose all their slot values is a bug -- these
> should either not be picklable unless you add a __reduce__ method, or
> they should be pickled properly.

My vote is that they should be pickled properly by default.  In my mind,
slots are a more static type of attribute.  Since they are more static, my
feeling is that they should be as or more accessible than dict attributes.
Descriptors are fine for handing the black magic of making them addressable
by name, but it just feels wrong to hide them from access by other means.
Of course, I am really talking about slots defined at the Python level --
not necessarily all storage allocated in the 'members' array.

> I'm not so sure that the fact that you can "override" or "hide" slots
> defined in a base class should be classified as a bug.  I see it more
> as a "don't do that" issue: If you're deriving a class that overrides
> a base class slot, you haven't done your homework.  PyChecker could
> warn about this though.

Unless attribute access becomes scoped based on the static type of the
method, then I think it is a bug.  Re-declared slots become effectively
orphaned and just waste memory.  Coalescing them or raising an exception
when they are re-declared seem much better alternatives.

> I think you're mostly right with your proposal "Update standard
> library to use new reflection API".  Insofar as there are standard
> support classes that use introspection to provide generic services for
> classic classes, it would be nice of these could work correctly for
> new-style classes even if they use slots or are derived from
> non-trivial built-in types like dict or list.> This is a big job, and
> I'd love some help.  Adding the right things to the inspect module
> (without breaking pydoc :-) would probably be a first priority.

Well, I'm happy to contribute, though my primary concern (other than
correctness and completeness) is efficiency.  The whole reason I'm using
slots is to save space when allocating huge numbers of fairly small objects.
I believe that there is a big performance difference between being able to
pickle based on arbitrary descriptors and pickling just slots.  Slots are
already nicely laid out in rows, just waiting to be plucked out and stuffed
into a pickle.  Even without flattened __slots__ lists, it is a fast and
trivial operation to iterate over a class and all its bases and extract
slots.  Doing so over dictionaries is not nearly so trivial.

> Maybe you can formulate it as a set of tentative clarifying patches to
> PEPs 252, 253, and 254?

To be honest, I forgot that those PEPs existed!  I've been working off of
the Python 2.2 source and the tutorials.  I'll read them over tonight and
see.

> >   2) In Python 2.2, what intentional deviations have you chosen from the
> >      SOMMCP and what differences are incidental or accidental?
>
> Hard to say, unless you specifically list all the things that you
> consider part of the SOMMCP.

When I say SOMMCP, I really mean the "metaclass protocol" defined by the
various postulates and theorems in the first few chapters of the book.

> - I currently don't complain when there are serious order
>   disagreements.  I haven't decided yet whether to make these an error
>   (then I'd have to implement an overridable way of defining
>   "serious") or whether it's more Pythonic to leave this up to the
>   user.

Sure -- I noticed this.  Maybe you should store the order-safety in the
metaclass?  That way, the user can inspect it when they decide it is
important.

> - I don't enforce any of their rules about cooperative methods.  This
>   is Pythonic: you can be cooperative but you don't have to be.  It
>   would also be too incompatible with current practice (I expect few
>   people will adopt super().)

I agree with most of that, except that I expect that MANY people will start
using 'super'.  I've trained an office full of Java programmers to
program in Python and they are always complaining about the lack of super
calls.  Also, I've _always_ considered this idiom ugly and hackish:

  def Foo(Bar,Baz):
    def __init__(self):
      Bar.__init__(self)
      Baz.__init__(self)

Its so much better as:

  def Foo(Bar,Baz):
    def __init__(self):
      # when super becomes a keyword and we write nice cooperative __init__
      # methods
      super.__init__(self)

> - I don't automatically derive a new metaclass if multiple base
>   classes have different metaclasses.

I have my own ideas about this, but like you, don't have enough experience
with them in practice to do anything about it.

>   Since I expect that non-trivial metaclasses are
>   often implemented in C, I'm not so comfortable with automatically
>   merging multiple metaclasses -- I can't prove to myself that it's
>   always safe.

It is always safe when the assumption of monotonicity is not violated.

> - I don't check that a base class doesn't override instance
>   variables.  As I stated above, I don't think I should, but I'm not
>   100% sure.

Do you mean slots or all Python instance attributes in this statement?

> >   3) Do you intend to enforce monotonicity for all methods and slots?
> >      (Clearly, this is not desirable for instance __dict__ attributes.)
>
> If I understand the concept of monotonicity, no.  Python traditionally
> allows you to override methods in ways that are incompatible with the
> contract of the base class method, and I don't intend to forbid this.

For Python, monotonicity means that the instance attributes and instance
methods of a class are a superset of those of all its ancestors.  This is
not the way that normal __dict__ attributes work in Python, so lets talk
only about slots when discussing monotonic properties.  In order words, it
means that the metaclass interface does not provide a way to delete a slot
or a method, only ways to add and override them.  Combined with some static
type information, the assumption of monotonicity will be very helpful when
we can eventually compile Python.

> It would be good if PyChecker checked for accidental mistakes in this
> area, and maybe there should be a way to declare that you do want this
> enforced; I don't know how though.

I have a pretty good idea how.  Its essentially a proof-based method that
works by solving metatype constraints.

> There's also the issue that (again, if I remember the concepts right)
> there are some semantic requirements that would be really hard to
> check at compile time for Python.

True for __dict__ instance attributes, not for slots!

> >   4) Should descriptors work cooperatively?  i.e., allowing a
> >      'super' call within __get__ and __set__.
>
> I don't think so, but I haven't thought through all the consequences
> (I'm not sure why you're asking this, and whether it's still a
> relevant question after my responses above).  You can do this for
> properties though.

  class Foo(object):
    __slots__=()
    a = 1

  class Bar(Foo):
    __slots__ = ('a',)

  bar = Bar()
  print dir(a)
  print a

The resolution rule for descriptors could work cooperatively to find Foo's
class attribute 'a' instead of giving up with an AttributeError.

Thanks for the very useful answers,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From Jack.Jansen@oratrix.com  Thu Feb 28 21:34:05 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Thu, 28 Feb 2002 22:34:05 +0100
Subject: [Python-Dev] Alignment assumptions
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEBDOAAA.tim.one@comcast.net>
Message-ID: <E3E1D0BD-2C92-11D6-8B25-003065517236@oratrix.com>

On donderdag, februari 28, 2002, at 07:57 , Tim Peters wrote:

> [David Abrahams]
>> A quick grep-find through the Python-2.2 sources reveals the 
>> following:
>>
>> Include/dictobject.h:49: long aligner;
>
> This is in
>
> #ifdef USE_CACHE_ALIGNED
> 	long	aligner;
> #endif
>
> and AFAIK nobody ever defines the symbol.  It's a cache-line 
> optimization
> gimmick, but is effectively a nop (except to waste memory) on 
> "almost all"
> machines.  IIRC, the author never measured any improvement by 
> using it (not
> surprising, since I believe almost all mallocs at least 8-byte 
> align now).
> I vote we delete it.

MacPython uses it. At the time it was put in it caused a 15% 
increase in Pystones because dictionary entries were aligned in 
cache lines. But: this was in the PPC 601 and 604 era, I must 
say that I've never tested whether it made any difference on G3 
and G4.

Put in a bug report in my name, and one day I'll get around to 
testing whether it still makes a difference on current hardware 
and rip it out if it doesn't.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -



From guido@python.org  Thu Feb 28 21:51:45 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 28 Feb 2002 16:51:45 -0500
Subject: [Python-Dev] Re: Of slots and metaclasses...
In-Reply-To: Your message of "Thu, 28 Feb 2002 15:48:05 EST."
 <Pine.LNX.4.33.0202281420030.8298-100000@penguin.theopalgroup.com>
References: <Pine.LNX.4.33.0202281420030.8298-100000@penguin.theopalgroup.com>
Message-ID: <200202282151.g1SLpjE30957@pcp742651pcs.reston01.va.comcast.net>

[me]
> > A new-style class, with or without __slots__, should be considered
> > no different from a new-style built-in type, except that all of
> > the methods happen to be defined in Python (except maybe for
> > inherited methods).

[Kevin]
> Sure.  Except that I also want to be able to extend existing
> new-style classes/types in C, as well as Python.  Here is how I do
> it now (minus error checking and ref-counting):
> 
> static PyMethodDef PyRow_methods[] = {
>         {"__init__",      (PyCFunction)rowinit,       METH_VARARGS},
>         {"__repr__",      (PyCFunction)rowstrrepr,    METH_NOARGS },
>         {"__getitem__",   (PyCFunction)rowgetitem,    METH_VARARGS}
>         /* etc... */ }
> 
>   PyRow_Type = (PyTypeObject*)PyType_Type.tp_call((PyObject*)&PyType_Type,args, NULL)
> 
>   /* Methods must be added _after_ PyRow_Type has been created
>     since the type is an argument to PyDescr_NewMethod */
>   dict = PyRow_Type->tp_dict;
>   meth = PyRow_methods;
>   for (; meth->ml_name != NULL; meth++)
>   {
>       PyObject* method = PyDescr_NewMethod(PyRow_Type, meth);
>       PyDict_SetItemString(dict,meth->ml_name,method);
>   }

Heh?!?!!!  Why can't you declare PyRow_Type as a statically
initialized struct like all extensions and the core do?

[snip]

> Sure.  I was just hoping to have that list of descriptors
> pre-computed and stored in the class (like __mro__).

__mro__ gets used *all the time*; on every method lookup at least.
The list of instance variable descriptors is only interesting to a
small number of highly introspective tools.

> I suppose the question is why even expose __slots__ if it is so
> worthless?

It's found in the dict when the class is defined.  Why delete it?  The
idea is that you can make it a dict that has other info about the
slots.  It's got a __foo__ name.  I can give it any semantics I damn
well please. :-)

> > If the descriptors don't tell you everything you need, too bad --
> > some types just are like that.
> 
> This has _never_ been a concern of mine --  I don't mind if the C
> implementation chooses to hide things.

Exactly, and I'm telling you to have the same attitude about
slots.  Let me repeat something I just sent someone else about slots:

It seems that unfortunately __slots__ is Python 2.2's most
misunderstood feature...

I see it as a hack that lets me define a special-purpose class whose
instances are (almost) as efficient as I can do using C, but without
having to write a C extension.  (I say "almost", because a C extension
can store simple values as C ints, while __slots__ only lets you store
PyObject pointers.  But still, it's a big savings compared to adding a
__dict__ to every instance, and sometimes the slot value is picked
from a small number of interned or cached ints or strings.)

It has different semantics from regular attributes, and I don't try to
hide that: introspection doesn't find slots the same way as it finds
regular instance vars, you can't provide a default via a class
variable, and there are a bunch of "don't do that" things like
modifying __slots__ of an existing class or overriding a slot defined
by a base class.  (There's a whole list of warnings in
http://www.python.org/2.2/descrintro.html!)

I think as such, the feature is just right (except for the no-pickling
bug).  It's unfortunate that people have jumped on it as the answer to
all their questions.  I guess that means there's a big demand for more
control over instance variables -- whether that demand is created by a
real need or simply because that's how most other languages do it
remains to be seen...

> > Why do I reject your suggestion of making __slots__ (more) usable
> > for introspection?  Because it would create another split between
> > built-in types and user-defined classes: built-in types don't have
> > __slots__, so any strategy based on __slots__ will only work for
> > user-defined types.  And that's exactly what I'm trying to avoid!
> 
> Well, I'm busing creating C extension types that *do* have slots!
> One of my many current projects is to create a better type to store
> the results of relational database queries.  I want the memory
> efficiency of tuples and the ability to query by name (via
> __getitem__ or __getattr__).  So I basically need to re-invent a
> magic tuple type that adds descriptors for every named field.
> Strangely enough, this is basically what the slots mechanism does.
> I do realize that I could accomplish the same end by sub-classing
> tuple and adding a bunch of descriptors.

Note that there's something already there that you might reuse:
Objects/structseq.c, which is used to create the return values of
localtime(), stat() and a few others in a way that looks both like a
tuple and like a read-only record.  It may not be powerful enough
because I think the assumption is that the set of field names is
static, but you may be able to extend it or copy some good ideas.

(Just don't try to understand what it does to make the tuple shorter
than the record in some cases -- that's for backwards compatibility
because lots of code would break if e.g. struct() returned a longer
tuple than in previous Python versions, but we still want to provide
new fields when using named fields.  This part is not for the weak of
heart, and I didn't write it, and can't guarantee that it's 100%
bugfree.)

[items I rejected]

> > - Alter vars(obj) to return a dict of all attrs
> 
> Ok, I'm a little baffled by this.  Why not?

Currently, the assumption is that vars() returns a dict that can be
modified to modify the underlying object's attributes.  If it were to
return a synthetic dict, that wouldn't work, or it would require more
implementation effort than I care for -- again, since I doubt there is
much demand for this outside a small set of introspection tools.

> > I'll be the first to admit that some details are broken in 2.2.
> >
> > In particular, the fact that instances of classes with __slots__
> > appear picklable but lose all their slot values is a bug -- these
> > should either not be picklable unless you add a __reduce__ method,
> > or they should be pickled properly.
> 
> My vote is that they should be pickled properly by default.  In my
> mind, slots are a more static type of attribute.  Since they are
> more static, my feeling is that they should be as or more accessible
> than dict attributes.  Descriptors are fine for handing the black
> magic of making them addressable by name, but it just feels wrong to
> hide them from access by other means.  Of course, I am really
> talking about slots defined at the Python level -- not necessarily
> all storage allocated in the 'members' array.

Slots share their descriptor implementation with anything defined by
the tp_members array in a type object.  E.g. file.softspace is a
descriptor of the same type as used by slots.  What they share is that
they refer to "real" data stored in the instance -- either a PyObject*
or some basic C type like int or double.  I don't want to trust that
__slots__ has the right data: even if I made it immutable, someone
could still do C.__dict__['__slots__'] = <whatever>, and I don't want
to go so far as to make __slots__ a property stored in the type
object.  So I can't really tell which descriptors are slots and which
are other things -- and I don't want to, because I believe that would
be breaking through an abstraction.

> Unless attribute access becomes scoped based on the static type of
> the method, then I think it is a bug.  Re-declared slots become
> effectively orphaned and just waste memory.  Coalescing them or
> raising an exception when they are re-declared seem much better
> alternatives.

It's a bug to redeclare a slot.  I don't find it Python's job to make
it an error.

> > I think you're mostly right with your proposal "Update standard
> > library to use new reflection API".  Insofar as there are standard
> > support classes that use introspection to provide generic services
> > for classic classes, it would be nice of these could work
> > correctly for new-style classes even if they use slots or are
> > derived from non-trivial built-in types like dict or list.> This
> > is a big job, and I'd love some help.  Adding the right things to
> > the inspect module (without breaking pydoc :-) would probably be a
> > first priority.
> 
> Well, I'm happy to contribute, though my primary concern (other than
> correctness and completeness) is efficiency.  The whole reason I'm
> using slots is to save space when allocating huge numbers of fairly
> small objects.  I believe that there is a big performance difference
> between being able to pickle based on arbitrary descriptors and
> pickling just slots.  Slots are already nicely laid out in rows,
> just waiting to be plucked out and stuffed into a pickle.  Even
> without flattened __slots__ lists, it is a fast and trivial
> operation to iterate over a class and all its bases and extract
> slots.  Doing so over dictionaries is not nearly so trivial.

I think you're overstating the simplicity of pickling slots.  There is
no guarantee that the slots of a derived class are contiguous with the
slots of a base class; a __weakref__ and a __dict__ field may be
placed in between, and another metaclass could add other things.  For
example, you could write a metaclass in C that took the __slots__ idea
one step further and let you declare the types of the slots as basic C
types, so that other structmember keys could be used, e.g. T_INT or
T_FLOAT.

If you want your instances to be pickled *efficiently*, you should
write a custom reduce method in C anyway -- right now, new-style
classes are pickled by a piece of Python code at the end of
copy_reg.py.

> > Maybe you can formulate it as a set of tentative clarifying
> > patches to PEPs 252, 253, and 254?
> 
> To be honest, I forgot that those PEPs existed!  I've been working
> off of the Python 2.2 source and the tutorials.  I'll read them over
> tonight and see.

I had a feeling you were missing something basic. :-)

> When I say SOMMCP, I really mean the "metaclass protocol" defined by the
> various postulates and theorems in the first few chapters of the book.

As I said, I don't have the whole set in my head, so you'll have to be
more specific in your questions.  (Basically, I don't expect to be
adding much from the book, but I'll be looking to the book for clues
as we find problems with how things are implemented now, e.g. the
automatically derived metaclass issue below.)

> > - I currently don't complain when there are serious order
> >   disagreements.  I haven't decided yet whether to make these an
> >   error (then I'd have to implement an overridable way of defining
> >   "serious") or whether it's more Pythonic to leave this up to the
> >   user.
> 
> Sure -- I noticed this.  Maybe you should store the order-safety in the
> metaclass?  That way, the user can inspect it when they decide it is
> important.

You mean in the class object?  I'm not sure what you mean by "storing
the order-safety".  I currently don't calculate whether there are any
order conflicts: serious_order_disagreements() returns 0 without doing
anything.  Someone who wants it can easily implement the check from
the book though.

> > - I don't enforce any of their rules about cooperative methods.
> >   This is Pythonic: you can be cooperative but you don't have to
> >   be.  It would also be too incompatible with current practice (I
> >   expect few people will adopt super().)
> 
> I agree with most of that, except that I expect that MANY people
> will start using 'super'.

I doubt it with the current super(Class,self).method(args) notation.
Probably they will once super is a keyword so you can write
super.method(args).

> I've trained an office full of Java
> programmers to program in Python and they are always complaining
> about the lack of super calls.  Also, I've _always_ considered this
> idiom ugly and hackish:
> 
>   def Foo(Bar,Baz):
>     def __init__(self):
>       Bar.__init__(self)
>       Baz.__init__(self)

Strange that you mention Java in the same paragraph as an example
using multiple inheritance. ;-/

Also note that this is pretty much what C++ wants you to do, except it
uses '::' instead of '.' and doesn't require you to pass self (which
is a different issue).

I don't see this as a serious issue, just syntactic sugar.

> Its so much better as:
> 
>   def Foo(Bar,Baz):
>     def __init__(self):
>       # when super becomes a keyword and we write nice cooperative __init__
>       # methods
>       super.__init__(self)

But that's not what you'd be writing -- you'd be writing
super.__init__().

> > - I don't automatically derive a new metaclass if multiple base
> >   classes have different metaclasses.
> 
> I have my own ideas about this, but like you, don't have enough
> experience with them in practice to do anything about it.

Can you share them?  This might be interesting.

> >   Since I expect that non-trivial metaclasses are
> >   often implemented in C, I'm not so comfortable with automatically
> >   merging multiple metaclasses -- I can't prove to myself that it's
> >   always safe.
> 
> It is always safe when the assumption of monotonicity is not violated.

And that we can't know.

> > - I don't check that a base class doesn't override instance
> >   variables.  As I stated above, I don't think I should, but I'm not
> >   100% sure.
> 
> Do you mean slots or all Python instance attributes in this statement?

I just meant slots, but in a sense it's also true for other ivars: if
you don't know that your base class defines an ivar 'foo', you might
create your own ivar named 'foo' and use it in a way that's
inconsistent with the base class.  Because there are no type checks
and no ivar declarations, that's much harder to avoid in Python than
in more static languages like C++ or Java (I assume those will
complain when you redefine an ivar, even with the same type).

> > >   3) Do you intend to enforce monotonicity for all methods and
> > >      slots?  (Clearly, this is not desirable for instance
> > >      __dict__ attributes.)
> >
> > If I understand the concept of monotonicity, no.  Python
> > traditionally allows you to override methods in ways that are
> > incompatible with the contract of the base class method, and I
> > don't intend to forbid this.
> 
> For Python, monotonicity means that the instance attributes and
> instance methods of a class are a superset of those of all its
> ancestors.  This is not the way that normal __dict__ attributes work
> in Python, so lets talk only about slots when discussing monotonic
> properties.

I'm not sure what you mean by "this is not the way that normal
__dict__ attrs work", unless you are talking about overriding __init__
without calling the base class __init__ (and perhaps the same for
other methods), which of course can mean that a derived class instance
lacks an ivar that a base class instance would have.  This is Pythonic
freedom IMO.

Since it's not true for regular ivars, why worry about it for slots?

> In order words, it means that the metaclass interface
> does not provide a way to delete a slot or a method, only ways to
> add and override them.  Combined with some static type information,
> the assumption of monotonicity will be very helpful when we can
> eventually compile Python.

I don't think we should be guided here by what might be needed by a
compiler.  Without actually trying to build a compiler, we'll probably
miss important requirements that mean we'll have to change the
language anyway, and we'll impose requirements that we think might be
important without a good reason.  (E.g. structured programming was
once thought as an aid to compiler technology as well as to the human
reader.  Nowadays, optimizers reduce all control flow to labels and
goto statements. :-)

> > It would be good if PyChecker checked for accidental mistakes in
> > this area, and maybe there should be a way to declare that you do
> > want this enforced; I don't know how though.
> 
> I have a pretty good idea how.  Its essentially a proof-based method
> that works by solving metatype constraints.

Isn't that how most of PyChecker works?  At least the proof-base part?

> > There's also the issue that (again, if I remember the concepts right)
> > there are some semantic requirements that would be really hard to
> > check at compile time for Python.
> 
> True for __dict__ instance attributes, not for slots!

Again, you're trying to hijack slots for purposes for which they
weren't created.  Think of slots as an efficiency hack, *not* as a
better way to declare ivars.

> > >   4) Should descriptors work cooperatively?  i.e., allowing a
> > >      'super' call within __get__ and __set__.
> >
> > I don't think so, but I haven't thought through all the
> > consequences (I'm not sure why you're asking this, and whether
> > it's still a relevant question after my responses above).  You can
> > do this for properties though.
> 
>   class Foo(object):
>     __slots__=()
>     a = 1
> 
>   class Bar(Foo):
>     __slots__ = ('a',)
> 
>   bar = Bar()
>   print dir(a)
>   print a

That's a NameError, I suppose you meant 'bar' instead of 'a' in the
last two lines, then it makes sense. :-)

> The resolution rule for descriptors could work cooperatively to find
> Foo's class attribute 'a' instead of giving up with an
> AttributeError.

Once a descriptor is found, that's the end of the line.  When you find
a method, you call it, and it raises an exception, you're not going to
continue looking for a base class method either!

The descriptor type used to implement slots could do this, but
doesn't.  I don't care about this feature.  With a __dict__, there's
some real saving in not storing default values, since it means a
smaller dict, which can save space.  The slot space is always there,
so you might as well initialize it.

Concluding: don't expect that you can take an arbitrary class, analyze
what ivars it uses, and add a __slots__ variable to speed it up.
There are lots of differences in semantics when you use slots, and I
don't want to hide those.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@v.loewis.de  Thu Feb 28 21:51:46 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 28 Feb 2002 22:51:46 +0100
Subject: Japanese codecs (was Re: [Python-Dev] PEP 263 -- Python Source Code  Encoding)
In-Reply-To: <3C7DF381.C2E1335A@lemburg.com>
References: <200202250520.g1P5KKD01484@mira.informatik.hu-berlin.de>
 <3C7B5E35.129E5501@lemburg.com>
 <m31yf8fsxu.fsf@mira.informatik.hu-berlin.de>
 <3C7B6322.440D21E7@lemburg.com> <3c7bbf00.17218508@mail.wanadoo.dk>
 <200202261958.g1QJwsj19402@pcp742651pcs.reston01.va.comcast.net>
 <3C7BECEC.E1550553@lemburg.com>
 <200202262037.g1QKb5S19756@pcp742651pcs.reston01.va.comcast.net>
 <m38z9gudqd.fsf@mira.informatik.hu-berlin.de>
 <3C7CA3E2.C3705289@lemburg.com>
 <m3sn7nz360.fsf@mira.informatik.hu-berlin.de>
 <3C7CAD5D.6692F44@lemburg.com>
 <m3it8iltsx.fsf@mira.informatik.hu-berlin.de>
 <15485.15623.543255.443894@anthem.wooz.org>
 <m34rk2eg84.fsf@mira.informatik.hu-berlin.de>
 <3C7DF381.C2E1335A@lemburg.com>
Message-ID: <m38z9dcm19.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> Which wrapper APIs do we currently have which could actually
> be made part of the Python core ?

On Unix, we have iconv(3). On Windows, we have MultiByteToWideChar,
which would need to be wrapped with a map translating codec names to
codepage numbers. There is also a codec API through a COM interface
provided by Internet Exploder; I don't have the name of that interface
right now.

On all platforms, we could easily wrap the Tcl encodings, which are
available everywhere where Python is available. Not sure what the
performance implications would be.

There also could be a wrapper around ICU.

On OS X, CFStringCreateFromExternalRepresentation could be used.

> Aside: while it's true that we could use those, the Unicode 
> implementation has shown that rolling our own has worked out
> quite well too.

There have been a few correctness glitches in those, but overall, I'd
agree that they have worked quite well. Performance is a different
issue, though; people just haven't complained, yet, IMO.

Regards,
Martin


From martin@v.loewis.de  Thu Feb 28 22:00:01 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 28 Feb 2002 23:00:01 +0100
Subject: [Python-Dev] PEP 1 update
In-Reply-To: <017101c1c084$e85cdaa0$6d94fea9@newmexico>
References: <20020228181706.D0335E8C7@waltz.rahul.net>
 <017101c1c084$e85cdaa0$6d94fea9@newmexico>
Message-ID: <m34rk1clni.fsf@mira.informatik.hu-berlin.de>

"Samuele Pedroni" <pedroni@inf.ethz.ch> writes:

> Just my impressions.

I agree with the observations, but what would you do about this?

Regards,
Martin


From tim@zope.com  Thu Feb 28 22:30:49 2002
From: tim@zope.com (Tim Peters)
Date: Thu, 28 Feb 2002 17:30:49 -0500
Subject: [Python-Dev] proposal: add basic time type to the standard library
In-Reply-To: <3C7CD8B7.3E9A89A3@zope.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEFJOAAA.tim@zope.com>

[Jim Fulton]
>>> ZODB has a TimeStamp type that uses a 32-bit unsigned integer
>>> to store year, month,, day, hour, and minute in a way that
>>> makes it dirt simple to extract a component.

[Tim]
>> You really think so?  It's a mixed-radix scheme:
>>
>>           v=((((y-1900)*12+mo-1)*31+d-1)*24+h)*60+m;
>>
>> so requires lots of expensive integer division and remainder ...

[Jim]
> Compared to storing date-times as offsets from an epoch, this is
> much simpler and cheaper.

OK, as with most things, it boils down to the definition of dirt:  you're
contrasting hard-packed dirt with a 21%-dirt 79%-concrete mix, and I'm
constrasting hard-packed dirt with household dust.  I'm sure you'll agree
that's a rigorously correct summary <wink>.



From pedroni@inf.ethz.ch  Thu Feb 28 23:21:29 2002
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Fri, 1 Mar 2002 00:21:29 +0100
Subject: [Python-Dev] PEP 1 update
References: <20020228181706.D0335E8C7@waltz.rahul.net><017101c1c084$e85cdaa0$6d94fea9@newmexico> <m34rk1clni.fsf@mira.informatik.hu-berlin.de>
Message-ID: <035601c1c0ae$a6c11a00$6d94fea9@newmexico>

From: Martin v. Loewis <martin@v.loewis.de>
> "Samuele Pedroni" <pedroni@inf.ethz.ch> writes:
> 
> > Just my impressions.
> 
> I agree with the observations, but what would you do about this?

Some possible proposals (more or less easy to implement)

>From PEP 1:
    Standards track PEPs must have a Python-Version: header which
    indicates the version of Python that the feature will be released
    with.  Informational PEPs do not need a Python-Version: header.

- have in the summary an active (standard track) PEP category:
e.g.  PEP 237, PEP 252, PEP 253,  PEP 238 should go there;
maybe use for them the Python-Version (possibly renamed
 Implementation-Python-Versions: ) in a reasonable imaginative
way
   PEP 237: 2.2-2.3-2.4,...3.0
   PEP 238:  2.2...3.0
   PEP 252: 2.2....

- PEP for which it is not clear whether they will be implemeted
  should have Python-Version: ?,
  I think that for example PEPs 273 and 277 are fine reporting
  Python-Version: 2.3

- Maybe open PEPs should be divided between those
  that have at least a proof-of-concept or ref impl,
  and those that don't have one (the latter for obvious
 reasons are less likely to be implemented).
 Maybe other/richer categorizations would sense
 but those would require more burocracy.

- Maybe status should go a bit beyond the actual
  draft/final dicotomy    but this needs  discussion
  (thinking out loud: draft -> draft-stable vs. draft-incomplete
    or open-draft)

  OTOH the above proposals should already
  improve things a bit (if they are practicable).

- PEP workflow:
  at the moment it seems that a PEP champion
  can ask the BDFL to accept/reject and then
  things should reach "quickly" a final settlement.

  (Are all the PEPpers aware of this, sometimes
    it seems not for some of the PEP hanging around)

 Now if this would happen for all the PEPs
 on the plate, Guido would have an hard time :-)

  I think is up to Guido to think/decide/change
  things in this respect.

(For sure I miss the pie-in-the-sky category,
maybe Guido should sometimes go over
all the PEPs and assign acceptance likelyhood
measures,  half-kidding <wink>.
)

Just some vague ideas.

regards, Samuele Pedroni.



  









From pedroni@inf.ethz.ch  Thu Feb 28 23:31:27 2002
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Fri, 1 Mar 2002 00:31:27 +0100
Subject: [Python-Dev] PEP 1 update
References: <20020228181706.D0335E8C7@waltz.rahul.net><017101c1c084$e85cdaa0$6d94fea9@newmexico> <m34rk1clni.fsf@mira.informatik.hu-berlin.de> <035601c1c0ae$a6c11a00$6d94fea9@newmexico>
Message-ID: <036a01c1c0b0$0b6f38a0$6d94fea9@newmexico>

Another maybe valuable thing:

probably another useful heuristic
to divided the open PEPs
beyond   proof-of-concept/no-proof-of-concept

is new-syntax/new-keywords/new-"funny"-semantics/
non-backward compatible

vs. infrastructure/library/etc/BDFL championed

Those PEPs espacially make peope wonder:
will that happen to my favorite language,
oh god, when?, it seems real soon now - gulp, gasp.

regards, Samuele Pedroni.



From pedroni@inf.ethz.ch  Thu Feb 28 23:59:08 2002
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Fri, 1 Mar 2002 00:59:08 +0100
Subject: R: [Python-Dev] PEP 1 update
References: <20020228181706.D0335E8C7@waltz.rahul.net><017101c1c084$e85cdaa0$6d94fea9@newmexico> <m34rk1clni.fsf@mira.informatik.hu-berlin.de> <035601c1c0ae$a6c11a00$6d94fea9@newmexico> <036a01c1c0b0$0b6f38a0$6d94fea9@newmexico>
Message-ID: <039a01c1c0b3$e99946e0$6d94fea9@newmexico>

From: Samuele Pedroni <pedroni@inf.ethz.ch>
> Another maybe valuable thing:
> 
> probably another useful heuristic
> to divided the open PEPs
> beyond   proof-of-concept/no-proof-of-concept
> 
> is new-syntax/new-keywords/new-"funny"-semantics/
> non-backward compatible
> 
> vs. infrastructure/library/etc/BDFL championed
> 
> Those PEPs espacially make peope wonder:
> will that happen to my favorite language,
> oh god, when?, it seems real soon now - gulp, gasp.
> 

Clearly my point is not against "disruptive" changes
but about making clear for people what 
is ongoing-work-in-progress and what is still just
undecided.

regarfs, Samuele Pedroni.