From guido@python.org  Sat Jun  1 02:21:59 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 31 May 2002 21:21:59 -0400
Subject: [Python-Dev] Customization docs
In-Reply-To: Your message of "Fri, 31 May 2002 18:45:13 EDT."
 <06da01c208f4$d69003c0$6601a8c0@boostconsulting.com>
References: <06da01c208f4$d69003c0$6601a8c0@boostconsulting.com>
Message-ID: <200206010121.g511LxX19223@pcp742651pcs.reston01.va.comcast.net>

I'll leave the doc questions for Fred (maybe better open a SF bug for
them though).  Then:

> From what I could find in the docs, it's completely non-obvious how the
> following works for immutable objects in containers:
> 
> >>> x = [ 1, 2, 3]
> >>> x[1] += 3
> >>> x
> [1, 5, 3]
> 
> Is the sequence of operations described someplace?

Um, in the code. :-( Using dis(), you'll find that x[1]+=3 executes
the following:

          6 LOAD_FAST                0 (x)
          9 LOAD_CONST               1 (1)
         12 DUP_TOPX                 2
         15 BINARY_SUBSCR       
         16 LOAD_CONST               2 (3)
         19 INPLACE_ADD         
         20 ROT_THREE           
         21 STORE_SUBSCR        

> How does Python decide that sequence elements are immutable?

Huh?  It doesn't.  If they were mutable, had you expected something
else?

    >>> x = [[1], [3], [5]]
    >>> x[1] += [6]
    >>> x
    [[1], [3, 6], [5]]
    >>> 

Basically, += on an attribute or subscripted container does the
following:

(1) get the thing out
(2) apply the inplace operation to the thing
(3) put the thing back in

The inplace operation, of course, is a binary operator that *may*
modify its first operand in place, but *must* return the resulting
value; if it modified the first operand in place, it *should* return
that operand.  If a type doesn't support an inplace operation, the
regular binary operator is invoked instead.

Does this help?  (The whole thing is designed to be intuitive, but
that probably doesn't work in your case. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From s_lott@yahoo.com  Sat Jun  1 03:11:06 2002
From: s_lott@yahoo.com (Steven Lott)
Date: Fri, 31 May 2002 19:11:06 -0700 (PDT)
Subject: [Python-Dev] deprecating string module?
In-Reply-To: <m3znyhz1xh.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020601021106.29157.qmail@web9606.mail.yahoo.com>

In python, you don't need overloading, you have a variety of
optional parameter mechanisms.

I think the "member functions" issues from C++ don't apply to
Python becuase C++ is strongly typed, meaning that many similar
functions have to be written with slightly different type
signatures.  The lack of strong typing makes it practical to
write generic operations.

I find that use of free functions defeats good object-oriented
design and leads to functionality being informally bound by a
cluster of free functions that have similar names.  I'm
suspicious of this, finding it tiresome to maintain and debug.


--- "Martin v. Loewis" <martin@v.loewis.de> wrote:
> Guido van Rossum <guido@python.org> writes:
> 
> > Is this still relevant to Python?  Why are C++ member
> functions
> > difficult to generic programs?  Does the same apply to
> Python methods?
> 
> In a generic algorithm foo, you can write
> 
> def foo1(x):
>   bar(x)
> 
> if you have global functions. With methods, you need to write
> 
> def foo1(x):
>   x.bar()
> 
> which means that bar must be a method of x. This might be
> difficult to
> achieve if x is of a class that you cannot control. In C++, it
> is then
> still possible to define a function
> 
> def bar(x of-type-X):
>   pass
> 
> which, by means of overload resolution, will be found from
> foo1
> automatically.
> 
> In Python, this is not so easy since you don't have
> overloading.
> 
> Regards,
> Martin
> 
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev


=====
--
S. Lott, CCP :-{)
S_LOTT@YAHOO.COM
http://www.mindspring.com/~slott1
Buccaneer #468: KaDiMa

Macintosh user: drinking upstream from the herd.

__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From s_lott@yahoo.com  Sat Jun  1 03:45:53 2002
From: s_lott@yahoo.com (Steven Lott)
Date: Fri, 31 May 2002 19:45:53 -0700 (PDT)
Subject: [Python-Dev] Re: Adding Optik to the standard library
In-Reply-To: <NBBBIOJPGKJEKIECEMCBCEIGNCAA.pobrien@orbtech.com>
Message-ID: <20020601024553.59600.qmail@web9607.mail.yahoo.com>

The class isn't really the unit of reuse.  The old
one-class-per-file rules from C++ aren't helpful for good
reusable design.  They are for optimizing compiling and making.

This is great book on large-scale design considerations.  Much
of it is C++ specific, but parts apply to Python.

Large-Scale C++ Software Design, John Lakos Addison-Wesley,
Paperback, Published July 1996, 845 pages, ISBN 0201633620.

The module of related classes is the unit of reuse.  A cluster
of related modules can make sense for a large, complex reusable
component, like an application program.

As a user, anything in a module file that is not class
definition (or the odd occaisional convenience function) is a
show-stopper.  If there is some funny business to implement
submodules, that ends my interest.  Part of the open source
social contract is that if I'm going to use it, I'd better be
able to support it.  Even if you win the lottery and retire to a
fishing boat in the Caribbean. 


The question of <was>Optik</was><is>options</is> having several
reusable elements pushes my envelope.  If it's job is to parse
command line arguments, how many different reusable elements can
their really be?  Perhaps there are several candidate modules
here.  It seems difficult to justify putting them all into a
library.  The problem doesn't seem complex enough to justify a
complex solution. 


--- "Patrick K. O'Brien" <pobrien@orbtech.com> wrote:
> [Barry A. Warsaw]
> > If that's so, then I'd prefer to see each class in its own
> module
> > inside a parent package.
> 
> Without trying to open a can of worms here, is there any sort
> of consensus
> on the use of packages with multiple smaller modules vs. one
> module
> containing everything? I'm asking about the Python standard
> library,
> specifically. According to the one-class-per-module rule of
> thumb, there are
> some Python modules that could be refactored into packages.
> Weighing against
> that is the convenience of importing a single module.
> 
> I'm just wondering if there are any guidelines that should
> frame one's
> thinking beyond the fairly obvious ones? For example, is the
> standard
> library an exceptional case because it must appeal to new
> users as well as
> experts? Does a good part of this issue come down to personal
> preference? Or
> are there advantages and disadvantages that should be
> documented? (Maybe
> they already have.)
> 
> Is the current library configuration considered healthy? There
> are a mix of
> packages and single modules. Are these implementations pretty
> optimal, or
> would they be organized differently if one had the chance to
> do it all over
> again?
> 
> Just curious.
> 
> ---
> Patrick K. O'Brien
> Orbtech
> 
> 
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev


=====
--
S. Lott, CCP :-{)
S_LOTT@YAHOO.COM
http://www.mindspring.com/~slott1
Buccaneer #468: KaDiMa

Macintosh user: drinking upstream from the herd.

__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From gward@python.net  Sat Jun  1 03:57:39 2002
From: gward@python.net (Greg Ward)
Date: Fri, 31 May 2002 22:57:39 -0400
Subject: [Python-Dev] Re: Adding Optik to the standard library
In-Reply-To: <20020601024553.59600.qmail@web9607.mail.yahoo.com>
References: <NBBBIOJPGKJEKIECEMCBCEIGNCAA.pobrien@orbtech.com> <20020601024553.59600.qmail@web9607.mail.yahoo.com>
Message-ID: <20020601025739.GA17229@gerg.ca>

On 31 May 2002, Steven Lott said:
> The question of <was>Optik</was><is>options</is> having several
> reusable elements pushes my envelope.  If it's job is to parse
> command line arguments, how many different reusable elements can
> their really be?  Perhaps there are several candidate modules
> here.  It seems difficult to justify putting them all into a
> library.  The problem doesn't seem complex enough to justify a
> complex solution. 

I think I agree with everything you said.  There are only two important
classes in Optik: OptionParser and Option.  Together with one trivial
support class (OptionValue) and some exception classes, that is the
module -- the unit of reusability, in your terms.

For convenience while developing, I split Optik into three source files
-- optik/option_parser.py, optik/option.py, and optik/errors.py.
There's not that much code; about 1100 lines.  And it's all pretty
tightly related -- the OptionParser class is useless without Option, and
vice-versa.

If you just want to use the code, it doesn't much matter if optik (or
OptionParser) is a package with three sub-modules or a single file.  If
you just want to read the code, it's probably easier to have a single
file.  If you're hacking on it, it's probably easier to split the code
up.  I think Optik is now moving into that long, happy phase where it is
mostly read and rarely hacked on, so I think it's time to merge the
three separate source files into one.  I very much doubt that it's too
complex for this -- I have worked hard to keep it tightly focussed on
doing one thing well.

        Greg
-- 
Greg Ward - nerd                                        gward@python.net
http://starship.python.net/~gward/
I appoint you ambassador to Fantasy Island!!!



From barry@zope.com  Sat Jun  1 04:16:48 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 31 May 2002 23:16:48 -0400
Subject: [Python-Dev] subclass a module?
Message-ID: <15608.15520.96707.809995@anthem.wooz.org>

Am I freaking out, did I missing something, or was the `root' in my
root beer float tonight something other than sarsaparilla?

-------------------- snip snip --------------------
Python 2.2.1 (#1, May 31 2002, 18:34:35) 
[GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import string
>>> class allmodcons(string): pass
... 
>>> string
<module 'string' from '/usr/local/lib/python2.2/string.pyc'>
>>> allmodcons
<module '?' (built-in)>
-------------------- snip snip --------------------

Can I now subclass from modules?  And if so, what good does that do
me?

-------------------- snip snip --------------------
>>> dir(allmodcons)
[]
>>> allmodcons.whitespace
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'module' object has no attribute 'whitespace'
>>> string.whitespace
'\t\n\x0b\x0c\r \xa0'
>>> 
-------------------- snip snip --------------------

stickin'-to-herbal-tea-and-dr.-pepper-ly y'rs,
-Barry



From goodger@users.sourceforge.net  Sat Jun  1 04:30:43 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Fri, 31 May 2002 23:30:43 -0400
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg> import <mod>"
In-Reply-To: <B919A917.23AB3%goodger@users.sourceforge.net>
Message-ID: <B91DB822.23E2F%goodger@users.sourceforge.net>

I ran across this wrinkle and hope that someone can shed some light.
First posted to comp.lang.python, but no help there.  Can anyone
here enlighten me?

I have a package on sys.path containing pairs of modules, each
importing the other::

    package/
        __init__.py:
            # empty

        module1.py:
            import module2   # relative import

        module2.py:
            import module1

Executing "from package import module1" works fine.  Changing the
import statements to absolute dotted forms also works for "from
package import module3"::

        module3.py:
            import package.module4   # absolute import

        module4.py:
            import package.module3

However, if I change both imports to be absolute using the
"from/import" form, it doesn't work::

        module5.py:
            from package import module6   # absolute import

        module6.py:
            from package import module5

Now I get an exception::

    >>> from package import module5
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "package/module5.py", line 1, in ?
        from package import module6
      File "package/module6.py", line 1, in ?
        from package import module5
    ImportError: cannot import name module5

Is this behavior expected?  Or is it a bug?  I note that FAQ entry
4.37 [*]_ says we shouldn't do "from <module> import *"; I'm not.  Are
all "from import" statements forbidden in this context?  Why?  (It
seems to me that "import package.module" and "from package import
module" are equivalent imports, except for their effect on the local
namespace.)  Is there an authoritative reference (docs, past c.l.p
post, bug report, etc.)?

.. [*] http://www.python.org/cgi-bin/faqw.py?req=show&file=faq04.037.htp

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/




From guido@python.org  Sat Jun  1 05:27:44 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 00:27:44 -0400
Subject: [Python-Dev] Re: Adding Optik to the standard library
In-Reply-To: Your message of "Fri, 31 May 2002 22:57:39 EDT."
 <20020601025739.GA17229@gerg.ca>
References: <NBBBIOJPGKJEKIECEMCBCEIGNCAA.pobrien@orbtech.com> <20020601024553.59600.qmail@web9607.mail.yahoo.com>
 <20020601025739.GA17229@gerg.ca>
Message-ID: <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net>

> If you're hacking on it, it's probably easier to split the code up.

Hm, that's not how I tend to hack on things (except when working with
others who like that style).  Why do you find hacking on several
(many?) small files easier for you than on a single large file?
Surely not because loading a large file (in the editor, or in Python)
takes too long?  That was in the 80s. :-)  Is it because multiple
Emacs buffers allow you to maintain multiple current positions, with
all the context that that entails?  Or is it something else?

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido@python.org  Sat Jun  1 05:29:51 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 00:29:51 -0400
Subject: [Python-Dev] subclass a module?
In-Reply-To: Your message of "Fri, 31 May 2002 23:16:48 EDT."
 <15608.15520.96707.809995@anthem.wooz.org>
References: <15608.15520.96707.809995@anthem.wooz.org>
Message-ID: <200206010429.g514Tpi19397@pcp742651pcs.reston01.va.comcast.net>

> Can I now subclass from modules?

It's a bug IMO.  

> And if so, what good does that do me?

None whatsoever.  The resulting class cannot be instantiated.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sat Jun  1 05:42:42 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 00:42:42 -0400
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg> import <mod>"
In-Reply-To: Your message of "Fri, 31 May 2002 23:30:43 EDT."
 <B91DB822.23E2F%goodger@users.sourceforge.net>
References: <B91DB822.23E2F%goodger@users.sourceforge.net>
Message-ID: <200206010442.g514gg419477@pcp742651pcs.reston01.va.comcast.net>

> However, if I change both imports to be absolute using the
> "from/import" form, it doesn't work::
> 
>         module5.py:
>             from package import module6   # absolute import
> 
>         module6.py:
>             from package import module5
> 
> Now I get an exception::
> 
>     >>> from package import module5
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>       File "package/module5.py", line 1, in ?
>         from package import module6
>       File "package/module6.py", line 1, in ?
>         from package import module5
>     ImportError: cannot import name module5
> 
> Is this behavior expected?  Or is it a bug?

It's probably due to the extremely subtle (lame?) way that "from
package import module" is (has to be?) implemented.

It's too late at night for me to dig further to come up with an
explanation, but maybe reading the file knee.py is helpful -- it gives
the *algorithm* used for package and module import.  In 2.2 and
before, it's Lib/knee.py; in 2.3, it's been moved to
Demo/imputils/knee.py.

I think that you'll have to live with it.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Sat Jun  1 12:33:20 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Sat, 1 Jun 2002 07:33:20 -0400
Subject: [Python-Dev] Customization docs
References: <06da01c208f4$d69003c0$6601a8c0@boostconsulting.com>  <200206010121.g511LxX19223@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <073501c20960$23f367e0$6601a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>

> Um, in the code. :-( Using dis(), you'll find that x[1]+=3 executes
> the following:
>
>           6 LOAD_FAST                0 (x)
>           9 LOAD_CONST               1 (1)
>          12 DUP_TOPX                 2
>          15 BINARY_SUBSCR
>          16 LOAD_CONST               2 (3)
>          19 INPLACE_ADD
>          20 ROT_THREE
>          21 STORE_SUBSCR
>
> > How does Python decide that sequence elements are immutable?
>
> Huh?  It doesn't.  If they were mutable, had you expected something
> else?

Actually, yes. I had expcected that Python would know it didn't need to
"put the thing back in", since the thing gets modified in place. Knowing
that it doesn't work that way clears up a lot.

>     >>> x = [[1], [3], [5]]
>     >>> x[1] += [6]
>     >>> x
>     [[1], [3, 6], [5]]
>     >>>

Well of /course/ I know that's the result. The question was, how is the
result achieved?

> Basically, += on an attribute or subscripted container does the
> following:
>
> (1) get the thing out
> (2) apply the inplace operation to the thing
> (3) put the thing back in
>
> The inplace operation, of course, is a binary operator that *may*
> modify its first operand in place, but *must* return the resulting
> value; if it modified the first operand in place, it *should* return
> that operand.  If a type doesn't support an inplace operation, the
> regular binary operator is invoked instead.

That's the easy part.

> Does this help?  (The whole thing is designed to be intuitive, but
> that probably doesn't work in your case. :-)

I use this stuff from Python without thinking about it, but when it comes
to building new types, I sometimes need to have a better sense of the
underlying mechanism.

Thanks,
Dave


P.S. Say, you could optimize away putting the thing back at runtime if the
inplace operation returns its first argument... but you probably already
thought of that one.





From David Abrahams" <david.abrahams@rcn.com  Sat Jun  1 12:42:09 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Sat, 1 Jun 2002 07:42:09 -0400
Subject: [Python-Dev] deprecating string module?
References: <20020601021106.29157.qmail@web9606.mail.yahoo.com>
Message-ID: <073e01c20961$8ab0b770$6601a8c0@boostconsulting.com>

> In python, you don't need overloading, you have a variety of
> optional parameter mechanisms

...which forces users to write centralized dispatching mechanism that could
be much more elegantly-handled by the language. The language already does
something just for operators, but the rules are complicated and don't scale
well.

> I think the "member functions" issues from C++ don't apply to
> Python becuase C++ is strongly typed, meaning that many similar
> functions have to be written with slightly different type
> signatures.

That's very seldom the case in my C++ code. Why would you do that in lieu
of writing function templates?

I think Martin hit the nail on the head: you can achieve some decoupling of
algorithms from data structures using free functions, but you need some way
to look up the appropriate free function for a given data structure. FOr
that, you need some kind of overload resolution.

> The lack of strong typing makes it practical to
> write generic operations.

Templates and overloading in C++ make it practical to write
statically-type-checked generic operations.






From David Abrahams" <david.abrahams@rcn.com  Sat Jun  1 13:00:53 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Sat, 1 Jun 2002 08:00:53 -0400
Subject: [Python-Dev] subclass a module?
References: <15608.15520.96707.809995@anthem.wooz.org>  <200206010429.g514Tpi19397@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <076b01c20964$573b9f60$6601a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>


> > Can I now subclass from modules?
> 
> It's a bug IMO.  
> 
> > And if so, what good does that do me?
> 
> None whatsoever.  The resulting class cannot be instantiated.

Really?

>>> import re
>>> class X(type(re)):
...     def hello(): print 'hi'
...
>>> newmod = X()
>>> newmod.hello
<bound method X.hello of <module '?' (built-in)>>




From neal@metaslash.com  Sat Jun  1 14:20:04 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Sat, 01 Jun 2002 09:20:04 -0400
Subject: [Python-Dev] PYC Magic
Message-ID: <3CF8CA04.C9070F3A@metaslash.com>

I recently posted a patch to fix a bug:  http://python.org/sf/561858.
The patch requires changing .pyc magic.  Since this bug goes back
to 2.1, what is the process for changing .pyc magic in bugfix releases?
ie, is it allowed?

In this case the co_stacksize > 32767 and only a short is written 
to disk.  This could be doubled to 65536 (probably should be) 
without changing the magic.  But even that isn't sufficient 
to solve this problem.

It also brings up a related problem.  If the PyCodeObject 
can't be written to disk, should a .pyc be created at all?  
The code will run fine the first time, but when imported 
the second time it will fail.

The other 16 bit values stored are:  co_argcount, co_nlocals, co_flags.
At least argcount & nlocals aren't too likely to exceed 32k, but
co_flags could, which would be silently ignored now.

Neal



From guido@python.org  Sat Jun  1 14:34:01 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 09:34:01 -0400
Subject: [Python-Dev] Customization docs
In-Reply-To: Your message of "Sat, 01 Jun 2002 07:33:20 EDT."
 <073501c20960$23f367e0$6601a8c0@boostconsulting.com>
References: <06da01c208f4$d69003c0$6601a8c0@boostconsulting.com> <200206010121.g511LxX19223@pcp742651pcs.reston01.va.comcast.net>
 <073501c20960$23f367e0$6601a8c0@boostconsulting.com>
Message-ID: <200206011334.g51DY1D21669@pcp742651pcs.reston01.va.comcast.net>

> > > How does Python decide that sequence elements are immutable?
> >
> > Huh?  It doesn't.  If they were mutable, had you expected something
> > else?
> 
> Actually, yes. I had expcected that Python would know it didn't need
> to "put the thing back in", since the thing gets modified in
> place. Knowing that it doesn't work that way clears up a lot.

Still, I don't understand which other outcome than [1, 6, 5] you had
expected.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sat Jun  1 14:35:09 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 09:35:09 -0400
Subject: [Python-Dev] deprecating string module?
In-Reply-To: Your message of "Sat, 01 Jun 2002 07:42:09 EDT."
 <073e01c20961$8ab0b770$6601a8c0@boostconsulting.com>
References: <20020601021106.29157.qmail@web9606.mail.yahoo.com>
 <073e01c20961$8ab0b770$6601a8c0@boostconsulting.com>
Message-ID: <200206011335.g51DZ9a21685@pcp742651pcs.reston01.va.comcast.net>

> > In python, you don't need overloading, you have a variety of
> > optional parameter mechanisms
> 
> ...which forces users to write centralized dispatching mechanism
> that could be much more elegantly-handled by the language. The
> language already does something just for operators, but the rules
> are complicated and don't scale well.

I don't think the situation can be improved without adding type
declarations.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sat Jun  1 14:36:51 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 09:36:51 -0400
Subject: [Python-Dev] subclass a module?
In-Reply-To: Your message of "Sat, 01 Jun 2002 08:00:53 EDT."
 <076b01c20964$573b9f60$6601a8c0@boostconsulting.com>
References: <15608.15520.96707.809995@anthem.wooz.org> <200206010429.g514Tpi19397@pcp742651pcs.reston01.va.comcast.net>
 <076b01c20964$573b9f60$6601a8c0@boostconsulting.com>
Message-ID: <200206011336.g51Daqe21701@pcp742651pcs.reston01.va.comcast.net>

> > > Can I now subclass from modules?
> > 
> > It's a bug IMO.  
> > 
> > > And if so, what good does that do me?
> > 
> > None whatsoever.  The resulting class cannot be instantiated.
> 
> Really?
> 
> >>> import re
> >>> class X(type(re)):
> ...     def hello(): print 'hi'
> ...
> >>> newmod = X()
> >>> newmod.hello
> <bound method X.hello of <module '?' (built-in)>>

You subclass the module metaclass.  The example we were discussing was
different: it subclassed the module itself, like this:

    >>> import re
    >>> class X(re):
	 pass
    ...
    >>> X()
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    TypeError: 'module' object is not callable
    >>> 

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Sat Jun  1 14:32:39 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Sat, 1 Jun 2002 09:32:39 -0400
Subject: [Python-Dev] Customization docs
References: <06da01c208f4$d69003c0$6601a8c0@boostconsulting.com> <200206010121.g511LxX19223@pcp742651pcs.reston01.va.comcast.net>              <073501c20960$23f367e0$6601a8c0@boostconsulting.com>  <200206011334.g51DY1D21669@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <07d801c20970$cdebe050$6601a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>


> > > > How does Python decide that sequence elements are immutable?
> > >
> > > Huh?  It doesn't.  If they were mutable, had you expected something
> > > else?
> >
> > Actually, yes. I had expcected that Python would know it didn't need
> > to "put the thing back in", since the thing gets modified in
> > place. Knowing that it doesn't work that way clears up a lot.
>
> Still, I don't understand which other outcome than [1, 6, 5] you had
> expected.

As I indicated in my previous mail, I didn't expect any other result.

My question was about what a new type needs to do in order for things to
work properly in Python. If, as I had incorrectly assumed, Python were
checking a type's mutability before deciding whether it would be putting
the result back into the sequence, I would need to know what criteria
Python uses to decide mutability.

-Dave





From guido@python.org  Sat Jun  1 14:45:09 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 09:45:09 -0400
Subject: [Python-Dev] PYC Magic
In-Reply-To: Your message of "Sat, 01 Jun 2002 09:20:04 EDT."
 <3CF8CA04.C9070F3A@metaslash.com>
References: <3CF8CA04.C9070F3A@metaslash.com>
Message-ID: <200206011345.g51Dj9h21769@pcp742651pcs.reston01.va.comcast.net>

> I recently posted a patch to fix a bug:  http://python.org/sf/561858.
> The patch requires changing .pyc magic.  Since this bug goes back
> to 2.1, what is the process for changing .pyc magic in bugfix releases?
> ie, is it allowed?

Absolutely not!!!!!  .pyc files must remain 100% compatible!!!
(Imagine someone doing a .pyc-only distribution for 2.1.3 and finding
that it doesn't work for 2.1.4!)

> In this case the co_stacksize > 32767 and only a short is written 
> to disk.  This could be doubled to 65536 (probably should be) 
> without changing the magic.  But even that isn't sufficient 
> to solve this problem.

I guess the only way to fix this in 2.1.x is to raise an error --
that's better than the crash that will follow if you try to execute
that code.

> It also brings up a related problem.  If the PyCodeObject 
> can't be written to disk, should a .pyc be created at all?  
> The code will run fine the first time, but when imported 
> the second time it will fail.

What do you mean by "can't be written to disk"?  Is the disk full?  Is
there another kind of write error?  The magic number is written last,
only when the write is successful.

> The other 16 bit values stored are:  co_argcount, co_nlocals, co_flags.
> At least argcount & nlocals aren't too likely to exceed 32k, but
> co_flags could, which would be silently ignored now.

If you're going to change the marshal format anyway, I'd increase all
of them to 32 bit ints.  After all, I thought the stacksize would
never exceed 32K either...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gward@python.net  Sat Jun  1 14:42:36 2002
From: gward@python.net (Greg Ward)
Date: Sat, 1 Jun 2002 09:42:36 -0400
Subject: [Python-Dev] Where to put wrap_text()?
Message-ID: <20020601134236.GA17691@gerg.ca>

Hidden away in distutils.fancy_getopt is an exceedingly handy function
called wrap_text().  It does just what you might expect from the name:

def wrap_text (text, width):
    """wrap_text(text : string, width : int) -> [string]

    Split 'text' into multiple lines of no more than 'width' characters
    each, and return the list of strings that results.
    """

Surprise surprise, Optik uses this.  I've never been terribly happy
about importing it from distutils.fancy_getopt, and putting Optik into
the standard library as OptionParser is a great opportunity for putting
wrap_text somewhere more sensible.

I happen to think that wrap_text() is useful for more than just
auto-formatting --help messages, so hiding it away in OptionParser.py
doesn't seem right.  Also, Perl has a Text::Wrap module that's been part
of the standard library for not-quite-forever -- so shouln't Python have
one too?

Proposal: a new standard library module, wrap_text, which combines the
best of distutils.fancy_getopt.wrap_text() and Text::Wrap.  Right now,
I'm thinking of an interface something like this:

  wrap(text : string, width : int) -> [string]

    Split 'text' into multiple lines of no more than 'width' characters
    each, and return the list of strings that results.  Tabs in 'text'
    are expanded with string.expandtabs(), and all other whitespace
    characters (including newline) are converted to space.

[This is identical to distutils.fancy_getopt.wrap_text(), but the
docstring is more complete.]

  wrap_nomunge(text : string, width : int) -> [string]

    Same as wrap(), without munging whitespace.

[Not sure if this is really useful to expose publicly.  Opinions?]

  fill(text : string,
       width : int,
       initial_tab : string = "",
       subsequent_tab : string = "")
  -> string

    Reformat the paragraph in 'text' to fit in lines of no more than
    'width' columns.  The first line is prefixed with 'initial_tab',
    and subsequent lines are prefixed with 'subsequent_tab'; the
    lengths of the tab strings are accounted for when wrapping lines
    to fit in 'width' columns.

[This is just a glorified "\n".join(wrap(...)); the idea to add initial_tab
and subsequent_tab was stolen from Perl's Text::Wrap.]

I'll go whip up some code and submit a patch to SF.  If people like it,
I'll even write some tests and documentation too.

        Greg
-- 
Greg Ward - Unix nerd                                   gward@python.net
http://starship.python.net/~gward/
Support bacteria -- it's the only culture some people have!



From guido@python.org  Sat Jun  1 14:54:20 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 09:54:20 -0400
Subject: [Python-Dev] Where to put wrap_text()?
In-Reply-To: Your message of "Sat, 01 Jun 2002 09:42:36 EDT."
 <20020601134236.GA17691@gerg.ca>
References: <20020601134236.GA17691@gerg.ca>
Message-ID: <200206011354.g51DsKK21861@pcp742651pcs.reston01.va.comcast.net>

> Proposal: a new standard library module, wrap_text, which combines the
> best of distutils.fancy_getopt.wrap_text() and Text::Wrap.

I think this is a fine idea.  But *please* don't put an underscore in
the name.  I'd say "wrap" or "wraptext" are better than "wrap_text".

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aahz@pythoncraft.com  Sat Jun  1 14:49:46 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sat, 1 Jun 2002 09:49:46 -0400
Subject: [Python-Dev] Where to put wrap_text()?
In-Reply-To: <20020601134236.GA17691@gerg.ca>
References: <20020601134236.GA17691@gerg.ca>
Message-ID: <20020601134946.GA608@panix.com>

On Sat, Jun 01, 2002, Greg Ward wrote:
>
> Proposal: a new standard library module, wrap_text, which combines the
> best of distutils.fancy_getopt.wrap_text() and Text::Wrap.  

Personally, I'd like to at least get the functionality of some versions
of 'fmt', which have both goal and maxlength parameters.  If you feel
like getting ambitious, there's the 'par' program that can wrap quoted
text, but that can always be added to a later version of the library.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In the end, outside of spy agencies, people are far too trusting and
willing to help."  --Ira Winkler



From pinard@iro.umontreal.ca  Sat Jun  1 15:00:04 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 01 Jun 2002 10:00:04 -0400
Subject: [Python-Dev] Those (punctuations and skul heads) bug tracking systems! :-)
Message-ID: <oqptzbf6aj.fsf@titan.progiciels-bpi.ca>

Hi, gang.  Anecdotical rants from a technical Luddite, yours truly! :-)

A few days ago, I wrote Fred about a tiny problem in Python documentation.
Fred replied (very nicely, don't doubt it) something like "Yes, this should
absolutely be corrected, but being busy now, I might forget about this --
so please submit a bug report using the SF tracker".

I learnt to shudder with horror when people tell me such things.  Email is
so simple, clean, expeditive and human!  Each time I have to use a BTS,
this is the same story, I spend a lot of hours studying around, then later
experimenting with the system.  And finally, I invariably fall in dead ends,
after having met a few blatant bugs in the BTS itself.  Don't tell me it's
my browser.  The browser is an integral part of the BTS.  Think "user" here!

So, trying once more to be a good citizen, I spent many hours yesterday at
sorting and reading the email I saved over time, about various comments or
references from Python developers about the BTS in use.  If I filed these,
this is foreseeing I could not escape the Python BTS forever, especially if
I want to involve myself a bit more.  Reading all this more attentively,
I noticed a flurry of alternate, confusing, and sometimes heavy notations
to access already submitted reports and documentation, changes in numbering
and methods over time that were did not always seem to be fully gracious,
I admired the relative nicety of the Python SF redirector, and its minor
short-comings.  Notable to me were many developer comments about reports
being mis-attributed, re-filed, unduly aging or nearly lost in practice.

This morning, I decided to do the great try, knowing that there is
a facility to prepare the message offline using a reasonable editor
(Netscape is very far from my concept of a usable editor) and submit it
afterwards.  I prepared the message yesterday, saved it into a `temp0'
file, and moved it over to the machine here, coming back from travel.
Netscape first refused to see that `temp0' file in its directory in
the file browsing window, it apparently only saw `*.html' files.  I was
surely not to turn my little communication into HTML first for Fred to
see, so I merely typed the file name in the upload box.  "Category",
"Group", "Summary", "Check to Upload and Attach File" all had a little
'?' besides them, from which I expected some documentation, but clicking
on them yielded "File loaded" in the bottom echo area, and _nothing_ more.
For "File Description" in particular, I would have needed more information,
but there was no `?' next to it, so I merely guessed it wanted a MIME type
and wrote "text/plain" within it.  Clicking "SUBMIT" gave something like
"ERROR Invalid file name", and no kind of feedback about the bug having
been submitted.  So I guessed the file needed an extension, and renamed
`temp0' into `temp0.txt', then modified the file name accordingly in
the upload box.  Re-attempting "SUBMIT" a second time yielded: "ERROR You
Attempted To Double-submit this item.  Please avoid double-clicking."

Sigh! The usual misery!

OK.  Instead of uploading a prepared file, I will now proceed to try cut
and pasting into Netscape from a real editor, hoping that the mangling will
be limited.  I do know I have more comments and nuances for Fred, I should
find the courage to share them: I fear one does not have much of a choice
for contributing.  Such rotten reporting systems are merely discouraging.
Hmph!  I'll continue trying to tame myself to these user interface failures.

Sometimes, I ponder that if all maintainers were using the same BTS,
the effort of learning to cope with _that one_ would probably have some
more worth.  I do imagine that a BTS could be useful.  There are many
BTS around, random projects using random BTS -- so the effort of fighting
with BTS often has to be restarted when you play in many fields.  Moreover,
the truth is, at least for Python, that using a BTS does not solve the main
problem, which is the insufficient number of contributors and developers.
Risk for risk, I still think I have a much better chance being listened
to and understood when I write to Fred directly!  With some luck, Fred is
an ordered and careful man who, just like me, is able to handle folders.

I read with pleasure all the thread saying that `roundup' has an email
interface, is actively being improved, and could replace the SF tracker.
Let us hope it will be more usable than its precessors!  You know, the
real goal of all this is allowing for simple and humble communication
between humans, about the knowledge of a problem.  I surely used to be
a very active reporter for all problems I saw everywhere, at the time
maintainers were still reachable.  When the effort gets too frustrating,
sadly, one might feel less inclined to offer contributions, and rather
choose to enjoy more of the sun, music, and life! :-)

It may look like a useless moan, but it might be worth saying after all.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From neal@metaslash.com  Sat Jun  1 15:15:22 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Sat, 01 Jun 2002 10:15:22 -0400
Subject: [Python-Dev] PYC Magic
References: <3CF8CA04.C9070F3A@metaslash.com> <200206011345.g51Dj9h21769@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3CF8D6FA.D9D7B8CE@metaslash.com>

Guido van Rossum wrote:
> 
> > I recently posted a patch to fix a bug:  http://python.org/sf/561858.
> > The patch requires changing .pyc magic.  Since this bug goes back
> > to 2.1, what is the process for changing .pyc magic in bugfix releases?
> > ie, is it allowed?
> 
> Absolutely not!!!!!  .pyc files must remain 100% compatible!!!
> (Imagine someone doing a .pyc-only distribution for 2.1.3 and finding
> that it doesn't work for 2.1.4!)

Ok, I'll work on a patch for 2.1/2.2.

In looking through other magic code, I found that when -U 
was removed.  It was only removed from the usage msg.
Should the option and code be removed altogether?
ie, -U is still used by getopt and still changes the magic,
so -U is just a hidden option now.

> > It also brings up a related problem.  If the PyCodeObject
> > can't be written to disk, should a .pyc be created at all?
> > The code will run fine the first time, but when imported
> > the second time it will fail.
> 
> What do you mean by "can't be written to disk"?  Is the disk full?  Is
> there another kind of write error?  The magic number is written last,
> only when the write is successful.

Disk full was one condition.  The other condition was the if a value
is 32 bits in memory, but only 16 bits are written to disk.  Based
on your comment to increase all of the 16 bit values for PyCode,
that will no longer be the case.  Although, there could be transient
write errors and the file could be corrupted.  Since only part
of the data would be written.  One case where this could happen
is an interupted system call.

There is one other possible problem.  [wr]_short() is now only
used in one place:  for long.digits which are unsigned ints.
But r_short() does sign extension.  Is this a problem?

Neal



From paul-python@svensson.org  Sat Jun  1 15:17:50 2002
From: paul-python@svensson.org (Paul Svensson)
Date: Sat, 1 Jun 2002 10:17:50 -0400 (EDT)
Subject: [Python-Dev] Customization docs
In-Reply-To: <200206011334.g51DY1D21669@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <Pine.LNX.4.44.0206011005450.22613-100000@familjen.svensson.org>

On Sat, 1 Jun 2002, Guido van Rossum wrote:

>> > > How does Python decide that sequence elements are immutable?
>> >
>> > Huh?  It doesn't.  If they were mutable, had you expected something
>> > else?
>>
>> Actually, yes. I had expcected that Python would know it didn't need
>> to "put the thing back in", since the thing gets modified in
>> place. Knowing that it doesn't work that way clears up a lot.
>
>Still, I don't understand which other outcome than [1, 6, 5] you had
>expected.

Well, _I_ would have expected this to work:

	Python 2.1 (#4, Jun  6 2001, 08:54:49)
	[GCC 2.95.2 19991024 (release)] on linux2
	Type "copyright", "credits" or "license" for more information.
	>>> x = ([],[],[])
	>>> x[1] += [1]
	Traceback (most recent call last):
	  File "<stdin>", line 1, in ?
	TypeError: object doesn't support item assignment

Given that the object x[1] can be (and is) modified in place,
I find this behaviour quite counter-intuitive,
specially considering:

	>>> z = x[1]
	>>> z += [2]
	>>> x
	([], [1, 2], [])


		/Paul




From neal@metaslash.com  Sat Jun  1 15:18:31 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Sat, 01 Jun 2002 10:18:31 -0400
Subject: [Python-Dev] Where to put wrap_text()?
References: <20020601134236.GA17691@gerg.ca> <200206011354.g51DsKK21861@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3CF8D7B7.BFA0AD14@metaslash.com>

Guido van Rossum wrote:
> 
> > Proposal: a new standard library module, wrap_text, which combines the
> > best of distutils.fancy_getopt.wrap_text() and Text::Wrap.
> 
> I think this is a fine idea.  But *please* don't put an underscore in
> the name.  I'd say "wrap" or "wraptext" are better than "wrap_text".

Some possibilities are:

  * a string method
  * a UserString method
  * a new module text, with a function wrap()
  * add function wrap() to UserString

Should it work on unicode strings too?

Neal



From walter@livinglogic.de  Sat Jun  1 15:23:39 2002
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Sat, 01 Jun 2002 16:23:39 +0200
Subject: [Python-Dev] Other library code transformations
References: <001501c208bd$46133420$d061accf@othello>
Message-ID: <3CF8D8EB.60604@livinglogic.de>

Raymond Hettinger wrote:

> While we're eliminating uses of the string and types modules, how about
> other code clean-ups and modernization:
> 
 > [...]

dont' forget:
import stat; os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime

But to be able to remove "import stat" everywhere the remaining 
functions in stat.py would have to be implemented as methods.
And what about the remaining constants defined in stat.py?

Bye,
    Walter Dörwald




From walter@livinglogic.de  Sat Jun  1 15:25:36 2002
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Sat, 01 Jun 2002 16:25:36 +0200
Subject: [Python-Dev] Other library code transformations
References: <001501c208bd$46133420$d061accf@othello>
Message-ID: <3CF8D960.8040402@livinglogic.de>

Raymond Hettinger wrote:

 > While we're eliminating uses of the string and types modules, how about
 > other code clean-ups and modernization:
 >
 > [...]

dont' forget:
import stat; os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime

But to be able to remove "import stat" everywhere the remaining 
functions in stat.py would have to be implemented as methods.
And what about the remaining constants defined in stat.py?

Bye,
    Walter Dörwald





From gward@python.net  Sat Jun  1 15:38:55 2002
From: gward@python.net (Greg Ward)
Date: Sat, 1 Jun 2002 10:38:55 -0400
Subject: [Python-Dev] Re: Adding Optik to the standard library
In-Reply-To: <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net>
References: <NBBBIOJPGKJEKIECEMCBCEIGNCAA.pobrien@orbtech.com> <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020601143855.GA18632@gerg.ca>

On 01 June 2002, Guido van Rossum said:
> Hm, that's not how I tend to hack on things (except when working with
> others who like that style).  Why do you find hacking on several
> (many?) small files easier for you than on a single large file?

Actually, Optik started out in one file; I split it up somewhere around
600 or 700 lines of code expecting it to grow more.  It only grew to
around 1100 lines, which I suppose is a good thing.  I think having
small modules makes me more comfortable about adding code -- I don't
feel at all hemmed-in adding 50 lines to a 300-line module, but adding
50 lines to an 800-line module makes me nervous.

I think it all boils down to having things in easily-digested chunks,
rather than concerns about stressing Emacs out.

(OTOH and wildly OT: since I gave in a couple years ago and started
using Emacs syntax-colouring, it *does* take a lot longer to load
modules up -- eg. ~2 sec for the 1000-line rfc822.py.  But that's
probably just because Emacs is a great shaggy beast of an editor
("Eight(y) Megs and Constantly Swapping", "Eventually Mallocs All Core
Storage", you know...).  I'm sure if I got a brain transplant so that I
could use vim, it would be different.)

        Greg
-- 
Greg Ward - programmer-at-big                           gward@python.net
http://starship.python.net/~gward/
Gee, I feel kind of LIGHT in the head now, knowing I can't make my
satellite dish PAYMENTS!



From aahz@pythoncraft.com  Sat Jun  1 15:44:05 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sat, 1 Jun 2002 10:44:05 -0400
Subject: [Python-Dev] Re: Adding Optik to the standard library
In-Reply-To: <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net>
References: <NBBBIOJPGKJEKIECEMCBCEIGNCAA.pobrien@orbtech.com> <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020601144405.GA6490@panix.com>

On Sat, Jun 01, 2002, Guido van Rossum wrote:
>
> > If you're hacking on it, it's probably easier to split the code up.
> 
> Hm, that's not how I tend to hack on things (except when working with
> others who like that style).  Why do you find hacking on several
> (many?) small files easier for you than on a single large file?
> Surely not because loading a large file (in the editor, or in Python)
> takes too long?  That was in the 80s. :-)  Is it because multiple
> Emacs buffers allow you to maintain multiple current positions, with
> all the context that that entails?  Or is it something else?

s/Emacs/vi sessions/

Yes.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In the end, outside of spy agencies, people are far too trusting and
willing to help."  --Ira Winkler



From David Abrahams" <david.abrahams@rcn.com  Sat Jun  1 15:38:13 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Sat, 1 Jun 2002 10:38:13 -0400
Subject: [Python-Dev] Customization docs
References: <06da01c208f4$d69003c0$6601a8c0@boostconsulting.com> <200206010121.g511LxX19223@pcp742651pcs.reston01.va.comcast.net> <073501c20960$23f367e0$6601a8c0@boostconsulting.com> <200206011334.g51DY1D21669@pcp742651pcs.reston01.va.comcast.net>              <07d801c20970$cdebe050$6601a8c0@boostconsulting.com>  <200206011348.g51Dm0i21793@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <081001c2097d$4fce5740$6601a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>


> > As I indicated in my previous mail, I didn't expect any other result.
>
> Then your question was formulated strangely.  You showed the result
> and said "how does it know that list items are immutable"; the context
> suggested strongly to me that you had expected something else.
>
> > My question was about what a new type needs to do in order for things
to
> > work properly in Python.
>
> You could have asked that directly. :-)

Incorrect background assumptions have a way of fouling communication. I
hope it's obvious that I'm making an effort to be clear.

-Dave





From guido@python.org  Sat Jun  1 16:20:01 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 11:20:01 -0400
Subject: [Python-Dev] Where to put wrap_text()?
In-Reply-To: Your message of "Sat, 01 Jun 2002 10:18:31 EDT."
 <3CF8D7B7.BFA0AD14@metaslash.com>
References: <20020601134236.GA17691@gerg.ca> <200206011354.g51DsKK21861@pcp742651pcs.reston01.va.comcast.net>
 <3CF8D7B7.BFA0AD14@metaslash.com>
Message-ID: <200206011520.g51FK1P22041@pcp742651pcs.reston01.va.comcast.net>

> Some possibilities are:
> 
>   * a string method
>   * a UserString method

This should *definitely* not be a method.  Too specialized, too many
possibilities for tweaking the algorithm.

>   * a new module text, with a function wrap()
>   * add function wrap() to UserString
> 
> Should it work on unicode strings too?

Yes.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Sat Jun  1 16:15:35 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Sat, 1 Jun 2002 11:15:35 -0400
Subject: [Python-Dev] deprecating string module?
References: <20020601021106.29157.qmail@web9606.mail.yahoo.com>              <073e01c20961$8ab0b770$6601a8c0@boostconsulting.com>  <200206011335.g51DZ9a21685@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <082601c2097f$f7b75900$6601a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>


> > > In python, you don't need overloading, you have a variety of
> > > optional parameter mechanisms
> >
> > ...which forces users to write centralized dispatching mechanism
> > that could be much more elegantly-handled by the language. The
> > language already does something just for operators, but the rules
> > are complicated and don't scale well.
>
> I don't think the situation can be improved without adding type
> declarations.

You could do a little without type declarations, to handle functions with
diffrent numbers of arguments or keywords, though I don't think that would
be very satisfying.

A good solution would not neccessarily need full type declaration
capability; just some way to annotate function signatures with types. What
I mean is that, for example, the ability to declare the type of a local
variable would not be of any use in overload resolution.

In fact, pure (sub)type comparison is probably not the best mechanism for
Python's overload resolution. For example, it should be possible to write a
function which will match any sequence object.

I'd like to see something like the following sketch:

1. Each function can have an optional associated rating function which,
given (args,kw) returns a float describing the quality of the match to its
arguments
2. When calling an overloaded function, the best match from the pool of
overloads is taken
3. The default rating function works as follows:

  a. Each formal argument has an optional associated match object
  b. The match object contributes a rating to the overall rating of the
function
  c. If the match object is a type T, the rating system favors arguments x
where T appears earlier in the mro of x.__class__. The availability of
explicit conversions such as int() and float() is considered, but always
produces a worse match than a subtype match.
  d. If the match object is a callable non-type, it's expected to produce
an argument match rating

Obviously, there are some details missing. I have to think about this stuff
anyway for Boost.Python (since its current trivial overload resolution is
not really adequate); if there's any interest here it would be nice for me,
since I'd stand a chance of doing something which is likely to be
consistent with whatever happens in Python.

-Dave





From guido@python.org  Sat Jun  1 16:26:37 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 11:26:37 -0400
Subject: [Python-Dev] PYC Magic
In-Reply-To: Your message of "Sat, 01 Jun 2002 10:15:22 EDT."
 <3CF8D6FA.D9D7B8CE@metaslash.com>
References: <3CF8CA04.C9070F3A@metaslash.com> <200206011345.g51Dj9h21769@pcp742651pcs.reston01.va.comcast.net>
 <3CF8D6FA.D9D7B8CE@metaslash.com>
Message-ID: <200206011526.g51FQbO22066@pcp742651pcs.reston01.va.comcast.net>

> In looking through other magic code, I found that when -U 
> was removed.  It was only removed from the usage msg.
> Should the option and code be removed altogether?
> ie, -U is still used by getopt and still changes the magic,
> so -U is just a hidden option now.

-U is a handy option for developers wanting to test Unicode
conformance of their code, but the help message promised more than it
could deliver.  Please leave this alone.

> There is one other possible problem.  [wr]_short() is now only
> used in one place:  for long.digits which are unsigned ints.
> But r_short() does sign extension.  Is this a problem?

Long digits are only 15 bits, so if you change it to return an
unsigned short that shouldn't matter.  Dunno if there's magic for
negative numbers though (in memory, the length is negative, but the
digits are not).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sat Jun  1 16:28:45 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 11:28:45 -0400
Subject: [Python-Dev] Other library code transformations
In-Reply-To: Your message of "Sat, 01 Jun 2002 16:23:39 +0200."
 <3CF8D8EB.60604@livinglogic.de>
References: <001501c208bd$46133420$d061accf@othello>
 <3CF8D8EB.60604@livinglogic.de>
Message-ID: <200206011528.g51FSjb22095@pcp742651pcs.reston01.va.comcast.net>

> But to be able to remove "import stat" everywhere the remaining 
> functions in stat.py would have to be implemented as methods.
> And what about the remaining constants defined in stat.py?

I see no reason to want to deprecate the stat module, only the
indexing constants in the stat tuple.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sat Jun  1 16:27:48 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 11:27:48 -0400
Subject: [Python-Dev] Customization docs
In-Reply-To: Your message of "Sat, 01 Jun 2002 10:17:50 EDT."
 <Pine.LNX.4.44.0206011005450.22613-100000@familjen.svensson.org>
References: <Pine.LNX.4.44.0206011005450.22613-100000@familjen.svensson.org>
Message-ID: <200206011527.g51FRmt22081@pcp742651pcs.reston01.va.comcast.net>

> Well, _I_ would have expected this to work:
> 
> 	Python 2.1 (#4, Jun  6 2001, 08:54:49)
> 	[GCC 2.95.2 19991024 (release)] on linux2
> 	Type "copyright", "credits" or "license" for more information.
> 	>>> x = ([],[],[])
> 	>>> x[1] += [1]
> 	Traceback (most recent call last):
> 	  File "<stdin>", line 1, in ?
> 	TypeError: object doesn't support item assignment

Yes, but that can't be fixed without breaking other things.  Too bad.
It's not like this is an important use case in real life.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aahz@pythoncraft.com  Sat Jun  1 16:39:17 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sat, 1 Jun 2002 11:39:17 -0400
Subject: [Python-Dev] deprecating string module?
In-Reply-To: <200206011335.g51DZ9a21685@pcp742651pcs.reston01.va.comcast.net>
References: <20020601021106.29157.qmail@web9606.mail.yahoo.com> <073e01c20961$8ab0b770$6601a8c0@boostconsulting.com> <200206011335.g51DZ9a21685@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020601153917.GA14320@panix.com>

>>> In python, you don't need overloading, you have a variety of
>>> optional parameter mechanisms
>> 
>> ...which forces users to write centralized dispatching mechanism
>> that could be much more elegantly-handled by the language. The
>> language already does something just for operators, but the rules
>> are complicated and don't scale well.
> 
> I don't think the situation can be improved without adding type
> declarations.

I thought this was the issue interfaces were supposed to handle?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In the end, outside of spy agencies, people are far too trusting and
willing to help."  --Ira Winkler



From barry@zope.com  Sat Jun  1 16:55:23 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sat, 1 Jun 2002 11:55:23 -0400
Subject: [Python-Dev] Where to put wrap_text()?
References: <20020601134236.GA17691@gerg.ca>
Message-ID: <15608.61035.133229.77125@anthem.wooz.org>

>>>>> "GW" == Greg Ward <gward@python.net> writes:

    GW> Proposal: a new standard library module, wrap_text, which
    GW> combines the best of distutils.fancy_getopt.wrap_text() and
    GW> Text::Wrap.  Right now, I'm thinking of an interface something
    GW> like this:

You might consider a text package with submodules for various wrapping
algorithms.  The text package might even grow other functionality
later too.

I say this because in Mailman I also have a wrap() function (big
surprise, eh?) that implements the Python FAQ wizard rules for
wrapping:

def wrap(text, column=70, honor_leading_ws=1):
    """Wrap and fill the text to the specified column.

    Wrapping is always in effect, although if it is not possible to wrap a
    line (because some word is longer than `column' characters) the line is
    broken at the next available whitespace boundary.  Paragraphs are also
    always filled, unless honor_leading_ws is true and the line begins with
    whitespace.  This is the algorithm that the Python FAQ wizard uses, and
    seems like a good compromise.
    """

There's nothing at all Mailman specific about it, so I wouldn't mind
donating it to the standard library.

-Barry



From aahz@pythoncraft.com  Sat Jun  1 17:07:00 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sat, 1 Jun 2002 12:07:00 -0400
Subject: Documenting practice (was Re: [Python-Dev] Python 2.3 release schedule)
In-Reply-To: <2mr8jwv84e.fsf@starship.python.net>
References: <LNBBLJKPBEHFEDALKOLCMEFJPIAA.tim.one@comcast.net> <2mr8jwv84e.fsf@starship.python.net>
Message-ID: <20020601160659.GA16298@panix.com>

On Tue, May 28, 2002, Michael Hudson wrote:
>
> Thanks; I think it is a good idea to describe intended usage in no
> uncertain terms *somewhere* at least.  Probably lots of places.  Any
> book authors reading python-dev?

Yes.  However, I'm no C programmer, and feedback on my book proposal
makes it likely that API stuff will be dumped -- which is fine with me.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In the end, outside of spy agencies, people are far too trusting and
willing to help."  --Ira Winkler



From guido@python.org  Sat Jun  1 17:17:50 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 12:17:50 -0400
Subject: [Python-Dev] deprecating string module?
In-Reply-To: Your message of "Sat, 01 Jun 2002 11:39:17 EDT."
 <20020601153917.GA14320@panix.com>
References: <20020601021106.29157.qmail@web9606.mail.yahoo.com> <073e01c20961$8ab0b770$6601a8c0@boostconsulting.com> <200206011335.g51DZ9a21685@pcp742651pcs.reston01.va.comcast.net>
 <20020601153917.GA14320@panix.com>
Message-ID: <200206011617.g51GHor22381@pcp742651pcs.reston01.va.comcast.net>

> I thought this was the issue interfaces were supposed to handle?

You'd still need a way to attach an interface declaration to a
function argument.  Smell likes type declarations to me.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gmcm@hypernet.com  Sat Jun  1 17:24:25 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Sat, 1 Jun 2002 12:24:25 -0400
Subject: [Python-Dev] PYC Magic
In-Reply-To: <3CF8D6FA.D9D7B8CE@metaslash.com>
Message-ID: <3CF8BCF9.5557.4C49011F@localhost>

On 1 Jun 2002 at 10:15, Neal Norwitz wrote:

> Guido van Rossum wrote:

> > What do you mean by "can't be written to disk"? 

> Disk full was one condition.  

I can't be 100% sure of the cause, but I *have*
seen this (a bad .pyc file that had to be
deleted before the module would import). The .pyc
was woefully short but passed the magic
test. I think this was 2.1, maybe 2.0. 

This was during a firestorm at a client
site, so I didn't get around to a bug report.

-- Gordon
http://www.mcmillan-inc.com/




From pinard@iro.umontreal.ca  Sat Jun  1 17:51:03 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 01 Jun 2002 12:51:03 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: <20020601134236.GA17691@gerg.ca>
References: <20020601134236.GA17691@gerg.ca>
Message-ID: <oqk7pjeydk.fsf@titan.progiciels-bpi.ca>

[Greg Ward]

> Proposal: a new standard library module, wrap_text, which combines the
> best of distutils.fancy_getopt.wrap_text() and Text::Wrap.

[Aahz]

> Personally, I'd like to at least get the functionality of some versions
> of 'fmt'

[Guido van Rossum]

> I think this is a fine idea.  But *please* don't put an underscore in
> the name.  I'd say "wrap" or "wraptext" are better than "wrap_text".

One thing that I would love to have available in Python is a function able
to wrap text using Knuth's filling algorithm.  GNU `fmt' does it, and it
is _so_ better than dumb refilling, in my eyes at least, that I managed
so Emacs own filling algorithm is short-circuited with an external call
(I do not mind the small fraction of a second it takes).

Also, is there some existing module in which `wraptext' would fit nicely?
That might be better than creating a new module for not many functions.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From tim.one@comcast.net  Sat Jun  1 17:50:29 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 01 Jun 2002 12:50:29 -0400
Subject: [Python-Dev] PYC Magic
In-Reply-To: <200206011526.g51FQbO22066@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCIELHPJAA.tim.one@comcast.net>

[Guido]
> Long digits are only 15 bits, so if you change it to return an
> unsigned short that shouldn't matter.  Dunno if there's magic for
> negative numbers though (in memory, the length is negative, but the
> digits are not).

The marshal format is the same:  signed length and unsigned digits.  The
signed length goes thru [rw]_long.




From aahz@pythoncraft.com  Sat Jun  1 17:59:06 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sat, 1 Jun 2002 12:59:06 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: <oqk7pjeydk.fsf@titan.progiciels-bpi.ca>
References: <20020601134236.GA17691@gerg.ca> <oqk7pjeydk.fsf@titan.progiciels-bpi.ca>
Message-ID: <20020601165906.GA23320@panix.com>

On Sat, Jun 01, 2002, François Pinard wrote:
>
> Also, is there some existing module in which `wraptext' would fit nicely?
> That might be better than creating a new module for not many functions.

I'd prefer to create a package called 'text', with wrap being a module
inside it.  That way, as we add parsing (e.g. mxTextTools) and other
features to the standard library, they can be stuck in the package.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In the end, outside of spy agencies, people are far too trusting and
willing to help."  --Ira Winkler



From tim.one@comcast.net  Sat Jun  1 17:58:48 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 01 Jun 2002 12:58:48 -0400
Subject: [Python-Dev] Where to put wrap_text()?
In-Reply-To: <20020601134236.GA17691@gerg.ca>
Message-ID: <LNBBLJKPBEHFEDALKOLCKELIPJAA.tim.one@comcast.net>

[Greg Ward, on wrapping text]
> ...

Note that regrtest.py also has a wrapper:

def printlist(x, width=70, indent=4):
    """Print the elements of a sequence to stdout.

    Optional arg width (default 70) is the maximum line length.
    Optional arg indent (default 4) is the number of blanks with which to
    begin each line.
    """

This kind of thing gets reinvented too often, so +1 on a module from me.
Just make sure it handle the union of all possible desires, but has a simple
and intuitive interface <wink>.




From mal@lemburg.com  Sat Jun  1 18:36:34 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 01 Jun 2002 19:36:34 +0200
Subject: [Python-Dev] Other library code transformations
References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de>
Message-ID: <3CF90622.6000106@lemburg.com>

Walter D=F6rwald wrote:
> Raymond Hettinger wrote:
>=20
>  > While we're eliminating uses of the string and types modules, how ab=
out
>  > other code clean-ups and modernization:
>  >
>  > [...]
>=20
> dont' forget:
> import stat; os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime
>=20
> But to be able to remove "import stat" everywhere the remaining=20
> functions in stat.py would have to be implemented as methods.
> And what about the remaining constants defined in stat.py?

While you're at it: could you also write up all these little
"code cleanups" in some file so that Andrew can integrate them
in the migration guide ?!

Thanks,
--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From barry@zope.com  Sat Jun  1 17:02:38 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sat, 1 Jun 2002 12:02:38 -0400
Subject: [Python-Dev] Re: Adding Optik to the standard library
References: <NBBBIOJPGKJEKIECEMCBCEIGNCAA.pobrien@orbtech.com>
 <20020601024553.59600.qmail@web9607.mail.yahoo.com>
 <20020601025739.GA17229@gerg.ca>
 <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net>
 <20020601143855.GA18632@gerg.ca>
Message-ID: <15608.61470.981820.695481@anthem.wooz.org>

>>>>> "GW" == Greg Ward <gward@python.net> writes:

    GW> (OTOH and wildly OT: since I gave in a couple years ago and
    GW> started using Emacs syntax-colouring, it *does* take a lot
    GW> longer to load modules up -- eg. ~2 sec for the 1000-line
    GW> rfc822.py.  But that's probably just because Emacs is a great
    GW> shaggy beast of an editor ("Eight(y) Megs and Constantly
    GW> Swapping", "Eventually Mallocs All Core Storage", you
    GW> know...).  I'm sure if I got a brain transplant so that I
    GW> could use vim, it would be different.)

Actually, I've found jed to be a very nice quick-in-quick-out
alternative to XEmacs (the one true Emacs :).  Its default bindings
and operation is close enough that I never notice the difference, for
simple quick editing jobs.

-Barry



From python@rcn.com  Sat Jun  1 19:34:46 2002
From: python@rcn.com (Raymond Hettinger)
Date: Sat, 1 Jun 2002 14:34:46 -0400
Subject: [Python-Dev] Other library code transformations
References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com>
Message-ID: <000b01c2099b$023984a0$1bea7ad1@othello>

From: "M.-A. Lemburg" <mal@lemburg.com>
> While you're at it: could you also write up all these little
> "code cleanups" in some file so that Andrew can integrate them
> in the migration guide ?!

Will do!


Raymond Hettinger




From mal@lemburg.com  Sat Jun  1 20:48:30 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 01 Jun 2002 21:48:30 +0200
Subject: [Python-Dev] Other library code transformations
References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello>
Message-ID: <3CF9250E.30002@lemburg.com>

Raymond Hettinger wrote:
> From: "M.-A. Lemburg" <mal@lemburg.com>
> 
>>While you're at it: could you also write up all these little
>>"code cleanups" in some file so that Andrew can integrate them
>>in the migration guide ?!
> 
> 
> Will do!

Great.

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From python@rcn.com  Sat Jun  1 20:50:50 2002
From: python@rcn.com (Raymond Hettinger)
Date: Sat, 1 Jun 2002 15:50:50 -0400
Subject: [Python-Dev] Other library code transformations
References: <001501c208bd$46133420$d061accf@othello> <3CF8D8EB.60604@livinglogic.de>
Message-ID: <001f01c209a5$a27007a0$1bea7ad1@othello>

From: "Walter Dörwald" <walter@livinglogic.de>
> dont' forget:
> import stat; os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime

Done!

BTW, it was surprising how many times the above has been coded as:
     os.stat("foo")[8]


Raymond Hettinger




From gward@python.net  Sat Jun  1 23:05:29 2002
From: gward@python.net (Greg Ward)
Date: Sat, 1 Jun 2002 18:05:29 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: <oqk7pjeydk.fsf@titan.progiciels-bpi.ca>
References: <20020601134236.GA17691@gerg.ca> <oqk7pjeydk.fsf@titan.progiciels-bpi.ca>
Message-ID: <20020601220529.GA20025@gerg.ca>

On 01 June 2002, Fran?ois Pinard said:
> One thing that I would love to have available in Python is a function able
> to wrap text using Knuth's filling algorithm.  GNU `fmt' does it, and it
> is _so_ better than dumb refilling, in my eyes at least, that I managed
> so Emacs own filling algorithm is short-circuited with an external call
> (I do not mind the small fraction of a second it takes).

Damn, I had no idea there was a body of computer science (however small)
devoted to the art of filling text.  Trust Knuth to be there first.  Do
you have a reference for this algorithm apart from GNU fmt's source
code?  Google'ing for "knuth text fill algorithm" was unhelpful, ditto
with s/fill/wrap/.

Anyways, despite being warned just today on the conceptual/philosophical
danger of classes whose names end in "-er" [1], I'm leaning towards a
TextWrapper class, so that everyone may impose their desires through
subclassing.  I'll start with my simple naive text-wrapping algorithm,
and then we can see who wants to contribute fancy/clever algorithms to
the pot.

> Also, is there some existing module in which `wraptext' would fit nicely?
> That might be better than creating a new module for not many functions.

Not if it grows to accomodate Optik/OptionParser, Mailman, regrtest,
etc.

        Greg

[1] objects should *be*, not *do*, and class names like HelpFormatter
    and TextWrapper are impositions of procedural abstraction onto
    OOP.  It's something to be aware of, but still a useful idiom
    (IMHO).
-- 
Greg Ward - Unix geek                                   gward@python.net
http://starship.python.net/~gward/
No problem is so formidable that you can't just walk away from it.



From tim.one@comcast.net  Sun Jun  2 00:19:08 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 01 Jun 2002 19:19:08 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: <20020601220529.GA20025@gerg.ca>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEMDPJAA.tim.one@comcast.net>

[Greg Ward]
> Damn, I had no idea there was a body of computer science (however small)
> devoted to the art of filling text.

I take it you don't spend much time surveying the range of computer science
literature <wink>.

> Trust Knuth to be there first.  Do you have a reference for this
> algorithm apart from GNU fmt's source code?  Google'ing for "knuth text
> fill algorithm" was unhelpful, ditto with s/fill/wrap/.

Search for

    Knuth hyphenation

instead.  Three months later, the best advice you'll have read is to avoid
hyphenation entirely.  But then you're stuck fighting snaky little rivers of
vertical whitespace without the biggest gun in the arsenal.  Avoid right
justification entirely too, and let the whitespace fall where it may.  Doing
justification with fixed-width fonts is like juggling dirt anyway <wink>.

> Anyways, despite being warned just today on the conceptual/philosophical
> danger of classes whose names end in "-er" [1], I'm leaning towards a
> TextWrapper class, so that everyone may impose their desires through
> subclassing.

LOL!  Resolved, that the world would be a better place if all classes ended
with "-ist".




From goodger@users.sourceforge.net  Sun Jun  2 01:30:26 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Sat, 01 Jun 2002 20:30:26 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
Message-ID: <B91EDF61.23EB4%goodger@users.sourceforge.net>

Greg Ward wrote:
> [1] objects should *be*, not *do*, and class names like
>     HelpFormatter and TextWrapper are impositions of procedural
>     abstraction onto OOP.

I don't see anything dangerous about -er objects.  There are plenty
of objects in the real world that end in -er, all nouns: Programmer,
Bookkeeper, Publisher, Reader, Writer, Trucker, ad infinitum.
Plenty of precedent in the OOP world too: Debugger, Profiler,
Parser, TestLoader, SequenceMatcher, Visitor.  Objects combine state
(data) with behavior (processing); sometimes the state is most
important, sometimes the behavior.  Following that kind of
over-simplified "rule" may do more harm than good.

I'm glad you didn't fall for it.  ;-)

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/




From guido@python.org  Sun Jun  2 01:42:12 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 01 Jun 2002 20:42:12 -0400
Subject: [Python-Dev] PYC Magic
In-Reply-To: Your message of "Sat, 01 Jun 2002 12:24:25 EDT."
 <3CF8BCF9.5557.4C49011F@localhost>
References: <3CF8BCF9.5557.4C49011F@localhost>
Message-ID: <200206020042.g520gCL22848@pcp742651pcs.reston01.va.comcast.net>

[GMcM]
> I can't be 100% sure of the cause, but I *have*
> seen this (a bad .pyc file that had to be
> deleted before the module would import). The .pyc
> was woefully short but passed the magic
> test. I think this was 2.1, maybe 2.0. 

Hm...  Here's the code responsible for writing .pyc files:

static void
write_compiled_module(PyCodeObject *co, char *cpathname, long mtime)
{
	[...]
	PyMarshal_WriteLongToFile(pyc_magic, fp);
	/* First write a 0 for mtime */
	PyMarshal_WriteLongToFile(0L, fp);
	PyMarshal_WriteObjectToFile((PyObject *)co, fp);
	if (ferror(fp)) {
		/* Don't keep partial file */
		fclose(fp);
		(void) unlink(cpathname);
		return;
	}
	/* Now write the true mtime */
	fseek(fp, 4L, 0);
	PyMarshal_WriteLongToFile(mtime, fp);
	fflush(fp);
	fclose(fp);
	[...]
}

It's been like this for a very long time.  It always writes the magic
number, but withholds the mtime until it's done writing without
errors.  And if the mtime doesn't match, the .pyc is ignored (unless
there's no .py file...).

The only way this could write the correct mtime but not all the
marshalled data would be if ferror(fp) doesn't actually indicate an
error after a write failure due to a disk full condition.  And that's
a stdio quality of implementation issue.

I'm not sure if there's anything I could do differently to make this
more robust.  (I guess I could write the correct magic number at the
end too.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip@pobox.com  Sun Jun  2 04:43:14 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sat, 1 Jun 2002 22:43:14 -0500
Subject: [Python-Dev] Where to put wrap_text()?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKELIPJAA.tim.one@comcast.net>
References: <20020601134236.GA17691@gerg.ca>
 <LNBBLJKPBEHFEDALKOLCKELIPJAA.tim.one@comcast.net>
Message-ID: <15609.37970.691340.638319@12-248-41-177.client.attbi.com>

    Tim> Note that regrtest.py also has a wrapper:

Me too...

    def wrap(s, col=74, startcol=0, hangindent=0):
        """Insert newlines into 's' so it doesn't extend past 'col'.

        All lines are indented to 'startcol'.  The indentation of the first 
        line is adjusted further by hangindent.
        """

I guess everybody has one of these laying about...  I'll be happy to dump
mine once something mostly equivalent is available.  I love to throw out
code.

Skip



From smurf@noris.de  Sun Jun  2 05:04:59 2002
From: smurf@noris.de (Matthias Urlichs)
Date: Sun, 2 Jun 2002 06:04:59 +0200
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg>
 import <mod>"
Message-ID: <p05111706b91f48617628@[192.109.102.36]>

>          module5.py:
>              from package import module6   # absolute import
>
>          module6.py:
>              from package import module5
>  [...]
>      ImportError: cannot import name module5
>
>  Is this behavior expected?  Or is it a bug?

The problem is that importing with from consists of two steps:
- load the module
- add the imported names to the local namespace

Since this addition is by reference to the actual object and not to 
the symbol's name in the other module, a concept which Python doesn't 
have (use Perl if you want this...), your recursive import doesn't 
work.

The solution would be:
	import package.module6 as module6

which should have the same effect.
-- 
Matthias Urlichs



From walter@livinglogic.de  Sun Jun  2 10:27:28 2002
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Sun, 02 Jun 2002 11:27:28 +0200
Subject: [Python-Dev] Other library code transformations
References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello>
Message-ID: <3CF9E500.9030103@livinglogic.de>

Raymond Hettinger wrote:
> From: "M.-A. Lemburg" <mal@lemburg.com>
> 
>>While you're at it: could you also write up all these little
>>"code cleanups" in some file so that Andrew can integrate them
>>in the migration guide ?!
> 
> 
> Will do!

There's another one:
"foobar"[:3]=="foo" --> "foobar".startswith("foo")

Bye,
    Walter Dörwald





From skip@mojam.com  Sun Jun  2 13:00:18 2002
From: skip@mojam.com (Skip Montanaro)
Date: Sun, 2 Jun 2002 07:00:18 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200206021200.g52C0I721199@12-248-41-177.client.attbi.com>

Bug/Patch Summary
-----------------

263 open / 2539 total bugs (+8)
136 open / 1532 total patches (+8)

New Bugs
--------

crash in shelve module (2001-03-13)
	http://python.org/sf/408271
UTF-16 BOM handling counterintuitive (2002-05-13)
	http://python.org/sf/555360
import user doesn't work with CGIs (2002-05-14)
	http://python.org/sf/555779
removing extensions without admin rights (2002-05-14)
	http://python.org/sf/555810
installing extension w/o admin rights (2002-05-14)
	http://python.org/sf/555812
Flawed fcntl.ioctl implementation. (2002-05-14)
	http://python.org/sf/555817
Expat improperly described in setup.py (2002-05-15)
	http://python.org/sf/556370
illegal use of malloc/free (2002-05-16)
	http://python.org/sf/557028
TclError is a str should be an Exception (2002-05-17)
	http://python.org/sf/557436
netrc module can't handle all passwords (2002-05-18)
	http://python.org/sf/557704
faqwiz.py could do email obfuscation (2002-05-19)
	http://python.org/sf/558072
Compile error _sre.c on Cray T3E (2002-05-19)
	http://python.org/sf/558153
Shutdown of IDLE blows up (2002-05-19)
	http://python.org/sf/558166
rfc822.Message.get() incompatibility (2002-05-20)
	http://python.org/sf/558179
unittest.TestResult documentation (2002-05-20)
	http://python.org/sf/558278
\verbatiminput and name duplication (2002-05-20)
	http://python.org/sf/558279
DL_EXPORT on VC7 broken (2002-05-20)
	http://python.org/sf/558488
HTTPSConnection memory leakage (2002-05-22)
	http://python.org/sf/559117
imaplib.IMAP4.open() typo (2002-05-23)
	http://python.org/sf/559884
inconsistent behavior of __getslice__ (2002-05-24)
	http://python.org/sf/560064
PyType_IsSubtype can segfault (2002-05-24)
	http://python.org/sf/560215
Add docs for 'string' (2002-05-24)
	http://python.org/sf/560286
foo() doesn't use __getattribute__ (2002-05-25)
	http://python.org/sf/560438
deepcopy can't handle custom metaclasses (2002-05-26)
	http://python.org/sf/560794
Maximum recursion limit exceeded (2002-05-27)
	http://python.org/sf/561047
ConfigParser has_option case sensitive (2002-05-29)
	http://python.org/sf/561822
Assertion with very long lists (2002-05-29)
	http://python.org/sf/561858
test_signal.py fails on FreeBSD-4-stable (2002-05-29)
	http://python.org/sf/562188
build problems on DEC Unix 4.0f (2002-05-30)
	http://python.org/sf/562585
xmlrpclib.Binary.data undocumented (2002-05-31)
	http://python.org/sf/562878
Module can be used as a base class (2002-05-31)
	http://python.org/sf/563060
Clarify documentation for inspect (2002-06-01)
	http://python.org/sf/563273
Fuzziness in inspect module documentatio (2002-06-01)
	http://python.org/sf/563298
Heap corruption in debug (2002-06-01)
	http://python.org/sf/563303
Getting traceback in embedded python. (2002-06-01)
	http://python.org/sf/563338
Add separator argument to readline() (2002-06-02)
	http://python.org/sf/563491

New Patches
-----------

timeout socket implementation (2002-05-12)
	http://python.org/sf/555085
Mutable object change flag (2002-05-12)
	http://python.org/sf/555251
Cygwin AH_BOTTOM cleanup patch (2002-05-14)
	http://python.org/sf/555929
OSX build -- make python.app (2002-05-18)
	http://python.org/sf/557719
Ebcdic compliancy in stringobject source (2002-05-19)
	http://python.org/sf/557946
cmd.py: add instance-specific stdin/out (2002-05-20)
	http://python.org/sf/558544
SocketServer: don't flush closed wfile (2002-05-20)
	http://python.org/sf/558547
GC: untrack simple objects (2002-05-21)
	http://python.org/sf/558745
Use builtin boolean if present (2002-05-22)
	http://python.org/sf/559288
Expose xrange type in builtins (2002-05-23)
	http://python.org/sf/559833
isinstance error message (2002-05-24)
	http://python.org/sf/560250
os.uname() on Darwin space in machine (2002-05-24)
	http://python.org/sf/560311
Karatsuba multiplication (2002-05-24)
	http://python.org/sf/560379
Micro optimizations (2002-05-27)
	http://python.org/sf/561244
webchecker chokes at charsets. (2002-05-28)
	http://python.org/sf/561478
README additions for Cray T3E (2002-05-28)
	http://python.org/sf/561724
Installation database patch (2002-05-29)
	http://python.org/sf/562100
Getting rid of string, types and stat (2002-05-30)
	http://python.org/sf/562373
Prevent duplicates in readline history (2002-05-30)
	http://python.org/sf/562492
Add isxxx() methods to string objects (2002-05-30)
	http://python.org/sf/562501
First patch: start describing types... (2002-05-30)
	http://python.org/sf/562529
Remove UserDict from cookie.py (2002-05-31)
	http://python.org/sf/562987

Closed Bugs
-----------

pdb can only step when at botframe (PR#4) (2000-07-31)
	http://python.org/sf/210682
Copy from stdout after crash (2001-11-29)
	http://python.org/sf/487297
detail: tp_basicsize and tp_itemsize (2001-12-12)
	http://python.org/sf/492349
Finder Tool Move not working on MOSX (2001-12-15)
	http://python.org/sf/493826
plugin project generation has problems (2001-12-18)
	http://python.org/sf/494572
Inaccuracy(?) in tutorial section 9.2 (2002-01-07)
	http://python.org/sf/500539
random.cunifvariate() incorrect? (2002-01-21)
	http://python.org/sf/506647
random.gammavariate hosed (2002-03-07)
	http://python.org/sf/527139
__reduce__ does not work as documented (2002-03-21)
	http://python.org/sf/533291
rexec: potential security hole (2002-03-22)
	http://python.org/sf/533625
Running MacPython as non-priv user may fail (2002-03-23)
	http://python.org/sf/534158
Compile fails on posixmodule.c (2002-04-10)
	http://python.org/sf/542003
Distutils readme outdated (2002-04-12)
	http://python.org/sf/542912
bug? floor divison on complex (2002-04-13)
	http://python.org/sf/543387
base64 newlines - documentation (again) (2002-04-22)
	http://python.org/sf/547037
cStringIO mangles Unicode (2002-04-23)
	http://python.org/sf/547537
Missing or wrong index entries (2002-04-25)
	http://python.org/sf/548693
Poor error message for float() (2002-05-02)
	http://python.org/sf/551673
PDF won't print (2002-05-03)
	http://python.org/sf/551828
"./configure" crashes (2002-05-06)
	http://python.org/sf/553000
cPickle dies on short reads (2002-05-07)
	http://python.org/sf/553512
bug in telnetlib- 'opt' instead of 'c' (2002-05-09)
	http://python.org/sf/554073
test_fcntl fails on OpenBSD 3.0 (2002-05-10)
	http://python.org/sf/554663
--disable-unicode builds horked (2002-05-11)
	http://python.org/sf/554912
rfc822.Message.getaddrlist broken (2002-05-11)
	http://python.org/sf/555035

Closed Patches
--------------

Reminder: 2.3 should check tp_compare (2001-10-18)
	http://python.org/sf/472523
foreign-platform newline support (2001-10-31)
	http://python.org/sf/476814
Cygwin setup.py import workaround patch (2001-12-10)
	http://python.org/sf/491107
Fix webbrowser running on MachoPython (2002-01-10)
	http://python.org/sf/502205
fix random.gammavariate bug #527139 (2002-03-13)
	http://python.org/sf/529408
force gzip to open files with 'b' (2002-03-28)
	http://python.org/sf/536278
context sensitive help/keyword search (2002-04-08)
	http://python.org/sf/541031
error about string formatting rewording? (2002-04-26)
	http://python.org/sf/549187
Unittest for base64 (2002-04-28)
	http://python.org/sf/550002
__doc__ strings of builtin types (2002-04-29)
	http://python.org/sf/550290
GC collection frequency bug (2002-05-03)
	http://python.org/sf/551915
Add degrees() & radians() to math module (2002-05-05)
	http://python.org/sf/552452
Cygwin Makefile.pre.in vestige patch (2002-05-08)
	http://python.org/sf/553678
OpenBSD fixes for Python 2.2 (2002-05-10)
	http://python.org/sf/554719



From akuchlin@mems-exchange.org  Sun Jun  2 13:45:17 2002
From: akuchlin@mems-exchange.org (akuchlin@mems-exchange.org)
Date: Sun, 2 Jun 2002 08:45:17 -0400
Subject: [Python-Dev] Where to put wrap_text()?
In-Reply-To: <15608.61035.133229.77125@anthem.wooz.org>
References: <20020601134236.GA17691@gerg.ca> <15608.61035.133229.77125@anthem.wooz.org>
Message-ID: <20020602124517.GA31389@mems-exchange.org>

On Sat, Jun 01, 2002 at 11:55:23AM -0400, Barry A. Warsaw wrote:
>You might consider a text package with submodules for various wrapping
>algorithms.  The text package might even grow other functionality
>later too.

+1. 

--amk



From pinard@iro.umontreal.ca  Sun Jun  2 14:09:57 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 02 Jun 2002 09:09:57 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: <20020601220529.GA20025@gerg.ca>
References: <20020601134236.GA17691@gerg.ca>
 <oqk7pjeydk.fsf@titan.progiciels-bpi.ca>
 <20020601220529.GA20025@gerg.ca>
Message-ID: <oqu1olesii.fsf@titan.progiciels-bpi.ca>

[Greg Ward]

> Do you have a reference for this algorithm apart from GNU fmt's source
> code?

Surely not handy.  I heard about it, and others even more capable, many
years ago.  If I remember well, Knuth's algorithm plays by moving line
cuts and optimising a global function through dynamic programming, giving
more points, say, when punctuation coincides with end of lines, removing
points when a single letter words appear at end of lines, and such thing.
So lines are not guaranteed to be as filled as possible, but the overall
appearance of the paragraph gets better, sometimes much better.

I'm Cc:ing Ross Paterson, who wrote GNU `fmt', in hope he could shed some
light about references, or otherwise.

Some filling algorithms used by typographers (or so I heard) are even
careful about dismantling vertical or diagonal (aliased) white lines which
sometimes build up across paragraphs by the effect of dumbier filling.

> I'm leaning towards a TextWrapper class, so that everyone may impose
> their desires through subclassing.

Distutils experience speaking here? :-)

By the way, I would like if the module was not named `text'.  I use `text'
all over in my programs already as a common variable name, as a way to not
use `string' for a common variable name, for obvious reasons.  Granted that
`string' is progressively becoming available again :-).  Maybe Python should
try to not name modules with likely to be use-everywhere local variables.

> Not if it grows to accomodate Optik/OptionParser, Mailman, regrtest,
> etc.

At some places in my things, I have unusual wrapping/filling needs, and
wonder if they could all fit in a generic scheme.  An interesting question
and exercise, surely.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From skip@pobox.com  Sun Jun  2 14:10:15 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 2 Jun 2002 08:10:15 -0500
Subject: [Python-Dev] "max recursion limit exceeded" canned response?
Message-ID: <15610.6455.96035.742110@12-248-41-177.client.attbi.com>

How would we go about adding a canned response to the commonly submitted
"max recursion limit exceeded" bug report?  I think Tim's discussion of re
design patterns to use in

    http://python.org/sf/493252

(or something like it) probably belongs in the re module docs since this is
such a common stumbling block for people used to using ".*?".  I'll work
something up for the Examples section and Jake's hockey game this morning.

Skip




From guido@python.org  Sun Jun  2 14:22:53 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 02 Jun 2002 09:22:53 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: Your message of "02 Jun 2002 09:09:57 EDT."
 <oqu1olesii.fsf@titan.progiciels-bpi.ca>
References: <20020601134236.GA17691@gerg.ca> <oqk7pjeydk.fsf@titan.progiciels-bpi.ca> <20020601220529.GA20025@gerg.ca>
 <oqu1olesii.fsf@titan.progiciels-bpi.ca>
Message-ID: <200206021322.g52DMr531014@pcp742651pcs.reston01.va.comcast.net>

> > Do you have a reference for this algorithm apart from GNU fmt's
> > source code?

Can we focus on getting the module/package structure and a basic
algorithm first?  It's fine to design the structure for easy
extensibility with other algorithms, but implementing Knuth's
algorithm seems hopelessly out of scope.  Even Emacs' fill-paragraph
is too fancy-schmancy for my taste (for inclusion as a Python standard
library).  Simply breaking lines at a certain limit is all that's
needed.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sun Jun  2 14:25:08 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 02 Jun 2002 09:25:08 -0400
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg> import <mod>"
In-Reply-To: Your message of "Sun, 02 Jun 2002 06:04:59 +0200."
 <p05111706b91f48617628@[192.109.102.36]>
References: <p05111706b91f48617628@[192.109.102.36]>
Message-ID: <200206021325.g52DP8031046@pcp742651pcs.reston01.va.comcast.net>

> >          module5.py:
> >              from package import module6   # absolute import
> >
> >          module6.py:
> >              from package import module5
> >  [...]
> >      ImportError: cannot import name module5
> >
> >  Is this behavior expected?  Or is it a bug?
> 
> The problem is that importing with from consists of two steps:
> - load the module
> - add the imported names to the local namespace

Good explanation!  This means it's an unavoidable problem.  Maybe you
can fix the FAQ entry?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From goodger@users.sourceforge.net  Sun Jun  2 15:11:31 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Sun, 02 Jun 2002 10:11:31 -0400
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg>
 import <mod>"
In-Reply-To: <p05111706b91f48617628@[192.109.102.36]>
Message-ID: <B91F9FD2.23EB8%goodger@users.sourceforge.net>

I wrote:
>> module5.py:
>> from package import module6   # absolute import
>> 
>> module6.py:
>> from package import module5
>> [...]
>> ImportError: cannot import name module5
>> 
>> Is this behavior expected?  Or is it a bug?

Matthias Urlichs replied:
> The problem is that importing with from consists of two steps:
> - load the module
> - add the imported names to the local namespace
> 
> Since this addition is by reference to the actual object and not to
> the symbol's name in the other module, a concept which Python doesn't
> have (use Perl if you want this...), your recursive import doesn't
> work.
> 
> The solution would be:
>  import package.module6 as module6
> 
> which should have the same effect.

Perhaps I'm just dense, or perhaps it's because of my choice of names
in my example, but I don't understand the explanation.  Could you be
more specific, perhaps with a concrete example?  Despite Guido's
"Good explanation!", the above text in the FAQ entry wouldn't
eliminate my confusion.  I suspect it's a good explanation for those
that already understand what's going on behind the scenes.

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/




From David Abrahams" <david.abrahams@rcn.com  Sun Jun  2 15:36:12 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Sun, 2 Jun 2002 10:36:12 -0400
Subject: [Python-Dev] Numeric conversions
Message-ID: <0b3501c20a43$394d77a0$6601a8c0@boostconsulting.com>

The following small program is giving me some unexpected results with
Python 2.2.1:

class Int(object):
    def __int__(self): return 10

class Float(object):
    def __float__(self): return 10.0

class Long(object):
    def __long__(self): return 10L

class Complex(object):
    def __complex__(self): return (10+0j)

def attempt(f,arg):
    try:
        return f(arg)
    except Exception,e:
        return str(e.__class__.__name__)+': '+str(e)

for f in int,float,long,complex:
    for t in Int,Float,Long,Complex:
        print f.__name__ + '(' + t.__name__ + ')\t\t',
        print attempt(f,t())

----- results ------

int(Int)                10
int(Float)              TypeError: object can't be converted to int
int(Long)               TypeError: object can't be converted to int
int(Complex)            TypeError: object can't be converted to int

*** OK, int() seems to work as expected

float(Int)              TypeError: float() needs a string argument
float(Float)            10.0
float(Long)             TypeError: float() needs a string argument
float(Complex)          TypeError: float() needs a string argument

*** float() seems to work, but what's with the error message about strings?

long(Int)               TypeError: object can't be converted to long
long(Float)             TypeError: object can't be converted to long
long(Long)              10
long(Complex)           TypeError: object can't be converted to long

**** OK, long seems to work as expected

complex(Int)            TypeError: complex() arg can't be converted to
complex
complex(Float)          (10+0j)
complex(Long)           TypeError: complex() arg can't be converted to
complex
complex(Complex)        TypeError: complex() arg can't be converted to
complex

**** I can understand complex() handling Float implicitly, but only if it
also handles Complex! And if it does handle Float implicitly, shouldn't all
of these handle everything?

Comments?

-Dave

+---------------------------------------------------------------+
                  David Abrahams
      C++ Booster (http://www.boost.org)               O__  ==
      Pythonista (http://www.python.org)              c/ /'_ ==
  resume: http://users.rcn.com/abrahams/resume.html  (*) \(*) ==
          email: david.abrahams@rcn.com
+---------------------------------------------------------------+




From David Abrahams" <david.abrahams@rcn.com  Sun Jun  2 15:38:52 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Sun, 2 Jun 2002 10:38:52 -0400
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg>import <mod>"
References: <B91F9FD2.23EB8%goodger@users.sourceforge.net>
Message-ID: <0b3701c20a43$39a61ef0$6601a8c0@boostconsulting.com>

From: "David Goodger" <goodger@users.sourceforge.net>

> Perhaps I'm just dense, or perhaps it's because of my choice of names
> in my example, but I don't understand the explanation.  Could you be
> more specific, perhaps with a concrete example?  Despite Guido's
> "Good explanation!", the above text in the FAQ entry wouldn't
> eliminate my confusion.  

Nor mine, FWIW.

-D





From smurf@noris.de  Sun Jun  2 15:44:07 2002
From: smurf@noris.de (Matthias Urlichs)
Date: Sun, 2 Jun 2002 16:44:07 +0200
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg> import <mod>"
In-Reply-To: <B91F9FD2.23EB8%goodger@users.sourceforge.net>; from goodger@users.sourceforge.net on Sun, Jun 02, 2002 at 10:11:31AM -0400
References: <p05111706b91f48617628@[192.109.102.36]> <B91F9FD2.23EB8%goodger@users.sourceforge.net>
Message-ID: <20020602164407.E17316@noris.de>

Hi,

David Goodger:
> Perhaps I'm just dense, or perhaps it's because of my choice of names
> in my example, but I don't understand the explanation.  Could you be
> more specific, perhaps with a concrete example? 

foo.py: from bar import one
bar.py: from foo import two
main.py: import foo

So what happens is, more or less:

main imports foo
  Empty globals for foo are created
  foo is compiled
  foo loads bar
    Empty globals for bar are created
    bar is compiled
    bar loads foo (which is a no-op since there already is a module named foo)
    bar.two = foo.two
      ... which fails, because the compiler isn't done with foo yet and the
      global symbol dict for foo is still empty.

> eliminate my confusion.  I suspect it's a good explanation for those
> that already understand what's going on behind the scenes.
> 
_If_ you can change foo.py so that it reads:

two = 2
from bar import one

i.e., initialize the exports first and load afterwards, the test
would work. However, the following will NOT work:

two = None
from bar import one
two = do_something(with(bar.one))

for (hopefully) obvious reasons.

-- 
Matthias Urlichs     |     noris network AG     |     http://smurf.noris.de/



From mwh@python.net  Sun Jun  2 15:45:13 2002
From: mwh@python.net (Michael Hudson)
Date: 02 Jun 2002 15:45:13 +0100
Subject: [Python-Dev] PYC Magic
In-Reply-To: "Gordon McMillan"'s message of "Sat, 1 Jun 2002 12:24:25 -0400"
References: <3CF8BCF9.5557.4C49011F@localhost>
Message-ID: <2m1ybpu4cm.fsf@starship.python.net>

"Gordon McMillan" <gmcm@hypernet.com> writes:

> On 1 Jun 2002 at 10:15, Neal Norwitz wrote:
> 
> > Guido van Rossum wrote:
> 
> > > What do you mean by "can't be written to disk"? 
> 
> > Disk full was one condition.  
> 
> I can't be 100% sure of the cause, but I *have*
> seen this (a bad .pyc file that had to be
> deleted before the module would import). The .pyc
> was woefully short but passed the magic
> test. I think this was 2.1, maybe 2.0. 

Someone on comp.lang.python reported getting corrupt .pycs by having
modules in a user-writeable directory being accessed more-or-less
simultaneously by different Python versions.  I'm not sure what could
be done aobut that.

Cheers,
M.

-- 
  First of all, email me your AOL password as a security measure. You
  may find that won't be able to connect to the 'net for a while. This
  is normal. The next thing to do is turn your computer upside down
  and shake it to reboot it.                     -- Darren Tucker, asr



From mwh@python.net  Sun Jun  2 16:13:24 2002
From: mwh@python.net (Michael Hudson)
Date: 02 Jun 2002 16:13:24 +0100
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/test test_signal.py,1.9,1.10
In-Reply-To: Michael Hudson's message of "30 May 2002 17:05:04 +0100"
References: <Pine.OS2.4.32.0205292159370.1634-100000@tenring.andymac.org> <2mit56ros4.fsf@starship.python.net> <2msn49hp3i.fsf@starship.python.net> <200205301228.g4UCS9S07601@pcp742651pcs.reston01.va.comcast.net> <2mhekphjdi.fsf@starship.python.net> <200205301540.g4UFefk24115@odiug.zope.com> <2melfthb9r.fsf@starship.python.net>
Message-ID: <2m3cw5zpbf.fsf@starship.python.net>

Michael Hudson <mwh@python.net> writes:

> Now what do I do?  Back my patch out?  Not expose the functions on
> BSD?  It works on Linux...

But nowhere else, it would seem, at least not when python is built
threaded.  Darwin fails in a similar manner to FreeBSD, test_signal
hangs on Solaris, but only if it's runs as part of the test suite --
it runs fine if you just run it alone.  Argh!  I really am starting to
think that I should back out my patch and distribute my code as an
extension module.  Opinions?

Cheers,
M.

-- 
  But since your post didn't lay out your assumptions, your goals,
  or how you view language characteristics as fitting in with 
  either, you're not a *natural* candidate for embracing Design by 
  Contract <0.6 wink>.    -- Tim Peters, giving Eiffel adoption advice



From gisle@ActiveState.com  Sun Jun  2 16:40:49 2002
From: gisle@ActiveState.com (Gisle Aas)
Date: 02 Jun 2002 08:40:49 -0700
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg> import <mod>"
In-Reply-To: <p05111706b91f48617628@[192.109.102.36]>
References: <p05111706b91f48617628@[192.109.102.36]>
Message-ID: <lrit51vgce.fsf@caliper.activestate.com>

Matthias Urlichs <smurf@noris.de> writes:

> Since this addition is by reference to the actual object and not to
> the symbol's name in the other module, a concept which Python doesn't
> have (use Perl if you want this...)

Perl doesn't add references to names.  It imports direct reference as
well.  The difference is that perl will create the named object in the
exporting package when it is imported, if the exporting package's init
code has not executed yet.

In Perl this works because we at import time know if we are importing
a variable (and what kind) or a function, and later assignments to
variables or redefinitions of functions mutate the object in-place.

Regards,
Gisle Aas



From barry@zope.com  Sun Jun  2 16:54:55 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sun, 2 Jun 2002 11:54:55 -0400
Subject: [Python-Dev] Other library code transformations
References: <001501c208bd$46133420$d061accf@othello>
 <3CF8D960.8040402@livinglogic.de>
 <3CF90622.6000106@lemburg.com>
 <000b01c2099b$023984a0$1bea7ad1@othello>
 <3CF9E500.9030103@livinglogic.de>
Message-ID: <15610.16335.191744.30203@anthem.wooz.org>

What about "foo = foo + 1" => "foo += 1"?
-Barry



From aahz@pythoncraft.com  Sun Jun  2 16:56:33 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sun, 2 Jun 2002 11:56:33 -0400
Subject: [Python-Dev] Numeric conversions
In-Reply-To: <0b3501c20a43$394d77a0$6601a8c0@boostconsulting.com>
References: <0b3501c20a43$394d77a0$6601a8c0@boostconsulting.com>
Message-ID: <20020602155633.GA3139@panix.com>

On Sun, Jun 02, 2002, David Abrahams wrote:
>
> The following small program is giving me some unexpected results with
> Python 2.2.1:
> 
> class Int(object):
>     def __int__(self): return 10
> 
> class Float(object):
>     def __float__(self): return 10.0
> 
> ----- results ------
> 
> int(Int)                10
> int(Float)              TypeError: object can't be converted to int

Um.  I'm confuzzled.  Float doesn't have an __int__ method; why do you
expect it to work?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In the end, outside of spy agencies, people are far too trusting and
willing to help."  --Ira Winkler



From David Abrahams" <david.abrahams@rcn.com  Sun Jun  2 16:58:51 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Sun, 2 Jun 2002 11:58:51 -0400
Subject: [Python-Dev] Numeric conversions
References: <0b3501c20a43$394d77a0$6601a8c0@boostconsulting.com> <20020602155633.GA3139@panix.com>
Message-ID: <0b5701c20a4e$6648f760$6601a8c0@boostconsulting.com>

From: "Aahz" <aahz@pythoncraft.com>


> On Sun, Jun 02, 2002, David Abrahams wrote:
> >
> > The following small program is giving me some unexpected results with
> > Python 2.2.1:
> >
> > class Int(object):
> >     def __int__(self): return 10
> >
> > class Float(object):
> >     def __float__(self): return 10.0
> >
> > ----- results ------
> >
> > int(Int)                10
> > int(Float)              TypeError: object can't be converted to int
>
> Um.  I'm confuzzled.  Float doesn't have an __int__ method; why do you
> expect it to work?

I don't. as I wrote below that:

*** OK, int() seems to work as expected

Each item beginning with '***' is meant to refer to the group of 4 lines
above.





From barry@zope.com  Sun Jun  2 17:00:27 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sun, 2 Jun 2002 12:00:27 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
References: <20020601134236.GA17691@gerg.ca>
 <oqk7pjeydk.fsf@titan.progiciels-bpi.ca>
 <20020601220529.GA20025@gerg.ca>
 <oqu1olesii.fsf@titan.progiciels-bpi.ca>
Message-ID: <15610.16667.463562.261068@anthem.wooz.org>

>>>>> "FP" =3D=3D Fran=E7ois Pinard <pinard@iro.umontreal.ca> writes:

    FP> By the way, I would like if the module was not named `text'.

Well, since Greg's writing it <wink>, I think "textutils" is a natural
package name. :)

-Barry



From smurf@noris.de  Sun Jun  2 17:05:37 2002
From: smurf@noris.de (Matthias Urlichs)
Date: Sun, 2 Jun 2002 18:05:37 +0200
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg> import <mod>"
In-Reply-To: <lrit51vgce.fsf@caliper.activestate.com>; from gisle@ActiveState.com on Sun, Jun 02, 2002 at 08:40:49AM -0700
References: <p05111706b91f48617628@[192.109.102.36]> <lrit51vgce.fsf@caliper.activestate.com>
Message-ID: <20020602180537.G17316@noris.de>

Hi,

Gisle Aas:
> Perl doesn't add references to names.  It imports direct reference as
> well.

What I meant to say was: Perl shares the actual symbol table slot when you
import something; so a later reassignment to the variable in question will
affect every module.

Python doesn't have that additional indirection.

> In Perl this works because we at import time know if we are importing
> a variable (and what kind) or a function, and later assignments to
> variables or redefinitions of functions mutate the object in-place.
> 
... which is essentially a different way to state the same thing. ;-)

-- 
Matthias Urlichs     |     noris network AG     |     http://smurf.noris.de/



From goodger@users.sourceforge.net  Sun Jun  2 17:19:39 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Sun, 02 Jun 2002 12:19:39 -0400
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg>
 import <mod>"
In-Reply-To: <20020602164407.E17316@noris.de>
Message-ID: <B91FBDDA.23EED%goodger@users.sourceforge.net>

Matthias, thank you for your explanation.  I was operating under the
assumption that the mechanism behind "from package import module" was
somehow different from that behind "from module import name", because there
is no name "module" inside package/__init__.py.  A little experimentation
confirmed that it was a mistaken assumption.  Now all is clear.

The note in section 6.12 ("The import statement") of the Language Reference,
"XXX Can't be bothered to spell this out right now", has always bothered me.
I will endeavour to flesh it out for release 2.3.  Perhaps Guido's package
support essay (http://www.python.org/doc/essays/packages.html), edited,
should form a new section or appendix.  Ideas anyone?

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/




From pinard@iro.umontreal.ca  Sun Jun  2 17:35:42 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 02 Jun 2002 12:35:42 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: <200206021322.g52DMr531014@pcp742651pcs.reston01.va.comcast.net>
References: <20020601134236.GA17691@gerg.ca>
 <oqk7pjeydk.fsf@titan.progiciels-bpi.ca>
 <20020601220529.GA20025@gerg.ca>
 <oqu1olesii.fsf@titan.progiciels-bpi.ca>
 <200206021322.g52DMr531014@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <oqlm9xeizl.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

> > > Do you have a reference for this algorithm apart from GNU fmt's
> > > source code?

> Can we focus on getting the module/package structure and a basic
> algorithm first?  It's fine to design the structure for easy
> extensibility with other algorithms, but implementing Knuth's
> algorithm seems hopelessly out of scope.

There is no emergency for Knuth's algorithm of, course.  However, if I
mentioned it, this was as an invitation for the package to be designed with
an opened mind about extensibility.  And it usually helps opening the mind,
pondering various avenues.

What should we read in your "hopelessly out of scope" comment, above?
Do you mean you would object beforehand that Python offers it?

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From gisle@ActiveState.com  Sun Jun  2 17:44:25 2002
From: gisle@ActiveState.com (Gisle Aas)
Date: 02 Jun 2002 09:44:25 -0700
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg> import <mod>"
In-Reply-To: <20020602180537.G17316@noris.de>
References: <p05111706b91f48617628@[192.109.102.36]>
 <lrit51vgce.fsf@caliper.activestate.com>
 <20020602180537.G17316@noris.de>
Message-ID: <lrbsatd40m.fsf@caliper.activestate.com>

"Matthias Urlichs" <smurf@noris.de> writes:

> Gisle Aas:
> > Perl doesn't add references to names.  It imports direct reference as
> > well.
> 
> What I meant to say was: Perl shares the actual symbol table slot when you
> import something;

Wrong.  If you do:

   package Bar;
   use Foo qw(foo);

Then you end up with \&Bar::foo and \&Foo:foo pointing to the same
function object, but the symbol table slots are independent.  Is is
exactly the same situation you would have in Python with two namespace
dicts pointing to the same function object.

But you can achieve sharing by explicit import of the symbol (aka
glob) using something like:

   use Foo qw(*foo);

Regards,
Gisle Aas



From mal@lemburg.com  Sun Jun  2 18:22:18 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 02 Jun 2002 19:22:18 +0200
Subject: [Python-Dev] Other library code transformations
References: <001501c208bd$46133420$d061accf@othello>	<3CF8D960.8040402@livinglogic.de>	<3CF90622.6000106@lemburg.com>	<000b01c2099b$023984a0$1bea7ad1@othello>	<3CF9E500.9030103@livinglogic.de> <15610.16335.191744.30203@anthem.wooz.org>
Message-ID: <3CFA544A.6050701@lemburg.com>

Raymond,

while you documenting the various changes, please include
a Python version number with all of them, so that the migration
guide can use this information as well.

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From smurf@noris.de  Sun Jun  2 18:45:06 2002
From: smurf@noris.de (Matthias Urlichs)
Date: Sun, 2 Jun 2002 19:45:06 +0200
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg> import <mod>"
In-Reply-To: <lrbsatd40m.fsf@caliper.activestate.com>; from gisle@ActiveState.com on Sun, Jun 02, 2002 at 09:44:25AM -0700
References: <p05111706b91f48617628@[192.109.102.36]> <lrit51vgce.fsf@caliper.activestate.com> <20020602180537.G17316@noris.de> <lrbsatd40m.fsf@caliper.activestate.com>
Message-ID: <20020602194506.H17316@noris.de>

Hi,

Gisle Aas:
> > What I meant to say was: Perl shares the actual symbol table slot when you
> > import something;
> 
> Wrong.  If you do:
> 
>    package Bar;
>    use Foo qw(foo);
> 
> Then you end up with \&Bar::foo and \&Foo:foo pointing to the same
> function object, but the symbol table slots are independent.

Right, actually; we're just miscommunicating.

Let's state it differently: the situation is more like "use Foo qw($foo)"
(you can't assign to &Foo::foo). 

After the "use", \$Bar::foo and \$Foo:foo point to the same scalar
variable, thus $Foo::foo and $Bar::foo are the same variable and no longer
independent (unless a different scalar or glob reference is stored in
to *Foo::foo, but that's too much magic for a FAQ entry).

In Python, Bar.foo gets set to the contents of Foo.foo when the import
statement is processed, but the two variables are otherwise independent.

-- 
Matthias Urlichs     |     noris network AG     |     http://smurf.noris.de/



From guido@python.org  Sun Jun  2 21:56:39 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 02 Jun 2002 16:56:39 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib formatter.py,1.20,1.21 ftplib.py,1.69,1.70 gettext.py,1.13,1.14 hmac.py,1.5,1.6
In-Reply-To: Your message of "Sun, 02 Jun 2002 16:18:25 +1100."
 <Pine.OS2.4.32.0206021610110.1651-100000@tenring.andymac.org>
References: <Pine.OS2.4.32.0206021610110.1651-100000@tenring.andymac.org>
Message-ID: <200206022056.g52KudD31601@pcp742651pcs.reston01.va.comcast.net>

[Andrew MacIntyre commenting on Raymond H's changes from "if x" to
"if x is not None"]
> You have in fact changed the semantics of this test with your change.
> 
> In the case where file = '', the original would fall through to the
> elif, whereas with your change it won't.

The question is whether that's an important change or not.  Raymond
has changed the semantics in every case where he made this particular
change.  Usually that's fine.  Occasionally it's not.

> It concerns me that your extensive changes have introduced some
> unexpected traps which won't be sprung until 2.3 is released.

Me too.  I think 99% of the changes were "right", but without looking
at actual use cases much more we won't know where the 1% mistakes are.

I think it's okay to do this (because of the 99%) but we should be
aware that we may be breaking some code and willing to revert the
decision in some cases.  I'm not sure how we can test this enough
before 2.3 is released -- surely the alphas won't shake out enough.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@v.loewis.de  Sun Jun  2 22:53:20 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 02 Jun 2002 23:53:20 +0200
Subject: [Python-Dev] PYC Magic
In-Reply-To: <2m1ybpu4cm.fsf@starship.python.net>
References: <3CF8BCF9.5557.4C49011F@localhost>
 <2m1ybpu4cm.fsf@starship.python.net>
Message-ID: <m3d6v98i0f.fsf@mira.informatik.hu-berlin.de>

Michael Hudson <mwh@python.net> writes:

> Someone on comp.lang.python reported getting corrupt .pycs by having
> modules in a user-writeable directory being accessed more-or-less
> simultaneously by different Python versions.  I'm not sure what could
> be done aobut that.

The user could remove write permission on that directory. I think
Python should provide an option to never write .pyc files,
controllable through sys.something.

Regards,
Martin



From martin@v.loewis.de  Sun Jun  2 22:51:30 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 02 Jun 2002 23:51:30 +0200
Subject: [Python-Dev] "max recursion limit exceeded" canned response?
In-Reply-To: <15610.6455.96035.742110@12-248-41-177.client.attbi.com>
References: <15610.6455.96035.742110@12-248-41-177.client.attbi.com>
Message-ID: <m3hekl8i3h.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> How would we go about adding a canned response to the commonly submitted
> "max recursion limit exceeded" bug report?  

Post the precise text that you want to see as the canned response, and
somebody can install it.

Regards,
Martin




From tim.one@comcast.net  Sun Jun  2 23:03:55 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 02 Jun 2002 18:03:55 -0400
Subject: [Python-Dev] "max recursion limit exceeded" canned response?
In-Reply-To: <m3hekl8i3h.fsf@mira.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEOOPJAA.tim.one@comcast.net>

[Skip Montanaro]
> How would we go about adding a canned response to the commonly submitted
> "max recursion limit exceeded" bug report?

[Martin v. Loewis]
> Post the precise text that you want to see as the canned response, and
> somebody can install it.

I don't think any canned answer will suffice -- every context is different
enough that it needs custom help.  I vote instead that we stop answering
these reports at all:  let /F do it.  That will eventually provoke him into
either writing the canned response he wants to see, or to complete the
long-delayed task of removing this ceiling from sre.




From akuchlin@mems-exchange.org  Mon Jun  3 01:37:09 2002
From: akuchlin@mems-exchange.org (akuchlin@mems-exchange.org)
Date: Sun, 2 Jun 2002 20:37:09 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: <oqu1olesii.fsf@titan.progiciels-bpi.ca>
References: <20020601134236.GA17691@gerg.ca> <oqk7pjeydk.fsf@titan.progiciels-bpi.ca> <20020601220529.GA20025@gerg.ca> <oqu1olesii.fsf@titan.progiciels-bpi.ca>
Message-ID: <20020603003709.GA1214@mems-exchange.org>

On Sun, Jun 02, 2002 at 09:09:57AM -0400, Fran?ois Pinard wrote:
>years ago.  If I remember well, Knuth's algorithm plays by moving line
>cuts and optimising a global function through dynamic programming, giving
>more points, say, when punctuation coincides with end of lines, ...

If that's the same algorithm that's used by TeX, see
http://www.amk.ca/python/code/tex_wrap.html .

--amk



From guido@python.org  Mon Jun  3 06:01:36 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 01:01:36 -0400
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg> import <mod>"
In-Reply-To: Your message of "Sun, 02 Jun 2002 12:19:39 EDT."
 <B91FBDDA.23EED%goodger@users.sourceforge.net>
References: <B91FBDDA.23EED%goodger@users.sourceforge.net>
Message-ID: <200206030501.g5351aG31933@pcp742651pcs.reston01.va.comcast.net>

> The note in section 6.12 ("The import statement") of the Language
> Reference, "XXX Can't be bothered to spell this out right now", has
> always bothered me.  I will endeavour to flesh it out for release
> 2.3.  Perhaps Guido's package support essay
> (http://www.python.org/doc/essays/packages.html), edited, should
> form a new section or appendix.  Ideas anyone?

The information should be worked into the language reference IMO.  I'm
not sure if an appendix is appropriate or if this could be part of
the documentation for the import statement.

Thanks for helping out!!!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jun  3 06:09:09 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 01:09:09 -0400
Subject: [Python-Dev] Other library code transformations
In-Reply-To: Your message of "Sun, 02 Jun 2002 11:54:55 EDT."
 <15610.16335.191744.30203@anthem.wooz.org>
References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello> <3CF9E500.9030103@livinglogic.de>
 <15610.16335.191744.30203@anthem.wooz.org>
Message-ID: <200206030509.g53599u31983@pcp742651pcs.reston01.va.comcast.net>

> What about "foo = foo + 1" => "foo += 1"?

I'm not for making peephole changes like this.  It's easy to make
mistakes (even if you run the test suite) if you don't guess the type
of a variable right.  I think it's better to bring code up to date in
style only as part of a serious rewrite of the module containing it --
so you can fix up all different aspects.  It's often kind of strange
to see a modernization like this in code that otherwise shows it
hasn't been modified in 5 years...

(Exceptions are to get rid of deprecation warnings, or outright
failures, of course.)

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido@python.org  Mon Jun  3 06:10:08 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 01:10:08 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/test test_signal.py,1.9,1.10
In-Reply-To: Your message of "02 Jun 2002 16:13:24 BST."
 <2m3cw5zpbf.fsf@starship.python.net>
References: <Pine.OS2.4.32.0205292159370.1634-100000@tenring.andymac.org> <2mit56ros4.fsf@starship.python.net> <2msn49hp3i.fsf@starship.python.net> <200205301228.g4UCS9S07601@pcp742651pcs.reston01.va.comcast.net> <2mhekphjdi.fsf@starship.python.net> <200205301540.g4UFefk24115@odiug.zope.com> <2melfthb9r.fsf@starship.python.net>
 <2m3cw5zpbf.fsf@starship.python.net>
Message-ID: <200206030510.g535A8g31997@pcp742651pcs.reston01.va.comcast.net>

> But nowhere else, it would seem, at least not when python is built
> threaded.  Darwin fails in a similar manner to FreeBSD, test_signal
> hangs on Solaris, but only if it's runs as part of the test suite --
> it runs fine if you just run it alone.  Argh!  I really am starting to
> think that I should back out my patch and distribute my code as an
> extension module.  Opinions?

Back it out, and think of a way to only enable it on Linux.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jun  3 06:12:02 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 01:12:02 -0400
Subject: [Python-Dev] PYC Magic
In-Reply-To: Your message of "02 Jun 2002 15:45:13 BST."
 <2m1ybpu4cm.fsf@starship.python.net>
References: <3CF8BCF9.5557.4C49011F@localhost>
 <2m1ybpu4cm.fsf@starship.python.net>
Message-ID: <200206030512.g535C2Q32028@pcp742651pcs.reston01.va.comcast.net>

> Someone on comp.lang.python reported getting corrupt .pycs by having
> modules in a user-writeable directory being accessed more-or-less
> simultaneously by different Python versions.  I'm not sure what
> could be done about that.

Yes, that doesn't work...

I suggest to create copies of the code per Python version.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jun  3 06:23:03 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 01:23:03 -0400
Subject: [Python-Dev] Numeric conversions
In-Reply-To: Your message of "Sun, 02 Jun 2002 10:36:12 EDT."
 <0b3501c20a43$394d77a0$6601a8c0@boostconsulting.com>
References: <0b3501c20a43$394d77a0$6601a8c0@boostconsulting.com>
Message-ID: <200206030523.g535N3t32090@pcp742651pcs.reston01.va.comcast.net>

> The following small program is giving me some unexpected results with
> Python 2.2.1:
> 
> class Int(object):
>     def __int__(self): return 10
> 
> class Float(object):
>     def __float__(self): return 10.0
> 
> class Long(object):
>     def __long__(self): return 10L
> 
> class Complex(object):
>     def __complex__(self): return (10+0j)
> 
> def attempt(f,arg):
>     try:
>         return f(arg)
>     except Exception,e:
>         return str(e.__class__.__name__)+': '+str(e)
> 
> for f in int,float,long,complex:
>     for t in Int,Float,Long,Complex:
>         print f.__name__ + '(' + t.__name__ + ')\t\t',
>         print attempt(f,t())
> 
> ----- results ------
> 
> int(Int)                10
> int(Float)              TypeError: object can't be converted to int
> int(Long)               TypeError: object can't be converted to int
> int(Complex)            TypeError: object can't be converted to int
> 
> *** OK, int() seems to work as expected
> 
> float(Int)              TypeError: float() needs a string argument
> float(Float)            10.0
> float(Long)             TypeError: float() needs a string argument
> float(Complex)          TypeError: float() needs a string argument
> 
> *** float() seems to work, but what's with the error message about strings?

Sloppy coding.  float(), like int() and long(), takes either a number
or a string.  Raymond Hettinger fixed this in CVS in response to SF
bug 551673, about two weeks ago. :-)

> long(Int)               TypeError: object can't be converted to long
> long(Float)             TypeError: object can't be converted to long
> long(Long)              10
> long(Complex)           TypeError: object can't be converted to long
> 
> **** OK, long seems to work as expected
> 
> complex(Int)            TypeError: complex() arg can't be converted to
> complex
> complex(Float)          (10+0j)
> complex(Long)           TypeError: complex() arg can't be converted to
> complex
> complex(Complex)        TypeError: complex() arg can't be converted to
> complex
> 
> **** I can understand complex() handling Float implicitly, but only if it
> also handles Complex! And if it does handle Float implicitly, shouldn't all
> of these handle everything?

The signature of complex() is different -- it takes two float
arguments, the real and imaginary part.

It doesn't take Complex() because of a bug: it only looks for
__complex__ if the argument is a classic instance.  Maybe Raymond can
fix this?  I've added a bug report: python.org/sf/563740

--Guido van Rossum (home page: http://www.python.org/~guido/)



From smurf@noris.de  Mon Jun  3 07:01:21 2002
From: smurf@noris.de (Matthias Urlichs)
Date: Mon, 3 Jun 2002 08:01:21 +0200
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg> import <mod>"
In-Reply-To: <200206021325.g52DP8031046@pcp742651pcs.reston01.va.comcast.net>; from guido@python.org on Sun, Jun 02, 2002 at 09:25:08AM -0400
References: <p05111706b91f48617628@[192.109.102.36]> <200206021325.g52DP8031046@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020603080121.I17316@noris.de>

Hi,

Guido van Rossum:
> Good explanation!  This means it's an unavoidable problem.  Maybe you
> can fix the FAQ entry?
> 
I've rewritten FAQ 4.37, though I just noticed that I mis-pasted the log
entry (it's incomplete). I'll be more careful in the future.  :-/

-- 
Matthias Urlichs     |     noris network AG     |     http://smurf.noris.de/



From fredrik@pythonware.com  Mon Jun  3 10:07:22 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 3 Jun 2002 11:07:22 +0200
Subject: [Python-Dev] Re: Adding Optik to the standard library
References: <NBBBIOJPGKJEKIECEMCBCEIGNCAA.pobrien@orbtech.com> <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> <20020601143855.GA18632@gerg.ca>
Message-ID: <01b401c20ade$13c7bdb0$0900a8c0@spiff>

greg wrote:

> (OTOH and wildly OT: since I gave in a couple years ago and started
> using Emacs syntax-colouring, it *does* take a lot longer to load
> modules up -- eg. ~2 sec for the 1000-line rfc822.py.

    (setq font-lock-support-mode 'lazy-lock-mode)

</F>




From fredrik@pythonware.com  Mon Jun  3 10:08:52 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 3 Jun 2002 11:08:52 +0200
Subject: [Python-Dev] Other library code transformations
References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de>
Message-ID: <01d401c20ade$d0dc9650$0900a8c0@spiff>

walter wrote:

> import stat; os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime

or, nicer:

    os.path.getmtime("foo")

</F>




From fredrik@pythonware.com  Mon Jun  3 10:12:34 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 3 Jun 2002 11:12:34 +0200
Subject: [Python-Dev] Re: Where to put wrap_text()?
References: <20020601134236.GA17691@gerg.ca> <oqk7pjeydk.fsf@titan.progiciels-bpi.ca> <20020601220529.GA20025@gerg.ca>
Message-ID: <01d501c20ade$d0e76bc0$0900a8c0@spiff>

greg wrote:

> Damn, I had no idea there was a body of computer science (however =
small)
> devoted to the art of filling text.  Trust Knuth to be there first.  =
Do
> you have a reference for this algorithm apart from GNU fmt's source
> code?  Google'ing for "knuth text fill algorithm" was unhelpful, ditto
> with s/fill/wrap/.

http://www.amk.ca/python/code/tex_wrap.html

> > Also, is there some existing module in which `wraptext' would fit =
nicely?
> > That might be better than creating a new module for not many =
functions.

"string" (yes, I'm serious).

</F>




From barry@zope.com  Mon Jun  3 10:48:26 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 3 Jun 2002 05:48:26 -0400
Subject: [Python-Dev] Other library code transformations
References: <001501c208bd$46133420$d061accf@othello>
 <3CF8D960.8040402@livinglogic.de>
 <3CF90622.6000106@lemburg.com>
 <000b01c2099b$023984a0$1bea7ad1@othello>
 <3CF9E500.9030103@livinglogic.de>
 <15610.16335.191744.30203@anthem.wooz.org>
 <200206030509.g53599u31983@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15611.15210.602452.975631@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    >> What about "foo = foo + 1" => "foo += 1"?

    GvR> I'm not for making peephole changes like this.  It's easy to
    GvR> make mistakes (even if you run the test suite) if you don't
    GvR> guess the type of a variable right.  I think it's better to
    GvR> bring code up to date in style only as part of a serious
    GvR> rewrite of the module containing it -- so you can fix up all
    GvR> different aspects.  It's often kind of strange to see a
    GvR> modernization like this in code that otherwise shows it
    GvR> hasn't been modified in 5 years...

I agree!  If it works, don't fix it.  

I was responding to this

    > While you're at it: could you also write up all these little
    > "code cleanups" in some file so that Andrew can integrate them
    > in the migration guide ?!

I think it's a nice "code cleanup" that is worth noting in a migration
guide.

-Barry



From barry@zope.com  Mon Jun  3 11:06:32 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 3 Jun 2002 06:06:32 -0400
Subject: [Python-Dev] Re: Adding Optik to the standard library
References: <NBBBIOJPGKJEKIECEMCBCEIGNCAA.pobrien@orbtech.com>
 <20020601024553.59600.qmail@web9607.mail.yahoo.com>
 <20020601025739.GA17229@gerg.ca>
 <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net>
 <20020601143855.GA18632@gerg.ca>
 <01b401c20ade$13c7bdb0$0900a8c0@spiff>
Message-ID: <15611.16296.202088.831238@anthem.wooz.org>

>>>>> "FL" == Fredrik Lundh <fredrik@pythonware.com> writes:

    FL>     (setq font-lock-support-mode 'lazy-lock-mode)

(add-hook 'font-lock-mode-hook 'turn-on-fast-lock)

dueling-hooks-ly y'rs,
-Barry



From gmcm@hypernet.com  Mon Jun  3 13:37:10 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Mon, 3 Jun 2002 08:37:10 -0400
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg> import <mod>"
In-Reply-To: <200206021325.g52DP8031046@pcp742651pcs.reston01.va.comcast.net>
References: Your message of "Sun, 02 Jun 2002 06:04:59 +0200." <p05111706b91f48617628@[192.109.102.36]>
Message-ID: <3CFB2AB6.22345.55C5AC0E@localhost>

On 2 Jun 2002 at 9:25, Guido van Rossum wrote:

[Matthias Urlichs]
> > The problem is that importing with from consists of
> > two steps: - load the module - add the imported names
> > to the local namespace
> 
> Good explanation!  This means it's an unavoidable
> problem.  

Um, different problem. What Matthias explains is
unavoidable. But in David's case, the containing
namespace (the package) is not empty when 
module2 wants module1.  In fact, I believe that
sys.modules['package.module1'] is there (though
*it* is empty).

My guess is that import is looking for module1
as an attribute of package, and that that binding
hasn't taken place yet.

If I use iu instead of the builtin import, it works.

-- Gordon
http://www.mcmillan-inc.com/




From skip@pobox.com  Mon Jun  3 14:34:13 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 3 Jun 2002 08:34:13 -0500
Subject: [Python-Dev] Re: Adding Optik to the standard library
In-Reply-To: <15611.16296.202088.831238@anthem.wooz.org>
References: <NBBBIOJPGKJEKIECEMCBCEIGNCAA.pobrien@orbtech.com>
 <20020601024553.59600.qmail@web9607.mail.yahoo.com>
 <20020601025739.GA17229@gerg.ca>
 <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net>
 <20020601143855.GA18632@gerg.ca>
 <01b401c20ade$13c7bdb0$0900a8c0@spiff>
 <15611.16296.202088.831238@anthem.wooz.org>
Message-ID: <15611.28757.838957.151483@12-248-41-177.client.attbi.com>

    BAW> (add-hook 'font-lock-mode-hook 'turn-on-fast-lock)

Whoa!  What a difference!

After seeing your response to /F, I assume his solution was for GNU Emacs
and yours if for XEmacs, right?

Skip



From barry@zope.com  Mon Jun  3 14:37:48 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 3 Jun 2002 09:37:48 -0400
Subject: [Python-Dev] Re: Adding Optik to the standard library
References: <NBBBIOJPGKJEKIECEMCBCEIGNCAA.pobrien@orbtech.com>
 <20020601024553.59600.qmail@web9607.mail.yahoo.com>
 <20020601025739.GA17229@gerg.ca>
 <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net>
 <20020601143855.GA18632@gerg.ca>
 <01b401c20ade$13c7bdb0$0900a8c0@spiff>
 <15611.16296.202088.831238@anthem.wooz.org>
 <15611.28757.838957.151483@12-248-41-177.client.attbi.com>
Message-ID: <15611.28972.172040.401411@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    BAW> (add-hook 'font-lock-mode-hook 'turn-on-fast-lock)

    SM> Whoa!  What a difference!

    SM> After seeing your response to /F, I assume his solution was
    SM> for GNU Emacs and yours if for XEmacs, right?

What's "GNU Emacs"? <wink>

-Barry



From guido@python.org  Mon Jun  3 15:04:58 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 10:04:58 -0400
Subject: [Python-Dev] Other library code transformations
In-Reply-To: Your message of "Mon, 03 Jun 2002 05:48:26 EDT."
 <15611.15210.602452.975631@anthem.wooz.org>
References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello> <3CF9E500.9030103@livinglogic.de> <15610.16335.191744.30203@anthem.wooz.org> <200206030509.g53599u31983@pcp742651pcs.reston01.va.comcast.net>
 <15611.15210.602452.975631@anthem.wooz.org>
Message-ID: <200206031404.g53E4w400629@pcp742651pcs.reston01.va.comcast.net>

> I was responding to this
> 
>     > While you're at it: could you also write up all these little
>     > "code cleanups" in some file so that Andrew can integrate them
>     > in the migration guide ?!
> 
> I think it's a nice "code cleanup" that is worth noting in a migration
> guide.

I think MAL wrote that.  I interpreted it as "Raymond should document
which modules he changed, and how, so that people can be aware of the
subtle semantic changes."  I now realize that he probably meant "can
you write up a list of things you can do to modernize your code."

But IMO the latter doesn't belong in a migration guide; the migration
guide should focus on what you *have* to change in order to avoid
disappointments later.  Most of the things Raymond does aren't about
new features in 2.3 either.

And I *do* think that a migration guide should at least contain a
general warning about the kind of changes that Raymond did that might
affect 3rd party code.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From s_lott@yahoo.com  Mon Jun  3 14:58:24 2002
From: s_lott@yahoo.com (Steven Lott)
Date: Mon, 3 Jun 2002 06:58:24 -0700 (PDT)
Subject: [Python-Dev] Where to put wrap_text()?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKELIPJAA.tim.one@comcast.net>
Message-ID: <20020603135824.43784.qmail@web9605.mail.yahoo.com>

Another place for a text wrap function would be part of pprint.

I agree with M. Pinard that it doesn't deserve an entire module.

RE seems a little off-task for formatting text.
PPRINT seems more closely related to the core problem.

And it leaves room for adding additional formatting and
pretty-printing features.  Perhaps a small class hierarchy
with different wrapping algorithms (filling and justifying,
no filling, etc.)

--- Tim Peters <tim.one@comcast.net> wrote:
> [Greg Ward, on wrapping text]
> > ...
> 
> Note that regrtest.py also has a wrapper:
> 
> def printlist(x, width=70, indent=4):
>     """Print the elements of a sequence to stdout.
> 
>     Optional arg width (default 70) is the maximum line
> length.
>     Optional arg indent (default 4) is the number of blanks
> with which to
>     begin each line.
>     """
> 
> This kind of thing gets reinvented too often, so +1 on a
> module from me.
> Just make sure it handle the union of all possible desires,
> but has a simple
> and intuitive interface <wink>.
> 
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev


=====
--
S. Lott, CCP :-{)
S_LOTT@YAHOO.COM
http://www.mindspring.com/~slott1
Buccaneer #468: KaDiMa

Macintosh user: drinking upstream from the herd.

__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From mal@lemburg.com  Mon Jun  3 15:15:06 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 03 Jun 2002 16:15:06 +0200
Subject: [Python-Dev] Other library code transformations
References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello> <3CF9E500.9030103@livinglogic.de> <15610.16335.191744.30203@anthem.wooz.org> <200206030509.g53599u31983@pcp742651pcs.reston01.va.comcast.net>              <15611.15210.602452.975631@anthem.wooz.org> <200206031404.g53E4w400629@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3CFB79EA.3070607@lemburg.com>

Guido van Rossum wrote:
>>I was responding to this
>>
>>    > While you're at it: could you also write up all these little
>>    > "code cleanups" in some file so that Andrew can integrate them
>>    > in the migration guide ?!
>>
>>I think it's a nice "code cleanup" that is worth noting in a migration
>>guide.
> 
> 
> I think MAL wrote that.  I interpreted it as "Raymond should document
> which modules he changed, and how, so that people can be aware of the
> subtle semantic changes."  I now realize that he probably meant "can
> you write up a list of things you can do to modernize your code."

I meant the latter. This information is needed in order to moderinise
Python code and also to have a reference for checking existing
code against a specific Python version.

> But IMO the latter doesn't belong in a migration guide; the migration
> guide should focus on what you *have* to change in order to avoid
> disappointments later.  Most of the things Raymond does aren't about
> new features in 2.3 either.

I know. That's why I asked Raymond to add a Python version to
each of the modifications (basically pointing out in which Python
version this coding style became available).

Note that migration does not only include correcting code
which might break; it also covers code cleanups like what
Raymond is currently doing.

I somehow have a feeling that you are afraid of such a guide,
Guido. Is that so ? and if yes, why ? I think all this is valuable
information and worth publishing.

> And I *do* think that a migration guide should at least contain a
> general warning about the kind of changes that Raymond did that might
> affect 3rd party code.

Certainly.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From guido@python.org  Mon Jun  3 15:21:04 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 10:21:04 -0400
Subject: [Python-Dev] intra-package mutual imports fail: "from <pkg> import <mod>"
In-Reply-To: Your message of "Mon, 03 Jun 2002 08:37:10 EDT."
 <3CFB2AB6.22345.55C5AC0E@localhost>
References: Your message of "Sun, 02 Jun 2002 06:04:59 +0200." <p05111706b91f48617628@[192.109.102.36]>
 <3CFB2AB6.22345.55C5AC0E@localhost>
Message-ID: <200206031421.g53EL4W00847@pcp742651pcs.reston01.va.comcast.net>

> Um, different problem. What Matthias explains is
> unavoidable. But in David's case, the containing
> namespace (the package) is not empty when 
> module2 wants module1.  In fact, I believe that
> sys.modules['package.module1'] is there (though
> *it* is empty).
> 
> My guess is that import is looking for module1
> as an attribute of package, and that that binding
> hasn't taken place yet.

Yes, that's what "from package import module" does -- it wants
"module" to be an attribute of "package".  This is because it doesn't
really distinguish between "from package import module" and "from
module import attribute".

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jun  3 15:50:39 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 10:50:39 -0400
Subject: [Python-Dev] Other library code transformations
In-Reply-To: Your message of "Mon, 03 Jun 2002 16:15:06 +0200."
 <3CFB79EA.3070607@lemburg.com>
References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <3CF90622.6000106@lemburg.com> <000b01c2099b$023984a0$1bea7ad1@othello> <3CF9E500.9030103@livinglogic.de> <15610.16335.191744.30203@anthem.wooz.org> <200206030509.g53599u31983@pcp742651pcs.reston01.va.comcast.net> <15611.15210.602452.975631@anthem.wooz.org> <200206031404.g53E4w400629@pcp742651pcs.reston01.va.comcast.net>
 <3CFB79EA.3070607@lemburg.com>
Message-ID: <200206031450.g53Eodx01137@pcp742651pcs.reston01.va.comcast.net>

> I somehow have a feeling that you are afraid of such a guide,
> Guido. Is that so ? and if yes, why ? I think all this is valuable
> information and worth publishing.

No, not at all!  I just misunderstood what your purpose of a migration
guide was.  Sorry for the confusion.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pobrien@orbtech.com  Mon Jun  3 16:03:42 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Mon, 3 Jun 2002 10:03:42 -0500
Subject: [Python-Dev] Other library code transformations
In-Reply-To: <200206031450.g53Eodx01137@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <NBBBIOJPGKJEKIECEMCBKEOENCAA.pobrien@orbtech.com>

[Guido van Rossum]
>
> No, not at all!  I just misunderstood what your purpose of a migration
> guide was.  Sorry for the confusion.

Certainly distinguishing between Required changes and Recommended changes
would be a good thing. Required changes are what needs to be done to keep
old code working. Recommended changes could include stylistic changes, new
features, new idioms, etc. I think it would even make sense to talk about
changes that should be made in anticipation of future feature deprecation.

---
Patrick K. O'Brien
Orbtech




From mgilfix@eecs.tufts.edu  Mon Jun  3 16:22:45 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Mon, 3 Jun 2002 11:22:45 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <200205232013.g4NKD6X07596@odiug.zope.com>; from guido@python.org on Thu, May 23, 2002 at 04:13:06PM -0400
References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com>
Message-ID: <20020603112245.E19838@eecs.tufts.edu>

  Alrighty. Here's the monster reply. I'll be much faster with the
replies this week. Had a hectic week last week.

On Thu, May 23 @ 16:13, Guido van Rossum wrote:
> General style nits:
> 
> - You submitted a reverse diff!  Customary is diff(old, new).

  Oops. Will fix that this time round.

> - Please don't put a space between the function name and the open
>   parenthesis.  (You do this both in C and in Python code.)

  Fixed. Some personal preference bled it there. All removed in
my copy.

> - Also please don't put a space between the open parenthesis and the
>   first argument (you do this almost consistently in test_timeout.py).

  Couldn't really figure out what you were seeing here. I read that
you saw something like func( a, b), which I don't see in my local
copy. I do have something like this for list comprehension:

    [ x.find('\n') for x in self._rbuf ]

  Er, but I though there were supposed to be surrounding spaces at the
edges...

> - Please don't introduce lines longer than 78 columns.

  Fixed my offending line. I've also corrected some other lines in
the socket module that went over 78 columns (there were a few).

> 
> Feedback on the patch to socket.py:
> 
> - I think you're changing the semantics of unbuffered and
>   line-buffered reads/writes on Windows.  For one thing, you no longer
>   implement line-buffered writes correctly.  The idea is that if the
>   buffer size is set to 1, data is flushed at \n only, so that if
>   the code builds up the line using many small writes, this doesn't
>   result in many small sends. There was code for this in write() --
>   why did you delete it?

  I screwed up the write. New the write is:

   def write(self, data):
        self._wbuf = self._wbuf + data
        if self._wbufsize == 1:
            if '\n' in data:
                self.flush ()
        elif len(self._wbuf) >= self._wbufsize:
            self.flush()

  which is pretty much the same as the old. The read should be ok
though. I could really use someone with a win compiler to test this
for me.

> - It also looks like you've broken the semantics of size<0 in read().

  Maybe I'm misunderstanding the code, but I thought that a size < 0
meant to read until there are no more? The statement:

    while size < 0 or buf_len < size:

  accomplishes the same thing as what's in the current socket.py
implementation.  If you look closely, the 'if >= 0' branch *always* returns,
meaning that the < 0 is equiv to while 1. Due to shortcutting, the same
thing happens in the above statement. Maybe a comment would make it clearer?

> - Maybe changing the write buffer to a list makes sense too?

  I could do this. Then just do a join before the flush. Is the append
/that/ much faster?

> - Since this code appears independent from the timeoutsocket code,
>   maybe we can discuss this separately?

  The point of this code was to keep from losing data when an exception
occurs (as timothy, if I remember correctly, pointed out). Hence the reason
for keeping a lot more data around in instance variables instead of local
variables. So the windows version might (in obscure cases) be affected
by the timeout changes. That's what this patch was addressing.

> 
> Feedback on the documentation:
> 
> - I would document that the argument to settimeout() should be int or
>   float (hm, can it be a long?  that should be acceptable even if it's
>   strange), and that the return value of gettimeout() is None or a
>   float.

  It can be a long in my local copy. The argument can be any numeric
value and the special None. I've updated my documentation to be more
explicit.

> - You may want to document the interaction with blocking mode.

  I've put notes in the tex documentation.  Here's how the interaction
works:

    if the socket is in non-blocking mode:
      All operations are non-blocking and setting timeouts doesn't
      mean anything (they are not enforced). A timeout can still
      be changed and gettimeout will reflect the value but the
      exception will never be raised.
    else if the socket is in blocking mode:
      enabling timeouts does the usual thing you would expect from
      timeouts.

> Feedback on the C socket module changes:
> 
> - Why import the select module?  It's much more efficient to use the
>   select system call directly.  You don't need all the overhead and
>   generality that the select module adds on top of it, and it costs a
>   lot (select allocates lots of objects and lots of memory and hence
>   is very expensive).

  Well, the thinking was that if there were any portability issues with
select, they could be taken care of in one place. At the time, I hadn't
really looked closely at the select module. Now that I glance at it,
pretty much all the code in the select module just extracts the necessary
information from the objects for polling. I suppose I could just use
select directly... There's also the advantage of all the error handling
in select. I could do a stripped down version of the code, I suppose, for

speed. Seemed like a good idea for code re-use.
> - <errno.h> is already included by Python.h.

  I didn't do this but it's been removed.

> - Please don't introduce more static functions with a 'Py' name
>   prefix.

  Only did this in one place, with PyTimeout_Err. The reason was that the
other Error functions used the Py prefix, so it was done for consistency. I
can change that.. or change the naming scheme with the others if you like.

> - You should raise TypeError if the type of the argument is wrong, and
>   ValueError if the value is wrong (out of range).  Not SocketError.

  Oops. Fixed.

> - I believe that you can't reliably maintain a "sock_blocking" flag;
>   there are setsockopt() or ioctl() calls that can make a socket
>   blocking or non-blocking.  Also, there's no way to know whether a
>   socket passed to fromfd() is in blocking mode or not.

  Well, upon socket creation (in init_sockobject), the socket is
set to blocking mode. I think that handles fromfd, right? Doesn't
every initialization means have to call that function?

  The real problem would be someone using an ioctl or setsockopt (Can
you even do blocking stuff through setsockopt?). Ugh.  The original
timeoutsocket didn't really deal with anything like that.  Erm, seems like
an interface problem here - using ioctl kinda breaks the socket object
interface. Perhaps we should be doing some sort of getsockopt to figure out
the blocking mode and update our state accordingly? That would be an extra
call for each thing to the interface though.

  One solution is to set/unset blocking mode right before doing each
call to be sure of the state and based on the internally stored value
of the blocking attribute... but... then that kind of renders ioctl
useless.

  Another solution might be to set the blocking mode to on everytime someone
sets a timeout. That would change the blocking/socket interaction already
described a bit but not drastically. Also easy to implement.  That sends the
message: Don't use ioctls when using timeouts.

  Hmm.. Will need to think about this more. Any insight would be helpful or
some wisdom about how you usually handle this sort of thing.

> - There are refcount bugs.  I didn't do a detailed review of these,
>   but I note that the return value from PyFloat_FromDouble() in
>   PySocketSock_settimeout() is leaked.  (There's an INCREF that's only
>   needed for the None case.)

  This has been fixed. I was one ref count too high in my scheme.

> - The function internal_select() is *always* used like this:
> 
>         count = internal_select (s, 1);
>         if (count < 0)
>                 return NULL;
>         else if (count ==  0) /* Timeout elapsed */
>                 return PyTimeout_Err ();
> 
>   If internal_select() called PyTimeout_Err() itself, all call sites
>   could be simplified to this:
> 
>         count = internal_select (s, 1);
>         if (count < 0)
>                 return NULL;
> 
>   or even (now that the count variable is no longer needed) to this:
> 
>         if(internal_select (s, 1) < 0)
>                 return NULL;

  Good point. Except the return value needs to be checked for <= 0
in this case. Changes were made.

> - The accept() wrapper contains this bit of code (only showing the
>   Unix variant):
> 
> 	if (newfd < 0)
> 		if (!s->sock_blocking || (errno != EAGAIN && errno != EWOULDBLOCK))
> 			return s->errorhandler ();
> 
>   Isn't the sense of testing s->sock_blocking wrong?  I would think
>   that if we're in blocking mode we'd want to return immediately
>   without even checking the errno.  I recommend writing this out more
>   clearly, e.g. like this:
> 
>     if (s->sock_blocking)
>        return s->errorhandler();
>     /* A non-blocking accept() failed */
>     if (errno != EAGAIN && errno != EWOULDBLOCK)
>        return s->errorhandler();

  I've written this out more explicitly as you suggest. It is supposed to be
!s->sock_blocking though. If we're in non-blocking mode at that point with
an error, then it's definitely an error. If we're in blocking mode, then we
have to check the type of error. The reason being that the underlying socket
is always in non-blocking mode (remember select) so we need to check that we
don't have a weird error.

  I've written it out like this:

    if (newfd < 0) {
········if (!s->sock_blocking)
············return s->errorhandler();
········/* Check if we have a true failure for a blocking socket */
········if (errno != EAGAIN && errno != EWOULDBLOCK)
············return s->errorhandler();
····}

  I've also fixed a similar thing for connect.

> - What is s->errorhandler() for?  AFAICT, this is always equal to
>   PySocket_Err!

  This was always in the module. Not sure why it was put there intially.
I used it to be consistent.

> - The whole interaction between non-blocking mode and timeout mode is
>   confusing to me.  Are you sure that this always does the right
>   thing?  Have you even thought about what "the right thing" is in all
>   4 combinations?

  I think I've explained this earlier in the thread. Lemme know if I need
any more clarifications.

  If you made it this far, it's time for coffee.

                     -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From guido@python.org  Mon Jun  3 16:34:34 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 11:34:34 -0400
Subject: [Python-Dev] Other library code transformations
In-Reply-To: Your message of "Mon, 03 Jun 2002 10:03:42 CDT."
 <NBBBIOJPGKJEKIECEMCBKEOENCAA.pobrien@orbtech.com>
References: <NBBBIOJPGKJEKIECEMCBKEOENCAA.pobrien@orbtech.com>
Message-ID: <200206031534.g53FYYR01331@pcp742651pcs.reston01.va.comcast.net>

> Certainly distinguishing between Required changes and Recommended
> changes would be a good thing. Required changes are what needs to be
> done to keep old code working. Recommended changes could include
> stylistic changes, new features, new idioms, etc. I think it would
> even make sense to talk about changes that should be made in
> anticipation of future feature deprecation.

I'm not sure that using x+=1 instead of x=x+1 should be even a
recommended change.  This is a personal choice, just like using
True/False to indicate truth values.

The "is None" vs. "== None" issue is a general style recommendation,
not a migration tip.  This is a "should do" issue.

The "if not x" vs. "if x is None" issue is also a general style
recommendation.  This is a "could do" issue, because the semantics are
different.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From walter@livinglogic.de  Mon Jun  3 16:54:11 2002
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Mon, 03 Jun 2002 17:54:11 +0200
Subject: [Python-Dev] Other library code transformations
References: <001501c208bd$46133420$d061accf@othello> <3CF8D960.8040402@livinglogic.de> <01d401c20ade$d0dc9650$0900a8c0@spiff>
Message-ID: <3CFB9123.7050203@livinglogic.de>

Fredrik Lundh wrote:
> walter wrote:
> 
> 
>>import stat; os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime
> 
> 
> or, nicer:
> 
>     os.path.getmtime("foo")

Is there an os.path function available for all the os.stat entries?

Which version should we use?

Should we change this at all? (stat.py won't go away,
only string.py and types.py will.)

Bye,
    Walter Dörwald




From pobrien@orbtech.com  Mon Jun  3 16:58:41 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Mon, 3 Jun 2002 10:58:41 -0500
Subject: [Python-Dev] Other library code transformations
In-Reply-To: <200206031534.g53FYYR01331@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <NBBBIOJPGKJEKIECEMCBAEOJNCAA.pobrien@orbtech.com>

[Guido van Rossum]
>
> > Certainly distinguishing between Required changes and Recommended
> > changes would be a good thing. Required changes are what needs to be
> > done to keep old code working. Recommended changes could include
> > stylistic changes, new features, new idioms, etc. I think it would
> > even make sense to talk about changes that should be made in
> > anticipation of future feature deprecation.
>
> I'm not sure that using x+=1 instead of x=x+1 should be even a
> recommended change.  This is a personal choice, just like using
> True/False to indicate truth values.

But the choice only becomes available with a certain version of Python.

> The "is None" vs. "== None" issue is a general style recommendation,
> not a migration tip.  This is a "should do" issue.

Right. But it wouldn't hurt to remind people in a migration guide, would it?

> The "if not x" vs. "if x is None" issue is also a general style
> recommendation.  This is a "could do" issue, because the semantics are
> different.

Perhaps three sections then: Required, Recommended and Optional (TMTOWTDI)?
I just think it would be good to know what coding changes one could/should
start making when a new version is released. And which practices one should
stop. And if there are lots of examples in real code of certain poor
practices, it wouldn't hurt to point them out as well, even if they aren't
necessarily tied to a particular release of Python. (But I'm willing to
concede that I might be stretching the scope of this migration guide beyond
its limits with that last item.)

---
Patrick K. O'Brien
Orbtech




From guido@python.org  Mon Jun  3 17:02:59 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 12:02:59 -0400
Subject: [Python-Dev] Other library code transformations
In-Reply-To: Your message of "Mon, 03 Jun 2002 10:58:41 CDT."
 <NBBBIOJPGKJEKIECEMCBAEOJNCAA.pobrien@orbtech.com>
References: <NBBBIOJPGKJEKIECEMCBAEOJNCAA.pobrien@orbtech.com>
Message-ID: <200206031602.g53G2xm02048@pcp742651pcs.reston01.va.comcast.net>

> > I'm not sure that using x+=1 instead of x=x+1 should be even a
> > recommended change.  This is a personal choice, just like using
> > True/False to indicate truth values.
> 
> But the choice only becomes available with a certain version of Python.

2.0.

> > The "is None" vs. "== None" issue is a general style recommendation,
> > not a migration tip.  This is a "should do" issue.
> 
> Right. But it wouldn't hurt to remind people in a migration guide,
> would it?

Yes it would.  A migration guide should focus on migration and leave
general style tips to other documents.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jun  3 18:22:16 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 13:22:16 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: Your message of "Mon, 03 Jun 2002 11:22:45 EDT."
 <20020603112245.E19838@eecs.tufts.edu>
References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com>
 <20020603112245.E19838@eecs.tufts.edu>
Message-ID: <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net>

[Addressing only points that need attention]

> > - Also please don't put a space between the open parenthesis and the
> >   first argument (you do this almost consistently in test_timeout.py).
> 
>   Couldn't really figure out what you were seeing here. I read that
> you saw something like func( a, b), which I don't see in my local
> copy.

test_timeout.py from the SF page has this.  I'm glad you fixed this
already in your own copy.

> I do have something like this for list comprehension:
> 
>     [ x.find('\n') for x in self._rbuf ]
> 
>   Er, but I though there were supposed to be surrounding spaces at the
> edges...

I prefer to see that as

    [x.find('\n') for x in self._rbuf]

>   I screwed up the write. New the write is:
> 
[...]
> 
>   which is pretty much the same as the old. The read should be ok
> though. I could really use someone with a win compiler to test this
> for me.

I'll review it more when you next upload it.

> > - It also looks like you've broken the semantics of size<0 in read().
> 
>   Maybe I'm misunderstanding the code, but I thought that a size < 0
> meant to read until there are no more? The statement:
> 
>     while size < 0 or buf_len < size:
> 
>   accomplishes the same thing as what's in the current socket.py
> implementation.  If you look closely, the 'if >= 0' branch *always* returns,
> meaning that the < 0 is equiv to while 1. Due to shortcutting, the same
> thing happens in the above statement. Maybe a comment would make it clearer?

I was referring to this piece of code:

!         if buf_len > size:
!             self._rbuf.append (data[size:])
!             data = data[:size]

Here data[size:] gives you the last byte of the data and data[:size]
chops off the last byte.

> > - Maybe changing the write buffer to a list makes sense too?
> 
>   I could do this. Then just do a join before the flush. Is the append
> /that/ much faster?

Depends on how small the chunks are you write.  Roughly, repeated list
append is O(N log N), while repeated string append is O(N**2).

> > - Since this code appears independent from the timeoutsocket code,
> >   maybe we can discuss this separately?
> 
>   The point of this code was to keep from losing data when an exception
> occurs (as timothy, if I remember correctly, pointed out). Hence the reason
> for keeping a lot more data around in instance variables instead of local
> variables. So the windows version might (in obscure cases) be affected
> by the timeout changes. That's what this patch was addressing.

OK, but given the issues the first version had, I recommand that the
code gets more review and that you write unit tests for all cases.

> > - Please don't introduce more static functions with a 'Py' name
> >   prefix.
> 
>   Only did this in one place, with PyTimeout_Err. The reason was that the
> other Error functions used the Py prefix, so it was done for consistency. I
> can change that.. or change the naming scheme with the others if you like.

I like to do code cleanup that doesn't change semantics (like
renamings) as a separate patch and checkin.  You can do this before or
after the timeout changes, but don't merge it into the timeout
changes.  I still like the static names that you introduce not to
start with Py.

> > - I believe that you can't reliably maintain a "sock_blocking" flag;
> >   there are setsockopt() or ioctl() calls that can make a socket
> >   blocking or non-blocking.  Also, there's no way to know whether a
> >   socket passed to fromfd() is in blocking mode or not.
> 
>   Well, upon socket creation (in init_sockobject), the socket is
> set to blocking mode. I think that handles fromfd, right? Doesn't
> every initialization means have to call that function?

OK, it looks like you call internal_setblocking(s, 0) to set the
socket in nonblocking mode.  (Hm, I don't see any calls to set the
socket in blocking mode!)

So do I understand that you are now always setting the socket in
non-blocking mode, even when there is no timeout specified, and that
you look at the sock_blocking flag to decide whether to do timeouts or
just pass the nonblocking behavior to the user?

This is a change in semantics, and could interfere with existing
applications that pass the socket's file descriptor off to other
code.  I think I'd be happier if the behavior wasn't changed at all
until a timeout is set for a socket -- then existing code won't
break.

>   The real problem would be someone using an ioctl or setsockopt (Can
> you even do blocking stuff through setsockopt?).

Yes, setblocking() makes a call to setsockopt(). :-)

> Ugh.  The original timeoutsocket didn't really deal with anything
> like that.  Erm, seems like an interface problem here - using ioctl
> kinda breaks the socket object interface. Perhaps we should be doing
> some sort of getsockopt to figure out the blocking mode and update
> our state accordingly? That would be an extra call for each thing to
> the interface though.

I only really care for sockets passed in to fromfd().  E.g. someone
can currently do:

  s1 = socket(AF_INET, SOCK_STREAM)
  s1.setblocking(0)

  s2 = fromfd(s1.fileno())
  # Now s2 is non-blocking too

I'd like this to continue to work as long as s1 doesn't set a timeout.

>   One solution is to set/unset blocking mode right before doing each
> call to be sure of the state and based on the internally stored value
> of the blocking attribute... but... then that kind of renders ioctl
> useless.

Don't worry so much about ioctl, but do worry about fromfd.

>   Another solution might be to set the blocking mode to on everytime
> someone sets a timeout. That would change the blocking/socket
> interaction already described a bit but not drastically. Also easy
> to implement.  That sends the message: Don't use ioctls when using
> timeouts.

I like this.

>   Hmm.. Will need to think about this more. Any insight would be
> helpful or some wisdom about how you usually handle this sort of
> thing.

See above.  Since we don't know what people out there are doing, I
don't want to break existing code.  We do know that existing code
doesn't use (this form of) timeout, so we can exploit that knowledge.

> > - What is s->errorhandler() for?  AFAICT, this is always equal to
> >   PySocket_Err!
> 
>   This was always in the module. Not sure why it was put there intially.
> I used it to be consistent.

Argh, you're right.  MAL added this; I'll ask him why.

>   If you made it this far, it's time for coffee.

When can I expect a new version?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aahz@pythoncraft.com  Mon Jun  3 18:16:02 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 3 Jun 2002 13:16:02 -0400
Subject: [Python-Dev] Re: Adding Optik to the standard library
In-Reply-To: <15611.16296.202088.831238@anthem.wooz.org>
References: <NBBBIOJPGKJEKIECEMCBCEIGNCAA.pobrien@orbtech.com> <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> <20020601143855.GA18632@gerg.ca> <01b401c20ade$13c7bdb0$0900a8c0@spiff> <15611.16296.202088.831238@anthem.wooz.org>
Message-ID: <20020603171602.GB20395@panix.com>

On Mon, Jun 03, 2002, Barry A. Warsaw wrote:
> 
> >>>>> "FL" == Fredrik Lundh <fredrik@pythonware.com> writes:
> 
>     FL>     (setq font-lock-support-mode 'lazy-lock-mode)
> 
> (add-hook 'font-lock-mode-hook 'turn-on-fast-lock)

Which proves that vi[m] is the true Pythonic editor -- there's only one
way.

baiting-ly y'rs
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In the end, outside of spy agencies, people are far too trusting and
willing to help."  --Ira Winkler



From phoengeist38259@arcor.de" <phoengeist38259@arcor.de  Mon Jun  3 18:25:21 2002
From: phoengeist38259@arcor.de" <phoengeist38259@arcor.de (phoengeist38259@arcor.de)
Date: Mon, 3 Jun 2002 19:25:21 +0200
Subject: [Python-Dev] Entschuldigen Sie bitte die Störung!
Message-ID: <E17Eva9-00029z-00@mail.python.org>

Entschuldigen Sie bitte die Störung!

Mir ist etwas zu Ohren gekommen.
Eine relativ aussergewöhnliche Gerüchteküche,
aus der man mir ein schwerverdauliches Süppchen vorgesetzt hat,
ist der Grund meiner Mail.
Unappetitlich ist gar kein Ausdruck!
Ist es möglich auf  funktechnischem Wege(in welchen Frequenzbereichen?)
 jemanden zu beeinflussen oder zu manipulieren?
Oder sogar zu schikanieren und terrorisieren?
Unter dem Motto:"Einen am Sender?Nich ganz alleine?
Kleine Mannim Ohr?Falsche Wellenlänge?Bohnen in den Ohren?
Auf den Zahn gefühlt(Amalgam)?Mal unverbindlich reinhören?
Der Pullacher Wanzentanz?
Ist das Spinnerei?Das geht doch gar nicht,oder?
Und wenn wie sieht das ethisch moralisch aus?
Zur technischen Seite der Sache gibt es zwar Berichte und Webseiten:
Totalitaer,de - Die Waffe gegen die Kritik
http://www.raum-und-zeit.com/Aktuell/Brummton.htm
http://www.fosar-bludorf.com/Tempelhof/
http://jya.com/haarp.htm
http://www.zeitenschrift.at/magazin/zs_24_15/1_mikrowaffen.htm
http://www.bse-plus.de/d/doc/lbrief/lbmincontr.htm
http://home.nexgo.de/kraven/bigb/big3.html
http://w3.nrl.navy.mil/projects/haarp/index.html
http://cryptome.org/
http://www.raven1.net/ravindex.htm
http://www.calweb.com/~welsh/
http://www.cahe.org/
http://www.parascope.com/ds/mkultra0.htm
http://www.trufax.org/menu/mind.html
http://www.trufax.org/menu/elect.html
http://mindcontrolforum.com/
http://www.trufax.org/menu/elect.html
usw.
usw.
usw.
,aber,das kann doch nicht sein,das soetwas gemacht wird,oder?
Eine Menschenrechtsverletzung sonder gleichen!?!
Ist es möglich,durch Präparation,der
Ohren und im Zusammenspiel mit eventuell vorhandenem Zahnersatz?
Mit relativ einfacher Funktechnik??
In diesem Land?Hier und heute???
Unter welchen Motiven?
Wo ist eigentlich die Abteilung  5 des BND und des Verfassungsschutzes?
Kann es sein,daß es Leute gibt,die dem BND/Verfassungsschutz,auf
funktechnischem Wege 
permanent einen Situationsbericht abliefern,ohne es selbst zu merken,im
Kindesalter machbar gemacht??
Werden durch solche inoffiziellen Mitarbeiter,beim BND und
Verfassungsschutz,nach Stasimanier,
Informationen von und über,rein theoretisch, jeden Bundesbürger,gesammelt?
Gibt es dann noch ein Recht auf  Privatsphere? Wer kontrolliert eigentlich
den BND,MAD und Verfassungsschutz auf Unterwanderung???
In der Mail geht es mir eigentlich um die Frage,ob es kriminellen Elementen,
aus dem Motiv der Bereicherung,oder Gruppierungen aus ideologischen Motiven,
möglich ist ,sich Wissen und Technik anzueignen,die zu anderen Zeiten,
aus anderen Motiven(Westfernsehen?),entwickelt wurde.
Und stellt der technische Wissensstand,
der der Allgemeinheit bekannt ist  wirklich das Ende der Fahnenstange dar?
Ist es denn nicht kriminellen Elementen genauso möglich,
ich sage das jetzt mal verharmlost und verniedlichend,
einzelne Personen oder Gruppen mit relativ einfachen Mitteln,
aus welchen Motiven auch immer, auszuspionieren?
Und stellt diese "Ausspioniererei" nicht einen erheblichen Eingriff in die
Privatsphäre dar? 
Ist es möglich einzelne Personen oder Gruppen,
eine Akzeptans einer gewissen Öffentlichkeit(suggeriert?),
die z.B. mit Hilfe von Internetseiten,wie zum Beispiel dem
"Pranger"geschaffen werden könnte,
mal vorausgestzt,zu terroriesieren und oder zu schikanieren,
und das in aller (suggerierten)Öffentlichkeit?Haben die Leute die da am
Pranger,
oder auf irgendeiner anderen Seite verunglimpft,oder gar Verleumdet werden,
eigentlich eine Chance zur Gegenöffentlichkeit?Ist das nicht Rufmord?
Vor einigen Jahren bin ich per Zufall auf die Seite "Der Pranger" gestoßen,
damals lief das noch nicht unter dem Deckmantel der Partnervermittlung.
Können sich einzelne Personen,oder Interessengemeinschaften,
aus reinem Selbstzweck,solcher Seiten bedienen,
um unter dem Deckmantel einer fragwürdigen Zivilkourage,
durch anzetteln irgendwelcher Hetzkampagnen,eigene,
ganz persöhnliche Interessen durchsetzen?
Können solche Seiten zur Koordination von kriminellen machenschaften dienen?
Die Frage,ist es  Möglichkeit oder Unmöglichkeit,technisch und
gesellschaftlich,
einzelne Personen,oder auch Gruppierungen,aus einer
kriminellen/ideologischen
Energei heraus,zu manipulieren oder zu beeinflussen,terrorisieren oder zu 
schickanieren,und zwar gezielt.
Zielgruppenmanipulation durch Massenmedien sind alltägliche Manipulation,
der mansich,mehr oder weniger,entziehen kann.
Wird das Recht auf Privatsphäre,schleichend,tiefenpsychologisch,
durch Sendungen,wie,zum Beispiel "Big brother",untergraben?
Sollte bei einem der Angemailten ein gewisser Wissensstand zum Thema
vorhanden sein,
wäre ich über Hinweise zum Thema froh.
Auf der Suche nach Antworten auf meine Fragen
maile ich verschiedene Adressen aus dem Internet an,
und hoffe aufkonstruktive Antworten und Kritiken.
Über einen Besuch auf der Seite
<http://hometown.aol.de/reinerhohn38259/homepage/index.html>
würde ich mich freuen.
Sollten Sie von mir mehrfach angeschrieben worden
sein,so bitte ich Sie,mir dies zu entschuldigen,
das war nicht beabsichtigt.
Der Grund für meine Anonymität ist die Tatsache,
daß bei derlei Fragenstellerei,
verständlicherweise,schnell der Ruf nach der Psychatrie laut wird.
Was auch Methode hat(ist).
Sollten Sie die Mail als Belästigung empfinden,
möchte ich mich hiermit dafür entschuldigen!
Big brother is watching you?


Excuse please the disturbance!

Me something came to ears.
A relatively unusual rumor kitchen,
from which one put forward to me a heavydigestible soup,
is the reason of my Mail.
Unappetizing is no printout!
Is it possible on radio Wege(in for which frequency ranges?)  to
influence or manipulate someone?
Terrorize or to even chicane and?
Under the Motto:"Einen at the Sender?Nich quite alone?
Small Mannim Ohr?Fal Wellenlaenge?Bohnen in the ears?
On the tooth clean-hear gefuehlt(Amalgam)?Mal witthout obligation?
The Pullacher bug wanzentanz?
Isn't the Spinnerei?Das goes nevertheless at all, or?
And if as looks ethicalally morally?
For the technical page of the thing there is to report and web page:
Totalitaer,de - Die Waffe gegen die Kritik
http://www.raum-und-zeit.com/Aktuell/Brummton.htm
http://www.fosar-bludorf.com/Tempelhof/
http://jya.com/haarp.htm
http://www.zeitenschrift.at/magazin/zs_24_15/1_mikrowaffen.htm
http://www.bse-plus.de/d/doc/lbrief/lbmincontr.htm
http://home.nexgo.de/kraven/bigb/big3.html
http://w3.nrl.navy.mil/projects/haarp/index.html
http://cryptome.org/
http://www.raven1.net/ravindex.htm
http://www.calweb.com/~welsh/
http://www.cahe.org/
http://www.parascope.com/ds/mkultra0.htm
http://www.trufax.org/menu/mind.html
http://www.trufax.org/menu/elect.html
http://mindcontrolforum.com/
http://www.trufax.org/menu/elect.html
usw.
usw.
usw.
but, that cannot be nevertheless, which is made soetwas, or?
A violation of human rights resemble special!?!
Is it possible, by preparation, the ears and in interaction with
possibly available artificial dentures?
With relatively simple radio engineering??
In this Land?Hier and today???
Under which motives?
Where is the department actually 5 of the BND and the protection of the
constitution?
Can it be that there are people, which deliver the Federal
Intelligence Service/protection of the constitution, on radio way
permanently a situation report, without noticing it, in the infancy
feasiblly made?
By such unofficial coworkers, with the BND and protection of the
constitution, after Stasimanier, is information collected of and
over,purely theoretically, each Federal citizen?
Is there then still another right to Privatsphere?
Who actually checks the BND, WAD and protection of the constitution for
infiltration???
Into the Mail actually concerns it to me the question whether it
criminal items, from which motive of enriching, or groupings from
ideological motives is possible, to acquire itself knowledge and
technique which were developed at other times, from other
Motiven(Westfernsehen?).And does the technical knowledge status place, to
that the public admits is really the end of the flag bar?
Is it not to criminal items just as possible, I legend that now times
played down and does nice-end, individual persons or groups with
relatively simple means, to spy from whatever motives always?
And doesn't this " Ausspioniererei " represent a substantial
intervention into the privatsphaere?
It is possible individual persons or groups, one acceptance to of a
certain Oeffentlichkeit(suggeriert?),  e.g. by Internet pages, how for
example the " Pranger"geschaffen could become, times vorausgestzt, to
terroriesieren and or chicane, and in everything (the people
suggerierten)Oeffentlichkeit?Haben there at the Pranger, or on any
other page to be reviled, or slandered, actually a chance to the
Gegenoeffentlichkeit?Ist that not character assassination?
Some years ago I am by coincidence the page " the Pranger "
encountered, at that time ran not yet under the cover of the partner
switching.Itself can individual persons, or communities of interests, from
pure self purpose, such pages to serve, over under the cover of a doubtful
Zivilkourage, through plot any rushing campaigns, own, quite
persoehnliche interests to intersperse?
Can such pages serve for the co-ordination of criminal machinations?
The question, is it possibility or impossibility, technically and
socially, individual persons, or also groupings of manipulating or of
influencing from an criminal/ideological Energei, terrorizes or to
schickanieren, directed.Target group manipulation by mass media are
everyday manipulation, from which, more or less, can extract itself.
Does the right to privatsphaere, creeping, by transmissions become
deep psychological, how, for example " Big undermine brother"?
If the Angemailten should be available a certain knowledge status to
the topic with one, I would be glad over notes to the topic
On the search for responses to my questions maile I different
addresses from the Internet on, and hope up-constructional responses
and criticisms.Over an attendance on the page
<http://hometown.aol.de/reinerhohn38259/homepage/index.html>
wuerde I are pleased.If you should have been written down by me several
times, then please
I you to excuse me this that was not intended.
The reason for my anonymity is the fact that with such
Fragenstellerei, understandably, fast after the call the Psychatrie
loud becomes.  Which also method hat(ist).
If you should feel the Mail as annoyance, I would like to apologize
hereby for it!  Big is watching you?


Veuillez excuser le dérangement!

Moi quelque chose concernant des oreilles est venu.
   Une cuisine de bruit relativement inhabituelle, dont on m'a placé un
Sueppchen schwerverdauliches devant, est la raison de mes Mail.Aucune
expression n'est  peu appétissante!
   Il est possible sur un Wege(in funktechnischem pour quelles réponses
fréquentielles?)  quelqu'un influencer ou manipuler?
Ou même  schikanieren et terroriser?
   Sous le Motto:"Einen au Sender?Nich tout à fait seulement?
   Petits Mannim Ohr?Falsche Wellenlaenge?Bohnen dans les oreilles?
   Sur la dent gefuehlt(Amalgam)?Mal non contraignant reinhoeren?
   Le Pullacher Wanzentanz?
Le Spinnerei?Das n'est-il quand même pas du tout va, ou?
   Et si comme cela paraît éthiquement moralement?
   Au côté technique de la chose, il y a certes des rapports et des
Webseiten:
Totalitaer,de - Die Waffe gegen die Kritik
http://www.raum-und-zeit.com/Aktuell/Brummton.htm
http://www.fosar-bludorf.com/Tempelhof/
http://jya.com/haarp.htm
http://www.zeitenschrift.at/magazin/zs_24_15/1_mikrowaffen.htm
http://www.bse-plus.de/d/doc/lbrief/lbmincontr.htm
http://home.nexgo.de/kraven/bigb/big3.html
http://w3.nrl.navy.mil/projects/haarp/index.html
http://cryptome.org/
http://www.raven1.net/ravindex.htm
http://www.calweb.com/~welsh/
http://www.cahe.org/
http://www.parascope.com/ds/mkultra0.htm
http://www.trufax.org/menu/mind.html
http://www.trufax.org/menu/elect.html
http://mindcontrolforum.com/
http://www.trufax.org/menu/elect.html
usw.
usw.
usw.
toutefois qui ne peut quand même pas être qui on fait soetwas, ou?
   Une violation des droits de l'homme séparer ressembler!?!
   Il est possible, par la préparation, des oreilles et dans l'effet avec
la prothèse dentaire éventuellement existante?
Avec la technique de radio relativement simple??
   Dans ce Land?Hier et aujourd'hui
   Sous quels motifs?
   Où le département est-il en réalité 5 du BND et de la protection
d'constitution?
peut il être qu'il y a les personnes qui livrent en permanence le
BND/Verfassungsschutz, de manière funktechnischem un rapport de situation,
sans le remarquer le -même , dans l'enfance rendu possible??
Par de tels collaborateurs officieux, avec le BND et la protection
d'constitution, après manière, des informations sont-elles rassemblées et
plus de, purement théoriquement, chaque citoyen allemand?
   Il y a alors encore un droit à des Privatsphere?  Qui contrôle en
réalité le BND, mad et protection d'constitution sur une infiltration???
Il s'agit en réalité dans le Mail me la question de savoir si lui éléments
criminels, dont le motif de l'enrichissement, ou de groupements des motifs
idéologiques, possible de s'acquérir le savoir et la technique qui à
d'autres temps, est autre MotivenEt place-t-il le savoir technique dont le
public vraiment la fin la barre de drapeau a connaissance ?
   Il n'est pas donc exactement la même chose possible pour des éléments
criminels, moi cela maintenant fois verharmlost et minimisant une légende,
personnes ou groupes particuliers avec des moyens relativement simples, de
quels motifs aussi toujours, auszuspionieren?(Westfernsehen?), a été
développé.
Et ce "Ausspioniererei" ne représente-t-il pas une intervention
considérable dans la vie privée?
   Il est possible personnes ou groupes particuliers, pour certain
Oeffentlichkeit(suggeriert?),  celui p. ex. à l'aide des côtés Internet,
comme par exemple "le Pranger"geschaffen pourrait, fois vorausgestzt
schikanieren  terroriesieren et ou ,
et qui toute (suggerierten)Oeffentlichkeit?Haben les personnes ceux là, ou
d'un autre côté verunglimpft, ou  on ne pas calomnie, en réalité une
chance au Gegenoeffentlichkeit?Ist qui meurtre d'appel?
Il y a quelques années, je ne suis pas encore par hasard sur le côté
"celui" poussé, fonctionnais alors cela sous la couche de pont de
l'entremise partenaire.
   Des personnes particulières, ou des communautés d'intérêts le
peuventelles, d'un autobut pur, de tels côtés servent, sous la couche de
pont d'un Zivilkourage douteux, tracent plus de  des campagnes de
précipitation, propres intérêts tout à fait persoehnliche entremêlent?
De tels côtés peuvent-ils servir à la coordination des manoeuvres
criminelles?
Question, est lui possibilité ou impossibilité de manipuler ou
d'influencer  techniquement et socialement, particulière personnes, ou
aussi groupements, criminelle/ponctuel idéologique Energei dehors, ,
terroriser ou  schickanieren, et ce.Une manipulation de groupe cible par
des masse-médias être la manipulation quotidienne qui peut extraire
mansich, plus ou moins.
   Le droit à la vie privée est-il miné, ramment, tiefenpsychologisch, par
des envois, comme, par exemple "des Big brother"?
   Avec un les Angemailten si un certain savoir devait exister sur le
thème, je serais heureux sur des indications sur le thème.Sur la recherche
des réponses à mes questions je différentes adresses maile d'Internet
dessus, et espère réponses et critiques aufkonstruktive.
   Sur une visite du côté
http://hometown.aol.de/reinerhohn38259/homepage/index.html>
je me réjouirais.
   Si vous deviez avoir été écrit à différentes reprises par moi, je vous
demande de m'excuser cela  qui n'était pas envisagé.
La raison de mon anonymat est le fait qu'avec telle des Fragenstellerei,
l'appel devient ce qui est bien compréhensible, rapidement bruyant après
le Psychatrie.
   Ce que la méthode a également (ist).
   Si vous deviez ressentir les Mail comme un ennui, je voudrais m'excuser
par ceci pour cela!
   Big brother is watching you?




From tim.one@comcast.net  Mon Jun  3 18:32:27 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 03 Jun 2002 13:32:27 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBJPKAA.tim.one@comcast.net>

[Guido]
>>> - Maybe changing the write buffer to a list makes sense too?

[mgilfix@eecs.tufts.edu]
>>  I could do this. Then just do a join before the flush. Is the append
>> /that/ much faster?

[Guido]
> Depends on how small the chunks are you write.  Roughly, repeated list
> append is O(N log N), while repeated string append is O(N**2).

Repeated list append is O(N) amortized (a single append may take O(N) time
all by itself, but if you do N of them in a row the time is still no worse
than O(N) overall; a possible conceptual difficulty may arise here because
the value of "N" changes over time, and while growing to a total of size N
may require O(log N) whole-list copies, each of the copies involves far
fewer elements than the final value of N -- if you add up all these smaller
values of N in the worst case, the sum is O(N) wrt the final value of N, and
so it's worst-case O(N) overall wrt the final value of N).




From trentm@ActiveState.com  Mon Jun  3 19:04:16 2002
From: trentm@ActiveState.com (Trent Mick)
Date: Mon, 3 Jun 2002 11:04:16 -0700
Subject: [Python-Dev] PYC Magic
In-Reply-To: <m3d6v98i0f.fsf@mira.informatik.hu-berlin.de>; from martin@v.loewis.de on Sun, Jun 02, 2002 at 11:53:20PM +0200
References: <3CF8BCF9.5557.4C49011F@localhost> <2m1ybpu4cm.fsf@starship.python.net> <m3d6v98i0f.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020603110416.A25092@ActiveState.com>

[Martin v. Loewis wrote]
> I think Python should provide an option to never write .pyc files,
> controllable through sys.something.

+1

ActivePython's uninstallation process has a custom action (which uses
the installed Python) to remove .pyc files before the MSI process
removes the other files. That process *creates* new .pyc files which
makes uninstallation a little bit of a pain.


Trent

-- 
Trent Mick
TrentM@ActiveState.com



From mgilfix@eecs.tufts.edu  Mon Jun  3 19:39:06 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Mon, 3 Jun 2002 14:39:06 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jun 03, 2002 at 01:22:16PM -0400
References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> <20020603112245.E19838@eecs.tufts.edu> <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020603143906.J19838@eecs.tufts.edu>

On Mon, Jun 03 @ 13:22, Guido van Rossum wrote:
> >   If you made it this far, it's time for coffee.
> 
> When can I expect a new version?

  Give me a day or two to address these points and produce
the new version.

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From python@rcn.com  Mon Jun  3 19:51:20 2002
From: python@rcn.com (Raymond Hettinger)
Date: Mon, 3 Jun 2002 14:51:20 -0400
Subject: [Python-Dev] Draft Guide for code migration and modernation
Message-ID: <001a01c20b2f$a7518060$7eec7ad1@othello>

Here is my cut at the migration and modernization guide.

Comments are welcome.

Walter and Neal, would you like to add the somewhat more involved steps for
eliminating the types and strings modules.


Raymond Hettinger

-----------------------------------------------

Code Modernization and Migration Guide


Pattern:    if d.has_key(k):  --> if k in d:
Idea:   For testing dictionary membership, use the 'in' keyword instead of
the 'has_key()' method.
Version:   2.2 or greater
Benefits:   The result is shorter and more readable. The style becomes
consistent with tests for membership in lists.  The result is slightly
faster because has_key requires an attribute search.
Locating: grep has_key
Contra-indications:
1. if dictlike.has_key(k) ## objects like shelve do not define
__contains__()



Pattern:    for k in d.keys()  -->  for k in d
                for k in d.items() --> for k in d.iteritems()
                for k in d.values() -->  for k in d.itervalues()
Idea:   Use the new iter methods for looping over dictionaries
Version:   2.2 or greater
Benefits:   The iter methods are faster because the do not have to create a
new list object with a complete copy of all of the keys, values, or items.
Selecting only keys, items, or values as needed saves the time for creating
unused object references and, in the case of items, saves a second hash
look-up of the key.
Contra-indications:
1. def getids():  return d.keys()  ## do not change the return type
2. for k in dictlike.keys() ## objects like shelve do not define itermethods
3. k = d.keys(); j = k[:]   ## iterators do not support slicing, sorting or
other operations
4. for k in d.keys(): del[k] ## dict iterators prohibit modifying the
dictionary


Pattern:    if v == None  -->  if v is None:
Idea:   Since there is only one None object, it can be tested with identity.
Version:   Any
Benefits:   Identity tests are slightly faster than equality tests. Also,
some object types may overload comparison to be much slower (or even break).
Locating: grep '== None' or grep '!= None'



Pattern:    os.stat("foo")[stat.ST_MTIME] --> os.stat("foo").st_mtime
                os.stat("foo")[stat.ST_MTIME] --> os.path.getmtime("foo")
Idea:   Replace stat contants or indices with new stat methods
Version:   2.2 or greater
Benefits:   The methods are not order dependent and do not require an import
of the stat module
Locating: grep os.stat


Pattern:    import whrandom --> import random
Idea:   Replace deprecated module
Version:   2.1 or greater
Benefits:   All random methods collected in one place
Locating:   grep whrandom






From skip@pobox.com  Mon Jun  3 20:00:22 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 3 Jun 2002 14:00:22 -0500
Subject: [Python-Dev] "max recursion limit exceeded" canned response?
In-Reply-To: <m3hekl8i3h.fsf@mira.informatik.hu-berlin.de>
References: <15610.6455.96035.742110@12-248-41-177.client.attbi.com>
 <m3hekl8i3h.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15611.48326.400187.797188@beluga.mojam.com>

    >> How would we go about adding a canned response to the commonly
    >> submitted "max recursion limit exceeded" bug report?

    Martin> Post the precise text that you want to see as the canned
    Martin> response, and somebody can install it.

How about:

    The max recursion limit problem in the re module is well-known.  Until
    this limitation in the implementation is removed, to work around it
    check

        http://www.python.org/dev/doc/devel/lib/module-re.html
        http://python/org/sf/493252

Note that the examples in the CVS version of the re module do contain some
tips for working around the problem, however they haven't yet percolated to
the main doc set.

Skip




From mwh@python.net  Mon Jun  3 20:04:03 2002
From: mwh@python.net (Michael Hudson)
Date: 03 Jun 2002 20:04:03 +0100
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: Guido van Rossum's message of "Mon, 03 Jun 2002 13:22:16 -0400"
References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> <20020603112245.E19838@eecs.tufts.edu> <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <2mvg90mbfg.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> OK, but given the issues the first version had, I recommand that the
                                                    ^^^^^^^^^
I *like* this typo :)

> code gets more review and that you write unit tests for all cases.

Cheers,
M.

-- 
  I've reinvented the idea of variables and types as in a
  programming language, something I do on every project.
                                          -- Greg Ward, September 1998



From skip@pobox.com  Mon Jun  3 20:05:33 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 3 Jun 2002 14:05:33 -0500
Subject: [Python-Dev] Draft Guide for code migration and modernation
In-Reply-To: <001a01c20b2f$a7518060$7eec7ad1@othello>
References: <001a01c20b2f$a7518060$7eec7ad1@othello>
Message-ID: <15611.48637.35273.683341@beluga.mojam.com>

    Raymond> Pattern:    if d.has_key(k):  --> if k in d:
    Raymond> Idea:   For testing dictionary membership, use the 'in' keyword
    Raymond> instead of the 'has_key()' method.
    Raymond> Version:   2.2 or greater
    Raymond> Benefits:   The result is shorter and more readable. The style
    Raymond> becomes consistent with tests for membership in lists.  The
    Raymond> result is slightly faster because has_key requires an attribute
    Raymond> search.

Also faster (I think) because it avoids executing the expensive
CALL_FUNCTION opcode.  (Probably applies to the d.keys() part of second
pattern as well.)

    Raymond> Pattern:    if v == None  -->  if v is None:
    Raymond> Idea:   Since there is only one None object, it can be tested
    Raymond> with identity. 
    Raymond> Version:   Any
    Raymond> Benefits:   Identity tests are slightly faster than equality
    Raymond> tests. Also, some object types may overload comparison to be
    Raymond> much slower (or even break). 
    Raymond> Locating: grep '== None' or grep '!= None'

Also:

    if v: --> if v is None

where appropriate (often when testing function arguments that default to
None).  This may change semantics though and has to be undertaken with some
care.

Skip



From tim.one@comcast.net  Mon Jun  3 20:04:50 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 03 Jun 2002 15:04:50 -0400
Subject: [Python-Dev] "max recursion limit exceeded" canned response?
In-Reply-To: <15611.48326.400187.797188@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECDPKAA.tim.one@comcast.net>

[Skip Montanaro]
> How about:
>
>     The max recursion limit problem in the re module is well-known.  Until
>     this limitation in the implementation is removed, to work around it
>     check
>
>         http://www.python.org/dev/doc/devel/lib/module-re.html
>         http://python/org/sf/493252

I've added this as a canned response, with name "SRE max recursion limit".
Thanks!




From skip@pobox.com  Mon Jun  3 20:18:37 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 3 Jun 2002 14:18:37 -0500
Subject: [Python-Dev] Draft Guide for code migration and modernation
In-Reply-To: <15611.48637.35273.683341@beluga.mojam.com>
References: <001a01c20b2f$a7518060$7eec7ad1@othello>
 <15611.48637.35273.683341@beluga.mojam.com>
Message-ID: <15611.49421.389497.166235@beluga.mojam.com>

    Skip> Also:

    Skip>     if v: --> if v is None

Ack!!!  Obviously I got the sense of the test backwards:

    if v: --> if v is not None:

Skip



From neal@metaslash.com  Mon Jun  3 20:32:53 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Mon, 03 Jun 2002 15:32:53 -0400
Subject: [Python-Dev] Draft Guide for code migration and modernation
References: <001a01c20b2f$a7518060$7eec7ad1@othello>
Message-ID: <3CFBC465.9CE292D7@metaslash.com>

Raymond Hettinger wrote:
> 
> Here is my cut at the migration and modernization guide.
> 
> Comments are welcome.
> 
> Walter and Neal, would you like to add the somewhat more involved steps for
> eliminating the types and strings modules.

Here's some more.  Note the last one.  Martin wanted to make sure this made
it into whatsnew.  I have already changed a few, one in Bdb I think.
I will be changing TclError also.  This could be a problem if anyone
assumed these exceptions would be string.

Neal
--

Pattern:  import types ; type(v, types.IntType)  -->  isinstance(v, int)
          type(s, types.StringTypes --> isinstance(s, basestring)
Idea:     The types module will likely to be deprecated in the future.
Version:  2.2 or greater
Benefits: May be slightly faster, avoid a deprecated feature.
Locating: grep types *.py | grep import

Pattern:  import string ; string.method(s, ...)  -->  s.method(...)
          c in string.whitespace --> c.isspace()
Idea:     The string module will likely to be deprecated in the future.
Version:  2.0 or greater
Benefits: Slightly faster, avoid a deprecated feature.
Locating: grep string *.py | grep import

Pattern:  NewError = 'NewError' --> class NewError(Exception): pass
Idea:     String exceptions are deprecated, derive from Exception base class.
Version:  Any
Benefits: String exceptions will not work in future versions.  Allows except Exception: clause to work.
Locating: Use PyChecker



From guido@python.org  Mon Jun  3 20:42:49 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 15:42:49 -0400
Subject: [Python-Dev] Draft Guide for code migration and modernation
In-Reply-To: Your message of "Mon, 03 Jun 2002 15:32:53 EDT."
 <3CFBC465.9CE292D7@metaslash.com>
References: <001a01c20b2f$a7518060$7eec7ad1@othello>
 <3CFBC465.9CE292D7@metaslash.com>
Message-ID: <200206031942.g53Jgnq17951@pcp742651pcs.reston01.va.comcast.net>

> Pattern:  NewError = 'NewError' --> class NewError(Exception): pass
> Idea:     String exceptions are deprecated, derive from Exception base class.
> Version:  Any
> Benefits: String exceptions will not work in future versions.  Allows except Exception: clause to work.
> Locating: Use PyChecker

Should also warn against class exceptions not deriving from Exception.

Be careful about generic phrases like "String exceptions will not work
in future versions."  Some people (especially those who tend to fear
change ;-) start to panic when they read this, thinking it might be in
2.4.  I don't think we'll be able to delete string exceptions before
Python 3.0, so you can be explicit in this case.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip@pobox.com  Mon Jun  3 19:41:22 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 3 Jun 2002 13:41:22 -0500
Subject: [Python-Dev] zap _Py prefix?
Message-ID: <15611.47186.828411.542771@beluga.mojam.com>

The issue of Michael's static PyTimeout_Err symbol reminded me about a
question I had about _Py-prefixed symbols.  I realize they are all
"internal", but I also recall Tim saying a couple of times that the ANSI C
standard reserves all symbols which begin with underscores for use by
compiler writers.

Should the _Py-prefixed symbols be renamed, for example, from

    _PyUnicode_IsDecimalDigit

to

    Py__Unicode_IsDecimalDigit

?  If so, we would then declare that all external symbols which begin with
"Py__" were part of the private API.  We would of course add macro
definitions during the deprecation period:

    #define _PyUnicode_IsDecimalDigit Py__Unicode_IsDecimalDigit

(It would also be nice to #warn when the macros are used.  Is that possible
with the C preprocessor?)

Skip



From neal@metaslash.com  Mon Jun  3 20:54:20 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Mon, 03 Jun 2002 15:54:20 -0400
Subject: [Python-Dev] Draft Guide for code migration and modernation
References: <001a01c20b2f$a7518060$7eec7ad1@othello>
 <3CFBC465.9CE292D7@metaslash.com> <200206031942.g53Jgnq17951@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3CFBC96C.C29CBD98@metaslash.com>

Guido van Rossum wrote:
> 
> > Pattern:  NewError = 'NewError' --> class NewError(Exception): pass
> > Idea:     String exceptions are deprecated, derive from Exception base class.
> > Version:  Any
> > Benefits: String exceptions will not work in future versions.  Allows except Exception: clause to work.
> > Locating: Use PyChecker
> 
> Should also warn against class exceptions not deriving from Exception.

I was going to add this check, but I then I noticed I already had. :-)

> Be careful about generic phrases like "String exceptions will not work
> in future versions."  Some people (especially those who tend to fear
> change ;-) start to panic when they read this, thinking it might be in
> 2.4.

Maybe people won't bitch at you so much for things like bool
and will start bitching at me. :-)

On a somewhat related note, I was perusing the Perl 5.8 RC1 notes.
While there was probably not any incompatible change as "major" as bool,
there were many, many "minor" incompatible changes.  Most seemed 
pretty small and had been warned about in the past.

1. I think Python is doing a good job wrt change.
2. Perhaps, there needs to be stronger warnings about
   deprecated or questionable features.  (We seem to be
   working towards this.)

Neal

PS I don't view the bool change as major and I think it was a good change.



From guido@python.org  Mon Jun  3 21:32:37 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 16:32:37 -0400
Subject: [Python-Dev] zap _Py prefix?
In-Reply-To: Your message of "Mon, 03 Jun 2002 13:41:22 CDT."
 <15611.47186.828411.542771@beluga.mojam.com>
References: <15611.47186.828411.542771@beluga.mojam.com>
Message-ID: <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net>

> Should the _Py-prefixed symbols be renamed, for example, from
> 
>     _PyUnicode_IsDecimalDigit
> 
> to
> 
>     Py__Unicode_IsDecimalDigit

I've replied to this and I'll reply again.  When a C compiler is
spotted that defines a conflicting symbol or that refuses to compile
our code because of this, it's early enough to change this.

> ?  If so, we would then declare that all external symbols which begin with
> "Py__" were part of the private API.

The problem is that Py__ doesn't scream "internal" like "_Py" does.
If we had to, I'd propose "_py".

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@zope.com  Mon Jun  3 21:29:12 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 3 Jun 2002 16:29:12 -0400
Subject: [Python-Dev] zap _Py prefix?
References: <15611.47186.828411.542771@beluga.mojam.com>
 <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <15611.53656.944393.842462@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> The problem is that Py__ doesn't scream "internal" like "_Py"
    GvR> does.  If we had to, I'd propose "_py".

Or how 'bout "mypy_". :)

keep-your-hands-off-of-my-pie-ly y'rs,
-Barry




From python@rcn.com  Mon Jun  3 21:42:11 2002
From: python@rcn.com (Raymond Hettinger)
Date: Mon, 3 Jun 2002 16:42:11 -0400
Subject: [Python-Dev] Silent Deprecation
Message-ID: <005a01c20b3f$239c82a0$a9e77ad1@othello>

Did we ever decide how to implement silent deprecation in the docs?

Some of the choices were:
1.  Delete it from the current docs (making Fred cringe)
2.  Move it to a separate section of the docs
3.  Add a note to the docs.  Perhaps \dissuade{apply()}{Use
func(*args,**kwds) instead}


Raymond Hettinger




From tim.one@comcast.net  Mon Jun  3 21:49:17 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 03 Jun 2002 16:49:17 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects object.c,2.175,2.176
In-Reply-To: <E17Dscu-0007DU-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECLPKAA.tim.one@comcast.net>

[Guido]
> Modified Files:
> 	object.c
> Log Message:
> Implement the intention of SF patch 472523 (but coded differently).
>
> In the past, an object's tp_compare could return any value.  In 2.2
> the docs were tightened to require it to return -1, 0 or 1; and -1 for
> an error.
>
> We now issue a warning if the value is not in this range.
> ...
>
> I haven't decided yet whether to backport this to 2.2.x.  The patch
> applies fine.  But is it fair to start warning in 2.2.2 about code
> that worked flawlessly in 2.2.1?

If 2.2.x is the Python-in-a-Tie line, I say "no way".  People wearing ties
don't care whether a thing is right or wrong, so long as "it works" they
simply don't want to hear about it at all.  Converting old extensions to
avoid new warnings is a 2.3 task for them.




From guido@python.org  Mon Jun  3 21:55:35 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Jun 2002 16:55:35 -0400
Subject: [Python-Dev] Silent Deprecation
In-Reply-To: Your message of "Mon, 03 Jun 2002 16:42:11 EDT."
 <005a01c20b3f$239c82a0$a9e77ad1@othello>
References: <005a01c20b3f$239c82a0$a9e77ad1@othello>
Message-ID: <200206032055.g53KtZD21491@pcp742651pcs.reston01.va.comcast.net>

> Did we ever decide how to implement silent deprecation in the docs?
> 
> Some of the choices were:
> 1.  Delete it from the current docs (making Fred cringe)
> 2.  Move it to a separate section of the docs
> 3.  Add a note to the docs.  Perhaps \dissuade{apply()}{Use
> func(*args,**kwds) instead}

I'll leave this for Fred to pronounce on.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Mon Jun  3 21:51:00 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 3 Jun 2002 16:51:00 -0400
Subject: [Python-Dev] zap _Py prefix?
References: <15611.47186.828411.542771@beluga.mojam.com>  <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <100201c20b40$5f177320$6601a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>

> The problem is that Py__ doesn't scream "internal" like "_Py" does.
> If we had to, I'd propose "_py".

ISO/IEC 9899:1999:
7.1.3 Reserved identifiers
1 Each header declares or defines all identifiers listed in its associated
subclause, and
optionally declares or defines identifiers listed in its associated future
library directions
subclause and identifiers which are always reserved either for any use or
for use as file
scope identifiers.
— All identifiers that begin with an underscore and either an uppercase
letter or another
underscore are always reserved for any use.
— All identifiers that begin with an underscore are always reserved for use
as identifiers with file scope in both the ordinary and tag name spaces.



i-liked-mypy_-ly y'rs,

Dave






From mal@lemburg.com  Mon Jun  3 21:57:28 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 03 Jun 2002 22:57:28 +0200
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects object.c,2.175,2.176
References: <LNBBLJKPBEHFEDALKOLCEECLPKAA.tim.one@comcast.net>
Message-ID: <3CFBD838.5090205@lemburg.com>

Tim Peters wrote:
> [Guido]
> 
>>Modified Files:
>>	object.c
>>Log Message:
>>Implement the intention of SF patch 472523 (but coded differently).
>>
>>In the past, an object's tp_compare could return any value.  In 2.2
>>the docs were tightened to require it to return -1, 0 or 1; and -1 for
>>an error.
>>
>>We now issue a warning if the value is not in this range.
>>...

Another one of these little changes that slipped my radar...
migration guide candidate.

>>I haven't decided yet whether to backport this to 2.2.x.  The patch
>>applies fine.  But is it fair to start warning in 2.2.2 about code
>>that worked flawlessly in 2.2.1?
> 
> 
> If 2.2.x is the Python-in-a-Tie line, I say "no way".  People wearing ties
> don't care whether a thing is right or wrong, so long as "it works" they
> simply don't want to hear about it at all.  Converting old extensions to
> avoid new warnings is a 2.3 task for them.

Since when do you wear a tie, Tim ? ;-) (or have you found a
new employer requiring this... comcast.net ?)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From mgilfix@eecs.tufts.edu  Mon Jun  3 22:00:53 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Mon, 3 Jun 2002 17:00:53 -0400
Subject: [Python-Dev] zap _Py prefix?
In-Reply-To: <100201c20b40$5f177320$6601a8c0@boostconsulting.com>; from david.abrahams@rcn.com on Mon, Jun 03, 2002 at 04:51:00PM -0400
References: <15611.47186.828411.542771@beluga.mojam.com> <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net> <100201c20b40$5f177320$6601a8c0@boostconsulting.com>
Message-ID: <20020603170053.C2362@eecs.tufts.edu>

  There's also py backwards:

    yp_func

  or even:

   yP_func

  I think I like the last one better.

On Mon, Jun 03 @ 16:51, David Abrahams wrote:
> 
> From: "Guido van Rossum" <guido@python.org>
> 
> > The problem is that Py__ doesn't scream "internal" like "_Py" does.
> > If we had to, I'd propose "_py".
> 
> ISO/IEC 9899:1999:
> 7.1.3 Reserved identifiers
> 1 Each header declares or defines all identifiers listed in its associated
> subclause, and
> optionally declares or defines identifiers listed in its associated future
> library directions
> subclause and identifiers which are always reserved either for any use or
> for use as file
> scope identifiers.
> — All identifiers that begin with an underscore and either an uppercase
> letter or another
> underscore are always reserved for any use.
> — All identifiers that begin with an underscore are always reserved for use
> as identifiers with file scope in both the ordinary and tag name spaces.
> 
> 
> 
> i-liked-mypy_-ly y'rs,
> 
> Dave
> 
> 
> 
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
`-> (david.abrahams)

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From David Abrahams" <david.abrahams@rcn.com  Mon Jun  3 22:06:58 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 3 Jun 2002 17:06:58 -0400
Subject: [Python-Dev] zap _Py prefix?
References: <15611.47186.828411.542771@beluga.mojam.com> <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net> <100201c20b40$5f177320$6601a8c0@boostconsulting.com> <20020603170053.C2362@eecs.tufts.edu>
Message-ID: <106001c20b43$48870dc0$6601a8c0@boostconsulting.com>

From: "Michael Gilfix" <mgilfix@eecs.tufts.edu>


>   There's also py backwards:
> 
>     yp_func
> 
>   or even:
> 
>    yP_func
> 
>   I think I like the last one better.

Yeep! That's unpronounceably delicious!

frosted-lucky-charms-ly y'rs,
dave




From tim.one@comcast.net  Mon Jun  3 22:23:26 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 03 Jun 2002 17:23:26 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects
 object.c,2.175,2.176
In-Reply-To: <3CFBD838.5090205@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECOPKAA.tim.one@comcast.net>

[M.-A. Lemburg]
> ...
> Since when do you wear a tie, Tim ? ;-) (or have you found a
> new employer requiring this... comcast.net ?)

Ya, I'm now a sales rep for Comcast, selling cable TV door to door in rural
Virginia.  It's pretty much a nightmare, as they haven't yet laid any cable
in rural Virginia, and almost 10% of my customers ask for their money back
when they discover I've just sold them the right to get cable if it ever
comes to their area.  Still, it beats working for Guido.

so-glad-you-asked-and-btw-do-you-have-a-second-tv?-ly y'rs  - tim




From barry@zope.com  Mon Jun  3 22:28:42 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 3 Jun 2002 17:28:42 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects
 object.c,2.175,2.176
References: <3CFBD838.5090205@lemburg.com>
 <LNBBLJKPBEHFEDALKOLCMECOPKAA.tim.one@comcast.net>
Message-ID: <15611.57226.657099.191575@anthem.wooz.org>

>>>>> "TP" == Tim Peters <tim.one@comcast.net> writes:

    TP> [M.-A. Lemburg]
    >> ...  Since when do you wear a tie, Tim ? ;-) (or have you found
    >> a new employer requiring this... comcast.net ?)

    TP> Ya, I'm now a sales rep for Comcast, selling cable TV door to
    TP> door in rural Virginia.  It's pretty much a nightmare, as they
    TP> haven't yet laid any cable in rural Virginia, and almost 10%
    TP> of my customers ask for their money back when they discover
    TP> I've just sold them the right to get cable if it ever comes to
    TP> their area.  Still, it beats working for Guido.

And if you slip him $50 he'll hook you up to the illicit Uncle Timmy's
Farm Report channel.  Moo.

it's-even-entertaining-in-maryland-ly y'rs,
-Barry



From tim.one@comcast.net  Mon Jun  3 22:45:21 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 03 Jun 2002 17:45:21 -0400
Subject: [Python-Dev] Lazily GC tracking tuples
In-Reply-To: <Pine.LNX.4.44.0205300909110.28843-100000@penguin.theopalgroup.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDAPKAA.tim.one@comcast.net>

[Kevin Jacobs, on Neil's tuple-untracking patch]
> Sorry, I wasn't very clear here.  The patch _does_ fix the performance
> problem by untracking cycle-less tuples when we use the naive version of
> our code (i.e., the one that does not play with the garbage collector).
> However, the performance of the patched GC when compared to our GC-tuned
> code is very similar.

Then Neil's patch is doing all that we could wish of it in this case (you
seem to have counted it as a strike against the patch that it didn't do
better than you can by turning off gc by hand, but that's unrealistic if
so), and then some:

>>> The good news is that another (unrelated) part of our code just became
>>> about 20-40% faster with this patch, though I need to do some fairly
>>> major surgery to isolate why this is so.

That makes it a winner if it doesn't slow non-pathological cases "too much"
(counting your cases as pathological, just because they are <wink>).




From jacobs@penguin.theopalgroup.com  Mon Jun  3 23:04:00 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Mon, 3 Jun 2002 18:04:00 -0400 (EDT)
Subject: [Python-Dev] Lazily GC tracking tuples
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEDAPKAA.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.44.0206031751360.6488-100000@penguin.theopalgroup.com>

On Mon, 3 Jun 2002, Tim Peters wrote:
> [Kevin Jacobs, on Neil's tuple-untracking patch]
> > Sorry, I wasn't very clear here.  The patch _does_ fix the performance
> > problem by untracking cycle-less tuples when we use the naive version of
> > our code (i.e., the one that does not play with the garbage collector).
> > However, the performance of the patched GC when compared to our GC-tuned
> > code is very similar.
> 
> Then Neil's patch is doing all that we could wish of it in this case (you
> seem to have counted it as a strike against the patch that it didn't do
> better than you can by turning off gc by hand, but that's unrealistic if
> so), and then some:

I didn't count it as a strike against the patch -- I had just hoped that
untracking tuples would result in faster execution than turning GC off and
letting my heap swell obscenely.  One extreme case could happend if I turn
off GC, run my code, and let it fill all of my real memory with tuples, and
start swapping to disk.  Clearly, keeping GC enabled with the tuple
untracking patch would result in huge performance gains.  This is not the
situation I was dealing with, though I was hoping for a relatively smaller
improvement from having a more compact and (hopefully) less fragmented heap.

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From mwh@python.net  Mon Jun  3 23:23:44 2002
From: mwh@python.net (Michael Hudson)
Date: 03 Jun 2002 23:23:44 +0100
Subject: [Python-Dev] zap _Py prefix?
In-Reply-To: Michael Gilfix's message of "Mon, 3 Jun 2002 17:00:53 -0400"
References: <15611.47186.828411.542771@beluga.mojam.com> <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net> <100201c20b40$5f177320$6601a8c0@boostconsulting.com> <20020603170053.C2362@eecs.tufts.edu>
Message-ID: <2m1ybogfwv.fsf@starship.python.net>

Michael Gilfix <mgilfix@eecs.tufts.edu> writes:

>   There's also py backwards:
> 
>     yp_func
> 
>   or even:
> 
>    yP_func
> 
>   I think I like the last one better.

Unfortunately, that screams "Yellow Pages", even to me.

...-let's-call-the-whole-thing-off-ly y'rs
m.
-- 
  Well, yes.  I don't think I'd put something like "penchant for anal
  play" and "able to wield a buttplug" in a CV unless it was relevant
  to the gig being applied for...
                                 -- Matt McLeod, alt.sysadmin.recovery



From gward@python.net  Mon Jun  3 23:50:39 2002
From: gward@python.net (Greg Ward)
Date: Mon, 3 Jun 2002 18:50:39 -0400
Subject: [Python-Dev] Re: Adding Optik to the standard library
In-Reply-To: <15611.16296.202088.831238@anthem.wooz.org>
References: <NBBBIOJPGKJEKIECEMCBCEIGNCAA.pobrien@orbtech.com> <20020601024553.59600.qmail@web9607.mail.yahoo.com> <20020601025739.GA17229@gerg.ca> <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net> <20020601143855.GA18632@gerg.ca> <01b401c20ade$13c7bdb0$0900a8c0@spiff> <15611.16296.202088.831238@anthem.wooz.org>
Message-ID: <20020603225039.GA6787@gerg.ca>

On 03 June 2002, Barry A. Warsaw said:
> 
> >>>>> "FL" == Fredrik Lundh <fredrik@pythonware.com> writes:
> 
>     FL>     (setq font-lock-support-mode 'lazy-lock-mode)
> 
> (add-hook 'font-lock-mode-hook 'turn-on-fast-lock)

Neither one worked for me (XEmacs 21.4.6).  You're *never* going to
believe what did work:
  load a font-locked file
  go to Options menu
  go to "Syntax Highlighting"
  select "Lazy lock"
  back to Options menu
  select "Save Options ..."
  restart XEmacs

XEmacs seems to have added this bit of gibberish, err sorry line of Lisp
code, to my ~/.xemacs/custom.el:

'(lazy-lock-mode t nil (lazy-lock))

The wonderful thing about (X)Emacs is that there are so very many ways
for it not to do what you want it to do, and every one of those ways
just might work in some version of (X)Emacs somewhere...

        Greg
-- 
Greg Ward - Python bigot                                gward@python.net
http://starship.python.net/~gward/
Vote anarchist.



From tim.one@comcast.net  Mon Jun  3 23:51:33 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 03 Jun 2002 18:51:33 -0400
Subject: [Python-Dev] Lazily GC tracking tuples
In-Reply-To: <Pine.LNX.4.44.0206031751360.6488-100000@penguin.theopalgroup.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEDEPKAA.tim.one@comcast.net>

[Kevin Jacobs]
> I didn't count it as a strike against the patch -- I had just hoped that
> untracking tuples would result in faster execution than turning GC off and
> letting my heap swell obscenely.

Kevin, keep in mind that we haven't run your app:  the only things we know
about it are what you've told us.  That your heap swells obscenely when gc
is off is news to me. *Does* your heap swell obscenely when turning gc off,
but does not swell obscenely if you keep gc on?  If so, you should keep in
mind too that the best way to speed gc is not to create cyclic trash to
begin with.

> One extreme case could happend if I turn off GC, run my code, and let it
> fill all of my real memory with tuples, and start swapping to disk.
> Clearly, keeping GC enabled with the tuple untracking patch would result
> in huge performance gains.

Sorry, this isn't clear at all.  If your tuples aren't actually in cycles,
then whether GC is on or off is irrelevant to how long they live, and to how
much memory they consume.  It doesn't cost any extra memory (not even one
byte) for a tuple to live in a gc list; on the flip side, no memory is saved
by Neil's patch.

> This is not the situation I was dealing with, though I was hoping for a
> relatively smaller improvement from having a more compact and (hopefully)
> less fragmented heap.

Neil's patch should have no effect on memory use or fragmentation.  It's
only aiming at reducing the *time* spent uselessly scanning and rescanning
and rescanning and ... tuples in bad cases.  So long as your tuples stay
alive, they're going to consume the same amount of memory with or without
the patch, and whether or not you disable gc.




From gward@python.net  Tue Jun  4 00:46:12 2002
From: gward@python.net (Greg Ward)
Date: Mon, 3 Jun 2002 19:46:12 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEMDPJAA.tim.one@comcast.net>
References: <20020601220529.GA20025@gerg.ca> <LNBBLJKPBEHFEDALKOLCKEMDPJAA.tim.one@comcast.net>
Message-ID: <20020603234612.GA6891@gerg.ca>

On 01 June 2002, Tim Peters said:
> I take it you don't spend much time surveying the range of computer science
> literature <wink>.

*snort*  I went to grad school in CS.  Wasn't that enough?  ;->

> Search for
> 
>     Knuth hyphenation
> 
> instead.  Three months later, the best advice you'll have read is to avoid
> hyphenation entirely.

I have no desire to put auto-hyphenation into the Python standard
library -- isn't the whole world trying to get *away* from (natural)
language-specific code?  It's wonderful that Knuth came up with the
algorithm, and even more wonderful that Andrew implemented it for us in
Python.  My wrapping algorithm respects hyphens according to the
English-language conventions I learned in school, augmented by my
peculiar need to handle strings like "-b" and "--file".  But that's all
I need.

> Doing
> justification with fixed-width fonts is like juggling dirt anyway <wink>.

Don't worry, I have even less intention of going there.

        Greg
-- 
Greg Ward - Linux nerd                                  gward@python.net
http://starship.python.net/~gward/
Never put off till tomorrow what you can put off till the day after tomorrow.



From greg@cosc.canterbury.ac.nz  Tue Jun  4 01:19:37 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 04 Jun 2002 12:19:37 +1200 (NZST)
Subject: [Python-Dev] Re: Adding Optik to the standard library
In-Reply-To: <15607.2869.648570.950260@anthem.wooz.org>
Message-ID: <200206040019.MAA06851@s454.cosc.canterbury.ac.nz>

barry@zope.com (Barry A. Warsaw):

> I'm probably somewhat influenced too by
> my early C++ days when we adopted a one class per .h file (and one
> class implementation per .cc file).  IIRC, Objective-C also encouraged
> this granularity of organization.

Deciding how to split things up into files is not such
a big issue in C-related languages, because file
organisation is not tied to naming. You can change
your mind about it without having to change any of
the code which refers to the affected items.

In Python, one is encouraged to put more thought into
the matter, because it affects how things are named.
One-class-per-module is convenient for editing, but
it introduces an extra unneeded level into the
naming hierarchy.

It's unfortunate that editing convenience and naming
convenience seem to be in conflict in Python. Maybe
a folding editor is the answer...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+





From barry@zope.com  Tue Jun  4 02:35:32 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 3 Jun 2002 21:35:32 -0400
Subject: [Python-Dev] Re: Adding Optik to the standard library
References: <NBBBIOJPGKJEKIECEMCBCEIGNCAA.pobrien@orbtech.com>
 <20020601024553.59600.qmail@web9607.mail.yahoo.com>
 <20020601025739.GA17229@gerg.ca>
 <200206010427.g514Ri219383@pcp742651pcs.reston01.va.comcast.net>
 <20020601143855.GA18632@gerg.ca>
 <01b401c20ade$13c7bdb0$0900a8c0@spiff>
 <15611.16296.202088.831238@anthem.wooz.org>
 <20020603225039.GA6787@gerg.ca>
Message-ID: <15612.6500.399598.338375@anthem.wooz.org>

>>>>> "GW" == Greg Ward <gward@python.net> writes:

    GW> Neither one worked for me (XEmacs 21.4.6).

Well of course (wink) you also have to (require 'fast-lock).

    | You're *never*
    | going to believe what did work:
    |   load a font-locked file
    |   go to Options menu
    |   go to "Syntax Highlighting"
    |   select "Lazy lock"
    |   back to Options menu
    |   select "Save Options ..."
    |   restart XEmacs

Oh, but I do believe it.

    GW> XEmacs seems to have added this bit of gibberish, err sorry
    GW> line of Lisp code, to my ~/.xemacs/custom.el:

    GW> '(lazy-lock-mode t nil (lazy-lock))

Pshhh.  D'oh.  Obvious.

    GW> The wonderful thing about (X)Emacs is that there are so very
    GW> many ways for it not to do what you want it to do, and every
    GW> one of those ways just might work in some version of (X)Emacs
    GW> somewhere...

how-many-more-would-you-like?-ly y'rs,
-Barry



From jacobs@penguin.theopalgroup.com  Tue Jun  4 02:52:31 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Mon, 3 Jun 2002 21:52:31 -0400 (EDT)
Subject: [Python-Dev] Lazily GC tracking tuples
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEDEPKAA.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.44.0206032127190.7099-100000@penguin.theopalgroup.com>

On Mon, 3 Jun 2002, Tim Peters wrote:
> [Kevin Jacobs]
> > I didn't count it as a strike against the patch -- I had just hoped that
> > untracking tuples would result in faster execution than turning GC off and
> > letting my heap swell obscenely.
> 
> Kevin, keep in mind that we haven't run your app:  the only things we know
> about it are what you've told us.  That your heap swells obscenely when gc
> is off is news to me. *Does* your heap swell obscenely when turning gc off,
> but does not swell obscenely if you keep gc on?  If so, you should keep in
> mind too that the best way to speed gc is not to create cyclic trash to
> begin with.

This part of my app allocates a mix of several kinds of objects.  Some are
cyclic tuple trees, many are acyclic tuples, and others are complex class
instances.  To make things worse, the object lifetimes vary from ephemeral
to very long-lived.  Due to this mix, turning GC off *does* cause the heap
to swell, and we have to very carefully monitor the algorithm to relieve the
swelling frequently enough to prevent eating too much system memory, but
infrequently enough that we don't spend all of our time in the GC tracking
through the many acyclic tuples.  Also, due to the dynamic nature of the
algorithm, it is not trivial to avoid creating cyclic trash.

> > One extreme case could happend if I turn off GC, run my code, and let it
> > fill all of my real memory with tuples, and start swapping to disk.
> > Clearly, keeping GC enabled with the tuple untracking patch would result
> > in huge performance gains.
> 
> Sorry, this isn't clear at all.  If your tuples aren't actually in cycles,
> then whether GC is on or off is irrelevant to how long they live, and to how
> much memory they consume.  It doesn't cost any extra memory (not even one
> byte) for a tuple to live in a gc list; on the flip side, no memory is saved
> by Neil's patch.

But I do generate cyclic trash, and it does build up when I turn off GC.
So, if it runs long enough with GC turned off, the machine will run out of
real memory.  This is all that I am saying.

> > This is not the situation I was dealing with, though I was hoping for a
> > relatively smaller improvement from having a more compact and (hopefully)
> > less fragmented heap.
> 
> Neil's patch should have no effect on memory use or fragmentation.  It's
> only aiming at reducing the *time* spent uselessly scanning and rescanning
> and rescanning and ... tuples in bad cases.  So long as your tuples stay
> alive, they're going to consume the same amount of memory with or without
> the patch, and whether or not you disable gc.

With Neil's patch and GC turned ON:

  Python untracks many, many tuples that give for several generations that
  would otherwise have to scanned to dispose of the accumulating cyclic
  garbage.  Some GC overhead is observed, due to normal periodic scanning
  and the extra work needed to untrack acyclic tuples.

without Neil's patch and automatic GC turned OFF:

  Python does not spend any time scanning for garbage, and things are very
  fast until we run out of core memory, or the patterns of allocations
  result in an extremely fragmented heap due to interleaved allocation of
  objects with very heterogeneous lifetimes.  At certain intervals, GC would
  be triggered manually to release the cyclic trash.

My hope was that the more compact and hopefully less fragmented heap would
more than offset the overhead of automatic GC scans and the tuple untracking
sweep.

Hopefully my comments are now a little clearer...

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From greg@cosc.canterbury.ac.nz  Tue Jun  4 03:00:06 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 04 Jun 2002 14:00:06 +1200 (NZST)
Subject: [Python-Dev] subclass a module?
In-Reply-To: <15608.15520.96707.809995@anthem.wooz.org>
Message-ID: <200206040200.OAA06890@s454.cosc.canterbury.ac.nz>

barry@zope.com (Barry A. Warsaw):

> Can I now subclass from modules?  And if so, what good does that do
> me?

This seems to be a side effect of two things:
(1) Python 2.2 will accept anything as a base class
whose type is callable with the appropriate arguments,
and (2) types.ModuleType doesn't seem to care what
arguments you give it:

Python 2.2 (#14, May 28 2002, 14:11:27) 
[GCC 2.95.2 19991024 (release)] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> from types import ModuleType
>>> ModuleType()
<module '?' (built-in)>
>>> ModuleType(42)
<module '?' (built-in)>
>>> ModuleType("dead", "parrot")
<module '?' (built-in)>
>>> ModuleType("nobody", "expects", "the", "spanish", "base", "class")
<module '?' (built-in)>
>>> 

So your class statement is simply creating an empty module.

An interesting feature of the new scheme is that the "class"
statement can be used to create things which don't even
remotely resemble classes. Brain-explosion for the masses!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Tue Jun  4 02:17:38 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 04 Jun 2002 13:17:38 +1200 (NZST)
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: <20020601220529.GA20025@gerg.ca>
Message-ID: <200206040117.NAA06859@s454.cosc.canterbury.ac.nz>

Greg Ward <gward@python.net>:

> despite being warned just today on the conceptual/philosophical
> danger of classes whose names end in "-er" [1]
> 
> [1] objects should *be*, not *do*, and class names like HelpFormatter
>     and TextWrapper are impositions of procedural abstraction onto
>     OOP.

I disagree with this statement completely. Surely the
concept of objects *doing* things is central to the
whole idea of OO! Why do you think objects have
things called "methods"?-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From martin@v.loewis.de  Tue Jun  4 06:32:52 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 04 Jun 2002 07:32:52 +0200
Subject: [Python-Dev] zap _Py prefix?
In-Reply-To: <100201c20b40$5f177320$6601a8c0@boostconsulting.com>
References: <15611.47186.828411.542771@beluga.mojam.com>
 <200206032032.g53KWbm21270@pcp742651pcs.reston01.va.comcast.net>
 <100201c20b40$5f177320$6601a8c0@boostconsulting.com>
Message-ID: <m33cw37gmz.fsf@mira.informatik.hu-berlin.de>

"David Abrahams" <david.abrahams@rcn.com> writes:

>  All identifiers that begin with an underscore and either an
> uppercase letter or another underscore are always reserved for any
> use.

I agree with Guido that this is not enough incentive to change the
names of these function. Even though compiler may use then, no
compiler of interest will.

Regards,
Martin




From guido@python.org  Tue Jun  4 07:08:53 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 04 Jun 2002 02:08:53 -0400
Subject: [Python-Dev] subclass a module?
In-Reply-To: Your message of "Tue, 04 Jun 2002 14:00:06 +1200."
 <200206040200.OAA06890@s454.cosc.canterbury.ac.nz>
References: <200206040200.OAA06890@s454.cosc.canterbury.ac.nz>
Message-ID: <200206040608.g5468rb31173@pcp742651pcs.reston01.va.comcast.net>

> > Can I now subclass from modules?  And if so, what good does that do
> > me?
> 
> This seems to be a side effect of two things:
> (1) Python 2.2 will accept anything as a base class
> whose type is callable with the appropriate arguments,
> and (2) types.ModuleType doesn't seem to care what
> arguments you give it:

Thanks for explaining this!  (I've been away from this so long that it
baffled me a bit. :-)  I've fixed this by making the module
constructor sane: it now requires a name and takes an optional
docstring.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal@lemburg.com  Tue Jun  4 08:48:36 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 04 Jun 2002 09:48:36 +0200
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects object.c,2.175,2.176
References: <LNBBLJKPBEHFEDALKOLCMECOPKAA.tim.one@comcast.net>
Message-ID: <3CFC70D4.6050303@lemburg.com>

Tim Peters wrote:
> [M.-A. Lemburg]
> 
>>...
>>Since when do you wear a tie, Tim ? ;-) (or have you found a
>>new employer requiring this... comcast.net ?)
> 
> 
> Ya, I'm now a sales rep for Comcast, selling cable TV door to door in rural
> Virginia.  It's pretty much a nightmare, as they haven't yet laid any cable
> in rural Virginia, and almost 10% of my customers ask for their money back
> when they discover I've just sold them the right to get cable if it ever
> comes to their area.  Still, it beats working for Guido.

Yeah, probably better than squashing dirty bugs living near
wild pythons on a daily basis.

> so-glad-you-asked-and-btw-do-you-have-a-second-tv?-ly y'rs  - tim

Not yet, but I promise to get one as soon as I move to Virginia.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From walter@livinglogic.de  Tue Jun  4 12:58:53 2002
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue, 04 Jun 2002 13:58:53 +0200
Subject: [Python-Dev] Draft Guide for code migration and modernation
References: <001a01c20b2f$a7518060$7eec7ad1@othello> <3CFBC465.9CE292D7@metaslash.com>
Message-ID: <3CFCAB7D.4090507@livinglogic.de>

Neal Norwitz wrote:

> [...]
> Pattern:  import string ; string.method(s, ...)  -->  s.method(...)

join and zfill should probably be mentioned separately:
"""
Be careful with string.join(): The order of the arguments is reversed
here.

string.zfill has a "decadent feature": It also works for
non-string objects by calling repr before formatting.
"""

             string.atoi(s, ...)  -->  int(s, ...)
             string.atol(s, ...)  -->  long(s, ...)
             string.atof(s, ...)  -->  float(s, ...)

>           c in string.whitespace --> c.isspace()

This changes the meaning slightly for unicode characters, because
chr(i).isspace() != unichr(i).isspace()
for i in { 0x1c, 0x1d, 0x1e, 0x1f, 0x85, 0xa0 }

New ones:

Pattern:  "foobar"[:3] == "foo" -> "foobar".startswith("foo")
           "foobar"[-3:] == "bar" -> "foobar".endswith("bar")
Version:  ??? (It was added on the string_methods branch)
Benefits: Faster because no slice has to be created.
           No danger of miscounting.
Locating: grep "\[\w*-[0-9]*\w*:\w*\]" | grep "=="
           grep "\[\w*:\w*[0-9]*\w*\]" | grep "=="

Pattern:  import types;
           if hasattr(types, "UnicodeType"):
               foo
           else:
               bar
           -->
           try:
               unicode
           except NameError:
               bar
           else:
               foo
Idea:     The types module will likely to be deprecated in the future.
Version:  2.2
Benefits: Avoid a deprecated feature.
Locating: grep "hasattr.*UnicodeType"

Bye,
    Walter Dörwald




From s_lott@yahoo.com  Tue Jun  4 13:03:04 2002
From: s_lott@yahoo.com (Steven Lott)
Date: Tue, 4 Jun 2002 05:03:04 -0700 (PDT)
Subject: [Python-Dev] Re: Adding Optik to the standard library
In-Reply-To: <200206040019.MAA06851@s454.cosc.canterbury.ac.nz>
Message-ID: <20020604120304.17738.qmail@web9605.mail.yahoo.com>

Or a literate programming tool that separates these concerns
nicely.  


--- Greg Ewing <greg@cosc.canterbury.ac.nz> wrote:
> barry@zope.com (Barry A. Warsaw):
> 
> > I'm probably somewhat influenced too by
> > my early C++ days when we adopted a one class per .h file
> (and one
> > class implementation per .cc file).  IIRC, Objective-C also
> encouraged
> > this granularity of organization.
> 
> Deciding how to split things up into files is not such
> a big issue in C-related languages, because file
> organisation is not tied to naming. You can change
> your mind about it without having to change any of
> the code which refers to the affected items.
> 
> In Python, one is encouraged to put more thought into
> the matter, because it affects how things are named.
> One-class-per-module is convenient for editing, but
> it introduces an extra unneeded level into the
> naming hierarchy.
> 
> It's unfortunate that editing convenience and naming
> convenience seem to be in conflict in Python. Maybe
> a folding editor is the answer...
> 
> Greg Ewing, Computer Science Dept,
> +--------------------------------------+
> University of Canterbury,	   | A citizen of NewZealandCorp, a	
>  |
> Christchurch, New Zealand	   | wholly-owned subsidiary of USA
> Inc.  |
> greg@cosc.canterbury.ac.nz	  
> +--------------------------------------+
> 
> 
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev


=====
--
S. Lott, CCP :-{)
S_LOTT@YAHOO.COM
http://www.mindspring.com/~slott1
Buccaneer #468: KaDiMa

Macintosh user: drinking upstream from the herd.

__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From guido@python.org  Tue Jun  4 14:16:11 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 04 Jun 2002 09:16:11 -0400
Subject: [Python-Dev] Draft Guide for code migration and modernation
In-Reply-To: Your message of "Tue, 04 Jun 2002 13:58:53 +0200."
 <3CFCAB7D.4090507@livinglogic.de>
References: <001a01c20b2f$a7518060$7eec7ad1@othello> <3CFBC465.9CE292D7@metaslash.com>
 <3CFCAB7D.4090507@livinglogic.de>
Message-ID: <200206041316.g54DGBA00973@pcp742651pcs.reston01.va.comcast.net>

> string.zfill has a "decadent feature": It also works for
> non-string objects by calling repr before formatting.

Hm, but repr() was the wrong thing to call here anyway. :-(

> >           c in string.whitespace --> c.isspace()
> 
> This changes the meaning slightly for unicode characters, because
> chr(i).isspace() != unichr(i).isspace()
> for i in { 0x1c, 0x1d, 0x1e, 0x1f, 0x85, 0xa0 }

That's unfortunate, because I'd like unicode to be an extension of
ASCII also in this kind of functionality.  What are these and why are
they considered spaces?  Would it hurt to make them spaces in ASCII
too?

> New ones:
> 
> Pattern:  "foobar"[:3] == "foo" -> "foobar".startswith("foo")
>            "foobar"[-3:] == "bar" -> "foobar".endswith("bar")
> Version:  ??? (It was added on the string_methods branch)

2.0.

> Benefits: Faster because no slice has to be created.
>            No danger of miscounting.
> Locating: grep "\[\w*-[0-9]*\w*:\w*\]" | grep "=="
>            grep "\[\w*:\w*[0-9]*\w*\]" | grep "=="

Are these regexes really worth making part of the migration guide?
\w* isn't a good pattern to catch an arbitrary expression, it only
catches simple identifiers!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pobrien@orbtech.com  Tue Jun  4 14:23:34 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Tue, 4 Jun 2002 08:23:34 -0500
Subject: [Python-Dev] Draft Guide for code migration and modernation
In-Reply-To: <200206041316.g54DGBA00973@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <NBBBIOJPGKJEKIECEMCBKEABNDAA.pobrien@orbtech.com>

[Guido van Rossum]
> > Locating: grep "\[\w*-[0-9]*\w*:\w*\]" | grep "=="
> >            grep "\[\w*:\w*[0-9]*\w*\]" | grep "=="
>
> Are these regexes really worth making part of the migration guide?
> \w* isn't a good pattern to catch an arbitrary expression, it only
> catches simple identifiers!

Doesn't that make a pretty good case for including (properly formed)
regexes?

---
Patrick K. O'Brien
Orbtech




From pinard@iro.umontreal.ca  Tue Jun  4 14:29:51 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 04 Jun 2002 09:29:51 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: <200206040117.NAA06859@s454.cosc.canterbury.ac.nz>
Message-ID: <oqlm9vw4s0.fsf@titan.progiciels-bpi.ca>

Hi, people.

For this incoming text wrapper facility, there is a feature that appears
really essential to me, and many others: the protection of full stops[1].

In a previous message, I spoke of Knuth's algorithm as a nice possibility,
but this is merely whipped cream and cherry over the ice cream.  Protection
of full stops does not fall in that decoration category, it is essential.
I mean, for those who care, a wrapper without full stop protection would
be rather unusable when there is more than one sentence to refill.

----------
[1] Full stops are punctuation ending sentences with two spaces guaranteed.
Full stops are defined that way for typography based on fixed width fonts,
like when we say "this many characters to a line".

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From walter@livinglogic.de  Tue Jun  4 15:06:29 2002
From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=)
Date: Tue, 04 Jun 2002 16:06:29 +0200
Subject: [Python-Dev] Draft Guide for code migration and modernation
References: <001a01c20b2f$a7518060$7eec7ad1@othello> <3CFBC465.9CE292D7@metaslash.com>              <3CFCAB7D.4090507@livinglogic.de> <200206041316.g54DGBA00973@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3CFCC965.5060200@livinglogic.de>

Guido van Rossum wrote:

>>string.zfill has a "decadent feature": It also works for
>>non-string objects by calling repr before formatting.
> 
> 
> Hm, but repr() was the wrong thing to call here anyway. :-(

The old code used `x`. Should we change it to use str()?

>>>          c in string.whitespace --> c.isspace()
>>
>>This changes the meaning slightly for unicode characters, because
>>chr(i).isspace() != unichr(i).isspace()
>>for i in { 0x1c, 0x1d, 0x1e, 0x1f, 0x85, 0xa0 }
> 
> 
> That's unfortunate, because I'd like unicode to be an extension of
> ASCII also in this kind of functionality.  What are these and why are
> they considered spaces?

http://www.unicode.org/Public/UNIDATA/NamesList.txt says:
001C 
<control>
	= INFORMATION SEPARATOR FOUR
	= file separator (FS)
001D 
<control>
	= INFORMATION SEPARATOR THREE
	= group separator (GS)
001E 
<control>
	= INFORMATION SEPARATOR TWO
	= record separator (RS)
001F 
<control>
	= INFORMATION SEPARATOR ONE
	= unit separator (US)
0085 
<control>
	= NEXT LINE (NEL)
00A0 
NO-BREAK SPACE
	x (space - 0020)
	x (figure space - 2007)
	x (narrow no-break space - 202F)
	x (word joiner - 2060)
	x (zero width no-break space - FEFF)
	# <noBreak> 0020

> Would it hurt to make them spaces in ASCII
> too?

stringobject.c::string_isspace() currently uses the isspace()
function from <ctype.h>.

>>New ones:
>>
>>Pattern:  "foobar"[:3] == "foo" -> "foobar".startswith("foo")
>>           "foobar"[-3:] == "bar" -> "foobar".endswith("bar")
>>Version:  ??? (It was added on the string_methods branch)
> 
> 
> 2.0.
> 
> 
>>Benefits: Faster because no slice has to be created.
>>           No danger of miscounting.
>>Locating: grep "\[\w*-[0-9]*\w*:\w*\]" | grep "=="
>>           grep "\[\w*:\w*[0-9]*\w*\]" | grep "=="
> 
> 
> Are these regexes really worth making part of the migration guide?
> \w* isn't a good pattern to catch an arbitrary expression, it only
> catches simple identifiers!

Ouch, that was meant to be

grep "\[[[:space:]]*-[[:digit:]]*[[:space:]]*:[[:space:]]*\]" | grep "=="
grep "\[[[:space:]]*:[[:space:]]*[[:digit:]]*[[:space:]]*\]" | grep "=="

This doesn't find "foobar"[-len("bar"):]=="bar", only constants.

But at least it's a little better than vgrep. ;)

Bye,
    Walter Dörwald




From akuchlin@mems-exchange.org  Tue Jun  4 15:07:34 2002
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Tue, 4 Jun 2002 10:07:34 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: <oqlm9vw4s0.fsf@titan.progiciels-bpi.ca>
References: <200206040117.NAA06859@s454.cosc.canterbury.ac.nz> <oqlm9vw4s0.fsf@titan.progiciels-bpi.ca>
Message-ID: <20020604140734.GB1039@ute.mems-exchange.org>

On Tue, Jun 04, 2002 at 09:29:51AM -0400, Fran?ois Pinard wrote:
>[1] Full stops are punctuation ending sentences with two spaces guaranteed.
>Full stops are defined that way for typography based on fixed width fonts,
>like when we say "this many characters to a line".

I don't think this really matters, because I doubt anyone will be
implementing full justification.  Left justification is just a matter
of inserting newlines at particular points, so if the input data has
two spaces after punctuation, line-breaking won't introduce any errors.

--amk





From guido@python.org  Tue Jun  4 15:11:10 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 04 Jun 2002 10:11:10 -0400
Subject: [Python-Dev] Draft Guide for code migration and modernation
In-Reply-To: Your message of "Tue, 04 Jun 2002 08:23:34 CDT."
 <NBBBIOJPGKJEKIECEMCBKEABNDAA.pobrien@orbtech.com>
References: <NBBBIOJPGKJEKIECEMCBKEABNDAA.pobrien@orbtech.com>
Message-ID: <200206041411.g54EBAI10211@odiug.zope.com>

> Doesn't that make a pretty good case for including (properly formed)
> regexes?

You can't match an expression with a regex.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pobrien@orbtech.com  Tue Jun  4 15:22:51 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Tue, 4 Jun 2002 09:22:51 -0500
Subject: [Python-Dev] Draft Guide for code migration and modernation
In-Reply-To: <200206041411.g54EBAI10211@odiug.zope.com>
Message-ID: <NBBBIOJPGKJEKIECEMCBKEAENDAA.pobrien@orbtech.com>

[Guido van Rossum]
> > Doesn't that make a pretty good case for including (properly formed)
> > regexes?
> 
> You can't match an expression with a regex.

Doh! Sorry. ;-)

---
Patrick K. O'Brien
Orbtech



From mal@lemburg.com  Tue Jun  4 15:17:26 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 04 Jun 2002 16:17:26 +0200
Subject: [Python-Dev] Draft Guide for code migration and modernation
References: <001a01c20b2f$a7518060$7eec7ad1@othello> <3CFBC465.9CE292D7@metaslash.com>              <3CFCAB7D.4090507@livinglogic.de> <200206041316.g54DGBA00973@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <3CFCCBF6.8010902@lemburg.com>


Guido van Rossum wrote:
>>>          c in string.whitespace --> c.isspace()
>>
>>This changes the meaning slightly for unicode characters, because
>>chr(i).isspace() != unichr(i).isspace()
>>for i in { 0x1c, 0x1d, 0x1e, 0x1f, 0x85, 0xa0 }
> 
> 
> That's unfortunate, because I'd like unicode to be an extension of
> ASCII also in this kind of functionality.  What are these and why are
> they considered spaces?  Would it hurt to make them spaces in ASCII
> too?

 From the Unicode database:

001C;<control>;Cc;0;B;;;;;N;FILE SEPARATOR;;;;
001D;<control>;Cc;0;B;;;;;N;GROUP SEPARATOR;;;;
001E;<control>;Cc;0;B;;;;;N;RECORD SEPARATOR;;;;
001F;<control>;Cc;0;S;;;;;N;UNIT SEPARATOR;;;;

0085;<control>;Cc;0;B;;;;;N;NEXT LINE;;;;

00A0;NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;NON-BREAKING SPACE;;;;

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From guido@python.org  Tue Jun  4 15:32:01 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 04 Jun 2002 10:32:01 -0400
Subject: [Python-Dev] Draft Guide for code migration and modernation
In-Reply-To: Your message of "Tue, 04 Jun 2002 16:06:29 +0200."
 <3CFCC965.5060200@livinglogic.de>
References: <001a01c20b2f$a7518060$7eec7ad1@othello> <3CFBC465.9CE292D7@metaslash.com> <3CFCAB7D.4090507@livinglogic.de> <200206041316.g54DGBA00973@pcp742651pcs.reston01.va.comcast.net>
 <3CFCC965.5060200@livinglogic.de>
Message-ID: <200206041432.g54EW1Q10622@odiug.zope.com>

> > Hm, but repr() was the wrong thing to call here anyway. :-(
> 
> The old code used `x`. Should we change it to use str()?

Can't do that, it's an incompatibility.  In a module of mostly
historic importance, it doesn't make sense to change it incompatibly.

> > Would it hurt to make them spaces in ASCII
> > too?
> 
> stringobject.c::string_isspace() currently uses the isspace()
> function from <ctype.h>.

I guess we'll have to live with this difference.  There's not much
harm, since nobody uses these anyway.

> grep "\[[[:space:]]*-[[:digit:]]*[[:space:]]*:[[:space:]]*\]" | grep "=="
> grep "\[[[:space:]]*:[[:space:]]*[[:digit:]]*[[:space:]]*\]" | grep "=="
> 
> This doesn't find "foobar"[-len("bar"):]=="bar", only constants.
> 
> But at least it's a little better than vgrep. ;)

Doesn't answer my question.  I'm doubting the wisdom of including
these grep instructions (correct or otherwise :-) for several reasons:

(1) It doesn't catch all cases (regexes aren't powerful enough to
    match arbitrary expressions)
(2) This recipe is Unix specific
(3) (Most important) it encourages "peephole changes"

By "peephole changes" I mean a very focused search-and-destroy looking
for a pattern and changing it, without looking at anything else.  This
can cause anachronistic code, where one line is modern style, and the
rest of a function uses outdated idioms.  IMO that looks worse than
all old style.  It can also cause bugs to slip in.  In my
recollection, every time someone went in and did a sweep over the
standard library looking for a particular pattern to fix, they
introduced at least one bug.

I much prefer such modernizations to be done only when you have a
reason to look at a particular module anyway, so you really understand
the code before you go in.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@strakt.com  Tue Jun  4 16:08:37 2002
From: martin@strakt.com (Martin =?iso-8859-1?Q?Sj=F6gren?=)
Date: Tue, 4 Jun 2002 17:08:37 +0200
Subject: [Python-Dev] Python 2.3 release schedule
In-Reply-To: <001c01c20474$e1a50140$f9d8accf@othello>
References: <200205242217.g4OMHbu25323@pcp742651pcs.reston01.va.comcast.net> <001c01c20474$e1a50140$f9d8accf@othello>
Message-ID: <20020604150837.GA30078@strakt.com>

On Sun, May 26, 2002 at 01:19:15AM -0400, Raymond Hettinger wrote:
>    ia.filter(pred)          # takewhile
>    ia.invfilter(pred)       # dropwhile

Err. I don't know what you mean with "filter", but in Haskell, there is a
big difference between filter and takeWhile.

Prelude> filter (>3) [1..5]
[4,5]
Prelude> takeWhile (>3) [1..5]
[]
Prelude> dropWhile (>3) [1..5]
[1,2,3,4,5]


/Martin

--=20
Martin Sj=F6gren
  martin@strakt.com              ICQ : 41245059
  Phone: +46 (0)31 7710870       Cell: +46 (0)739 169191
  GPG key: http://www.strakt.com/~martin/gpg.html



From neal@metaslash.com  Tue Jun  4 16:17:50 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Tue, 04 Jun 2002 11:17:50 -0400
Subject: [Python-Dev] Changes to PEP 8 & 42
Message-ID: <3CFCDA1E.B0577968@metaslash.com>

I've also got a change to PEP 8 (for basestring).  Is the wording ok?
Should I check it in or should Barry/Guido?

Also, it seems that PEP 42 is a bit out of date.  I think the
stat/statvfs changes may be done (at least started), 
the std library uses 4 space indents, math.radians/degrees were added.
Probably others too.

Neal
--

Index: pep-0008.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0008.txt,v
retrieving revision 1.13
diff -C1 -r1.13 pep-0008.txt
*** pep-0008.txt        29 May 2002 16:07:27 -0000      1.13
--- pep-0008.txt        3 Jun 2002 20:25:54 -0000
***************
*** 542,548 ****
        When checking if an object is a string, keep in mind that it
!       might be a unicode string too!  In Python 2.2, the types module
!       has the StringTypes type defined for that purpose, e.g.:
  
!         from types import StringTypes:
!         if isinstance(strorunicodeobj, StringTypes):
  
--- 542,547 ----
        When checking if an object is a string, keep in mind that it
!       might be a unicode string too!  In Python 2.3, str and unicode
!       have a common base class, basestring, so you can do:
  
!         if isinstance(strorunicodeobj, basestring):



From guido@python.org  Tue Jun  4 16:39:32 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 04 Jun 2002 11:39:32 -0400
Subject: [Python-Dev] Changes to PEP 8 & 42
In-Reply-To: Your message of "Tue, 04 Jun 2002 11:17:50 EDT."
 <3CFCDA1E.B0577968@metaslash.com>
References: <3CFCDA1E.B0577968@metaslash.com>
Message-ID: <200206041539.g54FdWF01219@odiug.zope.com>

> I've also got a change to PEP 8 (for basestring).  Is the wording ok?
> Should I check it in or should Barry/Guido?

You can check it in, but I suggest providing ways of doing this for
2.0/2.1, 2.2, and 2.3 (since they are all different).

> Also, it seems that PEP 42 is a bit out of date.  I think the
> stat/statvfs changes may be done (at least started), 
> the std library uses 4 space indents, math.radians/degrees were added.
> Probably others too.

Whoever fulfilled those wishes should ideally edit the PEP.  You can
do it too.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@zope.com  Tue Jun  4 16:43:29 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 4 Jun 2002 11:43:29 -0400
Subject: [Python-Dev] Changes to PEP 8 & 42
References: <3CFCDA1E.B0577968@metaslash.com>
Message-ID: <15612.57377.547271.755162@anthem.wooz.org>

>>>>> "NN" == Neal Norwitz <neal@metaslash.com> writes:

    NN> I've also got a change to PEP 8 (for basestring).  Is the
    NN> wording ok?  Should I check it in or should Barry/Guido?

I would add the Python 2.3 recommendation to the PEP instead of
replacing the Python 2.2 recommendation.  If you do that, feel free to
commit the change.

-Barry



From martin@v.loewis.de  Tue Jun  4 18:40:27 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: Tue, 4 Jun 2002 19:40:27 +0200
Subject: [Python-Dev] Patch #473512
Message-ID: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de>

I'm ready to apply patch 473512 : getopt with GNU style scanning,
which adds getopt.gnu_getopt.

Any objections?

Regards,
Martin



From guido@python.org  Tue Jun  4 19:04:28 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 04 Jun 2002 14:04:28 -0400
Subject: [Python-Dev] Patch #473512
In-Reply-To: Your message of "Tue, 04 Jun 2002 19:40:27 +0200."
 <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de>
References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de>
Message-ID: <200206041804.g54I4Sd16333@odiug.zope.com>

> I'm ready to apply patch 473512 : getopt with GNU style scanning,
> which adds getopt.gnu_getopt.
> 
> Any objections?

Is there a point to adding more cruft to getopt.py now that we're
getting Greg Ward's Optik?

Also, I happen to hate GNU style getopt.  You may call me an old
fogey, but I think options should precede other arguments.

But other that, no objections.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pinard@iro.umontreal.ca  Tue Jun  4 19:09:26 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 04 Jun 2002 14:09:26 -0400
Subject: [Python-Dev] Re: Patch #473512
In-Reply-To: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de>
References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de>
Message-ID: <oqd6v6hq5l.fsf@carouge.sram.qc.ca>

[Martin v. Loewis]

> I'm ready to apply patch 473512 : getopt with GNU style scanning, which
> adds getopt.gnu_getopt.  Any objections?

GNU getopt changes once in a while.  Will `getopt.gnu_getopt' track and
reflect these changes as they occur?  I mean, is it the intent?  If yes,
the name might be fine.  Otherwise, it might be better to name this other
`getopt' after some of its properties, instead of using `gnu' as a prefix.
If GNU it has to be, maybe it should be capitalised?  Some existing modules
suggest capitals where underlines would probably have been sufficient,
maybe we should use capitals where they are more naturally mandated.

Not a big matter for me, but probably worth a thought nevertheless?

                                Have a good day, everybody!

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard




From pinard@iro.umontreal.ca  Tue Jun  4 19:34:12 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 04 Jun 2002 14:34:12 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: <20020604140734.GB1039@ute.mems-exchange.org>
References: <200206040117.NAA06859@s454.cosc.canterbury.ac.nz>
 <oqlm9vw4s0.fsf@titan.progiciels-bpi.ca>
 <20020604140734.GB1039@ute.mems-exchange.org>
Message-ID: <oq8z5uhp0b.fsf@carouge.sram.qc.ca>

[Andrew Kuchling]

> On Tue, Jun 04, 2002 at 09:29:51AM -0400, Fran?ois Pinard wrote:
> >[1] Full stops are punctuation ending sentences with two spaces guaranteed.
> >Full stops are defined that way for typography based on fixed width fonts,
> >like when we say "this many characters to a line".

> I don't think this really matters, because I doubt anyone will be
> implementing full justification.

This is an orthogonal matter, unrelated to full stops.  Simultaneous left
and right justification for fixed fonts texts is _not_ to be praised[1].
The real goal of any typographical device, like wrapping, is improving the
legibility of text.  Maybe simultaneous left and right justification is
more "good looking", some would even say "beautiful", but I think it is
considered well known that such simultaneous justification signficiatnly
decreases legibility for fixed width fonts.  If a typographical device
aims beauty instead of legibility, it misses the real goal.

> Left justification is just a matter of inserting newlines at particular
> points, so if the input data has two spaces after punctuation,
> line-breaking won't introduce any errors.

Excellent if it could be done exactly this way.  However, things are not
always that simple.  If a newline is inserted at some point for wrapping
purposes, it is desirable and usual to remove what was whitespace around
that point, so we do not have unwelcome spaces at start of the beginning
line, or spurious trailing whitespace at end of the previous line.  If the
wrapping device otherwise replaces sequences of many spaces by one, it
should be careful at replacing many space by two, in context of full stops.

----------
[1] I think, shudder and horror, that `man' does simultaneous left and right
justification when producing ASCII pages, this is especially bad since
`man' is about documentation to start with.  Of course, when generating
pages for laser printers, with proportional fonts and micro-spacing, things
are pretty different, and _then_ simultaneous left and right justification
makes sense for legibility, if kept within reasonable bounds of course.
I'm almost sure that all of us have seen dubious and unreasonable usages.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard




From martin@v.loewis.de  Tue Jun  4 19:34:07 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 04 Jun 2002 20:34:07 +0200
Subject: [Python-Dev] Patch #473512
In-Reply-To: <200206041804.g54I4Sd16333@odiug.zope.com>
References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de>
 <200206041804.g54I4Sd16333@odiug.zope.com>
Message-ID: <m38z5uzye8.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Is there a point to adding more cruft to getopt.py now that we're
> getting Greg Ward's Optik?

Perhaps ease-of-use - people wanting to use GNU getopt style only need
to change the function name in their existing application.

> Also, I happen to hate GNU style getopt.  You may call me an old
> fogey, but I think options should precede other arguments.

That certainly is debatable. However, since the patch is a pure
addition, every application author will have to make the choice
herself, and no existing application will break.

Regards,
Martin




From guido@python.org  Tue Jun  4 19:53:54 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 04 Jun 2002 14:53:54 -0400
Subject: [Python-Dev] Patch #473512
In-Reply-To: Your message of "04 Jun 2002 20:34:07 +0200."
 <m38z5uzye8.fsf@mira.informatik.hu-berlin.de>
References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de> <200206041804.g54I4Sd16333@odiug.zope.com>
 <m38z5uzye8.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206041853.g54Irsb17084@odiug.zope.com>

> > Is there a point to adding more cruft to getopt.py now that we're
> > getting Greg Ward's Optik?
> 
> Perhaps ease-of-use - people wanting to use GNU getopt style only need
> to change the function name in their existing application.
> 
> > Also, I happen to hate GNU style getopt.  You may call me an old
> > fogey, but I think options should precede other arguments.
> 
> That certainly is debatable. However, since the patch is a pure
> addition, every application author will have to make the choice
> herself, and no existing application will break.

So I'm a neutral 0 on the patch.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jun  4 19:55:11 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 04 Jun 2002 14:55:11 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: Your message of "04 Jun 2002 14:34:12 EDT."
 <oq8z5uhp0b.fsf@carouge.sram.qc.ca>
References: <200206040117.NAA06859@s454.cosc.canterbury.ac.nz> <oqlm9vw4s0.fsf@titan.progiciels-bpi.ca> <20020604140734.GB1039@ute.mems-exchange.org>
 <oq8z5uhp0b.fsf@carouge.sram.qc.ca>
Message-ID: <200206041855.g54ItBW17096@odiug.zope.com>

> Excellent if it could be done exactly this way.  However, things are
> not always that simple.  If a newline is inserted at some point for
> wrapping purposes, it is desirable and usual to remove what was
> whitespace around that point, so we do not have unwelcome spaces at
> start of the beginning line, or spurious trailing whitespace at end
> of the previous line.  If the wrapping device otherwise replaces
> sequences of many spaces by one, it should be careful at replacing
> many space by two, in context of full stops.

Emacs does it this way because you reformat the same paragraph over
and over.  The downside is that sometimes a line is shorter than it
could be because it would end in a period.  For what we're doing here
(producing tidy output) I prefer not to do the Emacs fiddling.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@v.loewis.de  Tue Jun  4 20:48:20 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 04 Jun 2002 21:48:20 +0200
Subject: [Python-Dev] Re: Patch #473512
In-Reply-To: <oqd6v6hq5l.fsf@carouge.sram.qc.ca>
References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de>
 <oqd6v6hq5l.fsf@carouge.sram.qc.ca>
Message-ID: <m34rgizuyj.fsf@mira.informatik.hu-berlin.de>

pinard@iro.umontreal.ca (Fran=E7ois Pinard) writes:

> GNU getopt changes once in a while.=20=20

In what way? When has it changed last, and what was that change?

> Will `getopt.gnu_getopt' track and reflect these changes as they
> occur?  I mean, is it the intent?  If yes, the name might be fine.
> Otherwise, it might be better to name this other `getopt' after some
> of its properties, instead of using `gnu' as a prefix.

Assuming that bug fixes are made to GNU getopt, it would certainly be
reasonable to reflect them in getopt.gnu_getopt.

> If GNU it has to be, maybe it should be capitalised?  Some existing modul=
es
> suggest capitals where underlines would probably have been sufficient,
> maybe we should use capitals where they are more naturally mandated.
>=20
> Not a big matter for me, but probably worth a thought nevertheless?

I've no opinion on that except that function names are traditionally
all lower-case. I'll ask the author of the patch.

Regards,
Martin




From guido@python.org  Tue Jun  4 20:58:31 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 04 Jun 2002 15:58:31 -0400
Subject: [Python-Dev] Re: Patch #473512
In-Reply-To: Your message of "04 Jun 2002 21:48:20 +0200."
 <m34rgizuyj.fsf@mira.informatik.hu-berlin.de>
References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de> <oqd6v6hq5l.fsf@carouge.sram.qc.ca>
 <m34rgizuyj.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206041958.g54JwVS18336@odiug.zope.com>

> > If GNU it has to be, maybe it should be capitalised?  Some existing modules
> > suggest capitals where underlines would probably have been sufficient,
> > maybe we should use capitals where they are more naturally mandated.
> > 
> > Not a big matter for me, but probably worth a thought nevertheless?
> 
> I've no opinion on that except that function names are traditionally
> all lower-case. I'll ask the author of the patch.

I'm -1 on capitalizing GNU here -- the module name and function name
are all lowercase.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From oren-py-d@hishome.net  Tue Jun  4 21:08:08 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 4 Jun 2002 16:08:08 -0400
Subject: [Python-Dev] xrange identity crisis
Message-ID: <20020604200808.GA43351@hishome.net>

It seems that the xrange object in the current CVS can't make up its mind 
whether it's an iterator or an iterable:

>>> iterables = ["", (), [], {}, file('/dev/null'), xrange(10)]
>>> iterators = [iter(x) for x in iterables]
>>> for x in iterables + iterators:
...    print hasattr(x, 'next'), x is iter(x), type(x)
...
False False <type 'str'>
False False <type 'tuple'>
False False <type 'list'>
False False <type 'dict'>
False False <type 'file'>
True  False <type 'xrange'>
True  True  <type 'iterator'>
True  True  <type 'iterator'>
True  True  <type 'listiterator'>
True  True  <type 'dictionary-iterator'>
True  True  <type 'xreadlines.xreadlines'>
True  False <type 'xrange'>

Generally, iterables don't have a next() method and return a new object 
each time they are iter()ed. Iterators do have a next() method and return 
themselves on iter(). xrange is a strange hybrid.

In Python 2.2.0/1 xrange behaved just like the other iterables:
>>> iterables = ["", (), [], {}, file('/dev/null'), xrange(10)]
>>> iterators = [iter(x) for x in iterables]
>>> for x in iterables + iterators:
...    print hasattr(x, 'next'), x is iter(x), type(x)
...
0 0 <type 'str'>
0 0 <type 'tuple'>
0 0 <type 'list'>
0 0 <type 'dict'>
0 0 <type 'file'>
0 0 <type 'xrange'>
1 1 <type 'iterator'>
1 1 <type 'iterator'>
1 1 <type 'iterator'>
1 1 <type 'dictionary-iterator'>
1 1 <type 'xreadlines.xreadlines'>
1 1 <type 'iterator'>

What's the rationale behind this change?

	Oren



From jepler@unpythonic.net  Tue Jun  4 22:01:24 2002
From: jepler@unpythonic.net (Jeff Epler)
Date: Tue, 4 Jun 2002 16:01:24 -0500
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: <20020604200808.GA43351@hishome.net>
References: <20020604200808.GA43351@hishome.net>
Message-ID: <20020604160123.D24361@unpythonic.net>

On Tue, Jun 04, 2002 at 04:08:08PM -0400, Oren Tirosh wrote:
> It seems that the xrange object in the current CVS can't make up its mind 
> whether it's an iterator or an iterable:

In 2.2, xrange had no "next" method, so it got wrapped by a generic
iterator object.  It was desirable for performance to have xrange also
act as an iterator.

See
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/rangeobject.c.diff?r1=2.35&r2=2.36
for the change.

See
http://www.python.org/sf/551410
for the sf patch this comes from.

However, the following code would give different results if 'iter(x)
is x' for xrange objects:
    x = xrange(5)
    for a in x:
	for b in x:
	    print a,b
(it'd print "0 1" "0 2" "0 3" "0 4" if they were the same iterator, just
as for 'x = iter(range(5))') so, it's necessary to return a *different*
xrange object from iter(x) so it can start iterating from the beginning
again.  I think there's an optimization that *the first time*, iter(x)
is x for an xrange object.

Hm, the python cvs I have here is too old to have this optimization ...
so I can't really tell you how it works now for sure.

Jeff



From python@rcn.com  Tue Jun  4 21:57:58 2002
From: python@rcn.com (Raymond Hettinger)
Date: Tue, 4 Jun 2002 16:57:58 -0400
Subject: [Python-Dev] xrange identity crisis
References: <20020604200808.GA43351@hishome.net>
Message-ID: <001901c20c0a$8254d880$f061accf@othello>

Xrange was given its own tp_iter slot and now runs as fast a range.  In
single pass timings, it runs faster.  In multiple passes, range is still
quicker because it only has to create the PyNumbers once.

Being immutable, xrange had the advantage that it could serve as its own
iterator and did not require the extra code needed for list iterators and
dict iterators.


Raymond Hettinger


----- Original Message -----
From: "Oren Tirosh" <oren-py-d@hishome.net>
To: <python-dev@python.org>
Sent: Tuesday, June 04, 2002 4:08 PM
Subject: [Python-Dev] xrange identity crisis


> It seems that the xrange object in the current CVS can't make up its mind
> whether it's an iterator or an iterable:
>
> >>> iterables = ["", (), [], {}, file('/dev/null'), xrange(10)]
> >>> iterators = [iter(x) for x in iterables]
> >>> for x in iterables + iterators:
> ...    print hasattr(x, 'next'), x is iter(x), type(x)
> ...
> False False <type 'str'>
> False False <type 'tuple'>
> False False <type 'list'>
> False False <type 'dict'>
> False False <type 'file'>
> True  False <type 'xrange'>
> True  True  <type 'iterator'>
> True  True  <type 'iterator'>
> True  True  <type 'listiterator'>
> True  True  <type 'dictionary-iterator'>
> True  True  <type 'xreadlines.xreadlines'>
> True  False <type 'xrange'>
>
> Generally, iterables don't have a next() method and return a new object
> each time they are iter()ed. Iterators do have a next() method and return
> themselves on iter(). xrange is a strange hybrid.
>
> In Python 2.2.0/1 xrange behaved just like the other iterables:
> >>> iterables = ["", (), [], {}, file('/dev/null'), xrange(10)]
> >>> iterators = [iter(x) for x in iterables]
> >>> for x in iterables + iterators:
> ...    print hasattr(x, 'next'), x is iter(x), type(x)
> ...
> 0 0 <type 'str'>
> 0 0 <type 'tuple'>
> 0 0 <type 'list'>
> 0 0 <type 'dict'>
> 0 0 <type 'file'>
> 0 0 <type 'xrange'>
> 1 1 <type 'iterator'>
> 1 1 <type 'iterator'>
> 1 1 <type 'iterator'>
> 1 1 <type 'dictionary-iterator'>
> 1 1 <type 'xreadlines.xreadlines'>
> 1 1 <type 'iterator'>
>
> What's the rationale behind this change?
>
> Oren
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
>




From guido@python.org  Tue Jun  4 22:12:45 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 04 Jun 2002 17:12:45 -0400
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: Your message of "Tue, 04 Jun 2002 16:01:24 CDT."
 <20020604160123.D24361@unpythonic.net>
References: <20020604200808.GA43351@hishome.net>
 <20020604160123.D24361@unpythonic.net>
Message-ID: <200206042112.g54LCjl25275@odiug.zope.com>

> On Tue, Jun 04, 2002 at 04:08:08PM -0400, Oren Tirosh wrote:
> > It seems that the xrange object in the current CVS can't make up its mind 
> > whether it's an iterator or an iterable:
> 
> In 2.2, xrange had no "next" method, so it got wrapped by a generic
> iterator object.  It was desirable for performance to have xrange also
> act as an iterator.

This seems to propagate the confusion.  To avoid being wrapped by a
generic iterator object, you need to define an __iter__ method, not a
next method.

The current xrange code (from SF patch #551410) uses the xrange object
as both an iterator and iterable, and has an extra flag to make things
work right when the same object is iterated over more than once.
Without doing more of a review, I can only say that I'm a but
uncomfortable with that approach.  Something like the more recent code
that Raymond H added to listobject.c to add a custom iterator makes
more sense.  But perhaps it is defensible.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jun  4 22:18:39 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 04 Jun 2002 17:18:39 -0400
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: Your message of "Tue, 04 Jun 2002 16:57:58 EDT."
 <001901c20c0a$8254d880$f061accf@othello>
References: <20020604200808.GA43351@hishome.net>
 <001901c20c0a$8254d880$f061accf@othello>
Message-ID: <200206042118.g54LIdr25307@odiug.zope.com>

[Raymond Hettinger]
> Xrange was given its own tp_iter slot and now runs as fast a range.
> In single pass timings, it runs faster.  In multiple passes, range
> is still quicker because it only has to create the PyNumbers once.
> 
> Being immutable, xrange had the advantage that it could serve as its
> own iterator and did not require the extra code needed for list
> iterators and dict iterators.

Did you write the pach that Martin checked in?

It's broken.

>>> a = iter(xrange(10))
>>> for i in a:
        print i    
        if i == 4: print '*', a.next()
    
0
1
2
3
4
* 0
5
6
7
8
9
>>>

Compare to:

>>> a = iter(range(10))
>>> for i in a:
        print i
        if i == 4: print '*', a.next()
    
0
1
2
3
4
* 5
6
7
8
9
>>> 

--Guido van Rossum (home page: http://www.python.org/~guido/)



From oren-py-d@hishome.net  Tue Jun  4 22:42:52 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 5 Jun 2002 00:42:52 +0300
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: <20020604160123.D24361@unpythonic.net>; from jepler@unpythonic.net on Tue, Jun 04, 2002 at 04:01:24PM -0500
References: <20020604200808.GA43351@hishome.net> <20020604160123.D24361@unpythonic.net>
Message-ID: <20020605004252.A27339@hishome.net>

On Tue, Jun 04, 2002 at 04:01:24PM -0500, Jeff Epler wrote:
> On Tue, Jun 04, 2002 at 04:08:08PM -0400, Oren Tirosh wrote:
> > It seems that the xrange object in the current CVS can't make up its mind 
> > whether it's an iterator or an iterable:
> 
> In 2.2, xrange had no "next" method, so it got wrapped by a generic
> iterator object.  It was desirable for performance to have xrange also
> act as an iterator.

I understand the performance issue. But it is possible to improve the 
performance of iterating over xranges without creating this unholy chimera.

>>> type([]), type(iter([]))
(<type 'list'>, <type 'listiterator'>)

   ... lists have a listiterator

>>> type({}), type(iter({}))
(<type 'dict'>, <type 'dictionary-iterator'>)

   ... dictionaries have a dictionary-iterator

>>> type(xrange(10)), type(iter(xrange(10)))
(<type 'xrange'>, <type 'xrange'>)

   ... why shouldn't an xrange have an xrangeiterator? 

It's the only way to make xrange behave consistently with other iterables.  
 
> However, the following code would give different results if 'iter(x)
> is x' for xrange objects:
>     x = xrange(5)
>     for a in x:
> 	for b in x:
> 	    print a,b

xrange currently is currently stuck halfway between an iterable and an 
iterator.  If it was made 100% iterator you would be right, it would
break this code.  What I'm saying is that it should be 100% iterable.

I know it works just fine the way it is.  But I see a lot of confusion on 
the python list around the semantics of iterators and this behavior might 
make it just a little bit worse.

	Oren




From martin@v.loewis.de  Tue Jun  4 22:47:00 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 04 Jun 2002 23:47:00 +0200
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: <200206042112.g54LCjl25275@odiug.zope.com>
References: <20020604200808.GA43351@hishome.net>
 <20020604160123.D24361@unpythonic.net>
 <200206042112.g54LCjl25275@odiug.zope.com>
Message-ID: <m3n0uayawb.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> The current xrange code (from SF patch #551410) uses the xrange object
> as both an iterator and iterable, and has an extra flag to make things
> work right when the same object is iterated over more than once.
> Without doing more of a review, I can only say that I'm a but
> uncomfortable with that approach.  Something like the more recent code
> that Raymond H added to listobject.c to add a custom iterator makes
> more sense.  But perhaps it is defensible.

The main defense is that the typical use case is 

for i in xrange(len(some_list))

In that case, it is desirable not to create an additional object, and
nobody will notice the difference.

Regards,
Martin




From martin@v.loewis.de  Tue Jun  4 22:52:30 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 04 Jun 2002 23:52:30 +0200
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: <20020604200808.GA43351@hishome.net>
References: <20020604200808.GA43351@hishome.net>
Message-ID: <m3it4yyan5.fsf@mira.informatik.hu-berlin.de>

Oren Tirosh <oren-py-d@hishome.net> writes:

> What's the rationale behind this change?

The rationale is that it is more efficient. You seem to think it is a
problem. Can you explain why you think so?

Regards,
Martin




From martin@v.loewis.de  Tue Jun  4 23:07:36 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 05 Jun 2002 00:07:36 +0200
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: <20020605004252.A27339@hishome.net>
References: <20020604200808.GA43351@hishome.net>
 <20020604160123.D24361@unpythonic.net>
 <20020605004252.A27339@hishome.net>
Message-ID: <m33cw2y9xz.fsf@mira.informatik.hu-berlin.de>

Oren Tirosh <oren-py-d@hishome.net> writes:

>    ... why shouldn't an xrange have an xrangeiterator? 

Because that would create an additional object.

> It's the only way to make xrange behave consistently with other iterables.  

Why does it have to be consistent?

> I know it works just fine the way it is.  But I see a lot of confusion on 
> the python list around the semantics of iterators and this behavior might 
> make it just a little bit worse.

Why do you think people will get confused? Most people will use it in
the canoncical form

for i in range(maxvalue)

in which case they cannot experience any difference (except for the
performance boost)?

Regards,
Martin



From greg@cosc.canterbury.ac.nz  Wed Jun  5 01:24:58 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 05 Jun 2002 12:24:58 +1200 (NZST)
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: <m3n0uayawb.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206050024.MAA06957@s454.cosc.canterbury.ac.nz>

martin@v.loewis.de (Martin v. Loewis):

> The main defense is that the typical use case is 
> 
> for i in xrange(len(some_list))

How about deprecating xrange, and introducing a
new function such as

  indexes(sequence)

that returns a proper iterator. That would clear
up all the xrange confusion and make for nicer
looking code as well.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+




From gward@python.net  Wed Jun  5 02:14:02 2002
From: gward@python.net (Greg Ward)
Date: Tue, 4 Jun 2002 21:14:02 -0400
Subject: [Python-Dev] Re: Where to put wrap_text()?
In-Reply-To: <oqlm9vw4s0.fsf@titan.progiciels-bpi.ca>
References: <200206040117.NAA06859@s454.cosc.canterbury.ac.nz> <oqlm9vw4s0.fsf@titan.progiciels-bpi.ca>
Message-ID: <20020605011402.GA13638@gerg.ca>

On 04 June 2002, Fran?ois Pinard said:
> Hi, people.
> 
> For this incoming text wrapper facility, there is a feature that appears
> really essential to me, and many others: the protection of full stops[1].

If you mean reformatting this:

"""
This is a sentence ending.
If we convert each newline to a single space, there won't
be enough space after that period.
"""

to this:

"""
This is a sentence ending.  If we convert each newline to a single
space, there won't be enough space after that period.
"""

then my wrapping algorithm handles it.  However, it's currently limited
to English, because it relies on string.lowercase to detect sentence
ending periods -- this needs to be fixed, but I was going to post the
code and let someone who understands locales tell me what to do.  ;-)

        Greg
-- 
Greg Ward - programmer-at-big                           gward@python.net
http://starship.python.net/~gward/
"... but in the town it was well known that when they got home their fat and
psychopathic wives would thrash them to within inches of their lives ..."



From guido@python.org  Wed Jun  5 02:22:30 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 04 Jun 2002 21:22:30 -0400
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: Your message of "04 Jun 2002 23:47:00 +0200."
 <m3n0uayawb.fsf@mira.informatik.hu-berlin.de>
References: <20020604200808.GA43351@hishome.net> <20020604160123.D24361@unpythonic.net> <200206042112.g54LCjl25275@odiug.zope.com>
 <m3n0uayawb.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206050122.g551MUA01710@pcp02138704pcs.reston01.va.comcast.net>

> The main defense is that the typical use case is 
> 
> for i in xrange(len(some_list))
> 
> In that case, it is desirable not to create an additional object, and
> nobody will notice the difference.

Is it really so bad if this allocates *two* objects instead of one?

I think that's the only to get my example to work correctly.  And it
*has* to work correctly.

If two objects are created anyway, I agree with Oren that it's better
to have a separate range-iterator object type.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Wed Jun  5 02:42:05 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 04 Jun 2002 21:42:05 -0400
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: Your message of "Wed, 05 Jun 2002 12:24:58 +1200."
 <200206050024.MAA06957@s454.cosc.canterbury.ac.nz>
References: <200206050024.MAA06957@s454.cosc.canterbury.ac.nz>
Message-ID: <200206050142.g551g5j01889@pcp02138704pcs.reston01.va.comcast.net>

> How about deprecating xrange,

Deprecating xrange has about as much chance as deprecating the string
module.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From greg@cosc.canterbury.ac.nz  Wed Jun  5 03:43:28 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 05 Jun 2002 14:43:28 +1200 (NZST)
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: <200206050142.g551g5j01889@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200206050243.OAA07021@s454.cosc.canterbury.ac.nz>

> Deprecating xrange has about as much chance as deprecating the string
> module.

Well, discouraging it then, or whatever *is* being done
with the string module.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From python@rcn.com  Wed Jun  5 05:52:16 2002
From: python@rcn.com (Raymond Hettinger)
Date: Wed, 5 Jun 2002 00:52:16 -0400
Subject: [Python-Dev] xrange identity crisis
References: <20020604200808.GA43351@hishome.net>              <001901c20c0a$8254d880$f061accf@othello>  <200206042118.g54LIdr25307@odiug.zope.com>
Message-ID: <008101c20c4c$c4fa9700$a666accf@othello>

RDH> Xrange was given its own tp_iter slot and now runs as fast a range.
RDH> > In single pass timings, it runs faster.  In multiple passes, range
RDH> > is still quicker because it only has to create the PyNumbers once.
RDH> >
RDH> > Being immutable, xrange had the advantage that it could serve as its
RDH> > own iterator and did not require the extra code needed for list
RDH> > iterators and dict iterators.
>
GvR> Did you write the pach that Martin checked in?
GvR>
GvR> It's broken.
GvR>
GvR> >>> a = iter(xrange(10))
GvR> >>> for i in a:
GvR>         print i
GvR>         if i == 4: print '*', a.next()

Okay, here's the distilled analysis:

Given x=xrange(10),
1. Oren notes that id(iter(x)) == id(x) which is atypical of objects that
have special iterator types or get wrapped by the generic iterobject.
2. GvR notes that id(iter(x)) != id(iter(iter(x))) which is inconsistent
with range().

#1 should nor be a requirement.  A call to iter should simply return
something that has an iterable interface whether it be a new object or the
current object.  In examples of user defined classes with their own
__iter__() method, we show the object returning itself.  At the same time,
we allow the __iter__ method to possibly be defined with a generator which
returns a new object.  In short, the object identity of iter(x) has not been
promised to be either equal or not equal to x.

If we decide that #1 is required (for consistency with the way other
iterables are currently implemented), the most straightforward solution is
to add an xrange iteratorobject to rangeobject.c just like we did for
listobject.c.  I'll be happy to do this if it is what everyone wants.

For #2, the most compelling argument is that xrange should be a drop-in
replacement for range in *every* circumstance including the weird use case
of iter(iter(xrange(10))).  This is easily accomplished and I've uploaded
attached a simple patch to the bug report that restores this behavior.
However, before accepting the patch, I think we ought to consider whether
the current xrange() behavior is more rational than the range() behavior.

PEP 234 says:  """Some folks have requested the ability to restart an
iterator.  This should be dealt with by calling iter() on a sequence
repeatedly, not by the iterator protocol itself. """

Maybe, the right way to go is to assure that iter(x) returns a freshly
loaded iterator instead of the same iterator in the same state.  Right now
(with xrange different from range), we get what I think is weirder behavior
from range():

>>> a = iter(range(3))
>>> for i in a:
 for j in a:
  print i,j


0 1
0 2
>>> a = iter(xrange(3))
>>> for i in a:
 for j in a:
  print i,j


0 0
0 1
0 2
1 0
1 1
1 2
2 0
2 1
2 2


BTW, I'm happy to do whatever you guys think best:
(a)  Adding an xrangeiteratorobject fixes #1 and #2 resulting in an xrange()
identical to range() with no cost to performance during the loop (creation
performance suffers just a bit).
(b)  Adding my other patch (attached to the bug report
www.python.org/sf/564601), fixes #2 only (again with no cost to loop
performance).
(c)  Leaving it the way it is gives xrange a behavior that is identical to
range for the common use cases, and arguably superior abilities for the
weird cases.


Raymond Hettinger











From greg@cosc.canterbury.ac.nz  Wed Jun  5 06:08:32 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 05 Jun 2002 17:08:32 +1200 (NZST)
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: <008101c20c4c$c4fa9700$a666accf@othello>
Message-ID: <200206050508.RAA07040@s454.cosc.canterbury.ac.nz>

> Maybe, the right way to go is to assure that iter(x) returns a freshly
> loaded iterator instead of the same iterator in the same state.

That would be a change to the semantics of all iterators,
not worth it just to fix a small oddity with xrange.
I think it's fairly clear that xrange is to be thought
of as a lazy list, *not* an iterator. The best way to
fix it (if it needs fixing) is to have iter(xrange(...)) 
always return a new object, I think.

It wouldn't be possible for all iterators to behave
the way you suggest, anyway, because some kinds of
iterator don't have an underlying sequence that can
be restarted (e.g. file iterators).

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From martin@v.loewis.de  Wed Jun  5 08:30:05 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 05 Jun 2002 09:30:05 +0200
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: <200206050122.g551MUA01710@pcp02138704pcs.reston01.va.comcast.net>
References: <20020604200808.GA43351@hishome.net>
 <20020604160123.D24361@unpythonic.net>
 <200206042112.g54LCjl25275@odiug.zope.com>
 <m3n0uayawb.fsf@mira.informatik.hu-berlin.de>
 <200206050122.g551MUA01710@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3y9du89oi.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Is it really so bad if this allocates *two* objects instead of one?

When accepting the patch, I assumed that the observed speed difference
between xrange and range originated from the fact that xrange
iteration allocates iterator objects. I'm not so sure anymore that
this is the real cause, more likely, it is again the exception
handling when exhausting the range.

> I think that's the only to get my example to work correctly.  And it
> *has* to work correctly.
> 
> If two objects are created anyway, I agree with Oren that it's better
> to have a separate range-iterator object type.

I agree. I wouldn't mind if somebody would review Raymond's to
introduce such a thing.

Regards,
Martin




From oren-py-d@hishome.net  Wed Jun  5 10:21:23 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 5 Jun 2002 12:21:23 +0300
Subject: [Python-Dev] xrange identity crisis
In-Reply-To: <001901c20c0a$8254d880$f061accf@othello>; from python@rcn.com on Tue, Jun 04, 2002 at 04:57:58PM -0400
References: <20020604200808.GA43351@hishome.net> <001901c20c0a$8254d880$f061accf@othello>
Message-ID: <20020605122123.A1420@hishome.net>

On Tue, Jun 04, 2002 at 04:57:58PM -0400, Raymond Hettinger wrote:
> Being immutable, xrange had the advantage that it could serve as its own
> iterator and did not require the extra code needed for list iterators and
> dict iterators.

In its current form, xrange is no longer immutable. It has state information
and calling the next() method of an xrange object modifies it.

I guess the difference between us is that you are concerned with what works 
while I am irrationally obsessed with semantics :-)

	Oren




From thomas.heller@ion-tof.com  Wed Jun  5 15:25:35 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 5 Jun 2002 16:25:35 +0200
Subject: [Python-Dev] 'compile' error message
Message-ID: <0c6401c20c9c$db600610$e000a8c0@thomasnotebook>

Consider:

Python 2.3a0 (#29, Jun  5 2002, 13:09:10) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> compile("1+*3", "myfile.py", "exec")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<string>", line 1
    1+*3
      ^
SyntaxError: invalid syntax
>>>

Shouldn't it print "myfile.py" instead of "<string>"?

Thomas




From Oleg Broytmann <phd@phd.pp.ru>  Wed Jun  5 15:48:14 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Wed, 5 Jun 2002 18:48:14 +0400
Subject: [Python-Dev] 'compile' error message
In-Reply-To: <0c6401c20c9c$db600610$e000a8c0@thomasnotebook>; from thomas.heller@ion-tof.com on Wed, Jun 05, 2002 at 04:25:35PM +0200
References: <0c6401c20c9c$db600610$e000a8c0@thomasnotebook>
Message-ID: <20020605184814.D26674@phd.pp.ru>

On Wed, Jun 05, 2002 at 04:25:35PM +0200, Thomas Heller wrote:
> Python 2.3a0 (#29, Jun  5 2002, 13:09:10) [MSC 32 bit (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
> >>> compile("1+*3", "myfile.py", "exec")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "<string>", line 1
>     1+*3
>       ^
> SyntaxError: invalid syntax
> >>>
> 
> Shouldn't it print "myfile.py" instead of "<string>"?

   I think it shoud. Just yesterday I stuck on this bug. Pleas file a bug
report.

PS. Tomorrow I'll publish the code that uses "compile", whatch c.l.py :)

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From David Abrahams" <david.abrahams@rcn.com  Wed Jun  5 16:07:42 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 5 Jun 2002 11:07:42 -0400
Subject: [Python-Dev] 'compile' error message
References: <0c6401c20c9c$db600610$e000a8c0@thomasnotebook> <20020605184814.D26674@phd.pp.ru>
Message-ID: <177701c20ca3$16e69d10$6601a8c0@boostconsulting.com>

Didn't I report this problem for 2.2? I was getting this "<string>" thing
in my doctest outputs.
Could've sworn I phoned it in, and was told it was already fixed.

-Dave

----- Original Message -----
From: "Oleg Broytmann" <phd@phd.pp.ru>
Shouldn't it print "myfile.py" instead of "<string>"?
>
>    I think it shoud. Just yesterday I stuck on this bug. Pleas file a bug
> report.
>
> PS. Tomorrow I'll publish the code that uses "compile", whatch c.l.py :)
>
> Oleg.
> --
>      Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
>            Programmers don't die, they just GOSUB without RETURN.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev




From guido@python.org  Wed Jun  5 17:14:45 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 05 Jun 2002 12:14:45 -0400
Subject: [Python-Dev] 'compile' error message
In-Reply-To: Your message of "Wed, 05 Jun 2002 16:25:35 +0200."
 <0c6401c20c9c$db600610$e000a8c0@thomasnotebook>
References: <0c6401c20c9c$db600610$e000a8c0@thomasnotebook>
Message-ID: <200206051614.g55GEjr01951@pcp02138704pcs.reston01.va.comcast.net>

> >>> compile("1+*3", "myfile.py", "exec")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "<string>", line 1
>     1+*3
>       ^
> SyntaxError: invalid syntax
> >>>
> 
> Shouldn't it print "myfile.py" instead of "<string>"?

Yes.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas.heller@ion-tof.com  Wed Jun  5 18:04:42 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 5 Jun 2002 19:04:42 +0200
Subject: [Python-Dev] 'compile' error message
References: <0c6401c20c9c$db600610$e000a8c0@thomasnotebook>  <200206051614.g55GEjr01951@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <0d3001c20cb3$164b7b40$e000a8c0@thomasnotebook>

> > >>> compile("1+*3", "myfile.py", "exec")
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in ?
> >   File "<string>", line 1
> >     1+*3
> >       ^
> > SyntaxError: invalid syntax
> > >>>
> > 
> > Shouldn't it print "myfile.py" instead of "<string>"?
> 
> Yes.
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)

Submitted as SF bug #564931.

Thomas




From eikeon@eikeon.com  Wed Jun  5 20:43:27 2002
From: eikeon@eikeon.com (Daniel 'eikeon' Krech)
Date: 05 Jun 2002 15:43:27 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
Message-ID: <uvg8xbjfk.fsf@eikeon.com>

While attempting to "intern" the nodes in our rdflib's triple store I have come across the following question.

Is there or could there be an efficient way to get an existing key from a dictionary given a key that is == but whose id is not. For example:

    given a==b and id(a)!=id(b) and d[a] = 1

what is the best way to:

    d.get_key(b) -> a

--eikeon, http://eikeon.com/


PS: Here is the code where I am trying to get rid of multiple instances of equivalent nodes:

    http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/redfoot/rdflib-1.0/rdflib/store/memory.py?rev=1.1.1.1&content-type=text/vnd.viewcvs-markup


and a not so efficient first attempt:

# (could use a s.get(e2)->e1 as well given e1==e2 and id(e1)!=id(e2))

class Set(object):
    def __init__(self):
        self.__set = []

    def add(self, obj):
        e = self.get(obj)
        if e: # already have equivalent element, so return the one we have
            return e
        else:
            self.__set.append(obj)
            return obj

    def get(self, obj):
        if obj in self.__set:
            for e in self.__set:
                if e==obj:
                    return e
        return None

class Intern(object):

    def __init__(self):
        super(Intern, self).__init__()
        self.__nodes = Set()
        
    def add(self, subject, predicate, object):
        subject = self.__nodes.add(subject)
        predicate = self.__nodes.add(predicate)
        object = self.__nodes.add(object)
        super(Intern, self).add(subject, predicate, object)





From guido@python.org  Wed Jun  5 20:52:37 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 05 Jun 2002 15:52:37 -0400
Subject: [Python-Dev] SF sending content-free emails
Message-ID: <200206051952.g55Jqcl04146@pcp02138704pcs.reston01.va.comcast.net>

You may have noticed that the SF tracker is sending email that doesn't
contain any content, when an item is updated.

I've filed a bug report with SF.

http://sourceforge.net/tracker/index.php?func=detail&aid=565001&group_id=1&atid=200001

Gordon, how's the roundup project coming along? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mgilfix@eecs.tufts.edu  Wed Jun  5 20:52:35 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Wed, 5 Jun 2002 15:52:35 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jun 03, 2002 at 01:22:16PM -0400
References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> <20020603112245.E19838@eecs.tufts.edu> <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net>
Message-ID: <20020605155235.B5911@eecs.tufts.edu>

On Mon, Jun 03 @ 13:22, Guido van Rossum wrote:
> >   Couldn't really figure out what you were seeing here. I read that
> > you saw something like func( a, b), which I don't see in my local
> > copy.
> 
> test_timeout.py from the SF page has this.  I'm glad you fixed this
> already in your own copy.

  Weird. I didn't change anything. Oh well. We'll see if it shows up in
the new patch this time round.

> > I do have something like this for list comprehension:
> > 
> >     [ x.find('\n') for x in self._rbuf ]
> > 
> >   Er, but I though there were supposed to be surrounding spaces at the
> > edges...
> 
> I prefer to see that as
> 
>     [x.find('\n') for x in self._rbuf]

  Ok. Done. One day, you can explain to me why you despise whitespace
so. Perhaps she was mean to you or something. She's always hanging around
with that tab guy at any rate and they make a bad mix.

> > > - It also looks like you've broken the semantics of size<0 in read().
>
> I was referring to this piece of code:
> 
> !         if buf_len > size:
> !             self._rbuf.append (data[size:])
> !             data = data[:size]
> 
> Here data[size:] gives you the last byte of the data and data[:size]
> chops off the last byte.

  Ok. This has been fixed. All read sizes now work and have been tested
by me.

> > > - Maybe changing the write buffer to a list makes sense too?
> > 
> >   I could do this. Then just do a join before the flush. Is the append
> > /that/ much faster?
> 
> Depends on how small the chunks are you write.  Roughly, repeated list
> append is O(N log N), while repeated string append is O(N**2).

  Done. The write buffer now uses a list, so it should be faster than
the initial version and the one currently in use.

> OK, but given the issues the first version had, I recommand that the
> code gets more review and that you write unit tests for all cases.

  I agree. I wasn't through enough in my checking. I'm going to see if
I can include a test-case specifically to test the windows file
class directly.

> > > - Please don't introduce more static functions with a 'Py' name
> > >   prefix.
> > 
> >   Only did this in one place, with PyTimeout_Err. The reason was that the
> > other Error functions used the Py prefix, so it was done for consistency. I
> > can change that.. or change the naming scheme with the others if you like.
> 
> I like to do code cleanup that doesn't change semantics (like
> renamings) as a separate patch and checkin.  You can do this before or
> after the timeout changes, but don't merge it into the timeout
> changes.  I still like the static names that you introduce not to
> start with Py.

  Ok. I'll change the PyTimeout_Err to just timeout_err. We can do some
other cleanup after the patch has been accepted. It's big enough as is and
no need to add more complication.

> OK, it looks like you call internal_setblocking(s, 0) to set the
> socket in nonblocking mode.  (Hm, I don't see any calls to set the
> socket in blocking mode!)
> 
> So do I understand that you are now always setting the socket in
> non-blocking mode, even when there is no timeout specified, and that
> you look at the sock_blocking flag to decide whether to do timeouts or
> just pass the nonblocking behavior to the user?
> 
> This is a change in semantics, and could interfere with existing
> applications that pass the socket's file descriptor off to other
> code.  I think I'd be happier if the behavior wasn't changed at all
> until a timeout is set for a socket -- then existing code won't
> break.

  So, the best way to proceed seems to be:

    if (s->sock_timeout == Py_None)
       /* Perhaps do nothing, or just do original behavior */
    else
       /* Get funky. Do one of the solutions discussed below */

> I only really care for sockets passed in to fromfd().  E.g. someone
> can currently do:
> 
>   s1 = socket(AF_INET, SOCK_STREAM)
>   s1.setblocking(0)
> 
>   s2 = fromfd(s1.fileno())
>   # Now s2 is non-blocking too
> 
> I'd like this to continue to work as long as s1 doesn't set a timeout.

  I see the issue. We'll worry about this and not ioctl. So let's look
at solutions:

> >   One solution is to set/unset blocking mode right before doing each
> > call to be sure of the state and based on the internally stored value
> > of the blocking attribute... but... then that kind of renders ioctl
> > useless.
> 
> Don't worry so much about ioctl, but do worry about fromfd.

  Not so popular.

> >   Another solution might be to set the blocking mode to on everytime
> > someone sets a timeout. That would change the blocking/socket
> > interaction already described a bit but not drastically. Also easy
> > to implement.  That sends the message: Don't use ioctls when using
> > timeouts.
> 
> I like this.

  Alright. Well, using the above pseudo-code scheme, we should be alright.
So here are the new semantics:

  If you set_timeout(int/float/long != None):
    The actual socket gets put in non-blocking mode and the usual select
    stuff is done.
  If you set_timeout(None):
    The old behavior is used AND automatically, the socket is set
    to blocking mode. That means that someone who was doing non-blocking
    stuff before, sets a timeout, and then unsets one, will have to do
    a set_blocking call again if he wants non-blocking stuff. This makes
    sense 'cause timeout stuff is blocking by nature.

  That seems fairest and we always have an idea of what state we're in.

                    -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From martin@v.loewis.de  Wed Jun  5 21:30:47 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 05 Jun 2002 22:30:47 +0200
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <uvg8xbjfk.fsf@eikeon.com>
References: <uvg8xbjfk.fsf@eikeon.com>
Message-ID: <m3elflo4co.fsf@mira.informatik.hu-berlin.de>

"Daniel 'eikeon' Krech" <eikeon@eikeon.com> writes:

> While attempting to "intern" the nodes in our rdflib's triple store
> I have come across the following question.

Why is that a python-dev question? Please use python-list to discuss
applications of Python.

Regards,
Martin




From guido@python.org  Wed Jun  5 22:33:55 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 05 Jun 2002 17:33:55 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: Your message of "Wed, 05 Jun 2002 15:52:35 EDT."
 <20020605155235.B5911@eecs.tufts.edu>
References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> <20020603112245.E19838@eecs.tufts.edu> <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net>
 <20020605155235.B5911@eecs.tufts.edu>
Message-ID: <200206052133.g55LXtv04611@pcp02138704pcs.reston01.va.comcast.net>

>   Ok. Done. One day, you can explain to me why you despise whitespace
> so. Perhaps she was mean to you or something. She's always hanging
> around with that tab guy at any rate and they make a bad mix.

I like the whitespace use in the English language (like so) best.

>   Ok. This has been fixed. All read sizes now work and have been tested
> by me.

Have you written unit tests?  That would be really great.  Ideally,
the tests should pass both before and after your patches.

>   So, the best way to proceed seems to be:
> 
>     if (s->sock_timeout == Py_None)
>        /* Perhaps do nothing, or just do original behavior */
>     else
>        /* Get funky. Do one of the solutions discussed below */

Yes.

> So here are the new semantics:
> 
>   If you set_timeout(int/float/long != None):
>     The actual socket gets put in non-blocking mode and the usual select
>     stuff is done.
>   If you set_timeout(None):
>     The old behavior is used AND automatically, the socket is set
>     to blocking mode. That means that someone who was doing non-blocking
>     stuff before, sets a timeout, and then unsets one, will have to do
>     a set_blocking call again if he wants non-blocking stuff. This makes
>     sense 'cause timeout stuff is blocking by nature.

Sounds good!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From eikeon@eikeon.com  Wed Jun  5 22:33:36 2002
From: eikeon@eikeon.com (Daniel 'eikeon' Krech)
Date: 05 Jun 2002 17:33:36 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <m3elflo4co.fsf@mira.informatik.hu-berlin.de>
References: <uvg8xbjfk.fsf@eikeon.com>
 <m3elflo4co.fsf@mira.informatik.hu-berlin.de>
Message-ID: <u1yblv2a7.fsf@eikeon.com>

martin@v.loewis.de (Martin v. Loewis) writes:

> "Daniel 'eikeon' Krech" <eikeon@eikeon.com> writes:
> 
> > While attempting to "intern" the nodes in our rdflib's triple store
> > I have come across the following question.
> 
> Why is that a python-dev question? Please use python-list to discuss
> applications of Python.

Sorry, seems to me like it was on topic for python-dev seeing as
python dictionaries do not currently have the functionality I
desire. And it would make a great addition to an already great
language, IMO. Did not mean for my message to come across as a
question about applying python... it is not.

--eikeon




From ping@zesty.ca  Wed Jun  5 22:40:16 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Wed, 5 Jun 2002 16:40:16 -0500 (CDT)
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <uvg8xbjfk.fsf@eikeon.com>
Message-ID: <Pine.LNX.4.33.0206051638290.9557-100000@server1.lfw.org>

On 5 Jun 2002, Daniel 'eikeon' Krech wrote:
>
> Is there or could there be an efficient way to get an existing key from
> a dictionary given a key that is == but whose id is not. For example:

If i understand you correctly, a good way to solve this problem is to
provide a __hash__ method on the objects that you are using as keys to
your dictionary.  Dictionaries look up keys by hash equality.

Note that you will have to ensure that the keys are immutable (i.e.
once they are put in the dictionary, they should never change).


-- ?!ng




From barry@barrys-emacs.org  Wed Jun  5 23:07:10 2002
From: barry@barrys-emacs.org (Barry Scott)
Date: Wed, 5 Jun 2002 23:07:10 +0100
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <uvg8xbjfk.fsf@eikeon.com>
Message-ID: <000001c20cdd$56a14330$070210ac@LAPDANCE>

Why not store the key as part of the value.

	d[a] = a

	d[b] => a

If you need more info in the value put a class instance or a
tuple with the key as part of the value.

	d[a] = (a,1)

		BArry


-----Original Message-----
From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On
Behalf Of Daniel 'eikeon' Krech
Sent: 05 June 2002 20:43
To: python-dev@python.org
Subject: [Python-Dev] d.get_key(key) -> key?



While attempting to "intern" the nodes in our rdflib's triple store I have
come across the following question.

Is there or could there be an efficient way to get an existing key from a
dictionary given a key that is == but whose id is not. For example:

    given a==b and id(a)!=id(b) and d[a] = 1

what is the best way to:

    d.get_key(b) -> a

--eikeon, http://eikeon.com/


PS: Here is the code where I am trying to get rid of multiple instances of
equivalent nodes:


http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/redfoot/rdflib-1.0/rdflib/sto
re/memory.py?rev=1.1.1.1&content-type=text/vnd.viewcvs-markup


and a not so efficient first attempt:

# (could use a s.get(e2)->e1 as well given e1==e2 and id(e1)!=id(e2))

class Set(object):
    def __init__(self):
        self.__set = []

    def add(self, obj):
        e = self.get(obj)
        if e: # already have equivalent element, so return the one we have
            return e
        else:
            self.__set.append(obj)
            return obj

    def get(self, obj):
        if obj in self.__set:
            for e in self.__set:
                if e==obj:
                    return e
        return None

class Intern(object):

    def __init__(self):
        super(Intern, self).__init__()
        self.__nodes = Set()

    def add(self, subject, predicate, object):
        subject = self.__nodes.add(subject)
        predicate = self.__nodes.add(predicate)
        object = self.__nodes.add(object)
        super(Intern, self).add(subject, predicate, object)




_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev





From mgilfix@eecs.tufts.edu  Wed Jun  5 23:07:29 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Wed, 5 Jun 2002 18:07:29 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <200206052133.g55LXtv04611@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Wed, Jun 05, 2002 at 05:33:55PM -0400
References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> <20020603112245.E19838@eecs.tufts.edu> <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net> <20020605155235.B5911@eecs.tufts.edu> <200206052133.g55LXtv04611@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020605180728.C5911@eecs.tufts.edu>

On Wed, Jun 05 @ 17:33, Guido van Rossum wrote:
> >   Ok. This has been fixed. All read sizes now work and have been tested
> > by me.
> 
> Have you written unit tests?  That would be really great.  Ideally,
> the tests should pass both before and after your patches.

  Done. I've added them into the test_socket.py test as I didn't feel like
starting a new test that does roughly the same thing. Works on both the
old (2.1.3 source I had lying around my system) and the new.

                -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From niemeyer@conectiva.com  Wed Jun  5 23:09:01 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Wed, 5 Jun 2002 19:09:01 -0300
Subject: [Python-Dev] Patch #473512
In-Reply-To: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de>
References: <200206041740.g54HeR9V001953@mira.informatik.hu-berlin.de>
Message-ID: <20020605190901.A7546@ibook.distro.conectiva>

> I'm ready to apply patch 473512 : getopt with GNU style scanning,
> which adds getopt.gnu_getopt.

I'm +1 on that. I've written a wrapper by myself a few times. Having it
in the library will help. Even with Optik, this should be a small patch,
and I don't think getopt will be deprecated any time soon.

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From mgilfix@eecs.tufts.edu  Wed Jun  5 23:25:15 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Wed, 5 Jun 2002 18:25:15 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <200206052133.g55LXtv04611@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Wed, Jun 05, 2002 at 05:33:55PM -0400
References: <20020512082740.C10230@eecs.tufts.edu> <200205232013.g4NKD6X07596@odiug.zope.com> <20020603112245.E19838@eecs.tufts.edu> <200206031722.g53HMGo02408@pcp742651pcs.reston01.va.comcast.net> <20020605155235.B5911@eecs.tufts.edu> <200206052133.g55LXtv04611@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020605182515.D5911@eecs.tufts.edu>

  Ok. The new version of the patch is in the sourceforge tracker.
Hopefully I haven't forgotten anything. Enjoy all.

               -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From eikeon@eikeon.com  Wed Jun  5 23:41:12 2002
From: eikeon@eikeon.com (Daniel 'eikeon' Krech)
Date: 05 Jun 2002 18:41:12 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <000001c20cdd$56a14330$070210ac@LAPDANCE>
References: <000001c20cdd$56a14330$070210ac@LAPDANCE>
Message-ID: <uvg8xtkl3.fsf@eikeon.com>

"Barry Scott" <barry.alan.scott@ntlworld.com> writes:

> Why not store the key as part of the value.
> 
> 	d[a] = a
> 
> 	d[b] => a
> 
> If you need more info in the value put a class instance or a
> tuple with the key as part of the value.
> 
> 	d[a] = (a,1)

Ideally it would be nice not to have to store it as part of the
value. But that should work. Thank you.

I should have split my question clearly into two questions. Sorry for
dragging the off [python-dev] topic aspect of my question into this
list.

The question I tried (poorly) to raise to this list is if a
get_key(key) -> key as I described could be added to dictionary in
future versions of python. I know at least one user that would use it
:)

Thank you from a happy python user,
--eikeon





From barry@barrys-emacs.org  Thu Jun  6 00:14:07 2002
From: barry@barrys-emacs.org (Barry Scott)
Date: Thu, 6 Jun 2002 00:14:07 +0100
Subject: [Python-Dev] "max recursion limit exceeded" canned response?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEOOPJAA.tim.one@comcast.net>
Message-ID: <000301c20ce6$b113c730$070210ac@LAPDANCE>

I take it the bug is that .*? is implemented recursively rather
then iteratively? I wondered if .*? was broken, but it yields the
right answer for short input strings.

The case of * applied to a fixed width term could be implemented
interatively, ".*", "[axz]*" etc. But variable sized terms would
need a record of what they matched for back tracking. For example
"(\w+\s+)*". The compiler can figure these differences out.

Using a back tracking stack allocated from the heap would reduce
the memory used to run the search at the cost of code complexity.

Once the bug is fixed the canned message will only need to cover
the case of greed repeats * and {n,} encountering an input string
line that is too long?

I'm working on a regex parser/engine for Barry's Emacs and these
design problems are fresh in my thoughts. 

	Barry


-----Original Message-----
From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On
Behalf Of Tim Peters
Sent: 02 June 2002 23:04
To: python-dev@python.org
Subject: RE: [Python-Dev] "max recursion limit exceeded" canned
response?


[Skip Montanaro]
> How would we go about adding a canned response to the commonly submitted
> "max recursion limit exceeded" bug report?

[Martin v. Loewis]
> Post the precise text that you want to see as the canned response, and
> somebody can install it.

I don't think any canned answer will suffice -- every context is different
enough that it needs custom help.  I vote instead that we stop answering
these reports at all:  let /F do it.  That will eventually provoke him into
either writing the canned response he wants to see, or to complete the
long-delayed task of removing this ceiling from sre.



_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev





From guido@python.org  Thu Jun  6 00:25:27 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 05 Jun 2002 19:25:27 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: Your message of "05 Jun 2002 17:33:36 EDT."
 <u1yblv2a7.fsf@eikeon.com>
References: <uvg8xbjfk.fsf@eikeon.com> <m3elflo4co.fsf@mira.informatik.hu-berlin.de>
 <u1yblv2a7.fsf@eikeon.com>
Message-ID: <200206052325.g55NPSm04904@pcp02138704pcs.reston01.va.comcast.net>

> Sorry, seems to me like it was on topic for python-dev seeing as
> python dictionaries do not currently have the functionality I
> desire. And it would make a great addition to an already great
> language, IMO. Did not mean for my message to come across as a
> question about applying python... it is not.

The functionality you propose seems to esoteric to add.  It's probably
simpler to make sure you call intern() before storing the key in the
dict anyway.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From eikeon@eikeon.com  Thu Jun  6 01:02:34 2002
From: eikeon@eikeon.com (Daniel 'eikeon' Krech)
Date: 05 Jun 2002 20:02:34 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <200206052325.g55NPSm04904@pcp02138704pcs.reston01.va.comcast.net>
References: <uvg8xbjfk.fsf@eikeon.com>
 <m3elflo4co.fsf@mira.informatik.hu-berlin.de>
 <u1yblv2a7.fsf@eikeon.com>
 <200206052325.g55NPSm04904@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <ur8jltgth.fsf@eikeon.com>

> The functionality you propose seems to esoteric to add.  

Fair enough. That fact that it is a bit esoteric is the reason why I
raised it... to me it seems like a nice boundary case to support, but
an unlikely one unless someone raised it.


> It's probably simpler to make sure you call intern() before storing
> the key in the dict anyway.

It is my understanding that calling intern creates a string that is
forever immortal? [Perhaps this is a question for python-list though?]
For our application, we can not afford to "leak" the memory we would
lose to the immortal strings.

[I will likely proceed by storing the key as part of the value as
Barry suggested.]


Thank you for your time to hear me out. Sorry it was too
esoteric... felt it at least desirved raising... now it is time to
forget that I did :) 

Thank you all for your time,
--eikeon





From pobrien@orbtech.com  Thu Jun  6 01:24:25 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Wed, 5 Jun 2002 19:24:25 -0500
Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention
Message-ID: <NBBBIOJPGKJEKIECEMCBMEEFNDAA.pobrien@orbtech.com>

Forgive me if this is slightly off-topic for this list, but since we've been
talking about migration guides and coding idioms and tweaking performance
and such, I've got a few questions I'd like to ask.

I'll start with an actual code sample. This is a very simple class that's
part of an xhtml toolkit I'm writing.

class Comment:

    def __init__(self, content=''):
        self.content = content

    def __call__(self, content=''):
        o = self.__class__(content)
        return str(o)

    def __str__(self):
        return '<!-- %s -->' % self.content

    def __repr__(self):
        return repr(self.__str__())

When I look at this, I see certain decisions I've made and I'm wondering if
I've made the best decisions. I'm wondering how to balance performance
against clarity and proper coding conventions.

1. In the __call__ I save a reference to the object. Instead, I could
simply:

       return str(self.__class__(content))

Is there much of a performance impact by explicitly naming intermediate
references? (I need some of Tim Peter's performance testing scripts.)

2. I chose the slightly indirect str(o) instead of o.__str__(). Is this
slower? Is one style preferred over the other and why?

3. I used a format string, '<!-- %s -->' % self.content, where I could just
as easily have concatenated '<!-- ' + self.content + ' -->' instead. Is one
faster than the other?

4. Is there any documentation that covers these kinds of issues where there
is more than one way to do something? I'd like to have some foundation for
making these decisions. As you can probably guess, I usually hate having
more than one way to do anything. ;-)

---
Patrick K. O'Brien
Orbtech




From greg@cosc.canterbury.ac.nz  Thu Jun  6 01:26:40 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 06 Jun 2002 12:26:40 +1200 (NZST)
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <uvg8xtkl3.fsf@eikeon.com>
Message-ID: <200206060026.MAA07140@s454.cosc.canterbury.ac.nz>

"Daniel 'eikeon' Krech" <eikeon@eikeon.com>:

> Ideally it would be nice not to have to store it as part of the
> value.

You could keep a separate dictionary mapping each
"canonical" value to itself, and use that for
normalising things before looking up the main
dictionary.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From guido@python.org  Thu Jun  6 01:53:02 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 05 Jun 2002 20:53:02 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: Your message of "Thu, 06 Jun 2002 12:26:40 +1200."
 <200206060026.MAA07140@s454.cosc.canterbury.ac.nz>
References: <200206060026.MAA07140@s454.cosc.canterbury.ac.nz>
Message-ID: <200206060053.g560r2h05238@pcp02138704pcs.reston01.va.comcast.net>

> You could keep a separate dictionary mapping each
> "canonical" value to itself, and use that for
> normalising things before looking up the main
> dictionary.

That's what intern() does.  Can't he just call intern()?  Or does he
want the *uninterned* version of the key back?  Why on earth?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jun  6 02:07:11 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 05 Jun 2002 21:07:11 -0400
Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention
In-Reply-To: Your message of "Wed, 05 Jun 2002 19:24:25 CDT."
 <NBBBIOJPGKJEKIECEMCBMEEFNDAA.pobrien@orbtech.com>
References: <NBBBIOJPGKJEKIECEMCBMEEFNDAA.pobrien@orbtech.com>
Message-ID: <200206060107.g5617Cv05298@pcp02138704pcs.reston01.va.comcast.net>

> class Comment:
> 
>     def __init__(self, content=''):
>         self.content = content
> 
>     def __call__(self, content=''):
>         o = self.__class__(content)
>         return str(o)
> 
>     def __str__(self):
>         return '<!-- %s -->' % self.content
> 
>     def __repr__(self):
>         return repr(self.__str__())
> 
> When I look at this, I see certain decisions I've made and I'm wondering if
> I've made the best decisions. I'm wondering how to balance performance
> against clarity and proper coding conventions.
> 
> 1. In the __call__ I save a reference to the object. Instead, I could
> simply:
> 
>        return str(self.__class__(content))
> 
> Is there much of a performance impact by explicitly naming intermediate
> references? (I need some of Tim Peter's performance testing scripts.)

Since o is a "fast local" (all locals are fast locals except when a
function uses exec or import *), it is very fast.  The load and store
of fast locals are about the fastest opcodes around.

I am more worried about the inefficiency of instantiating
self.__class__ and then throwing it away after calling str() on it.
You could factor out the body of __str__ into a separate method so
that you can invoke it from __call__ without creating an instance.

> 2. I chose the slightly indirect str(o) instead of o.__str__(). Is this
> slower? Is one style preferred over the other and why?

str(o) is preferred.  I would say that you should never call __foo__
methods directly except when you're overriding a base class's __foo__
method.

> 3. I used a format string, '<!-- %s -->' % self.content, where I could just
> as easily have concatenated '<!-- ' + self.content + ' -->'
> instead. Is one faster than the other?

You could time it.  My personal belief is that for more than one +
operator, %s is faster.

> 4. Is there any documentation that covers these kinds of issues
> where there is more than one way to do something? I'd like to have
> some foundation for making these decisions. As you can probably
> guess, I usually hate having more than one way to do anything. ;-)

I'm not aware of documentation, and I think you should give yourself
some credit for having a personal opinion.  Study the standard library
and you'll get an idea of what's "done" and what's "not done".

BTW I have another gripe about your example.

>     def __str__(self):
>         return '<!-- %s -->' % self.content
> 
>     def __repr__(self):
>         return repr(self.__str__())

This definition of __repr__ makes no sense to me -- all it does is add
string quotes around the contents of the string (and escape
non-printing characters and quotes if there are any).  That is
confusing, because it will appear to the reader as if the object is a
string.  You probably should write

    __repr__ = __str__

instead.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From greg@cosc.canterbury.ac.nz  Thu Jun  6 02:06:48 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 06 Jun 2002 13:06:48 +1200 (NZST)
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <200206060053.g560r2h05238@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200206060106.NAA07146@s454.cosc.canterbury.ac.nz>

Guido:

> That's what intern() does.  Can't he just call intern()?

Because his keys aren't strings, I think.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From guido@python.org  Thu Jun  6 02:11:47 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 05 Jun 2002 21:11:47 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: Your message of "05 Jun 2002 20:02:34 EDT."
 <ur8jltgth.fsf@eikeon.com>
References: <uvg8xbjfk.fsf@eikeon.com> <m3elflo4co.fsf@mira.informatik.hu-berlin.de> <u1yblv2a7.fsf@eikeon.com> <200206052325.g55NPSm04904@pcp02138704pcs.reston01.va.comcast.net>
 <ur8jltgth.fsf@eikeon.com>
Message-ID: <200206060111.g561Bl205351@pcp02138704pcs.reston01.va.comcast.net>

> It is my understanding that calling intern creates a string that is
> forever immortal? [Perhaps this is a question for python-list
> though?]

Yes.

> For our application, we can not afford to "leak" the memory we would
> lose to the immortal strings.

Thanks for explaining that.  The use case still seems to esoteric to
me to warrant adding a feature to Python.  You could always write an
extension that does it though. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From greg@cosc.canterbury.ac.nz  Thu Jun  6 02:12:56 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 06 Jun 2002 13:12:56 +1200 (NZST)
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <200206060111.g561Bl205351@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz>

> For our application, we can not afford to "leak" the memory we would
> lose to the immortal strings.

Seems to me that if the implementation of interning were
smart enough, it would be able to drop strings that were
not referenced from anywhere else.

Maybe *that* would be a useful feature to add?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From eikeon@eikeon.com  Thu Jun  6 02:29:45 2002
From: eikeon@eikeon.com (Daniel 'eikeon' Krech)
Date: 05 Jun 2002 21:29:45 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <200206060111.g561Bl205351@pcp02138704pcs.reston01.va.comcast.net>
References: <uvg8xbjfk.fsf@eikeon.com>
 <m3elflo4co.fsf@mira.informatik.hu-berlin.de>
 <u1yblv2a7.fsf@eikeon.com>
 <200206052325.g55NPSm04904@pcp02138704pcs.reston01.va.comcast.net>
 <ur8jltgth.fsf@eikeon.com>
 <200206060111.g561Bl205351@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <un0u9tcs6.fsf@eikeon.com>

>Because his keys aren't strings, I think.

Our objects are subclasses of string.


>Seems to me that if the implementation of interning were
>smart enough, it would be able to drop strings that were
>not referenced from anywhere else.

>Maybe *that* would be a useful feature to add?

Yes. Especially if it could be made to work with subclasses of string ;)


--eikeon






From guido@python.org  Thu Jun  6 02:36:26 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 05 Jun 2002 21:36:26 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: Your message of "Thu, 06 Jun 2002 13:12:56 +1200."
 <200206060112.NAA07149@s454.cosc.canterbury.ac.nz>
References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz>
Message-ID: <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net>

> Seems to me that if the implementation of interning were
> smart enough, it would be able to drop strings that were
> not referenced from anywhere else.
> 
> Maybe *that* would be a useful feature to add?

An occasional run through the 'interned' dict (in stringobject.c)
looking for strings with refcount 2 would do this.  Maybe something
for the gc module do handle as a service whenever it runs its
last-generation collection?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jun  6 02:44:26 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 05 Jun 2002 21:44:26 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: Your message of "05 Jun 2002 21:29:45 EDT."
 <un0u9tcs6.fsf@eikeon.com>
References: <uvg8xbjfk.fsf@eikeon.com> <m3elflo4co.fsf@mira.informatik.hu-berlin.de> <u1yblv2a7.fsf@eikeon.com> <200206052325.g55NPSm04904@pcp02138704pcs.reston01.va.comcast.net> <ur8jltgth.fsf@eikeon.com> <200206060111.g561Bl205351@pcp02138704pcs.reston01.va.comcast.net>
 <un0u9tcs6.fsf@eikeon.com>
Message-ID: <200206060144.g561iQc05572@pcp02138704pcs.reston01.va.comcast.net>

> Our objects are subclasses of string.

Ah, those can't be interned.

> >Seems to me that if the implementation of interning were
> >smart enough, it would be able to drop strings that were
> >not referenced from anywhere else.
> 
> >Maybe *that* would be a useful feature to add?
> 
> Yes. Especially if it could be made to work with subclasses of string ;)

Alas, subclasses of str can't be interned.  Consider the following
scenario.  You intern a str-subclass-instance with value "foo" that
implements a funky __repr__.  Some other unrelated piece of code
interns the string "foo".  When they apply repr() to it, they'll be
very unhappy that their string has been turned into something else.

(In fact, the interning code, when it sees a str-subclass-instance,
makes a copy as a true str instance.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pobrien@orbtech.com  Thu Jun  6 03:15:51 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Wed, 5 Jun 2002 21:15:51 -0500
Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention
In-Reply-To: <200206060107.g5617Cv05298@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <NBBBIOJPGKJEKIECEMCBOEEJNDAA.pobrien@orbtech.com>

[Guido van Rossum]
>
> I am more worried about the inefficiency of instantiating
> self.__class__ and then throwing it away after calling str() on it.
> You could factor out the body of __str__ into a separate method so
> that you can invoke it from __call__ without creating an instance.

Some more code from the module might help explain this design decision. I'm
still sort of toying with this to see if I like it. The basic idea here is
that I'm trying to support both DOM-like xhtml structures as well as simple
function-like callables that return strings. When the instance is called it
needs a fresh state in order to better mimic a true function. It isn't
immediately obvious to me how I might refactor this to avoid instantiating a
throwaway.

class Element:

    def __init__(self, klass, id, style, title):
        self.name = self.__class__.__name__.lower()
        self.attrs = {
            'class': klass,  # Space-separated list of classes.
            'id': id,        # Document-wide unique id.
            'style': style,  # Associated style info.
            'title': title,  # Advisory title/amplification.
            }

    def attrstring(self):
        attrs = self.attrs.keys()
        attrs.sort()  # Sorting is only cosmetic, not required.
        l = []  # List of formatted attribute/value pairs.
        for attr in attrs:
            value = self.attrs[attr]
            if value is not None and value != '':
                l += ['%s="%s"' % (attr, convert(value))]
        s = ' ' + ' '.join(l)  # Prepend a single space.
        return s.rstrip()  # Reduce to an empty string if no attrs.

    def __str__(self):
        pass

    def __repr__(self):
        return repr(self.__str__())


class EmptyElement(Element):

    def __init__(self, klass=None, id=None, style=None, title=None):
        Element.__init__(self, klass, id, style, title)

    def __call__(self, klass=None, id=None, style=None, title=None):
        o = self.__class__(klass, id, style, title)
        return str(o)

    def __str__(self):
        attrstring = self.attrstring()
        return '<%s%s />\n' % (self.name, attrstring)


class SimpleElement(Element):

    def __init__(self, content='', klass=None, id=None, style=None,
title=None):
        self.content = content
        Element.__init__(self, klass, id, style, title)

    def __call__(self, content='', klass=None, id=None, style=None,
title=None):
        o = self.__class__(content, klass, id, style, title)
        return str(o)

    def __str__(self):
        attrstring = self.attrstring()
        return '<%s%s>\n%s\n</%s>\n' % \
               (self.name, attrstring, convert(self.content), self.name)


class Br(EmptyElement): pass
class Hr(EmptyElement): pass
class P(SimpleElement): pass

# The following singleton instances are callable, returning strings.
# They can be used like simple functions to return properly tagged contents.
br = Br()
comment = Comment()
hr = Hr()
p = P()


> BTW I have another gripe about your example.
>
> >     def __str__(self):
> >         return '<!-- %s -->' % self.content
> >
> >     def __repr__(self):
> >         return repr(self.__str__())
>
> This definition of __repr__ makes no sense to me -- all it does is add
> string quotes around the contents of the string (and escape
> non-printing characters and quotes if there are any).  That is
> confusing, because it will appear to the reader as if the object is a
> string.

Yes. This was a conscious design choice for this particular application.
Maybe there is a better way, and maybe I'm not being too Pythonic, but I'm
not particularly troubled by this even though I know I'm "breaking the
rules".

I guess I don't mind if there is more than one way to do something, as long
as one way is the Python way and the other way is my way. ;-)

---
Patrick K. O'Brien
Orbtech




From barry@zope.com  Thu Jun  6 04:35:55 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 5 Jun 2002 23:35:55 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz>
 <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15614.55451.1832.105520@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    >> Seems to me that if the implementation of interning were smart
    >> enough, it would be able to drop strings that were not
    >> referenced from anywhere else.  Maybe *that* would be a useful
    >> feature to add?

    GvR> An occasional run through the 'interned' dict (in
    GvR> stringobject.c) looking for strings with refcount 2 would do
    GvR> this.  Maybe something for the gc module do handle as a
    GvR> service whenever it runs its last-generation collection?

What about exposing _Py_ReleaseInternedStrings() to Python, say, from
the gc module?

-Barry



From aahz@pythoncraft.com  Thu Jun  6 05:03:33 2002
From: aahz@pythoncraft.com (Aahz)
Date: Thu, 6 Jun 2002 00:03:33 -0400
Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention
In-Reply-To: <NBBBIOJPGKJEKIECEMCBOEEJNDAA.pobrien@orbtech.com>
References: <200206060107.g5617Cv05298@pcp02138704pcs.reston01.va.comcast.net> <NBBBIOJPGKJEKIECEMCBOEEJNDAA.pobrien@orbtech.com>
Message-ID: <20020606040333.GA11085@panix.com>

On Wed, Jun 05, 2002, Patrick K. O'Brien wrote:
>
> class Element:
> 
>     def __str__(self):
>         pass

Dunno about other people's opinions, but I have a strong distaste for
creating methods whose body contains pass.  I always use "raise
NotImplementedError".
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In the end, outside of spy agencies, people are far too trusting and
willing to help."  --Ira Winkler



From pobrien@orbtech.com  Thu Jun  6 05:38:51 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Wed, 5 Jun 2002 23:38:51 -0500
Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention
In-Reply-To: <20020606040333.GA11085@panix.com>
Message-ID: <NBBBIOJPGKJEKIECEMCBMEFBNDAA.pobrien@orbtech.com>

[Aahz]
>
> On Wed, Jun 05, 2002, Patrick K. O'Brien wrote:
> >
> > class Element:
> >
> >     def __str__(self):
> >         pass
>
> Dunno about other people's opinions, but I have a strong distaste for
> creating methods whose body contains pass.  I always use "raise
> NotImplementedError".

I agree. That's a bad habit of mine that I need to change. Thanks for the
reminder.

---
Patrick K. O'Brien
Orbtech




From martin@v.loewis.de  Thu Jun  6 07:36:41 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 06 Jun 2002 08:36:41 +0200
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net>
References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz>
 <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3sn40ud52.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> An occasional run through the 'interned' dict (in stringobject.c)
> looking for strings with refcount 2 would do this.  Maybe something
> for the gc module do handle as a service whenever it runs its
> last-generation collection?

This has the potential of breaking applications that remember the id()
of an interned string, instead of its value.

Regards,
Martin




From martin@v.loewis.de  Thu Jun  6 07:44:30 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 06 Jun 2002 08:44:30 +0200
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <u1yblv2a7.fsf@eikeon.com>
References: <uvg8xbjfk.fsf@eikeon.com>
 <m3elflo4co.fsf@mira.informatik.hu-berlin.de>
 <u1yblv2a7.fsf@eikeon.com>
Message-ID: <m3ofeoucs1.fsf@mira.informatik.hu-berlin.de>

"Daniel 'eikeon' Krech" <eikeon@eikeon.com> writes:

> Sorry, seems to me like it was on topic for python-dev seeing as
> python dictionaries do not currently have the functionality I
> desire. 

It sure is possible. You have been essentially asking the question
"How do I get the canonical member of an equivalence class in Python",
for which the canonical answer is "you intern the one member of each
equivalence class". As Barry Scott explains, this is best done with an
interning dictionary.

You are also asking "how do I efficiently implement sets in
Python". This is almost a FAQ, the answer is "if the elements are
hashable, use a dictionary".

Regards,
Martin



From loewis@informatik.hu-berlin.de  Thu Jun  6 10:41:15 2002
From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=)
Date: 06 Jun 2002 11:41:15 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
Message-ID: <j4wutc4udg.fsf@informatik.hu-berlin.de>

What terrible things would happen if ob_size would be changed from int
to size_t?

The question recently came up on comp.lang.python, where the poster
noticed that you cannot mmap large files on a 64-bit system where int
is 32 bits; there is a 2Gib limit on the length of objects on his
specific system.

About the only problem I can see is that you could not store negative
numbers anymore. Is ssize_t universally available, or could be used on
systems where it is available?

Regards,
Martin



From mal@lemburg.com  Thu Jun  6 11:19:17 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 06 Jun 2002 12:19:17 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
References: <j4wutc4udg.fsf@informatik.hu-berlin.de>
Message-ID: <3CFF3725.303@lemburg.com>

Martin v. L=F6wis wrote:
> What terrible things would happen if ob_size would be changed from int
> to size_t?

This would cause binary incompatibility for all extension
types on 64-bit systems since the object struct layout
would change (probably not much of an issue since
binary compatiblity is not guaranteed between releases anyway).

> The question recently came up on comp.lang.python, where the poster
> noticed that you cannot mmap large files on a 64-bit system where int
> is 32 bits; there is a 2Gib limit on the length of objects on his
> specific system.

Wouldn't it be easier to solve this particular problem in
the type used for mmapping files ?

> About the only problem I can see is that you could not store negative
> numbers anymore. Is ssize_t universally available, or could be used on
> systems where it is available?

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From guido@python.org  Thu Jun  6 13:50:38 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 08:50:38 -0400
Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention
In-Reply-To: Your message of "Wed, 05 Jun 2002 21:15:51 CDT."
 <NBBBIOJPGKJEKIECEMCBOEEJNDAA.pobrien@orbtech.com>
References: <NBBBIOJPGKJEKIECEMCBOEEJNDAA.pobrien@orbtech.com>
Message-ID: <200206061250.g56Cocp06447@pcp02138704pcs.reston01.va.comcast.net>

> Yes. This was a conscious design choice for this particular
> application.  Maybe there is a better way, and maybe I'm not being
> too Pythonic, but I'm not particularly troubled by this even though
> I know I'm "breaking the rules".

Maybe you shouldn't ask for advice if you have it all worked out
already? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jun  6 13:57:41 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 08:57:41 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: Your message of "Wed, 05 Jun 2002 23:35:55 EDT."
 <15614.55451.1832.105520@anthem.wooz.org>
References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net>
 <15614.55451.1832.105520@anthem.wooz.org>
Message-ID: <200206061257.g56Cvfg06543@pcp02138704pcs.reston01.va.comcast.net>

>     GvR> An occasional run through the 'interned' dict (in
>     GvR> stringobject.c) looking for strings with refcount 2 would do
>     GvR> this.  Maybe something for the gc module do handle as a
>     GvR> service whenever it runs its last-generation collection?
> 
> What about exposing _Py_ReleaseInternedStrings() to Python, say, from
> the gc module?

If it's going to be an exposed API, it will have to live in
stringobject.c, since the 'interned' dict is a static global there.

Wanna give it a crack?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jun  6 13:58:21 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 08:58:21 -0400
Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention
In-Reply-To: Your message of "Thu, 06 Jun 2002 00:03:33 EDT."
 <20020606040333.GA11085@panix.com>
References: <200206060107.g5617Cv05298@pcp02138704pcs.reston01.va.comcast.net> <NBBBIOJPGKJEKIECEMCBOEEJNDAA.pobrien@orbtech.com>
 <20020606040333.GA11085@panix.com>
Message-ID: <200206061258.g56CwLp06558@pcp02138704pcs.reston01.va.comcast.net>

> > class Element:
> > 
> >     def __str__(self):
> >         pass
> 
> Dunno about other people's opinions, but I have a strong distaste for
> creating methods whose body contains pass.  I always use "raise
> NotImplementedError".

But that has different semantics!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aahz@pythoncraft.com  Thu Jun  6 14:13:25 2002
From: aahz@pythoncraft.com (Aahz)
Date: Thu, 6 Jun 2002 09:13:25 -0400
Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention
In-Reply-To: <200206061258.g56CwLp06558@pcp02138704pcs.reston01.va.comcast.net>
References: <200206060107.g5617Cv05298@pcp02138704pcs.reston01.va.comcast.net> <NBBBIOJPGKJEKIECEMCBOEEJNDAA.pobrien@orbtech.com> <20020606040333.GA11085@panix.com> <200206061258.g56CwLp06558@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020606131325.GA21511@panix.com>

On Thu, Jun 06, 2002, Guido van Rossum wrote:
>
>>> class Element:
>>> 
>>>     def __str__(self):
>>>         pass
>> 
>> Dunno about other people's opinions, but I have a strong distaste for
>> creating methods whose body contains pass.  I always use "raise
>> NotImplementedError".
> 
> But that has different semantics!

Yes, exactly.  My point was that one rarely wants the semantics of
"pass" for method definitions, and that goes double or triple for the
special methods such as __str__.  Consider what happens to an
application that calls str() on this object and gets back a None instead
of a string.  Blech -- errors should never pass silently.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I had lots of reasonable theories about children myself, until I
had some."  --Michael Rios



From guido@python.org  Thu Jun  6 14:19:46 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 09:19:46 -0400
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: Your message of "06 Jun 2002 11:41:15 +0200."
 <j4wutc4udg.fsf@informatik.hu-berlin.de>
References: <j4wutc4udg.fsf@informatik.hu-berlin.de>
Message-ID: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net>

> What terrible things would happen if ob_size would be changed from int
> to size_t?

Binary incompatibility on 64-bit platforms, for one.

Also, many other APIs would have to be changed: everything that takes
or returns an int (e.g. PyObject_Size, PySequence_GetItem) would have
to be changed, and again would be a binary incompatibility.  Also
could cause lots of compilation warnings when user code stores the
result into an int.

> The question recently came up on comp.lang.python, where the poster
> noticed that you cannot mmap large files on a 64-bit system where int
> is 32 bits; there is a 2Gib limit on the length of objects on his
> specific system.

That is indeed painful.

> About the only problem I can see is that you could not store negative
> numbers anymore. Is ssize_t universally available, or could be used on
> systems where it is available?

I've never heard of if, so it must be a relatively newfangled
thing. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jun  6 14:07:29 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 09:07:29 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: Your message of "06 Jun 2002 08:36:41 +0200."
 <m3sn40ud52.fsf@mira.informatik.hu-berlin.de>
References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net>
 <m3sn40ud52.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206061307.g56D7Tb06612@pcp02138704pcs.reston01.va.comcast.net>

> > An occasional run through the 'interned' dict (in stringobject.c)
> > looking for strings with refcount 2 would do this.  Maybe something
> > for the gc module do handle as a service whenever it runs its
> > last-generation collection?
> 
> This has the potential of breaking applications that remember the id()
> of an interned string, instead of its value.

Ow, good point!  It's also quite possible that there are no outside
references to an interned string, but another string with the same
value still references the interned string from its ob_sinterned
field.  E.g.

    s = "frobnicate"*3
    t = intern(s)
    del t

To solve this, we would have to make the ob_sinterned slot count as a
reference to the interned string.  But then string_dealloc would be
complicated (it would have to call Py_XDECREF(op->ob_sinterned)),
possibly slowing things down.

Is this worth it?  The fear for unbounded growth of the interned
strings table is pretty common amongst authors of serious long-running
programs.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@zope.com  Thu Jun  6 16:47:19 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Thu, 6 Jun 2002 11:47:19 -0400
Subject: Releasing the intern dictionary (was Re: [Python-Dev] d.get_key(key) -> key?)
References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz>
 <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net>
 <15614.55451.1832.105520@anthem.wooz.org>
 <200206061257.g56Cvfg06543@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15615.33799.915681.15874@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    >> What about exposing _Py_ReleaseInternedStrings() to Python,
    >> say, from the gc module?

    GvR> If it's going to be an exposed API, it will have to live in
    GvR> stringobject.c, since the 'interned' dict is a static global
    GvR> there.

Actually, I don't think so, since _Py_ReleaseInternedStrings() is
already an extern function.

    GvR> Wanna give it a crack?

http://sourceforge.net/tracker/index.php?func=detail&aid=565378&group_id=5470&atid=305470

Doc changes and test case included.
-Barry



From gward@python.net  Thu Jun  6 16:20:11 2002
From: gward@python.net (Greg Ward)
Date: Thu, 6 Jun 2002 11:20:11 -0400
Subject: [Python-Dev] Where to put wrap_text()?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKELIPJAA.tim.one@comcast.net>
References: <20020601134236.GA17691@gerg.ca> <LNBBLJKPBEHFEDALKOLCKELIPJAA.tim.one@comcast.net>
Message-ID: <20020606152011.GA16829@gerg.ca>

On 01 June 2002, Tim Peters said:
> [Greg Ward, on wrapping text]
> > ...
> 
> Note that regrtest.py also has a wrapper:
> 
> def printlist(x, width=70, indent=4):
>     """Print the elements of a sequence to stdout.
> 
>     Optional arg width (default 70) is the maximum line length.
>     Optional arg indent (default 4) is the number of blanks with which to
>     begin each line.
>     """

I think this one will probably stand; I've gotten to the point with my
text-wrapping code where I'm reimplementing the various other
text-wrappers people have mentioned on top of it, and
regrtest.printlist() is just not a good fit.  It's for printing
lists compactly, not for filling text.  Whatever.

> Just make sure it handle the union of all possible desires, but has a simple
> and intuitive interface <wink>.

Right.  Gotcha.  Code coming up soon.

        Greg
-- 
Greg Ward - Unix weenie                                 gward@python.net
http://starship.python.net/~gward/
Quick!!  Act as if nothing has happened!



From gward@python.net  Thu Jun  6 16:46:01 2002
From: gward@python.net (Greg Ward)
Date: Thu, 6 Jun 2002 11:46:01 -0400
Subject: [Python-Dev] textwrap.py
Message-ID: <20020606154601.GA16897@gerg.ca>

--jI8keyz6grp/JLjh
Content-Type: text/plain; charset=unknown-8bit
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

Hi all --

since my ISP seems to be taking a holiday today, I was able to polish
off my proposed text-wrapping module.  Of course, it'll sit in my mail
queue until my link to the outside world is back, but never mind.

Anyways, the code is attached.  I don't care if this becomes
textwrap.py, wraptext.py, text/wrap.py, or whatever -- let's concentrate
on the code for now.  I'll also attach my test script, so you can see
what TextWrapper can and cannot do.

Things to note:
  * The code is not locale-aware; it should be to detect sentence
    endings, which it needs to do to ensure that there are two spaces
    after each sentence ending.  Eg. it fixes
       "I have eaten. And you?"
    but not
       "Moi, j'ai mangé. Et toi?"

  * The code is not Unicode-aware.  I have no idea what will happen if
    you pass Unicode strings to it.

  * However, it is hyphen-aware.  Please spend a few minutes gawping at
    the enormity of wordsep_re -- that took a while to get right.  ;-)

  * Despite occasional complaints (hello, Jeremy and Neil),
    I still write "def foo (a, b)" rather than "def foo(a, b)".
    I'll (begrudgingly) fix this before checking anything in.

  * I'm not sure if exposing flags to make whitespace-munging optional
    is a good idea.  Opinions?

  * need to convert the test suite to unittest (I guess)

BTW, as an exercise I implemented Mailman's wrap() function on top of
TextWrapper, and it seems to have worked fine.  I tried to implement
regrtest's printlist(), but it's not a good fit (as I mentioned in my
last post).

        Greg
-- 
Greg Ward - nerd                                        gward@python.net
http://starship.python.net/~gward/
Support bacteria -- it's the only culture some people have!

--jI8keyz6grp/JLjh
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="textwrap.py"

"""
Utilities for wrapping text strings and filling text paragraphs.
"""

__revision__ = "$Id$"

import string, re


# XXX is this going to be implemented properly somewhere in 2.3?
def islower (c):
    return c in string.lowercase


class TextWrapper:
    """
    Object for wrapping/filling text.  The public interface consists of
    the wrap() and fill() methods; the other methods are just there for
    subclasses to override in order to tweak the default behaviour.
    If you want to completely replace the main wrapping algorithm,
    you'll probably have to override _wrap_chunks().

    Several instance attributes control various aspects of
    wrapping:
      expand_tabs
        if true (default), tabs in input text will be expanded
        to spaces before further processing.  Each tab will
        become 1 .. 8 spaces, depending on its position in its line.
        If false, each tab is treated as a single character.
      replace_whitespace
        if true (default), all whitespace characters in the input
        text are replaced by spaces after tab expansion.  Note
        that expand_tabs is false and replace_whitespace is true,
        every tab will be converted to a single space!
      break_long_words
        if true (default), words longer than the line width constraint
        will be broken.  If false, those words will not be broken,
        and some lines might be longer than the width constraint.
    """

    whitespace_trans = string.maketrans(string.whitespace,
                                        ' ' * len(string.whitespace))

    # This funky little regex is just the trick for splitting 
    # text up into word-wrappable chunks.  E.g.
    #   "Hello there -- you goof-ball, use the -b option!"
    # splits into
    #   Hello/ /there/ /--/ /you/ /goof-/ball,/ /use/ /the/ /-b/ /option!
    # (after stripping out empty strings).
    wordsep_re = re.compile(r'(\s+|'                  # any whitespace
                            r'\w{2,}-(?=\w{2,})|'     # hyphenated words
                            r'(?<=\w)-{2,}(?=\w))')   # em-dash


    def __init__ (self):
        self.expand_tabs = 1
        self.replace_whitespace = 1
        self.break_long_words = 1
        

    # -- Private methods -----------------------------------------------
    # (possibly useful for subclasses to override)

    def _munge_whitespace (self, text):
        """_munge_whitespace(text : string) -> string

        Munge whitespace in text: expand tabs and convert all other
        whitespace characters to spaces.  Eg. " foo\tbar\n\nbaz"
        becomes " foo    bar  baz".
        """
        if self.expand_tabs:
            text = text.expandtabs()
        if self.replace_whitespace:
            text = text.translate(self.whitespace_trans)
        return text


    def _split (self, text):
        """_split(text : string) -> [string]

        Split the text to wrap into indivisible chunks.  Chunks are
        not quite the same as words; see wrap_chunks() for full
        details.  As an example, the text
          Look, goof-ball -- use the -b option!
        breaks into the following chunks:
          'Look,', ' ', 'goof-', 'ball', ' ', '--', ' ',
          'use', ' ', 'the', ' ', '-b', ' ', 'option!'
        """
        chunks = self.wordsep_re.split(text)
        chunks = filter(None, chunks)
        return chunks

    def _fix_sentence_endings (self, chunks):
        """_fix_sentence_endings(chunks : [string])

        Correct for sentence endings buried in 'chunks'.  Eg. when the
        original text contains "... foo.\nBar ...", munge_whitespace()
        and split() will convert that to [..., "foo.", " ", "Bar", ...]
        which has one too few spaces; this method simply changes the one
        space to two.
        """
        i = 0
        while i < len(chunks)-1:
            # chunks[i] looks like the last word of a sentence,
            # and it's followed by a single space.
            if (chunks[i][-1] == "." and
                  chunks[i+1] == " " and
                  islower(chunks[i][-2])):
                chunks[i+1] = "  "
                i += 2
            else:
                i += 1

    def _handle_long_word (self, chunks, cur_line, cur_len, width):
        """_handle_long_word(chunks : [string],
                             cur_line : [string],
                             cur_len : int, width : int)

        Handle a chunk of text (most likely a word, not whitespace) that
        is too long to fit in any line.
        """
        space_left = width - cur_len

        # If we're allowed to break long words, then do so: put as much
        # of the next chunk onto the current line as will fit.
        if self.break_long_words:
            cur_line.append(chunks[0][0:space_left])
            chunks[0] = chunks[0][space_left:]

        # Otherwise, we have to preserve the long word intact.  Only add
        # it to the current line if there's nothing already there --
        # that minimizes how much we violate the width constraint.
        elif not cur_line:
            cur_line.append(chunks.pop(0))

        # If we're not allowed to break long words, and there's already
        # text on the current line, do nothing.  Next time through the
        # main loop of _wrap_chunks(), we'll wind up here again, but
        # cur_len will be zero, so the next line will be entirely
        # devoted to the long word that we can't handle right now.

    def _wrap_chunks (self, chunks, width):
        """_wrap_chunks(chunks : [string], width : int) -> [string]

        Wrap a sequence of text chunks and return a list of lines of
        length 'width' or less.  (If 'break_long_words' is false, some
        lines may be longer than 'width'.)  Chunks correspond roughly to
        words and the whitespace between them: each chunk is indivisible
        (modulo 'break_long_words'), but a line break can come between
        any two chunks.  Chunks should not have internal whitespace;
        ie. a chunk is either all whitespace or a "word".  Whitespace
        chunks will be removed from the beginning and end of lines, but
        apart from that whitespace is preserved.
        """
        lines = []

        while chunks:

            cur_line = []                   # list of chunks (to-be-joined)
            cur_len = 0                     # length of current line

            # First chunk on line is whitespace -- drop it.
            if chunks[0].strip() == '':
                del chunks[0]

            while chunks:
                l = len(chunks[0])

                # Can at least squeeze this chunk onto the current line.
                if cur_len + l <= width:
                    cur_line.append(chunks.pop(0))
                    cur_len += l

                # Nope, this line is full.
                else:
                    break

            # The current line is full, and the next chunk is too big to
            # fit on *any* line (not just this one).  
            if chunks and len(chunks[0]) > width:
                self._handle_long_word(chunks, cur_line, cur_len, width)

            # If the last chunk on this line is all whitespace, drop it.
            if cur_line and cur_line[-1].strip() == '':
                del cur_line[-1]

            # Convert current line back to a string and store it in list
            # of all lines (return value).
            if cur_line:
                lines.append(''.join(cur_line))

        return lines


    # -- Public interface ----------------------------------------------

    def wrap (self, text, width):
        """wrap(text : string, width : int) -> [string]

        Split 'text' into multiple lines of no more than 'width'
        characters each, and return the list of strings that results.
        Tabs in 'text' are expanded with string.expandtabs(), and all
        other whitespace characters (including newline) are converted to
        space.
        """
        text = self._munge_whitespace(text)
        if len(text) <= width:
            return [text]
        chunks = self._split(text)
        self._fix_sentence_endings(chunks)
        return self._wrap_chunks(chunks, width)

    def fill (self, text, width, initial_tab="", subsequent_tab=""):
        """fill(text : string,
                width : int,
                initial_tab : string = "",
                subsequent_tab : string = "")
           -> string

        Reformat the paragraph in 'text' to fit in lines of no more than
        'width' columns.  The first line is prefixed with 'initial_tab',
        and subsequent lines are prefixed with 'subsequent_tab'; the
        lengths of the tab strings are accounted for when wrapping lines
        to fit in 'width' columns.
        """
        lines = self.wrap(text, width)
        sep = "\n" + subsequent_tab
        return initial_tab + sep.join(lines)


# Convenience interface

_wrapper = TextWrapper()

def wrap (text, width):
    return _wrapper.wrap(text, width)

def fill (text, width, initial_tab="", subsequent_tab=""):
    return _wrapper.fill(text, width, initial_tab, subsequent_tab)

--jI8keyz6grp/JLjh
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="test_textwrap.py"

#!/usr/bin/env python

from textwrap import TextWrapper

num = 0

def test (result, expect):
    global num
    num += 1
    if result == expect:
        print "%d: ok" % num
    else:
        print "%d: not ok, expected:" % num
        for i in range(len(expect)):
            print "  %d: %r" % (i, expect[i])
        print "but got:"
        for i in range(len(result)):
            print "  %d: %r" % (i, result[i])

wrapper = TextWrapper()
wrap = wrapper.wrap


# Simple case: just words, spaces, and a bit of punctuation.
t = "Hello there, how are you this fine day?  I'm glad to hear it!"
test(wrap(t, 12), ["Hello there,",
                   "how are you",
                   "this fine",
                   "day?  I'm",
                   "glad to hear",
                   "it!"])
test(wrap(t, 42), ["Hello there, how are you this fine day?",
                   "I'm glad to hear it!"])
test(wrap(t, 80), [t])

# Whitespace munging and end-of-sentence detection.
t = """\
This is a paragraph that already has
line breaks.  But some of its lines are much longer than the others,
so it needs to be wrapped.
Some lines are \ttabbed too.
What a mess!
"""
test(wrap(t, 45), ["This is a paragraph that already has line",
                   "breaks.  But some of its lines are much",
                   "longer than the others, so it needs to be",
                   "wrapped.  Some lines are  tabbed too.  What a",
                   "mess!"])


# Wrapping to make short lines longer.
t = "This is a\nshort paragraph."
test(wrap(t, 20), ["This is a short",
                   "paragraph."])
test(wrap(t, 40), ["This is a short paragraph."])


# Test breaking hyphenated words.
t = "this-is-a-useful-feature-for-reformatting-posts-from-tim-peters'ly"
test(wrap(t, 40), ["this-is-a-useful-feature-for-",
                   "reformatting-posts-from-tim-peters'ly"])
test(wrap(t, 41), ["this-is-a-useful-feature-for-",
                   "reformatting-posts-from-tim-peters'ly"])
test(wrap(t, 42), ["this-is-a-useful-feature-for-reformatting-",
                   "posts-from-tim-peters'ly"])

# Ensure that the standard _split() method works as advertised in
# the comments (don't you hate it when code and comments diverge?).
t = "Hello there -- you goof-ball, use the -b option!"
test(wrapper._split(t),
     ["Hello", " ", "there", " ", "--", " ", "you", " ", "goof-",
      "ball,", " ", "use", " ", "the", " ", "-b", " ",  "option!"])


text = '''
Did you say "supercalifragilisticexpialidocious?"
How *do* you spell that odd word, anyways?
'''
# XXX sentence ending not detected because of quotes
test(wrap(text, 30),
     ['Did you say "supercalifragilis',
      'ticexpialidocious?" How *do*',
      'you spell that odd word,',
      'anyways?'])
test(wrap(text, 50),
     ['Did you say "supercalifragilisticexpialidocious?"',
      'How *do* you spell that odd word, anyways?'])

wrapper.break_long_words = 0
test(wrap(text, 30),
     ['Did you say',
      '"supercalifragilisticexpialidocious?"',
      'How *do* you spell that odd',
      'word, anyways?'])

--jI8keyz6grp/JLjh--



From aahz@pythoncraft.com  Thu Jun  6 17:15:42 2002
From: aahz@pythoncraft.com (Aahz)
Date: Thu, 6 Jun 2002 12:15:42 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <20020606154601.GA16897@gerg.ca>
References: <20020606154601.GA16897@gerg.ca>
Message-ID: <20020606161541.GA26647@panix.com>

On Thu, Jun 06, 2002, Greg Ward wrote:
>
>   * The code is not locale-aware; it should be to detect sentence
>     endings, which it needs to do to ensure that there are two spaces
>     after each sentence ending.  Eg. it fixes
>        "I have eaten. And you?"
>     but not
>        "Moi, j'ai mangé. Et toi?"

It should fix neither.  However, it should preserve sentence endings:

    "I have
    eaten.  And you?"

becomes

    "I have eaten.  And you?"

Writing the algorithm this way should require no locale-dependent code.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I had lots of reasonable theories about children myself, until I
had some."  --Michael Rios



From loewis@informatik.hu-berlin.de  Thu Jun  6 18:44:41 2002
From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=)
Date: 06 Jun 2002 19:44:41 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net>
References: <j4wutc4udg.fsf@informatik.hu-berlin.de>
 <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <j4ptz4l2t2.fsf@informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Binary incompatibility on 64-bit platforms, for one.

Isn't Python 2.3 breaking binary compatibility, anyway, so that the
PYTHON_API_VERSION must be bumped?

Regards,
Martin




From guido@python.org  Thu Jun  6 18:48:00 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 13:48:00 -0400
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: Your message of "06 Jun 2002 19:44:41 +0200."
 <j4ptz4l2t2.fsf@informatik.hu-berlin.de>
References: <j4wutc4udg.fsf@informatik.hu-berlin.de> <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net>
 <j4ptz4l2t2.fsf@informatik.hu-berlin.de>
Message-ID: <200206061748.g56Hm0k15221@odiug.zope.com>

> > Binary incompatibility on 64-bit platforms, for one.
> 
> Isn't Python 2.3 breaking binary compatibility, anyway, so that the
> PYTHON_API_VERSION must be bumped?

Maybe, but we're still trying to be as compatible as possible --
sometimes it helps.

What about my other objections?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From neal@metaslash.com  Thu Jun  6 19:06:26 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Thu, 06 Jun 2002 14:06:26 -0400
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
Message-ID: <3CFFA4A2.9C2D9313@metaslash.com>

This is a multi-part message in MIME format.
--------------93293161D9DE985912DCB2D9
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Since the subject has come up several times recently,
and some one (Walter?) suggested a PEP be written....here goes.

Attached is a draft PEP.  Comments?

Neal
--------------93293161D9DE985912DCB2D9
Content-Type: text/plain; charset=us-ascii;
 name="pep-nn.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="pep-nn.txt"

PEP: XXX
Title: Backward Compatibility for Standard Library
Version: $Revision:$
Last-Modified: $Date:$
Author: neal@metaslash.com (Neal Norwitz)
Status: Draft
Type: Informational
Created: 06-Jun-2002
Post-History:
Python-Version: 2.3


Abstract

    This PEP describes the packages and modules in the standard
    library which should remain backward compatible with previous
    versions of Python.


Rationale

    Authors have various reasons why packages and modules should
    continue to work with previous versions of Python.  In order to
    maintain backward compatibility for these modules while moving the
    rest of the standard library forward, it is necessary to know
    which modules can be modified and which should use old and
    possibly deprecated features.

    Generally, authors should attempt to keep changes backward
    compatible with the previous released version of Python in order
    to make bug fixes easier to backport.


Backward Compatible Packages & Modules

    Package/Module     Maintainer(s)          Python Version
    --------------     -------------          --------------
    distutils          Andrew Kuchling             1.5.2
    email              Barry Warsaw                2.1
    sre                Fredrik Lundh               1.5.2
    xml (PyXML)        Martin v. Loewis            2.0


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:

--------------93293161D9DE985912DCB2D9--




From loewis@informatik.hu-berlin.de  Thu Jun  6 19:16:10 2002
From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=)
Date: 06 Jun 2002 20:16:10 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib getopt.py,1.17,1.18
In-Reply-To: <3CFF60F9.48B334FF@metaslash.com>
References: <E17FuyU-0002bX-00@usw-pr-cvs1.sourceforge.net>
 <3CFF60F9.48B334FF@metaslash.com>
Message-ID: <j4lm9sl1cl.fsf@informatik.hu-berlin.de>

Neal Norwitz <neal@metaslash.com> writes:

> For additions to the stdlib, should we try to make sure new features
> are used?  In the above code, type(longopts) ... ->
> isinstance(longopts, str) (or basestring?) and all_options_first
> could be a bool.

Done. It really should be ,str), since Unicode in command line options
is not yet support (although they should be, since, on Windows,
command line options are "natively" Unicode).

Regards,
Martin




From guido@python.org  Thu Jun  6 19:22:32 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 14:22:32 -0400
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
In-Reply-To: Your message of "Thu, 06 Jun 2002 14:06:26 EDT."
 <3CFFA4A2.9C2D9313@metaslash.com>
References: <3CFFA4A2.9C2D9313@metaslash.com>
Message-ID: <200206061822.g56IMWT23115@odiug.zope.com>

> Since the subject has come up several times recently,
> and some one (Walter?) suggested a PEP be written....here goes.
> 
> Attached is a draft PEP.  Comments?

Good idea.

Maybe you should mention some of the most common things you need to avoid
to preserve backwards compatibility with 1.5.2, 2.0, 2.1?

Without trying for completeness:

For 1.5.2 (these were introduced in 2.0): string methods, unicode,
augmented assignment, list comprehensions, zip(), dict.setdefault(),
print >>f, calling f(*args), plus all of the following.

For 2.0 (introduced in 2.1): nested scopes with future statement, rich
comparisons, function attributes, plus all of the following.

For 2.1 (introduced in 2.2): new-style classes, iterators, generators
with future statement, nested scopes without future statement, plus
all of the following.

For 2.2 (introduced in 2.3): generators without future statement, bool
(what else?).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From walter@livinglogic.de  Thu Jun  6 19:26:30 2002
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu, 06 Jun 2002 20:26:30 +0200
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
References: <3CFFA4A2.9C2D9313@metaslash.com>
Message-ID: <3CFFA956.4000004@livinglogic.de>

Neal Norwitz wrote:

> Since the subject has come up several times recently,
> and some one (Walter?) suggested a PEP be written....here goes.

It was Thomas Heller on http://www.python.org/sf/561478

> [...]
>     Package/Module                Maintainer(s)   Python Version
>     --------------                -------------   --------------
       Tools/freeze/modulefinder.py  Thomas Heller   1.5.2

Bye,
    Walter Dörwald




From fredrik@pythonware.com  Thu Jun  6 19:25:46 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 6 Jun 2002 20:25:46 +0200
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
References: <3CFFA4A2.9C2D9313@metaslash.com>
Message-ID: <019a01c20d87$95ac5380$ced241d5@hagrid>

neal wrote:

> Backward Compatible Packages & Modules
> 
>     Package/Module     Maintainer(s)          Python Version
>     --------------     -------------          --------------
>     distutils          Andrew Kuchling             1.5.2
>     email              Barry Warsaw                2.1
>     sre                Fredrik Lundh               1.5.2

+     xmlrpclib         Fredrik Lundh               1.5.2

(the code says 1.5.1, but I don't think I've tested that in
quite a while...)

>     xml (PyXML)        Martin v. Loewis            2.0

</F>




From thomas.heller@ion-tof.com  Thu Jun  6 19:24:19 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 6 Jun 2002 20:24:19 +0200
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
References: <3CFFA4A2.9C2D9313@metaslash.com>
Message-ID: <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook>

> Since the subject has come up several times recently,
> and some one (Walter?) suggested a PEP be written....here goes.
It was me (if you mean the comment on bug 561478), but who cares...
> 
> Attached is a draft PEP.  Comments?

Since it may become impossible in the future to remain backward
compatibility, should there be a (planned) Python version
which no longer maintains backwards compatibility?

>     Package/Module     Maintainer(s)          Python Version
>     --------------     -------------          --------------
tools/scripts/freeze/modulefinder       ???             1.5.2

Thomas




From loewis@informatik.hu-berlin.de  Thu Jun  6 19:31:10 2002
From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=)
Date: 06 Jun 2002 20:31:10 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: <200206061748.g56Hm0k15221@odiug.zope.com>
References: <j4wutc4udg.fsf@informatik.hu-berlin.de>
 <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net>
 <j4ptz4l2t2.fsf@informatik.hu-berlin.de>
 <200206061748.g56Hm0k15221@odiug.zope.com>
Message-ID: <j4d6v4l0nl.fsf@informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> What about my other objections?

Besides "breaks binary compatibility", the only other objection was:

> Also could cause lots of compilation warnings when user code stores
> the result into an int.

True; this would be a migration issue. To be safe, we probably would
define Py_size_t (or Py_ssize_t). People on 32-bit platforms would not
notice the problems; people on 64-bit platforms would soon provide
patches to use Py_ssize_t in the core.

That is a lot of work, so it requires careful planning, but I believe
this needs to be done sooner or later. Given MAL's and your response,
I already accepted that it would likely be done rather later than
sooner.

I don't agree with MAL's objection

> Wouldn't it be easier to solve this particular problem in
> the type used for mmapping files ?

Sure, it would be faster and easier, but that is the dark side of the
force. People will find that they cannot have string objects with more
than 2Gib one day, too, and, perhaps somewhat later, that they cannot
have more than 2 milliard objects in a list.

It is unlikely that the problem will go away, so at some point, all
the problems will become pressing. It is perfectly reasonable to defer
the binary breakage to that later point, except that probably more
users will be affected in the future than would be affected now
(because of the current rareness of 64-bit Python installations).

Regards,
Martin



From guido@python.org  Thu Jun  6 19:32:30 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 14:32:30 -0400
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
In-Reply-To: Your message of "Thu, 06 Jun 2002 20:24:19 +0200."
 <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook>
References: <3CFFA4A2.9C2D9313@metaslash.com>
 <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook>
Message-ID: <200206061832.g56IWU523219@odiug.zope.com>

> Since it may become impossible in the future to remain backward
> compatibility, should there be a (planned) Python version
> which no longer maintains backwards compatibility?

That would be 3.0.

Of course minor incompatibilities creep in at each new release.

> >     Package/Module     Maintainer(s)          Python Version
> >     --------------     -------------          --------------
> tools/scripts/freeze/modulefinder       ???             1.5.2

Can I ask once more why?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From walter@livinglogic.de  Thu Jun  6 19:34:57 2002
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu, 06 Jun 2002 20:34:57 +0200
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook>
Message-ID: <3CFFAB51.4050508@livinglogic.de>

Thomas Heller wrote:

>>Since the subject has come up several times recently,
>>and some one (Walter?) suggested a PEP be written....here goes.
> 
> It was me (if you mean the comment on bug 561478), but who cares...
> 
>>Attached is a draft PEP.  Comments?
> 
> 
> Since it may become impossible in the future to remain backward
> compatibility, should there be a (planned) Python version
> which no longer maintains backwards compatibility?
> 
> 
>>    Package/Module     Maintainer(s)          Python Version
>>    --------------     -------------          --------------
> 
> tools/scripts/freeze/modulefinder       ???             1.5.2

Ouch, I misinterpreted your comment on bug #561478.

So who is ???.

Bye,
    Walter Dörwald




From thomas.heller@ion-tof.com  Thu Jun  6 19:37:33 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 6 Jun 2002 20:37:33 +0200
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
References: <3CFFA4A2.9C2D9313@metaslash.com>              <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook>  <200206061832.g56IWU523219@odiug.zope.com>
Message-ID: <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook>

> > Since it may become impossible in the future to remain backward
> > compatibility, should there be a (planned) Python version
> > which no longer maintains backwards compatibility?
> 
> That would be 3.0.
> 
> Of course minor incompatibilities creep in at each new release.
> 
> > >     Package/Module     Maintainer(s)          Python Version
> > >     --------------     -------------          --------------
> > tools/scripts/freeze/modulefinder       ???             1.5.2
> 
> Can I ask once more why?
> 
I use it in py2exe, and this still supports 1.5.2.

Thomas




From thomas.heller@ion-tof.com  Thu Jun  6 19:38:05 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 6 Jun 2002 20:38:05 +0200
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <3CFFAB51.4050508@livinglogic.de>
Message-ID: <133001c20d89$4bc7ef70$e000a8c0@thomasnotebook>

> >>    Package/Module     Maintainer(s)          Python Version
> >>    --------------     -------------          --------------
> > 
> > tools/scripts/freeze/modulefinder       ???             1.5.2
> 
> Ouch, I misinterpreted your comment on bug #561478.
> 
> So who is ???.
> 
Maybe 'Thomas Heller et al.'

Thomas




From guido@python.org  Thu Jun  6 19:44:21 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 14:44:21 -0400
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: Your message of "06 Jun 2002 20:31:10 +0200."
 <j4d6v4l0nl.fsf@informatik.hu-berlin.de>
References: <j4wutc4udg.fsf@informatik.hu-berlin.de> <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <j4ptz4l2t2.fsf@informatik.hu-berlin.de> <200206061748.g56Hm0k15221@odiug.zope.com>
 <j4d6v4l0nl.fsf@informatik.hu-berlin.de>
Message-ID: <200206061844.g56IiMY23310@odiug.zope.com>

> Besides "breaks binary compatibility", the only other objection was:
> 
> > Also could cause lots of compilation warnings when user code stores
> > the result into an int.
> 
> True; this would be a migration issue. To be safe, we probably would
> define Py_size_t (or Py_ssize_t). People on 32-bit platforms would not
> notice the problems; people on 64-bit platforms would soon provide
> patches to use Py_ssize_t in the core.
> 
> That is a lot of work, so it requires careful planning, but I believe
> this needs to be done sooner or later. Given MAL's and your response,
> I already accepted that it would likely be done rather later than
> sooner.

Perhaps we could introduce a new signed type in 2.3 that's implemented
as an int, and switch it to something of the same size as size_t in
a later revision.

> I don't agree with MAL's objection
> 
> > Wouldn't it be easier to solve this particular problem in
> > the type used for mmapping files ?
> 
> Sure, it would be faster and easier, but that is the dark side of the
> force. People will find that they cannot have string objects with more
> than 2Gib one day, too, and, perhaps somewhat later, that they cannot
> have more than 2 milliard objects in a list.

What's a milliard? <US-parochial wink>

Seriously, I think the problem for this "solution" would be that you
can't use index notation on an mmap object, because
PySequence_GetSlice takes two int args.

I'm not very concerned about strings or lists with more than 2GB
items, but I am concerned about other memory buffers.

> It is unlikely that the problem will go away, so at some point, all
> the problems will become pressing. It is perfectly reasonable to defer
> the binary breakage to that later point, except that probably more
> users will be affected in the future than would be affected now
> (because of the current rareness of 64-bit Python installations).

So we should be planning now.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jun  6 19:51:30 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 14:51:30 -0400
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
In-Reply-To: Your message of "Thu, 06 Jun 2002 20:37:33 +0200."
 <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook>
References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com>
 <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook>
Message-ID: <200206061851.g56IpUC23357@odiug.zope.com>

> > > tools/scripts/freeze/modulefinder       ???             1.5.2

I think the maintainer is Mark Hammond.  I doubt he cares about 1.5.2
compatibility though.

> > Can I ask once more why?
> > 
> I use it in py2exe, and this still supports 1.5.2.

Can you elaborate?  Can't you include the last version of
modulefinder.py that supports 1.5.2 in your py2exe distro?  Or run
py2exe with a 1.5.2 python?  It seems to me that modulefinder.py
depends on the dis.py module of the current Python -- how can you use
a modulefinder.py from Python 2.x for a Python 1.5.2 program?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From neal@metaslash.com  Thu Jun  6 20:03:04 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Thu, 06 Jun 2002 15:03:04 -0400
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
References: <3CFFA4A2.9C2D9313@metaslash.com> <200206061822.g56IMWT23115@odiug.zope.com>
Message-ID: <3CFFB1E8.F60E3F4@metaslash.com>

This is a multi-part message in MIME format.
--------------6729E0A908F3290E4EFD95F5
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Guido van Rossum wrote:

> Maybe you should mention some of the most common things you need to avoid
> to preserve backwards compatibility with 1.5.2, 2.0, 2.1?

Updated version attached.  Not sure if the Tools should remain in there.

Neal
--------------6729E0A908F3290E4EFD95F5
Content-Type: text/plain; charset=us-ascii;
 name="pep-nn.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="pep-nn.txt"

PEP: XXX
Title: Backward Compatibility for Standard Library
Version: $Revision:$
Last-Modified: $Date:$
Author: neal@metaslash.com (Neal Norwitz)
Status: Draft
Type: Informational
Created: 06-Jun-2002
Post-History:
Python-Version: 2.3


Abstract

    This PEP describes the packages and modules in the standard
    library which should remain backward compatible with previous
    versions of Python.


Rationale

    Authors have various reasons why packages and modules should
    continue to work with previous versions of Python.  In order to
    maintain backward compatibility for these modules while moving the
    rest of the standard library forward, it is necessary to know
    which modules can be modified and which should use old and
    possibly deprecated features.

    Generally, authors should attempt to keep changes backward
    compatible with the previous released version of Python in order
    to make bug fixes easier to backport.


Features to Avoid

    The following list contains common features to avoid in order
    to maintain backward compatibility with each version of Python.
    This list is not complete!  It is only meant as a general guide.

    Note the features to avoid were implemented in the following
    version.  For example, features listed next to 1.5.2 were
    implemented in 2.0.

        Version    Features
        -------    --------
          1.5.2    string methods, Unicode, list comprehensions, 
                   augmented assignment (eg, +=), zip(), import x as y,
                   dict.setdefault(), print >> f, calling f(*args, **kw),
                   plus 2.0 features

          2.0      nested scopes, rich comparisons, function attributes,
                   plus 2.1 features

          2.1      use of object or new-style classes, iterators, 
                   using generators, nested scopes, or //
                   without from __future__ import ... statement,
                   plus 2.2 features

          2.2      bool, True, False, basestring, enumerate(), {}.pop(),
                   PendingDeprecationWarning, Universal Newlines,
                   plus 2.3 features


Backward Compatible Packages, Modules, and Tools

    Package/Module     Maintainer(s)          Python Version
    --------------     -------------          --------------
    compiler           Jeremy Hylton               2.1
    distutils          Andrew Kuchling             1.5.2
    email              Barry Warsaw                2.1
    sre                Fredrik Lundh               1.5.2
    xml (PyXML)        Martin v. Loewis            2.0
    xmlrpclib          Fredrik Lundh               1.5.2


    Tool                         Maintainer(s)   Python Version
    ----                         -------------   --------------
    scripts/freeze/modulefinder  Thomas Heller       1.5.2


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:

--------------6729E0A908F3290E4EFD95F5--




From thomas.heller@ion-tof.com  Thu Jun  6 20:06:53 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 6 Jun 2002 21:06:53 +0200
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com>              <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook>  <200206061851.g56IpUC23357@odiug.zope.com>
Message-ID: <137201c20d8d$5226a790$e000a8c0@thomasnotebook>

> Can you elaborate?  Can't you include the last version of
> modulefinder.py that supports 1.5.2 in your py2exe distro?  Or run
> py2exe with a 1.5.2 python?  It seems to me that modulefinder.py
> depends on the dis.py module of the current Python -- how can you use
> a modulefinder.py from Python 2.x for a Python 1.5.2 program?
> 
First, I want to use a version-independend modulefinder in py2exe,
if possible.
Second, it seems to work from 1.5.2 up to 2.2, currently. Except
for a single use of a string method someone overlooked probably,
see http://www.python.org/sf/564840.

modulefinder simply uses some opnames from dis, and all seem to be
present already in 1.5.2.

Thomas





From guido@python.org  Thu Jun  6 20:08:34 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 15:08:34 -0400
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
In-Reply-To: Your message of "Thu, 06 Jun 2002 15:03:04 EDT."
 <3CFFB1E8.F60E3F4@metaslash.com>
References: <3CFFA4A2.9C2D9313@metaslash.com> <200206061822.g56IMWT23115@odiug.zope.com>
 <3CFFB1E8.F60E3F4@metaslash.com>
Message-ID: <200206061908.g56J8YZ23563@odiug.zope.com>

> Updated version attached.  Not sure if the Tools should remain in there.

Go ahead and check it in as PEP 291.  (PEP 290 is reserved for
RaymondH's Migration Guide.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jun  6 20:10:52 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 15:10:52 -0400
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
In-Reply-To: Your message of "Thu, 06 Jun 2002 21:06:53 +0200."
 <137201c20d8d$5226a790$e000a8c0@thomasnotebook>
References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com>
 <137201c20d8d$5226a790$e000a8c0@thomasnotebook>
Message-ID: <200206061910.g56JAq023595@odiug.zope.com>

> > Can you elaborate?  Can't you include the last version of
> > modulefinder.py that supports 1.5.2 in your py2exe distro?  Or run
> > py2exe with a 1.5.2 python?  It seems to me that modulefinder.py
> > depends on the dis.py module of the current Python -- how can you use
> > a modulefinder.py from Python 2.x for a Python 1.5.2 program?
> > 
> First, I want to use a version-independend modulefinder in py2exe,
> if possible.
> Second, it seems to work from 1.5.2 up to 2.2, currently. Except
> for a single use of a string method someone overlooked probably,
> see http://www.python.org/sf/564840.
> 
> modulefinder simply uses some opnames from dis, and all seem to be
> present already in 1.5.2.

So why can't you include a copy of modulefinder.py in your distro?
You seem to be using it beyond its intended use.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas.heller@ion-tof.com  Thu Jun  6 20:15:26 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 6 Jun 2002 21:15:26 +0200
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com>              <137201c20d8d$5226a790$e000a8c0@thomasnotebook>  <200206061910.g56JAq023595@odiug.zope.com>
Message-ID: <13b101c20d8e$83858d00$e000a8c0@thomasnotebook>

> > > Can you elaborate?  Can't you include the last version of
> > > modulefinder.py that supports 1.5.2 in your py2exe distro?  Or run
> > > py2exe with a 1.5.2 python?  It seems to me that modulefinder.py
> > > depends on the dis.py module of the current Python -- how can you use
> > > a modulefinder.py from Python 2.x for a Python 1.5.2 program?
> > > 
> > First, I want to use a version-independend modulefinder in py2exe,
> > if possible.
> > Second, it seems to work from 1.5.2 up to 2.2, currently. Except
> > for a single use of a string method someone overlooked probably,
> > see http://www.python.org/sf/564840.
> > 
> > modulefinder simply uses some opnames from dis, and all seem to be
> > present already in 1.5.2.
> 
> So why can't you include a copy of modulefinder.py in your distro?
That's what I'm doing. I just want to keep it up-to-date with
the latest and greatest version from the Python distro with the minimum
effort.

> You seem to be using it beyond its intended use.
> 
You mean the cross-version use? As I said, it works nice.

Anyway, if there is a strong reason to do so, it can be
removed from PEP 291 - but string methods and booleans
aren't such a reason (IMO).

Thomas




From guido@python.org  Thu Jun  6 20:25:23 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 15:25:23 -0400
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
In-Reply-To: Your message of "Thu, 06 Jun 2002 21:15:26 +0200."
 <13b101c20d8e$83858d00$e000a8c0@thomasnotebook>
References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com> <137201c20d8d$5226a790$e000a8c0@thomasnotebook> <200206061910.g56JAq023595@odiug.zope.com>
 <13b101c20d8e$83858d00$e000a8c0@thomasnotebook>
Message-ID: <200206061925.g56JPN623651@odiug.zope.com>

> > So why can't you include a copy of modulefinder.py in your distro?

> That's what I'm doing. I just want to keep it up-to-date with
> the latest and greatest version from the Python distro with the minimum
> effort.

The solution is simple.  Just don't pull the new copy from the next
Python release.

> > You seem to be using it beyond its intended use.

> You mean the cross-version use? As I said, it works nice.

No, I meant that this module is part of the freeze tool.  That has no
requirement to be backwards compatible, since each Python version
comes with its own version of freeze.  Suppose that the .pyc file
format changes in a backwards incompatible way (we're considering this
too) and suppose modulefinder has to be changed.  I think it should be
possible to do that without consideration for older Python versions.

> Anyway, if there is a strong reason to do so, it can be
> removed from PEP 291 - but string methods and booleans
> aren't such a reason (IMO).

I think it should be removed.  I want to avoid having random claims
for backwards compatibility of arbitrary parts of the Python
distribution, because the more of these we have, the more constrained
we are as maintainers.

The other cases are all packages that are being distributed separately
by their maintainers for use with older Python versions.  I think your
use case is considerably different -- you are simply borrowing a
module.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas.heller@ion-tof.com  Thu Jun  6 20:38:09 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 6 Jun 2002 21:38:09 +0200
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com> <137201c20d8d$5226a790$e000a8c0@thomasnotebook> <200206061910.g56JAq023595@odiug.zope.com>              <13b101c20d8e$83858d00$e000a8c0@thomasnotebook>  <200206061925.g56JPN623651@odiug.zope.com>
Message-ID: <143201c20d91$b02be180$e000a8c0@thomasnotebook>

> > > You seem to be using it beyond its intended use.
> 
> > You mean the cross-version use? As I said, it works nice.
> 
> No, I meant that this module is part of the freeze tool.  That has no
> requirement to be backwards compatible, since each Python version
> comes with its own version of freeze.  Suppose that the .pyc file
> format changes in a backwards incompatible way (we're considering this
> too) and suppose modulefinder has to be changed.  I think it should be
> possible to do that without consideration for older Python versions.

In this case I propose to add it to the standard library
(or maybe Gordon's mf replacement, together with iu, his imputil
replacement ?).

> 
> > Anyway, if there is a strong reason to do so, it can be
> > removed from PEP 291 - but string methods and booleans
> > aren't such a reason (IMO).
> 
> I think it should be removed.  I want to avoid having random claims
> for backwards compatibility of arbitrary parts of the Python
> distribution, because the more of these we have, the more constrained
> we are as maintainers.
> 
Ok.

> The other cases are all packages that are being distributed separately
> by their maintainers for use with older Python versions.  I think your
> use case is considerably different -- you are simply borrowing a
> module.
> 

Thomas




From guido@python.org  Thu Jun  6 20:49:51 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 15:49:51 -0400
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
In-Reply-To: Your message of "Thu, 06 Jun 2002 21:38:09 +0200."
 <143201c20d91$b02be180$e000a8c0@thomasnotebook>
References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com> <137201c20d8d$5226a790$e000a8c0@thomasnotebook> <200206061910.g56JAq023595@odiug.zope.com> <13b101c20d8e$83858d00$e000a8c0@thomasnotebook> <200206061925.g56JPN623651@odiug.zope.com>
 <143201c20d91$b02be180$e000a8c0@thomasnotebook>
Message-ID: <200206061949.g56JnpZ30419@odiug.zope.com>

> > Suppose that the .pyc file
> > format changes in a backwards incompatible way (we're considering this
> > too) and suppose modulefinder has to be changed.  I think it should be
> > possible to do that without consideration for older Python versions.

> In this case I propose to add it to the standard library

That's a non-sequitur.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas.heller@ion-tof.com  Thu Jun  6 20:53:17 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 6 Jun 2002 21:53:17 +0200
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com> <137201c20d8d$5226a790$e000a8c0@thomasnotebook> <200206061910.g56JAq023595@odiug.zope.com> <13b101c20d8e$83858d00$e000a8c0@thomasnotebook> <200206061925.g56JPN623651@odiug.zope.com>              <143201c20d91$b02be180$e000a8c0@thomasnotebook>  <200206061949.g56JnpZ30419@odiug.zope.com>
Message-ID: <144e01c20d93$cd504a60$e000a8c0@thomasnotebook>

> > > Suppose that the .pyc file
> > > format changes in a backwards incompatible way (we're considering this
> > > too) and suppose modulefinder has to be changed.  I think it should be
> > > possible to do that without consideration for older Python versions.
> 
> > In this case I propose to add it to the standard library
> 
> That's a non-sequitur.

I don't understand what you mean.
What I mean is:
If modulefinder provides functionality outside the freeze tool,
and if it is maintained anyway, why not add it to the library?

Thomas




From guido@python.org  Thu Jun  6 21:01:16 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 16:01:16 -0400
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
In-Reply-To: Your message of "Thu, 06 Jun 2002 21:53:17 +0200."
 <144e01c20d93$cd504a60$e000a8c0@thomasnotebook>
References: <3CFFA4A2.9C2D9313@metaslash.com> <12f601c20d87$5fe3dcf0$e000a8c0@thomasnotebook> <200206061832.g56IWU523219@odiug.zope.com> <132a01c20d89$38c1f0b0$e000a8c0@thomasnotebook> <200206061851.g56IpUC23357@odiug.zope.com> <137201c20d8d$5226a790$e000a8c0@thomasnotebook> <200206061910.g56JAq023595@odiug.zope.com> <13b101c20d8e$83858d00$e000a8c0@thomasnotebook> <200206061925.g56JPN623651@odiug.zope.com> <143201c20d91$b02be180$e000a8c0@thomasnotebook> <200206061949.g56JnpZ30419@odiug.zope.com>
 <144e01c20d93$cd504a60$e000a8c0@thomasnotebook>
Message-ID: <200206062001.g56K1G730686@odiug.zope.com>

> > > > Suppose that the .pyc file
> > > > format changes in a backwards incompatible way (we're considering this
> > > > too) and suppose modulefinder has to be changed.  I think it should be
> > > > possible to do that without consideration for older Python versions.
> > 
> > > In this case I propose to add it to the standard library
> > 
> > That's a non-sequitur.
> 
> I don't understand what you mean.
> What I mean is:
> If modulefinder provides functionality outside the freeze tool,
> and if it is maintained anyway, why not add it to the library?

My "Suppose that..." was in the context of the freeze tool.  Since
modulefinder.py looks in .pyc files, it has to track the .pyc file
format, hence it cannot be required to be 1.5.2 compatible.

Only *you* are claiming that it is useful outside the freeze tool.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From mal@lemburg.com  Thu Jun  6 21:02:25 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 06 Jun 2002 22:02:25 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
References: <j4wutc4udg.fsf@informatik.hu-berlin.de>	<200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net>	<j4ptz4l2t2.fsf@informatik.hu-berlin.de>	<200206061748.g56Hm0k15221@odiug.zope.com> <j4d6v4l0nl.fsf@informatik.hu-berlin.de>
Message-ID: <3CFFBFD1.1050305@lemburg.com>

Martin v. L=F6wis wrote:
> Guido van Rossum <guido@python.org> writes:
>=20
>=20
>>What about my other objections?
>=20
>=20
> Besides "breaks binary compatibility", the only other objection was:
>=20
>=20
>>Also could cause lots of compilation warnings when user code stores
>>the result into an int.
>=20
>=20
> True; this would be a migration issue. To be safe, we probably would
> define Py_size_t (or Py_ssize_t). People on 32-bit platforms would not
> notice the problems; people on 64-bit platforms would soon provide
> patches to use Py_ssize_t in the core.
>=20
> That is a lot of work, so it requires careful planning, but I believe
> this needs to be done sooner or later. Given MAL's and your response,
> I already accepted that it would likely be done rather later than
> sooner.
>=20
> I don't agree with MAL's objection

Not that I would be surprised ;-)... but which one ?

>>Wouldn't it be easier to solve this particular problem in
>>the type used for mmapping files ?
>=20
>=20
> Sure, it would be faster and easier, but that is the dark side of the
> force. People will find that they cannot have string objects with more
> than 2Gib one day, too, and, perhaps somewhat later, that they cannot
> have more than 2 milliard objects in a list.
>=20
> It is unlikely that the problem will go away, so at some point, all
> the problems will become pressing. It is perfectly reasonable to defer
> the binary breakage to that later point, except that probably more
> users will be affected in the future than would be affected now
> (because of the current rareness of 64-bit Python installations).

Why not leave this for Py3K when 64-bit platforms will have
become common enough to make this a real need (I doubt that
anyone is using 1GB Python strings nowadays without getting
MemoryErrors :-).

Until then, I'd rather like to see the file IO APIs and related
types fixed so that they can handle 2GB files all the way
through.

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From guido@python.org  Thu Jun  6 21:04:47 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 16:04:47 -0400
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: Your message of "Thu, 06 Jun 2002 22:02:25 +0200."
 <3CFFBFD1.1050305@lemburg.com>
References: <j4wutc4udg.fsf@informatik.hu-berlin.de> <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <j4ptz4l2t2.fsf@informatik.hu-berlin.de> <200206061748.g56Hm0k15221@odiug.zope.com> <j4d6v4l0nl.fsf@informatik.hu-berlin.de>
 <3CFFBFD1.1050305@lemburg.com>
Message-ID: <200206062004.g56K4lS30831@odiug.zope.com>

> Until then, I'd rather like to see the file IO APIs and related
> types fixed so that they can handle 2GB files all the way
> through.

Which file IO APIs need to be fixed?  I thought we supported large
files already (when the OS supports them)?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake@acm.org  Thu Jun  6 20:12:36 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 6 Jun 2002 15:12:36 -0400
Subject: [Python-Dev] trace.py and the obscurity of Tools/scripts/
In-Reply-To: <15543.28437.909937.768107@slothrop.zope.com>
References: <15541.51980.403233.710018@anthem.wooz.org>
 <BIEJKCLHCIOIHAGOKOLHMEJPCPAA.tim.one@comcast.net>
 <15541.53473.786848.71301@anthem.wooz.org>
 <E16vjAg-0007Q9-00@imp>
 <15543.27943.6250.555793@12-248-41-177.client.attbi.com>
 <15543.28437.909937.768107@slothrop.zope.com>
Message-ID: <15615.46116.832422.534079@grendel.zope.com>

[cleaning out some old mail...]

Zooko wrote:
 > So in terms of `trace.py', it is a widely useful tool and
 > already has a programmatic interface.  Being added to the
 > hallowed Python Standard Library would be a major step up in
 > publicity and hence usage.  It would require better docs
 > regarding the programmatic usage.

Skip wrote:
 > It's speed cries out for a rewrite of some sort.  I haven't
 > thought about it, but I wonder if it could be layered on top of
 > hotshot.

Jeremy Hylton writes:
 > Can any of the handler methods be re-coded in C?  The hotshot changes
 > allow you to install a C function as a trace hook, but doesn't the
 > trace function in trace.py do a fair amount of work?

Another possibility is to use _hotshot.coverage() instead of the
"normal" profiler in HotShot.  This records the enter/exit events and
SET_LINENO instructions, but doesn't take any timing measurements, so
operates much quicker than profiling.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From martin@v.loewis.de  Thu Jun  6 21:16:04 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 06 Jun 2002 22:16:04 +0200
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <200206061307.g56D7Tb06612@pcp02138704pcs.reston01.va.comcast.net>
References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz>
 <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net>
 <m3sn40ud52.fsf@mira.informatik.hu-berlin.de>
 <200206061307.g56D7Tb06612@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3k7pcw4cb.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> To solve this, we would have to make the ob_sinterned slot count as a
> reference to the interned string.  But then string_dealloc would be
> complicated (it would have to call Py_XDECREF(op->ob_sinterned)),
> possibly slowing things down.
> 
> Is this worth it?  

That (latter) change seem "right" regardless of whether interned
strings are ever released.

> The fear for unbounded growth of the interned strings table is
> pretty common amongst authors of serious long-running programs.

I think it is. Unbound growth of interned strings is precisely the
reason why the XML libraries repeatedly came up with their own
interning dictionaries, which only persist for the lifetime of parsing
the document, since the next document may want to intern entirely
different things. This is the reason that the intern() function is bad
to use for most applications.

Regards,
Martin




From mal@lemburg.com  Thu Jun  6 21:25:11 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 06 Jun 2002 22:25:11 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
References: <j4wutc4udg.fsf@informatik.hu-berlin.de> <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <j4ptz4l2t2.fsf@informatik.hu-berlin.de> <200206061748.g56Hm0k15221@odiug.zope.com> <j4d6v4l0nl.fsf@informatik.hu-berlin.de>              <3CFFBFD1.1050305@lemburg.com> <200206062004.g56K4lS30831@odiug.zope.com>
Message-ID: <3CFFC527.1090907@lemburg.com>

Guido van Rossum wrote:
>>Until then, I'd rather like to see the file IO APIs and related
>>types fixed so that they can handle 2GB files all the way
>>through.
> 
> 
> Which file IO APIs need to be fixed?  I thought we supported large
> files already (when the OS supports them)?

The file object does, but what the mmap module doesn't and
it is not clear to me whether all code in the standard lib
can actually deal with file positions outside the int range
(most code probably doesn't care, since it uses .read()
and .write() exclusively), e.g. can SRE scan mmapped
files of such size ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From guido@python.org  Thu Jun  6 21:33:57 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 16:33:57 -0400
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: Your message of "Thu, 06 Jun 2002 22:25:11 +0200."
 <3CFFC527.1090907@lemburg.com>
References: <j4wutc4udg.fsf@informatik.hu-berlin.de> <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <j4ptz4l2t2.fsf@informatik.hu-berlin.de> <200206061748.g56Hm0k15221@odiug.zope.com> <j4d6v4l0nl.fsf@informatik.hu-berlin.de> <3CFFBFD1.1050305@lemburg.com> <200206062004.g56K4lS30831@odiug.zope.com>
 <3CFFC527.1090907@lemburg.com>
Message-ID: <200206062033.g56KXvJ05145@odiug.zope.com>

> >>Until then, I'd rather like to see the file IO APIs and related
> >>types fixed so that they can handle 2GB files all the way
> >>through.

(I suppose you meant >2GB files.)

> > Which file IO APIs need to be fixed?  I thought we supported large
> > files already (when the OS supports them)?
> 
> The file object does, but what the mmap module doesn't and
> it is not clear to me whether all code in the standard lib
> can actually deal with file positions outside the int range
> (most code probably doesn't care, since it uses .read()
> and .write() exclusively), e.g. can SRE scan mmapped
> files of such size ?

On a 32-bit machine you can mmap at most 2 GB anyway I expect, due to
the VM architecture (and otherwise the limit would obviously be 4 GB).

In which architecture are you interested?  The only place where this
might be a problem is when a pointer is 64 bits but an int is 32 bits.

What other modules are you worried about?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik@pythonware.com  Thu Jun  6 21:53:44 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 6 Jun 2002 22:53:44 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
References: <j4wutc4udg.fsf@informatik.hu-berlin.de> <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <j4ptz4l2t2.fsf@informatik.hu-berlin.de> <200206061748.g56Hm0k15221@odiug.zope.com> <j4d6v4l0nl.fsf@informatik.hu-berlin.de> <3CFFBFD1.1050305@lemburg.com> <200206062004.g56K4lS30831@odiug.zope.com>              <3CFFC527.1090907@lemburg.com>  <200206062033.g56KXvJ05145@odiug.zope.com>
Message-ID: <03e201c20d9c$4213d350$ced241d5@hagrid>

Guido van Rossum wrote:

> In which architecture are you interested?  The only place where this
> might be a problem is when a pointer is 64 bits but an int is 32 bits.

which means all 64-bit Unix machines...

</F>




From gward@python.net  Thu Jun  6 21:49:01 2002
From: gward@python.net (Greg Ward)
Date: Thu, 6 Jun 2002 16:49:01 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <20020606161541.GA26647@panix.com>
References: <20020606154601.GA16897@gerg.ca> <20020606161541.GA26647@panix.com>
Message-ID: <20020606204901.GA18310@gerg.ca>

On 06 June 2002, Aahz said:
> It should fix neither.  However, it should preserve sentence endings:

Actually, what it does fix is this:

  blah blah blah here's the end of sentence at the end of a line.
  And here's the next sentence.

Which, after whitespace-mangling, becomes

  ... end of a line. And here's ...

which is incorrect (sentences should be separated by two spaces in
fixed-width fonts).  The catch is that single-space-separated sentences
elsewhere in the text are also fixed, which *I* think is a good thing,
but should be optional.

Whatever -- minor detail.  I'll check it in first and then worry about
making more features optional.

        Greg
-- 
Greg Ward - Unix geek                                   gward@python.net
http://starship.python.net/~gward/
Budget's in the red?  Let's tax religion!
    -- Dead Kennedys



From mal@lemburg.com  Thu Jun  6 22:39:58 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 06 Jun 2002 23:39:58 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
References: <j4wutc4udg.fsf@informatik.hu-berlin.de> <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <j4ptz4l2t2.fsf@informatik.hu-berlin.de> <200206061748.g56Hm0k15221@odiug.zope.com> <j4d6v4l0nl.fsf@informatik.hu-berlin.de> <3CFFBFD1.1050305@lemburg.com> <200206062004.g56K4lS30831@odiug.zope.com>              <3CFFC527.1090907@lemburg.com> <200206062033.g56KXvJ05145@odiug.zope.com>
Message-ID: <3CFFD6AE.9020602@lemburg.com>

Guido van Rossum wrote:
>>>>Until then, I'd rather like to see the file IO APIs and related
>>>>types fixed so that they can handle 2GB files all the way
>>>>through.
>>>
> 
> (I suppose you meant >2GB files.)

Yes.

>>>Which file IO APIs need to be fixed?  I thought we supported large
>>>files already (when the OS supports them)?
>>
>>The file object does, but what the mmap module doesn't and
>>it is not clear to me whether all code in the standard lib
>>can actually deal with file positions outside the int range
>>(most code probably doesn't care, since it uses .read()
>>and .write() exclusively), e.g. can SRE scan mmapped
>>files of such size ?
> 
> 
> On a 32-bit machine you can mmap at most 2 GB anyway I expect, due to
> the VM architecture (and otherwise the limit would obviously be 4 GB).
 >
> In which architecture are you interested?  The only place where this
> might be a problem is when a pointer is 64 bits but an int is 32 bits.

64-bit Unix systems such as AIX 5L.

> What other modules are you worried about?

I'm not worried about any modules... take this as PEP-42 wish:
someone would need to check all the code using e.g. file.seek()
and file.tell() to make sure that it works correctly with
long values.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From fredrik@pythonware.com  Thu Jun  6 22:44:02 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 6 Jun 2002 23:44:02 +0200
Subject: [Python-Dev] textwrap.py
References: <20020606154601.GA16897@gerg.ca> <20020606161541.GA26647@panix.com> <20020606204901.GA18310@gerg.ca>
Message-ID: <047001c20da3$49c4f4b0$ced241d5@hagrid>

Greg Ward wrote:

> Which, after whitespace-mangling, becomes
> 
>   ... end of a line. And here's ...
> 
> which is incorrect (sentences should be separated by two spaces in
> fixed-width fonts).

that depends on the locale.

the two space rule does not apply to swedish, for example.

and googling for "two space rule" and "one space after" + period
makes me think it doesn't really apply to english either...  see eg

http://www.press.uchicago.edu/Misc/Chicago/cmosfaq/cmosfaq.OneSpaceorTwo.html

    "There is a traditional American practice, favored by some,
    of leaving two spaces after colons and periods. This practice
    is discouraged /.../"

(and thousands of similar entries.  from what I can tell, the more
*real* research done by an author, the more likely he is to come
down on the one space side...)

</F>




From martin@v.loewis.de  Thu Jun  6 22:55:03 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 06 Jun 2002 23:55:03 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: <3CFFD6AE.9020602@lemburg.com>
References: <j4wutc4udg.fsf@informatik.hu-berlin.de>
 <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net>
 <j4ptz4l2t2.fsf@informatik.hu-berlin.de>
 <200206061748.g56Hm0k15221@odiug.zope.com>
 <j4d6v4l0nl.fsf@informatik.hu-berlin.de>
 <3CFFBFD1.1050305@lemburg.com>
 <200206062004.g56K4lS30831@odiug.zope.com>
 <3CFFC527.1090907@lemburg.com>
 <200206062033.g56KXvJ05145@odiug.zope.com>
 <3CFFD6AE.9020602@lemburg.com>
Message-ID: <m3k7pcul6w.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> 64-bit Unix systems such as AIX 5L.
> 
> > What other modules are you worried about?
> 
> I'm not worried about any modules... take this as PEP-42 wish:
> someone would need to check all the code using e.g. file.seek()
> and file.tell() to make sure that it works correctly with
> long values.

That is supposed to work today. If it doesn't, make a detailed bug
report.

Regards,
Martin




From DavidA@ActiveState.com  Fri Jun  7 00:31:17 2002
From: DavidA@ActiveState.com (David Ascher)
Date: Thu, 06 Jun 2002 16:31:17 -0700
Subject: [Python-Dev] textwrap.py
References: <20020606154601.GA16897@gerg.ca> <20020606161541.GA26647@panix.com> <20020606204901.GA18310@gerg.ca> <047001c20da3$49c4f4b0$ced241d5@hagrid>
Message-ID: <3CFFF0C5.7010902@ActiveState.com>

Fredrik Lundh wrote:

>Greg Ward wrote:
>
>  
>
>>Which, after whitespace-mangling, becomes
>>
>>  ... end of a line. And here's ...
>>
>>which is incorrect (sentences should be separated by two spaces in
>>fixed-width fonts).
>>    
>>
>
>that depends on the locale.
>
>the two space rule does not apply to swedish, for example.
>
>and googling for "two space rule" and "one space after" + period
>makes me think it doesn't really apply to english either...  see eg
>
>http://www.press.uchicago.edu/Misc/Chicago/cmosfaq/cmosfaq.OneSpaceorTwo.html
>
>    "There is a traditional American practice, favored by some,
>    of leaving two spaces after colons and periods. This practice
>    is discouraged /.../"
>
>(and thousands of similar entries.  from what I can tell, the more
>*real* research done by an author, the more likely he is to come
>down on the one space side...)
>  
>
I did some research on this in a previous life, and my memory is that 
the two-space rule was designed, much like using underscores, as a guide 
to the typesetter, since periods are easily missed.  Underscores were an 
indication that the text should be emphasized, and no well-typeset 
document will include real underscores (except for "effect").

--da




From tim.one@comcast.net  Fri Jun  7 01:12:58 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 06 Jun 2002 20:12:58 -0400
Subject: [Python-Dev] Where to put wrap_text()?
In-Reply-To: <20020606152011.GA16829@gerg.ca>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEAMPLAA.tim.one@comcast.net>

[Tim]
> Note that regrtest.py also has a wrapper:
>
> def printlist(x, width=70, indent=4):
>     """Print the elements of a sequence to stdout.
>
>     Optional arg width (default 70) is the maximum line length.
>     Optional arg indent (default 4) is the number of blanks
>     with which to begin each line.
>    """

[Greg Ward]
> I think this one will probably stand; I've gotten to the point with my
> text-wrapping code where I'm reimplementing the various other
> text-wrappers people have mentioned on top of it, and
> regrtest.printlist() is just not a good fit.  It's for printing
> lists compactly, not for filling text.  Whatever.

regrtest's printlist is trivial to implement on top of the code you posted:

def printlist(x, width=70, indent=4):
    guts = map(str, x)
    blanks = ' ' * indent
    w = textwrap.TextWrapper()
    print w.fill(' '.join(guts), width, blanks, blanks)

TextWrapper certainly doesn't have to worry about changing the list into a
string, all I want it is that it wrap a string, and it does.

>> Just make sure it handle the union of all possible desires, but
>> has a simple and intuitive interface <wink>.

> Right.  Gotcha.  Code coming up soon.

It's no more than 10x more elaborate than necessary, so ship it <wink>.




From guido@python.org  Fri Jun  7 01:19:54 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 20:19:54 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: Your message of "Thu, 06 Jun 2002 23:44:02 +0200."
 <047001c20da3$49c4f4b0$ced241d5@hagrid>
References: <20020606154601.GA16897@gerg.ca> <20020606161541.GA26647@panix.com> <20020606204901.GA18310@gerg.ca>
 <047001c20da3$49c4f4b0$ced241d5@hagrid>
Message-ID: <200206070019.g570Jsx07402@pcp02138704pcs.reston01.va.comcast.net>

> (and thousands of similar entries.  from what I can tell, the more
> *real* research done by an author, the more likely he is to come
> down on the one space side...)

Knuth, when he invented TeX, heavily promoted a typesetting rule (for
variable-width fonts) that allowed the whitespace after a full stop to
stretch more than regular word space.  The Emacs folks, who love
Knuth, translated this idea for fixed-width text into two spaces.

Note that HTML also doesn't do this -- it always single-spaces text.
Looks fine to me.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun  7 01:23:46 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 20:23:46 -0400
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: Your message of "06 Jun 2002 23:55:03 +0200."
 <m3k7pcul6w.fsf@mira.informatik.hu-berlin.de>
References: <j4wutc4udg.fsf@informatik.hu-berlin.de> <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <j4ptz4l2t2.fsf@informatik.hu-berlin.de> <200206061748.g56Hm0k15221@odiug.zope.com> <j4d6v4l0nl.fsf@informatik.hu-berlin.de> <3CFFBFD1.1050305@lemburg.com> <200206062004.g56K4lS30831@odiug.zope.com> <3CFFC527.1090907@lemburg.com> <200206062033.g56KXvJ05145@odiug.zope.com> <3CFFD6AE.9020602@lemburg.com>
 <m3k7pcul6w.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206070023.g570NkZ07427@pcp02138704pcs.reston01.va.comcast.net>

> > I'm not worried about any modules... take this as PEP-42 wish:
> > someone would need to check all the code using e.g. file.seek()
> > and file.tell() to make sure that it works correctly with
> > long values.
> 
> That is supposed to work today. If it doesn't, make a detailed bug
> report.

While file.seek() and file.tell() are indeed fixed, I think MAL has a
fear that some modules don't like getting a long from tell().  I fixed
a bug of this kind in dumbdbm more than three years ago, when a long
wasn't acceptable as a multiplier in string repetition.  Since then,
longs aren't quite so poisonous as they once were, and I don't think
this fear is rational any more.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Fri Jun  7 01:48:53 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 06 Jun 2002 20:48:53 -0400
Subject: [Python-Dev] Bizarre new test failure
Message-ID: <LNBBLJKPBEHFEDALKOLCMEAOPLAA.tim.one@comcast.net>

Guido noticed this on Linux late this afternoon.  I've seen it on Win2K and
Win98SE since.  test_gc fails if you run the whole test suite from the
start:

test test_gc failed -- test_list: actual 10, expected 1

This seems impossible (look at the test).  It doesn't fail in isolation.  It
fails in both debug and release builds.  Not *all* tests before test_gc need
to be run in order to provoke a failure, but I can detect no sense in which
do need to be run (not just one or two, but lots of them).

Here are the files that changed between a Python that does work (yesterday)
and now:

P python/configure
P python/configure.in
P python/pyconfig.h.in
P python/Doc/lib/libgetopt.tex
P python/Doc/lib/libsocket.tex
P python/Lib/copy.py
P python/Lib/fileinput.py
P python/Lib/getopt.py
P python/Lib/posixpath.py
P python/Lib/shutil.py
P python/Lib/socket.py
P python/Lib/compiler/pyassem.py
P python/Lib/compiler/pycodegen.py
P python/Lib/compiler/transformer.py
P python/Lib/distutils/command/clean.py
P python/Lib/test/test_b1.py
P python/Lib/test/test_commands.py
P python/Lib/test/test_descr.py
P python/Lib/test/test_getopt.py
P python/Lib/test/test_socket.py
U python/Lib/test/test_timeout.py
P python/Misc/ACKS
P python/Misc/NEWS
P python/Modules/gcmodule.c
P python/Modules/socketmodule.c
P python/Modules/socketmodule.h
P python/Objects/abstract.c
P python/Objects/complexobject.c
P python/Objects/rangeobject.c
P python/Tools/webchecker/webchecker.py

While Jeremy did fiddle gcmodule.c, that isn't the cause.

I changed the test like so:

def test_list():
    import sys
    l = []
    l.append(l)
    gc.collect()
    del l
    gc.set_debug(gc.DEBUG_SAVEALL)
    n = gc.collect()
    print >> sys.stderr, '*' * 30, n, gc.garbage
    expect(n, 1, "list")

Here's the list of garbage objects it found:

[<class 'test_descr.mysuper'>,
 {'__dict__': <attribute '__dict__' of 'mysuper' objects>,
 '__module__': 'test_descr',
 '__weakref__': <member '__weakref__' of 'mysuper' objects>,
 '__doc__': None,
 '__init__': <function __init__ at 0x00CC00D8>},
 (<class 'test_descr.mysuper'>, <type 'super'>, <type 'object'>),
 (<type 'super'>,),
 [[...]],
 <attribute '__dict__' of 'mysuper' objects>,
 <member '__weakref__' of 'mysuper' objects>,
 <function __init__ at 0x00CC00D8>,
 (<cell at 0x00CB4DB0: type object at 0x00768420>,),
 <cell at 0x00CB4DB0: type object at 0x00768420>]

The recursive list:

 [[...]]

is the only one expected here.

Your turn <wink>.




From greg@cosc.canterbury.ac.nz  Fri Jun  7 01:47:06 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 07 Jun 2002 12:47:06 +1200 (NZST)
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <m3sn40ud52.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206070047.MAA07206@s454.cosc.canterbury.ac.nz>

> This has the potential of breaking applications that remember the id()
> of an interned string, instead of its value.

Unless the manual promises that interned strings will
live forever, I'd say such an application is broken
already.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Fri Jun  7 01:53:52 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 07 Jun 2002 12:53:52 +1200 (NZST)
Subject: [Python-Dev] OT: Performance vs. Clarity vs. Convention
In-Reply-To: <200206061258.g56CwLp06558@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200206070053.MAA07210@s454.cosc.canterbury.ac.nz>

Guido van Rossum <guido@python.org>:

> > >     def __str__(self):
> > >         pass
> > 
> > Dunno about other people's opinions, but I have a strong distaste for
> > creating methods whose body contains pass.  I always use "raise
> > NotImplementedError".
> 
> But that has different semantics!

In this particular case, the program blows up anyway if this
method is ever called, so you might as well return a meaningful
exception!

Python 2.2 (#14, May 28 2002, 14:11:27) 
[GCC 2.95.2 19991024 (release)] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> class C:
...  def __str__(self):
...   pass
... 
>>> c = C()
>>> str(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: __str__ returned non-string (type NoneType)
>>> 

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From neal@metaslash.com  Fri Jun  7 02:03:25 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Thu, 06 Jun 2002 21:03:25 -0400
Subject: [Python-Dev] Bizarre new test failure
References: <LNBBLJKPBEHFEDALKOLCMEAOPLAA.tim.one@comcast.net>
Message-ID: <3D00065D.5AEAECF4@metaslash.com>

Tim Peters wrote:
> 
> Guido noticed this on Linux late this afternoon.  I've seen it on Win2K and
> Win98SE since.  test_gc fails if you run the whole test suite from the
> start:
> 
> test test_gc failed -- test_list: actual 10, expected 1
> 
> This seems impossible (look at the test).  It doesn't fail in isolation.  It
> fails in both debug and release builds.  Not *all* tests before test_gc need
> to be run in order to provoke a failure, but I can detect no sense in which
> do need to be run (not just one or two, but lots of them).
> 
> Here are the files that changed between a Python that does work (yesterday)
> and now:

I've gotten this intermittently.  Although the first time I got it was
sometime yesterday, so I think you may have to go back a bit farther.

Neal



From greg@cosc.canterbury.ac.nz  Fri Jun  7 02:02:56 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 07 Jun 2002 13:02:56 +1200 (NZST)
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <200206061307.g56D7Tb06612@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200206070102.NAA07214@s454.cosc.canterbury.ac.nz>

Guido:

> It's also quite possible that there are no outside
> references to an interned string, but another string with the same
> value still references the interned string from its ob_sinterned
> field.
> 
> To solve this, we would have to make the ob_sinterned slot count as a
> reference to the interned string.  But then string_dealloc would be
> complicated (it would have to call Py_XDECREF(op->ob_sinterned)),
> possibly slowing things down.

If the intern table cleanup is being done by a GC pass, you
don't need a full Py_XDECREF -- you only need to decrement
op->ob_sinterned->ob_refcnt. Doesn't sound excessively
expensive to me, but I suppose it would have to be timed
to make sure.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+




From greg@cosc.canterbury.ac.nz  Fri Jun  7 02:10:17 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 07 Jun 2002 13:10:17 +1200 (NZST)
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <m3k7pcw4cb.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206070110.NAA07217@s454.cosc.canterbury.ac.nz>

> > Is this worth it?  
> 
> I think it is.

I think so, too. Currently, interning can *almost* be
regarded as no more than an optimisation that speeds
up comparing strings -- almost, because it has this
side effect of making the strings immortal.

Removing that side effect would be good.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From guido@python.org  Fri Jun  7 02:20:26 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 21:20:26 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: Your message of "Thu, 06 Jun 2002 20:48:53 EDT."
 <LNBBLJKPBEHFEDALKOLCMEAOPLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEAOPLAA.tim.one@comcast.net>
Message-ID: <200206070120.g571KQC14344@pcp02138704pcs.reston01.va.comcast.net>

> Here are the files that changed between a Python that does work (yesterday)
> and now:
> Here's the list of garbage objects it found:
> 
> [<class 'test_descr.mysuper'>,
>  {'__dict__': <attribute '__dict__' of 'mysuper' objects>,
>  '__module__': 'test_descr',
>  '__weakref__': <member '__weakref__' of 'mysuper' objects>,
>  '__doc__': None,
>  '__init__': <function __init__ at 0x00CC00D8>},
>  (<class 'test_descr.mysuper'>, <type 'super'>, <type 'object'>),
>  (<type 'super'>,),
>  [[...]],
>  <attribute '__dict__' of 'mysuper' objects>,
>  <member '__weakref__' of 'mysuper' objects>,
>  <function __init__ at 0x00CC00D8>,
>  (<cell at 0x00CB4DB0: type object at 0x00768420>,),
>  <cell at 0x00CB4DB0: type object at 0x00768420>]

Most of these are leftovers from the test supers() in test_descr.py.
If Neal is right and this could be two days old, I'm curious if my
last change to typeobject.c (2.148) might not be the culprit, since it
messes with the garbage collector.

I'm trying to fix the non-blocking code in the socket module first, so
I doubt I'll get to this tonight.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@zope.com  Fri Jun  7 03:21:29 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Thu, 6 Jun 2002 22:21:29 -0400
Subject: [Python-Dev] textwrap.py
References: <20020606154601.GA16897@gerg.ca>
 <20020606161541.GA26647@panix.com>
 <20020606204901.GA18310@gerg.ca>
 <047001c20da3$49c4f4b0$ced241d5@hagrid>
 <200206070019.g570Jsx07402@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15616.6313.71537.816137@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> The Emacs folks, who love Knuth, translated this idea for
    GvR> fixed-width text into two spaces.

I always thought the Emacs folks adopted it because it made the
filling algorithms easier to deal with the difference between:

    ...on a stick with no mustard.  At least, that's how I prefer my...

and

    ...in love with Dr. Frankenstein, and who once noticed a watermelon...

Two spaces after the sentence end and one after the abbreviation.
Here's some interesting information from XEmacs:

C-h f fill-paragraph RET

    If `sentence-end-double-space' is non-nil, then period followed by one
    space does not end a sentence, so don't break a line there.

C-h v sentence-end-double-space RET

    *Non-nil means a single space does not end a sentence.  This
    variable applies only to filling, not motion commands.  To change
    the behavior of motion commands, see `sentence-end'.

It's clear to me that any standard wrapping code in Python needs to
handle either style.

-Barry



From guido@python.org  Fri Jun  7 03:33:18 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 06 Jun 2002 22:33:18 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: Your message of "Thu, 06 Jun 2002 22:21:29 EDT."
 <15616.6313.71537.816137@anthem.wooz.org>
References: <20020606154601.GA16897@gerg.ca> <20020606161541.GA26647@panix.com> <20020606204901.GA18310@gerg.ca> <047001c20da3$49c4f4b0$ced241d5@hagrid> <200206070019.g570Jsx07402@pcp02138704pcs.reston01.va.comcast.net>
 <15616.6313.71537.816137@anthem.wooz.org>
Message-ID: <200206070233.g572XIh14811@pcp02138704pcs.reston01.va.comcast.net>

> I always thought the Emacs folks adopted it because it made the
> filling algorithms easier to deal with the difference between:
> 
>     ...on a stick with no mustard.  At least, that's how I prefer my...
> 
> and
> 
>     ...in love with Dr. Frankenstein, and who once noticed a watermelon...

You've got that backwards of course.  If we didn't want two spaces
after a full stop, we wouldn't need any of this nonsense.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@zope.com  Fri Jun  7 03:54:43 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Thu, 6 Jun 2002 22:54:43 -0400
Subject: [Python-Dev] Proto-PEP for maintaining backward compatibility
References: <3CFFA4A2.9C2D9313@metaslash.com>
 <200206061822.g56IMWT23115@odiug.zope.com>
 <3CFFB1E8.F60E3F4@metaslash.com>
 <200206061908.g56J8YZ23563@odiug.zope.com>
Message-ID: <15616.8307.731761.157525@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> Go ahead and check it in as PEP 291.  (PEP 290 is reserved
    GvR> for RaymondH's Migration Guide.)

Today's thunderstorms knocked out my home network so I'm just now
catching up on email.

Great PEP Neal, thanks for doing it!

One nit: PEP file numbers must have 4 digits, so I cvs rm'd
pep-291.txt, copied it to pep-0291.txt and cvs added the latter.  I
also sync'd it to www.python.org.

Please make any future updates to the pep-0291.txt file.

Thanks,
-Barry



From guido@python.org  Fri Jun  7 05:01:02 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 00:01:02 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: Your message of "Wed, 05 Jun 2002 17:33:55 EDT."
Message-ID: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net>

I've more or less completed the introduction of timeout sockets.

Executive summary: after sock.settimeout(T), all methods of sock will
block for at most T floating seconds and fail if they can't complete
within that time.  sock.settimeout(None) restores full blocking mode.

I've also done some long-needed rigorous cleanup of the socket module
source code, e.g. I got rid of the PySock* static names.

Remaining issues:

- A test suite.  There's no decent test suite for the timeout code.
  The file test_timeout.py doesn't test the functionality (as I
  discovered when the test succeeded while I had several blunders in
  the select code that made everything always time out).

- Cross-platform testing.  It's possible that the cleanup broke things
  on some platforms, or that select() doesn't work the same way.  I
  can only test on Windows and Linux; there is code specific to OS/2
  and RISCOS in the module too.

- I'm not sure that the handling of timeout errors in accept(),
  connect() and connect_ex() is 100% correct (this code sniffs the
  error code and decides whether to retry or not).

- Should sock.settimeout(0.0) mean the same as sock.setblocking(0)?
  Currently it sets a timeout of zero seconds, and that behaves pretty
  much the same as setting the socket in nonblocking mode -- but not
  exactly.  Maybe these should be made the same?

- A socket filedescriptor passed to fromfd() is now assumed to be in
  blocking, non-timeout mode.

- The socket.py module has been changed too, changing the way
  buffering is done on Windows.  I haven't reviewed or tested this
  code thoroughly.

I hope some of the developers on this list will help me out with all
this!  In the mean time, many thanks to Michael Gilfix who did most of
the thinking and coding.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun  7 05:02:44 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 00:02:44 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: Your message of "06 Jun 2002 22:16:04 +0200."
 <m3k7pcw4cb.fsf@mira.informatik.hu-berlin.de>
References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz> <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net> <m3sn40ud52.fsf@mira.informatik.hu-berlin.de> <200206061307.g56D7Tb06612@pcp02138704pcs.reston01.va.comcast.net>
 <m3k7pcw4cb.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206070402.g5742il15834@pcp02138704pcs.reston01.va.comcast.net>

> > To solve this, we would have to make the ob_sinterned slot count as a
> > reference to the interned string.  But then string_dealloc would be
> > complicated (it would have to call Py_XDECREF(op->ob_sinterned)),
> > possibly slowing things down.
> > 
> > Is this worth it?  
> 
> That (latter) change seem "right" regardless of whether interned
> strings are ever released.

OK, let's do this.

> > The fear for unbounded growth of the interned strings table is
> > pretty common amongst authors of serious long-running programs.
> 
> I think it is. Unbound growth of interned strings is precisely the
> reason why the XML libraries repeatedly came up with their own
> interning dictionaries, which only persist for the lifetime of parsing
> the document, since the next document may want to intern entirely
> different things. This is the reason that the intern() function is bad
> to use for most applications.

So let's expose a function that cleans out unused strings from the
interned dict.  Long-running apps can decide when to call this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From greg@cosc.canterbury.ac.nz  Fri Jun  7 05:19:57 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 07 Jun 2002 16:19:57 +1200 (NZST)
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200206070419.QAA07241@s454.cosc.canterbury.ac.nz>

Guido:

> - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)?
>   Currently it sets a timeout of zero seconds, and that behaves pretty
>   much the same as setting the socket in nonblocking mode -- but not
>   exactly.  Maybe these should be made the same?

I'd say no. Someone might want the current behaviour,
whatever it is -- and if they don't, they can always
make it properly non-blocking. Don't make a special
case unless it's absolutely necessary.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Fri Jun  7 05:22:25 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 07 Jun 2002 16:22:25 +1200 (NZST)
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <200206070402.g5742il15834@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200206070422.QAA07244@s454.cosc.canterbury.ac.nz>

Guido:

> So let's expose a function that cleans out unused strings from the
> interned dict.  Long-running apps can decide when to call this.

Would it do any harm to call this automatically from
the garbage collector?

I suppose it should be exposed as well, in case you
want it but have GC turned off -- but in the normal
case you shouldn't have to do anything special to
get it.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From mgilfix@eecs.tufts.edu  Fri Jun  7 05:26:23 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Fri, 7 Jun 2002 00:26:23 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jun 07, 2002 at 12:01:02AM -0400
References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020607002623.A20029@eecs.tufts.edu>

On Fri, Jun 07 @ 00:01, Guido van Rossum wrote:
> I've more or less completed the introduction of timeout sockets.
> 
> Executive summary: after sock.settimeout(T), all methods of sock will
> block for at most T floating seconds and fail if they can't complete
> within that time.  sock.settimeout(None) restores full blocking mode.
> 
> I've also done some long-needed rigorous cleanup of the socket module
> source code, e.g. I got rid of the PySock* static names.

  Good stuff. The module needed a little work as I discovered as well
:)

> Remaining issues:
> 
> - A test suite.  There's no decent test suite for the timeout code.
>   The file test_timeout.py doesn't test the functionality (as I
>   discovered when the test succeeded while I had several blunders in
>   the select code that made everything always time out).

  Er, hopefully Bernard is still watching this thread as he wrote
the test_timeout.py. He's been pretty quiet though as of late... I'm
willing to rewrite the tests if he doesn't have the time. 

  I think the tests should follow the same pattern as the
test_socket.py.  While adding my regression tests, I noted that the
general socket test suite could use some re-writing but I didn't feel
it appropriate to tackle it at that point. Perhaps a next patch?

> - Cross-platform testing.  It's possible that the cleanup broke things
>   on some platforms, or that select() doesn't work the same way.  I
>   can only test on Windows and Linux; there is code specific to OS/2
>   and RISCOS in the module too.

  This was a concern from the beginning but we had some chat on the
dev list and concluded that any system supporting sockets has to
support select or some equivalent (hence the initial reason for using
the select module, although I agree it was expensive).

> - I'm not sure that the handling of timeout errors in accept(),
>   connect() and connect_ex() is 100% correct (this code sniffs the
>   error code and decides whether to retry or not).

  I've tested these on linux (manually) and they seem to work just
fine. I didn't do as much testing with connect_ex but the code is
very similar to connect, so confidence is high-er. The reason for the
two-pass is because the initial connect needs to be made to start the
process and then try again, based on the error codes, for non-blocking
connects. It's weird like that.

> - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)?
>   Currently it sets a timeout of zero seconds, and that behaves pretty
>   much the same as setting the socket in nonblocking mode -- but not
>   exactly.  Maybe these should be made the same?

  I thought about this and whether or not I wanted to address this.  I
kinda decided to leave them separate though. I don't think setting a
timeout means anything equivalent to setblocking(0). In fact, I can't
see why anyone should ever set a timeout of zero and the immediate
throwing of the exception is a good alert as to what's going on. I
vote, leave them separate and as they are now...

> - A socket filedescriptor passed to fromfd() is now assumed to be in
>   blocking, non-timeout mode.
> 
> - The socket.py module has been changed too, changing the way
>   buffering is done on Windows.  I haven't reviewed or tested this
>   code thoroughly.

  I added a regression test to test_socket.py to test this, that works
on both the old code (I used 2.1.3) and the new code. Hopefully, this
will be instrumental for those testing it and it reflects my manual
tests.

                          -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From tim.one@comcast.net  Fri Jun  7 05:32:07 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Jun 2002 00:32:07 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <200206070019.g570Jsx07402@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEBHPLAA.tim.one@comcast.net>

[Guido]
> Knuth, when he invented TeX, heavily promoted a typesetting rule (for
> variable-width fonts) that allowed the whitespace after a full stop to
> stretch more than regular word space.  The Emacs folks, who love
> Knuth, translated this idea for fixed-width text into two spaces.

Two spaces between sentences was the rule for monospaced fonts before Knuth
was born.  It got beaten into me by my mother when I learned to type, and is
still the rule for monospaced fonts according to several style guides.

    Are there TWO spaces after every sentence?  Manuscripts without
    two spaces after each sentence will be rejected.

> Note that HTML also doesn't do this -- it always single-spaces text.

    Are there TWO spaces after every sentence?  Manuscripts without
    two spaces after each sentence will be rejected.

The web page from which that quote was taken forces an extra space after the
question mark (I didn't insert it after pasting the quote) in the obvious
way:

    Are there TWO spaces after every sentence?&nbsp; Manuscripts
    without two spaces after each will be rejected.

> Looks fine to me.

It wouldn't if you viewed it in Courier; for fixed-width fonts it very
arguably helps people parse.  The two-space gimmick is out of favor for
published works because proportional fonts and kerning are adequate to
distinguish sentences.  It still Rulz the DOS box, though <wink>.




From tim.one@comcast.net  Fri Jun  7 05:40:54 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Jun 2002 00:40:54 -0400
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: <200206070023.g570NkZ07427@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEBIPLAA.tim.one@comcast.net>

[Guido]
> While file.seek() and file.tell() are indeed fixed, I think MAL has a
> fear that some modules don't like getting a long from tell().  I fixed
> a bug of this kind in dumbdbm more than three years ago, when a long
> wasn't acceptable as a multiplier in string repetition.  Since then,
> longs aren't quite so poisonous as they once were, and I don't think
> this fear is rational any more.

Andrew K fixed a lot of these too, for some definition of "fixed" <wink>.



From xscottg@yahoo.com  Fri Jun  7 05:56:09 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Thu, 6 Jun 2002 21:56:09 -0700 (PDT)
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020607045609.8677.qmail@web12908.mail.yahoo.com>

--- Guido van Rossum <guido@python.org> wrote:
>
> Also
> could cause lots of compilation warnings when user code stores the
> result into an int.
>

The compiler won't complain a wink for int pointers passed to varargs
functions.  PyArg_ParseTuple and any format specifiers that have # after
the typecode could be quiet bugs in any extension modules.  This could be
handled in a backwards compatible fashion by adding a new code indicating
ssize_t while leaving '#' as indicating int.



__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From nas@python.ca  Fri Jun  7 06:31:04 2002
From: nas@python.ca (Neil Schemenauer)
Date: Thu, 6 Jun 2002 22:31:04 -0700
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEAOPLAA.tim.one@comcast.net>; from tim.one@comcast.net on Thu, Jun 06, 2002 at 08:48:53PM -0400
References: <LNBBLJKPBEHFEDALKOLCMEAOPLAA.tim.one@comcast.net>
Message-ID: <20020606223104.B1389@glacier.arctrix.com>

Tim Peters wrote:
> test test_gc failed -- test_list: actual 10, expected 1

Hmm.

> Here's the list of garbage objects it found:
> 
> [<class 'test_descr.mysuper'>,
>  {'__dict__': <attribute '__dict__' of 'mysuper' objects>,
>  '__module__': 'test_descr',
>  '__weakref__': <member '__weakref__' of 'mysuper' objects>,
>  '__doc__': None,
>  '__init__': <function __init__ at 0x00CC00D8>},
>  (<class 'test_descr.mysuper'>, <type 'super'>, <type 'object'>),
>  (<type 'super'>,),
>  [[...]],
>  <attribute '__dict__' of 'mysuper' objects>,
>  <member '__weakref__' of 'mysuper' objects>,
>  <function __init__ at 0x00CC00D8>,
>  (<cell at 0x00CB4DB0: type object at 0x00768420>,),
>  <cell at 0x00CB4DB0: type object at 0x00768420>]

I wonder if some new cyclic garbage structure needs two gc.collect()
passes in order to break it up.

  Neil



From nas@python.ca  Fri Jun  7 06:38:37 2002
From: nas@python.ca (Neil Schemenauer)
Date: Thu, 6 Jun 2002 22:38:37 -0700
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEAOPLAA.tim.one@comcast.net>; from tim.one@comcast.net on Thu, Jun 06, 2002 at 08:48:53PM -0400
References: <LNBBLJKPBEHFEDALKOLCMEAOPLAA.tim.one@comcast.net>
Message-ID: <20020606223837.C1389@glacier.arctrix.com>

Tim Peters wrote:
> I can detect no sense in which do need to be run (not just one or two,
> but lots of them).

It's easy to reproduce.  First, disable the GC.  Next, run:

    regrtest.py test_descr test_gc

My wild guess is that some tp_clear method is not doing it's job.  I'll
take a closer look tomorrow if someone hasn't figured it out by then.
Must sleep.  Too much CS.

  Neil



From loewis@informatik.hu-berlin.de  Fri Jun  7 08:05:31 2002
From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=)
Date: 07 Jun 2002 09:05:31 +0200
Subject: [Python-Dev] Quota on sf.net
Message-ID: <j4y9drk1qc.fsf@informatik.hu-berlin.de>

It appears SF is rearranging servers, and asks projects to honor their
disk quota, see

https://sourceforge.net/forum/forum.php?forum_id=183601

There is a per-project disk quota of 100MB; /home/groups/p/py/python
currently consumes 880MB. Most of this (830MB) is in
htdocs/snapshots. Should we move those onto python.org?

Regards,
Martin




From loewis@informatik.hu-berlin.de  Fri Jun  7 08:09:35 2002
From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=)
Date: 07 Jun 2002 09:09:35 +0200
Subject: [Python-Dev] Making doc strings optional
Message-ID: <j4u1ofk1jk.fsf@informatik.hu-berlin.de>

I'm ready to apply patch #505375. Any objections?

Martin



From mal@lemburg.com  Fri Jun  7 08:26:59 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 07 Jun 2002 09:26:59 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
References: <j4wutc4udg.fsf@informatik.hu-berlin.de> <200206061319.g56DJkl06699@pcp02138704pcs.reston01.va.comcast.net> <j4ptz4l2t2.fsf@informatik.hu-berlin.de> <200206061748.g56Hm0k15221@odiug.zope.com> <j4d6v4l0nl.fsf@informatik.hu-berlin.de> <3CFFBFD1.1050305@lemburg.com> <200206062004.g56K4lS30831@odiug.zope.com> <3CFFC527.1090907@lemburg.com> <200206062033.g56KXvJ05145@odiug.zope.com> <3CFFD6AE.9020602@lemburg.com>              <m3k7pcul6w.fsf@mira.informatik.hu-berlin.de> <200206070023.g570NkZ07427@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D006043.7010802@lemburg.com>

Guido van Rossum wrote:
>>>I'm not worried about any modules... take this as PEP-42 wish:
>>>someone would need to check all the code using e.g. file.seek()
>>>and file.tell() to make sure that it works correctly with
>>>long values.
>>
>>That is supposed to work today. If it doesn't, make a detailed bug
>>report.
> 
> 
> While file.seek() and file.tell() are indeed fixed, I think MAL has a
> fear that some modules don't like getting a long from tell().  I fixed
> a bug of this kind in dumbdbm more than three years ago, when a long
> wasn't acceptable as a multiplier in string repetition.  Since then,
> longs aren't quite so poisonous as they once were, and I don't think
> this fear is rational any more.

If Martin has checked the code for this already, I'm fine.

I stumbled across problems in this area with mxBeeBase which did
not support using longs as addresses and since the problems
are rather subtle, I assumed that other code not specifically
built for handling longs in file positions could have similiar
problems.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From eikeon@eikeon.com  Fri Jun  7 08:26:58 2002
From: eikeon@eikeon.com (Daniel 'eikeon' Krech)
Date: 07 Jun 2002 03:26:58 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: <200206070402.g5742il15834@pcp02138704pcs.reston01.va.comcast.net>
References: <200206060112.NAA07149@s454.cosc.canterbury.ac.nz>
 <200206060136.g561aQG05507@pcp02138704pcs.reston01.va.comcast.net>
 <m3sn40ud52.fsf@mira.informatik.hu-berlin.de>
 <200206061307.g56D7Tb06612@pcp02138704pcs.reston01.va.comcast.net>
 <m3k7pcw4cb.fsf@mira.informatik.hu-berlin.de>
 <200206070402.g5742il15834@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <un0u7tupp.fsf@eikeon.com>

Guido van Rossum <guido@python.org> writes:

> > > To solve this, we would have to make the ob_sinterned slot count as a
> > > reference to the interned string.  But then string_dealloc would be
> > > complicated (it would have to call Py_XDECREF(op->ob_sinterned)),
> > > possibly slowing things down.
> > > 
> > > Is this worth it?  
> > 
> > That (latter) change seem "right" regardless of whether interned
> > strings are ever released.
> 
> OK, let's do this.

Cool!





From tim_one@email.msn.com  Fri Jun  7 08:49:27 2002
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 7 Jun 2002 03:49:27 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <20020606223837.C1389@glacier.arctrix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEBPPLAA.tim_one@email.msn.com>

[Neil Schemenauer]
> It's easy to reproduce.  First, disable the GC.  Next, run:
>
>     regrtest.py test_descr test_gc

Sorry, thinking is cheating.

> My wild guess is that some tp_clear method is not doing it's job.  I'll
> take a closer look tomorrow if someone hasn't figured it out by then.
> Must sleep.  Too much CS.

Ya, Canadian sausage always does me in too.  I'll attach a self-contained
(in the sense that you can run it directly by itself, without regrtest.py)
test program.  Guido might have some idea what does <wink>.  For me, it
prints:

    collected 3
    collected 51
    collected 9
    collected 0

    and, at the end, collected 1

and it's not a coincidence that 9+1 == 10 (the failing value seen when
running the test suite).  It suggests one easy way to fix test_gc <wink>.

> I wonder if some new cyclic garbage structure needs two gc.collect()
> passes in order to break it up.

If there isn't a bug, this case takes 3(!) passes.


from test_support import vereq

def supers():
    class A(object):
        def meth(self, a):
            return "A(%r)" % a

    class B(A):
        def __init__(self):
            self.__super = super(B, self)
        def meth(self, a):
            return "B(%r)" % a + self.__super.meth(a)

    class C(A):
        def meth(self, a):
            return "C(%r)" % a + self.__super.meth(a)
    C._C__super = super(C)

    class D(C, B):
        def meth(self, a):
            return "D(%r)" % a + super(D, self).meth(a)

    class mysuper(super):
        def __init__(self, *args):
            return super(mysuper, self).__init__(*args)

    class E(D):
        def meth(self, a):
            return "E(%r)" % a + mysuper(E, self).meth(a)

    class F(E):
        def meth(self, a):
            s = self.__super
            return "F(%r)[%s]" % (a, s.__class__.__name__) + s.meth(a)
    F._F__super = mysuper(F)

    vereq(F().meth(6), "F(6)[mysuper]E(6)D(6)C(6)B(6)A(6)")

import gc
gc.disable()

L = []
L.append(L)

supers()

while 1:
    n = gc.collect()
    print "collected", n
    if n == 0:
        break

del L
n = gc.collect()
print
print "and, at the end, collected", n




From just@letterror.com  Fri Jun  7 09:05:23 2002
From: just@letterror.com (Just van Rossum)
Date: Fri,  7 Jun 2002 10:05:23 +0200
Subject: [Python-Dev] textwrap.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEBHPLAA.tim.one@comcast.net>
Message-ID: <r01050300-1015-53F66B6679ED11D695FD003065D5E7E4@[10.0.0.23]>

Tim Peters wrote:

> It wouldn't if you viewed it in Courier; for fixed-width fonts it very
> arguably helps people parse.  The two-space gimmick is out of favor for
> published works because proportional fonts and kerning are adequate to
> distinguish sentences.  It still Rulz the DOS box, though <wink>.

Huh? In fixed-width fonts the period is a small dot on a huge area of white
space. It contains much more white than it would in a proportional font and/or
the dot is much bigger to compenstate for that. Either way, in most fixed-width
fonts the period sticks out pretty well. I don't see how that can be harder to
parse than when set in a proportional font, let alone why an extra space would
help.

can't-afford-to-stay-out-of-a-typographical-flame-fest-on-python-dev-ly y'rs  -
Just



From fredrik@pythonware.com  Fri Jun  7 12:20:58 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 7 Jun 2002 13:20:58 +0200
Subject: [Python-Dev] textwrap.py
References: <LNBBLJKPBEHFEDALKOLCMEBHPLAA.tim.one@comcast.net>
Message-ID: <01f501c20e15$bb84cd60$0900a8c0@spiff>

tim wrote:

> Two spaces between sentences was the rule for monospaced fonts before
> Knuth was born.

in America, perhaps.  people from other parts of the world may
also wish to use the textwrap modules (or better, string.wrap).

so let's add an option (e.g. ms_davis_told_me_so=3D1 ;-)

> It got beaten into me by my mother when I learned to type, and
> is still the rule for monospaced fonts according to several style
> guides.

if you do the google searches I mention, you'll find that the
word "some" is more correct than "several".
=20
>     Are there TWO spaces after every sentence?  Manuscripts without
>     two spaces after each sentence will be rejected.

to quote another random web page:

    "Even I was told by my typing teacher to put two spaces after
    a period. It's just that I trusted the advice I got from graphic
    designers more than I trusted my typing teacher. My typing
    teacher also carried a lunch box and wore short-sleeved white
    dress shirts with really bad ties to school every day. It's up
    to you...."

and
   =20
    "... I've found tenacity and authority the overriding "arguments"
    for maintaining the two-space rule. Empirically and financially, the
    one-space rule makes sense."

> It wouldn't if you viewed it in Courier; for fixed-width fonts it very
> arguably helps people parse.

according to vision researchers, humans using their eyes to read
text don't care much about sentence breaks inside blocks of text
-- for some reason, they're probably more interested in the con-
tent.  and humans don't appear to use regular expressions at all.
how weird.

</F>




From neal@metaslash.com  Fri Jun  7 13:39:19 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Fri, 07 Jun 2002 08:39:19 -0400
Subject: [Python-Dev] Socket timeout patch
References: <200206070419.QAA07241@s454.cosc.canterbury.ac.nz>
Message-ID: <3D00A977.7FBDFC50@metaslash.com>

Greg Ewing wrote:
> 
> Guido:
> 
> > - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)?
> >   Currently it sets a timeout of zero seconds, and that behaves pretty
> >   much the same as setting the socket in nonblocking mode -- but not
> >   exactly.  Maybe these should be made the same?
> 
> I'd say no. Someone might want the current behaviour,
> whatever it is -- and if they don't, they can always
> make it properly non-blocking. Don't make a special
> case unless it's absolutely necessary.

Another possibility would be to make settimeout(0.0) equivalent to
settimeout(None), ie disable timeouts.

Neal



From guido@python.org  Fri Jun  7 13:54:05 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 08:54:05 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: Your message of "Fri, 07 Jun 2002 00:32:07 EDT."
 <LNBBLJKPBEHFEDALKOLCMEBHPLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEBHPLAA.tim.one@comcast.net>
Message-ID: <200206071254.g57Cs5V16781@pcp02138704pcs.reston01.va.comcast.net>

> The web page from which that quote was taken forces an extra space
> after the question mark (I didn't insert it after pasting the quote)
> in the obvious way:
> 
>     Are there TWO spaces after every sentence?&nbsp; Manuscripts
>     without two spaces after each will be rejected.

How pedantic.  HTML wasn't intended to be written this way.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun  7 14:02:44 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 09:02:44 -0400
Subject: [Python-Dev] Quota on sf.net
In-Reply-To: Your message of "07 Jun 2002 09:05:31 +0200."
 <j4y9drk1qc.fsf@informatik.hu-berlin.de>
References: <j4y9drk1qc.fsf@informatik.hu-berlin.de>
Message-ID: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net>

> It appears SF is rearranging servers, and asks projects to honor their
> disk quota, see
> 
> https://sourceforge.net/forum/forum.php?forum_id=183601
> 
> There is a per-project disk quota of 100MB; /home/groups/p/py/python
> currently consumes 880MB. Most of this (830MB) is in
> htdocs/snapshots. Should we move those onto python.org?

What is htdocs/snapshots?  There's plenty of space on creosote, but
maybe the snapshots should be reduced in volume first?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun  7 14:19:42 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 09:19:42 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: Your message of "Fri, 07 Jun 2002 16:19:57 +1200."
 <200206070419.QAA07241@s454.cosc.canterbury.ac.nz>
References: <200206070419.QAA07241@s454.cosc.canterbury.ac.nz>
Message-ID: <200206071319.g57DJgC17102@pcp02138704pcs.reston01.va.comcast.net>

[Guido]
> > - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)?
> >   Currently it sets a timeout of zero seconds, and that behaves pretty
> >   much the same as setting the socket in nonblocking mode -- but not
> >   exactly.  Maybe these should be made the same?

[GregE]
> I'd say no. Someone might want the current behaviour,
> whatever it is -- and if they don't, they can always
> make it properly non-blocking. Don't make a special
> case unless it's absolutely necessary.

Why would someone want the current (as of last night) behavior?  IMO
it's useless.  The distinction with non-blocking mode is very minimal.

[Neal]
> Another possibility would be to make settimeout(0.0) equivalent to
> settimeout(None), ie disable timeouts.

Hm, but a zero really does smell more of non-blocking than of
blocking.  It would also be inconsistent with the timeout argument to
select(), which currently uses None for blocking, 0 for non-blocking,
and other positive numbers for a timeout in seconds -- just like
settimeout().

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun  7 14:19:51 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 09:19:51 -0400
Subject: [Python-Dev] d.get_key(key) -> key?
In-Reply-To: Your message of "Fri, 07 Jun 2002 16:22:25 +1200."
 <200206070422.QAA07244@s454.cosc.canterbury.ac.nz>
References: <200206070422.QAA07244@s454.cosc.canterbury.ac.nz>
Message-ID: <200206071319.g57DJpH17110@pcp02138704pcs.reston01.va.comcast.net>

> > So let's expose a function that cleans out unused strings from the
> > interned dict.  Long-running apps can decide when to call this.
> 
> Would it do any harm to call this automatically from
> the garbage collector?

That's what I initially proposed -- do it in the last-generation GC
pass, which runs every million object allocations or so.  But since
this is potentially expensive (running through a large dict),
long-running processes might want to control when it runs.

> I suppose it should be exposed as well, in case you
> want it but have GC turned off -- but in the normal
> case you shouldn't have to do anything special to
> get it.

It's a pure slowdown for more programs, even long-running programs
(one could say *especially* for long-running programs, since
short-running programs won't get to the last generation GC pass).

Only long-running (24x7) servers that execute some kind of
(pseudo-)code submitted by clients are vulnerable to the
interned-dict-bloat problem.  Such programs are full of hacks to limit
memory bloat already, so this would be yet another trick for them to
deploy.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one@email.msn.com  Fri Jun  7 14:49:57 2002
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 7 Jun 2002 09:49:57 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <01f501c20e15$bb84cd60$0900a8c0@spiff>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECKPLAA.tim_one@email.msn.com>

[Tim]
>> Two spaces between sentences was the rule for monospaced fonts before
>> Knuth was born.

[/F]
> in America, perhaps.  people from other parts of the world may
> also wish to use the textwrap modules (or better, string.wrap).

Despite that you never bought a shift key, you use two spaces between
sentences.  Are you American?  François Pinard's name can't even be spelled
in American <wink>, and said

    Protection of full stops does not fall in that decoration category,
    it is essential.

Just complained about it, and I invite you to set your browser to a
fixed-width font and judge the readability of his msg compared to the pieces
of mine he quoted (Just, the point isn't to make the period stand out, it's
to make the start of the next sentence stand out):

    http://mail.python.org/pipermail/python-dev/2002-June/025141.html

to my eyes single-space sucks with a monospaced font and I agree with
François on this it makes monospaced text look like a giant run-on sentence.

> so let's add an option (e.g. ms_davis_told_me_so=1 ;-)

>> It got beaten into me by my mother when I learned to type, and
>> is still the rule for monospaced fonts according to several style
>> guides.

> if you do the google searches I mention,

I did, but I looked at a lot more than discussion boards.

> you'll find that the word "some" is more correct than "several".

I'm not sure that distinction means something; if it does, I don't buy it.

>>     Are there TWO spaces after every sentence?  Manuscripts without
>>     two spaces after each sentence will be rejected.

> to quote another random web page:
>
>     "Even I was told by my typing teacher to put two spaces after
>     a period. It's just that I trusted the advice I got from graphic
>     designers more than I trusted my typing teacher. My typing
>     teacher also carried a lunch box and wore short-sleeved white
>     dress shirts with really bad ties to school every day. It's up
>     to you...."

A difference is that my quote came from a publisher spelling out
requirements for submission, while yours is pulled from a casual msg in a
discussion board.  This is the difference between quoting a journal and an
Archimedes Plutonium post from sci.physics <wink>.

> and
>
>     "... I've found tenacity and authority the overriding "arguments"
>     for maintaining the two-space rule. Empirically and financially, the
>     one-space rule makes sense."

And in *that* discussion board, the preceding msg in the thread says

    At my last Technical Writing job my manager was adamant about using
    two spaces after a period, and I have become accustomed to using two
    spaces.

and

    Two spaces after a period is still the rule ...

Selective quoting of random people blathering at each other doesn't count as
"research" to me.  If it does to anyone else, you can find hundreds of
quotes supporting any view you like.

> ...
> according to vision researchers, humans using their eyes to read
> text don't care much about sentence breaks inside blocks of text

This reads like a garbled paraphrase; I assume that if you had a real
reference, you would have given it <0.9 wink>.

> -- for some reason, they're probably more interested in the con-
> tent.  and humans don't appear to use regular expressions at all.
> how weird.

    [from Patricia Godfrey's review of "The Mac Is Not a Typewriter]
    ...
    The author details all the typewriter makeshifts, such as two
    hyphens for a dash, that no longer have to be—and should not be—
    employed when you’re working on a PC.  But in one case she reveals
    her youth.  Typing two spaces after an end-of-sentence period, she
    thinks, was only done on typewriters because typewriters have
    monospaced type, and you shouldn’t do it on a PC.

    Like many theories, it sounds logical, but those of us who read old
    books or are old enough to remember when typesetting was an art
    practiced by people, rather than the result of an algorithm, know
    better.  Typists were taught to hit two spaces after a period when
    typing because typeset material once upon a time used extra space
    there.

    ...

    This is an interesting instance of a phenomenon that we should all
    be aware of:  in times of much change, collective cultural amnesia
    can occur, and a whole society can forget something that "everyone
    knew."

The regexp attempts to preserve what everyone used to know, against
computer-inspired reduction to the simplest thing that can possibly be
implemented.

in-my-oourier-new-world-i-know-what-works-ly y'rs  - tim




From tim_one@email.msn.com  Fri Jun  7 14:54:56 2002
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 7 Jun 2002 09:54:56 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <200206071254.g57Cs5V16781@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCGECLPLAA.tim_one@email.msn.com>

>>     Are there TWO spaces after every sentence?&nbsp; Manuscripts
>>     without two spaces after each will be rejected.

[Guido]
> How pedantic.  HTML wasn't intended to be written this way.

The quote had nothing to do with HTML.  Or are you focusing exclusively on
the instance of &nbsp?  That's helpful <wink>.




From nas@python.ca  Fri Jun  7 15:10:54 2002
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 7 Jun 2002 07:10:54 -0700
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEBPPLAA.tim_one@email.msn.com>; from tim_one@email.msn.com on Fri, Jun 07, 2002 at 03:49:27AM -0400
References: <20020606223837.C1389@glacier.arctrix.com> <LNBBLJKPBEHFEDALKOLCKEBPPLAA.tim_one@email.msn.com>
Message-ID: <20020607071054.A2511@glacier.arctrix.com>

Tim Peters wrote:
> [Neil Schemenauer]
> > Must sleep.  Too much CS.
> 
> Ya, Canadian sausage always does me in too.

But it's so good.

> If there isn't a bug, this case takes 3(!) passes.

Perhaps it is a reference counting bug.  If a reference count is too
high then tp_clear will keep decref'ing it until it gets to zero.

  Neil



From just@letterror.com  Fri Jun  7 15:15:11 2002
From: just@letterror.com (Just van Rossum)
Date: Fri,  7 Jun 2002 16:15:11 +0200
Subject: [Python-Dev] textwrap.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMECKPLAA.tim_one@email.msn.com>
Message-ID: <r01050300-1015-FCA717527A2011D695FD003065D5E7E4@[10.0.0.23]>

Tim Peters wrote:

> (Just, the point isn't to make the period stand out, it's
> to make the start of the next sentence stand out):

Sure, but there are already *two* things to make that clear: end the prev=
ious
sentence with a period, start the next with a capital letter. An extra sp=
ace is
overkill. But I guess your point may be that caps usually stand out less =
in
fixed-width fonts, which may be true.

>     http://mail.python.org/pipermail/python-dev/2002-June/025141.html

That sucks only because the empty line between quote and followup was del=
eted
from the original...

> to my eyes single-space sucks with a monospaced font and I agree with
> Fran=E7ois on this it makes monospaced text look like a giant run-on se=
ntence.

Don't know about canadians, but I wouldn't listen to the french : they wr=
ite
spaces *before* punctuation !

Just



From tim.one@comcast.net  Fri Jun  7 15:33:51 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Jun 2002 10:33:51 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <20020606223104.B1389@glacier.arctrix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOECNPLAA.tim.one@comcast.net>

[Neil Schemenauer]
> I wonder if some new cyclic garbage structure needs two gc.collect()
> passes in order to break it up.

Can you dream up a way that can happen legitimately?  I haven't been able
to, short of assuming the existence of a container object that isn't tracked
by gc.  Else it seems that all the unreachable cycles that exist at any
given time will be found by a single all-generations gc pass (either that,
or gc is busted <wink>).




From perry@stsci.edu  Fri Jun  7 15:39:53 2002
From: perry@stsci.edu (Perry Greenfield)
Date: Fri, 07 Jun 2002 10:39:53 -0400
Subject: [Python-Dev] Changing ob_size to [s]size_t
Message-ID: <NEBBIJKBMLDBLNCEEFOCIEJCCNAA.perry@stsci.edu>

Guido writes:

> I'm not very concerned about strings or lists with more than 2GB
> items, but I am concerned about other memory buffers.

Those in the Numeric/numarray community, for one, would also be
concerned. Although there aren't many data arrays these days that are
larger than 2GB there are some beginning to appear. I have no doubt
that within a few years there will be many more. I'm not sure I 
understand all the implications of the discussion here, but it sounds
like an important issue. Currently strings are frequently used as
a common "medium" to pass binary data from one module to another
(e.g., from Numeric to PIL); limiting strings to 2GB may prove
a problem in this area (though frankly, I suspect few will want
to use them as temporary buffers for objects that size until memories
have grown a bit more :-). 

Perry Greenfield



From gmcm@hypernet.com  Fri Jun  7 15:40:32 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Fri, 7 Jun 2002 10:40:32 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <r01050300-1015-FCA717527A2011D695FD003065D5E7E4@[10.0.0.23]>
References: <LNBBLJKPBEHFEDALKOLCMECKPLAA.tim_one@email.msn.com>
Message-ID: <3D008DA0.20248.6AD00DBA@localhost>

Aargh!

It doesn't matter if it "makes sense"[1]! It's a widely known rule that some people still insist upon.  
I don't see anyone arguing you should adopt the
convention, just that people who follow the 
convention should see it respected.

-- Gordon
http://www.mcmillan-inc.com/

[1] Style guidelines frequently appeal to a
notion of "sense" that only makes sense
if the guideline appeals to you. I would cite
the GNU C code style guide as an example, 
but that would only get me flamed :-).




From tim.one@comcast.net  Fri Jun  7 15:44:46 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Jun 2002 10:44:46 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <r01050300-1015-FCA717527A2011D695FD003065D5E7E4@[10.0.0.23]>
Message-ID: <LNBBLJKPBEHFEDALKOLCOECPPLAA.tim.one@comcast.net>

>> (Just, the point isn't to make the period stand out, it's
>> to make the start of the next sentence stand out):

[Just]
> Sure, but there are already *two* things to make that clear: end
> the previous sentence with a period, start the next with a capital
> letter. An extra space is overkill. But I guess your point may be
> that caps usually stand out less in fixed-width fonts, which may be
> true.

They do seem to stand out less, but that isn't really my point.  My point is
that I've been living mostly with fixed-width fonts for more than 30 years,
and even in 1970 I noticed it was easier to read prose in such fonts when
sentences were separated by two spaces.  And that's before my eyes started
growing old -- it's gotten more noticeable over the years.  I don't know
exactly why that is, but I can't notice a thing hundreds of times and then
be convinced by abstract arguments that I've been hallucinating for decades.

> ...
> Don't know about canadians, but I wouldn't listen to the french :
> they write spaces *before* punctuation !

God knows I'd rather align myself with the Dutch, but in the grand tradition
of European wars, punctuation makes strange bedfellows <wink>.




From s_lott@yahoo.com  Fri Jun  7 15:48:51 2002
From: s_lott@yahoo.com (Steven Lott)
Date: Fri, 7 Jun 2002 07:48:51 -0700 (PDT)
Subject: [Python-Dev] textwrap.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMECKPLAA.tim_one@email.msn.com>
Message-ID: <20020607144851.21255.qmail@web9605.mail.yahoo.com>

> "old enough to remember when typesetting was
> an art
>     practiced by people"

Hey wait a minute, I resemble that remark!  Briefly, I spent
some time setting cold lead type with my stubbly little fingers.
 You used an "em" space after full stops, an "en" space
otherwise.  You padded after the "em" first, then spread "thins"
around the line to get it to long enough that you could clamp it
firmly in the frame.

However, this doesn't resolve the monofont issue.  An "em" is
(usually) not twice as wide as an "en".  An "en" is the width of
the letter "n"; about in the middle of all of the widths.  An
"em", is the width of the letter "m", the widest of all letters.

Anyway.  I'm not a big fan of flags and options and settings.  
I think the text wrapper should have the "fix sentence ending"
method renamed to "find sentence ending" and the wrap() and
fill() functions could have a hook where a Strategy class can be
applied.

- One strategy class puts a single space after full stops.
- Another puts double spaces after full stops.
- A subclass of either of these could spread space around
to justify the line.
- I think that Unicode offers "em" and "en"-sized spaces; what
this does with good old fashioned Courrier 12 I have no idea; 
but someone could add this strategy if it made them happy.


=====
--
S. Lott, CCP :-{)
S_LOTT@YAHOO.COM
http://www.mindspring.com/~slott1
Buccaneer #468: KaDiMa

Macintosh user: drinking upstream from the herd.

__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From guido@python.org  Fri Jun  7 15:58:02 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 10:58:02 -0400
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: Your message of "Fri, 07 Jun 2002 10:39:53 EDT."
 <NEBBIJKBMLDBLNCEEFOCIEJCCNAA.perry@stsci.edu>
References: <NEBBIJKBMLDBLNCEEFOCIEJCCNAA.perry@stsci.edu>
Message-ID: <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net>

> > I'm not very concerned about strings or lists with more than 2GB
> > items, but I am concerned about other memory buffers.
> 
> Those in the Numeric/numarray community, for one, would also be
> concerned. Although there aren't many data arrays these days that are
> larger than 2GB there are some beginning to appear. I have no doubt
> that within a few years there will be many more. I'm not sure I 
> understand all the implications of the discussion here, but it sounds
> like an important issue. Currently strings are frequently used as
> a common "medium" to pass binary data from one module to another
> (e.g., from Numeric to PIL); limiting strings to 2GB may prove
> a problem in this area (though frankly, I suspect few will want
> to use them as temporary buffers for objects that size until memories
> have grown a bit more :-). 

Sorry, I should have been more exact.  I meant 2 billion items, not 2
gigabytes.  That should give you an extra factor 4-8 to play with. :-)

We'll fix this in Python 3.0 for sure -- the question is, should we
start fixing it now and binary compatibility be damned, or should we
honor binary compatiblity more?

Maybe someone in the Python-with-a-tie camp can comment?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Fri Jun  7 15:51:46 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Jun 2002 10:51:46 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <3D008DA0.20248.6AD00DBA@localhost>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEDAPLAA.tim.one@comcast.net>

[Gordon McMillan]
> Aargh!
>
> It doesn't matter if it "makes sense"[1]!

Indeed, it doesn't even matter if one side is dead wrong.  Hell, it doesn't
even matter if all sides are dead wrong.

> It's a widely known rule that some people still insist upon.

The ones with a normal sense of aesthetics, yes <wink>.

> ...
> [1] Style guidelines frequently appeal to a
> notion of "sense" that only makes sense
> if the guideline appeals to you. I would cite
> the GNU C code style guide as an example,
> but that would only get me flamed :-).

"""
Aside from this, I prefer code formatted like this:

  if (x < foo (y, z))
    haha = bar[4] + 5;
  else
    {
      while (z)
        {
          haha += foo (z, z);
          z--;
        }
      return ++x + bar ();
    }

I find it easier to read a program when it has spaces before the
open-parentheses and after the commas.
"""

I've recently had the opportunity to work with reams of code done this way.
The best that can be said of it is that it's dead wrong.  Out of
consideration for our youth, I'll refrain from revealing the worst that can
be said of it.  At least it has two spaces after right curly braces <wink>.




From guido@python.org  Fri Jun  7 16:00:46 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 11:00:46 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: Your message of "Fri, 07 Jun 2002 10:40:32 EDT."
 <3D008DA0.20248.6AD00DBA@localhost>
References: <LNBBLJKPBEHFEDALKOLCMECKPLAA.tim_one@email.msn.com>
 <3D008DA0.20248.6AD00DBA@localhost>
Message-ID: <200206071500.g57F0lh18144@pcp02138704pcs.reston01.va.comcast.net>

> It doesn't matter if it "makes sense"[1]! It's a widely known rule
> that some people still insist upon.  I don't see anyone arguing you
> should adopt the convention, just that people who follow the
> convention should see it respected.

True, but then there needs to be a way to enable/disable it, since
even if you never use two spaces after a period, the rule can still
generate them for you in the output: when an input sentence ends at
the end of a line but the output sentence doesn't, the rule will
translate the newline into two spaces instead of one.

I vote to have it off by default.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From loewis@informatik.hu-berlin.de  Fri Jun  7 16:00:05 2002
From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=)
Date: 07 Jun 2002 17:00:05 +0200
Subject: [Python-Dev] Quota on sf.net
In-Reply-To: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net>
References: <j4y9drk1qc.fsf@informatik.hu-berlin.de>
 <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <j4elfjjfre.fsf@informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> What is htdocs/snapshots?  There's plenty of space on creosote, but
> maybe the snapshots should be reduced in volume first?

I'm not sure. Jeremy owns it, but I don't know what process creates it.

Regards,
Martin



From sholden@holdenweb.com  Fri Jun  7 15:59:48 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Fri, 7 Jun 2002 10:59:48 -0400
Subject: [Python-Dev] textwrap.py
References: <r01050300-1015-FCA717527A2011D695FD003065D5E7E4@[10.0.0.23]>
Message-ID: <012c01c20e33$f9462540$7201a8c0@holdenweb.com>

----- Original Message -----
From: "Just van Rossum" <just@letterror.com>
To: <python-dev@python.org>
Sent: Friday, June 07, 2002 10:15 AM
Subject: RE: [Python-Dev] textwrap.py


Tim Peters wrote:

> (Just, the point isn't to make the period stand out, it's
> to make the start of the next sentence stand out):

Sure, but there are already *two* things to make that clear: end the
previous
sentence with a period, start the next with a capital letter. An extra space
is
overkill. But I guess your point may be that caps usually stand out less in
fixed-width fonts, which may be true.

>     http://mail.python.org/pipermail/python-dev/2002-June/025141.html

That sucks only because the empty line between quote and followup was
deleted
from the original...

> to my eyes single-space sucks with a monospaced font and I agree with
> François on this it makes monospaced text look like a giant run-on
sentence.

Don't know about canadians, but I wouldn't listen to the french : they write
spaces *before* punctuation !
--------End Original Message--------


If the energy that has gone into this debate had gone into modifying
existing code, by now each different version of the text formatting function
being discussed could have an "twospace" keyword argument which could be set
to achieve the required behavior and defaulted to the author's preference.

I smell the bicycle shed here.

regards
-----------------------------------------------------------------------
Steve Holden                                 http://www.holdenweb.com/
Python Web Programming                http://pydish.holdenweb.com/pwp/
-----------------------------------------------------------------------






From mal@lemburg.com  Fri Jun  7 16:12:57 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 07 Jun 2002 17:12:57 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
References: <NEBBIJKBMLDBLNCEEFOCIEJCCNAA.perry@stsci.edu> <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D00CD79.6050409@lemburg.com>

Guido van Rossum wrote:
>>>I'm not very concerned about strings or lists with more than 2GB
>>>items, but I am concerned about other memory buffers.
>>
>>Those in the Numeric/numarray community, for one, would also be
>>concerned. Although there aren't many data arrays these days that are
>>larger than 2GB there are some beginning to appear. I have no doubt
>>that within a few years there will be many more. I'm not sure I 
>>understand all the implications of the discussion here, but it sounds
>>like an important issue. Currently strings are frequently used as
>>a common "medium" to pass binary data from one module to another
>>(e.g., from Numeric to PIL); limiting strings to 2GB may prove
>>a problem in this area (though frankly, I suspect few will want
>>to use them as temporary buffers for objects that size until memories
>>have grown a bit more :-). 
> 
> 
> Sorry, I should have been more exact.  I meant 2 billion items, not 2
> gigabytes.  That should give you an extra factor 4-8 to play with. :-)
> 
> We'll fix this in Python 3.0 for sure -- the question is, should we
> start fixing it now and binary compatibility be damned, or should we
> honor binary compatiblity more?

What binary compatibility ? I thought we had given that idea
up after 1.5.2 was out the door (which is also why the Windows
distutils installers are very picky about the Python version
to install an extension for).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From guido@python.org  Fri Jun  7 16:19:05 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 11:19:05 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: Your message of "Fri, 07 Jun 2002 10:33:51 EDT."
 <LNBBLJKPBEHFEDALKOLCOECNPLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCOECNPLAA.tim.one@comcast.net>
Message-ID: <200206071519.g57FJ5k18395@pcp02138704pcs.reston01.va.comcast.net>

> > I wonder if some new cyclic garbage structure needs two gc.collect()
> > passes in order to break it up.
> 
> Can you dream up a way that can happen legitimately?  I haven't been able
> to, short of assuming the existence of a container object that isn't tracked
> by gc.  Else it seems that all the unreachable cycles that exist at any
> given time will be found by a single all-generations gc pass (either that,
> or gc is busted <wink>).

Any idea why this would only happen on Windows?  I tried it on Linux
and couldn't get it to fail.  Not even with gc.set_threshold(1).

I'll go review my tp_clear code next.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aahz@pythoncraft.com  Fri Jun  7 16:15:05 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 7 Jun 2002 11:15:05 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <200206071500.g57F0lh18144@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMECKPLAA.tim_one@email.msn.com> <3D008DA0.20248.6AD00DBA@localhost> <200206071500.g57F0lh18144@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020607151505.GA22182@panix.com>

On Fri, Jun 07, 2002, Guido van Rossum wrote:
>
> > It doesn't matter if it "makes sense"[1]! It's a widely known rule
> > that some people still insist upon.  I don't see anyone arguing you
> > should adopt the convention, just that people who follow the
> > convention should see it respected.
> 
> True, but then there needs to be a way to enable/disable it, since
> even if you never use two spaces after a period, the rule can still
> generate them for you in the output: when an input sentence ends at
> the end of a line but the output sentence doesn't, the rule will
> translate the newline into two spaces instead of one.
> 
> I vote to have it off by default.

How about a compromise?  If the algorithm discovers a sentence with two
or more spaces ending it, it goes into "two-space" mode; otherwise, it
defaults to one-space mode.  (I think fmt does this, but I'm not sure;
it's certainly the case that sometimes it preserves my spaces and
sometimes it doesn't, and I've never been able to figure it out
precisely.)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I had lots of reasonable theories about children myself, until I
had some."  --Michael Rios



From guido@python.org  Fri Jun  7 16:23:23 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 11:23:23 -0400
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: Your message of "Fri, 07 Jun 2002 17:12:57 +0200."
 <3D00CD79.6050409@lemburg.com>
References: <NEBBIJKBMLDBLNCEEFOCIEJCCNAA.perry@stsci.edu> <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net>
 <3D00CD79.6050409@lemburg.com>
Message-ID: <200206071523.g57FNNe18428@pcp02138704pcs.reston01.va.comcast.net>

> What binary compatibility ? I thought we had given that idea
> up after 1.5.2 was out the door (which is also why the Windows
> distutils installers are very picky about the Python version
> to install an extension for).

You keep saying this, and I keep denying it.  In everything I do I try
to remain binary compatible in struct layout and function signatures.
Can you point to a document that records a decision to the contrary?

--Guido van Rossum (home page: http://www.python.org/~guido/)




From tim.one@comcast.net  Fri Jun  7 16:23:04 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Jun 2002 11:23:04 -0400
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: <3D00CD79.6050409@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEDFPLAA.tim.one@comcast.net>

[M.-A. Lemburg]
> What binary compatibility ? I thought we had given that idea
> up after 1.5.2 was out the door (which is also why the Windows
> distutils installers are very picky about the Python version
> to install an extension for).

It's strange.  The binary API has changed with every non-bugfix release
since then, but if you stare at the details, old binaries will almost
certainly work correctly despite the API warning messages.  Guido explained
to me that this is why API mismatch is just a warning instead of an error.
This is also why it took weeks to enable pymalloc by default, instead of
days (i.e., to make sure that old binaries don't have to be recompiled).

I don't speak for disutils, of course.




From tim.one@comcast.net  Fri Jun  7 16:26:47 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Jun 2002 11:26:47 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <200206071519.g57FJ5k18395@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDGPLAA.tim.one@comcast.net>

[Guido]
> Any idea why this would only happen on Windows?  I tried it on Linux
> and couldn't get it to fail.  Not even with gc.set_threshold(1).

What exactly is "it"?  The failure when running regrtest.py in whole; the
failure Neil reported (and I assume on Linux) by running just test_descr and
test_gc after *disabling* gc in regrtest.py ("disable" == gc.disable() or
gc.set_threshold(0), not gc.set_treshold(1)); or the 3 gc.collect()s it
takes to clear out the cycles in the self-contained test program I posted?

> I'll go review my tp_clear code next.

Probably a good idea regardless <wink>.




From tim.one@comcast.net  Fri Jun  7 16:27:38 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Jun 2002 11:27:38 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <20020607151505.GA22182@panix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEDHPLAA.tim.one@comcast.net>

[Aahz]
> How about a compromise?  If the algorithm discovers a sentence with two
> or more spaces ending it, it goes into "two-space" mode; otherwise, it
> defaults to one-space mode.  (I think fmt does this, but I'm not sure;
> it's certainly the case that sometimes it preserves my spaces and
> sometimes it doesn't, and I've never been able to figure it out
> precisely.)

That's certainly worthy of emulation <wink>.



From mgilfix@eecs.tufts.edu  Fri Jun  7 16:45:26 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Fri, 7 Jun 2002 11:45:26 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEBPPLAA.tim_one@email.msn.com>; from tim_one@email.msn.com on Fri, Jun 07, 2002 at 03:49:27AM -0400
References: <20020606223837.C1389@glacier.arctrix.com> <LNBBLJKPBEHFEDALKOLCKEBPPLAA.tim_one@email.msn.com>
Message-ID: <20020607114526.B24428@eecs.tufts.edu>

On Fri, Jun 07 @ 03:49, Tim Peters wrote:
> Ya, Canadian sausage always does me in too.  I'll attach a self-contained
> (in the sense that you can run it directly by itself, without regrtest.py)
> test program.  Guido might have some idea what does <wink>.  For me, it
> prints:

  Every Canadian knows that you should always opt for the ham. I think
we feed our turkeys molson...

                   -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From guido@python.org  Fri Jun  7 16:50:13 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 11:50:13 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: Your message of "Fri, 07 Jun 2002 03:49:27 EDT."
 <LNBBLJKPBEHFEDALKOLCKEBPPLAA.tim_one@email.msn.com>
References: <LNBBLJKPBEHFEDALKOLCKEBPPLAA.tim_one@email.msn.com>
Message-ID: <200206071550.g57FoDk25701@pcp02138704pcs.reston01.va.comcast.net>

> > I wonder if some new cyclic garbage structure needs two gc.collect()
> > passes in order to break it up.
> 
> If there isn't a bug, this case takes 3(!) passes.

That same testcase prints the same output for me on Linux, with Python
2.2, with a 2.3 from June 4th, and with 2.3 from current CVS:

    collected 3
    collected 51
    collected 9
    collected 0

    and, at the end, collected 1

So there really are test cases that require more than one collection
to clean them up.

Next:
> [Guido]
> > Any idea why this would only happen on Windows?  I tried it on Linux
> > and couldn't get it to fail.  Not even with gc.set_threshold(1).

[Tim]
> What exactly is "it"?  The failure when running regrtest.py in
> whole; the failure Neil reported (and I assume on Linux) by running
> just test_descr and test_gc after *disabling* gc in regrtest.py
> ("disable" == gc.disable() or gc.set_threshold(0), not
> gc.set_treshold(1)); or the 3 gc.collect()s it takes to clear out
> the cycles in the self-contained test program I posted?

I meant the failure on Windows.

But I can now reproduce on Linux what Neil did using the new -t option
that I just added to regrtest.py:

    ./python ../Lib/test/regrtest.py -t0 test_descr test_gc

which tells me

    test test_gc failed -- test_list: actual 10, expected 1

When I put an extra gc.collect() call in test_gc.test_list(), the test
succeeds.

Is this the right fix?

I can't see anything particilarly wrong with subtype_clear() or the
slot-traversing subtype_traverse() in typeobject.c.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas@python.ca  Fri Jun  7 16:54:22 2002
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 7 Jun 2002 08:54:22 -0700
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <200206071519.g57FJ5k18395@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jun 07, 2002 at 11:19:05AM -0400
References: <LNBBLJKPBEHFEDALKOLCOECNPLAA.tim.one@comcast.net> <200206071519.g57FJ5k18395@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020607085422.A3051@glacier.arctrix.com>

Guido van Rossum wrote:
> Any idea why this would only happen on Windows?

What only happens on Windows?  If can reliably reproduce the problem on
Linux.

> I tried it on Linux and couldn't get it to fail.  Not even with
> gc.set_threshold(1).

I think you want gc.disable().

> I'll go review my tp_clear code next.

I'm narrowing in on the change the broke things.  It happened between
Dec 1 and Dec 15.

  Neil



From mal@lemburg.com  Fri Jun  7 17:10:13 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 07 Jun 2002 18:10:13 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
References: <NEBBIJKBMLDBLNCEEFOCIEJCCNAA.perry@stsci.edu> <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net>              <3D00CD79.6050409@lemburg.com> <200206071523.g57FNNe18428@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D00DAE5.10203@lemburg.com>

Guido van Rossum wrote:
>>What binary compatibility ? I thought we had given that idea
>>up after 1.5.2 was out the door (which is also why the Windows
>>distutils installers are very picky about the Python version
>>to install an extension for).
> 
> 
> You keep saying this, and I keep denying it. 

:-)

 > In everything I do I try
> to remain binary compatible in struct layout and function signatures.
> Can you point to a document that records a decision to the contrary?

Garbage collection, weak references, changes in the memory allocation,
etc. etc.

All these change the binary layout of structs or the semantics
of memory allocation -- mostly only in slight ways, but to a point
where 100% binary compatibility is not given anymore.

Other changes (which I know of) are e.g. the Unicode APIs which
have changed (they now include UCS2 or UCS4 depending on whether
you use 16-bit or 32-bit Unicode internally).

I don't think that binary compatibility is all that important;
it just requires a recompile (and hey, that way you even get
sub-type support for free ;-). Far more difficult to handle are all
those minute little changes which easily slip the developer's radar.

Luckily this will get approached now by Andrew and Raymond, so
things are getting much better for us poor souls having to
live on supporting 3-4 different Python versions :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From tim.one@comcast.net  Fri Jun  7 17:03:49 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Jun 2002 12:03:49 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <200206071550.g57FoDk25701@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDLPLAA.tim.one@comcast.net>

[Guido]
> That same testcase prints the same output for me on Linux, with Python
> 2.2, with a 2.3 from June 4th, and with 2.3 from current CVS:
>
>     collected 3
>     collected 51
>     collected 9
>     collected 0
>
>     and, at the end, collected 1
>
> So there really are test cases that require more than one collection
> to clean them up.

Same here.  I wish we understood why.  Or that at least one of Neil and I
understood why.

> ...
> But I can now reproduce on Linux what Neil did using the new -t option
> that I just added to regrtest.py:
>
>     ./python ../Lib/test/regrtest.py -t0 test_descr test_gc
>
> which tells me
>
>     test test_gc failed -- test_list: actual 10, expected 1
>
> When I put an extra gc.collect() call in test_gc.test_list(), the test
> succeeds.
>
> Is this the right fix?

No, but assuming there isn't a real bug here, repeating gc.collect() until
it returns 0 would be -- as the self-contained program showed, we *may* need
to call gc.collect() as many as 4 times before that happens.  And if it's
legit that it may need 4, I see no reason for believing there's any a priori
upper bound on how many may be needed.  And the test could have failed all
along, even in 2.2; it apparently depends on how many times gc just happens
to run before we get to test_gc.

I'll check in a "drain it" fix to test_gc, but I'm still squirming.

> I can't see anything particilarly wrong with subtype_clear() or the
> slot-traversing subtype_traverse() in typeobject.c.

I couldn't either, but in my case I had scant idea what it thought it was
trying to do <0.9 wink>.




From guido@python.org  Fri Jun  7 17:29:40 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 12:29:40 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: Your message of "Fri, 07 Jun 2002 00:26:23 EDT."
 <20020607002623.A20029@eecs.tufts.edu>
References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net>
 <20020607002623.A20029@eecs.tufts.edu>
Message-ID: <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net>

First, a few new issues in this thread:

- On Windows, setting a timeout on a socket and then using s.makefile()
  works as expected (the I/O operations on the file will time out
  according to the timeout set on the socket).  This is because
  makefile() returns a pseudo-file that calls s.recv() etc. on the
  socket object.  But on Unix, s.makefile() on a socket with a timeout
  is a disaster: because the socket is internally set to nonblocking
  mode, all I/O operations will fail if they cannot be completed
  immediately (effectively setting a timeout of 0 on the file).  I have
  currently documented around this, but maybe it would be better if
  makefile() used a pseudo-file on all platforms, for uniform behavior.
  Thoughts?  I'm also thinking of implementing the socket wrapper (which
  is currently a Python class that has a "real" socket object as an
  instance variable) as a subclass of the real socket class instead.

- The original timeout socket code (in Python) by Tim O'Malley had a
  global timeout which you could set so that *all* sockets
  *automatically* had their timeout set.  This is nice if you want it
  to affect library modules like urllib or ftplib.  That feature is
  currently missing.  Should we add it?  (I have some concerns about
  it, in that it might break other code -- and it doesn't seem wise to
  do this for server-end sockets in general.  But it's a nice hack for
  smaller programs.)

Now on to my reply to Michael Gilfix:

>   Good stuff. The module needed a little work as I discovered as well
> :)

...and it still needs more.  There are still way too many #ifdefs in
the code.

>   Er, hopefully Bernard is still watching this thread as he wrote
> the test_timeout.py. He's been pretty quiet though as of late... I'm
> willing to rewrite the tests if he doesn't have the time. 

Either way would be good.

>   I think the tests should follow the same pattern as the
> test_socket.py.  While adding my regression tests, I noted that the
> general socket test suite could use some re-writing but I didn't feel
> it appropriate to tackle it at that point. Perhaps a next patch?

Yes, please!

> > - Cross-platform testing.  It's possible that the cleanup broke things
> >   on some platforms, or that select() doesn't work the same way.  I
> >   can only test on Windows and Linux; there is code specific to OS/2
> >   and RISCOS in the module too.
> 
>   This was a concern from the beginning but we had some chat on the
> dev list and concluded that any system supporting sockets has to
> support select or some equivalent (hence the initial reason for using
> the select module, although I agree it was expensive).

But that doesn't mean there aren't platform-specific tweaks necessary
to import the definition of select() and the FD_* macros.  We'll find
out soon enough, this is what alpha releases are for. :-)

> > - I'm not sure that the handling of timeout errors in accept(),
> >   connect() and connect_ex() is 100% correct (this code sniffs the
> >   error code and decides whether to retry or not).
> 
>   I've tested these on linux (manually) and they seem to work just
> fine. I didn't do as much testing with connect_ex but the code is
> very similar to connect, so confidence is high-er. The reason for the
> two-pass is because the initial connect needs to be made to start the
> process and then try again, based on the error codes, for non-blocking
> connects. It's weird like that.

I'll wait for the unit tests.  These should test all three modes
(blocking, non-blocking, and timeout).

Can you explain why on Windows you say that the socket is connected
when connect() returns a WSAEINVAL error?

Also, your code enters this block even in non-blocking mode, if a
timeout was also set.  (Fortunately I fixed this for you by setting
the timeout to -1.0 in setblocking(). Unfortunately there's still a
later test for !sock_blocking in the same block that cannot succeed
any more because of that.)

The same concerns apply to connect_ex() and accept(), which have very
similar logic.

I believe it is possible on some Unix variants (maybe on Linux) that
when select indicates that a socket is ready, if the socket is in
nonblocking mode, the call will return an error because some kernel
resource is unavailable.  This suggests that you may have to keep the
socket in blocking mode except when you have to do a connect() or
accept() (for which you can't do a select without setting the socket
in nonblocking mode first).

Looking at connect_ex, it seems to be missing the "res = errno" bit
in the case where it says "we're already connected".  It used to
return errno here, now it will return -1.  Maybe the conex_finally
label should be moved up to before the "if (res != 0) {" line?

> > - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)?
> >   Currently it sets a timeout of zero seconds, and that behaves pretty
> >   much the same as setting the socket in nonblocking mode -- but not
> >   exactly.  Maybe these should be made the same?
> 
>   I thought about this and whether or not I wanted to address this.  I
> kinda decided to leave them separate though. I don't think setting a
> timeout means anything equivalent to setblocking(0). In fact, I can't
> see why anyone should ever set a timeout of zero and the immediate
> throwing of the exception is a good alert as to what's going on. I
> vote, leave them separate and as they are now...

OTOH, a timeout of 0 behaves very similar to nonblocking mode --
similar enough that a program that uses setblocking(0) would probably
also work when using settimeout(0).  I kind of like the idea of having
only a single internal flag value, sock_timeout, rather than two
(sock_timeout and sock_blocking).

> > - The socket.py module has been changed too, changing the way
> >   buffering is done on Windows.  I haven't reviewed or tested this
> >   code thoroughly.
> 
>   I added a regression test to test_socket.py to test this, that works
> on both the old code (I used 2.1.3) and the new code. Hopefully, this
> will be instrumental for those testing it and it reflects my manual
> tests.

The tests don't look very systematic.  There are many cases (default
bufsize, unbuffered, bufsize=1, large bufsize; combine with read(),
readline(), read a line larger than the buffer size, etc.).  We need a
more systematic approach to unit testing here.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun  7 17:31:44 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 12:31:44 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: Your message of "Fri, 07 Jun 2002 12:03:49 EDT."
 <LNBBLJKPBEHFEDALKOLCOEDLPLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCOEDLPLAA.tim.one@comcast.net>
Message-ID: <200206071631.g57GVix25975@pcp02138704pcs.reston01.va.comcast.net>

> No, but assuming there isn't a real bug here, repeating gc.collect() until
> it returns 0 would be -- as the self-contained program showed, we *may* need
> to call gc.collect() as many as 4 times before that happens.  And if it's
> legit that it may need 4, I see no reason for believing there's any a priori
> upper bound on how many may be needed.  And the test could have failed all
> along, even in 2.2; it apparently depends on how many times gc just happens
> to run before we get to test_gc.
> 
> I'll check in a "drain it" fix to test_gc, but I'm still squirming.

Hold off.  Neil said he thought there was a bug introduced early
December -- that's before 2.2 was release!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun  7 17:41:53 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 12:41:53 -0400
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: Your message of "Fri, 07 Jun 2002 18:10:13 +0200."
 <3D00DAE5.10203@lemburg.com>
References: <NEBBIJKBMLDBLNCEEFOCIEJCCNAA.perry@stsci.edu> <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net> <3D00CD79.6050409@lemburg.com> <200206071523.g57FNNe18428@pcp02138704pcs.reston01.va.comcast.net>
 <3D00DAE5.10203@lemburg.com>
Message-ID: <200206071641.g57Gfr826040@pcp02138704pcs.reston01.va.comcast.net>

> > Can you point to a document that records a decision to the contrary?
> 
> Garbage collection, weak references, changes in the memory allocation,
> etc. etc.

IMO this is 100% FUD.

The GC does not change the object lay-out.  It is only triggered for
types that have a specific flag bit.  The changes in the GC API also
changed the flag bit used.  Weak references also use a flag bit in the
type object and if the flag bit is on, look at a field in the type
that checks whether there is a weakref pointer in the object struct.
All objects (with public object lay-out) that have had their struct
extended have always done so by appending to the end.

Tim spent weeks to make the memory allocation code backwards
compatible (with several different versions, binary and source
compatibility).

As an example, the old Zope ExtensionClass code works fine with Python
2.2.

> All these change the binary layout of structs or the semantics
> of memory allocation -- mostly only in slight ways, but to a point
> where 100% binary compatibility is not given anymore.

Still, I maintain that most extensions that work with 1.5.2 still work
today without recompilation, if you can live with the API version
change warnings.  Try it!

> Other changes (which I know of) are e.g. the Unicode APIs which
> have changed (they now include UCS2 or UCS4 depending on whether
> you use 16-bit or 32-bit Unicode internally).

When you compile with UCS2 it should be backward compatible.

> I don't think that binary compatibility is all that important;
> it just requires a recompile (and hey, that way you even get
> sub-type support for free ;-).

Actually, you don't -- you have to set a flag bit to make your type
subclassable.  There are too many things that classic extension types
don't provide.

> Far more difficult to handle are all those minute little changes
> which easily slip the developer's radar.

Examples?

> Luckily this will get approached now by Andrew and Raymond, so
> things are getting much better for us poor souls having to
> live on supporting 3-4 different Python versions :-)

I'm not sure which initiative you are referring to.  Or even which
Andrew (I presume you mean Raymond Hettinger)?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Fri Jun  7 17:38:19 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Jun 2002 12:38:19 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <200206071631.g57GVix25975@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDOPLAA.tim.one@comcast.net>

[Tim]
>> I'll check in a "drain it" fix to test_gc, but I'm still squirming.

[Guido]
> Hold off.  Neil said he thought there was a bug introduced early
> December -- that's before 2.2 was release!

Yup, I saw that.



From nas@python.ca  Fri Jun  7 17:33:27 2002
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 7 Jun 2002 09:33:27 -0700
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEDLPLAA.tim.one@comcast.net>; from tim.one@comcast.net on Fri, Jun 07, 2002 at 12:03:49PM -0400
References: <200206071550.g57FoDk25701@pcp02138704pcs.reston01.va.comcast.net> <LNBBLJKPBEHFEDALKOLCOEDLPLAA.tim.one@comcast.net>
Message-ID: <20020607093327.A3135@glacier.arctrix.com>

--liOOAslEiF7prFVr
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Attached is a little program that triggers the behavior.  The CVS change
I finally narrowed in on was the addition of similar code to test_descr.
A reference counting bug is still by best guess.  Guido?

  Neil

--liOOAslEiF7prFVr
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="gcbug.py"

import gc
gc.disable()

def main():
    # must be inside function scope
    class A(object):
        def __init__(self):
            self.__super = super(A, self)

    A()
 
main()
print 'first collect', gc.collect()
print 'second collect', gc.collect()

--liOOAslEiF7prFVr--



From tim.one@comcast.net  Fri Jun  7 17:57:15 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Jun 2002 12:57:15 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <20020607093327.A3135@glacier.arctrix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEEBPLAA.tim.one@comcast.net>

[Neil Schemenauer]
> Attached is a little program that triggers the behavior.  The CVS change
> I finally narrowed in on was the addition of similar code to test_descr.

Ouch!
> A reference counting bug is still by best guess.  Guido?

Here's the code:

import gc
gc.disable()

def main():
    # must be inside function scope
    class A(object):
        def __init__(self):
            self.__super = super(A, self)

    A()

main()
print 'first collect', gc.collect()
print 'second collect', gc.collect()


The first collect is getting these:

[<__main__.A object at 0x0066A090>,
 <super: <class 'A'>, <A object>>,
 {'_A__super': <super: <class 'A'>, <A object>>}
]


The second is getting these:

[<class '__main__.A'>,
 {'__dict__': <attribute '__dict__' of 'A' objects>,
 '__module__': '__main__',
 '__weakref__': <member '__weakref__' of 'A' objects>,
 '__doc__': None,
 '__init__': <function __init__ at 0x00674C70>},
 (<class '__main__.A'>, <type 'object'>), (<type 'object'>,),
 <attribute '__dict__' of 'A' objects>,
 <member '__weakref__' of 'A' objects>,
 <function __init__ at 0x00674C70>,
 (<cell at 0x0066A110: type object at 0x007687B0>,),
 <cell at 0x0066A110: type object at 0x007687B0>
]

For some reason, the cell nags me.  Perhaps because of your "must be inside
function scope" comment, and that cells are poorly understood by me <wink>.




From jeremy@zope.com  Fri Jun  7 13:41:20 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Fri, 7 Jun 2002 08:41:20 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEEBPLAA.tim.one@comcast.net>
References: <20020607093327.A3135@glacier.arctrix.com>
 <LNBBLJKPBEHFEDALKOLCMEEBPLAA.tim.one@comcast.net>
Message-ID: <15616.43504.457473.110331@slothrop.zope.com>

>>>>> "TP" == Tim Peters <tim.one@comcast.net> writes:

  TP> For some reason, the cell nags me.  Perhaps because of your
  TP> "must be inside function scope" comment, and that cells are
  TP> poorly understood by me <wink>.

I can explain the cells, at least.

def main():
    # must be inside function scope
    class A(object):
        def __init__(self):
            self.__super = super(A, self)

    A()

In the example, the value bound to A is stored in a cell, because it
is a free variable in __init__().  There are two references to the
cell after the class statement is executed.  One is the frame for
main().  The other is the func_closure for __init__().

The second reference creates a cycle.  The cycle is:

class A refers to
function __init__ refers to
cell for A refers to
class A

That's it for what I understand.

It looks like the example code creates two cycles, and one cycle
refers to the other.  The first cycle is the one involving the A
instance and the super instance variable.  That cycle has a reference
to class A.

When the garbage collector runs, it determines the first cycle is
garbage.  It doesn't determine the second cycle is garbage because it
has an external reference from the first cycle.

I presume that the garbage collector can't collect both cycles in one
pass without re-running the update & subtract refs phase after
deleting all the garbage.  During the first refs pass, the second
cycle wasn't detected.  The second cycle is only collectable after the
first cycle has been collected.

Jeremy





From nas@python.ca  Fri Jun  7 18:41:37 2002
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 7 Jun 2002 10:41:37 -0700
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <15616.43504.457473.110331@slothrop.zope.com>; from jeremy@zope.com on Fri, Jun 07, 2002 at 08:41:20AM -0400
References: <20020607093327.A3135@glacier.arctrix.com> <LNBBLJKPBEHFEDALKOLCMEEBPLAA.tim.one@comcast.net> <15616.43504.457473.110331@slothrop.zope.com>
Message-ID: <20020607104137.C3400@glacier.arctrix.com>

Jeremy Hylton wrote:
> When the garbage collector runs, it determines the first cycle is
> garbage.  It doesn't determine the second cycle is garbage because it
> has an external reference from the first cycle.

But both cycles should be in the set being collected.  It should be able
to collect them both at once.  If your theory is correct then we should
be able to construct some cyclic garbage using only list objects and get
the same behavior, right?

  Neil



From mgilfix@eecs.tufts.edu  Fri Jun  7 18:40:36 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Fri, 7 Jun 2002 13:40:36 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jun 07, 2002 at 12:29:40PM -0400
References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020607134036.C24428@eecs.tufts.edu>

On Fri, Jun 07 @ 12:29, Guido van Rossum wrote:
> First, a few new issues in this thread:
> 
> - On Windows, setting a timeout on a socket and then using s.makefile()
>   works as expected (the I/O operations on the file will time out
>   according to the timeout set on the socket).  This is because
>   makefile() returns a pseudo-file that calls s.recv() etc. on the
>   socket object.  But on Unix, s.makefile() on a socket with a timeout
>   is a disaster: because the socket is internally set to nonblocking
>   mode, all I/O operations will fail if they cannot be completed
>   immediately (effectively setting a timeout of 0 on the file).  I have
>   currently documented around this, but maybe it would be better if
>   makefile() used a pseudo-file on all platforms, for uniform behavior.
>   Thoughts?  I'm also thinking of implementing the socket wrapper (which
>   is currently a Python class that has a "real" socket object as an
>   instance variable) as a subclass of the real socket class instead.

  Glad to hear that _fileobject works well. Is there any benefit to
having the file code in C? I bet the python code isn't that much
slower. It does seem a shame to have to switch between the two. Maybe
one solution is that a makefile should set blocking back on if a
timeout exists? That would solve the problem and is a consistent
change since it only checks timeout behavior (= 2 lines of code). I'd
vote for using the python fileobject for both. Some profiling of the
two would be nice if you have the time.

  Er, what's the difference between that and _socketobject in
socket.py? Why not just use the python bit consistently?

> - The original timeout socket code (in Python) by Tim O'Malley had a
>   global timeout which you could set so that *all* sockets
>   *automatically* had their timeout set.  This is nice if you want it
>   to affect library modules like urllib or ftplib.  That feature is
>   currently missing.  Should we add it?  (I have some concerns about
>   it, in that it might break other code -- and it doesn't seem wise to
>   do this for server-end sockets in general.  But it's a nice hack for
>   smaller programs.)

  Is it really so painful for apps to keep track of all their sockets
and then do something like:

     for sock in sock_list:
        sock.settimeout (blah)

  Why keep track of them in the socket module, unless there's already code
for this.

> >   Good stuff. The module needed a little work as I discovered as well
> > :)
> 
> ...and it still needs more.  There are still way too many #ifdefs in
> the code.

  Well, we agreed to do some clean-up in a separate patch but you seem
anxious to get it in there :)

> >   I think the tests should follow the same pattern as the
> > test_socket.py.  While adding my regression tests, I noted that the
> > general socket test suite could use some re-writing but I didn't feel
> > it appropriate to tackle it at that point. Perhaps a next patch?
> 
> Yes, please!

  Alrighty. I'll re-write the test_socket.py and do the test_timeout.py
as well.

> >   This was a concern from the beginning but we had some chat on the
> > dev list and concluded that any system supporting sockets has to
> > support select or some equivalent (hence the initial reason for using
> > the select module, although I agree it was expensive).
> 
> But that doesn't mean there aren't platform-specific tweaks necessary
> to import the definition of select() and the FD_* macros.  We'll find
> out soon enough, this is what alpha releases are for. :-)

  Well, this was the initial reason to use the selectmodule.c code.
There's got to be a way to share code between the two for bare access
to select, since someone else might want to use such functionality one
day (and this has set the precendent). Why not make a small change to
selectmodule.c that opens up the code in a C API or some sort? And
then have select_select use that function internally.

> > > - I'm not sure that the handling of timeout errors in accept(),
> > >   connect() and connect_ex() is 100% correct (this code sniffs the
> > >   error code and decides whether to retry or not).

  This is how the original timeoutsocket.py did it and it seems
to be the way to do blocking connects. You try to do the connect, check
if it happened instantaneously and then if not, do the select, and then
try again. Errno is the way to check it. That's why if we're doing timeout
stuff, there's a second call to accept/connect. Says the linux man pages:

 EAGAIN or EWOULDBLOCK
   The socket is marked non-blocking and no connections are present to
   be accepted.

> >   I've tested these on linux (manually) and they seem to work just
> > fine. I didn't do as much testing with connect_ex but the code is
> > very similar to connect, so confidence is high-er. The reason for the
> > two-pass is because the initial connect needs to be made to start the
> > process and then try again, based on the error codes, for non-blocking
> > connects. It's weird like that.
> 
> I'll wait for the unit tests.  These should test all three modes
> (blocking, non-blocking, and timeout).

  Ok.. Should I merge the test_timeout.py and test_socket.py as well
then? A little off-topic, while I was thinking of restructuring these
tests, I was wondering what might be the best way to structure a unit
test where things have to work in seperate processes/threads. What
I'd really like to do is:

  * Have the setup function set up server and client sockets
  * Have the tear-down function close them
  * Have some synchronization function or simple message (this is
    socket specific)
  * Then have a test that has access to both threads and can insert
    call-backs to run in each thread.

  This seems tricky with the current unit-test frame work. The way I'd
do it is using threading.Event () and have the thing block until the
server-side and client-side respectively submit their test callbacks.
But it might be nice to have a general class that can be added to the
testing framework. Thoughts?

> Can you explain why on Windows you say that the socket is connected
> when connect() returns a WSAEINVAL error?

  This is what timeoutsocket.py used as the unix equivalent error
codes, and since I'm not set up to test windows and since it was
working code, I took their word for it.

> Also, your code enters this block even in non-blocking mode, if a
> timeout was also set.  (Fortunately I fixed this for you by setting
> the timeout to -1.0 in setblocking(). Unfortunately there's still a
> later test for !sock_blocking in the same block that cannot succeed
> any more because of that.)

  This confusion is arising because of the restructuring of the code.
Erm, this check applies if we have a timeout but are in non-blocking
mode. Perhaps you changed this? To make it clearer, originally before
the v2 of the patch, the socket was always in non-blocking mode, so
it was necessary to check whether we were examining error code with
non-blocking in mind, or whether we were checking for possible timeout
behavior. Since we've changed this, it now checks if non-blocking has
been set while a timeout has been set. Seems valid to me...

> The same concerns apply to connect_ex() and accept(), which have very
> similar logic.
> 
> I believe it is possible on some Unix variants (maybe on Linux) that
> when select indicates that a socket is ready, if the socket is in
> nonblocking mode, the call will return an error because some kernel
> resource is unavailable.  This suggests that you may have to keep the
> socket in blocking mode except when you have to do a connect() or
> accept() (for which you can't do a select without setting the socket
> in nonblocking mode first).

  Not sure about this. Checking the man pages, the error codes seem
to be the thing to check to determine what the behavior is. Perhaps
you could clarify?

> Looking at connect_ex, it seems to be missing the "res = errno" bit
> in the case where it says "we're already connected".  It used to
> return errno here, now it will return -1.  Maybe the conex_finally
> label should be moved up to before the "if (res != 0) {" line?

  Ah yes. I didn't look closely enough at the windows bit. On linux
it isn't necessary. Let's move it up.

> >   I thought about this and whether or not I wanted to address this.  I
> > kinda decided to leave them separate though. I don't think setting a
> > timeout means anything equivalent to setblocking(0). In fact, I can't
> > see why anyone should ever set a timeout of zero and the immediate
> > throwing of the exception is a good alert as to what's going on. I
> > vote, leave them separate and as they are now...
> 
> OTOH, a timeout of 0 behaves very similar to nonblocking mode --
> similar enough that a program that uses setblocking(0) would probably
> also work when using settimeout(0).  I kind of like the idea of having
> only a single internal flag value, sock_timeout, rather than two
> (sock_timeout and sock_blocking).

  But one throws an exception and one doesn't. It seems to me that
setting a timeout of 0 is sort of an error, if anything. It'll be an
easy way to do a superficial test of the functionality in the regr
test.

> > > - The socket.py module has been changed too, changing the way
> > >   buffering is done on Windows.  I haven't reviewed or tested this
> > >   code thoroughly.
> > 
> >   I added a regression test to test_socket.py to test this, that works
> > on both the old code (I used 2.1.3) and the new code. Hopefully, this
> > will be instrumental for those testing it and it reflects my manual
> > tests.
> 
> The tests don't look very systematic.  There are many cases (default
> bufsize, unbuffered, bufsize=1, large bufsize; combine with read(),
> readline(), read a line larger than the buffer size, etc.).  We need a
> more systematic approach to unit testing here.

  Ok, so to recap which tests we want:

     * Default read()
     * Read with size given
     * unbuffered read
     * large buffer size
     * Mix read and realine
     * Do a realine
     * Do a readline larger than buffer size.

  Any others in the list?

                              -- Mike

`-> (guido)

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From fredrik@pythonware.com  Fri Jun  7 18:42:13 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 7 Jun 2002 19:42:13 +0200
Subject: [Python-Dev] textwrap.py
References: <LNBBLJKPBEHFEDALKOLCMECKPLAA.tim_one@email.msn.com>
Message-ID: <006d01c20e4a$d207a3c0$ced241d5@hagrid>

tim wrote:
> Despite that you never bought a shift key, you use two spaces between
> sentences.

that's only to compensate for the lack of uppercase letters.
I can change that, if you wish.

> A difference is that my quote came from a publisher spelling out
> requirements for submission

from what I can tell, the three most well-respected style guides
for American English is the Chicago Manual of Style, the MLA style,
and the APA.

the page I linked to in my first post on this topic was from the
official CMS FAQ:

    This practice [of using double spaces] is discouraged by the
    University of Chicago Press, especially for formally published
    works and the manuscripts from which they are published. 

the MLA FAQ says:

    Publications in the United States today usually have the same
    spacing after a punctuation mark as between words on the same
    line /.../ In addition, most publishers' guidelines for preparing a
    manuscript on disk ask authors to type only the spaces that are
    to appear in print.

and continues

    ... there is nothing wrong with using two spaces after concluding
    punctuation marks unless an instructor or editor requests that you
    do otherwise.

the APA don't have a FAQ, but according to a "crib sheet" I found
on the net, the 5th edition says something similar to:

    Use one space after all punctuation.

and finally, John Rhodes (of webword fame) has collected lots of arguments
for and against:

    http://www.webword.com/reports/period.html

his conclusion:

    If you can't decide for yourself based on the above information here is my
    advice: You should use one space. Period.

> > you'll find that the word "some" is more correct than "several".
> 
> I'm not sure that distinction means something; if it does, I don't buy it.

now that you mention it, I'm not sure either.

let's see: according to my dictionary, "some" implies "more than none but
not many" while "several" implies "two or more, but not a large number".

you're right; one could probably find two style guides that supports your
view...

> >     "... I've found tenacity and authority the overriding "arguments"
> >     for maintaining the two-space rule. Empirically and financially, the
> >     one-space rule makes sense."
>
> Selective quoting of random people blathering at each other doesn't
> count as "research" to me.

no, but selective quoting can be used to make a point you're too
lazy to spell out yourself: most proponents rely on the authority of
their typing teacher or their mom.

> > according to vision researchers, humans using their eyes to read
> > text don't care much about sentence breaks inside blocks of text
> 
> This reads like a garbled paraphrase; I assume that if you had a real
> reference, you would have given it <0.9 wink>.

no, it was an attempt to summarize various sources (as I've inter-
preted them) in a way that could be understood by a bot.

as we all know, bots can simply copy bytes from an input device,
and doesn't have to learn how to carefully move their eyes in various
intricate patterns...

>    Like many theories, it sounds logical, but those of us who read old
>    books or are old enough to remember when typesetting was an art
>    practiced by people, rather than the result of an algorithm, know
>    better.  Typists were taught to hit two spaces after a period when
>    typing because typeset material once upon a time used extra space
>    there.
>
>    This is an interesting instance of a phenomenon that we should all
>    be aware of:  in times of much change, collective cultural amnesia
>    can occur, and a whole society can forget something that "everyone
>    knew."

the webword page mentions that it was difficult to typeset double
spaces on the first linotype machines, and when customers had to
pay extra to get double spaces, it quickly became unfashionable:

    If the operator typed two spaces in a row, you had two wedges next
    to each other, and that tended to gum up the operation. Clients who
    insisted could be accommodated by typing an en-space followed by
    a justifier-space, but printers charged extra for it and ridiculed it as
    'French Spacing, oo-la-la, you want it all fancy, huh?

iirc, the linotype was introduced in the 1890's. when did Patricia
write that review? ;-)

:::

anyway, to end this thread, the only reasonable thing is to do
like the "fmt" command, and provide a bunch of options:

    newtext = string.wrap(text, width=, french_spacing=, split=, prefix=)

I'll leave it to Guido to pick suitable defaults.

Cheers /F




From mal@lemburg.com  Fri Jun  7 18:48:30 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 07 Jun 2002 19:48:30 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
References: <NEBBIJKBMLDBLNCEEFOCIEJCCNAA.perry@stsci.edu> <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net> <3D00CD79.6050409@lemburg.com> <200206071523.g57FNNe18428@pcp02138704pcs.reston01.va.comcast.net>              <3D00DAE5.10203@lemburg.com> <200206071641.g57Gfr826040@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D00F1EE.9030501@lemburg.com>

Guido van Rossum wrote:
>>>Can you point to a document that records a decision to the contrary?
>>
>>Garbage collection, weak references, changes in the memory allocation,
>>etc. etc.
> 
> 
> IMO this is 100% FUD.

Could be 95% FUD :-), but I do remember that older versions of my
extensions broke when weak references were introduced.

> The GC does not change the object lay-out.  It is only triggered for
> types that have a specific flag bit.  The changes in the GC API also
> changed the flag bit used.  Weak references also use a flag bit in the
> type object and if the flag bit is on, look at a field in the type
> that checks whether there is a weakref pointer in the object struct.
> All objects (with public object lay-out) that have had their struct
> extended have always done so by appending to the end.
> 
> Tim spent weeks to make the memory allocation code backwards
> compatible (with several different versions, binary and source
> compatibility).
> 
> As an example, the old Zope ExtensionClass code works fine with Python
> 2.2.
 >
>>All these change the binary layout of structs or the semantics
>>of memory allocation -- mostly only in slight ways, but to a point
>>where 100% binary compatibility is not given anymore.
> 
> 
> Still, I maintain that most extensions that work with 1.5.2 still work
> today without recompilation, if you can live with the API version
> change warnings.  Try it!

That's true for most extensions.

Note that I wasn't saying that they all broke... distutils is
mainly being very careful about the Python version on Windows
because the name of the DLL contains the version name and the
reference is hard-coded into the extension DLLs.

Also, I don't have a problem with recompiling an extension for
a new Python version. What's important to me is that the
existing code continues to compile and work, not that a
compiled version for some old Python version continues
to run in a new version (the warnings are unacceptable in
a production environment, so there's no point in discussing
this).

>>Other changes (which I know of) are e.g. the Unicode APIs which
>>have changed (they now include UCS2 or UCS4 depending on whether
>>you use 16-bit or 32-bit Unicode internally).
> 
> 
> When you compile with UCS2 it should be backward compatible.

No, it's not:

08077f18 T PyUnicodeUCS2_AsASCIIString
0807da34 T PyUnicodeUCS2_AsCharmapString
080760a8 T PyUnicodeUCS2_AsEncodedString
08077b54 T PyUnicodeUCS2_AsLatin1String
0807d84c T PyUnicodeUCS2_AsRawUnicodeEscapeString
08076f64 T PyUnicodeUCS2_AsUTF16String
0807d6b4 T PyUnicodeUCS2_AsUTF8String
...

>>I don't think that binary compatibility is all that important;
>>it just requires a recompile (and hey, that way you even get
>>sub-type support for free ;-).
> 
> 
> Actually, you don't -- you have to set a flag bit to make your type
> subclassable.  There are too many things that classic extension types
> don't provide.

I was referring to the Py<Type>_Check() changes. After
a recompile they now also accept subtypes.

>>Far more difficult to handle are all those minute little changes
>>which easily slip the developer's radar.
> 
> 
> Examples?

Just see Raymond's PEP for a list of coding changes over the
years. Before this list existed , getting at that information was
hard.

Other subtle changes include the bool stuff, things like re starting
to fail when it sees multiple definitions of group names, changes
to xrange, change of character escaping, new scoping rules,
renaming in the C API (Length->Size) etc. etc.

>>Luckily this will get approached now by Andrew and Raymond, so
>>things are getting much better for us poor souls having to
>>live on supporting 3-4 different Python versions :-)
> 
> 
> I'm not sure which initiative you are referring to. 

The migration guide.

> Or even which
> Andrew (I presume you mean Raymond Hettinger)?

Andrew Kuchling.

Raymond Hettinger is summarizing the new coding style
possiblities (and how to write code for older Python
versions).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From tim.one@comcast.net  Fri Jun  7 18:51:16 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Jun 2002 13:51:16 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <15616.43504.457473.110331@slothrop.zope.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEEGPLAA.tim.one@comcast.net>

[Jeremy Hylton, explains the cells in Neil's example]

Thanks!  That was helpful.

> ...
> class A refers to
> function __init__ refers to
> cell for A refers to
> class A
>
> That's it for what I understand.

Where does the singleton tuple containing a cell come from?  I guess it must
be in function __init__.

> It looks like the example code creates two cycles, and one cycle
> refers to the other.  The first cycle is the one involving the A
> instance and the super instance variable.  That cycle has a reference
> to class A.
>
> When the garbage collector runs, it determines the first cycle is
> garbage.  It doesn't determine the second cycle is garbage because it
> has an external reference from the first cycle.

The proper understanding of "external" here is wrt all the objects GC
tracks.  So long as references are *within* that grand set, there are no
external references in a relevant sense.  "External" means stuff like the
reference is due to an untracked container, or to a C variable -- stuff like
that.

> I presume that the garbage collector can't collect both cycles in one
> pass without re-running the update & subtract refs phase after
> deleting all the garbage.  During the first refs pass, the second
> cycle wasn't detected.  The second cycle is only collectable after the
> first cycle has been collected.

I don't think that's it.  Here:

C:\Code\python\PCbuild>python
Python 2.3a0 (#29, Jun  5 2002, 23:17:02) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> class A: pass
...
>>> class B: pass
...
>>> A.a = A # one cycle
>>> B.b = B # another
>>> A.b = B # point the first cycle at the second
>>> import gc
>>> gc.collect()
0
>>> gc.set_debug(gc.DEBUG_SAVEALL)
>>> del B
>>> gc.collect()   # A still keeping everything alive
0
>>> del A          # Both cycles are trash now
>>> gc.collect()   # And both are recoved in one pass
4
>>> print gc.garbage
[<class __main__.A at 0x0065D120>,

 {'a': <class __main__.A at 0x0065D120>,
  '__module__': '__main__',
  'b': <class __main__.B at 0x0065D1B0>,
  '__doc__': None
 },

 <class __main__.B at 0x0065D1B0>,

 {'__module__': '__main__',
  'b': <class __main__.B at 0x0065D1B0>,
  '__doc__': None
 }
]
>>>




From jeremy@zope.com  Fri Jun  7 14:12:33 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Fri, 7 Jun 2002 09:12:33 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEEGPLAA.tim.one@comcast.net>
References: <15616.43504.457473.110331@slothrop.zope.com>
 <LNBBLJKPBEHFEDALKOLCMEEGPLAA.tim.one@comcast.net>
Message-ID: <15616.45377.425270.366203@slothrop.zope.com>

>>>>> "TP" == Tim Peters <tim.one@comcast.net> writes:

  TP> [Jeremy Hylton, explains the cells in Neil's example] Thanks!
  TP> That was helpful.

> ...
> class A refers to
> function __init__ refers to
> cell for A refers to
> class A

  TP> Where does the singleton tuple containing a cell come from?  I
  TP> guess it must be in function __init__.

As Guido mentioned, I was illustrative but avoided being thorough <0.5
wink>.

class A refers to
its __dict__ refers to 
function __init__ refers to
its func_closure (a tuple of cells) refers to
cell for A refers to
class A

Jeremy




From guido@python.org  Fri Jun  7 19:04:31 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 14:04:31 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: Your message of "Fri, 07 Jun 2002 19:42:13 +0200."
 <006d01c20e4a$d207a3c0$ced241d5@hagrid>
References: <LNBBLJKPBEHFEDALKOLCMECKPLAA.tim_one@email.msn.com>
 <006d01c20e4a$d207a3c0$ced241d5@hagrid>
Message-ID: <200206071804.g57I4VT26617@pcp02138704pcs.reston01.va.comcast.net>

>     'French Spacing, oo-la-la, you want it all fancy, huh?

The really bizarre thing being that in LaTeX, \frenchspacing means
*not* to put extra space after a sentence!

It also appears right that (human) typsetters did stretch the space
between sentences more than the space between words when stretching a
line to right-justify it.  In order to do that with a computer
typesetting program, and to avoid it assuming a sentence ends after
other use of periods (e.g. in "Mr. Lundh"), you have to tell it where
the sentences end.  Double spacing is a convenient convention for
that.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From sholden@holdenweb.com  Fri Jun  7 19:19:18 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Fri, 7 Jun 2002 14:19:18 -0400
Subject: [Python-Dev] Socket timeout patch
References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> <20020607134036.C24428@eecs.tufts.edu>
Message-ID: <052101c20e4f$d8153930$7201a8c0@holdenweb.com>

[ ... ]
[Guido]
> > - The original timeout socket code (in Python) by Tim O'Malley had a
> >   global timeout which you could set so that *all* sockets
> >   *automatically* had their timeout set.  This is nice if you want it
> >   to affect library modules like urllib or ftplib.  That feature is
> >   currently missing.  Should we add it?  (I have some concerns about
> >   it, in that it might break other code -- and it doesn't seem wise to
> >   do this for server-end sockets in general.  But it's a nice hack for
> >   smaller programs.)
>
[Mike]
>   Is it really so painful for apps to keep track of all their sockets
> and then do something like:
>
>      for sock in sock_list:
>         sock.settimeout (blah)
>
>   Why keep track of them in the socket module, unless there's already code
> for this.
>
It isn't painful, it's impossible (unless you want to revise all the
libraries).

The real problem comes when a program uses a socket-based library such as
smtplib or ftplib. Without the ability to impose a default timeout the
library client has no way to set a timeout until the library has created the
socket (and even then it will break encapsulation to do so in many cases).

Unfortunately, the most common requirement for a timeout is to avoid socket
code hanging when it makes the initial attempt to connect to a
non-responsive host. Under these circumstances, if the connect() doesn't
time out it can apparently be as long as two hours before an exception is
raised.

[...]

regards
-----------------------------------------------------------------------
Steve Holden                                 http://www.holdenweb.com/
Python Web Programming                http://pydish.holdenweb.com/pwp/
-----------------------------------------------------------------------






From fredrik@pythonware.com  Fri Jun  7 19:27:14 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 7 Jun 2002 20:27:14 +0200
Subject: [Python-Dev] textwrap.py
References: <LNBBLJKPBEHFEDALKOLCMECKPLAA.tim_one@email.msn.com>              <006d01c20e4a$d207a3c0$ced241d5@hagrid>  <200206071804.g57I4VT26617@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <00a801c20e50$f5dc6730$ced241d5@hagrid>

Guido wrote:

> It also appears right that (human) typsetters did stretch the space
> between sentences more than the space between words when stretching a
> line to right-justify it.  In order to do that with a computer
> typesetting program, and to avoid it assuming a sentence ends after
> other use of periods (e.g. in "Mr. Lundh"), you have to tell it where
> the sentences end.  Double spacing is a convenient convention for
> that.

or you could use non-breaking spaces...

</F>




From guido@python.org  Fri Jun  7 19:39:18 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 14:39:18 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: Your message of "Fri, 07 Jun 2002 20:27:14 +0200."
 <00a801c20e50$f5dc6730$ced241d5@hagrid>
References: <LNBBLJKPBEHFEDALKOLCMECKPLAA.tim_one@email.msn.com> <006d01c20e4a$d207a3c0$ced241d5@hagrid> <200206071804.g57I4VT26617@pcp02138704pcs.reston01.va.comcast.net>
 <00a801c20e50$f5dc6730$ced241d5@hagrid>
Message-ID: <200206071839.g57IdIk26862@pcp02138704pcs.reston01.va.comcast.net>

> or you could use non-breaking spaces...

My keyboard doesn't have one.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal@lemburg.com  Fri Jun  7 19:36:53 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 07 Jun 2002 20:36:53 +0200
Subject: [Python-Dev] Socket timeout patch
References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> <20020607134036.C24428@eecs.tufts.edu> <052101c20e4f$d8153930$7201a8c0@holdenweb.com>
Message-ID: <3D00FD45.1090306@lemburg.com>

>>>- The original timeout socket code (in Python) by Tim O'Malley had a
>>>  global timeout which you could set so that *all* sockets
>>>  *automatically* had their timeout set.  This is nice if you want it
>>>  to affect library modules like urllib or ftplib.  That feature is
>>>  currently missing.  Should we add it?  (I have some concerns about
>>>  it, in that it might break other code -- and it doesn't seem wise to
>>>  do this for server-end sockets in general.  But it's a nice hack for
>>>  smaller programs.)

Would be nice to have this. Programs like Plucker which do a lot
of socket work could benefit from it.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From tim.one@comcast.net  Fri Jun  7 19:46:35 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Jun 2002 14:46:35 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <006d01c20e4a$d207a3c0$ced241d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEEOPLAA.tim.one@comcast.net>

[Tim]
> Despite that you never bought a shift key, you use two spaces between
> sentences.

[/F]
> that's only to compensate for the lack of uppercase letters.
> I can change that, if you wish.

Goodness no!  I read email in Courier New, and the "extra" spaces do more to
make your style readable than would capital letters.

> ...
> from what I can tell, the three most well-respected style guides
> for American English is the Chicago Manual of Style, the MLA style,
> and the APA.

Ya, I saw all that stuff.  As I said at the very start, the "two space" rule
doesn't make sense for published works, as proportional fonts, kerning, and
the other gimmicks available to real typesetting are sufficient there.  I'm
solely talking about monospaced fonts.  The CMS etc are not.  If you follow
links deeply enough, you'll find at least one of the authors of these guides
"confessing" that they use two spaces in email, so that it's readable in a
fixed font.

> ...
>     Publications in the United States today usually have the same
>     spacing after a punctuation mark as between words on the same
>     line /.../

Except virtually no publications in the US today use monospaced fonts.

> ...
> and finally, John Rhodes (of webword fame) has collected lots of arguments
> for and against:
>
>     http://www.webword.com/reports/period.html

Yes, I read that too.  His "revelation" at the start is crucial:

    One of the next things I realized is that, in general, the spacing
    after a period will be irrelevant since most fonts used today are
    proportional

and goes on to reinforce the point in BOLD whenever he can <wink>:

    ... the current typographic standard for a single space after the
    period is a reflection of the power of proportionally spaced fonts.

Repetitions of this point are ubiquitous all over the web, not just in my
email <wink>.

> ...
> iirc, the linotype was introduced in the 1890's. when did Patricia
> write that review? ;-)

1996.  It's an OK review:

    http://www.the-efa.org/news/gramglean.html

> ...
> anyway, to end this thread, the only reasonable thing is to do
> like the "fmt" command, and provide a bunch of options:
>
>     newtext = string.wrap(text, width=, french_spacing=, split=, prefix=)
>
> I'll leave it to Guido to pick suitable defaults.

Greg seems to want to do it via setting vrbls on subclasses.  I couldn't
care less how it's done, so long as I have some way to wrap for readability
in a fixed-width font.




From guido@python.org  Fri Jun  7 20:25:10 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 15:25:10 -0400
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: Your message of "Fri, 07 Jun 2002 19:48:30 +0200."
 <3D00F1EE.9030501@lemburg.com>
References: <NEBBIJKBMLDBLNCEEFOCIEJCCNAA.perry@stsci.edu> <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net> <3D00CD79.6050409@lemburg.com> <200206071523.g57FNNe18428@pcp02138704pcs.reston01.va.comcast.net> <3D00DAE5.10203@lemburg.com> <200206071641.g57Gfr826040@pcp02138704pcs.reston01.va.comcast.net>
 <3D00F1EE.9030501@lemburg.com>
Message-ID: <200206071925.g57JPB627364@pcp02138704pcs.reston01.va.comcast.net>

> Could be 95% FUD :-), but I do remember that older versions of my
> extensions broke when weak references were introduced.

Could be that you were breaking the rules of course. :-)

> Also, I don't have a problem with recompiling an extension for
> a new Python version. What's important to me is that the
> existing code continues to compile and work, not that a
> compiled version for some old Python version continues
> to run in a new version (the warnings are unacceptable in
> a production environment, so there's no point in discussing
> this).

There are two kinds of case to be made for binary compatibility, both
involving 3rd party extensions whose maintainer has lost interest.
Case one is: it's only available in binary for a given platform (maybe
something it's linked with wasn't open source).  Case two: the code
doesn't compile any more under a new Python version, and the user who
wants to use it isn't sufficiently versatile in C to be able to fix it
(or has no time).
> > When you compile with UCS2 it should be backward compatible.
> 
> No, it's not:
> 
> 08077f18 T PyUnicodeUCS2_AsASCIIString
> 0807da34 T PyUnicodeUCS2_AsCharmapString
> 080760a8 T PyUnicodeUCS2_AsEncodedString
> 08077b54 T PyUnicodeUCS2_AsLatin1String
> 0807d84c T PyUnicodeUCS2_AsRawUnicodeEscapeString
> 08076f64 T PyUnicodeUCS2_AsUTF16String
> 0807d6b4 T PyUnicodeUCS2_AsUTF8String
> ...

Hm.  Maybe only the UCS4 variants should be renamed?

Of course, few extensions reference Unicode APIs...

> Just see Raymond's PEP for a list of coding changes over the
> years. Before this list existed , getting at that information was
> hard.

But you don't *have* to make any of those changes.  That's the whole
point of backwards compatibility.

> Other subtle changes include the bool stuff, things like re starting
> to fail when it sees multiple definitions of group names, changes
> to xrange, change of character escaping, new scoping rules,
> renaming in the C API (Length->Size) etc. etc.

I think we have left the topic of binary compatibility here. :-)

> Raymond Hettinger is summarizing the new coding style
> possiblities (and how to write code for older Python
> versions).

Yeah, I'm waiting for him to check it in as PEP 290.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From bernie@3captus.com  Fri Jun  7 20:17:52 2002
From: bernie@3captus.com (Bernard Yue)
Date: Fri, 07 Jun 2002 13:17:52 -0600
Subject: [Python-Dev] Socket timeout patch
References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net>
 <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D0106E0.5D8DF0A5@3captus.com>

[Guido]
> Remaining issues:
> 
> - A test suite.  There's no decent test suite for the timeout code.
>   The file test_timeout.py doesn't test the functionality (as I
>   discovered when the test succeeded while I had several blunders 
>   in the select code that made everything always time out).

[Michael]
>   Er, hopefully Bernard is still watching this thread as he wrote
> the test_timeout.py. He's been pretty quiet though as of late... I'm
> willing to rewrite the tests if he doesn't have the time. 
>
>   I think the tests should follow the same pattern as the
> test_socket.py.  While adding my regression tests, I noted that the
> general socket test suite could use some re-writing but I didn't feel
> it appropriate to tackle it at that point. Perhaps a next patch?

Looks like I have missed the war, folks!  I will work on the test 
suite.  The orginal test_timeout.py is incomplete.  I actually had 
problem when writing test case for accept(), using blocking() and 
makefile().  Guido, you are right on the point, the test suite 
should work without the timeout code as well.  If I've done that ...

As for the scope of the test suite, I would prefer to focus on socket 
timeout test for now.  Though there will be overlapping test for socket 
timeout test and socket test, we can always merge it later.


[Guido]
> - Cross-platform testing.  It's possible that the cleanup broke things
>   on some platforms, or that select() doesn't work the same way.  I
>   can only test on Windows and Linux; there is code specific to OS/2
>   and RISCOS in the module too.

[Michael]
>   This was a concern from the beginning but we had some chat on the
> dev list and concluded that any system supporting sockets has to
> support select or some equivalent (hence the initial reason for using
> the select module, although I agree it was expensive).

I now have Visual C++ version 6, but still limited to Windows and 
Linux.  I think once we are done with this two platform, we can ask 
people to run the test on other platform.  But I agreed with Michael
that using python select module put us on the safer side.


[Guido]
> - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)?

> OTOH, a timeout of 0 behaves very similar to nonblocking mode --
> similar enough that a program that uses setblocking(0) would 
> probably also work when using settimeout(0).  I kind of like the 
> idea of having only a single internal flag value, sock_timeout, 
> rather than two (sock_timeout and sock_blocking).

Agree.

[Guido]
> - The original timeout socket code (in Python) by Tim O'Malley had a
>  global timeout which you could set so that *all* sockets
>  *automatically* had their timeout set.  This is nice if you want it
>  to affect library modules like urllib or ftplib.  That feature is
>  currently missing.  Should we add it?  (I have some concerns about
>  it, in that it might break other code -- and it doesn't seem wise to
>  do this for server-end sockets in general.  But it's a nice hack for
>  smaller programs.)

Steve Holden and M.-A. Lemburg have spoken.


Bernie



From guido@python.org  Fri Jun  7 21:03:10 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 16:03:10 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: Your message of "Fri, 07 Jun 2002 13:40:36 EDT."
 <20020607134036.C24428@eecs.tufts.edu>
References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net>
 <20020607134036.C24428@eecs.tufts.edu>
Message-ID: <200206072003.g57K3AJ27544@pcp02138704pcs.reston01.va.comcast.net>

[Jeremy, please skip forward to where it says "Stevens" or "Jeremy"
and comment.]

>   Glad to hear that _fileobject works well.

I didn't say that.  I was hoping it does though. :-)

> Is there any benefit to having the file code in C?

Some operations (notably pickle.dump() and .load()) only work with
real files.  Other operations (e.g. printing a large list or dict) can
be faster to real files because they don't have to build the str() or
repr() of the whole thingk as a string first.

> I bet the python code isn't that much slower.

You're on.  Write a benchmark.  I notice that httplib uses makefile(),
and often reads very small chunks.

> It does seem a shame to have to switch
> between the two. Maybe one solution is that a makefile should set
> blocking back on if a timeout exists?

That's not very nice.  It could raise an exception.  But you could set
the timeout on the socket *after* calling makefile(), and then you'd
be hosed.  But if we always use the Python makefile(), the problem is
solved.

> That would solve the problem
> and is a consistent change since it only checks timeout behavior (=
> 2 lines of code). I'd vote for using the python fileobject for
> both. Some profiling of the two would be nice if you have the time.

I don't, maybe you do? :-)

> >   I'm also thinking of implementing the socket wrapper (which
> >   is currently a Python class that has a "real" socket object as an
> >   instance variable) as a subclass of the real socket class instead.

>   Er, what's the difference between that and _socketobject in
> socket.py? Why not just use the python bit consistently?

Making it a subclass should be faster.  But I don't know yet if it can
work -- I probably have to change the constructor at the C level to be
able to support dup() (which is the other reason for the Python
wrapper).

>   Is it really so painful for apps to keep track of all their sockets
> and then do something like:
> 
>      for sock in sock_list:
>         sock.settimeout (blah)
> 
>   Why keep track of them in the socket module, unless there's already code
> for this.

Steve Holden already answered this one.  Also, you don't have to keep
track of all sockets -- you just have to apply the timeout if one is
set in a global variable.

>   Well, we agreed to do some clean-up in a separate patch but you seem
> anxious to get it in there :)

I am relaxing now, waiting for you to pick up again.

>   Well, this was the initial reason to use the selectmodule.c code.
> There's got to be a way to share code between the two for bare access
> to select, since someone else might want to use such functionality one
> day (and this has set the precendent). Why not make a small change to
> selectmodule.c that opens up the code in a C API or some sort? And
> then have select_select use that function internally.

If two modules are both shared libraries, it's really painful to share
entry points (see the interface between _ssl.c and socketmodule.c for
an example -- it's written down in a large comment block in
socketmodule.h).  I really think one should be able to use select() in
more than one file.  At least it works on Windows. ;-)

> > > > - I'm not sure that the handling of timeout errors in accept(),
> > > >   connect() and connect_ex() is 100% correct (this code sniffs the
> > > >   error code and decides whether to retry or not).
> 
>   This is how the original timeoutsocket.py did it and it seems
> to be the way to do blocking connects.

I understand all that.  My comment is that your patch changed the
control flow in non-blocking mode too.  I think I accidentally fixed
it by setting sock_timeout to -1.0 in setblocking().  But I'm not 100%
sure so I'd like this aspect to be unit-tested thoroughly.

>   Ok.. Should I merge the test_timeout.py and test_socket.py as well
> then?

No, you can create as many (or as few) unit test files as you need.

> A little off-topic, while I was thinking of restructuring these
> tests, I was wondering what might be the best way to structure a unit
> test where things have to work in seperate processes/threads. What
> I'd really like to do is:
> 
>   * Have the setup function set up server and client sockets
>   * Have the tear-down function close them
>   * Have some synchronization function or simple message (this is
>     socket specific)
>   * Then have a test that has access to both threads and can insert
>     call-backs to run in each thread.
> 
>   This seems tricky with the current unit-test frame work. The way I'd
> do it is using threading.Event () and have the thing block until the
> server-side and client-side respectively submit their test callbacks.
> But it might be nice to have a general class that can be added to the
> testing framework. Thoughts?

If I were you I'd worry about getting it right once first.  Then we
can see if there's room for generalization.  (You might want to try to
convert test_socketserver.py to your proposed framework to see how
well it works.)

> > Can you explain why on Windows you say that the socket is connected
> > when connect() returns a WSAEINVAL error?
> 
>   This is what timeoutsocket.py used as the unix equivalent error
> codes, and since I'm not set up to test windows and since it was
> working code, I took their word for it.

Well, but WSAEINVAL can also be returned for other conditions.  See

http://msdn.microsoft.com/library/en-us/winsock/wsapiref_8m7m.asp

it seems that the *second* time you call connect() WSAEINVAL can only
mean that you're already connected.  But if this socket is in a
different state, and the connect() is simply not appropriate, I don't
like the fact that connect() would simply return "success" rather than
reporting an error.  E.g. I could do this:

  s = socket()
  s.settimeout(100)
  s.connect((host, port))
  .
  .
  .
  # By mistake:
  s.connect((otherhost, otherport))

I want the latter connect() to fail, but I think your code will make
it succeed.

> > Also, your code enters this block even in non-blocking mode, if a
> > timeout was also set.  (Fortunately I fixed this for you by setting
> > the timeout to -1.0 in setblocking(). Unfortunately there's still a
> > later test for !sock_blocking in the same block that cannot succeed
> > any more because of that.)
> 
>   This confusion is arising because of the restructuring of the code.
> Erm, this check applies if we have a timeout but are in non-blocking
> mode. Perhaps you changed this? To make it clearer, originally before
> the v2 of the patch, the socket was always in non-blocking mode, so
> it was necessary to check whether we were examining error code with
> non-blocking in mind, or whether we were checking for possible timeout
> behavior. Since we've changed this, it now checks if non-blocking has
> been set while a timeout has been set. Seems valid to me...

But I changed that again: setblocking() now always disables the
timeout.  Read the new source in CVS.

> > The same concerns apply to connect_ex() and accept(), which have very
> > similar logic.
> > 
> > I believe it is possible on some Unix variants (maybe on Linux) that
> > when select indicates that a socket is ready, if the socket is in
> > nonblocking mode, the call will return an error because some kernel
> > resource is unavailable.  This suggests that you may have to keep the
> > socket in blocking mode except when you have to do a connect() or
> > accept() (for which you can't do a select without setting the socket
> > in nonblocking mode first).
> 
>   Not sure about this. Checking the man pages, the error codes seem
> to be the thing to check to determine what the behavior is. Perhaps
> you could clarify?

When a timeout is set, the socket file descriptor is always in
nonblocking mode.  Take sock_recv() for example.  It calls
internal_select(), and if that returns >= 1, it calls recv().  But
according to the Stevens books, it is still possible (under heavy
load) that the recv() returns an EWOULDBLOCK error.

(We ran into this while debugging a high-performance application based
on Spread.  The select() succeeded, but the recv() failed, because the
socket was in nonblocking mode.  Well, I'm *almost* sure that this was
the case -- Jeremy Hylton should know the details.)

> > Looking at connect_ex, it seems to be missing the "res = errno" bit
> > in the case where it says "we're already connected".  It used to
> > return errno here, now it will return -1.  Maybe the conex_finally
> > label should be moved up to before the "if (res != 0) {" line?
> 
>   Ah yes. I didn't look closely enough at the windows bit. On linux
> it isn't necessary. Let's move it up.

OK, done.

> > OTOH, a timeout of 0 behaves very similar to nonblocking mode --
> > similar enough that a program that uses setblocking(0) would probably
> > also work when using settimeout(0).  I kind of like the idea of having
> > only a single internal flag value, sock_timeout, rather than two
> > (sock_timeout and sock_blocking).
> 
>   But one throws an exception and one doesn't.

What do you mean?  In nonblocking mode you get an exception when the
socket isn't ready too.

> It seems to me that
> setting a timeout of 0 is sort of an error, if anything. It'll be an
> easy way to do a superficial test of the functionality in the regr
> test.

OK, we don't seem to be able to agree on this.  I'll let your wisdom
prevail.

> > The tests don't look very systematic.  There are many cases (default
> > bufsize, unbuffered, bufsize=1, large bufsize; combine with read(),
> > readline(), read a line larger than the buffer size, etc.).  We need a
> > more systematic approach to unit testing here.
> 
>   Ok, so to recap which tests we want:
> 
>      * Default read()
>      * Read with size given
>      * unbuffered read
>      * large buffer size
>      * Mix read and realine
>      * Do a realine
>      * Do a readline larger than buffer size.
> 
>   Any others in the list?

Check the socket.py source code and make sure that every code path
through every method is taken at least once.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun  7 21:14:37 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Jun 2002 16:14:37 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: Your message of "Fri, 07 Jun 2002 13:17:52 MDT."
 <3D0106E0.5D8DF0A5@3captus.com>
References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net>
 <3D0106E0.5D8DF0A5@3captus.com>
Message-ID: <200206072014.g57KEbl27613@pcp02138704pcs.reston01.va.comcast.net>

> Looks like I have missed the war, folks!  I will work on the test 
> suite.  The orginal test_timeout.py is incomplete.  I actually had 
> problem when writing test case for accept(), using blocking() and 
> makefile().  Guido, you are right on the point, the test suite 
> should work without the timeout code as well.  If I've done that ...
> 
> As for the scope of the test suite, I would prefer to focus on socket 
> timeout test for now.  Though there will be overlapping test for socket 
> timeout test and socket test, we can always merge it later.

Thanks, Bernie!

> I now have Visual C++ version 6, but still limited to Windows and 
> Linux.  I think once we are done with this two platform, we can ask 
> people to run the test on other platform.

Good idea.

> But I agreed with Michael that using python select module put us on
> the safer side.

But it's too slow.

> [Guido]
> > - Should sock.settimeout(0.0) mean the same as sock.setblocking(0)?
> 
> > OTOH, a timeout of 0 behaves very similar to nonblocking mode --
> > similar enough that a program that uses setblocking(0) would 
> > probably also work when using settimeout(0).  I kind of like the 
> > idea of having only a single internal flag value, sock_timeout, 
> > rather than two (sock_timeout and sock_blocking).
> 
> Agree.

Hm, finally someone who agrees with me on this. ;-)

> Steve Holden and M.-A. Lemburg have spoken.

Can I expect a patch from you or Michael?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mgilfix@eecs.tufts.edu  Fri Jun  7 21:32:32 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Fri, 7 Jun 2002 16:32:32 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <3D00FD45.1090306@lemburg.com>; from mal@lemburg.com on Fri, Jun 07, 2002 at 08:36:53PM +0200
References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> <20020607134036.C24428@eecs.tufts.edu> <052101c20e4f$d8153930$7201a8c0@holdenweb.com> <3D00FD45.1090306@lemburg.com>
Message-ID: <20020607163232.D24428@eecs.tufts.edu>

  I stand corrected. Seems like people want the feature...

On Fri, Jun 07 @ 20:36, M.-A. Lemburg wrote:
> >>>- The original timeout socket code (in Python) by Tim O'Malley had a
> >>>  global timeout which you could set so that *all* sockets
> >>>  *automatically* had their timeout set.  This is nice if you want it
> >>>  to affect library modules like urllib or ftplib.  That feature is
> >>>  currently missing.  Should we add it?  (I have some concerns about
> >>>  it, in that it might break other code -- and it doesn't seem wise to
> >>>  do this for server-end sockets in general.  But it's a nice hack for
> >>>  smaller programs.)
> 
> Would be nice to have this. Programs like Plucker which do a lot
> of socket work could benefit from it.
> 
> -- 
> Marc-Andre Lemburg
> CEO eGenix.com Software GmbH
> ______________________________________________________________________
> Company & Consulting:                           http://www.egenix.com/
> Python Software:                   http://www.egenix.com/files/python/
> Meet us at EuroPython 2002:                 http://www.europython.org/
> 
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
`-> (mal)

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From mgilfix@eecs.tufts.edu  Fri Jun  7 21:41:55 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Fri, 7 Jun 2002 16:41:55 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <200206072014.g57KEbl27613@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jun 07, 2002 at 04:14:37PM -0400
References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> <3D0106E0.5D8DF0A5@3captus.com> <200206072014.g57KEbl27613@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020607164154.F24428@eecs.tufts.edu>

  If no one's taken this after I finish the rewrite of test_socket.py,
I'll tackle this. So, either you'll have either Bernard or me on it.

                   -- Mike

On Fri, Jun 07 @ 16:14, Guido van Rossum wrote:
> > Steve Holden and M.-A. Lemburg have spoken.
> 
> Can I expect a patch from you or Michael?
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
`-> (guido)

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From bernie@3captus.com  Fri Jun  7 21:32:39 2002
From: bernie@3captus.com (Bernard Yue)
Date: Fri, 07 Jun 2002 14:32:39 -0600
Subject: [Python-Dev] Socket timeout patch
References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net>
 <3D0106E0.5D8DF0A5@3captus.com> <200206072014.g57KEbl27613@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D011867.EBC5470@3captus.com>

Guido van Rossum wrote:
> 
> > But I agreed with Michael that using python select module put us on
> > the safer side.
> 
> But it's too slow.
> 

Well, if we have to use native select(), can I assume that there will 
only be three cases, namely UNIX, Windows and BeOS (looks like that's 
the case from selectmodule.c)?

Assume the above is true, what needed to done is to create a C API 
from select_select() so that socketmodule.c can use it.  Is that 
correct?

> > Steve Holden and M.-A. Lemburg have spoken.
> 
> Can I expect a patch from you or Michael?
> 

Let's see where we are when I finished the test suite.  I'll do it 
if it's still hasn't been done.

> --Guido van Rossum (home page: http://www.python.org/~guido/)


Bernie



From mgilfix@eecs.tufts.edu  Fri Jun  7 21:40:22 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Fri, 7 Jun 2002 16:40:22 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <3D0106E0.5D8DF0A5@3captus.com>; from bernie@3captus.com on Fri, Jun 07, 2002 at 01:17:52PM -0600
References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net> <3D0106E0.5D8DF0A5@3captus.com>
Message-ID: <20020607164022.E24428@eecs.tufts.edu>

On Fri, Jun 07 @ 13:17, Bernard Yue wrote:
> Looks like I have missed the war, folks!  I will work on the test 
> suite.  The orginal test_timeout.py is incomplete.  I actually had 
> problem when writing test case for accept(), using blocking() and 
> makefile().  Guido, you are right on the point, the test suite 
> should work without the timeout code as well.  If I've done that ...
> 
> As for the scope of the test suite, I would prefer to focus on socket 
> timeout test for now.  Though there will be overlapping test for socket 
> timeout test and socket test, we can always merge it later.

  Sounds good. I'll work on rewriting test_socket.py, which needs
to be done anyway to better test the _fileobject in windows -
especially if we decide to adopt that later. I'll run some profiling
tests and we'll see how painful it is. That way Guido can smack me
appropriately. I'll probably be able to draft one up tomorrow (I doubt
this evening).

> 
> [Guido]
> > - Cross-platform testing.  It's possible that the cleanup broke things
> >   on some platforms, or that select() doesn't work the same way.  I
> >   can only test on Windows and Linux; there is code specific to OS/2
> >   and RISCOS in the module too.
> 
> [Michael]
> >   This was a concern from the beginning but we had some chat on the
> > dev list and concluded that any system supporting sockets has to
> > support select or some equivalent (hence the initial reason for using
> > the select module, although I agree it was expensive).
> 
> I now have Visual C++ version 6, but still limited to Windows and 
> Linux.  I think once we are done with this two platform, we can ask 
> people to run the test on other platform.  But I agreed with Michael
> that using python select module put us on the safer side.

  Great. We need some more win testing since it's so much different
than *nix.

                    -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From gward@python.net  Fri Jun  7 22:33:29 2002
From: gward@python.net (Greg Ward)
Date: Fri, 7 Jun 2002 17:33:29 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <200206071500.g57F0lh18144@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMECKPLAA.tim_one@email.msn.com> <3D008DA0.20248.6AD00DBA@localhost> <200206071500.g57F0lh18144@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020607213329.GA21836@gerg.ca>

On 07 June 2002, Guido van Rossum said:
> True, but then there needs to be a way to enable/disable it, since
> even if you never use two spaces after a period, the rule can still
> generate them for you in the output: when an input sentence ends at
> the end of a line but the output sentence doesn't, the rule will
> translate the newline into two spaces instead of one.
> 
> I vote to have it off by default.

Sounds about right to me.  Reading this thread has revealed that 1) I
was correct to add sentence-ending-detection code, 2) I missed a few
subtle details (eg. my code will change "Dr. Frankenstein" to
"Dr.  Frankenstein" -- d'ohh!), and 3) the programmer must be
able to select whether she wants to use it.

        Greg
-- 
Greg Ward - Unix bigot                                  gward@python.net
http://starship.python.net/~gward/
A day for firm decisions!!!!!  Or is it?



From gward@python.net  Fri Jun  7 22:39:47 2002
From: gward@python.net (Greg Ward)
Date: Fri, 7 Jun 2002 17:39:47 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEEOPLAA.tim.one@comcast.net>
References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <LNBBLJKPBEHFEDALKOLCEEEOPLAA.tim.one@comcast.net>
Message-ID: <20020607213947.GB21836@gerg.ca>

On 07 June 2002, Tim Peters said:
> Greg seems to want to do it via setting vrbls on subclasses.  I couldn't
> care less how it's done, so long as I have some way to wrap for readability
> in a fixed-width font.

No subclass required -- just an instance:

  wrapper = TextWrapper()
  wrapper.fix_sentence_endings = 0
  wrapper.wrap(...)

Not sure if "fix_sentence_endings" is the right spelling, but it'll do
for now.

no-i-do-NOT-know-what-a-strategy-class-is,

        Greg
-- 
Greg Ward - Unix weenie                                 gward@python.net
http://starship.python.net/~gward/
I just heard the SEVENTIES were over!!  And I was just getting in touch
with my LEISURE SUIT!!



From paul@prescod.net  Fri Jun  7 22:59:53 2002
From: paul@prescod.net (Paul Prescod)
Date: Fri, 07 Jun 2002 14:59:53 -0700
Subject: [Python-Dev] textwrap.py
References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <LNBBLJKPBEHFEDALKOLCEEEOPLAA.tim.one@comcast.net> <20020607213947.GB21836@gerg.ca>
Message-ID: <3D012CD9.89D40523@prescod.net>

Greg Ward wrote:
> 
> On 07 June 2002, Tim Peters said:
> > Greg seems to want to do it via setting vrbls on subclasses.  I couldn't
> > care less how it's done, so long as I have some way to wrap for readability
> > in a fixed-width font.
> 
> No subclass required -- just an instance:
> 
>   wrapper = TextWrapper()
>   wrapper.fix_sentence_endings = 0
>   wrapper.wrap(...)
> 
> Not sure if "fix_sentence_endings" is the right spelling, but it'll do
> for now.

Why three statements instead of one expression?

textwrap.wrap_my_text(text, fix_sentence_endings = 0)

If you want to do class-y stuff internally, then go ahead. But wrapping
text is a stateless mathematical function with a domain and range. I'd
prefer function syntax.

 Paul Prescod



From gward@python.net  Fri Jun  7 23:06:40 2002
From: gward@python.net (Greg Ward)
Date: Fri, 7 Jun 2002 18:06:40 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <3D012CD9.89D40523@prescod.net>
References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <LNBBLJKPBEHFEDALKOLCEEEOPLAA.tim.one@comcast.net> <20020607213947.GB21836@gerg.ca> <3D012CD9.89D40523@prescod.net>
Message-ID: <20020607220640.GA21975@gerg.ca>

On 07 June 2002, Paul Prescod said:
> Why three statements instead of one expression?
> 
> textwrap.wrap_my_text(text, fix_sentence_endings = 0)
> 
> If you want to do class-y stuff internally, then go ahead. But wrapping
> text is a stateless mathematical function with a domain and range. I'd
> prefer function syntax.

Yeah, me too.  But there are an unbounded number of possible options
that people might insist on, and making these options instance
attributes seems vaguely friendly to subclasses to me.  These are both
wild, unproven allegations, of course.  Patches welcome.

        Greg
-- 
Greg Ward - geek                                        gward@python.net
http://starship.python.net/~gward/
I just read that 50% of the population has below median IQ!



From paul@prescod.net  Sat Jun  8 02:24:40 2002
From: paul@prescod.net (Paul Prescod)
Date: Fri, 07 Jun 2002 18:24:40 -0700
Subject: [Python-Dev] textwrap.py
References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <LNBBLJKPBEHFEDALKOLCEEEOPLAA.tim.one@comcast.net> <20020607213947.GB21836@gerg.ca> <3D012CD9.89D40523@prescod.net> <20020607220640.GA21975@gerg.ca>
Message-ID: <3D015CD8.DE7C4BC2@prescod.net>

Greg Ward wrote:
> 
>...
> 
> Yeah, me too.  But there are an unbounded number of possible options
> that people might insist on, and making these options instance
> attributes seems vaguely friendly to subclasses to me. 

I don't follow. If I want a subclass then I need to instantiate it
somehow. When I do, I'll call its constructor. I'll pass its constructor
the keyword arguments that the subclass expects.

 Paul Prescod



From aahz@pythoncraft.com  Sat Jun  8 02:40:33 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 7 Jun 2002 21:40:33 -0400
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net>
References: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net> <20020607002623.A20029@eecs.tufts.edu> <200206071629.g57GTe725941@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020608014033.GA9625@panix.com>

On Fri, Jun 07, 2002, Guido van Rossum wrote:
>
> - The original timeout socket code (in Python) by Tim O'Malley had a
>   global timeout which you could set so that *all* sockets
>   *automatically* had their timeout set.  This is nice if you want it
>   to affect library modules like urllib or ftplib.  That feature is
>   currently missing.  Should we add it?  (I have some concerns about
>   it, in that it might break other code -- and it doesn't seem wise to
>   do this for server-end sockets in general.  But it's a nice hack for
>   smaller programs.)

+1
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I had lots of reasonable theories about children myself, until I
had some."  --Michael Rios



From martin@v.loewis.de  Sat Jun  8 06:49:34 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 08 Jun 2002 07:49:34 +0200
Subject: [Python-Dev] Changing ob_size to [s]size_t
In-Reply-To: <3D00CD79.6050409@lemburg.com>
References: <NEBBIJKBMLDBLNCEEFOCIEJCCNAA.perry@stsci.edu>
 <200206071458.g57Ew2517792@pcp02138704pcs.reston01.va.comcast.net>
 <3D00CD79.6050409@lemburg.com>
Message-ID: <m3ptz2b9qp.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> What binary compatibility ? 

The binary compatibility of extension modules across Python
releases. That is not available on Windows, but it is available on
Unix.

Regards,
Martin



From mgilfix@eecs.tufts.edu  Sat Jun  8 22:26:28 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Sat, 8 Jun 2002 17:26:28 -0400
Subject: [Python-Dev] unittest and sockets. Ugh!?
Message-ID: <20020608172627.D9486@eecs.tufts.edu>

  Could someone please offer some advice/comments on this?  While
restructuring test_socket.py, I've come across the following
(limitation?) problem with the unittest module. To test socket stuff,
I really need to use two separate processes or threads.  Since
fork() is much better supported, here's my attempt at porting the
_fileobject tests into unittest. I really like the structure of the
test (hopefully you guys agree) and this is how I'd like to lay it out
ideally for testing things like accept/connect, etc, using the various
layers of my inhertiance hierarchy.

  However, this test won't work because I get an error binding to the
socket in the setUp function because the socket is already taken. Is
this because unittest dispatches the tests at roughly the same
time? I'm not quite sure why this is failing in sequence (perhaps I'm
missing something). In addition, I thought that the setUp/tearDown
functions were shared between all tests within a class, not called
for each test but this does not seem true. If I want the setup to be
shared between all tests, do I have to override __init__? Another
issue is that unittest doesn't seem to like that I've forked. It
considers it to be the equivalent of two tests. Perhaps I shouldn't
care, provided that they all pass anyway.

  Any other stuff that I've seen that uses forking/threading doesn't
seem to use the unittest style framework. Perhaps I shouldn't be using
this and should just write outside of it? That would be a shame since
I like many of the features of the framework but some seem limiting.

                    -- Mike

=======================================================================

#!/usr/bin/env python

import unittest
import test_support

import socket
import os
import time

PORT = 50007
HOST = 'localhost'

class SocketTest(unittest.TestCase):

    def setUp(self):
        canfork = hasattr (os, 'fork')
        if not canfork:
            raise test_support.TestSkipped, \
                  "Platform does not support forking."

        # Use this to figure out who we are in the tests
        self.parent = os.fork()

        if self.parent:
            self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            self.s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
            self.s.bind((HOST, PORT))
            self.s.listen(1)
        else:
            self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        time.sleep(1) # So we can catch up

    def tearDown(self):
        self.s.close ()
        self.s = None

class SocketConnectedTest(SocketTest):

    SYNCH_MSG = 'Michael Gilfix was here'

    def setUp(self):
        SocketTest.setUp(self)
        if self.parent:
            conn, addr = self.s.accept()
            self.conn = conn
        else:
            self.s.connect((HOST, PORT))
            self.conn = self.s

    def tearDown(self):
        if self.parent:
            self.conn.close()
        self.conn = None
        SocketTest.tearDown(self)

    def synchronize(self):
        time.sleep(1)
        if self.parent:
            msg = self.conn.recv(len(self.SYNCH_MSG))
            self.assertEqual(msg, self.SYNCH_MSG, "Parent synchronization error")
            self.conn.send(msg)
        else:
            self.conn.send(msg)
            msg = self.conn.recv(len(self.SYNCH_MSG))
            self.assertEqual(msg, self.SYNCH_MSG, "Child synchronization error")
        time.sleep(1)

class FileObjectClassTestCase(SocketConnectedTest):

    def setUp(self):
        SocketConnectedTest.setUp(self)
        # Create a file object for both the parent/client processes
        self.f = socket._fileobject(self.conn, 'rb', 8192)

    def tearDown(self):
        self.f.close()
        SocketConnectedTest.tearDown(self)

    def testSmallRead(self):
        """Performing small read test."""
        if self.parent:
            first_seg = self.f.read(7)
            second_seg = self.f.read(25)
            msg = ''.join((first_seg, second_seg))
            self.assertEqual(msg, self.SYNCH_MSG, "Error performing small read.")
        else:
            self.f.write(self.SYNCH_MSG)
            self.f.flush()

    def testUnbufferedRead(self):
        """Performing unbuffered read test."""
        if self.parent:
            buf = ''
            while 1:
                char = self.f.read(1)
                self.failIf(not char, "Error performing unbuffered read.")
                buf += char
                if buf == self.SYNCH_MSG:
                    break
        else:
            self.f.write(self.SYNCH_MSG)
            self.f.flush()

    def testReadline(self):
        """Performing readline test."""
        if self.parent:
            line = self.f.readline()
            self.assertEqual(line, self.SYNCH_MSG, "Error performing readline.")
        else:
            self.f.write(self.SYNCH_MSG)
            self.f.flush()

def suite():
    suite = unitest.TestSuite()
    suite.addTest(unittest.makeSuite(FileObjectClassTestCase))
    return suite

if __name__ == '__main__':
    unittest.main()

--
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From gward@python.net  Sun Jun  9 01:27:22 2002
From: gward@python.net (Greg Ward)
Date: Sat, 8 Jun 2002 20:27:22 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <3D015CD8.DE7C4BC2@prescod.net>
References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <LNBBLJKPBEHFEDALKOLCEEEOPLAA.tim.one@comcast.net> <20020607213947.GB21836@gerg.ca> <3D012CD9.89D40523@prescod.net> <20020607220640.GA21975@gerg.ca> <3D015CD8.DE7C4BC2@prescod.net>
Message-ID: <20020609002722.GA3750@gerg.ca>

On 07 June 2002, Paul Prescod said:
> Greg Ward wrote:
> > 
> >...
> > 
> > Yeah, me too.  But there are an unbounded number of possible options
> > that people might insist on, and making these options instance
> > attributes seems vaguely friendly to subclasses to me. 
> 
> I don't follow. If I want a subclass then I need to instantiate it
> somehow. When I do, I'll call its constructor. I'll pass its constructor
> the keyword arguments that the subclass expects.

Umm, ignore my original argument.  I don't understand what I was talking
about, and I understand your rebuttal even less.  Let's accept the fact
that we're not communicating and drop it.

However, I *still* don't want to make all of TextWrapper's options
keyword arguments to the wrap() method, because 1) I'd be morally bound
to make them kwargs to the fill() method, and to the standalone wrap()
and fill() functions as well, which is a PITA; and 2) I think it's
useful to be able to encode your preferences in an object for multiple
wrapping jobs.

Compromise: the TextWrapper constructor now looks like this:

    def __init__ (self,
                  expand_tabs=True,
                  replace_whitespace=True,
                  fix_sentence_endings=False,
                  break_long_words=True):
        self.expand_tabs = expand_tabs
        self.replace_whitespace = replace_whitespace
        self.fix_sentence_endings = fix_sentence_endings
        self.break_long_words = break_long_words

Good enough?  I'm happy with it.

        Greg
-- 
Greg Ward - Unix geek                                   gward@python.net
http://starship.python.net/~gward/
Question authority!



From tim.one@comcast.net  Sun Jun  9 02:00:40 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 08 Jun 2002 21:00:40 -0400
Subject: [Python-Dev] unittest and sockets. Ugh!?
In-Reply-To: <20020608172627.D9486@eecs.tufts.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEIEPLAA.tim.one@comcast.net>

[Michael Gilfix]
>   Could someone please offer some advice/comments on this?  While
> restructuring test_socket.py, I've come across the following
> (limitation?) problem with the unittest module. To test socket stuff,
> I really need to use two separate processes or threads.  Since
> fork() is much better supported,

Better supported than what?  Threads?  No way.  If you use fork(), the test
won't run at all except on Unixish systems.  If you use threads, it will run
just about everywhere.  Use threads.

Alas, I have no idea what unittest does in the presence of fork or threads,
and no desire to learn <wink>.

> ...
>   Any other stuff that I've seen that uses forking/threading doesn't
> seem to use the unittest style framework.

The existing fork and thread tests almost all long predate the invention of
unittest.  Frankly, I find that the layers of classes in elaborate unittests
ensure I almost always spend more time trying to understand what a failing
unittest *thinks* it's trying to do, and fixing what turn out to be bad
assumptions, than in fixing actual bugs in the stuff it's supposed to be
testing.  Combining that artificial complexity with the inherent complexity
of multiple processes or threads is something I instinctively shy away from.

My coworkers do not, and PythonLabs has done several projects now at Zope
Corp now that try to mix unittest with multiple threads and processes in the
*app* being tested.  Even that much is a never-ending nightmare.  Then
again, I feel this more acutely than them because the tests always fail on
Windows -- or any other platform where the timing is 1% different <wink>.

no-easy-answers-here-ly y'rs  - tim




From mgilfix@eecs.tufts.edu  Sun Jun  9 02:43:41 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Sat, 8 Jun 2002 21:43:41 -0400
Subject: [Python-Dev] unittest and sockets. Ugh!?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEIEPLAA.tim.one@comcast.net>; from tim.one@comcast.net on Sat, Jun 08, 2002 at 09:00:40PM -0400
References: <20020608172627.D9486@eecs.tufts.edu> <LNBBLJKPBEHFEDALKOLCEEIEPLAA.tim.one@comcast.net>
Message-ID: <20020608214341.G9486@eecs.tufts.edu>

On Sat, Jun 08 @ 21:00, Tim Peters wrote:
> Better supported than what?  Threads?  No way.  If you use fork(), the test
> won't run at all except on Unixish systems.  If you use threads, it will run
> just about everywhere.  Use threads.

  Will do. I would have much rather used threads to begin with in fact.
I just assumed that the reason the socket module used fork to begin with is
because it was considered more portable. Well, you know that thing about
assuming makes an ass-out-of-u-and-me.

> Alas, I have no idea what unittest does in the presence of fork or threads,
> and no desire to learn <wink>.

  I'll just change it to threads happily and find out :)

> > ...
> >   Any other stuff that I've seen that uses forking/threading doesn't
> > seem to use the unittest style framework.
> 
> The existing fork and thread tests almost all long predate the invention of
> unittest.  Frankly, I find that the layers of classes in elaborate unittests
> ensure I almost always spend more time trying to understand what a failing
> unittest *thinks* it's trying to do, and fixing what turn out to be bad
> assumptions, than in fixing actual bugs in the stuff it's supposed to be
> testing.  Combining that artificial complexity with the inherent complexity
> of multiple processes or threads is something I instinctively shy away from.

  I would agree in some respects. When I first started looking
at unittest, I thought it seemed more complicated than it was
worth. Indeed, I'm sure the advanced features are. I don't find the
documentation to be very good at describing just what I needed to get
going - at least not up to par with, for example, the xml.minidom
documentation, which gets you going in 5 minutes.  I just haven't
made up my mind yet about what's bugging me and maybe I'll have more
insight after the process.

  However, after trying it a bit, I've decided that I really like the
format/layout and it's quite convient. I'm just not sure what it can
and can't do yet.

> My coworkers do not, and PythonLabs has done several projects now at Zope
> Corp now that try to mix unittest with multiple threads and processes in the
> *app* being tested.  Even that much is a never-ending nightmare.  Then
> again, I feel this more acutely than them because the tests always fail on
> Windows -- or any other platform where the timing is 1% different <wink>.

  Well, it's easier to envision with sockets where the timing issues
are easier to sort out. But well written tests are a blessing and the
more I look at the python regression suite, I begin to realize that
they are lacking <grin>.

> no-easy-answers-here-ly y'rs  - tim

  agreeingly-and-I-hope-I-don't-pick-up-this-habit-ly y'rs

                         -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From guido@python.org  Sun Jun  9 03:10:44 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 08 Jun 2002 22:10:44 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: Your message of "Fri, 07 Jun 2002 12:31:44 EDT."
Message-ID: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net>

Offline, Jeremy, Neil, Tim & I figured out what was really the case
here.  I believe the last thing we reported here was that we'd found a
sample program that required several gc.collect() calls to clean out
all its garbage, which surprised Neil and Tim.  Since they know the
collector inside out, if it surprises them, there's probably a bug.
Here's the analysis.

A new-style instance increfs its class (its ob_type), to make sure
that the class doesn't go away as long as the instance exists.  But
the tp_traverse handler for new-style instances didn't visit the
ob_type reference, so the collector thinks there are "outside"
references to the class, and doesn't collect it.  When the last
instance goes away, the refcnt to the class finally matches what the
collector sees, and it can collect the class (once all other
references to it are gone of course).

I was tempted to say "this ain't so bad, let's keep it that way."  But
Jeremy and Tim came up with counterexamples involving a cycle between
a class and its instance, where the cycle would never be broken.  Tim
found that this program grows without bounds, though ever slower,
since it spends more and more time in the 2nd generation collector,
where all the uncollected objects eventually end up:

  while 1:
      class A(object): pass
      A.a = A()

I tried the simplest possible fix, which was to visit self->ob_type in
the new-style instance tp_traverse handler (subtype_traverse() in
typeobject.c).  But this caused assertions to fail all over the place.
It turns out that when the collector decides to break a cycle like
this, it calls the tp_clear handler for each object in the cycle, and
then the subsequent deletion of the instance references the type in
ways that have been made invalid by the clearing of the type.  So this
was a dead end.

So here's a patch that does do the right thing.  It adds a new field
to type objects, tp_dependents.  This is essentially a second
reference count, counting the instances and direct subclasses.  As
long as tp_dependents is nonzero, this means there are still instances
or subclasses, and then the type's tp_clear handler doesn't do
anything.

A consequence of the patch is that there will always be examples that
take more than one collection to clean their garbage -- but eventually
all garbage will be cleared out.  (I suppose a worst-case example
would be a very long chain of subclasses, which would be cleared out
once class per collection.)  Consequently, I'm patching test_gc.py to
get rid of garbage left behind by previous tests in a loop.

A downside of the patch is that it adds a new field to the type object
structure; I believe this prevents it from being a backport candidate
to 2.2.2.  For a half blissful hour I believed that it would be
possible to do something much simpler by doubling the regular refcnt;
then I realized that the double refcnt would mean the collector would
never break the cycle.  Still, I am wishing for a solution that avoids
adding a new field.

Please comment on this patch!

Index: Include/object.h
===================================================================
RCS file: /cvsroot/python/python/dist/src/Include/object.h,v
retrieving revision 2.101
diff -c -r2.101 object.h
*** Include/object.h	12 Apr 2002 01:57:06 -0000	2.101
--- Include/object.h	8 Jun 2002 13:09:14 -0000
***************
*** 292,297 ****
--- 292,298 ----
  	PyObject *tp_cache;
  	PyObject *tp_subclasses;
  	PyObject *tp_weaklist;
+ 	int tp_dependents;
  
  #ifdef COUNT_ALLOCS
  	/* these must be last and never explicitly initialized */
Index: Objects/typeobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/typeobject.c,v
retrieving revision 2.148
diff -c -r2.148 typeobject.c
*** Objects/typeobject.c	4 Jun 2002 19:52:53 -0000	2.148
--- Objects/typeobject.c	8 Jun 2002 13:09:15 -0000
***************
*** 218,225 ****
  
  	memset(obj, '\0', size);
  
! 	if (type->tp_flags & Py_TPFLAGS_HEAPTYPE)
  		Py_INCREF(type);
  
  	if (type->tp_itemsize == 0)
  		PyObject_INIT(obj, type);
--- 218,227 ----
  
  	memset(obj, '\0', size);
  
! 	if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) {
  		Py_INCREF(type);
+ 		type->tp_dependents++;
+ 	}
  
  	if (type->tp_itemsize == 0)
  		PyObject_INIT(obj, type);
***************
*** 290,295 ****
--- 292,303 ----
  		}
  	}
  
+ 	if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) {
+ 		int err = visit((PyObject *)type, arg);
+ 		if (err)
+ 			return err;
+ 	}
+ 
  	if (basetraverse)
  		return basetraverse(self, visit, arg);
  	return 0;
***************
*** 464,469 ****
--- 472,478 ----
  	/* Can't reference self beyond this point */
  	if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) {
  		Py_DECREF(type);
+ 		type->tp_dependents--;
  	}
  }
  
***************
*** 1170,1175 ****
--- 1179,1192 ----
  	Py_INCREF(base);
  	type->tp_base = base;
  
+ 	/* Incref the bases' tp_dependents count */
+ 	for (i = 0; i < nbases; i++) {
+ 		PyTypeObject *b;
+ 		b = (PyTypeObject *)PyTuple_GET_ITEM(bases, i);
+ 		if (PyType_Check(b) && (b->tp_flags & Py_TPFLAGS_HEAPTYPE))
+ 			b->tp_dependents++;
+ 	}
+ 
  	/* Initialize tp_dict from passed-in dict */
  	type->tp_dict = dict = PyDict_Copy(dict);
  	if (dict == NULL) {
***************
*** 1431,1441 ****
--- 1448,1475 ----
  static void
  type_dealloc(PyTypeObject *type)
  {
+ 	PyObject *bases;
  	etype *et;
  
  	/* Assert this is a heap-allocated type object */
  	assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE);
  	_PyObject_GC_UNTRACK(type);
+ 
+ 	/* Decref the bases' tp_dependents count */
+ 	bases = type->tp_bases;
+ 	if (bases) {
+ 		int i, nbases;
+ 		assert(PyTuple_Check(bases));
+ 		nbases = PyTuple_GET_SIZE(bases);
+ 		for (i = 0; i < nbases; i++) {
+ 			PyTypeObject *b;
+ 			b = (PyTypeObject *)PyTuple_GET_ITEM(bases, i);
+ 			if (PyType_Check(b) &&
+ 			    (b->tp_flags & Py_TPFLAGS_HEAPTYPE))
+ 				b->tp_dependents--;
+ 		}
+ 	}
+ 
  	PyObject_ClearWeakRefs((PyObject *)type);
  	et = (etype *)type;
  	Py_XDECREF(type->tp_base);
***************
*** 1495,1502 ****
  	etype *et;
  	int err;
  
! 	if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE))
! 		return 0;
  
  	et = (etype *)type;
  
--- 1529,1535 ----
  	etype *et;
  	int err;
  
! 	assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE);
  
  	et = (etype *)type;
  
***************
*** 1524,1533 ****
  type_clear(PyTypeObject *type)
  {
  	etype *et;
! 	PyObject *tmp;
  
! 	if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE))
! 		return 0;
  
  	et = (etype *)type;
  
--- 1557,1583 ----
  type_clear(PyTypeObject *type)
  {
  	etype *et;
! 	PyObject *tmp, *bases;
  
! 	assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE);
! 
! 	if (type->tp_dependents)
! 		return 0; /* Not yet, there are still instances */
! 
! 	/* Decref the bases' tp_dependents count */
! 	bases = type->tp_bases;
! 	if (bases) {
! 		int i, nbases;
! 		assert(PyTuple_Check(bases));
! 		nbases = PyTuple_GET_SIZE(bases);
! 		for (i = 0; i < nbases; i++) {
! 			PyTypeObject *b;
! 			b = (PyTypeObject *)PyTuple_GET_ITEM(bases, i);
! 			if (PyType_Check(b) &&
! 			    (b->tp_flags & Py_TPFLAGS_HEAPTYPE))
! 				b->tp_dependents--;
! 		}
! 	}
  
  	et = (etype *)type;
  
***************
*** 1754,1763 ****
--- 1804,1815 ----
  	}
  	if (new->tp_flags & Py_TPFLAGS_HEAPTYPE) {
  		Py_INCREF(new);
+ 		new->tp_dependents++;
  	}
  	self->ob_type = new;
  	if (old->tp_flags & Py_TPFLAGS_HEAPTYPE) {
  		Py_DECREF(old);
+ 		old->tp_dependents--;
  	}
  	return 0;
  }
Index: Lib/test/test_gc.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/test/test_gc.py,v
retrieving revision 1.14
diff -c -r1.14 test_gc.py
*** Lib/test/test_gc.py	28 Mar 2002 21:22:25 -0000	1.14
--- Lib/test/test_gc.py	8 Jun 2002 13:09:15 -0000
***************
*** 220,228 ****
  def test():
      if verbose:
          print "disabling automatic collection"
      enabled = gc.isenabled()
      gc.disable()
!     verify(not gc.isenabled() )
      debug = gc.get_debug()
      gc.set_debug(debug & ~gc.DEBUG_LEAK) # this test is supposed to leak
  
--- 220,229 ----
  def test():
      if verbose:
          print "disabling automatic collection"
+     while gc.collect(): pass # collect garbage from previous tests
      enabled = gc.isenabled()
      gc.disable()
!     verify(not gc.isenabled())
      debug = gc.get_debug()
      gc.set_debug(debug & ~gc.DEBUG_LEAK) # this test is supposed to leak
  


--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sun Jun  9 05:17:11 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 09 Jun 2002 00:17:11 -0400
Subject: [Python-Dev] unittest and sockets. Ugh!?
In-Reply-To: Your message of "Sat, 08 Jun 2002 17:26:28 EDT."
 <20020608172627.D9486@eecs.tufts.edu>
References: <20020608172627.D9486@eecs.tufts.edu>
Message-ID: <200206090417.g594HBk03827@pcp02138704pcs.reston01.va.comcast.net>

>   Could someone please offer some advice/comments on this?  While
> restructuring test_socket.py, I've come across the following
> (limitation?) problem with the unittest module. To test socket stuff,
> I really need to use two separate processes or threads.  Since
> fork() is much better supported, here's my attempt at porting the
> _fileobject tests into unittest. I really like the structure of the
> test (hopefully you guys agree) and this is how I'd like to lay it out
> ideally for testing things like accept/connect, etc, using the various
> layers of my inhertiance hierarchy.

Please don't use fork -- that isn't supported on Windows.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@v.loewis.de  Sun Jun  9 08:38:45 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 09 Jun 2002 09:38:45 +0200
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net>
References: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3bsaksxyy.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> I tried the simplest possible fix, which was to visit self->ob_type in
> the new-style instance tp_traverse handler (subtype_traverse() in
> typeobject.c).  But this caused assertions to fail all over the place.
> It turns out that when the collector decides to break a cycle like
> this, it calls the tp_clear handler for each object in the cycle, and
> then the subsequent deletion of the instance references the type in
> ways that have been made invalid by the clearing of the type.  So this
> was a dead end.

I'd like to question this statement. It ought to be possible, IMO, to
dealloc an instance whose type has been cleared.

The problem appears to be in the tp_clear. The task of tp_clear is to
clear all references that may participate in cycles (*not* to clear
all references per se). Now, if type_clear would clear tp_dict,
tp_subclasses, and et->slots, but leave alone tp_base, tp_bases, and
tp_mro, the type would still be "good enough" for subtype_dealloc, no?

Regards,
Martin



From bernie@3captus.com  Sun Jun  9 09:44:04 2002
From: bernie@3captus.com (Bernard Yue)
Date: Sun, 09 Jun 2002 02:44:04 -0600
Subject: [Python-Dev] Subclassing threading.Thread
Message-ID: <3D031553.D9E263C8@3captus.com>

This is a multi-part message in MIME format.
--------------695193BAE1E9F00BD723109A
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Attached files test1.py and test2.py produce different result.  Is it a
bug?


Bernie
--------------695193BAE1E9F00BD723109A
Content-Type: text/plain; charset=us-ascii;
 name="test1.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="test1.py"

#!/usr/bin/env python

import socket
import threading


class server:
    def __init__(self):
        self._addr_local  = ('127.0.0.1', 25339)
        self._s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self._s.bind(self._addr_local)


def test():
    for i in range(10):
        a = threading.Thread( target=server)
        a.start()
        a.run()
        a.join()

if __name__ == '__main__':
    test()


--------------695193BAE1E9F00BD723109A
Content-Type: text/plain; charset=us-ascii;
 name="test2.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="test2.py"

#!/usr/bin/env python

import socket
import threading


class server(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)
        self.__addr_local  = ('127.0.0.1', 25339)
        self.__s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.__s.bind(self.__addr_local)

def test():
    for i in range(10):
        a = server()
        a.start()
        a.run()
        a.join()

if __name__ == '__main__':
    test()


--------------695193BAE1E9F00BD723109A--




From martin@v.loewis.de  Sun Jun  9 10:20:18 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 09 Jun 2002 11:20:18 +0200
Subject: [Python-Dev] Subclassing threading.Thread
In-Reply-To: <3D031553.D9E263C8@3captus.com>
References: <3D031553.D9E263C8@3captus.com>
Message-ID: <m3fzzwrep9.fsf@mira.informatik.hu-berlin.de>

Bernard Yue <bernie@3captus.com> writes:

> Attached files test1.py and test2.py produce different result.  Is it a
> bug?

No.

Martin



From paul@prescod.net  Sun Jun  9 19:19:47 2002
From: paul@prescod.net (Paul Prescod)
Date: Sun, 09 Jun 2002 11:19:47 -0700
Subject: [Python-Dev] textwrap.py
References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <LNBBLJKPBEHFEDALKOLCEEEOPLAA.tim.one@comcast.net> <20020607213947.GB21836@gerg.ca> <3D012CD9.89D40523@prescod.net> <20020607220640.GA21975@gerg.ca> <3D015CD8.DE7C4BC2@prescod.net> <20020609002722.GA3750@gerg.ca>
Message-ID: <3D039C43.786EE3D8@prescod.net>

Greg Ward wrote:
> 
>...
> 
> However, I *still* don't want to make all of TextWrapper's options
> keyword arguments to the wrap() method, because 1) I'd be morally bound
> to make them kwargs to the fill() method, and to the standalone wrap()
> and fill() functions as well, which is a PITA; and 2) I think it's
> useful to be able to encode your preferences in an object for multiple
> wrapping jobs.

I buy the second argument but not the first. I'm 95% happy with what you
propose and suggest only a tiny change.

If "expand_tabs" or "replace_whitespace" or "break_long_words" are
options that people will want to specify when doing text wrapping then
why *wouldn't* they be arguments to the wrap and fill functions? The
object is useful for when you want to keep those options persistently.
But it seems clear that you would want to pass the same argument to the
function versions.

Here's my proposed (untested) fix:

def wrap (text, width, **kwargs):
    return TextWrapper(**kwargs).wrap(text, width)

def fill (text, width, initial_tab="", subsequent_tab="", **kwargs):
    return _wrapper.fill(text, width, initial_tab, subsequent_tab,
**kwargs)

I'm not clear on why the "width" argument is special and should be on
the wrap method rather than in the constructor. But I suspect most
people will use the convenience functions so they'll never know the
difference.

 Paul Prescod



From s_lott@yahoo.com  Sun Jun  9 19:44:25 2002
From: s_lott@yahoo.com (Steven Lott)
Date: Sun, 9 Jun 2002 11:44:25 -0700 (PDT)
Subject: [Python-Dev] textwrap.py
In-Reply-To: <3D039C43.786EE3D8@prescod.net>
Message-ID: <20020609184425.18907.qmail@web9601.mail.yahoo.com>

--0-1592618721-1023648265=:17605
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Here's a version with the Strategy classes included.  This
allows for essentially unlimited alternatives on the subjects of
long words, full stops, and also permits right justification.

This is my preference for resolving "creeping featuritis".  Any
new feature can be implemented as yet another strategy plug-in.

Note that I am AR about superclasses.  Python does not require
this level of fussiness, but I find that when I leave them out,
I always wish I had them as a place to factor out common
functions.



=====
--
S. Lott, CCP :-{)
S_LOTT@YAHOO.COM
http://www.mindspring.com/~slott1
Buccaneer #468: KaDiMa

Macintosh user: drinking upstream from the herd.

__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com
--0-1592618721-1023648265=:17605
Content-Type: application/octet-stream; name="textwrap.py"
Content-Transfer-Encoding: base64
Content-Description: textwrap.py
Content-Disposition: attachment; filename="textwrap.py"

IiIiClV0aWxpdGllcyBmb3Igd3JhcHBpbmcgdGV4dCBzdHJpbmdzIGFuZCBm
aWxsaW5nIHRleHQgcGFyYWdyYXBocy4KIiIiCgpfX3JldmlzaW9uX18gPSAi
JElkJCIKCmltcG9ydCBzdHJpbmcsIHJlCgoKIyBYWFggaXMgdGhpcyBnb2lu
ZyB0byBiZSBpbXBsZW1lbnRlZCBwcm9wZXJseSBzb21ld2hlcmUgaW4gMi4z
PwpkZWYgaXNsb3dlciAoYyk6CiAgICByZXR1cm4gYyBpbiBzdHJpbmcubG93
ZXJjYXNlCgojIFRleHRQcmVwcm9jZXNzIGNsYXNzIGhpZXJhcmNoeToKIyAg
IFRoZXNlIGNsYXNzZXMgYXJlIGFsbCBwbHVnLWluIHN0cmF0ZWdpZXMgZm9y
IHByZXByb2Nlc3NpbmcgdGV4dCBwcmlvciAKIyAgIHRvIHNwbGl0dGluZyBp
bnRvIGNodW5rcy4KCmNsYXNzIFRleHRQcmVwcm9jZXNzOgogICAgIiIiUHJl
cHJvY2VzcyB0ZXh0IGJlZm9yZSBmaWxsaW5nIGxpbmVzLiIiIgogICAgZGVm
IHRyYW5zZm9ybSggc2VsZiwgdGV4dCApOgogICAgICAgIHJldHVybiB0ZXh0
CiAgICAgICAgCmNsYXNzIEV4cGFuZFRhYnMoIFRleHRQcmVwcm9jZXNzICk6
CiAgICAiIiJFeHBhbmQgVGFicyB0byBzcGFjZXMuIiIiCiAgICBkZWYgX19p
bml0X18oIHNlbGYsIHdpZHRoICk6CiAgICAgICAgc2VsZi53aWR0aD0gd2lk
dGgKICAgIGRlZiB0cmFuc2Zvcm0oIHNlbGYsIHRleHQgKToKICAgICAgICBy
ZXR1cm4gdGV4dC5leHBhbmR0YWJzKCkKCmNsYXNzIENsZWFuV2hpdGVzcGFj
ZSggVGV4dFByZXByb2Nlc3MgKToKICAgICIiIk5vcm1hbGl6ZSB2YXJpb3Vz
IHdoaXRlc3BhY2UgY2hhcmFjdGVycyB0byBiZSBzcGFjZXMuIiIiIAogICAg
d2hpdGVzcGFjZV90cmFucz0gc3RyaW5nLm1ha2V0cmFucyggc3RyaW5nLndo
aXRlc3BhY2UsIAogICAgICAgICcgJypsZW4oc3RyaW5nLndoaXRlc3BhY2Up
ICkKICAgIGRlZiB0cmFuc2Zvcm0oIHNlbGYsIHRleHQgKToKICAgICAgICBy
ZXR1cm4gdGV4dC50cmFuc2xhdGUoIHNlbGYud2hpdGVzcGFjZV90cmFucyAp
CgojIENodW5rUHJlcHJvY2VzcyBjbGFzcyBoaWVyYXJjaHk6CiMgICAgVGhl
c2UgY2xhc3NlcyBhcmUgcGx1Zy1pbiBzdHJhdGVnaWVzIGZvciBwcm9jZXNz
aW5nIGEgc2VxdWVuY2Ugb2YgY2h1bmtzCiMgICAgcHJpb3IgdG8gd3JhcHBp
bmcuCgpjbGFzcyBDaHVua1ByZXByb2Nlc3M6CiAgICAiIiJQYXJlbnQgY2xh
c3MgZm9yIHBvc3Qtc3BsaXQsIHByZS13cmFwIHByb2Nlc3NpbmcuIiIiCiAg
ICBkZWYgYWJicmV2KCBzZWxmLCB0ZXh0ICk6CiAgICAgICAgcmV0dXJuIHRl
eHRbMF0gaW4gc3RyaW5nLnVwcGVyY2FzZSBhbmQgdGV4dFstMV0gPT0gJy4n
CiAgICBkZWYgdHJhbnNmb3JtKCBzZWxmLCBjaHVua3MgKToKICAgICAgICBy
ZXR1cm4gY2h1bmtzCiAgICAKY2xhc3MgRnVsbFN0b3BUd29TcGFjZSggQ2h1
bmtQcmVwcm9jZXNzICk6CiAgICAiIiJFbnN1cmUgdGhhdCBmdWxsIHN0b3Bz
IGFyZSBmb2xsb3dlZCBieSB0d28gc3BhY2VzLiIiIgogICAgZGVmIHRyYW5z
Zm9ybSggc2VsZiwgY2h1bmtzICk6CiAgICAgICAgaT0gMAogICAgICAgIHdo
aWxlIGkgPCBsZW4oY2h1bmtzKS0xOgogICAgICAgICAgICBpZiBzZWxmLmFi
YnJldiggY2h1bmtzW2ldICkgYW5kIGNodW5rc1tpKzFdLnN0YXJ0c3dpdGgo
JyAnKToKICAgICAgICAgICAgICAgIGNodW5rc1tpKzFdPSAnICAnCiAgICAg
ICAgICAgICAgICBpICs9IDIKICAgICAgICAgICAgZWxzZToKICAgICAgICAg
ICAgICAgIGkgKz0gMQogICAgCmNsYXNzIEZ1bGxTdG9wT25lU3BhY2UoIENo
dW5rUHJlcHJvY2VzcyApOgogICAgIiIiRW5zdXJlIHRoYXQgZnVsbCBzdG9w
cyBhcmUgZm9sbG93ZWQgYnkgb25lIHNwYWNlLiIiIgogICAgZGVmIHRyYW5z
Zm9ybSggc2VsZiwgY2h1bmtzICk6CiAgICAgICAgaT0gMAogICAgICAgIHdo
aWxlIGkgPCBsZW4oY2h1bmtzKS0xOgogICAgICAgICAgICBpZiBzZWxmLmFi
YnJldiggY2h1bmtzW2ldICkgYW5kIGNodW5rc1tpKzFdLnN0YXJ0c3dpdGgo
JyAnKToKICAgICAgICAgICAgICAgIGNodW5rc1tpKzFdPSAnICcKICAgICAg
ICAgICAgICAgIGkgKz0gMgogICAgICAgICAgICBlbHNlOgogICAgICAgICAg
ICAgICAgaSArPSAxCgojIExvbmdXb3JkIGNsYXNzIGhpZXJhcmNoeToKIyAg
ICBUaGVzZSBjbGFzc2VzIGFyZSBhbGwgcGx1Zy1pbiBzdHJhdGVnaWVzIGZv
ciBjb3Bpbmcgd2l0aCBsb25nIHdvcmRzCiMgICAgdGhhdCB3b3VsZCBvdmVy
ZmxvdyB0aGUgbGluZSBlbmRpbmcuICBBZGRpdGlvbmFsIHN1YmNsYXNzIG1p
Z2h0IGJlCiMgICAgd3JpdHRlbiB0byBoYW5kbGUgc3BsaXR0aW5nIHdvcmRz
IG9uIGRvdWJsZSBsZXR0ZXJzIG9yIGNvbW1vbiBlbmRpbmdzLgoKY2xhc3Mg
TG9uZ1dvcmRzOgogICAgIiIiU3RyYXRlZ3kgZm9yIGhhbmRsaW5nIGxvbmcg
d29yZCBicmVha3MuIiIiCiAgICBkZWYgaGFuZGxlKCBzZWxmLCBjaHVua3Ms
IGN1cl9saW5lLCBjdXJfbGVuLCB3aWR0aCApOgogICAgICAgIHBhc3MKCmNs
YXNzIEJyZWFrTG9uZ1dvcmRzKCBMb25nV29yZHMgKToKICAgICIiIkJyZWFr
IGEgbG9uZyB3b3JkIHRvIGtlZXAgdGhlIGxpbmUgYW4gYXBwcm9wcmlhdGUg
bGVuZ3RoLiIiIgogICAgZGVmIGhhbmRsZSggc2VsZiwgY2h1bmtzLCBjdXJf
bGluZSwgY3VyX2xlbiwgd2lkdGggKToKICAgICAgICBzcGFjZV9sZWZ0PSB3
aWR0aCAtIGN1cl9sZW4KICAgICAgICBjdXJfbGluZS5hcHBlbmQoIGNodW5r
c1swXVswOnNwYWNlX2xlZnRdICkKICAgICAgICBjaHVua3NbMF09IGNodW5r
c1swXVtzcGFjZV9sZWZ0Ol0KICAgIApjbGFzcyBLZWVwTG9uZ1dvcmRzKCBM
b25nV29yZHMgKToKICAgICIiIlB1dCB0aGUgbG9uZyB3b3JkIG9uIHRoZSBu
ZXh0IGxpbmUsIHdoZXJlIGl0IG1heSBiZSB0b28gbG9uZy4iIiIKICAgIGRl
ZiBoYW5kbGUoIHNlbGYsIGNodW5rcywgY3VyX2xpbmUsIGN1cl9sZW4sIHdp
ZHRoICk6CiAgICAgICAgaWYgbm90IGN1cl9saW5lOiAKICAgICAgICAgICAg
Y3VyX2xpbmUuYXBwZW5kKCBjaHVua3MucG9wKDApICkKCiMgTGluZSBwb3N0
LXByb2Nlc3MgY2xhc3MgaGllcmFyY2h5OgojICAgICBUaGVzZSBjbGFzc2Vz
IGFyZSBwbHVnLWluIHN0cmF0ZWdpZXMgZm9yIHBvc3QtcHJvY2Vzc2luZyBs
aW5lcwoKY2xhc3MgTGluZVBvc3Rwcm9jZXNzOgogICAgZGVmIHRyYW5zZm9y
bSggc2VsZiwgbGluZXMgKToKICAgICAgICByZXR1cm4gbGluZXMKCmNsYXNz
IFRleHRXcmFwcGVyOgogICAgIiIiCiAgICBPYmplY3QgZm9yIHdyYXBwaW5n
L2ZpbGxpbmcgdGV4dC4gIFRoZSBwdWJsaWMgaW50ZXJmYWNlIGNvbnNpc3Rz
IG9mCiAgICB0aGUgd3JhcCgpIGFuZCBmaWxsKCkgbWV0aG9kczsgdGhlIG90
aGVyIG1ldGhvZHMgYXJlIGp1c3QgdGhlcmUgZm9yCiAgICBzdWJjbGFzc2Vz
IHRvIG92ZXJyaWRlIGluIG9yZGVyIHRvIHR3ZWFrIHRoZSBkZWZhdWx0IGJl
aGF2aW91ci4KICAgIElmIHlvdSB3YW50IHRvIGNvbXBsZXRlbHkgcmVwbGFj
ZSB0aGUgbWFpbiB3cmFwcGluZyBhbGdvcml0aG0sCiAgICB5b3UnbGwgcHJv
YmFibHkgaGF2ZSB0byBvdmVycmlkZSBfd3JhcF9jaHVua3MoKS4KCiAgICBT
ZXZlcmFsIHN0cmF0ZWd5IGNsYXNzZXMgY2FuIGJlIHVzZWQgdG8gY3VzdG9t
aXplIHRoZSBvcGVyYXRpb24uCiAgICAKICAgIEEgc2VxdWVuY2Ugb2YgaW5z
dGFuY2VzIG9mIHN1YmNsYXNzIG9mIFRleHRQcmVwcm9jZXNzIGFyZSB1c2Vk
CiAgICB0byBwcmUtcHJvY2VzcyB0aGUgdGV4dCBiZWZvcmUgYnJlYWtpbmcg
aW50byBjaHVua3MgYW5kIHdyYXBwaW5nLgogICAgICBFeHBhbmRUYWJzCiAg
ICAgICAgVGFicyBpbiBpbnB1dCB0ZXh0IHdpbGwgYmUgZXhwYW5kZWQKICAg
ICAgICB0byBzcGFjZXMgYmVmb3JlIGZ1cnRoZXIgcHJvY2Vzc2luZy4gIEVh
Y2ggdGFiIHdpbGwKICAgICAgICBiZWNvbWUgMSAuLiA4IHNwYWNlcywgZGVw
ZW5kaW5nIG9uIGl0cyBwb3NpdGlvbiBpbiBpdHMgbGluZS4KICAgICAgQ2xl
YW5XaGl0ZXNwYWNlCiAgICAgICAgQWxsIHdoaXRlc3BhY2UgY2hhcmFjdGVy
cyBpbiB0aGUgaW5wdXQKICAgICAgICB0ZXh0IGFyZSByZXBsYWNlZCBieSBz
cGFjZXMuCiAgICAKICAgIEEgc2VxdWVuY2Ugb2YgaW5zdGFuY2VzIG9mIHN1
YmNsYXNzIG9mIENodW5rUHJlcHJvY2VzcyBhcmUgdXNlZAogICAgdG8gcHJl
LXByb2Nlc3MgdGhlIGNodW5rcyBwcmlvciB0byB3cmFwcGluZy4KICAgICAg
RnVsbFN0b3BUd29TcGFjZQogICAgICAgIEVuc3VyZXMgdGhhdCBmdWxsIHN0
b3BzIGFyZSBmb2xsb3dlZCBieSBwcmVjaXNlbHkgdHdvIHNwYWNlcy4KICAg
ICAgRnVsbFN0b3BPbmVTcGFjZQogICAgICAgIEVuc3VyZXMgdGhhdCBmdWxs
IHN0b3BzIGFyZSBmb2xsb3dlZCBieSBwcmVjaXNlbGUgb25lIHNwYWNlLgog
ICAgCiAgICBBbiBpbnN0YW5jZSBvZiBhIHN1YmNsYXNzIG9mIExvbmdXb3Jk
cyBpcyB1c2VkIHRvIGhhbmRsZSB0aGUgCiAgICBtZWNoYW5pc20gb2YgYnJl
YWtpbmcgb3IgcHJlc2VydmluZyBsb25nIHdvcmRzIGF0IHRoZSBlbmQgb2Yg
dGhlIAogICAgbGluZSBkdXJpbmcgd3JhcHBpbmcuCiAgICAgIEJyZWFrTG9u
Z1dvcmRzCiAgICAgICAgV29yZHMgbG9uZ2VyIHRoYW4gdGhlIGxpbmUgd2lk
dGggY29uc3RyYWludAogICAgICAgIHdpbGwgYmUgYnJva2VuLiAgCiAgICAg
IEtlZXBMb25nV29yZHMKICAgICAgICBXb3JkcyBsb25nZXIgdGhhbiB0aGUg
bGluZSB3aWR0aCBjb25zdHJhaW50IHdpbGwgbm90IGJlIGJyb2tlbiwKICAg
ICAgICBhbmQgc29tZSBsaW5lcyBtaWdodCBiZSBsb25nZXIgdGhhbiB0aGUg
d2lkdGggY29uc3RyYWludC4KICAgICAgICAKICAgIFByb3ZpZGUgYSBzZXF1
ZW5jZSBzdHJhdGVnaWVzIHdoZW4geW91IGNvbnN0cnVjdCB0aGUgaW5zdGFu
Y2Ugb2YgVGV4dFdyYXBwZXIuCiAgICBUaGV5IGFyZSBleGVjdXRlZCBpbiB0
aGUgb3JkZXIgcHJvdmlkZWQuCiAgICAKICAgIEZvciBleGFtcGxlOgogICAg
ICAgIHdyYXA9IFRleHRXcmFwcGVyKCBbIEV4cGFuZFRhYnMoKSwgQ2xlYW5X
aGl0ZXNwYWNlKCkgXSwgTm9uZSwgQnJlYWtMb25nV29yZHMoKSApCiAgICAi
IiIKCiAgICAjIFRoaXMgZnVua3kgbGl0dGxlIHJlZ2V4IGlzIGp1c3QgdGhl
IHRyaWNrIGZvciBzcGxpdHRpbmcgCiAgICAjIHRleHQgdXAgaW50byB3b3Jk
LXdyYXBwYWJsZSBjaHVua3MuICBFLmcuCiAgICAjICAgIkhlbGxvIHRoZXJl
IC0tIHlvdSBnb29mLWJhbGwsIHVzZSB0aGUgLWIgb3B0aW9uISIKICAgICMg
c3BsaXRzIGludG8KICAgICMgICBIZWxsby8gL3RoZXJlLyAvLS0vIC95b3Uv
IC9nb29mLS9iYWxsLC8gL3VzZS8gL3RoZS8gLy1iLyAvb3B0aW9uIQogICAg
IyAoYWZ0ZXIgc3RyaXBwaW5nIG91dCBlbXB0eSBzdHJpbmdzKS4KICAgIHdv
cmRzZXBfcmUgPSByZS5jb21waWxlKHInKFxzK3wnICAgICAgICAgICAgICAg
ICAgIyBhbnkgd2hpdGVzcGFjZQogICAgICAgICAgICAgICAgICAgICAgICAg
ICAgcidcd3syLH0tKD89XHd7Mix9KXwnICAgICAjIGh5cGhlbmF0ZWQgd29y
ZHMKICAgICAgICAgICAgICAgICAgICAgICAgICAgIHInKD88PVx3KS17Mix9
KD89XHcpKScpICAgIyBlbS1kYXNoCgoKICAgIGRlZiBfX2luaXRfXyAoc2Vs
Zix0ZXh0UHJlPVtdLGNodW5rUHJlPVtdLGxvbmd3b3Jkcz1bXSxsaW5lUG9z
dD1bXSk6CiAgICAgICAgaWYgdGV4dFByZToKICAgICAgICAgICAgc2VsZi50
ZXh0UHJlPSB0ZXh0UHJlCiAgICAgICAgZWxzZToKICAgICAgICAgICAgc2Vs
Zi50ZXh0UHJlPSBbIEV4cGFuZFRhYnMoOCksIENsZWFuV2hpdGVzcGFjZSgp
IF0KICAgICAgICBzZWxmLmNodW5rUHJlPSBjaHVua1ByZQogICAgICAgIGlm
IGxvbmd3b3JkczoKICAgICAgICAgICAgc2VsZi5sb25nX3dvcmRzPSBsb25n
d29yZHMKICAgICAgICBlbHNlOgogICAgICAgICAgICBzZWxmLmxvbmdfd29y
ZHM9IEJyZWFrTG9uZ1dvcmRzKCkKICAgICAgICBzZWxmLmxpbmVQb3N0PSBs
aW5lUG9zdAogICAgICAgIAoKICAgICMgLS0gUHJpdmF0ZSBtZXRob2RzIC0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
CiAgICAjIChwb3NzaWJseSB1c2VmdWwgZm9yIHN1YmNsYXNzZXMgdG8gb3Zl
cnJpZGUpCgoKICAgIGRlZiBfc3BsaXQgKHNlbGYsIHRleHQpOgogICAgICAg
ICIiIl9zcGxpdCh0ZXh0IDogc3RyaW5nKSAtPiBbc3RyaW5nXQoKICAgICAg
ICBTcGxpdCB0aGUgdGV4dCB0byB3cmFwIGludG8gaW5kaXZpc2libGUgY2h1
bmtzLiAgQ2h1bmtzIGFyZQogICAgICAgIG5vdCBxdWl0ZSB0aGUgc2FtZSBh
cyB3b3Jkczsgc2VlIHdyYXBfY2h1bmtzKCkgZm9yIGZ1bGwKICAgICAgICBk
ZXRhaWxzLiAgQXMgYW4gZXhhbXBsZSwgdGhlIHRleHQKICAgICAgICAgIExv
b2ssIGdvb2YtYmFsbCAtLSB1c2UgdGhlIC1iIG9wdGlvbiEKICAgICAgICBi
cmVha3MgaW50byB0aGUgZm9sbG93aW5nIGNodW5rczoKICAgICAgICAgICdM
b29rLCcsICcgJywgJ2dvb2YtJywgJ2JhbGwnLCAnICcsICctLScsICcgJywK
ICAgICAgICAgICd1c2UnLCAnICcsICd0aGUnLCAnICcsICctYicsICcgJywg
J29wdGlvbiEnCiAgICAgICAgIiIiCiAgICAgICAgY2h1bmtzID0gc2VsZi53
b3Jkc2VwX3JlLnNwbGl0KHRleHQpCiAgICAgICAgY2h1bmtzID0gZmlsdGVy
KE5vbmUsIGNodW5rcykKICAgICAgICByZXR1cm4gY2h1bmtzCgogICAgZGVm
IF9maXhfc2VudGVuY2VfZW5kaW5ncyAoc2VsZiwgY2h1bmtzKToKICAgICAg
ICAiIiJfZml4X3NlbnRlbmNlX2VuZGluZ3MoY2h1bmtzIDogW3N0cmluZ10p
CgogICAgICAgIENvcnJlY3QgZm9yIHNlbnRlbmNlIGVuZGluZ3MgYnVyaWVk
IGluICdjaHVua3MnLiAgRWcuIHdoZW4gdGhlCiAgICAgICAgb3JpZ2luYWwg
dGV4dCBjb250YWlucyAiLi4uIGZvby5cbkJhciAuLi4iLCBtdW5nZV93aGl0
ZXNwYWNlKCkKICAgICAgICBhbmQgc3BsaXQoKSB3aWxsIGNvbnZlcnQgdGhh
dCB0byBbLi4uLCAiZm9vLiIsICIgIiwgIkJhciIsIC4uLl0KICAgICAgICB3
aGljaCBoYXMgb25lIHRvbyBmZXcgc3BhY2VzOyB0aGlzIG1ldGhvZCBzaW1w
bHkgY2hhbmdlcyB0aGUgb25lCiAgICAgICAgc3BhY2UgdG8gdHdvLgogICAg
ICAgICIiIgogICAgICAgIGkgPSAwCiAgICAgICAgd2hpbGUgaSA8IGxlbihj
aHVua3MpLTE6CiAgICAgICAgICAgICMgY2h1bmtzW2ldIGxvb2tzIGxpa2Ug
dGhlIGxhc3Qgd29yZCBvZiBhIHNlbnRlbmNlLAogICAgICAgICAgICAjIGFu
ZCBpdCdzIGZvbGxvd2VkIGJ5IGEgc2luZ2xlIHNwYWNlLgogICAgICAgICAg
ICBpZiAoY2h1bmtzW2ldWy0xXSA9PSAiLiIgYW5kCiAgICAgICAgICAgICAg
ICAgIGNodW5rc1tpKzFdID09ICIgIiBhbmQKICAgICAgICAgICAgICAgICAg
aXNsb3dlcihjaHVua3NbaV1bLTJdKSk6CiAgICAgICAgICAgICAgICBjaHVu
a3NbaSsxXSA9ICIgICIKICAgICAgICAgICAgICAgIGkgKz0gMgogICAgICAg
ICAgICBlbHNlOgogICAgICAgICAgICAgICAgaSArPSAxCgoKICAgIGRlZiBf
d3JhcF9jaHVua3MgKHNlbGYsIGNodW5rcywgd2lkdGgpOgogICAgICAgICIi
Il93cmFwX2NodW5rcyhjaHVua3MgOiBbc3RyaW5nXSwgd2lkdGggOiBpbnQp
IC0+IFtzdHJpbmddCgogICAgICAgIFdyYXAgYSBzZXF1ZW5jZSBvZiB0ZXh0
IGNodW5rcyBhbmQgcmV0dXJuIGEgbGlzdCBvZiBsaW5lcyBvZgogICAgICAg
IGxlbmd0aCAnd2lkdGgnIG9yIGxlc3MuICAoSWYgJ2JyZWFrX2xvbmdfd29y
ZHMnIGlzIGZhbHNlLCBzb21lCiAgICAgICAgbGluZXMgbWF5IGJlIGxvbmdl
ciB0aGFuICd3aWR0aCcuKSAgQ2h1bmtzIGNvcnJlc3BvbmQgcm91Z2hseSB0
bwogICAgICAgIHdvcmRzIGFuZCB0aGUgd2hpdGVzcGFjZSBiZXR3ZWVuIHRo
ZW06IGVhY2ggY2h1bmsgaXMgaW5kaXZpc2libGUKICAgICAgICAobW9kdWxv
ICdsb25nX3dvcmQuaGFuZGxlKCknKSwgYnV0IGEgbGluZSBicmVhayBjYW4g
Y29tZSBiZXR3ZWVuCiAgICAgICAgYW55IHR3byBjaHVua3MuICBDaHVua3Mg
c2hvdWxkIG5vdCBoYXZlIGludGVybmFsIHdoaXRlc3BhY2U7CiAgICAgICAg
aWUuIGEgY2h1bmsgaXMgZWl0aGVyIGFsbCB3aGl0ZXNwYWNlIG9yIGEgIndv
cmQiLiAgV2hpdGVzcGFjZQogICAgICAgIGNodW5rcyB3aWxsIGJlIHJlbW92
ZWQgZnJvbSB0aGUgYmVnaW5uaW5nIGFuZCBlbmQgb2YgbGluZXMsIGJ1dAog
ICAgICAgIGFwYXJ0IGZyb20gdGhhdCB3aGl0ZXNwYWNlIGlzIHByZXNlcnZl
ZC4KICAgICAgICAiIiIKICAgICAgICBsaW5lcyA9IFtdCgogICAgICAgIHdo
aWxlIGNodW5rczoKCiAgICAgICAgICAgIGN1cl9saW5lID0gW10gICAgICAg
ICAgICAgICAgICAgIyBsaXN0IG9mIGNodW5rcyAodG8tYmUtam9pbmVkKQog
ICAgICAgICAgICBjdXJfbGVuID0gMCAgICAgICAgICAgICAgICAgICAgICMg
bGVuZ3RoIG9mIGN1cnJlbnQgbGluZQoKICAgICAgICAgICAgIyBGaXJzdCBj
aHVuayBvbiBsaW5lIGlzIHdoaXRlc3BhY2UgLS0gZHJvcCBpdC4KICAgICAg
ICAgICAgaWYgY2h1bmtzWzBdLnN0cmlwKCkgPT0gJyc6CiAgICAgICAgICAg
ICAgICBkZWwgY2h1bmtzWzBdCgogICAgICAgICAgICB3aGlsZSBjaHVua3M6
CiAgICAgICAgICAgICAgICBsID0gbGVuKGNodW5rc1swXSkKCiAgICAgICAg
ICAgICAgICAjIENhbiBhdCBsZWFzdCBzcXVlZXplIHRoaXMgY2h1bmsgb250
byB0aGUgY3VycmVudCBsaW5lLgogICAgICAgICAgICAgICAgaWYgY3VyX2xl
biArIGwgPD0gd2lkdGg6CiAgICAgICAgICAgICAgICAgICAgY3VyX2xpbmUu
YXBwZW5kKGNodW5rcy5wb3AoMCkpCiAgICAgICAgICAgICAgICAgICAgY3Vy
X2xlbiArPSBsCgogICAgICAgICAgICAgICAgIyBOb3BlLCB0aGlzIGxpbmUg
aXMgZnVsbC4KICAgICAgICAgICAgICAgIGVsc2U6CiAgICAgICAgICAgICAg
ICAgICAgYnJlYWsKCiAgICAgICAgICAgICMgVGhlIGN1cnJlbnQgbGluZSBp
cyBmdWxsLCBhbmQgdGhlIG5leHQgY2h1bmsgaXMgdG9vIGJpZyB0bwogICAg
ICAgICAgICAjIGZpdCBvbiAqYW55KiBsaW5lIChub3QganVzdCB0aGlzIG9u
ZSkuICAKICAgICAgICAgICAgaWYgY2h1bmtzIGFuZCBsZW4oY2h1bmtzWzBd
KSA+IHdpZHRoOgogICAgICAgICAgICAgICAgc2VsZi5sb25nX3dvcmRzLmhh
bmRsZShjaHVua3MsIGN1cl9saW5lLCBjdXJfbGVuLCB3aWR0aCkKCiAgICAg
ICAgICAgICMgSWYgdGhlIGxhc3QgY2h1bmsgb24gdGhpcyBsaW5lIGlzIGFs
bCB3aGl0ZXNwYWNlLCBkcm9wIGl0LgogICAgICAgICAgICBpZiBjdXJfbGlu
ZSBhbmQgY3VyX2xpbmVbLTFdLnN0cmlwKCkgPT0gJyc6CiAgICAgICAgICAg
ICAgICBkZWwgY3VyX2xpbmVbLTFdCgogICAgICAgICAgICAjIENvbnZlcnQg
Y3VycmVudCBsaW5lIGJhY2sgdG8gYSBzdHJpbmcgYW5kIHN0b3JlIGl0IGlu
IGxpc3QKICAgICAgICAgICAgIyBvZiBhbGwgbGluZXMgKHJldHVybiB2YWx1
ZSkuCiAgICAgICAgICAgIGlmIGN1cl9saW5lOgogICAgICAgICAgICAgICAg
bGluZXMuYXBwZW5kKCcnLmpvaW4oY3VyX2xpbmUpKQoKICAgICAgICByZXR1
cm4gbGluZXMKCgogICAgIyAtLSBQdWJsaWMgaW50ZXJmYWNlIC0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0KCiAgICBk
ZWYgd3JhcCAoc2VsZiwgdGV4dCwgd2lkdGgpOgogICAgICAgICIiIndyYXAo
dGV4dCA6IHN0cmluZywgd2lkdGggOiBpbnQpIC0+IFtzdHJpbmddCgogICAg
ICAgIFNwbGl0ICd0ZXh0JyBpbnRvIG11bHRpcGxlIGxpbmVzIG9mIG5vIG1v
cmUgdGhhbiAnd2lkdGgnCiAgICAgICAgY2hhcmFjdGVycyBlYWNoLCBhbmQg
cmV0dXJuIHRoZSBsaXN0IG9mIHN0cmluZ3MgdGhhdCByZXN1bHRzLgogICAg
ICAgIFRhYnMgaW4gJ3RleHQnIGFyZSBleHBhbmRlZCB3aXRoIHN0cmluZy5l
eHBhbmR0YWJzKCksIGFuZCBhbGwKICAgICAgICBvdGhlciB3aGl0ZXNwYWNl
IGNoYXJhY3RlcnMgKGluY2x1ZGluZyBuZXdsaW5lKSBhcmUgY29udmVydGVk
IHRvCiAgICAgICAgc3BhY2UuCiAgICAgICAgIiIiCiAgICAgICAgZm9yIHBy
ZSBpbiBzZWxmLnRleHRQcmU6CiAgICAgICAgICAgIHRleHQgPSBwcmUudHJh
bnNmb3JtKHRleHQpCiAgICAgICAgaWYgbGVuKHRleHQpIDw9IHdpZHRoOgog
ICAgICAgICAgICByZXR1cm4gW3RleHRdCiAgICAgICAgY2h1bmtzID0gc2Vs
Zi5fc3BsaXQodGV4dCkKICAgICAgICBmb3IgcHJlIGluIHNlbGYuY2h1bmtQ
cmU6CiAgICAgICAgICAgIGNodW5rcyA9IHByZS50cmFuc2Zvcm0oIGNodW5r
cyApCiAgICAgICAgbGluZXM9IHNlbGYuX3dyYXBfY2h1bmtzKGNodW5rcywg
d2lkdGgpCiAgICAgICAgZm9yIHBvc3QgaW4gc2VsZi5saW5lUG9zdDoKICAg
ICAgICAgICAgbGluZXM9IHBvc3QudHJhbnNmb3JtKCBsaW5lcyApCiAgICAg
ICAgcmV0dXJuIGxpbmVzCgogICAgZGVmIGZpbGwgKHNlbGYsIHRleHQsIHdp
ZHRoLCBpbml0aWFsX3RhYj0iIiwgc3Vic2VxdWVudF90YWI9IiIpOgogICAg
ICAgICIiImZpbGwodGV4dCA6IHN0cmluZywKICAgICAgICAgICAgICAgIHdp
ZHRoIDogaW50LAogICAgICAgICAgICAgICAgaW5pdGlhbF90YWIgOiBzdHJp
bmcgPSAiIiwKICAgICAgICAgICAgICAgIHN1YnNlcXVlbnRfdGFiIDogc3Ry
aW5nID0gIiIpCiAgICAgICAgICAgLT4gc3RyaW5nCgogICAgICAgIFJlZm9y
bWF0IHRoZSBwYXJhZ3JhcGggaW4gJ3RleHQnIHRvIGZpdCBpbiBsaW5lcyBv
ZiBubyBtb3JlIHRoYW4KICAgICAgICAnd2lkdGgnIGNvbHVtbnMuICBUaGUg
Zmlyc3QgbGluZSBpcyBwcmVmaXhlZCB3aXRoICdpbml0aWFsX3RhYicsCiAg
ICAgICAgYW5kIHN1YnNlcXVlbnQgbGluZXMgYXJlIHByZWZpeGVkIHdpdGgg
J3N1YnNlcXVlbnRfdGFiJzsgdGhlCiAgICAgICAgbGVuZ3RocyBvZiB0aGUg
dGFiIHN0cmluZ3MgYXJlIGFjY291bnRlZCBmb3Igd2hlbiB3cmFwcGluZyBs
aW5lcwogICAgICAgIHRvIGZpdCBpbiAnd2lkdGgnIGNvbHVtbnMuCiAgICAg
ICAgIiIiCiAgICAgICAgbGluZXMgPSBzZWxmLndyYXAodGV4dCwgd2lkdGgp
CiAgICAgICAgc2VwID0gIlxuIiArIHN1YnNlcXVlbnRfdGFiCiAgICAgICAg
cmV0dXJuIGluaXRpYWxfdGFiICsgc2VwLmpvaW4obGluZXMpCgoKIyBDb252
ZW5pZW5jZSBpbnRlcmZhY2UKCl93cmFwcGVyID0gVGV4dFdyYXBwZXIoIHRl
eHRQcmU9WyBFeHBhbmRUYWJzKDgpLCBDbGVhbldoaXRlc3BhY2UoKSBdLCBs
b25nd29yZHM9QnJlYWtMb25nV29yZHMoKSApCgpkZWYgd3JhcCAodGV4dCwg
d2lkdGgpOgogICAgcmV0dXJuIF93cmFwcGVyLndyYXAodGV4dCwgd2lkdGgp
CgpkZWYgZmlsbCAodGV4dCwgd2lkdGgsIGluaXRpYWxfdGFiPSIiLCBzdWJz
ZXF1ZW50X3RhYj0iIik6CiAgICByZXR1cm4gX3dyYXBwZXIuZmlsbCh0ZXh0
LCB3aWR0aCwgaW5pdGlhbF90YWIsIHN1YnNlcXVlbnRfdGFiKQo=

--0-1592618721-1023648265=:17605
Content-Type: application/octet-stream; name="test_textwrap.py"
Content-Transfer-Encoding: base64
Content-Description: test_textwrap.py
Content-Disposition: attachment; filename="test_textwrap.py"

IyEvdXNyL2Jpbi9lbnYgcHl0aG9uCgpmcm9tIHRleHR3cmFwIGltcG9ydCBU
ZXh0V3JhcHBlcgoKbnVtID0gMAoKZGVmIHRlc3QgKHJlc3VsdCwgZXhwZWN0
KToKICAgIGdsb2JhbCBudW0KICAgIG51bSArPSAxCiAgICBpZiByZXN1bHQg
PT0gZXhwZWN0OgogICAgICAgIHByaW50ICIlZDogb2siICUgbnVtCiAgICBl
bHNlOgogICAgICAgIHByaW50ICIlZDogbm90IG9rLCBleHBlY3RlZDoiICUg
bnVtCiAgICAgICAgZm9yIGkgaW4gcmFuZ2UobGVuKGV4cGVjdCkpOgogICAg
ICAgICAgICBwcmludCAiICAlZDogJXIiICUgKGksIGV4cGVjdFtpXSkKICAg
ICAgICBwcmludCAiYnV0IGdvdDoiCiAgICAgICAgZm9yIGkgaW4gcmFuZ2Uo
bGVuKHJlc3VsdCkpOgogICAgICAgICAgICBwcmludCAiICAlZDogJXIiICUg
KGksIHJlc3VsdFtpXSkKCndyYXBwZXIgPSBUZXh0V3JhcHBlcigpCndyYXAg
PSB3cmFwcGVyLndyYXAKCgojIFNpbXBsZSBjYXNlOiBqdXN0IHdvcmRzLCBz
cGFjZXMsIGFuZCBhIGJpdCBvZiBwdW5jdHVhdGlvbi4KdCA9ICJIZWxsbyB0
aGVyZSwgaG93IGFyZSB5b3UgdGhpcyBmaW5lIGRheT8gIEknbSBnbGFkIHRv
IGhlYXIgaXQhIgp0ZXN0KHdyYXAodCwgMTIpLCBbIkhlbGxvIHRoZXJlLCIs
CiAgICAgICAgICAgICAgICAgICAiaG93IGFyZSB5b3UiLAogICAgICAgICAg
ICAgICAgICAgInRoaXMgZmluZSIsCiAgICAgICAgICAgICAgICAgICAiZGF5
PyAgSSdtIiwKICAgICAgICAgICAgICAgICAgICJnbGFkIHRvIGhlYXIiLAog
ICAgICAgICAgICAgICAgICAgIml0ISJdKQp0ZXN0KHdyYXAodCwgNDIpLCBb
IkhlbGxvIHRoZXJlLCBob3cgYXJlIHlvdSB0aGlzIGZpbmUgZGF5PyIsCiAg
ICAgICAgICAgICAgICAgICAiSSdtIGdsYWQgdG8gaGVhciBpdCEiXSkKdGVz
dCh3cmFwKHQsIDgwKSwgW3RdKQoKIyBXaGl0ZXNwYWNlIG11bmdpbmcgYW5k
IGVuZC1vZi1zZW50ZW5jZSBkZXRlY3Rpb24uCnQgPSAiIiJcClRoaXMgaXMg
YSBwYXJhZ3JhcGggdGhhdCBhbHJlYWR5IGhhcwpsaW5lIGJyZWFrcy4gIEJ1
dCBzb21lIG9mIGl0cyBsaW5lcyBhcmUgbXVjaCBsb25nZXIgdGhhbiB0aGUg
b3RoZXJzLApzbyBpdCBuZWVkcyB0byBiZSB3cmFwcGVkLgpTb21lIGxpbmVz
IGFyZSBcdHRhYmJlZCB0b28uCldoYXQgYSBtZXNzIQoiIiIKdGVzdCh3cmFw
KHQsIDQ1KSwgWyJUaGlzIGlzIGEgcGFyYWdyYXBoIHRoYXQgYWxyZWFkeSBo
YXMgbGluZSIsCiAgICAgICAgICAgICAgICAgICAiYnJlYWtzLiAgQnV0IHNv
bWUgb2YgaXRzIGxpbmVzIGFyZSBtdWNoIiwKICAgICAgICAgICAgICAgICAg
ICJsb25nZXIgdGhhbiB0aGUgb3RoZXJzLCBzbyBpdCBuZWVkcyB0byBiZSIs
CiAgICAgICAgICAgICAgICAgICAid3JhcHBlZC4gIFNvbWUgbGluZXMgYXJl
ICB0YWJiZWQgdG9vLiAgV2hhdCBhIiwKICAgICAgICAgICAgICAgICAgICJt
ZXNzISJdKQoKCiMgV3JhcHBpbmcgdG8gbWFrZSBzaG9ydCBsaW5lcyBsb25n
ZXIuCnQgPSAiVGhpcyBpcyBhXG5zaG9ydCBwYXJhZ3JhcGguIgp0ZXN0KHdy
YXAodCwgMjApLCBbIlRoaXMgaXMgYSBzaG9ydCIsCiAgICAgICAgICAgICAg
ICAgICAicGFyYWdyYXBoLiJdKQp0ZXN0KHdyYXAodCwgNDApLCBbIlRoaXMg
aXMgYSBzaG9ydCBwYXJhZ3JhcGguIl0pCgoKIyBUZXN0IGJyZWFraW5nIGh5
cGhlbmF0ZWQgd29yZHMuCnQgPSAidGhpcy1pcy1hLXVzZWZ1bC1mZWF0dXJl
LWZvci1yZWZvcm1hdHRpbmctcG9zdHMtZnJvbS10aW0tcGV0ZXJzJ2x5Igp0
ZXN0KHdyYXAodCwgNDApLCBbInRoaXMtaXMtYS11c2VmdWwtZmVhdHVyZS1m
b3ItIiwKICAgICAgICAgICAgICAgICAgICJyZWZvcm1hdHRpbmctcG9zdHMt
ZnJvbS10aW0tcGV0ZXJzJ2x5Il0pCnRlc3Qod3JhcCh0LCA0MSksIFsidGhp
cy1pcy1hLXVzZWZ1bC1mZWF0dXJlLWZvci0iLAogICAgICAgICAgICAgICAg
ICAgInJlZm9ybWF0dGluZy1wb3N0cy1mcm9tLXRpbS1wZXRlcnMnbHkiXSkK
dGVzdCh3cmFwKHQsIDQyKSwgWyJ0aGlzLWlzLWEtdXNlZnVsLWZlYXR1cmUt
Zm9yLXJlZm9ybWF0dGluZy0iLAogICAgICAgICAgICAgICAgICAgInBvc3Rz
LWZyb20tdGltLXBldGVycydseSJdKQoKIyBFbnN1cmUgdGhhdCB0aGUgc3Rh
bmRhcmQgX3NwbGl0KCkgbWV0aG9kIHdvcmtzIGFzIGFkdmVydGlzZWQgaW4K
IyB0aGUgY29tbWVudHMgKGRvbid0IHlvdSBoYXRlIGl0IHdoZW4gY29kZSBh
bmQgY29tbWVudHMgZGl2ZXJnZT8pLgp0ID0gIkhlbGxvIHRoZXJlIC0tIHlv
dSBnb29mLWJhbGwsIHVzZSB0aGUgLWIgb3B0aW9uISIKdGVzdCh3cmFwcGVy
Ll9zcGxpdCh0KSwKICAgICBbIkhlbGxvIiwgIiAiLCAidGhlcmUiLCAiICIs
ICItLSIsICIgIiwgInlvdSIsICIgIiwgImdvb2YtIiwKICAgICAgImJhbGws
IiwgIiAiLCAidXNlIiwgIiAiLCAidGhlIiwgIiAiLCAiLWIiLCAiICIsICAi
b3B0aW9uISJdKQoKCnRleHQgPSAnJycKRGlkIHlvdSBzYXkgInN1cGVyY2Fs
aWZyYWdpbGlzdGljZXhwaWFsaWRvY2lvdXM/IgpIb3cgKmRvKiB5b3Ugc3Bl
bGwgdGhhdCBvZGQgd29yZCwgYW55d2F5cz8KJycnCiMgWFhYIHNlbnRlbmNl
IGVuZGluZyBub3QgZGV0ZWN0ZWQgYmVjYXVzZSBvZiBxdW90ZXMKdGVzdCh3
cmFwKHRleHQsIDMwKSwKICAgICBbJ0RpZCB5b3Ugc2F5ICJzdXBlcmNhbGlm
cmFnaWxpcycsCiAgICAgICd0aWNleHBpYWxpZG9jaW91cz8iIEhvdyAqZG8q
JywKICAgICAgJ3lvdSBzcGVsbCB0aGF0IG9kZCB3b3JkLCcsCiAgICAgICdh
bnl3YXlzPyddKQp0ZXN0KHdyYXAodGV4dCwgNTApLAogICAgIFsnRGlkIHlv
dSBzYXkgInN1cGVyY2FsaWZyYWdpbGlzdGljZXhwaWFsaWRvY2lvdXM/Iics
CiAgICAgICdIb3cgKmRvKiB5b3Ugc3BlbGwgdGhhdCBvZGQgd29yZCwgYW55
d2F5cz8nXSkKCnRlc3QoVGV4dFdyYXBwZXIobG9uZ3dvcmRzPUtlZXBMb25n
V29yZHMoKSkud3JhcCh0ZXh0LCAzMCksCiAgICAgWydEaWQgeW91IHNheScs
CiAgICAgICcic3VwZXJjYWxpZnJhZ2lsaXN0aWNleHBpYWxpZG9jaW91cz8i
JywKICAgICAgJ0hvdyAqZG8qIHlvdSBzcGVsbCB0aGF0IG9kZCcsCiAgICAg
ICd3b3JkLCBhbnl3YXlzPyddKQoK

--0-1592618721-1023648265=:17605--



From niemeyer@conectiva.com  Sun Jun  9 20:19:18 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Sun, 9 Jun 2002 16:19:18 -0300
Subject: [Python-Dev] os.stat(filename).r_dev
Message-ID: <20020609161918.A17718@ibook.distro.conectiva>

While talking to Lars about tarfile.py, he has noted an interesting
detail in the current implementation of os.stat(filename).r_dev:

-------------
With the current os.stat(), it is impossible to implement addition of
those devices. That's because the result for st_rdev is just a plain
integer, which still must be divided into the major and minor part.
This division (resp. the C type dev_t) differs between several
operating systems:

OS        format    major          minor

Linux     32-bit    upper 16bits   lower 16bits
SVR4      32-bit    upper 14bits   lower 18bits
BSD       16-bit    upper 8bits    lower 8bits
-------------

It seems like we really need some way to decode r_dev. One possible
solutions are to implement major(), minor(), and makedev() somewhere.
Another solution, if r_dev's raw value has no obvious use, would be to
turn it into a two elements tuple like (major, minor).

Any suggestions?

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From guido@python.org  Sun Jun  9 20:58:52 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 09 Jun 2002 15:58:52 -0400
Subject: [Python-Dev] os.stat(filename).r_dev
In-Reply-To: Your message of "Sun, 09 Jun 2002 16:19:18 -0300."
 <20020609161918.A17718@ibook.distro.conectiva>
References: <20020609161918.A17718@ibook.distro.conectiva>
Message-ID: <200206091958.g59JwqH15597@pcp02138704pcs.reston01.va.comcast.net>

> Any suggestions?

Submit a patch. ;-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sun Jun  9 21:02:20 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 09 Jun 2002 16:02:20 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: Your message of "09 Jun 2002 09:38:45 +0200."
 <m3bsaksxyy.fsf@mira.informatik.hu-berlin.de>
References: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net>
 <m3bsaksxyy.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206092002.g59K2Kt15647@pcp02138704pcs.reston01.va.comcast.net>

[Guido]
> > I tried the simplest possible fix, which was to visit self->ob_type in
> > the new-style instance tp_traverse handler (subtype_traverse() in
> > typeobject.c).  But this caused assertions to fail all over the place.
> > It turns out that when the collector decides to break a cycle like
> > this, it calls the tp_clear handler for each object in the cycle, and
> > then the subsequent deletion of the instance references the type in
> > ways that have been made invalid by the clearing of the type.  So this
> > was a dead end.

[Martin]
> I'd like to question this statement. It ought to be possible, IMO, to
> dealloc an instance whose type has been cleared.
> 
> The problem appears to be in the tp_clear. The task of tp_clear is to
> clear all references that may participate in cycles (*not* to clear
> all references per se). Now, if type_clear would clear tp_dict,
> tp_subclasses, and et->slots, but leave alone tp_base, tp_bases, and
> tp_mro, the type would still be "good enough" for subtype_dealloc, no?

Alas, I don't think so.

When tp_dict is cleared, this can remove the __del__ method before it
can be called (it is called by the instance's tp_dealloc).  But
tp_dict has to be cleared, because it can participate in cycles
(e.g. you could do A.A = A).

tp_mro participates in a cycle too: it is a tuple whose first element
is the type itself.  Tuples are immutable, so the tp_clear for tuples
doesn't do anything.  So type_clear is our only hope to break this
cycle.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@v.loewis.de  Sun Jun  9 21:46:26 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 09 Jun 2002 22:46:26 +0200
Subject: [Python-Dev] os.stat(filename).r_dev
In-Reply-To: <20020609161918.A17718@ibook.distro.conectiva>
References: <20020609161918.A17718@ibook.distro.conectiva>
Message-ID: <m3sn3w9o4d.fsf@mira.informatik.hu-berlin.de>

Gustavo Niemeyer <niemeyer@conectiva.com> writes:

> It seems like we really need some way to decode r_dev. One possible
> solutions are to implement major(), minor(), and makedev() somewhere.
> Another solution, if r_dev's raw value has no obvious use, would be to
> turn it into a two elements tuple like (major, minor).
> 
> Any suggestions?

I'd add a field r_dev_pair which splits this into major and minor. I
would not remove r_dev, since existing code may break.

Notice that major, minor, and makedev is already available through
TYPES on many platforms, although this has the known limitations, and
is probably wrong for Linux at the moment.

Regards,
Martin




From martin@v.loewis.de  Sun Jun  9 21:56:00 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 09 Jun 2002 22:56:00 +0200
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <200206092002.g59K2Kt15647@pcp02138704pcs.reston01.va.comcast.net>
References: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net>
 <m3bsaksxyy.fsf@mira.informatik.hu-berlin.de>
 <200206092002.g59K2Kt15647@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3ofek9nof.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> When tp_dict is cleared, this can remove the __del__ method before it
> can be called (it is called by the instance's tp_dealloc).  

That cannot happen: an object whose type has an __del__ cannot refer
to an object for which tp_clear has been called. Objects with
finalizers go into gc.garbage, so in this case, the type is
resurrected, and not cleared.

> tp_mro participates in a cycle too: it is a tuple whose first element
> is the type itself.  Tuples are immutable, so the tp_clear for tuples
> doesn't do anything.  So type_clear is our only hope to break this
> cycle.

I see. So tp_mro must be cleared in tp_clear; it's not used from
subtype_dealloc, so it won't cause problems to clear it.

Regards,
Martin




From niemeyer@conectiva.com  Sun Jun  9 22:17:11 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Sun, 9 Jun 2002 18:17:11 -0300
Subject: [Python-Dev] os.stat(filename).r_dev
In-Reply-To: <m3sn3w9o4d.fsf@mira.informatik.hu-berlin.de>
References: <20020609161918.A17718@ibook.distro.conectiva> <m3sn3w9o4d.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020609181710.A18935@ibook.distro.conectiva>

Hi Martin!

First, some self-corrections.. :-)

> > It seems like we really need some way to decode r_dev. One possible
> > solutions are to implement major(), minor(), and makedev() somewhere.

"solution is"

> > Another solution, if r_dev's raw value has no obvious use, would be to

This should be st_rdev.

> > turn it into a two elements tuple like (major, minor).

> I'd add a field r_dev_pair which splits this into major and minor. I
> would not remove r_dev, since existing code may break.

Isn't st_rdev being made available only in 2.3, trough stat attributes?

> Notice that major, minor, and makedev is already available through
> TYPES on many platforms, although this has the known limitations, and
> is probably wrong for Linux at the moment.

Indeed. Here's what's defined here:

def major(dev): return ((int)(((dev) >> 8) & 0xff))
def minor(dev): return ((int)((dev) & 0xff))
def major(dev): return (((dev).__val[1] >> 8) & 0xff)
def minor(dev): return ((dev).__val[1] & 0xff)
def major(dev): return (((dev).__val[0] >> 8) & 0xff)
def minor(dev): return ((dev).__val[0] & 0xff)

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From guido@python.org  Mon Jun 10 00:15:39 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 09 Jun 2002 19:15:39 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: Your message of "09 Jun 2002 22:56:00 +0200."
 <m3ofek9nof.fsf@mira.informatik.hu-berlin.de>
References: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net> <m3bsaksxyy.fsf@mira.informatik.hu-berlin.de> <200206092002.g59K2Kt15647@pcp02138704pcs.reston01.va.comcast.net>
 <m3ofek9nof.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206092315.g59NFdx20542@pcp02138704pcs.reston01.va.comcast.net>

> > When tp_dict is cleared, this can remove the __del__ method before it
> > can be called (it is called by the instance's tp_dealloc).  
> 
> That cannot happen: an object whose type has an __del__ cannot refer
> to an object for which tp_clear has been called. Objects with
> finalizers go into gc.garbage, so in this case, the type is
> resurrected, and not cleared.

You're right!

> > tp_mro participates in a cycle too: it is a tuple whose first element
> > is the type itself.  Tuples are immutable, so the tp_clear for tuples
> > doesn't do anything.  So type_clear is our only hope to break this
> > cycle.
> 
> I see. So tp_mro must be cleared in tp_clear; it's not used from
> subtype_dealloc, so it won't cause problems to clear it.

You've convinced me.  Here's a patch that only touches typeobject.c.
It doesn't add any fields, and it doesn't require multiple collections
to clear out cycles involving a class and its type.  I like it!

(Note: at the top of type_traverse() and type_clear(), there used to
be code saying "if not a heaptype, return".  That code was never
necessary, because the collector doesn't call the traverse or clear
hooks when tp_is_gc() returns false -- which it does when the heaptype
flag isn't set.  So I replaced these two with an assert that this is a
heaptype.)

Index: typeobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/typeobject.c,v
retrieving revision 2.148
diff -c -c -r2.148 typeobject.c
*** typeobject.c	4 Jun 2002 19:52:53 -0000	2.148
--- typeobject.c	9 Jun 2002 23:05:47 -0000
***************
*** 290,295 ****
--- 290,301 ----
  		}
  	}
  
+ 	if (type->tp_flags & Py_TPFLAGS_HEAPTYPE) {
+ 		int err = visit((PyObject *)type, arg);
+ 		if (err)
+ 			return err;
+ 	}
+ 
  	if (basetraverse)
  		return basetraverse(self, visit, arg);
  	return 0;
***************
*** 1323,1329 ****
  			return NULL;
  		}
  		mro = type->tp_mro;
! 		assert(mro != NULL);
  	}
  	assert(PyTuple_Check(mro));
  	n = PyTuple_GET_SIZE(mro);
--- 1329,1336 ----
  			return NULL;
  		}
  		mro = type->tp_mro;
! 		if (mro == NULL)
! 			return NULL;
  	}
  	assert(PyTuple_Check(mro));
  	n = PyTuple_GET_SIZE(mro);
***************
*** 1335,1341 ****
  			assert(PyType_Check(base));
  			dict = ((PyTypeObject *)base)->tp_dict;
  		}
! 		assert(dict && PyDict_Check(dict));
  		res = PyDict_GetItem(dict, name);
  		if (res != NULL)
  			return res;
--- 1342,1349 ----
  			assert(PyType_Check(base));
  			dict = ((PyTypeObject *)base)->tp_dict;
  		}
! 		if (dict == NULL || !PyDict_Check(dict))
! 			continue;
  		res = PyDict_GetItem(dict, name);
  		if (res != NULL)
  			return res;
***************
*** 1495,1502 ****
  	etype *et;
  	int err;
  
! 	if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE))
! 		return 0;
  
  	et = (etype *)type;
  
--- 1503,1509 ----
  	etype *et;
  	int err;
  
! 	assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE);
  
  	et = (etype *)type;
  
***************
*** 1512,1519 ****
  	VISIT(type->tp_mro);
  	VISIT(type->tp_bases);
  	VISIT(type->tp_base);
- 	VISIT(type->tp_subclasses);
- 	VISIT(et->slots);
  
  #undef VISIT
  
--- 1519,1524 ----
***************
*** 1526,1533 ****
  	etype *et;
  	PyObject *tmp;
  
! 	if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE))
! 		return 0;
  
  	et = (etype *)type;
  
--- 1531,1537 ----
  	etype *et;
  	PyObject *tmp;
  
! 	assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE);
  
  	et = (etype *)type;
  
***************
*** 1541,1555 ****
  	CLEAR(type->tp_dict);
  	CLEAR(type->tp_cache);
  	CLEAR(type->tp_mro);
- 	CLEAR(type->tp_bases);
- 	CLEAR(type->tp_base);
- 	CLEAR(type->tp_subclasses);
- 	CLEAR(et->slots);
- 
- 	if (type->tp_doc != NULL) {
- 		PyObject_FREE(type->tp_doc);
- 		type->tp_doc = NULL;
- 	}
  
  #undef CLEAR
  
--- 1545,1550 ----
***************
*** 2166,2175 ****
  	PyTypeObject *base;
  	int i, n;
  
! 	if (type->tp_flags & Py_TPFLAGS_READY) {
! 		assert(type->tp_dict != NULL);
  		return 0;
- 	}
  	assert((type->tp_flags & Py_TPFLAGS_READYING) == 0);
  
  	type->tp_flags |= Py_TPFLAGS_READYING;
--- 2161,2168 ----
  	PyTypeObject *base;
  	int i, n;
  
! 	if (type->tp_flags & Py_TPFLAGS_READY)
  		return 0;
  	assert((type->tp_flags & Py_TPFLAGS_READYING) == 0);
  
  	type->tp_flags |= Py_TPFLAGS_READYING;

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@v.loewis.de  Mon Jun 10 06:52:18 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 10 Jun 2002 07:52:18 +0200
Subject: [Python-Dev] os.stat(filename).r_dev
In-Reply-To: <20020609181710.A18935@ibook.distro.conectiva>
References: <20020609161918.A17718@ibook.distro.conectiva>
 <m3sn3w9o4d.fsf@mira.informatik.hu-berlin.de>
 <20020609181710.A18935@ibook.distro.conectiva>
Message-ID: <m3d6uzn0j1.fsf@mira.informatik.hu-berlin.de>

Gustavo Niemeyer <niemeyer@conectiva.com> writes:

> Isn't st_rdev being made available only in 2.3, trough stat attributes?

No, it was available in 2.2 already. So there is a backwards
compatibility issue.

Regards,
Martin



From martin@v.loewis.de  Mon Jun 10 07:06:22 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 10 Jun 2002 08:06:22 +0200
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <200206092315.g59NFdx20542@pcp02138704pcs.reston01.va.comcast.net>
References: <200206090210.g592Aip03694@pcp02138704pcs.reston01.va.comcast.net>
 <m3bsaksxyy.fsf@mira.informatik.hu-berlin.de>
 <200206092002.g59K2Kt15647@pcp02138704pcs.reston01.va.comcast.net>
 <m3ofek9nof.fsf@mira.informatik.hu-berlin.de>
 <200206092315.g59NFdx20542@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m38z5nmzvl.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> You've convinced me.  Here's a patch that only touches typeobject.c.
> It doesn't add any fields, and it doesn't require multiple collections
> to clear out cycles involving a class and its type.  I like it!

Looks good to me, too!

Martin



From mwh@python.net  Mon Jun 10 11:36:30 2002
From: mwh@python.net (Michael Hudson)
Date: 10 Jun 2002 11:36:30 +0100
Subject: [Python-Dev] pymemcompat.h & PyMem_New and friends
In-Reply-To: Michael Hudson's message of "29 May 2002 16:13:58 +0100"
References: <LNBBLJKPBEHFEDALKOLCMEFJPIAA.tim.one@comcast.net> <2mr8jwv84e.fsf@starship.python.net> <20020528080149.A7799@glacier.arctrix.com> <2mit57x9zd.fsf@starship.python.net>
Message-ID: <2msn3vpgi9.fsf_-_@starship.python.net>

Michael Hudson <mwh@python.net> writes:

> /* There are three "families" of memory API: the "raw memory", "object
>    memory" and "object" families.  (This is ignoring the matter of the
>    cycle collector, about which more is said below).

Of course this is an over-simplification.  There is at least one other
family in fairly widespread use in the Python core; the "typed memory
allocator", PyMem_New, PyMem_Resize and PyMem_Del.  Should this family
be listed in pyemcompat.h or subtly discouraged? (I don't think there
are any other options).

I think it should be subtly discouraged, for a couple of reasons:

a) three is a smaller number than four.
b) there is a non-analogy:

       PyMem_Malloc ---> PyMem_New
       PyObject_Malloc ---> PyObject_New

   They do rather different things.
c) I don't think omitting a cast and a sizeof is that much of a win.

I'm not proposing actually taking these interfaces away.

(as a special bonus I won't even mention the fact that we have
PyMem_Resize, PyObject_GC_Resize (only used in listobject.c) but not
PyObject_Resize...)

Cheers,
M.

-- 
  If a train station is a place where a train stops, what's a
  workstation?                            -- unknown (to me, at least)



From tim.one@comcast.net  Mon Jun 10 13:07:08 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 08:07:08 -0400
Subject: [Python-Dev] pymemcompat.h & PyMem_New and friends
In-Reply-To: <2msn3vpgi9.fsf_-_@starship.python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKELEPLAA.tim.one@comcast.net>

[Michael Hudson]
> ...
> There is at least one other family in fairly widespread use in the
> Python core; the "typed memory allocator", PyMem_New, PyMem_Resize and
> PyMem_Del.  Should this family be listed in pyemcompat.h or subtly
> discouraged? (I don't think there are any other options).

I left it out of the "recommended" memory API, so "subtly discouraged" gets
my vote.  I think the current comment in pymem.h shows this <wink>:

 * These are carried along for historical reasons.  There's rarely a good
 * reason to use them anymore (you can just as easily do the multiply and
 * cast yourself).

> I think it should be subtly discouraged, for a couple of reasons:
>
> a) three is a smaller number than four.
> b) there is a non-analogy:
>
>        PyMem_Malloc ---> PyMem_New
>        PyObject_Malloc ---> PyObject_New
>
>    They do rather different things.
> c) I don't think omitting a cast and a sizeof is that much of a win.

It also hides a multiply, and that's a Bad Idea because callers almost never
first check that the hidden multiply doesn't overflow a size_t -- and
neither do the macros.  There are a few calls in the Python core that do,
but only because I slammed those checks in when a real-life overflow bug
surfaced.  If the PyMem_XYZ family were reworked to detect overflow, it may
become valuable again.

> I'm not proposing actually taking these interfaces away.

I suppose that explains why you're still breathing <wink>.

> (as a special bonus I won't even mention the fact that we have
> PyMem_Resize, PyObject_GC_Resize (only used in listobject.c)

No, it's not used in listobject.c -- there's no need for it there, as list
guts are stored separately from the list object.  It is used in
tupleobject.c and in frameobject.c, as tuples and frames are the only
container types in Python that *embed* a variable amount of data in the
object and may participate in cycles.

> but not PyObject_Resize...)

That seems a curious omission, but it would only be useful for a variable-
size object that doesn't participate in cyclic gc.  8-bit strings are the
only type that come to mind.




From tim.one@comcast.net  Mon Jun 10 13:16:51 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 08:16:51 -0400
Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?)
Message-ID: <LNBBLJKPBEHFEDALKOLCKELFPLAA.tim.one@comcast.net>

I think Beni has a very nice idea here, especially for people who can't
visualize 2's-complement (not mentioning Guido by name <wink>).

-----Original Message-----
From: Beni Cherniavksy <cben@tx.technion.ac.il>
Sent: Monday, June 10, 2002 1:57 AM
To: python-list@python.org
Subject: Re: Does Python need a '>>>' operator?


... [quotes of old postings deleted] ...

I just got another idea: use 0x1234 for 0-filled numbers and 1xABCD for
1-filled ones.  That way you impose no restrictions on what follows the
prefix and keep backward compatibility.  0xFFFFFFFF stays a 2^n-1
_positive_ number, as it should be.  The look of 1x is weird at first but
it is very logical...

--
Beni Cherniavsky <cben@tx.technion.ac.il>

--
http://mail.python.org/mailman/listinfo/python-list




From guido@python.org  Mon Jun 10 13:52:43 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Jun 2002 08:52:43 -0400
Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?)
In-Reply-To: Your message of "Mon, 10 Jun 2002 08:16:51 EDT."
 <LNBBLJKPBEHFEDALKOLCKELFPLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCKELFPLAA.tim.one@comcast.net>
Message-ID: <200206101252.g5ACqi922863@pcp02138704pcs.reston01.va.comcast.net>

> I think Beni has a very nice idea here, especially for people who can't
> visualize 2's-complement (not mentioning Guido by name <wink>).

In fact it's so subtle that I didn't notice what he proposed.  I
though it had to do with the uppercase of 1xABCD.

Maybe that's too subtle?

Do we really need this?

> > I just got another idea: use 0x1234 for 0-filled numbers and 1xABCD for
> > 1-filled ones.  That way you impose no restrictions on what follows the
> > prefix and keep backward compatibility.  0xFFFFFFFF stays a 2^n-1
> > _positive_ number, as it should be.  The look of 1x is weird at first but
> > it is very logical...
> > 
> > Beni Cherniavsky <cben@tx.technion.ac.il>

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Mon Jun 10 13:50:10 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 08:50:10 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <m3bsaksxyy.fsf@mira.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCOELHPLAA.tim.one@comcast.net>

[Martin v. Loewis[
> ...
> The problem appears to be in the tp_clear. The task of tp_clear is to
> clear all references that may participate in cycles (*not* to clear
> all references per se).

That's the key insight, and one we all missed.  Thanks for sharing your
brain, Martin!




From mgilfix@eecs.tufts.edu  Mon Jun 10 13:56:40 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Mon, 10 Jun 2002 08:56:40 -0400
Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKELFPLAA.tim.one@comcast.net>; from tim.one@comcast.net on Mon, Jun 10, 2002 at 08:16:51AM -0400
References: <LNBBLJKPBEHFEDALKOLCKELFPLAA.tim.one@comcast.net>
Message-ID: <20020610085640.D23641@eecs.tufts.edu>

On Mon, Jun 10 @ 08:16, Tim Peters wrote:
> I think Beni has a very nice idea here, especially for people who can't
> visualize 2's-complement (not mentioning Guido by name <wink>).

  I like the idea but I'm not sure that still solves the down casting
problem. Say I do some bit ops on a long type and want to get it
into an int size (for whatever reason and there are several), I need
somehow to tell python that it is not an overflow when I'm int()ing
the number. Perhaps int could take a second hidden argument. Be
able to do a:

  int(big_num, signed=1)

  which is pretty clear.

                       -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From niemeyer@conectiva.com  Mon Jun 10 13:58:52 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Mon, 10 Jun 2002 09:58:52 -0300
Subject: [Python-Dev] os.stat(filename).r_dev
In-Reply-To: <m3d6uzn0j1.fsf@mira.informatik.hu-berlin.de>
References: <20020609161918.A17718@ibook.distro.conectiva> <m3sn3w9o4d.fsf@mira.informatik.hu-berlin.de> <20020609181710.A18935@ibook.distro.conectiva> <m3d6uzn0j1.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020610095851.B1769@ibook.distro.conectiva>

> > Isn't st_rdev being made available only in 2.3, trough stat attributes?
> 
> No, it was available in 2.2 already. So there is a backwards
> compatibility issue.

You're right then. st_rdev_pair may be the way to go. I'm not sure if
introducing major(), minor(), and makedev() would be a good idea,
since they are completely platform dependent.

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From David Abrahams" <david.abrahams@rcn.com  Mon Jun 10 14:01:32 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 10 Jun 2002 09:01:32 -0400
Subject: [Python-Dev] Null checking
Message-ID: <005901c2107f$52925d10$6601a8c0@boostconsulting.com>

A couple of quick questions for the authors of the Python source: I notice
that most, if not all, of the Python 'C' API includes null checks for the
PyObject* arguments, meaning that you can't crash Python by passing the
result of a previous operation, even if it returns an error.

First question: can that be counted on? Hmm, I guess I've answered my own
question -- PyNumber_InPlaceAdd has no checks.

I note that the null_error() check in abstract.c is non-destructive: it
preserves any existing error, whereas other checks (e.g. in typeobject.c)
do not.

Second question: I guess I really want to know what the intention behind
these checks is. Is it something like "prevent extension writers from
crashing Python in some large percentage of cases", or is there a deeper
plan that I'm missing?

TIA,
Dave

+---------------------------------------------------------------+
                  David Abrahams
      C++ Booster (http://www.boost.org)               O__  ==
      Pythonista (http://www.python.org)              c/ /'_ ==
  resume: http://users.rcn.com/abrahams/resume.html  (*) \(*) ==
          email: david.abrahams@rcn.com
+---------------------------------------------------------------+




From tim.one@comcast.net  Mon Jun 10 14:08:36 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 09:08:36 -0400
Subject: [Python-Dev] Negative long literals (was Re: Does Python need a
 '>>>' operator?)
In-Reply-To: <200206101252.g5ACqi922863@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKELJPLAA.tim.one@comcast.net>

>> I think Beni has a very nice idea here, especially for people who can't
>> visualize 2's-complement (not mentioning Guido by name <wink>).

[Guido]
> In fact it's so subtle that I didn't notice what he proposed.  I
> though it had to do with the uppercase of 1xABCD.
>
> Maybe that's too subtle?

In context, it was part of a long thread wherein assorted people griped that
they couldn't visualize what, e.g.,

>>> hex(-1L << 10)
'-0x400L'
>>>

means, recalling that hex() is often used when people are thinking of its
argument as a bitstring.  1xc00 "shows the bits" more clearly even in such
an easy case.  In a case like '-0xB373D', it's much harder to visualize the
bits, and this will grow more acute under int-long unification.  Right now,
hex(negative_plain_int) shows the bits directly; after unification,
hex(negative_plain_int) will likely have to resort to producing "negative
literals" as hex(negative_long_int) currently does.

> Do we really need this?

No, but I think it would make unification more attractive to people who care
about this sub-issue.  The 0x vs 1x idea grew on me the longer I played with
it.  Bonus:  we could generalize and say that integers beginning with "1"
are negative octal literals <wink>.




From tim.one@comcast.net  Mon Jun 10 14:12:55 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 09:12:55 -0400
Subject: [Python-Dev] Negative long literals (was Re: Does Python need a
 '>>>' operator?)
In-Reply-To: <20020610085640.D23641@eecs.tufts.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCKELKPLAA.tim.one@comcast.net>

[Michael Gilfix]
>   I like the idea but I'm not sure that still solves the down casting
> problem.

It's not even pretending to have something to do with downcasting.

> Say I do some bit ops on a long type and want to get it into an int
> size (for whatever reason and there are several), I need somehow to
> tell python that it is not an overflow when I'm int()ing the number.

Sorry, I don't know what you want it to do.  You have to specify the
intended semantics first.

> Perhaps int could take a second hidden argument. Be
> able to do a:
>
>   int(big_num, signed=1)
>
> which is pretty clear.

After int/long unification is complete, int() and long() will likely be the
same function.  If you only want the last N bits, apply "&" to the long and
a bitmask with the N least-significant bits set.




From mwh@python.net  Mon Jun 10 14:21:26 2002
From: mwh@python.net (Michael Hudson)
Date: 10 Jun 2002 14:21:26 +0100
Subject: [Python-Dev] Null checking
In-Reply-To: "David Abrahams"'s message of "Mon, 10 Jun 2002 09:01:32 -0400"
References: <005901c2107f$52925d10$6601a8c0@boostconsulting.com>
Message-ID: <2mptyz46cp.fsf@starship.python.net>

"David Abrahams" <david.abrahams@rcn.com> writes:

> A couple of quick questions for the authors of the Python source: I notice
> that most, if not all, of the Python 'C' API includes null checks for the
> PyObject* arguments, meaning that you can't crash Python by passing the
> result of a previous operation, even if it returns an error.
> 
> First question: can that be counted on? Hmm, I guess I've answered my own
> question -- PyNumber_InPlaceAdd has no checks.

You got it.

> I note that the null_error() check in abstract.c is non-destructive: it
> preserves any existing error, whereas other checks (e.g. in typeobject.c)
> do not.
> 
> Second question: I guess I really want to know what the intention behind
> these checks is.

I'm not sure there is one.  It may just be a bad example of defensive
programming (cf. OOSC).

> Is it something like "prevent extension writers from crashing Python
> in some large percentage of cases", or is there a deeper plan that
> I'm missing?

Well, if you're missing it, so am I.

I'd also like to know why all the (for instance) methods in
tupleobject.c start with "if (!PyTuple_Check(self)".  You'd have to
try REALLY hard to get those tests to fail...

Cheers,
M.

-- 
  Q: What are 1000 lawyers at the bottom of the ocean?
  A: A good start.
  (A lawyer told me this joke.)
                                  -- Michael Ströder, comp.lang.python



From guido@python.org  Mon Jun 10 14:28:42 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Jun 2002 09:28:42 -0400
Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?)
In-Reply-To: Your message of "Mon, 10 Jun 2002 08:56:40 EDT."
 <20020610085640.D23641@eecs.tufts.edu>
References: <LNBBLJKPBEHFEDALKOLCKELFPLAA.tim.one@comcast.net>
 <20020610085640.D23641@eecs.tufts.edu>
Message-ID: <200206101328.g5ADSgB22999@pcp02138704pcs.reston01.va.comcast.net>

>   I like the idea but I'm not sure that still solves the down casting
> problem. Say I do some bit ops on a long type and want to get it
> into an int size (for whatever reason and there are several), I need
> somehow to tell python that it is not an overflow when I'm int()ing
> the number. Perhaps int could take a second hidden argument. Be
> able to do a:
> 
>   int(big_num, signed=1)
> 
>   which is pretty clear.

I haven't been following the thread on c.l.py.  What problem do you
think this is trying to solve?

Anyway, if you want to get an int back (which should be pretty rare in
2.2 and up since ints and longs are *almost* completely
interchangeable) you should be able to say something like

  x & 0x7fffffff

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mgilfix@eecs.tufts.edu  Mon Jun 10 14:24:19 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Mon, 10 Jun 2002 09:24:19 -0400
Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKELKPLAA.tim.one@comcast.net>; from tim.one@comcast.net on Mon, Jun 10, 2002 at 09:12:55AM -0400
References: <20020610085640.D23641@eecs.tufts.edu> <LNBBLJKPBEHFEDALKOLCKELKPLAA.tim.one@comcast.net>
Message-ID: <20020610092418.E23641@eecs.tufts.edu>

On Mon, Jun 10 @ 09:12, Tim Peters wrote:
> [Michael Gilfix]
> >   I like the idea but I'm not sure that still solves the down casting
> > problem.
> 
> It's not even pretending to have something to do with downcasting.

  Er, I thought it was part of dealing with the int/long unification,
where it becomes more difficult to express signed numbers as well.
I think my phrasing was of. Should have been: Now if only we could
solve...

> > Say I do some bit ops on a long type and want to get it into an int
> > size (for whatever reason and there are several), I need somehow to
> > tell python that it is not an overflow when I'm int()ing the number.
> 
> Sorry, I don't know what you want it to do.  You have to specify the
> intended semantics first.

  Well, in today's python, if I want to operate on a 64-bit block (without
breaking it up into two ints), I could use a long to hold my value. Then
let's say I perform some operation and I know the result is a 32-bit value
and is signed. It's not easy to get it back into an int. I suppose with
unification, I could just do:

      if num & 0xA0000000:
          num = -num

  I just want a straight-forward way of expressing that it's signed.

                        -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From tim.one@comcast.net  Mon Jun 10 14:25:34 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 09:25:34 -0400
Subject: [Python-Dev] Null checking
In-Reply-To: <005901c2107f$52925d10$6601a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIELLPLAA.tim.one@comcast.net>

[David Abrahams, on NULL-checking in the source]
> ...
> Second question: I guess I really want to know what the intention behind
> these checks is. Is it something like "prevent extension writers from
> crashing Python in some large percentage of cases", or is there a deeper
> plan that I'm missing?

Different authors have different paranoia levels.  My level is here, for
functions that don't intend to accept NULL arguments:

1. Public API functions should always do explicit NULL checks on
   pointer arguments, and it's the user's fault if they pass a NULL.
   A NULL argument should never crash Python regardless.

2. Private API functions should always assert non-NULL-ness on pointer
   arguments, and it's a bug in Python if a caller passes a NULL.

Any place where the Python code base deviates from those is simply a place I
didn't write <wink>.

> I note that the null_error() check in abstract.c is non-destructive: it
> preserves any existing error, whereas other checks (e.g. in typeobject.c)
> do not.

Different authors.  Guido is omnipotent but not omnipresent <wink>.  It
would be good (IMO) to expose something like null_error in the public API,
to encourage NULL-checking.  I don't know that there's real value in trying
to preserve a pre-existing exception, though (if the code is hosed, it's
hosed).




From guido@python.org  Mon Jun 10 14:32:08 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Jun 2002 09:32:08 -0400
Subject: [Python-Dev] Null checking
In-Reply-To: Your message of "Mon, 10 Jun 2002 09:01:32 EDT."
 <005901c2107f$52925d10$6601a8c0@boostconsulting.com>
References: <005901c2107f$52925d10$6601a8c0@boostconsulting.com>
Message-ID: <200206101332.g5ADW8v23024@pcp02138704pcs.reston01.va.comcast.net>

> A couple of quick questions for the authors of the Python source: I
> notice that most, if not all, of the Python 'C' API includes null
> checks for the PyObject* arguments, meaning that you can't crash
> Python by passing the result of a previous operation, even if it
> returns an error.
> 
> First question: can that be counted on? Hmm, I guess I've answered
> my own question -- PyNumber_InPlaceAdd has no checks.

Unless documented explicitly you cannot count on it!

> I note that the null_error() check in abstract.c is non-destructive:
> it preserves any existing error, whereas other checks (e.g. in
> typeobject.c) do not.

Different goals.  (I'm not sure which checks in typeobject.c you're
referring to.)

> Second question: I guess I really want to know what the intention
> behind these checks is. Is it something like "prevent extension
> writers from crashing Python in some large percentage of cases", or
> is there a deeper plan that I'm missing?

Jim Fulton contributed the code that uses null_error().  I think he
was making it possible to pass the result from one call to the next
without doing the error checking on the first call.  Personally, I
find that inexcusable laziness and I don't intend to encourage it or
propagate this style to other APIs.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Mon Jun 10 14:28:57 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 09:28:57 -0400
Subject: [Python-Dev] Null checking
In-Reply-To: <2mptyz46cp.fsf@starship.python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOELMPLAA.tim.one@comcast.net>

[Michael Hudson]
> ...
> I'd also like to know why all the (for instance) methods in
> tupleobject.c start with "if (!PyTuple_Check(self)".  You'd have to
> try REALLY hard to get those tests to fail...

Not at all:  extension modules can pass any sort of nonsense to public API
functions.  Python shouldn't crash as a result.  The checking is expensive,
though.




From guido@python.org  Mon Jun 10 14:34:01 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Jun 2002 09:34:01 -0400
Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?)
In-Reply-To: Your message of "Mon, 10 Jun 2002 09:08:36 EDT."
 <LNBBLJKPBEHFEDALKOLCKELJPLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCKELJPLAA.tim.one@comcast.net>
Message-ID: <200206101334.g5ADY1523041@pcp02138704pcs.reston01.va.comcast.net>

> > Do we really need this?
> 
> No, but I think it would make unification more attractive to people
> who care about this sub-issue.  The 0x vs 1x idea grew on me the
> longer I played with it.  Bonus: we could generalize and say that
> integers beginning with "1" are negative octal literals <wink>.

I'm not sure that we should extend the language at such a fundamental
level (as adding a new form of literal) to address such a minor issue.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Mon Jun 10 14:33:21 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 10 Jun 2002 09:33:21 -0400
Subject: [Python-Dev] Null checking
References: <LNBBLJKPBEHFEDALKOLCIELLPLAA.tim.one@comcast.net>
Message-ID: <009101c21083$6feb0a70$6601a8c0@boostconsulting.com>

From: "Tim Peters" <tim.one@comcast.net>

> Different authors.  Guido is omnipotent but not omnipresent <wink>.  It
> would be good (IMO) to expose something like null_error in the public
API,
> to encourage NULL-checking.  I don't know that there's real value in
trying
> to preserve a pre-existing exception, though (if the code is hosed, it's
> hosed).

It depends on whether you intend to make the null checks part of the public
interface. There is a style of programming which says: "write your code
with no error checks, then look at the end to see if something went wrong".
When, as in 'C', you don't have real exception-handling in the language, it
can lead to smaller/more-straightforward code.

-Dave





From guido@python.org  Mon Jun 10 14:41:47 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Jun 2002 09:41:47 -0400
Subject: [Python-Dev] Null checking
In-Reply-To: Your message of "10 Jun 2002 14:21:26 BST."
 <2mptyz46cp.fsf@starship.python.net>
References: <005901c2107f$52925d10$6601a8c0@boostconsulting.com>
 <2mptyz46cp.fsf@starship.python.net>
Message-ID: <200206101341.g5ADflr23092@pcp02138704pcs.reston01.va.comcast.net>

> I'd also like to know why all the (for instance) methods in
> tupleobject.c start with "if (!PyTuple_Check(self)".  You'd have to
> try REALLY hard to get those tests to fail...

I found exactly one call to PyTuple_Check() that satisfies that
description, and it was in your own (uncommitted) addition,
tuplesubscript(). :-)

Note that the PyXxx_Check() macros do *not* check for a NULL pointer
and crash hard if you pass them one.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mwh@python.net  Mon Jun 10 14:37:30 2002
From: mwh@python.net (Michael Hudson)
Date: 10 Jun 2002 14:37:30 +0100
Subject: [Python-Dev] Null checking
In-Reply-To: Guido van Rossum's message of "Mon, 10 Jun 2002 09:41:47 -0400"
References: <005901c2107f$52925d10$6601a8c0@boostconsulting.com> <2mptyz46cp.fsf@starship.python.net> <200206101341.g5ADflr23092@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <2mlm9n45lx.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> > I'd also like to know why all the (for instance) methods in
> > tupleobject.c start with "if (!PyTuple_Check(self)".  You'd have to
> > try REALLY hard to get those tests to fail...
> 
> I found exactly one call to PyTuple_Check() that satisfies that
> description, and it was in your own (uncommitted) addition,
> tuplesubscript(). :-)

Yeah.  I wonder what I was thinking when I wrote that (it was two
years ago now, after all).

Never mind, I'll do my research better before my next post.

Cheers,
M.

-- 
  GET   *BONK*
  BACK  *BONK*
  IN    *BONK*
  THERE *BONK*             -- Naich using the troll hammer in cam.misc



From tim.one@comcast.net  Mon Jun 10 14:41:27 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 09:41:27 -0400
Subject: [Python-Dev] Negative long literals (was Re: Does Python need a
 '>>>' operator?)
In-Reply-To: <20020610092418.E23641@eecs.tufts.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCCELPPLAA.tim.one@comcast.net>

[Michael Gilfix]
> ...
>   Well, in today's python, if I want to operate on a 64-bit block (without
> breaking it up into two ints), I could use a long to hold my value. Then
> let's say I perform some operation and I know the result is a 32-bit value
> and is signed. It's not easy to get it back into an int.

It it's a signed result that truly fits in a 32-bit signed int, and you know
you're running on a 32-bit box, simply do int(result).  Nothing more than
that is necessary or helpful.

If you have a *positive* long that would fit in a 32-bit unsigned int (which
type Python doesn't have), and know you're running on a 32-bit box, and only
want the same bit pattern in an int, you can do

def toint32(long):
    if long & 0x80000000L:
        long -= 1L << 32
    return int(long)

This also raises OverflowError if you're mistaken about it fitting in 32
bits.

> I suppose with unification, I could just do:
>
>       if num & 0xA0000000:
>           num = -num

With full unification, there is no distinct int type, so there's nothing at
all you need to do.




From tim.one@comcast.net  Mon Jun 10 14:42:28 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 09:42:28 -0400
Subject: [Python-Dev] Negative long literals (was Re: Does Python need a
 '>>>' operator?)
In-Reply-To: <200206101334.g5ADY1523041@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCIELPPLAA.tim.one@comcast.net>

> I'm not sure that we should extend the language at such a fundamental
> level (as adding a new form of literal) to address such a minor issue.

That was obvious the first time around <wink>.



From guido@python.org  Mon Jun 10 14:48:10 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Jun 2002 09:48:10 -0400
Subject: [Python-Dev] Null checking
In-Reply-To: Your message of "Mon, 10 Jun 2002 09:25:34 EDT."
 <LNBBLJKPBEHFEDALKOLCIELLPLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCIELLPLAA.tim.one@comcast.net>
Message-ID: <200206101348.g5ADmAV23128@pcp02138704pcs.reston01.va.comcast.net>

> 1. Public API functions should always do explicit NULL checks on
>    pointer arguments, and it's the user's fault if they pass a NULL.
>    A NULL argument should never crash Python regardless.

This is violated in 99% of the code (you've got to start writing more
code, Tim :-).  My position is different: extensions shouldn't pass
NULL pointers to Python APIs and if they do it's their fault.

> > I note that the null_error() check in abstract.c is non-destructive: it
> > preserves any existing error, whereas other checks (e.g. in typeobject.c)
> > do not.
> 
> Different authors.  Guido is omnipotent but not omnipresent <wink>.
> It would be good (IMO) to expose something like null_error in the
> public API, to encourage NULL-checking.  I don't know that there's
> real value in trying to preserve a pre-existing exception, though
> (if the code is hosed, it's hosed).

That was a specific semantic trick that Jim tried to use (see my
previous mail).  I guess the idea whas that you could write things
like

PyObject_DelItemString(PyObject_GetAttr(PyEval_GetGlobals(), "foo"), "bar").

But this never caught on -- I'm sure in a large part because most
things require you to do a DECREF if the result is *not* NULL.

It *is* handy in Py_BuildValue(), and that has now grown a 'N' format
that eats a reference to an object.  Both 'O' and 'N' formats return
NULL while preserving an existing error if the see a NULL.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Mon Jun 10 14:55:12 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 09:55:12 -0400
Subject: [Python-Dev] Null checking
In-Reply-To: <200206101348.g5ADmAV23128@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEMBPLAA.tim.one@comcast.net>

[Tim, howling in the wildnerness]
>> 1. Public API functions should always do explicit NULL checks on
>>    pointer arguments, and it's the user's fault if they pass a NULL.
>>    A NULL argument should never crash Python regardless.

[Guido]
> This is violated in 99% of the code (you've got to start writing more
> code, Tim :-).  My position is different: extensions shouldn't pass
> NULL pointers to Python APIs and if they do it's their fault.

Then let's compromise:

0. All functions in the API, whether public or private, that don't
   intend to do something sensible with a NULL pointer argument,
   should assert non-NULL-ness.

> ...
> That was a specific semantic trick that Jim tried to use (see my
> previous mail).  I guess the idea whas that you could write things
> like
>
> PyObject_DelItemString(PyObject_GetAttr(PyEval_GetGlobals(),
> "foo"), "bar").
>
> But this never caught on -- I'm sure in a large part because most
> things require you to do a DECREF if the result is *not* NULL.

You may have noticed that I've been spending much of my recent life checking
in changes to clean up stray references when Zope's BTree code finds a
reason to exit prematurely.  It's due to a different set of mechanisms, but
the pattern is clear <wink>.

No more on null_error() from me.




From guido@python.org  Mon Jun 10 15:03:35 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Jun 2002 10:03:35 -0400
Subject: [Python-Dev] Null checking
In-Reply-To: Your message of "Mon, 10 Jun 2002 09:55:12 EDT."
 <LNBBLJKPBEHFEDALKOLCEEMBPLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEEMBPLAA.tim.one@comcast.net>
Message-ID: <200206101403.g5AE3Z323281@pcp02138704pcs.reston01.va.comcast.net>

> 0. All functions in the API, whether public or private, that don't
>    intend to do something sensible with a NULL pointer argument,
>    should assert non-NULL-ness.

Sounds good to me.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mgilfix@eecs.tufts.edu  Mon Jun 10 15:00:33 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Mon, 10 Jun 2002 10:00:33 -0400
Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCELPPLAA.tim.one@comcast.net>; from tim.one@comcast.net on Mon, Jun 10, 2002 at 09:41:27AM -0400
References: <20020610092418.E23641@eecs.tufts.edu> <LNBBLJKPBEHFEDALKOLCCELPPLAA.tim.one@comcast.net>
Message-ID: <20020610100032.F23641@eecs.tufts.edu>

On Mon, Jun 10 @ 09:41, Tim Peters wrote:
> If you have a *positive* long that would fit in a 32-bit unsigned int (which
> type Python doesn't have), and know you're running on a 32-bit box, and only
> want the same bit pattern in an int, you can do
> 
> def toint32(long):
>     if long & 0x80000000L:
>         long -= 1L << 32
>     return int(long)
> 
> This also raises OverflowError if you're mistaken about it fitting in 32
> bits.

  Whoops. That should have been a 0x8... in my example :)

   At ny rate, I wish this function was available as a built-in. It
would be nice if I had some conversion function where python looked
at the highest bit and treated that as my word boundary to determine
whether the number is positive or not.

                           -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From tim.one@comcast.net  Mon Jun 10 15:08:15 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 10:08:15 -0400
Subject: [Python-Dev] Negative long literals (was Re: Does Python need a
 '>>>' operator?)
In-Reply-To: <20020610100032.F23641@eecs.tufts.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEMCPLAA.tim.one@comcast.net>

>> If you have a *positive* long that would fit in a 32-bit
>> unsigned int ..
>   Whoops. That should have been a 0x8... in my example :)

[Michael Gilfix]
>    At ny rate, I wish this function was available as a built-in. It
> would be nice if I had some conversion function where python looked
> at the highest bit and treated that as my word boundary to determine
> whether the number is positive or not.

Write a patch and try to sell it to Guido.  I expect that with int/long
unification coming along nicely, it doesn't stand much chance.




From jeremy@zope.com  Mon Jun 10 10:24:10 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Mon, 10 Jun 2002 05:24:10 -0400
Subject: [Python-Dev] Quota on sf.net
In-Reply-To: <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net>
References: <j4y9drk1qc.fsf@informatik.hu-berlin.de>
 <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15620.28730.145429.690221@slothrop.zope.com>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

  >> It appears SF is rearranging servers, and asks projects to honor
  >> their disk quota, see
  >>
  >> https://sourceforge.net/forum/forum.php?forum_id=183601
  >>
  >> There is a per-project disk quota of 100MB;
  >> /home/groups/p/py/python currently consumes 880MB. Most of this
  >> (830MB) is in htdocs/snapshots. Should we move those onto
  >> python.org?

  GvR> What is htdocs/snapshots?  There's plenty of space on creosote,
  GvR> but maybe the snapshots should be reduced in volume first?

Last time quotas came up, the SF managers said that our project could
exceed the normal quota.  Still, we didn't intend to have ~1GB of CVS
snapshots.  The script that deletes old snapshots had a bug -- didn't
deal with change-of-year -- that kept lots of old snapshots around.  I
just deleted a lot of them, so that we are using less space.

I'm not sure the snapshots are worth the bother at all.  Are there
downloads statistics for the SF web pages?  I'll bet no one has ever
looked at them.

Jeremy





From tim.one@comcast.net  Mon Jun 10 15:15:11 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 10:15:11 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: <200206092315.g59NFdx20542@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEMDPLAA.tim.one@comcast.net>

[Guido, to MvL]
> ...
> You've convinced me.  Here's a patch that only touches typeobject.c.
> It doesn't add any fields, and it doesn't require multiple collections
> to clear out cycles involving a class and its type.  I like it!

Me too.  It's lovely.  Check it in!  It could use some comments about *why*
the clear function isn't clearing all the members.  That's unusual enough
for a clear function that it deserves some prose.




From fdrake@acm.org  Mon Jun 10 15:15:37 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 10 Jun 2002 10:15:37 -0400
Subject: [Python-Dev] Quota on sf.net
In-Reply-To: <15620.28730.145429.690221@slothrop.zope.com>
References: <j4y9drk1qc.fsf@informatik.hu-berlin.de>
 <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net>
 <15620.28730.145429.690221@slothrop.zope.com>
Message-ID: <15620.46217.344039.853918@grendel.zope.com>

Jeremy Hylton writes:
 > I'm not sure the snapshots are worth the bother at all.  Are there
 > downloads statistics for the SF web pages?  I'll bet no one has ever
 > looked at them.

I think there's a way to get the logs, but we'd have to run our own
analysis tools to see if specific pages/directories get requests.

I'd be fine to just drop the snapshots; we can deal with it as needed
if we get requests for them.  At the most, there's no need to keep
more than a week's worth around.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From guido@python.org  Mon Jun 10 15:23:30 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Jun 2002 10:23:30 -0400
Subject: [Python-Dev] Bizarre new test failure
In-Reply-To: Your message of "Mon, 10 Jun 2002 10:15:11 EDT."
 <LNBBLJKPBEHFEDALKOLCCEMDPLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCCEMDPLAA.tim.one@comcast.net>
Message-ID: <200206101423.g5AENUq24016@pcp02138704pcs.reston01.va.comcast.net>

> Me too.  It's lovely.  Check it in!  It could use some comments
> about *why* the clear function isn't clearing all the members.
> That's unusual enough for a clear function that it deserves some
> prose.

Will do.  But first I have to fix SF 551412 for the third time.  The
issues here made me look at that once more and realize that the real
cause of the problem was a bug in slot_tp_number -- it was being
called because on behalf of the second argument.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jun 10 15:19:48 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Jun 2002 10:19:48 -0400
Subject: [Python-Dev] Quota on sf.net
In-Reply-To: Your message of "Mon, 10 Jun 2002 05:24:10 EDT."
 <15620.28730.145429.690221@slothrop.zope.com>
References: <j4y9drk1qc.fsf@informatik.hu-berlin.de> <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net>
 <15620.28730.145429.690221@slothrop.zope.com>
Message-ID: <200206101419.g5AEJnY23884@pcp02138704pcs.reston01.va.comcast.net>

> I'm not sure the snapshots are worth the bother at all.  Are there
> downloads statistics for the SF web pages?  I'll bet no one has ever
> looked at them.

I forget -- are these snapshots of a checkout or of the whole CVS
directory?  If the former, we can probably lose them if and when SF
starts to enforce quota.  If the latter, then I suggest that having
them on SF defeats the purpose -- we want them on hardware that is as
far away from SF as we can imagine, like halfway across the world on
www.python.org. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin@mems-exchange.org  Mon Jun 10 15:25:39 2002
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 10 Jun 2002 10:25:39 -0400
Subject: [Python-Dev] Quota on sf.net
In-Reply-To: <15620.28730.145429.690221@slothrop.zope.com>
References: <j4y9drk1qc.fsf@informatik.hu-berlin.de> <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> <15620.28730.145429.690221@slothrop.zope.com>
Message-ID: <20020610142539.GA14084@ute.mems-exchange.org>

On Mon, Jun 10, 2002 at 05:24:10AM -0400, Jeremy Hylton wrote:
>I'm not sure the snapshots are worth the bother at all.  Are there
>downloads statistics for the SF web pages?  I'll bet no one has ever
>looked at them.

Wasn't the original goal of the snapshots to create an audit trail,
guarding against someone Trojaning the CVS repository?  And didn't
Sean Reifscheider burn a CD containing a whole bunch of snapshots?  

--amk                                                             (www.amk.ca)
Still, the future lies this way.
    -- The Doctor, in "Logopolis"



From barry@zope.com  Mon Jun 10 15:29:23 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 10 Jun 2002 10:29:23 -0400
Subject: [Python-Dev] Quota on sf.net
References: <j4y9drk1qc.fsf@informatik.hu-berlin.de>
 <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net>
 <15620.28730.145429.690221@slothrop.zope.com>
 <200206101419.g5AEJnY23884@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15620.47043.201842.176944@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> I forget -- are these snapshots of a checkout or of the whole
    GvR> CVS directory?  If the former, we can probably lose them if
    GvR> and when SF starts to enforce quota.  If the latter, then I
    GvR> suggest that having them on SF defeats the purpose -- we want
    GvR> them on hardware that is as far away from SF as we can
    GvR> imagine, like halfway across the world on www.python.org. :-)

creosote (at xs4all) is doing the nightly cvs repository snapshot
downloads.  It's probably due time to clear those out, but I haven't
heard Thomas complain yet. :)

>>>>> "AK" == Andrew Kuchling <akuchlin@mems-exchange.org> writes:

    AK> Wasn't the original goal of the snapshots to create an audit
    AK> trail, guarding against someone Trojaning the CVS repository?
    AK> And didn't Sean Reifscheider burn a CD containing a whole
    AK> bunch of snapshots?

I believe that's true.  I think Fred got one of the copies.
-Barry



From jeremy@zope.com  Mon Jun 10 10:46:06 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Mon, 10 Jun 2002 05:46:06 -0400
Subject: [Python-Dev] Quota on sf.net
In-Reply-To: <20020610142539.GA14084@ute.mems-exchange.org>
References: <j4y9drk1qc.fsf@informatik.hu-berlin.de>
 <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net>
 <15620.28730.145429.690221@slothrop.zope.com>
 <20020610142539.GA14084@ute.mems-exchange.org>
Message-ID: <15620.30046.267559.90616@slothrop.zope.com>

>>>>> "AMK" == Andrew Kuchling <akuchlin@mems-exchange.org> writes:

  AMK> On Mon, Jun 10, 2002 at 05:24:10AM -0400, Jeremy Hylton wrote:
  >> I'm not sure the snapshots are worth the bother at all.  Are
  >> there downloads statistics for the SF web pages?  I'll bet no one
  >> has ever looked at them.

  AMK> Wasn't the original goal of the snapshots to create an audit
  AMK> trail, guarding against someone Trojaning the CVS repository?
  AMK> And didn't Sean Reifscheider burn a CD containing a whole bunch
  AMK> of snapshots?

There are two different sets of snapshots.  One is the copies of the
CVS repository that Barry makes every night.  The snapshots I'm
talking about are cvs checkouts done every night.  We did this because
someone requested them, and it wasn't too much trouble.  But now I'm
wondering whether continued maintenance is worth the trouble.

Jeremy





From guido@python.org  Mon Jun 10 15:39:10 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Jun 2002 10:39:10 -0400
Subject: [Python-Dev] Quota on sf.net
In-Reply-To: Your message of "Mon, 10 Jun 2002 05:46:06 EDT."
 <15620.30046.267559.90616@slothrop.zope.com>
References: <j4y9drk1qc.fsf@informatik.hu-berlin.de> <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net> <15620.28730.145429.690221@slothrop.zope.com> <20020610142539.GA14084@ute.mems-exchange.org>
 <15620.30046.267559.90616@slothrop.zope.com>
Message-ID: <200206101439.g5AEdBp24496@pcp02138704pcs.reston01.va.comcast.net>

> The snapshots I'm talking about are cvs checkouts done every night.
> We did this because someone requested them, and it wasn't too much
> trouble.  But now I'm wondering whether continued maintenance is
> worth the trouble.

OK, lose them.  If there's a real need, someone in the community will
take over.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From niemeyer@conectiva.com  Mon Jun 10 15:53:34 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Mon, 10 Jun 2002 11:53:34 -0300
Subject: [Python-Dev] Null checking
In-Reply-To: <005901c2107f$52925d10$6601a8c0@boostconsulting.com>
References: <005901c2107f$52925d10$6601a8c0@boostconsulting.com>
Message-ID: <20020610115334.A3324@ibook.distro.conectiva>

Hello David!

> A couple of quick questions for the authors of the Python source: I notice
> that most, if not all, of the Python 'C' API includes null checks for the
> PyObject* arguments, meaning that you can't crash Python by passing the
> result of a previous operation, even if it returns an error.
> 
> First question: can that be counted on? Hmm, I guess I've answered my own
> question -- PyNumber_InPlaceAdd has no checks.

Something that should also be noticed is that even if Python doesn't
break, leaving errors around until a later point results in more trouble
debugging this code if something goes wrong, and maybe even wrong error
messages being sent to the user.

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From tim.one@comcast.net  Mon Jun 10 16:30:22 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 11:30:22 -0400
Subject: [Python-Dev] unittest and sockets. Ugh!?
In-Reply-To: <20020608214341.G9486@eecs.tufts.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEMKPLAA.tim.one@comcast.net>

BTW, if you want to run threads with unittest, I expect you'll have to
ensure that only the thread that starts unittest reports errors to unittest.
I'll call that "the main thread".  You should be aware that if a non-main
thread dies, unittest won't know that.  A common problem in the threaded
tests PLabs has written is that a thread dies an ignoble death but unittest
goes on regardless and says "ok!" at the end; if you didn't stare at all the
output, you never would have known something went wrong.

So wrap the body of your thread's work in a catch-all try/except, and if
anything goes wrong communicate that back to the main thread.  For example,
a Queue object (one or more) could work nicely for this.




From skip@mojam.com  Mon Jun 10 16:42:15 2002
From: skip@mojam.com (Skip Montanaro)
Date: Mon, 10 Jun 2002 10:42:15 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200206101542.g5AFgFR11272@12-248-41-177.client.attbi.com>

Bug/Patch Summary
-----------------

263 open / 2562 total bugs (-3)
131 open / 1541 total patches (-5)

New Bugs
--------

Missing operator docs (2002-06-02)
	http://python.org/sf/563530
urllib2 can't cope with error response (2002-06-02)
	http://python.org/sf/563665
os.tmpfile should use w+b, not w+ (2002-06-02)
	http://python.org/sf/563750
getttext defaults with unicode (2002-06-03)
	http://python.org/sf/563915
FixTk.py logic wrong (2002-06-04)
	http://python.org/sf/564729
compile traceback must include filename (2002-06-05)
	http://python.org/sf/564931
IDLE needs printing (2002-06-06)
	http://python.org/sf/565373
urllib FancyURLopener.__init__ / urlopen (2002-06-06)
	http://python.org/sf/565414
ImportError: No module named _socket (2002-06-07)
	http://python.org/sf/565710
crash on gethostbyaddr (2002-06-07)
	http://python.org/sf/565747
string.replace() can corrupt heap (2002-06-07)
	http://python.org/sf/565993
telnetlib makes Python dump core (2002-06-07)
	http://python.org/sf/566006
Popen exectuion blocking w/threads (2002-06-07)
	http://python.org/sf/566037
Bgen should generate 7-bit-clean code (2002-06-08)
	http://python.org/sf/566302
PyUnicode_Find() returns wrong results (2002-06-09)
	http://python.org/sf/566631
rotormodule's set_key calls strlen (2002-06-10)
	http://python.org/sf/566859
Typo in "What's new in Python 2.3" (2002-06-10)
	http://python.org/sf/566869

New Patches
-----------

experimental support for extended slicing on lists (2000-07-27)
	http://python.org/sf/400998
posixmodule.c RedHat 6.1 (bug #535545) (2002-06-03)
	http://python.org/sf/563954
error in weakref.WeakKeyDictionary (2002-06-04)
	http://python.org/sf/564549
modulefinder and string methods (2002-06-05)
	http://python.org/sf/564840
email Parser non-strict mode (2002-06-06)
	http://python.org/sf/565183
Expose _Py_ReleaseInternedStrings (2002-06-06)
	http://python.org/sf/565378
Rationalize DL_IMPORT and DL_EXPORT (2002-06-07)
	http://python.org/sf/566100
fix bug in shutil.rmtree exception case (2002-06-09)
	http://python.org/sf/566517

Closed Bugs
-----------

Coercion rules incomplete (2001-05-07)
	http://python.org/sf/421973
clean doesn't (2002-01-29)
	http://python.org/sf/510186
__slots__ may lead to undetected cycles (2002-02-18)
	http://python.org/sf/519621
make fails at posixmodule.c (2002-03-26)
	http://python.org/sf/535545
Warn for __coerce__ in new-style classes (2002-04-22)
	http://python.org/sf/547211
possible to fail to calc mro's (2002-05-02)
	http://python.org/sf/551412
UTF-16 BOM handling counterintuitive (2002-05-13)
	http://python.org/sf/555360
TclError is a str should be an Exception (2002-05-17)
	http://python.org/sf/557436
Shutdown of IDLE blows up (2002-05-19)
	http://python.org/sf/558166
rfc822.Message.get() incompatibility (2002-05-20)
	http://python.org/sf/558179
imaplib.IMAP4.open() typo (2002-05-23)
	http://python.org/sf/559884
PyType_IsSubtype can segfault (2002-05-24)
	http://python.org/sf/560215
foo() doesn't use __getattribute__ (2002-05-25)
	http://python.org/sf/560438
Maximum recursion limit exceeded (2002-05-27)
	http://python.org/sf/561047
Module can be used as a base class (2002-05-31)
	http://python.org/sf/563060
Heap corruption in debug (2002-06-01)
	http://python.org/sf/563303
Add separator argument to readline() (2002-06-02)
	http://python.org/sf/563491

Closed Patches
--------------

Remote execution patch for IDLE (2001-07-11)
	http://python.org/sf/440407
GNU/Hurd doesn't have large file support (2001-12-27)
	http://python.org/sf/497099
building a shared python library (2001-12-27)
	http://python.org/sf/497102
make setup.py less chatty by default (2002-01-17)
	http://python.org/sf/504889
Make doc strings optional (2002-01-18)
	http://python.org/sf/505375
Distutils & non-installed Python (2002-04-23)
	http://python.org/sf/547734
test_commands.py using . incorrectly (2002-05-03)
	http://python.org/sf/551911
Fix breakage of smtplib.starttls() (2002-05-03)
	http://python.org/sf/552060
Cygwin AH_BOTTOM cleanup patch (2002-05-14)
	http://python.org/sf/555929
Expose xrange type in builtins (2002-05-23)
	http://python.org/sf/559833
isinstance error message (2002-05-24)
	http://python.org/sf/560250
webchecker chokes at charsets. (2002-05-28)
	http://python.org/sf/561478
Getting rid of string, types and stat (2002-05-30)
	http://python.org/sf/562373



From bernie@3captus.com  Mon Jun 10 17:04:40 2002
From: bernie@3captus.com (Bernard Yue)
Date: Mon, 10 Jun 2002 10:04:40 -0600
Subject: [Python-Dev] unittest and sockets. Ugh!?
References: <LNBBLJKPBEHFEDALKOLCKEMKPLAA.tim.one@comcast.net>
Message-ID: <3D04CE18.7404CA90@3captus.com>

Tim Peters wrote:
> 
> BTW, if you want to run threads with unittest, I expect you'll have to
> ensure that only the thread that starts unittest reports errors to unittest.
> I'll call that "the main thread".  You should be aware that if a non-main
> thread dies, unittest won't know that.  A common problem in the threaded
> tests PLabs has written is that a thread dies an ignoble death but unittest
> goes on regardless and says "ok!" at the end; if you didn't stare at all the
> output, you never would have known something went wrong.
> 
> So wrap the body of your thread's work in a catch-all try/except, and if
> anything goes wrong communicate that back to the main thread.  For example,
> a Queue object (one or more) could work nicely for this.
> 

Thanks for the good tip Tim <wink>!  I will add those to my test case. 
The main challenge in socket testing, however, lies on thread
synchronization.  I have make some progress on that front, Michael.  See
if the following code fragment helps:


class socketObjTestCase(unittest.TestCase):
    """Test Case for Socket Object"""
    def setUp(self):
        self.__s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

        self.__server_addr  = ('127.0.0.1', 25339)
        self.__ready = threading.Event()
        self.__done = threading.Event()
        self.__quit = threading.Event()
        self.__server = threading.Thread(target=server,
                args=(self.__server_addr, self.__ready, self.__done,
                self.__quit))

        self.__server.start()
        self.__ready.wait()

    def tearDown(self):
        self.__s.close()

        self.__quit.set()
        self.__server.join()
        del self.__server
        self.__done.wait()
        self.__ready.clear()
        self.__done.clear()
        self.__quit.clear()


class server:
    def __init__(self, addr, ready, done, quit):
        self.__ready = ready
        self.__dead = done
        self.__quit = quit

        self.__s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.__s.setblocking(0)
        self.__s.bind(addr)
        self.__s.listen(1)
        self.getclient()

    def __del__(self):
        self.__dead.set()

    def getclient(self):
        self.__ready.set()
        while not self.__quit.isSet():
            try:
                _client, _addr = self.__s.accept()
                self.serveclient(_client, _addr)
            except socket.error, msg:
                pass
        self.__s.shutdown(2)

    def serveclient(self, sock, addr):
        print sock, addr


Bernie



From mgilfix@eecs.tufts.edu  Mon Jun 10 16:43:59 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Mon, 10 Jun 2002 11:43:59 -0400
Subject: [Python-Dev] unittest and sockets. Ugh!?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEMKPLAA.tim.one@comcast.net>; from tim.one@comcast.net on Mon, Jun 10, 2002 at 11:30:22AM -0400
References: <20020608214341.G9486@eecs.tufts.edu> <LNBBLJKPBEHFEDALKOLCKEMKPLAA.tim.one@comcast.net>
Message-ID: <20020610114359.C25627@eecs.tufts.edu>

  Cool. Thanks for the tip. You wouldn't happen to have any publicly
available examples? Couldn't hurt to see someone else's layout so I
can be sure I have the best structure for mine.

                   -- Mike

On Mon, Jun 10 @ 11:30, Tim Peters wrote:
> BTW, if you want to run threads with unittest, I expect you'll have to
> ensure that only the thread that starts unittest reports errors to unittest.
> I'll call that "the main thread".  You should be aware that if a non-main
> thread dies, unittest won't know that.  A common problem in the threaded
> tests PLabs has written is that a thread dies an ignoble death but unittest
> goes on regardless and says "ok!" at the end; if you didn't stare at all the
> output, you never would have known something went wrong.
> 
> So wrap the body of your thread's work in a catch-all try/except, and if
> anything goes wrong communicate that back to the main thread.  For example,
> a Queue object (one or more) could work nicely for this.
`-> (tim.one)

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From loewis@informatik.hu-berlin.de  Mon Jun 10 18:04:43 2002
From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=)
Date: 10 Jun 2002 19:04:43 +0200
Subject: [Python-Dev] Quota on sf.net
In-Reply-To: <15620.28730.145429.690221@slothrop.zope.com>
References: <j4y9drk1qc.fsf@informatik.hu-berlin.de>
 <200206071302.g57D2jD16999@pcp02138704pcs.reston01.va.comcast.net>
 <15620.28730.145429.690221@slothrop.zope.com>
Message-ID: <j4y9dn12vo.fsf@informatik.hu-berlin.de>

Jeremy Hylton <jeremy@zope.com> writes:

> I'm not sure the snapshots are worth the bother at all.  Are there
> downloads statistics for the SF web pages?  I'll bet no one has ever
> looked at them.

My recommendation would be to disable the scipt, and remove the
snapshots, perhaps leaving a page that anybody who wants the snapshots
should ask at python-dev to re-enable them.

Regards,
Martin




From gward@python.net  Mon Jun 10 21:18:00 2002
From: gward@python.net (Greg Ward)
Date: Mon, 10 Jun 2002 16:18:00 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <3D039C43.786EE3D8@prescod.net>
References: <006d01c20e4a$d207a3c0$ced241d5@hagrid> <LNBBLJKPBEHFEDALKOLCEEEOPLAA.tim.one@comcast.net> <20020607213947.GB21836@gerg.ca> <3D012CD9.89D40523@prescod.net> <20020607220640.GA21975@gerg.ca> <3D015CD8.DE7C4BC2@prescod.net> <20020609002722.GA3750@gerg.ca> <3D039C43.786EE3D8@prescod.net>
Message-ID: <20020610201800.GB7655@gerg.ca>

On 09 June 2002, Paul Prescod said:
> I'm not clear on why the "width" argument is special and should be on
> the wrap method rather than in the constructor. But I suspect most
> people will use the convenience functions so they'll never know the
> difference.

Beats me.  It does seem kind of silly.  I think I'll go fix it now.

        Greg
-- 
Greg Ward - programmer-at-large                         gward@python.net
http://starship.python.net/~gward/
"He's dead, Jim.  You get his tricorder and I'll grab his wallet."



From gward@python.net  Mon Jun 10 21:17:15 2002
From: gward@python.net (Greg Ward)
Date: Mon, 10 Jun 2002 16:17:15 -0400
Subject: [Python-Dev] textwrap.py
In-Reply-To: <20020609184425.18907.qmail@web9601.mail.yahoo.com>
References: <3D039C43.786EE3D8@prescod.net> <20020609184425.18907.qmail@web9601.mail.yahoo.com>
Message-ID: <20020610201714.GA7655@gerg.ca>

On 09 June 2002, Steven Lott said:
> Here's a version with the Strategy classes included.  This
> allows for essentially unlimited alternatives on the subjects of
> long words, full stops, and also permits right justification.

Ahh, very interesting.  Smells like massive flaming overkill here, but
at least now I understand what you meant by "strategy class".  (I kept
having visions of a classroom full of kids playing chess... design
patters are great, as long as everyone has a copy of *Design Patterns*
on their desk.  ;-)

I think my main reservation about this technique is that it does nothing
to make the simplest case simpler, and it makes the slightly complex
case ("I just want to disable breaking long words") a hell of a lot
harder.

        Greg
-- 
Greg Ward - just another /P(erl|ython)/ hacker          gward@python.net
http://starship.python.net/~gward/
The NSA.  We care: we listen to our customers.



From dan@sidhe.org  Mon Jun 10 21:42:45 2002
From: dan@sidhe.org (Dan Sugalski)
Date: Mon, 10 Jun 2002 16:42:45 -0400
Subject: [Python-Dev] Parrot in Phoenix
Message-ID: <a05111b1cb92abe8c26b6@[63.120.19.221]>

Dunno if anyone's interested, but I'll be giving a full-on 
presentation on Parrot, the dynamic language interpreter we're 
building to implement Perl 6 on top of, on June 20th in Phoenix, AZ 
to the Phoenix perlmongers. If anyone's interested, drop me a note 
and I'll get you directions and update the head count.

If you're worried about being surrounded by perl people, 
don't--you'll be surrounded by perl *and* ruby people. :) (Some of 
the folks involved in Cardinal, the project to layer Ruby on top of 
Parrot, will be there too)
-- 
                                         Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                       teddy bears get drunk



From tim.one@comcast.net  Mon Jun 10 22:34:27 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Jun 2002 17:34:27 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps pep-0042.txt,1.58,1.59
In-Reply-To: <E17HWeN-0008WF-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEOKPLAA.tim.one@comcast.net>

> Update of /cvsroot/python/python/nondist/peps
> In directory usw-pr-cvs1:/tmp/cvs-serv32728
> 
> Modified Files:
> 	pep-0042.txt 
> Log Message:
> Added another wish.  Removed a bunch of fulfilled wishes (no guarantee
> that I caught all of 'em).

...

> -     - Port the Python SSL code to Windows.
> - 
> -       http://www.python.org/sf/210683

That this was done is news to me.  That doesn't mean it isn't true <wink>.

Python 2.3a0 (#29, Jun  1 2002, 02:50:59) [MSC 32 bit (Intel)] on win32
...
>>> import socket
>>> hasattr(socket, 'ssl')
False
>>>



From guido@python.org  Mon Jun 10 22:46:10 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Jun 2002 17:46:10 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0042.txt,1.58,1.59
In-Reply-To: Your message of "Mon, 10 Jun 2002 17:34:27 EDT."
 <LNBBLJKPBEHFEDALKOLCOEOKPLAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCOEOKPLAA.tim.one@comcast.net>
Message-ID: <200206102146.g5ALkFE06401@pcp02138704pcs.reston01.va.comcast.net>

> > -     - Port the Python SSL code to Windows.
> > - 
> > -       http://www.python.org/sf/210683
> 
> That this was done is news to me.  That doesn't mean it isn't true <wink>.

AFAIK the C code works on Windows; I've heard repeated confirmations.
It's just that we don't feel like configuring it (and there isn't a
lot of demand AFAICT :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From andymac@bullseye.apana.org.au  Mon Jun 10 21:26:05 2002
From: andymac@bullseye.apana.org.au (Andrew MacIntyre)
Date: Tue, 11 Jun 2002 07:26:05 +1100 (edt)
Subject: [Python-Dev] Socket timeout patch
In-Reply-To: <200206070401.g57412i15821@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.OS2.4.32.0206101842270.5661-100000@tenring.andymac.org>

On Fri, 7 Jun 2002, Guido van Rossum wrote:

> I've more or less completed the introduction of timeout sockets.

{...}

> - Cross-platform testing.  It's possible that the cleanup broke things
>   on some platforms, or that select() doesn't work the same way.  I
>   can only test on Windows and Linux; there is code specific to OS/2
>   and RISCOS in the module too.

wrt OS/2:  sock_init() is an OS/2 TCPIP public symbol, which is used in
the OS/2 os_init() (about line 2982 of socketmodule.c, as of yesterday).
This of course clashes with the sock_init() defined in socketmodule.c.

Even though the EMX port doesn't need the underlying sock_init(), EMX'
socket.h defines sock_init() for compatibility with VACPP.

Once the name clash is resolved, the module compiles and completes
test_socket with no problems.

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  | Snail: PO Box 370
        andymac@pcug.org.au            |        Belconnen  ACT  2616
Web:    http://www.andymac.org/        |        Australia




From DavidA@ActiveState.com  Tue Jun 11 07:51:29 2002
From: DavidA@ActiveState.com (David Ascher)
Date: Mon, 10 Jun 2002 23:51:29 -0700
Subject: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0042.txt,1.58,1.59
References: <LNBBLJKPBEHFEDALKOLCOEOKPLAA.tim.one@comcast.net> <200206102146.g5ALkFE06401@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D059DF1.4020003@ActiveState.com>

Guido van Rossum wrote:

>>>-     - Port the Python SSL code to Windows.
>>>- 
>>>-       http://www.python.org/sf/210683
>>>
>>That this was done is news to me.  That doesn't mean it isn't true <wink>.
>>
>
>AFAIK the C code works on Windows; I've heard repeated confirmations.
>It's just that we don't feel like configuring it (and there isn't a
>lot of demand AFAICT :-).
>
It works.  We've tested it, and we have some customers who need it.  The 
test suite was somewhat busted, and there was a bug in 2.2.0, but 2.2.1 
is fine.

The entire feature is somewhat underdocumented, though =(.

--david




From skip@pobox.com  Tue Jun 11 17:22:08 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 11 Jun 2002 11:22:08 -0500
Subject: [Python-Dev] Please give this patch for building bsddb a try
Message-ID: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>

If you build the bsddb module on a Unix-like system (that is, you use
configure and setup.py to build the interpreter and it attempts to build the
bsddb module), please give the new patch attached to

    http://python.org/sf/553108

a try.  Ignore the subject of the patch.  I just tacked my patch onto this
item and assigned it to myself.  If/when the issue is settled I'll track
down and close other patches and bug reports related to building the bsddb
module.

Briefly, it attempts the following:

  1. Makes it inconvenient (though certainly not impossible) to build/link
     with version 1 of the Berkeley DB library by commenting out the
     relevant part of the db_try_this dictionary in setup.py.

  2. Links the search for a DB library and corresponding include files so
     you don't find a version 2 include file and a version 3 library (for
     example).

  3. Attempts to do the same for the dbm module when it decides to link with
     the Berkeley DB library for compatibility (this is stuff under
     "development" and will almost certainly require further changes).  (You
     can ignore the debug print I forgot to remove before creating the
     patch. ;-)

I asked on c.l.py about where people have the Berkeley DB stuff installed so
I could tune the locations listed in db_try_this, but the thread almost
immediately went off into the weeds arguing about berkdb license issues.  I
therefore humbly request your more rational input on this topic.  If you
have a Unix-ish system and Berkeley DB is installed somewhere not listed in
the db_try_this dictionary in setup.py, please let me know.

Thx,

Skip



From Oleg Broytmann <phd@phd.pp.ru>  Tue Jun 11 17:39:06 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Tue, 11 Jun 2002 20:39:06 +0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>; from skip@pobox.com on Tue, Jun 11, 2002 at 11:22:08AM -0500
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
Message-ID: <20020611203906.V6026@phd.pp.ru>

Hello!

On Tue, Jun 11, 2002 at 11:22:08AM -0500, Skip Montanaro wrote:
>     http://python.org/sf/553108
> 
>   1. Makes it inconvenient (though certainly not impossible) to build/link
>      with version 1 of the Berkeley DB library by commenting out the
>      relevant part of the db_try_this dictionary in setup.py.

   Can I have two different modules simultaneously? For example, a module
linked with db.1.85 plus a module linked with db3.

>   2. Links the search for a DB library and corresponding include files so
>      you don't find a version 2 include file and a version 3 library (for
>      example).

   After compiling bsddb-3.2 from sources I have got
/usr/local/BerkeleyDB.3.2/ directory, with lib/include being its
subdirectories. The patch didn't look into this, as I understand it.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From skip@pobox.com  Tue Jun 11 17:58:42 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 11 Jun 2002 11:58:42 -0500
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <20020611203906.V6026@phd.pp.ru>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
Message-ID: <15622.11330.948519.279929@12-248-41-177.client.attbi.com>

    >> 1. Makes it inconvenient (though certainly not impossible) to
    >>    build/link with version 1 of the Berkeley DB library by commenting
    >>    out the relevant part of the db_try_this dictionary in setup.py.

    Oleg> Can I have two different modules simultaneously? For example, a
    Oleg> module linked with db.1.85 plus a module linked with db3.

Nope.  I don't believe you can do that today (at least not without some
build-time gymnastics), and I have no plans to support that.  For one thing,
you'd have to compile and link bsddmodule.c twice.  To allow multiple
versions to be loaded into the interpreter you'd also have to name them
differently.  This would require source code changes to keep global symbols
(at least the module init functions) from clashing.

    >> 2. Links the search for a DB library and corresponding include files
    >>    so you don't find a version 2 include file and a version 3 library
    >>    (for example).

    Oleg> After compiling bsddb-3.2 from sources I have got
    Oleg> /usr/local/BerkeleyDB.3.2/ directory, with lib/include being its
    Oleg> subdirectories. The patch didn't look into this, as I understand
    Oleg> it.

Thanks, I'll add that.  I also notice that /usr/local/BerkeleyDB.4.0 is the
default install directory for the 4.0 source.

Skip



From guido@python.org  Tue Jun 11 19:28:05 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 11 Jun 2002 14:28:05 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0042.txt,1.58,1.59
In-Reply-To: Your message of "Mon, 10 Jun 2002 23:51:29 PDT."
 <3D059DF1.4020003@ActiveState.com>
References: <LNBBLJKPBEHFEDALKOLCOEOKPLAA.tim.one@comcast.net> <200206102146.g5ALkFE06401@pcp02138704pcs.reston01.va.comcast.net>
 <3D059DF1.4020003@ActiveState.com>
Message-ID: <200206111828.g5BIS5p29303@pcp02138704pcs.reston01.va.comcast.net>

[About SSL on Windows]
> It works.  We've tested it, and we have some customers who need it.  The 
> test suite was somewhat busted, and there was a bug in 2.2.0, but 2.2.1 
> is fine.
> 
> The entire feature is somewhat underdocumented, though =(.

What do you mean.  Is this not enough? :-)

http://www.python.org/doc/current/lib/ssl-objects.html

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@v.loewis.de  Tue Jun 11 20:00:24 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 11 Jun 2002 21:00:24 +0200
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15622.11330.948519.279929@12-248-41-177.client.attbi.com>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15622.11330.948519.279929@12-248-41-177.client.attbi.com>
Message-ID: <m37kl5y71z.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> This would require source code changes to keep global symbols (at
> least the module init functions) from clashing.

It actually only requires different init functions. To support that
with distutils, you need to tell distutils to generate different
object files from the same source file, which is probably not
supported out of the box.

Regards,
Martin



From Oleg Broytmann <phd@phd.pp.ru>  Tue Jun 11 20:48:52 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Tue, 11 Jun 2002 23:48:52 +0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <m37kl5y71z.fsf@mira.informatik.hu-berlin.de>; from martin@v.loewis.de on Tue, Jun 11, 2002 at 09:00:24PM +0200
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15622.11330.948519.279929@12-248-41-177.client.attbi.com> <m37kl5y71z.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020611234852.D23356@phd.pp.ru>

On Tue, Jun 11, 2002 at 09:00:24PM +0200, Martin v. Loewis wrote:
> Skip Montanaro <skip@pobox.com> writes:
> 
> > This would require source code changes to keep global symbols (at
> > least the module init functions) from clashing.
> 
> It actually only requires different init functions. To support that
> with distutils, you need to tell distutils to generate different
> object files from the same source file, which is probably not
> supported out of the box.

   I know. Once I thought about sed/awk magic to generate two different
modules from one template.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From haering_python@gmx.de  Tue Jun 11 20:58:48 2002
From: haering_python@gmx.de (Gerhard =?iso-8859-15?Q?H=E4ring?=)
Date: Tue, 11 Jun 2002 21:58:48 +0200
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <20020611234852.D23356@phd.pp.ru>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15622.11330.948519.279929@12-248-41-177.client.attbi.com> <m37kl5y71z.fsf@mira.informatik.hu-berlin.de> <20020611234852.D23356@phd.pp.ru>
Message-ID: <20020611195848.GA27976@lilith.my-fqdn.de>

* Oleg Broytmann <phd@phd.pp.ru> [2002-06-11 23:48 +0400]:
> On Tue, Jun 11, 2002 at 09:00:24PM +0200, Martin v. Loewis wrote:
> > Skip Montanaro <skip@pobox.com> writes:
> > 
> > > This would require source code changes to keep global symbols (at
> > > least the module init functions) from clashing.
> > 
> > It actually only requires different init functions. To support that
> > with distutils, you need to tell distutils to generate different
> > object files from the same source file, which is probably not
> > supported out of the box.
> 
> I know. Once I thought about sed/awk magic to generate two different
> modules from one template.

What about symlinks, like:

bsd18module.c -> bsd30module.c
                 bsd30module.c

and using a few #ifdefs in the C sources?

Gerhard
-- 
This sig powered by Python!
Außentemperatur in München: 17.1 °C      Wind: 1.7 m/s



From martin@v.loewis.de  Tue Jun 11 21:17:15 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 11 Jun 2002 22:17:15 +0200
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <20020611195848.GA27976@lilith.my-fqdn.de>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15622.11330.948519.279929@12-248-41-177.client.attbi.com>
 <m37kl5y71z.fsf@mira.informatik.hu-berlin.de>
 <20020611234852.D23356@phd.pp.ru>
 <20020611195848.GA27976@lilith.my-fqdn.de>
Message-ID: <m3sn3twoxg.fsf@mira.informatik.hu-berlin.de>

Gerhard H=E4ring <haering_python@gmx.de> writes:

> What about symlinks, like:

That can't work on Windows.

Regards,
Martin



From skip@pobox.com  Tue Jun 11 21:18:09 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 11 Jun 2002 15:18:09 -0500
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <m37kl5y71z.fsf@mira.informatik.hu-berlin.de>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15622.11330.948519.279929@12-248-41-177.client.attbi.com>
 <m37kl5y71z.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15622.23297.193301.295155@12-248-41-177.client.attbi.com>

    >> This would require source code changes to keep global symbols (at
    >> least the module init functions) from clashing.

    Martin> It actually only requires different init functions. To support
    Martin> that with distutils, you need to tell distutils to generate
    Martin> different object files from the same source file, which is
    Martin> probably not supported out of the box.

Thanks for the clarification Martin.  Even though this seems possible with
minimal changes to the source, I still think supporting this is not worth
it.

Oleg, at your end I suspect you could fairly easily copy a bit of code in
setup.py, copy bsddbmodule.c to bsddb1module.c, and rename the module init
function.

Skip




From jon+python-dev@unequivocal.co.uk  Tue Jun 11 21:19:59 2002
From: jon+python-dev@unequivocal.co.uk (Jon Ribbens)
Date: Tue, 11 Jun 2002 21:19:59 +0100
Subject: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0042.txt,1.58,1.59
In-Reply-To: <200206111828.g5BIS5p29303@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Tue, Jun 11, 2002 at 02:28:05PM -0400
References: <LNBBLJKPBEHFEDALKOLCOEOKPLAA.tim.one@comcast.net> <200206102146.g5ALkFE06401@pcp02138704pcs.reston01.va.comcast.net> <3D059DF1.4020003@ActiveState.com> <200206111828.g5BIS5p29303@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020611211959.M14101@snowy.squish.net>

Guido van Rossum <guido@python.org> wrote:
> > The entire feature is somewhat underdocumented, though =(.
> 
> What do you mean.  Is this not enough? :-)
> 
> http://www.python.org/doc/current/lib/ssl-objects.html

What about socket.sslerror, socket.SSL_ERROR_*, what to do about the
various socket.SSL_ERROR_* values, etc?



From guido@python.org  Tue Jun 11 21:28:30 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 11 Jun 2002 16:28:30 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: Your message of "Tue, 11 Jun 2002 21:58:48 +0200."
 <20020611195848.GA27976@lilith.my-fqdn.de>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15622.11330.948519.279929@12-248-41-177.client.attbi.com> <m37kl5y71z.fsf@mira.informatik.hu-berlin.de> <20020611234852.D23356@phd.pp.ru>
 <20020611195848.GA27976@lilith.my-fqdn.de>
Message-ID: <200206112028.g5BKSUD29813@pcp02138704pcs.reston01.va.comcast.net>

> > I know. Once I thought about sed/awk magic to generate two different
> > modules from one template.
> 
> What about symlinks, like:
> 
> bsd18module.c -> bsd30module.c
>                  bsd30module.c
> 
> and using a few #ifdefs in the C sources?

Instead of symlinks, how about one .h file containing most of the code
and two or three .c files that set a few #defines and then #include
the .h file?  Similar to what we do for Python/threads.c

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jun 11 21:53:39 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 11 Jun 2002 16:53:39 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0042.txt,1.58,1.59
In-Reply-To: Your message of "Tue, 11 Jun 2002 21:19:59 BST."
 <20020611211959.M14101@snowy.squish.net>
References: <LNBBLJKPBEHFEDALKOLCOEOKPLAA.tim.one@comcast.net> <200206102146.g5ALkFE06401@pcp02138704pcs.reston01.va.comcast.net> <3D059DF1.4020003@ActiveState.com> <200206111828.g5BIS5p29303@pcp02138704pcs.reston01.va.comcast.net>
 <20020611211959.M14101@snowy.squish.net>
Message-ID: <200206112053.g5BKrd329949@pcp02138704pcs.reston01.va.comcast.net>

> > What do you mean.  Is this not enough? :-)
> > 
> > http://www.python.org/doc/current/lib/ssl-objects.html
> 
> What about socket.sslerror, socket.SSL_ERROR_*, what to do about the
> various socket.SSL_ERROR_* values, etc?

I was kidding.  I'm hoping that someone who has used this stuff can
contribute (a) a little more fleshed-out docs, and (b) a working
example that goes beyond implementing an "https://..." URL.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jun 11 22:11:20 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 11 Jun 2002 17:11:20 -0400
Subject: [Python-Dev] urllib.py and 303 redirect
Message-ID: <200206112111.g5BLBKo30024@pcp02138704pcs.reston01.va.comcast.net>

There seems to be a move in the HTTP world to move away from 302
redirect responses to 303 redirects.  The 302 response was poorly
specified in the original HTTP spec, and most browsers implemented it
by doing a GET on the redirected URL even if the original request was
a POST, but the spec was ambiguous, and some browsers implemented it
by repeating the original request to the redirected URL.  The urllib
module does the latter.  The HTTP/1.1 spec now recommends doing the
former, and it also has two new responses, 303 to unambiguously
specify a redirect that must use d GET, and 307 to specify the
original intent of 302.

More info is on this page, which discusses the issue from a Zope
perspective (which is how I found out about this):

http://dev.zope.org/Wikis/DevSite/Projects/ComponentArchitecture/Use303RedirectsByDefault

and here is a nice general treatise on the subject:

http://ppewww.ph.gla.ac.uk/~flavell/www/post-redirect.html

It's clear that urllib would do wise to implement a handler for the
303 response.  A 307 handler would be useful too.

But the big question is, should we also change the 302 handler to
implement the HTTP/1.1 recommended behavior?  I vaguely remember that
the 302 handler used to do this and that it was "fixed", but I can't
find it in the CVS log.  Changing it *could* break applications, but
is more likely to unbreak them, given that this is now the spec's
recommendation.

Opinions?

Whatever we do should probably also be backported to Python 2.2.1.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aahz@pythoncraft.com  Wed Jun 12 00:29:16 2002
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 11 Jun 2002 19:29:16 -0400
Subject: [Python-Dev] urllib.py and 303 redirect
In-Reply-To: <200206112111.g5BLBKo30024@pcp02138704pcs.reston01.va.comcast.net>
References: <200206112111.g5BLBKo30024@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020611232916.GA3126@panix.com>

On Tue, Jun 11, 2002, Guido van Rossum wrote:
>
> Whatever we do should probably also be backported to Python 2.2.1.

Should it?  IMO, not unless someone stands forward with a clear case that
the current behavior for 302 is buggy.  If the current behavior is simply
ambiguous and works well enough in many situations, I think that changing
semantics would be counter to the intention for bugfix releases.

No objection here to adding 303 and 307 handlers, though.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I had lots of reasonable theories about children myself, until I
had some."  --Michael Rios



From guido@python.org  Wed Jun 12 00:54:00 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 11 Jun 2002 19:54:00 -0400
Subject: [Python-Dev] urllib.py and 303 redirect
In-Reply-To: Your message of "Tue, 11 Jun 2002 19:29:16 EDT."
 <20020611232916.GA3126@panix.com>
References: <200206112111.g5BLBKo30024@pcp02138704pcs.reston01.va.comcast.net>
 <20020611232916.GA3126@panix.com>
Message-ID: <200206112354.g5BNs0u05206@pcp02138704pcs.reston01.va.comcast.net>

> > Whatever we do should probably also be backported to Python 2.2.1.
> 
> Should it?  IMO, not unless someone stands forward with a clear case
> that the current behavior for 302 is buggy.  If the current behavior
> is simply ambiguous and works well enough in many situations, I
> think that changing semantics would be counter to the intention for
> bugfix releases.

The recommendation in the HTTP/1.1 standard is unclear IMO.  It says:

| 10.3.3 302 Found
| 
| The requested resource resides temporarily under a different
| URI. Since the redirection might be altered on occasion, the client
| SHOULD continue to use the Request-URI for future requests. This
| response is only cacheable if indicated by a Cache-Control or
| Expires header field.
| 
| The temporary URI SHOULD be given by the Location field in the
| response. Unless the request method was HEAD, the entity of the
| response SHOULD contain a short hypertext note with a hyperlink to
| the new URI(s).

OK so far.

| If the 302 status code is received in response to a request other
| than GET or HEAD, the user agent MUST NOT automatically redirect the
| request unless it can be confirmed by the user, since this might
| change the conditions under which the request was issued.

I *think* this says that the current urllib behavior (to reissue a
POST request to the redirected URL) should *not* be done, since there
is no user confirmation.

|       Note: RFC 1945 and RFC 2068 specify that the client is not
|       allowed to change the method on the redirected request.
|       However, most existing user agent implementations treat 302 as
|       if it were a 303 response, performing a GET on the Location
|       field-value regardless of the original request method. The
|       status codes 303 and 307 have been added for servers that wish
|       to make unambiguously clear which kind of reaction is expected
|       of the client.

This is ambiguous but suggests that changing PUT to GET is what most
servers expect by now.

> No objection here to adding 303 and 307 handlers, though.

Could I shame you into submitting a patch? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aahz@pythoncraft.com  Wed Jun 12 01:19:54 2002
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 11 Jun 2002 20:19:54 -0400
Subject: [Python-Dev] urllib.py and 303 redirect
In-Reply-To: <200206112354.g5BNs0u05206@pcp02138704pcs.reston01.va.comcast.net>
References: <200206112111.g5BLBKo30024@pcp02138704pcs.reston01.va.comcast.net> <20020611232916.GA3126@panix.com> <200206112354.g5BNs0u05206@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020612001954.GA10924@panix.com>

>> No objection here to adding 303 and 307 handlers, though.
> 
> Could I shame you into submitting a patch? :-)

Not until I figure out how I want to deal with SF not working with Lynx;
haven't been able to get anyone at SF interested in talking about fixing
the problem, and I've been too busy with writing (OSCON slides and book)
to investigate alternatives.  (Though I'll probably bite the bullet and
switch to using PPP.  :-(  Need to do something about ISPs for that; my
backup account supports PPP, but I'm limited in number of hours per
month.)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I had lots of reasonable theories about children myself, until I
had some."  --Michael Rios



From David Abrahams" <david.abrahams@rcn.com  Wed Jun 12 01:38:38 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Tue, 11 Jun 2002 20:38:38 -0400
Subject: [Python-Dev] behavior of inplace operations
Message-ID: <00ad01c211a9$b12b8600$6501a8c0@boostconsulting.com>

My initial post asking about the implementation of += sparked a small
thread over on python-list, from which I've come to the conclusion that my
little optimization suggestion (don't try to set the attribute or item if
the inplace op returns its first argument) is actually more semantically
correct.

For better or worse, these ideas aren't all mine, as
http://aspn.activestate.com/ASPN/Mail/Message/python-list/1222524 attests.

Consider:

>>> t = ([1],[2],[3])
>>> t[0].append(2) # OK, the elements of the tuple are mutable
>>> t
([1, 2], [2], [3])
>>>
>>> t[1] += [3]    # ?? Just did the equivalent operation above
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: object doesn't support item assignment
>>> t # Despite the exception, the operation succeeded!
([1, 2], [2, 3], [3])

So, the exception happens because the user is ostensibly trying to modify
this immutable tuple... but of course she's not. She's just trying to
modify the element of the tuple, which is itself mutable, and that makes
the exception surprising. Even more surprising in light of the exception is
the fact that everything seems to have worked. In order to get this all to
make sense, she needs to twist her brain into remembering that inplace
operations don't really just modify their targets "in place", but also try
to "replace" them.

However, if we just set up the inplace operations so that when they return
the original object there's no "replace", all of these problems go away. We
don't lose any safety; trying to do += on an immutable tuple element will
still fail. Also it makes tuples a generic replacement for lists in more
places.

There are other, more-perverse cases which the proposed change in semantics
would also fix. For example:

>>> class X(object):
...     def __init__(self, l):
...             self.container = l # will form a cycle
...             self.stuff  = []
...     def __iadd__(self, other):
...             self.stuff += other # add to X's internal list
...             del self.container[0]
...             return self
...
>>> l = [ 1, 2, 3]
>>> l.append(X(l))                # the X element refers to l
>>> l
[1, 2, 3, <__main__.X object at 0x00876668>]
>>> l[3] += 'a'     # the element is gone by write-back time.
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
IndexError: list assignment index out of range
>>> l # But everything succeeded
[2, 3, <__main__.X object at 0x00876668>]
>>> l[2].stuff
['a']
>>> l.append('tail') # this one's even wierder
>>> l[2] += 'a'
>>> l
[3, <__main__.X object at 0x00876668>, <__main__.X object at 0x00876668>]

These are too esoteric to be compelling on their own, but my proposal would
also make them work as expected.

Thoughts?
-Dave

------------

P.S. This error message is kind of wierd:

>>> t = ([1],[2],[3])
>>> t[1] += 3
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: argument to += must be iterable
                          ^^^^^^^^^^^^^^^^ ??

+---------------------------------------------------------------+
                  David Abrahams
      C++ Booster (http://www.boost.org)               O__  ==
      Pythonista (http://www.python.org)              c/ /'_ ==
  resume: http://users.rcn.com/abrahams/resume.html  (*) \(*) ==
          email: david.abrahams@rcn.com
+---------------------------------------------------------------+




From guido@python.org  Wed Jun 12 03:06:42 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 11 Jun 2002 22:06:42 -0400
Subject: [Python-Dev] behavior of inplace operations
In-Reply-To: Your message of "Tue, 11 Jun 2002 20:38:38 EDT."
 <00ad01c211a9$b12b8600$6501a8c0@boostconsulting.com>
References: <00ad01c211a9$b12b8600$6501a8c0@boostconsulting.com>
Message-ID: <200206120206.g5C26g005493@pcp02138704pcs.reston01.va.comcast.net>

> My initial post asking about the implementation of += sparked a
> small thread over on python-list, from which I've come to the
> conclusion that my little optimization suggestion (don't try to set
> the attribute or item if the inplace op returns its first argument)
> is actually more semantically correct.
[...]
> Thoughts?

One problem is that it's really hard to design the bytecode so that
this can be implemented.  The problem is that the compiler sees this:

   a[i] += x

and must compile bytecode that works for all cases: a can be mutable
or immutable, and += could return the same or a different object as
a[i].  It currently generates code that uses a STORE_SUBSCR opcode
(store into a[i]) with the saved value of the object and index used
for the BINARY_SUBSCR (load from a[i]) opcode.  It would have to
generate additional code to (a) save the object retrieved from a[i],
(b) compare the result to it using the 'is' operator, and (c) pop some
stuff of the stack and skip over the assignment if true.  That could
be done, but the extra test would definitely slow things down.

A worse problem is that it's a semantic change.  For example,
persistent objects in Zope require (under certain circumstances) that
if you modify an object that lives in a persistent container, you have
to store it back into the container in order for the persistency
mechanism to pick up on the change.  Currently we can rely on a[i]+=x
and a.foo+=x to do the assigment.  Under your proposal, we couldn't
(unless we knew that the item was of an immutable type).  That is such
a subtle change in semantics that I don't want to risk it without
years of transitional warnings.

Personally, I'd rather accept that if you have a = ([], [], []),
a[1]+=[2] won't work.  You can always write a[1].extend([2]).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Wed Jun 12 12:54:29 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 12 Jun 2002 07:54:29 -0400
Subject: [Python-Dev] behavior of inplace operations
References: <00ad01c211a9$b12b8600$6501a8c0@boostconsulting.com>  <200206120206.g5C26g005493@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <018d01c21208$1efadab0$6501a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>


> > My initial post asking about the implementation of += sparked a
> > small thread over on python-list, from which I've come to the
> > conclusion that my little optimization suggestion (don't try to set
> > the attribute or item if the inplace op returns its first argument)
> > is actually more semantically correct.
> [...]
> > Thoughts?
>
> One problem is that it's really hard to design the bytecode so that
> this can be implemented.  The problem is that the compiler sees this:
>
>    a[i] += x
>
> and must compile bytecode that works for all cases: a can be mutable
> or immutable, and += could return the same or a different object as
> a[i].  It currently generates code that uses a STORE_SUBSCR opcode
> (store into a[i]) with the saved value of the object and index used
> for the BINARY_SUBSCR (load from a[i]) opcode.  It would have to
> generate additional code to (a) save the object retrieved from a[i]

Isn't that already lying about on the stack somewhere? Didn't you have to
have it in order to invoke "+= x" on it? (I'm totally ignorant of Python's
bytecode, I'll be the first to admit)

> (b) compare the result to it using the 'is' operator, and (c) pop some
> stuff of the stack and skip over the assignment if true.  That could
> be done, but the extra test would definitely slow things down.

As was suggested by someone else in the thread I referenced, I was thinking
that a new bytecode would be used to handle this. It has to be faster to do
one test in 'C' code than it is to re-indexing into a map or even to do the
refcount-twiddling that goes with an unneeded store into a list.

> A worse problem is that it's a semantic change.  For example,
> persistent objects in Zope require (under certain circumstances) that
> if you modify an object that lives in a persistent container, you have
> to store it back into the container in order for the persistency
> mechanism to pick up on the change.  Currently we can rely on a[i]+=x
> and a.foo+=x to do the assigment.  Under your proposal, we couldn't
> (unless we knew that the item was of an immutable type).

That's right. I would have suggested that for persistent containers, the
object returned carries its own write-back knowledge.

> That is such
> a subtle change in semantics that I don't want to risk it without
> years of transitional warnings.

Hah, code breakage. The purity of the language must not be compromised, at
any cost! Well, ok, if someone's actually using this extra step I guess you
can't change it on a whim...

> Personally, I'd rather accept that if you have a = ([], [], []),
> a[1]+=[2] won't work.  You can always write a[1].extend([2]).

It's your choice, of course. However, it seems a little strange to have
this fundamental operation which is optimized for persistent containers but
doesn't work right -- and (I assert without evidence) must be slower than
neccessary -- in some simple cases. The pathological/non-generic cases are
the ones that make me think twice about using the inplace ops at all. They
don't, in fact, "just work", so I have to think carefully about what's
happening to avoid getting myself in trouble.

-Dave





From David Abrahams" <david.abrahams@rcn.com  Wed Jun 12 15:09:49 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 12 Jun 2002 10:09:49 -0400
Subject: [Python-Dev] Rich Comparison from "C" API?
Message-ID: <020c01c2121a$d69c91b0$6501a8c0@boostconsulting.com>

Suppose I want to execute, from "C", the same steps taken by Python in
evaluating the expression

    x <= y

I see no documented "C" API function which can do that. I'm guessing
PyObject_RichCompare[Bool] may do what I want, but since it's undocumented
I assume its untouchable. Guido?

-Dave

+---------------------------------------------------------------+
                  David Abrahams
      C++ Booster (http://www.boost.org)               O__  ==
      Pythonista (http://www.python.org)              c/ /'_ ==
  resume: http://users.rcn.com/abrahams/resume.html  (*) \(*) ==
          email: david.abrahams@rcn.com
+---------------------------------------------------------------+




From tim.one@comcast.net  Wed Jun 12 16:02:50 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 12 Jun 2002 11:02:50 -0400
Subject: [Python-Dev] Rich Comparison from "C" API?
In-Reply-To: <020c01c2121a$d69c91b0$6501a8c0@boostconsulting.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEBHDEAA.tim.one@comcast.net>

[David Abrahams]
> Suppose I want to execute, from "C", the same steps taken by Python in
> evaluating the expression
>
>     x <= y
>
> I see no documented "C" API function which can do that. I'm guessing
> PyObject_RichCompare[Bool] may do what I want, but since it's
> undocumented I assume its untouchable. Guido?

It doesn't start with an underscore, and is advertised in object.h, so that
it's undocumented just means you didn't yet volunteer a doc patch <wink>.

    /* Return -1 if error; 1 if v op w; 0 if not (v op w). */
    int PyObject_RichCompareBool(PyObject *v, PyObject *w, int op)


where op is one of (also in object.h)

    /* Rich comparison opcodes */
    #define Py_LT 0
    #define Py_LE 1
    #define Py_EQ 2
    #define Py_NE 3
    #define Py_GT 4
    #define Py_GE 5

So the answer to your question is

    int result = PyObject_RichCompareBool(x, y, Py_LE);
    if (result < 0)
        return error_indicator;
    /* now result is true/false */




From guido@python.org  Wed Jun 12 16:13:26 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 12 Jun 2002 11:13:26 -0400
Subject: [Python-Dev] behavior of inplace operations
In-Reply-To: Your message of "Wed, 12 Jun 2002 07:54:29 EDT."
 <018d01c21208$1efadab0$6501a8c0@boostconsulting.com>
References: <00ad01c211a9$b12b8600$6501a8c0@boostconsulting.com> <200206120206.g5C26g005493@pcp02138704pcs.reston01.va.comcast.net>
 <018d01c21208$1efadab0$6501a8c0@boostconsulting.com>
Message-ID: <200206121513.g5CFDQa25012@odiug.zope.com>

> > One problem is that it's really hard to design the bytecode so that
> > this can be implemented.  The problem is that the compiler sees this:
> >
> >    a[i] += x
> >
> > and must compile bytecode that works for all cases: a can be mutable
> > or immutable, and += could return the same or a different object as
> > a[i].  It currently generates code that uses a STORE_SUBSCR opcode
> > (store into a[i]) with the saved value of the object and index used
> > for the BINARY_SUBSCR (load from a[i]) opcode.  It would have to
> > generate additional code to (a) save the object retrieved from a[i]
> 
> Isn't that already lying about on the stack somewhere? Didn't you have to
> have it in order to invoke "+= x" on it? (I'm totally ignorant of Python's
> bytecode, I'll be the first to admit)

Getting that object is the easy part.

> > (b) compare the result to it using the 'is' operator, and (c) pop some
> > stuff of the stack and skip over the assignment if true.  That could
> > be done, but the extra test would definitely slow things down.
> 
> As was suggested by someone else in the thread I referenced, I was thinking
> that a new bytecode would be used to handle this. It has to be faster to do
> one test in 'C' code than it is to re-indexing into a map or even to do the
> refcount-twiddling that goes with an unneeded store into a list.
> 
> > A worse problem is that it's a semantic change.  For example,
> > persistent objects in Zope require (under certain circumstances) that
> > if you modify an object that lives in a persistent container, you have
> > to store it back into the container in order for the persistency
> > mechanism to pick up on the change.  Currently we can rely on a[i]+=x
> > and a.foo+=x to do the assigment.  Under your proposal, we couldn't
> > (unless we knew that the item was of an immutable type).
> 
> That's right. I would have suggested that for persistent containers, the
> object returned carries its own write-back knowledge.

But that's not how it works.  Giving each container a persistent
object ID is not an option.

> > That is such
> > a subtle change in semantics that I don't want to risk it without
> > years of transitional warnings.
> 
> Hah, code breakage. The purity of the language must not be compromised, at
> any cost! Well, ok, if someone's actually using this extra step I guess you
> can't change it on a whim...
> 
> > Personally, I'd rather accept that if you have a = ([], [], []),
> > a[1]+=[2] won't work.  You can always write a[1].extend([2]).
> 
> It's your choice, of course. However, it seems a little strange to have
> this fundamental operation which is optimized for persistent containers but
> doesn't work right -- and (I assert without evidence) must be slower than
> neccessary -- in some simple cases. The pathological/non-generic cases are
> the ones that make me think twice about using the inplace ops at all. They
> don't, in fact, "just work", so I have to think carefully about what's
> happening to avoid getting myself in trouble.

You have a habit of thinking too much instead of using common
sense. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Wed Jun 12 16:13:25 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 12 Jun 2002 11:13:25 -0400
Subject: [Python-Dev] Rich Comparison from "C" API?
References: <BIEJKCLHCIOIHAGOKOLHCEBHDEAA.tim.one@comcast.net>
Message-ID: <026301c21224$344847b0$6501a8c0@boostconsulting.com>

----- Original Message -----
From: "Tim Peters" <tim.one@comcast.net>
To: "David Abrahams" <david.abrahams@rcn.com>
Cc: <python-dev@python.org>
Sent: Wednesday, June 12, 2002 11:02 AM
Subject: RE: [Python-Dev] Rich Comparison from "C" API?


> [David Abrahams]
> > Suppose I want to execute, from "C", the same steps taken by Python in
> > evaluating the expression
> >
> >     x <= y
> >
> > I see no documented "C" API function which can do that. I'm guessing
> > PyObject_RichCompare[Bool] may do what I want, but since it's
> > undocumented I assume its untouchable. Guido?
>
> It doesn't start with an underscore, and is advertised in object.h, so
that
> it's undocumented just means you didn't yet volunteer a doc patch <wink>.

...and I didn't volunteer a doc patch yet because the source is too
complicated to easily determine if it's exactly what I'm looking for. Umm,
OK, I guess I was looking in the wrong place when poring over the
implementation of PyObject_RichCompare: ceval.c calls PyObject_RichCompare
directly. OK, doc patch coming up.

-Dave






From tim.one@comcast.net  Wed Jun 12 16:28:11 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 12 Jun 2002 11:28:11 -0400
Subject: [Python-Dev] Rich Comparison from "C" API?
In-Reply-To: <026301c21224$344847b0$6501a8c0@boostconsulting.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHGEBKDEAA.tim.one@comcast.net>

[Tim, claims that PyObject_RichCompareBool isn't documented because
 David hasn't yet submitted a doc patch]

[David Abrahams]
> ...and I didn't volunteer a doc patch yet because the source is too
> complicated to easily determine if it's exactly what I'm looking for.
> Umm, OK, I guess I was looking in the wrong place when poring over the
> implementation of PyObject_RichCompare: ceval.c calls
> PyObject_RichCompare directly. OK, doc patch coming up.

1. You really want PyObject_RichCompareBool in your example, not
   PyObject_RichCompare.  The former is much more efficient in some
   cases (e.g., a total ordering on dicts is much hairer to determine
   than just equality).

2. This is why I keep pulling your leg.  Sometimes you fall for it <wink>.

Thanks for the patch!



From David Abrahams" <david.abrahams@rcn.com  Wed Jun 12 16:35:02 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 12 Jun 2002 11:35:02 -0400
Subject: [Python-Dev] behavior of inplace operations
References: <00ad01c211a9$b12b8600$6501a8c0@boostconsulting.com> <200206120206.g5C26g005493@pcp02138704pcs.reston01.va.comcast.net>              <018d01c21208$1efadab0$6501a8c0@boostconsulting.com>  <200206121513.g5CFDQa25012@odiug.zope.com>
Message-ID: <028701c21227$029a8220$6501a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>

> > That's right. I would have suggested that for persistent containers,
the
> > object returned carries its own write-back knowledge.
>
> But that's not how it works.  Giving each container a persistent
> object ID is not an option.

I'm sure this is moot, but I don't think I was suggesting that. I was
suggesting that a persistent container's __getitem__() returns a proxy
object which contains a reference back to the container. You can either
write-back upon modifying the object, or, I suppose, upon __del__(). My
scheme may not work (I don't really understand the Zope requirements or
implementation), but it seems that the existing one is just as vulnerable
in the case of a container of mutable objects:

    x = container_of_lists[2]
    x += 3 # no write-back

-Dave




From David Abrahams" <david.abrahams@rcn.com  Wed Jun 12 16:36:50 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 12 Jun 2002 11:36:50 -0400
Subject: [Python-Dev] Rich Comparison from "C" API?
References: <BIEJKCLHCIOIHAGOKOLHGEBKDEAA.tim.one@comcast.net>
Message-ID: <028801c21227$02c3dc10$6501a8c0@boostconsulting.com>

From: "Tim Peters" <tim.one@comcast.net>

> 2. This is why I keep pulling your leg.  Sometimes you fall for it
<wink>.

Heh, you may have been kidding but Guido, privately, gave me the old
Obi-wan line.

> Thanks for the patch!

Workin' on it...




From David Abrahams" <david.abrahams@rcn.com  Wed Jun 12 16:48:27 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 12 Jun 2002 11:48:27 -0400
Subject: [Python-Dev] Rich Comparison from "C" API?
References: <BIEJKCLHCIOIHAGOKOLHGEBKDEAA.tim.one@comcast.net>
Message-ID: <02a201c21229$1d6df620$6501a8c0@boostconsulting.com>

From: "Tim Peters" <tim.one@comcast.net>

> 1. You really want PyObject_RichCompareBool in your example, not
>    PyObject_RichCompare.  The former is much more efficient in some
>    cases (e.g., a total ordering on dicts is much hairer to determine
>    than just equality).

Now you're really pulling my leg. PyObject_RichCompareBool is just:

/* Return -1 if error; 1 if v op w; 0 if not (v op w). */
int
PyObject_RichCompareBool(PyObject *v, PyObject *w, int op)
{
 PyObject *res = PyObject_RichCompare(v, w, op);
 int ok;

 if (res == NULL)
  return -1;
 ok = PyObject_IsTrue(res);
 Py_DECREF(res);
 return ok;
}


How can that be more efficient that PyObject_RichCompare?




From nas@python.ca  Wed Jun 12 16:58:06 2002
From: nas@python.ca (Neil Schemenauer)
Date: Wed, 12 Jun 2002 08:58:06 -0700
Subject: [Python-Dev] Rich Comparison from "C" API?
In-Reply-To: <026301c21224$344847b0$6501a8c0@boostconsulting.com>; from david.abrahams@rcn.com on Wed, Jun 12, 2002 at 11:13:25AM -0400
References: <BIEJKCLHCIOIHAGOKOLHCEBHDEAA.tim.one@comcast.net> <026301c21224$344847b0$6501a8c0@boostconsulting.com>
Message-ID: <20020612085806.A23915@glacier.arctrix.com>

David Abrahams wrote:
> ...and I didn't volunteer a doc patch yet because the source is too
> complicated to easily determine if it's exactly what I'm looking for.

Dragons be there.  Comparison operations are, I think, the most
complicated part of the interpreter.   Be brave and good luck to you.  :-)

  Neil



From guido@python.org  Wed Jun 12 16:51:53 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 12 Jun 2002 11:51:53 -0400
Subject: [Python-Dev] behavior of inplace operations
In-Reply-To: Your message of "Wed, 12 Jun 2002 11:35:02 EDT."
 <028701c21227$029a8220$6501a8c0@boostconsulting.com>
References: <00ad01c211a9$b12b8600$6501a8c0@boostconsulting.com> <200206120206.g5C26g005493@pcp02138704pcs.reston01.va.comcast.net> <018d01c21208$1efadab0$6501a8c0@boostconsulting.com> <200206121513.g5CFDQa25012@odiug.zope.com>
 <028701c21227$029a8220$6501a8c0@boostconsulting.com>
Message-ID: <200206121551.g5CFprp25331@odiug.zope.com>

> I'm sure this is moot, but I don't think I was suggesting that. I was
> suggesting that a persistent container's __getitem__() returns a proxy
> object which contains a reference back to the container.

That's not sufficiently transparent for some purposes.

> You can either
> write-back upon modifying the object, or, I suppose, upon __del__(). My
> scheme may not work (I don't really understand the Zope requirements or
> implementation), but it seems that the existing one is just as vulnerable
> in the case of a container of mutable objects:
> 
>     x = container_of_lists[2]
>     x += 3 # no write-back

This is a known limitation.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Wed Jun 12 17:07:55 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 12 Jun 2002 12:07:55 -0400
Subject: [Python-Dev] Rich Comparison from "C" API?
References: <BIEJKCLHCIOIHAGOKOLHOECADEAA.tim.one@comcast.net>
Message-ID: <02cd01c2122b$9fd7aa00$6501a8c0@boostconsulting.com>

From: "Tim Peters" <tim.one@comcast.net>

> [David Abrahams]
> > Now you're really pulling my leg. PyObject_RichCompareBool is just:
> > ...
> > How can that be more efficient that PyObject_RichCompare?
>
> I wasn't pulling your leg there, I was simply wrong.  Who can blame me?
You
> never documented this stuff <wink>.

In my patch I left out any mention of efficiency, just in case you were
right <wink><wink>

-D




From tim.one@comcast.net  Wed Jun 12 17:01:10 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 12 Jun 2002 12:01:10 -0400
Subject: [Python-Dev] Rich Comparison from "C" API?
In-Reply-To: <02a201c21229$1d6df620$6501a8c0@boostconsulting.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHOECADEAA.tim.one@comcast.net>

[Tim]
> 1. You really want PyObject_RichCompareBool in your example, not
>    PyObject_RichCompare.  The former is much more efficient in some
>    cases (e.g., a total ordering on dicts is much hairer to determine
>    than just equality).


[David Abrahams]
> Now you're really pulling my leg. PyObject_RichCompareBool is just:
> ...
> How can that be more efficient that PyObject_RichCompare?

I wasn't pulling your leg there, I was simply wrong.  Who can blame me?  You
never documented this stuff <wink>.




From fredrik@pythonware.com  Wed Jun 12 17:54:56 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 12 Jun 2002 18:54:56 +0200
Subject: [Python-Dev] urllib.py and 303 redirect
References: <200206112111.g5BLBKo30024@pcp02138704pcs.reston01.va.comcast.net> <20020611232916.GA3126@panix.com> <200206112354.g5BNs0u05206@pcp02138704pcs.reston01.va.comcast.net> <20020612001954.GA10924@panix.com>
Message-ID: <002d01c21231$e4e67080$ced241d5@hagrid>

Aahz wrote:

> > Could I shame you into submitting a patch? :-)
> 
> Not until I figure out how I want to deal with SF not working with Lynx;
> haven't been able to get anyone at SF interested in talking about fixing
> the problem

if you come up with a patch, I'm sure someone can
help you post it to SF.

(didn't someone report that SF worked perfectly fine
with "links", btw?)

</F>




From skip@pobox.com  Wed Jun 12 20:43:18 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 12 Jun 2002 14:43:18 -0500
Subject: [Python-Dev] code coverage updated
Message-ID: <15623.42070.960948.412843@12-248-41-177.client.attbi.com>

I updated the C and Python code coverage information at

    http://manatee.mojam.com/~skip/python/Python/dist/src/

Something changed about how gcov works, so all the .gcov files got dumped
into the top-level directory.  Accordingly, I needed to make a couple
changes to get the tables to display again, and all the .c file info is
jumbled together.  I don't think there are actually any name conflicts in
the Python C source, so the information there should be okay.  If I get a
few more minutes I will see if I can fix the problem.

Skip




From guido@python.org  Wed Jun 12 21:59:26 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 12 Jun 2002 16:59:26 -0400
Subject: [Python-Dev] test_socket failures
Message-ID: <200206122059.g5CKxQa16372@odiug.zope.com>

I know that there are problem with the two new socket tests:
test_timeout and test_socket.  The problems are varied: the tests
assume network access and a working and consistent DNS, they assume
predictable timing, and there is a number of Windows-specific failures
that I'm trying to track down.  Also, when the full test suite is run,
test_socket.py may hang, while in isolation it will work.  (Gosh if
only we had had these unit tests a few years ago.  They bring up all
sorts of issues that are good to know about.)

I'll try to fix these ASAP.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mgilfix@eecs.tufts.edu  Thu Jun 13 00:13:55 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Wed, 12 Jun 2002 19:13:55 -0400
Subject: [Python-Dev] test_socket failures
In-Reply-To: <200206122059.g5CKxQa16372@odiug.zope.com>; from guido@python.org on Wed, Jun 12, 2002 at 04:59:26PM -0400
References: <200206122059.g5CKxQa16372@odiug.zope.com>
Message-ID: <20020612191355.C10542@eecs.tufts.edu>

On Wed, Jun 12 @ 16:59, Guido van Rossum wrote:
> I know that there are problem with the two new socket tests:
> test_timeout and test_socket.  The problems are varied: the tests
> assume network access and a working and consistent DNS, they assume
> predictable timing, and there is a number of Windows-specific failures
> that I'm trying to track down.  Also, when the full test suite is run,
> test_socket.py may hang, while in isolation it will work.  (Gosh if
> only we had had these unit tests a few years ago.  They bring up all
> sorts of issues that are good to know about.)
> 
> I'll try to fix these ASAP.

  Yeah, I hadn't gotten around to checking them within the full test
suite. The version I had sent you was just for commentary <grin>.  I
try to do as much synchronization as possible. I'm sure fiddling
with the synchronization points in the ThreadableTest class in
test_socket.py will do the trick.

  BTW, I've concluded that the unitest module rocks. Just a show
of support here.

                        -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From mwh@python.net  Thu Jun 13 16:00:20 2002
From: mwh@python.net (Michael Hudson)
Date: 13 Jun 2002 16:00:20 +0100
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src setup.py,1.89,1.90
In-Reply-To: gvanrossum@users.sourceforge.net's message of "Thu, 13 Jun 2002 07:41:38 -0700"
References: <E17IVn8-0003EL-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <2m7kl3kyuz.fsf@starship.python.net>

gvanrossum@users.sourceforge.net writes:

> Index: setup.py
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/setup.py,v
> retrieving revision 1.89
> retrieving revision 1.90
> diff -C2 -d -r1.89 -r1.90
> *** setup.py	11 Jun 2002 06:22:30 -0000	1.89
> --- setup.py	13 Jun 2002 14:41:32 -0000	1.90
> ***************
> *** 272,275 ****
> --- 272,277 ----
>           exts.append( Extension('xreadlines', ['xreadlinesmodule.c']) )
>   
> +         exts.append( Extension("bits", ["bits.c"]) )
> + 

I'm guessing that wasn't meant to get checked in?

Cheers,
M.

-- 
  /* I'd just like to take this moment to point out that C has all
     the expressive power of two dixie cups and a string.
   */                       -- Jamie Zawinski from the xkeycaps source



From gward@python.net  Thu Jun 13 16:33:02 2002
From: gward@python.net (Greg Ward)
Date: Thu, 13 Jun 2002 11:33:02 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <m37kl5y71z.fsf@mira.informatik.hu-berlin.de>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15622.11330.948519.279929@12-248-41-177.client.attbi.com> <m37kl5y71z.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020613153302.GA1918@gerg.ca>

On 11 June 2002, Martin v. Loewis said:
> It actually only requires different init functions. To support that
> with distutils, you need to tell distutils to generate different
> object files from the same source file, which is probably not
> supported out of the box.

In theory:

  setup(...
        ext_modules=[Extension("foo1", ["foo.c"],
                               define_macros=[('FOOSTYLE', 1)]),
                     Extension("foo2", ["foo.c"],
                               define_macros=[('FOOSTYLE', 2)])])

should work.  See
  http://www.python.org/doc/current/dist/setup-script.html#SECTION000330000000000000000

Untested, YMMV, etc.

        Greg
-- 
Greg Ward - just another Python hacker                  gward@python.net
http://starship.python.net/~gward/
I have the power to HALT PRODUCTION on all TEENAGE SEX COMEDIES!!



From thomas.heller@ion-tof.com  Thu Jun 13 17:03:00 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 13 Jun 2002 18:03:00 +0200
Subject: [Python-Dev] Please give this patch for building bsddb a try
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15622.11330.948519.279929@12-248-41-177.client.attbi.com> <m37kl5y71z.fsf@mira.informatik.hu-berlin.de> <20020613153302.GA1918@gerg.ca>
Message-ID: <09da01c212f3$cacd72d0$e000a8c0@thomasnotebook>

From: "Greg Ward" <gward@python.net>
> On 11 June 2002, Martin v. Loewis said:
> > It actually only requires different init functions. To support that
> > with distutils, you need to tell distutils to generate different
> > object files from the same source file, which is probably not
> > supported out of the box.
> 
> In theory:
> 
>   setup(...
>         ext_modules=[Extension("foo1", ["foo.c"],
>                                define_macros=[('FOOSTYLE', 1)]),
>                      Extension("foo2", ["foo.c"],
>                                define_macros=[('FOOSTYLE', 2)])])
> 
> should work.
But not in practice, IIRC.

Because the build process for foo2 will see that foo.o
is up to date already.

Thomas




From jeremy@zope.com  Thu Jun 13 18:38:01 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Thu, 13 Jun 2002 13:38:01 -0400
Subject: [Python-Dev] change to compiler implementations
In-Reply-To: <E17IYMj-0007Sk-00@usw-pr-cvs1.sourceforge.net>
References: <E17IYMj-0007Sk-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <15624.55417.763429.27404@slothrop.zope.com>

I just checked in two sets of changes to distutils.  I refactored in
the implementation of compile() methods and I added some simple
dependency tracking.  I've only been able to test the changes on Unix,
and expect Guido will soon test it with MSVC.  I'd appreciate it if
people with other affected compilers (Borland, Cygwin, EMX) could test
it.

Jeremy




From skip@pobox.com  Thu Jun 13 18:49:04 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 13 Jun 2002 12:49:04 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
Message-ID: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>

I wonder if it would be better to have distutils generate the appropriate
type of makefile and execute that instead of directly building objects and
shared libraries.  This would finesse some of the dependency tracking
problems that pop up frequently.

Skip



From jeremy@zope.com  Thu Jun 13 18:51:30 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Thu, 13 Jun 2002 13:51:30 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
Message-ID: <15624.56226.581221.448980@slothrop.zope.com>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

  SM> I wonder if it would be better to have distutils generate the
  SM> appropriate type of makefile and execute that instead of
  SM> directly building objects and shared libraries.  This would
  SM> finesse some of the dependency tracking problems that pop up
  SM> frequently.

That sounds really complicated.

Jeremy




From guido@python.org  Thu Jun 13 18:51:54 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 13 Jun 2002 13:51:54 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Thu, 13 Jun 2002 12:49:04 CDT."
 <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
Message-ID: <200206131751.g5DHps801904@odiug.zope.com>

[Skip]
> I wonder if it would be better to have distutils generate the
> appropriate type of makefile and execute that instead of directly
> building objects and shared libraries.  This would finesse some of
> the dependency tracking problems that pop up frequently.

But that doesn't work for platforms that don't have a Make.  And while
Windows has one, its file format is completely different, so you'd
have to teach distutils how to write each platform's Makefile format.

-1

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jun 13 18:55:13 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 13 Jun 2002 13:55:13 -0400
Subject: [Python-Dev] change to compiler implementations
In-Reply-To: Your message of "Thu, 13 Jun 2002 13:38:01 EDT."
 <15624.55417.763429.27404@slothrop.zope.com>
References: <E17IYMj-0007Sk-00@usw-pr-cvs1.sourceforge.net>
 <15624.55417.763429.27404@slothrop.zope.com>
Message-ID: <200206131755.g5DHtED02030@odiug.zope.com>

> I just checked in two sets of changes to distutils.  I refactored in
> the implementation of compile() methods and I added some simple
> dependency tracking.  I've only been able to test the changes on Unix,
> and expect Guido will soon test it with MSVC.  I'd appreciate it if
> people with other affected compilers (Borland, Cygwin, EMX) could test
> it.

Um, I don't use setup.py with MSVC on Windows.  The MSVC project, for
better or for worse, has entries to build all the extensions we need,
and I don't have any 3rd party extensions I'd like to build.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From paul@prescod.net  Thu Jun 13 19:00:55 2002
From: paul@prescod.net (Paul Prescod)
Date: Thu, 13 Jun 2002 11:00:55 -0700
Subject: [Python-Dev] addressing distutils inability to track file
 dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
Message-ID: <3D08DDD7.BD8573D8@prescod.net>

Skip Montanaro wrote:
> 
> I wonder if it would be better to have distutils generate the appropriate
> type of makefile and execute that instead of directly building objects and
> shared libraries.  This would finesse some of the dependency tracking
> problems that pop up frequently.

That's what Perl does ("MakeMaker") but I think it just adds another
level of complexity, especially with different "makes" out there doing
different things. 

 Paul Prescod



From thomas.heller@ion-tof.com  Thu Jun 13 19:04:43 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 13 Jun 2002 20:04:43 +0200
Subject: [Python-Dev] Re: [Distutils] change to compiler implementations
References: <E17IYMj-0007Sk-00@usw-pr-cvs1.sourceforge.net> <15624.55417.763429.27404@slothrop.zope.com>
Message-ID: <0ae001c21304$cb9d0f20$e000a8c0@thomasnotebook>

From: "Jeremy Hylton" <jeremy@zope.com>
> I just checked in two sets of changes to distutils.  I refactored in
> the implementation of compile() methods and I added some simple
> dependency tracking.  I've only been able to test the changes on Unix,
> and expect Guido will soon test it with MSVC.  I'd appreciate it if
> people with other affected compilers (Borland, Cygwin, EMX) could test
> it.
> 
I've tested some of my extensions with MSVC, and it works fine.

Thomas




From niemeyer@conectiva.com  Thu Jun 13 19:55:13 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Thu, 13 Jun 2002 15:55:13 -0300
Subject: [Python-Dev] doc strings patch
Message-ID: <20020613155513.A15681@ibook.distro.conectiva>

Could someone please check out patch number 568124? It's a simple
patch. OTOH, it's huge, and may break shortly if not applied.

Thank you!

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From paul@pfdubois.com  Thu Jun 13 19:59:58 2002
From: paul@pfdubois.com (Paul F Dubois)
Date: Thu, 13 Jun 2002 11:59:58 -0700
Subject: [Distutils] Re: [Python-Dev] change to compiler implementations
In-Reply-To: <200206131755.g5DHtED02030@odiug.zope.com>
Message-ID: <001101c2130c$8dee5fa0$0c01a8c0@NICKLEBY>

Numeric is a suitable stress test on Windows that uses a setup.py.

> -----Original Message-----
> From: distutils-sig-admin@python.org 
> [mailto:distutils-sig-admin@python.org] On Behalf Of Guido van Rossum
> Sent: Thursday, June 13, 2002 10:55 AM
> To: jeremy@zope.com
> Cc: python-dev@python.org; distutils-sig@python.org
> Subject: [Distutils] Re: [Python-Dev] change to compiler 
> implementations
> 
> 
> > I just checked in two sets of changes to distutils.  I 
> refactored in 
> > the implementation of compile() methods and I added some simple 
> > dependency tracking.  I've only been able to test the 
> changes on Unix, 
> > and expect Guido will soon test it with MSVC.  I'd appreciate it if 
> > people with other affected compilers (Borland, Cygwin, EMX) 
> could test 
> > it.
> 
> Um, I don't use setup.py with MSVC on Windows.  The MSVC 
> project, for better or for worse, has entries to build all 
> the extensions we need, and I don't have any 3rd party 
> extensions I'd like to build.
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> 
> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG@python.org 
> http://mail.python.org/mailman/listinfo/distut> ils-sig
> 




From guido@python.org  Thu Jun 13 20:04:48 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 13 Jun 2002 15:04:48 -0400
Subject: [Python-Dev] doc strings patch
In-Reply-To: Your message of "Thu, 13 Jun 2002 15:55:13 -0300."
 <20020613155513.A15681@ibook.distro.conectiva>
References: <20020613155513.A15681@ibook.distro.conectiva>
Message-ID: <200206131904.g5DJ4mV04980@odiug.zope.com>

> Could someone please check out patch number 568124? It's a simple
> patch. OTOH, it's huge, and may break shortly if not applied.

Looks good to me, except for the socket module (where I changed some
docstrings).  We need a volunteer to check it in!

--Guido van Rossum (home page: http://www.python.org/~guido/)




From martin@v.loewis.de  Thu Jun 13 20:36:20 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 13 Jun 2002 21:36:20 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
Message-ID: <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> I wonder if it would be better to have distutils generate the appropriate
> type of makefile and execute that instead of directly building objects and
> shared libraries.  This would finesse some of the dependency tracking
> problems that pop up frequently.

It was one of the design principles of distutils to not rely on any
other tools but Python and the C compiler.

Regards,
Martin




From skip@pobox.com  Thu Jun 13 20:43:12 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 13 Jun 2002 14:43:12 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15624.62928.845160.407762@12-248-41-177.client.attbi.com>

    >> I wonder if it would be better to have distutils generate the
    >> appropriate type of makefile and execute that instead...

    Martin> It was one of the design principles of distutils to not rely on
    Martin> any other tools but Python and the C compiler.

Perhaps it's a design principle that needs to be rethought.  If you can
assume the presence of a C compiler I think you can generally assume the
presence of a make tool of some sort.

Skip




From paul-python@svensson.org  Thu Jun 13 20:55:51 2002
From: paul-python@svensson.org (Paul Svensson)
Date: Thu, 13 Jun 2002 15:55:51 -0400 (EDT)
Subject: [Python-Dev] addressing distutils inability to track file
 dependencies
In-Reply-To: <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
Message-ID: <Pine.LNX.4.44.0206131552280.7384-100000@familjen.svensson.org>

On Thu, 13 Jun 2002, Skip Montanaro wrote:

>    >> I wonder if it would be better to have distutils generate the
>    >> appropriate type of makefile and execute that instead...
>
>    Martin> It was one of the design principles of distutils to not rely on
>    Martin> any other tools but Python and the C compiler.
>
>Perhaps it's a design principle that needs to be rethought.  If you can
>assume the presence of a C compiler I think you can generally assume the
>presence of a make tool of some sort.
                         ^^^^^^^^^^^^

That's the rub.  The MAKE.EXE mostly found on WinDOS boxen doesn't have
much more than the name in common with the well known Unix tool.

	/Paul




From skip@pobox.com  Thu Jun 13 21:06:30 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 13 Jun 2002 15:06:30 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206131751.g5DHps801904@odiug.zope.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <200206131751.g5DHps801904@odiug.zope.com>
Message-ID: <15624.64326.946056.280597@12-248-41-177.client.attbi.com>

    >> I wonder if it would be better to have distutils generate the
    >> appropriate type of makefile and execute that instead...

    Guido> But that doesn't work for platforms that don't have a Make.  And
    Guido> while Windows has one, its file format is completely different,
    Guido> so you'd have to teach distutils how to write each platform's
    Guido> Makefile format.

I don't see that writing different makefile formats is any harder than
writing different shell commands.  On those systems where you don't have a
make-like tool, either distutils already writes compile and link commands or
it doesn't work at all.  On those systems where you do have a make-like
facility, I see no reason to not use it.  You will get more reliable
dependency checking for one thing.

Skip



From skip@pobox.com  Thu Jun 13 21:10:04 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 13 Jun 2002 15:10:04 -0500
Subject: [Python-Dev] addressing distutils inability to track file
 dependencies
In-Reply-To: <3D08DDD7.BD8573D8@prescod.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <3D08DDD7.BD8573D8@prescod.net>
Message-ID: <15624.64540.472905.469106@12-248-41-177.client.attbi.com>

    Paul> That's what Perl does ("MakeMaker") but I think it just adds
    Paul> another level of complexity, especially with different "makes" out
    Paul> there doing different things.

If the extra complexity came with no added benefits I'd agree with you.
However, most makes actually do support a fairly basic common syntax.  Who
cares about %-rules and suffix rules?  Those are only there to make it
easier for humans to maintain Makefiles Just generate a brute-force
low-level makefile.  Distutils will then do the right thing in the face of
file edits.

Skip



From gball@cfa.harvard.edu  Thu Jun 13 21:26:24 2002
From: gball@cfa.harvard.edu (Greg Ball)
Date: Thu, 13 Jun 2002 16:26:24 -0400 (EDT)
Subject: [Python-Dev] addressing distutils inability to track file
 dependencies
In-Reply-To: <15624.64540.472905.469106@12-248-41-177.client.attbi.com>
Message-ID: <Pine.LNX.4.44.0206131617320.12689-100000@tane.harvard.edu>

On Thu, 13 Jun 2002, Skip Montanaro wrote:

> If the extra complexity came with no added benefits I'd agree with you.
> However, most makes actually do support a fairly basic common syntax.  Who
> cares about %-rules and suffix rules?  Those are only there to make it
> easier for humans to maintain Makefiles Just generate a brute-force
> low-level makefile.  Distutils will then do the right thing in the face of
> file edits.

If you're not using sophisticated rules, the job make does is probably no
more complicated than the job of generating a makefile.  You just
construct a dependency graph, then walk over it stat-ing the files,
and running rules where needed.  It's a SMOP ;-)

--
Greg Ball
 






From pyth@devel.trillke.net  Thu Jun 13 21:40:04 2002
From: pyth@devel.trillke.net (holger krekel)
Date: Thu, 13 Jun 2002 22:40:04 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15624.62928.845160.407762@12-248-41-177.client.attbi.com>; from skip@pobox.com on Thu, Jun 13, 2002 at 02:43:12PM -0500
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
Message-ID: <20020613224004.F6609@prim.han.de>

Skip Montanaro wrote:
> 
>     >> I wonder if it would be better to have distutils generate the
>     >> appropriate type of makefile and execute that instead...
> 
>     Martin> It was one of the design principles of distutils to not rely on
>     Martin> any other tools but Python and the C compiler.
> 
> Perhaps it's a design principle that needs to be rethought.  If you can
> assume the presence of a C compiler I think you can generally assume the
> presence of a make tool of some sort.

isn't it funny that 'scons' as a *build system* doesn't rely on anything
but python? I've heard rumors they even check dependencies<wink>...

    holger



From martin@v.loewis.de  Thu Jun 13 22:00:15 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 13 Jun 2002 23:00:15 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
Message-ID: <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> Perhaps it's a design principle that needs to be rethought.  If you can
> assume the presence of a C compiler I think you can generally assume the
> presence of a make tool of some sort.

Maybe - although it removes reliability from the build process if you
need to rely on locating another tool. For example, on Solaris, you
could run into either the vendor's make, or GNU make.

Also, it appears that nothing is gained by using make.

Regards,
Martin



From martin@v.loewis.de  Thu Jun 13 22:03:36 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 13 Jun 2002 23:03:36 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15624.64326.946056.280597@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <200206131751.g5DHps801904@odiug.zope.com>
 <15624.64326.946056.280597@12-248-41-177.client.attbi.com>
Message-ID: <m3fzzqyjpz.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> You will get more reliable dependency checking for one thing.

I doubt that. To get that checking, you need to tell make what the
dependencies are - it won't automatically assume anything except that
object files depend on their input sources.

In particular, you will *not* get dependencies to header files - in my
experience, those are the biggest source of build problems. If you add
a scanning procedure for finding header files, you can integrate this
into distutils' dependency mechanisms just as well as you can generate
five different makefile formats from those data.

Regards,
Martin



From skip@pobox.com  Thu Jun 13 22:05:53 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 13 Jun 2002 16:05:53 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <20020613224004.F6609@prim.han.de>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <20020613224004.F6609@prim.han.de>
Message-ID: <15625.2353.853651.314565@12-248-41-177.client.attbi.com>

    holger> isn't it funny that 'scons' as a *build system* doesn't rely on
    holger> anything but python? I've heard rumors they even check
    holger> dependencies<wink>...

Scons would be fine by me.  It doesn't rely on a C compiler, but if you want
to build something that needs to be compiled I suspect you'd need one.

Skip



From skip@pobox.com  Thu Jun 13 22:08:05 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 13 Jun 2002 16:08:05 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15625.2485.994408.888814@12-248-41-177.client.attbi.com>

    Martin> Also, it appears that nothing is gained by using make.

That's incorrect.  Distutils is not a make substitute and I doubt it ever
will be.  What dependency checking it does do is incomplete and this gives
rise to problems that are reported fairly frequently.

Skip



From pyth@devel.trillke.net  Thu Jun 13 22:11:36 2002
From: pyth@devel.trillke.net (holger krekel)
Date: Thu, 13 Jun 2002 23:11:36 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15625.2353.853651.314565@12-248-41-177.client.attbi.com>; from skip@pobox.com on Thu, Jun 13, 2002 at 04:05:53PM -0500
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <20020613224004.F6609@prim.han.de> <15625.2353.853651.314565@12-248-41-177.client.attbi.com>
Message-ID: <20020613231136.H6609@prim.han.de>

Skip Montanaro wrote:
> 
>     holger> isn't it funny that 'scons' as a *build system* doesn't rely on
>     holger> anything but python? I've heard rumors they even check
>     holger> dependencies<wink>...
> 
> Scons would be fine by me.  It doesn't rely on a C compiler, but if you want
> to build something that needs to be compiled I suspect you'd need one.

i didn't mean to include or require scons for distutils.  I was just
making the obvious point that for the dependency task at hand 
python should be powerful enough.  Reusing some code from scons might
be worthwile, though. 

why-use-a-car-when-you-can-beam-ly y'rs, holger



From martin@v.loewis.de  Thu Jun 13 23:13:50 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 14 Jun 2002 00:13:50 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
Message-ID: <m3660myggx.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> That's incorrect.  Distutils is not a make substitute and I doubt it ever
> will be.  What dependency checking it does do is incomplete and this gives
> rise to problems that are reported fairly frequently.

Can you provide a specific example to support this criticism? Could
you also explain how generating makefiles would help?

Regards,
Martin




From paul@prescod.net  Fri Jun 14 01:09:28 2002
From: paul@prescod.net (Paul Prescod)
Date: Thu, 13 Jun 2002 17:09:28 -0700
Subject: [Python-Dev] addressing distutils inability to track file
 dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <3D08DDD7.BD8573D8@prescod.net> <15624.64540.472905.469106@12-248-41-177.client.attbi.com>
Message-ID: <3D093438.92B46349@prescod.net>

Skip Montanaro wrote:
> 
>...
> 
> If the extra complexity came with no added benefits I'd agree with you.

I guess most of us don't understand the benefits because we don't see
dependency tracking as necessarily that difficult. It's no harder than
the new method resolution order. ;)

Jeremy says he has already started implementing dependency tracking.
Would switching strategies to using make actually get us anywhere faster
or easier?

> However, most makes actually do support a fairly basic common syntax.  Who
> cares about %-rules and suffix rules?  Those are only there to make it
> easier for humans to maintain Makefiles Just generate a brute-force
> low-level makefile.  Distutils will then do the right thing in the face of
> file edits.

Okay, so let's say that we want distutils to handle ".i" files for SWIG
(it does) and .pyrx files for PyREX (it should), then we have to
generate rules for those too.

 Paul Prescod



From goodger@users.sourceforge.net  Fri Jun 14 01:37:04 2002
From: goodger@users.sourceforge.net (David Goodger)
Date: Thu, 13 Jun 2002 20:37:04 -0400
Subject: [Python-Dev] Design Patterns quick reference (was Re: textwrap.py)
Message-ID: <B92EB2F0.2462E%goodger@users.sourceforge.net>

Greg Ward wrote:
> design patterns are great, as long as everyone has a copy of *Design
> Patterns* on their desk.  ;-)

For those of us who don't, here's a free and nearly-complete (20
patterns) quick reference:
http://www.netobjectives.com/dpexplained/download/dpmatrix.pdf .
Printed double-sided, it makes a good memory jogger.

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/




From skip@pobox.com  Fri Jun 14 01:53:52 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 13 Jun 2002 19:53:52 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <m3660myggx.fsf@mira.informatik.hu-berlin.de>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15625.16032.161304.357298@12-248-41-177.client.attbi.com>

    >> That's incorrect.  Distutils is not a make substitute and I doubt it
    >> ever will be.  What dependency checking it does do is incomplete and
    >> this gives rise to problems that are reported fairly frequently.

    Martin> Can you provide a specific example to support this criticism?
    Martin> Could you also explain how generating makefiles would help?

>From python-list on June 10 (this is what made me wish yet again for better
dependency checking):

    http://mail.python.org/pipermail/python-list/2002-June/108153.html

It's clear nobody but me wants this, though I find it hard to believe most
of you haven't been burned in the past the same way the above poster was.
Frequently, after executing "cvs up" I see almost all of the Python core
rebuild because some commonly used header file was modified, yet find that
distutils rebuilds nothing.  If a header file is modified which causes most
of Objects and Python to be rebuilt, but nothing in Modules is rebuilt, I'm
immediately suspicious.

Here's a simple test you can perform at home.  Build Python.  Touch
Include/Python.h.  Run make again.  Notice how the core files are all
rebuilt but no modules are.  Touch Modules/dbmmodule.c (or something else
that builds).  Run make again.

I'm simply going to stop worrying about it and just keep deleting all the
stuff distutils builds to make sure I get correct shared libraries.

Skip



From skip@pobox.com  Fri Jun 14 01:54:32 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 13 Jun 2002 19:54:32 -0500
Subject: [Python-Dev] addressing distutils inability to track file
 dependencies
In-Reply-To: <3D093438.92B46349@prescod.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <3D08DDD7.BD8573D8@prescod.net>
 <15624.64540.472905.469106@12-248-41-177.client.attbi.com>
 <3D093438.92B46349@prescod.net>
Message-ID: <15625.16072.900596.114938@12-248-41-177.client.attbi.com>

    Paul> I guess most of us don't understand the benefits because we don't
    Paul> see dependency tracking as necessarily that difficult. It's no
    Paul> harder than the new method resolution order. ;)

If it's not that difficult why isn't it being done? <no wink>

Skip



From guido@python.org  Fri Jun 14 03:05:00 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 13 Jun 2002 22:05:00 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Thu, 13 Jun 2002 19:53:52 CDT."
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
Message-ID: <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>

> Here's a simple test you can perform at home.  Build Python.  Touch
> Include/Python.h.  Run make again.  Notice how the core files are all
> rebuilt but no modules are.  Touch Modules/dbmmodule.c (or something else
> that builds).  Run make again.

Most of the time most of the rebuilds of the core are unnecessary.

> I'm simply going to stop worrying about it and just keep deleting all the
> stuff distutils builds to make sure I get correct shared libraries.

Because we are religious about binary backwards compatibility, it is
very rare that a change to a header file requires that extension are
recompiled.  But when this happens, it is often the last thing we
think of when debugging. :-(

I think the conclusion from this thread is that it's not the checking
of dependencies which is the problem.  (Jeremy just added this to
distutils.)  It is the specification of which files are dependent on
which others that is a pain.  I think that with Jeremy's changes it
would not be hard to add a rule to our setup.py that makes all
extensions dependent on all .h files in the Include directory -- a
reasonable approximation of the rule that the main Makefile uses.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun 14 03:05:36 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 13 Jun 2002 22:05:36 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Thu, 13 Jun 2002 19:54:32 CDT."
 <15625.16072.900596.114938@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <3D08DDD7.BD8573D8@prescod.net> <15624.64540.472905.469106@12-248-41-177.client.attbi.com> <3D093438.92B46349@prescod.net>
 <15625.16072.900596.114938@12-248-41-177.client.attbi.com>
Message-ID: <200206140205.g5E25av27450@pcp02138704pcs.reston01.va.comcast.net>

> If it's not that difficult why isn't it being done? <no wink>

It's done.  Jeremy added it today.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas@python.ca  Fri Jun 14 03:09:51 2002
From: nas@python.ca (Neil Schemenauer)
Date: Thu, 13 Jun 2002 19:09:51 -0700
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15625.16072.900596.114938@12-248-41-177.client.attbi.com>; from skip@pobox.com on Thu, Jun 13, 2002 at 07:54:32PM -0500
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <3D08DDD7.BD8573D8@prescod.net> <15624.64540.472905.469106@12-248-41-177.client.attbi.com> <3D093438.92B46349@prescod.net> <15625.16072.900596.114938@12-248-41-177.client.attbi.com>
Message-ID: <20020613190951.A30383@glacier.arctrix.com>

Skip Montanaro wrote:
> If it's not that difficult why isn't it being done? <no wink>

I think the hard part is getting the dependency information.  Using it
is trivial.  'make' does not help solve the former problem.  Speaking as
someone who spent time hacking on the Python Makefile, avoid 'make'.
The portable subset is limited and sucky.

  Neil



From jeremy@zope.com  Thu Jun 13 23:26:39 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Thu, 13 Jun 2002 18:26:39 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15624.64326.946056.280597@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <200206131751.g5DHps801904@odiug.zope.com>
 <15624.64326.946056.280597@12-248-41-177.client.attbi.com>
Message-ID: <15625.7199.102456.85532@slothrop.zope.com>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

  SM> I don't see that writing different makefile formats is any
  SM> harder than writing different shell commands.  On those systems
  SM> where you don't have a make-like tool, either distutils already
  SM> writes compile and link commands or it doesn't work at all.  On
  SM> those systems where you do have a make-like facility, I see no
  SM> reason to not use it.  You will get more reliable dependency
  SM> checking for one thing.

Only if distutils grows a way to specify all those dependencies.  Once
you've specified them, I'm not sure why it is difficult to check them
in Python code instead of relying on make.

i'm-probably-naive-ly y'rs,
Jeremy




From skip@pobox.com  Fri Jun 14 04:48:07 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 13 Jun 2002 22:48:07 -0500
Subject: [Python-Dev] Updates to bsddb and dbm module build process
Message-ID: <15625.26487.865171.411458@12-248-41-177.client.attbi.com>

Would someone like to take a look at the diff file attached to this patch?

    http://python.org/sf/553108

I think it's complete except for a note in Misc/NEWS.

I'd assign it to Barry, but I understand he's a bit busy these days.

Skip



From martin@v.loewis.de  Fri Jun 14 08:04:18 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 14 Jun 2002 09:04:18 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
Message-ID: <m3660m8hot.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

>     Martin> Can you provide a specific example to support this criticism?
>     Martin> Could you also explain how generating makefiles would help?
> 
> >From python-list on June 10 (this is what made me wish yet again for better
> dependency checking):
> 
>     http://mail.python.org/pipermail/python-list/2002-June/108153.html

That does not answer my question: How would generating a makefile have
helped?

Notice that setup.py *will* regenerate nis.so if nis.c changes. The OP
is right that it refused to do so because, meanwhile, he had changed
Setup to build nismodule.so instead.

This is where the real problem lies: that building modules via
makesetup generates module.so, whereas building modules via setup.py
builds .so. This needs to be fixed, and I feel that setup.py is right
and makesetup is wrong.

> It's clear nobody but me wants this

Hard to tell, since I still don't quite get what "this" is. Generating
makefiles: certainly I don't want this. The reason is not that I think
there are no problems - I think that generating makefiles will not
solve these problems.

> though I find it hard to believe most of you haven't been burned in
> the past the same way the above poster was.

The specific problem comes from building shared modules through
Setup. I never do this, so I have not been burned by that.

> Frequently, after executing "cvs up" I see almost all of the Python core
> rebuild because some commonly used header file was modified, yet find that
> distutils rebuilds nothing.  

That I noticed. It has nothing to do with the article you quote,
though, and I question that generating makefiles would help.

I routinely rm -rf build when I see that some common header has
changed.

> Here's a simple test you can perform at home.  Build Python.  Touch
> Include/Python.h.  Run make again.  Notice how the core files are all
> rebuilt but no modules are.  Touch Modules/dbmmodule.c (or something else
> that builds).  Run make again.

I can reproduce your observations. I still don't see how generating
makefiles will help to solve this problem.

Regards,
Martin



From martin@v.loewis.de  Fri Jun 14 08:10:04 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 14 Jun 2002 09:10:04 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15625.7199.102456.85532@slothrop.zope.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <200206131751.g5DHps801904@odiug.zope.com>
 <15624.64326.946056.280597@12-248-41-177.client.attbi.com>
 <15625.7199.102456.85532@slothrop.zope.com>
Message-ID: <m31yba8hf7.fsf@mira.informatik.hu-berlin.de>

Jeremy Hylton <jeremy@zope.com> writes:

> Only if distutils grows a way to specify all those dependencies.  Once
> you've specified them, I'm not sure why it is difficult to check them
> in Python code instead of relying on make.

I believe people normally want their build process to know
dependencies without any specification of dependencies. Instead, the
build process should know what the dependencies are by looking at the
source files.

For C, there are two ways to do that: you can either scan the sources
yourself for include statements, or you can let the compiler dump
dependency lists into files. 

The latter is only supported for some compilers, but it would help
enourmously: when compiling the first time, you know for sure that you
will need to compile. When compiling the second time, you read the
dependency information generated the first time, to determine whether
any of the included headers has changed. If that is not the case, you
can skip rebuilding. If you do rebuild, the dependency information
will be updated automatically (since the change might have been to add
an include).

Regards,
Martin



From martin@v.loewis.de  Fri Jun 14 08:11:00 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 14 Jun 2002 09:11:00 +0200
Subject: [Python-Dev] addressing distutils inability to track file  dependencies
In-Reply-To: <15625.16072.900596.114938@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <3D08DDD7.BD8573D8@prescod.net>
 <15624.64540.472905.469106@12-248-41-177.client.attbi.com>
 <3D093438.92B46349@prescod.net>
 <15625.16072.900596.114938@12-248-41-177.client.attbi.com>
Message-ID: <m3wut272t7.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

>     Paul> I guess most of us don't understand the benefits because we don't
>     Paul> see dependency tracking as necessarily that difficult. It's no
>     Paul> harder than the new method resolution order. ;)
> 
> If it's not that difficult why isn't it being done? <no wink>

You are wrong assuming it is not done. distutils does dependency
analysis since day 1.

Regards,
Martin




From martin@strakt.com  Fri Jun 14 08:44:58 2002
From: martin@strakt.com (Martin =?iso-8859-1?Q?Sj=F6gren?=)
Date: Fri, 14 Jun 2002 09:44:58 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020614074458.GA31022@strakt.com>

On Thu, Jun 13, 2002 at 10:05:00PM -0400, Guido van Rossum wrote:
> I think the conclusion from this thread is that it's not the checking
> of dependencies which is the problem.  (Jeremy just added this to
> distutils.)  It is the specification of which files are dependent on
> which others that is a pain.  I think that with Jeremy's changes it
> would not be hard to add a rule to our setup.py that makes all
> extensions dependent on all .h files in the Include directory -- a
> reasonable approximation of the rule that the main Makefile uses.

I for one would love to have dependencies in my extension modules, I
usually end up deleting the build directory whenever I've changed a heade=
r
file :(

How about something like this:

  Extension('foo', ['foo1.c', 'foo2.c'], dependencies=3D{'foo1.c':
  ['bar.h'], 'foo2.c': ['bar.h', 'bar2.h']})

though there is the problem of backwards compatability :/


Just my two cents,
Martin

--=20
Martin Sj=F6gren
  martin@strakt.com              ICQ : 41245059
  Phone: +46 (0)31 7710870       Cell: +46 (0)739 169191
  GPG key: http://www.strakt.com/~martin/gpg.html



From mal@lemburg.com  Fri Jun 14 09:04:24 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 14 Jun 2002 10:04:24 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com>
Message-ID: <3D09A388.8080107@lemburg.com>

Martin Sj=F6gren wrote:
> On Thu, Jun 13, 2002 at 10:05:00PM -0400, Guido van Rossum wrote:
>=20
>>I think the conclusion from this thread is that it's not the checking
>>of dependencies which is the problem.  (Jeremy just added this to
>>distutils.)  It is the specification of which files are dependent on
>>which others that is a pain.  I think that with Jeremy's changes it
>>would not be hard to add a rule to our setup.py that makes all
>>extensions dependent on all .h files in the Include directory -- a
>>reasonable approximation of the rule that the main Makefile uses.
>=20
>=20
> I for one would love to have dependencies in my extension modules, I
> usually end up deleting the build directory whenever I've changed a hea=
der
> file :(
>=20
> How about something like this:
>=20
>   Extension('foo', ['foo1.c', 'foo2.c'], dependencies=3D{'foo1.c':
>   ['bar.h'], 'foo2.c': ['bar.h', 'bar2.h']})
>=20
> though there is the problem of backwards compatability :/

Just curious:

distutils, as the name says, is a tool for distributing
source code; that doesn't have much to do with developing code
where dependency analysis is nice to have since it saves compile
time.

When distributing code, the standard setup is that a user unzips
the distutils created archive, types "python setup.py install"
and that's it. Dependency analyis doesn't gain him anything.

Now if you want to use distutils in the development
process then you have a different mindset and therefore
need different tools like e.g. scons or make (+ makedep, etc.).

The question is whether we want distutils to be a development
tool as well, or rather stick to its main purpose: that of
simplifying distribution and installation of software (and
thanks to Greg, it's great at that !).

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From aleax@aleax.it  Fri Jun 14 09:42:33 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 14 Jun 2002 10:42:33 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <3D09A388.8080107@lemburg.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com>
Message-ID: <E17Imfl-00046Q-00@mail.python.org>

On Friday 14 June 2002 10:04 am, M.-A. Lemburg wrote:
	...
> distutils, as the name says, is a tool for distributing
> source code; that doesn't have much to do with developing code
> where dependency analysis is nice to have since it saves compile

However, distutils is already today the handiest building environment, 
particularly if your extension needs to support several platforms and/or 
several versions of Python.

> The question is whether we want distutils to be a development
> tool as well, or rather stick to its main purpose: that of
> simplifying distribution and installation of software (and
> thanks to Greg, it's great at that !).

The "problem" (:-) is that it's great at just building extensions, too.

python2.1 setup.py install, python2.2 setup.py install, python2.3 setup.py 
install, and hey pronto, I have my extension built and installed on all 
Python versions I want to support, ready for testing.  Hard to beat!-)


Alex



From fredrik@pythonware.com  Fri Jun 14 10:45:28 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 14 Jun 2002 11:45:28 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org>
Message-ID: <005a01c21388$387cd3e0$0900a8c0@spiff>

alex wrote:

> The "problem" (:-) is that it's great at just building extensions, =
too.
>=20
> python2.1 setup.py install, python2.2 setup.py install, python2.3 =
setup.py=20
> install, and hey pronto, I have my extension built and installed on =
all=20
> Python versions I want to support, ready for testing.  Hard to beat!-)

does your code always work right away?

I tend to use an incremental approach, with lots of edit-compile-run
cycles.  I still haven't found a way to get the damn thing to just build
my extension and copy it to the current directory, so I can run the
test scripts.

does anyone here know how to do that, without having to resort to
ugly wrapper batch files/shell scripts?

(distutils is also a pain to use with a version management system
that marks files in the repository as read-only; distutils copy function
happily copies all the status bits. but the remove function refuses to
remove files that are read-only, even if the files have been created
by distutils itself...)

</F>




From mwh@python.net  Fri Jun 14 10:48:54 2002
From: mwh@python.net (Michael Hudson)
Date: 14 Jun 2002 10:48:54 +0100
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: martin@v.loewis.de's message of "14 Jun 2002 09:10:04 +0200"
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <200206131751.g5DHps801904@odiug.zope.com> <15624.64326.946056.280597@12-248-41-177.client.attbi.com> <15625.7199.102456.85532@slothrop.zope.com> <m31yba8hf7.fsf@mira.informatik.hu-berlin.de>
Message-ID: <2mofeeyyux.fsf@starship.python.net>

martin@v.loewis.de (Martin v. Loewis) writes:

> Jeremy Hylton <jeremy@zope.com> writes:
> 
> > Only if distutils grows a way to specify all those dependencies.  Once
> > you've specified them, I'm not sure why it is difficult to check them
> > in Python code instead of relying on make.
> 
> I believe people normally want their build process to know
> dependencies without any specification of dependencies. Instead, the
> build process should know what the dependencies are by looking at the
> source files.
> 
> For C, there are two ways to do that: you can either scan the sources
> yourself for include statements, or you can let the compiler dump
> dependency lists into files. 
> 
> The latter is only supported for some compilers, but it would help
> enourmously: when compiling the first time, you know for sure that you
> will need to compile. When compiling the second time, you read the
> dependency information generated the first time, to determine whether
> any of the included headers has changed. If that is not the case, you
> can skip rebuilding. If you do rebuild, the dependency information
> will be updated automatically (since the change might have been to add
> an include).

$ cd ~/src/sf/python/dist/src/Lib/distutils/command/
$ ls -l build_dep.py
-rw-rw-r--    1 mwh      mwh           763 Apr 13 11:18 build_dep.py

Had that idea.  Didn't get very far with it, though.  Maybe on the
train to EuroPython...

Cheers,
M.

-- 
  The gripping hand is really that there are morons everywhere, it's
  just that the Americon morons are funnier than average.
                              -- Pim van Riezen, alt.sysadmin.recovery



From mwh@python.net  Fri Jun 14 11:10:46 2002
From: mwh@python.net (Michael Hudson)
Date: 14 Jun 2002 11:10:46 +0100
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: "Fredrik Lundh"'s message of "Fri, 14 Jun 2002 11:45:28 +0200"
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff>
Message-ID: <2m7kl2npax.fsf@starship.python.net>

"Fredrik Lundh" <fredrik@pythonware.com> writes:

> alex wrote:
> 
> > The "problem" (:-) is that it's great at just building extensions, =
> too.
> >=20
> > python2.1 setup.py install, python2.2 setup.py install, python2.3 =
> setup.py=20
> > install, and hey pronto, I have my extension built and installed on =
> all=20
> > Python versions I want to support, ready for testing.  Hard to beat!-)
> 
> does your code always work right away?
> 
> I tend to use an incremental approach, with lots of edit-compile-run
> cycles.  I still haven't found a way to get the damn thing to just build
> my extension and copy it to the current directory, so I can run the
> test scripts.
> 
> does anyone here know how to do that, without having to resort to
> ugly wrapper batch files/shell scripts?

Nope.  I guess 

class install_local(distutils.command.install.install):
    ...

would be one way.  Perhaps it should be built in.

> (distutils is also a pain to use with a version management system
> that marks files in the repository as read-only; distutils copy function
> happily copies all the status bits. but the remove function refuses to
> remove files that are read-only, even if the files have been created
> by distutils itself...)

Yeah, this area sucks.  It interacts v. badly with umask, too.  Maybe
I'll work on this bug instead on my next train journey... installing
shared libraries with something like copy_tree is gross.

Cheers,
M.

-- 
  Counting lines is probably a good idea if you want to print it out
  and are short on paper, but I fail to see the purpose otherwise.
                                        -- Erik Naggum, comp.lang.lisp



From thomas.heller@ion-tof.com  Fri Jun 14 11:56:43 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 14 Jun 2002 12:56:43 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff>
Message-ID: <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook>

> does your code always work right away?
> 
> I tend to use an incremental approach, with lots of edit-compile-run
> cycles.  I still haven't found a way to get the damn thing to just build
> my extension and copy it to the current directory, so I can run the
> test scripts.
> 
> does anyone here know how to do that, without having to resort to
> ugly wrapper batch files/shell scripts?
> 
> (distutils is also a pain to use with a version management system
> that marks files in the repository as read-only; distutils copy function
> happily copies all the status bits. but the remove function refuses to
> remove files that are read-only, even if the files have been created
> by distutils itself...)
> 
> </F>
> 

setup.py install --install-lib=.

Thomas




From aleax@aleax.it  Fri Jun 14 12:03:39 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 14 Jun 2002 13:03:39 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <005a01c21388$387cd3e0$0900a8c0@spiff>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff>
Message-ID: <E17IosG-0001Wd-00@mail.python.org>

On Friday 14 June 2002 11:45 am, Fredrik Lundh wrote:
> alex wrote:
> > The "problem" (:-) is that it's great at just building extensions, too.
> >
> > python2.1 setup.py install, python2.2 setup.py install, python2.3
> > setup.py install, and hey pronto, I have my extension built and installed
> > on all Python versions I want to support, ready for testing.  Hard to
> > beat!-)
>
> does your code always work right away?

Never!  As the tests fail and problems are identified, I edit the sources,
and redo the setup.py install on one or more of the Python versions.

> I tend to use an incremental approach, with lots of edit-compile-run

Me too.  Iterative and incremental is highly productive.

> cycles.  I still haven't found a way to get the damn thing to just build
> my extension and copy it to the current directory, so I can run the
> test scripts.

I haven't even looked for such a way, since going to site-packages is
no problem for me.  If I was developing on a Python installation shared
by several users I'd no doubt feel differently about it.

> (distutils is also a pain to use with a version management system
> that marks files in the repository as read-only; distutils copy function

Many things are.  Fortunately, cvs, for all of its problem, doesn't do the
readonly thing:-).


Alex



From fredrik@pythonware.com  Fri Jun 14 12:20:18 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 14 Jun 2002 13:20:18 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <200206141059.g5EAxLI31419@pythonware.com>
Message-ID: <016901c21395$dc9ee190$0900a8c0@spiff>

alex wrote:
> > cycles.  I still haven't found a way to get the damn thing to just =
build
> > my extension and copy it to the current directory, so I can run the
> > test scripts.
>=20
> I haven't even looked for such a way, since going to site-packages is
> no problem for me.  If I was developing on a Python installation =
shared
> by several users I'd no doubt feel differently about it.

you only work on a single project too, I assume.

I tend to prefer not to install a broken extension in my machine's
default install, in case I have to switch to another project...  (and
switching between projects is all I seem to do these days ;-)

(and I maintain too many modules to afford to install a separate
python interpreter for each one of them...)

</F>




From fredrik@pythonware.com  Fri Jun 14 12:23:06 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 14 Jun 2002 13:23:06 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook>
Message-ID: <016a01c21395$dca1eed0$0900a8c0@spiff>

thomas wrote:
> >=20
> > does anyone here know how to do that, without having to resort to
> > ugly wrapper batch files/shell scripts?
> >=20
> > (distutils is also a pain to use with a version management system
> > that marks files in the repository as read-only; distutils copy =
function
> > happily copies all the status bits. but the remove function refuses =
to
> > remove files that are read-only, even if the files have been created
> > by distutils itself...)
>=20
> setup.py install --install-lib=3D.

doesn't work: distutils ends up trying to overwrite (readonly)
original source files.

consider PIL, for example: in my source directory, I have the
following files, checked out from a repository:

    setup.py
    _imaging.c
    *.c
    PIL/*.py

I want to be able to run setup.py and end up with an _imaging.pyd
in the same directory.  I don't want distutils to attempt to copy
stuff from PIL/*.py to PIL/*.py, mess up other parts of my source
tree, install any scripts (broken or not) in the Python directory, or
just generally make an ass of itself when failing to copy readonly
files on top of other readonly files.

the following is a bit more reliable (windows version):

    rd /s /q build
    python setup.py build
    rd /s /q install
    python setup.py install --prefix install
    copy install\*.pyd .

if distutils didn't mess up when deleting readonly files it created all
by itself, the following command could perhaps work:

    setup.py install_ext --install-lib=3D.

but there is no install_ext command in the version I have...

</F>




From fredrik@pythonware.com  Fri Jun 14 12:25:38 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 14 Jun 2002 13:25:38 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
Message-ID: <019201c21396$36c9eb60$0900a8c0@spiff>

> the following is a bit more reliable (windows version):
>=20
>     rd /s /q build
>     python setup.py build
>     rd /s /q install
>     python setup.py install --prefix install
>     copy install\*.pyd .

except that the PYD ends up under install\lib\site-packages
in some versions of distutils, of course...

brute-force workaround:

     copy install\*.pyd .
     copy install\lib\site-packages\*.pyd .

</F>




From aleax@aleax.it  Fri Jun 14 12:36:37 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 14 Jun 2002 13:36:37 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <016901c21395$dc9ee190$0900a8c0@spiff>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <200206141059.g5EAxLI31419@pythonware.com> <016901c21395$dc9ee190$0900a8c0@spiff>
Message-ID: <E17IpNl-0005M5-00@mail.python.org>

On Friday 14 June 2002 01:20 pm, Fredrik Lundh wrote:
> alex wrote:
> > > cycles.  I still haven't found a way to get the damn thing to just
> > > build my extension and copy it to the current directory, so I can run
> > > the test scripts.
> >
> > I haven't even looked for such a way, since going to site-packages is
> > no problem for me.  If I was developing on a Python installation shared
> > by several users I'd no doubt feel differently about it.
>
> you only work on a single project too, I assume.

You know what they say about "assume"...?  In all fairness, I would say a 
substantial defect I have is to tend to try and juggle too MANY things at the 
same time.  "a single project", *INDEED*...!

> I tend to prefer not to install a broken extension in my machine's
> default install, in case I have to switch to another project...  (and
> switching between projects is all I seem to do these days ;-)

I guess it comes down to being a well-organized person.  I'm not, and a 
couple of imperfect extensions in various site-packages doesn't importantly 
increase the already-high entropy of my working environment.  Sure, it 
_would_ be even better if the extensions could be in some other 
distinguishable place until they're working -- but "the local directory" 
isn't such a place (no per-Python-version distinction in that case, and 
per-platform is also important -- I do most of my Windows work these days in 
a win4lin setup, same machine, disk, screen, &c, as my main Linux box -- 
ditto for cygwin inside that virtual Windows box inside that Linux box).


Alex



From thomas.heller@ion-tof.com  Fri Jun 14 12:39:29 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 14 Jun 2002 13:39:29 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff>
Message-ID: <0ef701c21398$24fc81c0$e000a8c0@thomasnotebook>

From: "Fredrik Lundh" <fredrik@pythonware.com>
> consider PIL, for example: in my source directory, I have the
> following files, checked out from a repository:
>
>     setup.py
>     _imaging.c
>     *.c
>     PIL/*.py
>
> I want to be able to run setup.py and end up with an _imaging.pyd
> in the same directory.  I don't want distutils to attempt to copy
> stuff from PIL/*.py to PIL/*.py, mess up other parts of my source
> tree, install any scripts (broken or not) in the Python directory, or
> just generally make an ass of itself when failing to copy readonly
> files on top of other readonly files.
>

Then there's Berthold Höllmanns test-command he posted to the
distutils sig, which internally runs the 'build' command, then extends
sys.path by build_purelib, build_platlib, and the test-directory, and
finally runs the tests in the test-directory files.

For the readonly file issue, I have a force_remove_tree() function in
one of my setup-scripts (well, actually it is part of the
pyexe-distutils extension), cloned from distutils' remove_tree()
function.

Thomas





From guido@python.org  Fri Jun 14 12:58:53 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 07:58:53 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "14 Jun 2002 09:04:18 +0200."
 <m3660m8hot.fsf@mira.informatik.hu-berlin.de>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <m3660m8hot.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206141158.g5EBwrw31717@pcp02138704pcs.reston01.va.comcast.net>

> This is where the real problem lies: that building modules via
> makesetup generates module.so, whereas building modules via setup.py
> builds .so. This needs to be fixed, and I feel that setup.py is right
> and makesetup is wrong.

IMO that's entirely accidental.  You can use Setup to build either
form.  I would assume you can use setup.py to build either form too,
but I'm not sure.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun 14 13:05:32 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 08:05:32 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Fri, 14 Jun 2002 09:44:58 +0200."
 <20020614074458.GA31022@strakt.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
Message-ID: <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net>

> How about something like this:
> 
>   Extension('foo', ['foo1.c', 'foo2.c'], dependencies={'foo1.c':
>   ['bar.h'], 'foo2.c': ['bar.h', 'bar2.h']})
> 
> though there is the problem of backwards compatability :/

But this is wrong: it's not foo1.c that depends on bar.h, it's foo1.o.

With the latest CVS, on Unix or Linux, try this:

  - Run Make to be sure you are up to date
  - Touch Modules/socketobject.h
  - Run Make again

The latest setup.py has directives that tell it that the _socket and
_ssl modules depend on socketmodule.h, and this makes it rebuild the
necessary .o and .so files (through the changes to distutils that
Jeremy made).

All we need is for someone to add all the other dependencies to
setup.py.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun 14 13:09:15 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 08:09:15 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Fri, 14 Jun 2002 10:04:24 +0200."
 <3D09A388.8080107@lemburg.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
Message-ID: <200206141209.g5EC9F931822@pcp02138704pcs.reston01.va.comcast.net>

> The question is whether we want distutils to be a development
> tool as well, or rather stick to its main purpose: that of
> simplifying distribution and installation of software (and
> thanks to Greg, it's great at that !).

Yes.

Much of distutils is concerned with compiling, and that part is also
needed by a development tool.  So I'd say it's a pretty good match.

You have to specify the extension build rules as some kind of script.
We found Modules/Setup + makesetup inadequate, and moved to setup.py +
distutils.  Distutils is the best we got; it knows about many
compilers and platforms; it was pretty easy to add .h file dependency
handling (though not discovery).

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido@python.org  Fri Jun 14 13:13:22 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 08:13:22 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Fri, 14 Jun 2002 11:45:28 +0200."
 <005a01c21388$387cd3e0$0900a8c0@spiff>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org>
 <005a01c21388$387cd3e0$0900a8c0@spiff>
Message-ID: <200206141213.g5ECDMT31861@pcp02138704pcs.reston01.va.comcast.net>

> I tend to use an incremental approach, with lots of edit-compile-run
> cycles.  I still haven't found a way to get the damn thing to just build
> my extension and copy it to the current directory, so I can run the
> test scripts.

Funny, I use an edit-compile-run cycle too, but I don't have the need
to copy anything to the current directory.

> does anyone here know how to do that, without having to resort to
> ugly wrapper batch files/shell scripts?
> 
> (distutils is also a pain to use with a version management system
> that marks files in the repository as read-only; distutils copy function
> happily copies all the status bits. but the remove function refuses to
> remove files that are read-only, even if the files have been created
> by distutils itself...)

This smells like a bug.  Maybe it can be fixed rather than used as a
stick to hit the dog.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@strakt.com  Fri Jun 14 13:12:25 2002
From: martin@strakt.com (Martin =?iso-8859-1?Q?Sj=F6gren?=)
Date: Fri, 14 Jun 2002 14:12:25 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020614121225.GA32573@strakt.com>

On Fri, Jun 14, 2002 at 08:05:32AM -0400, Guido van Rossum wrote:
> > How about something like this:
> >=20
> >   Extension('foo', ['foo1.c', 'foo2.c'], dependencies=3D{'foo1.c':
> >   ['bar.h'], 'foo2.c': ['bar.h', 'bar2.h']})
> >=20
> > though there is the problem of backwards compatability :/
>=20
> But this is wrong: it's not foo1.c that depends on bar.h, it's foo1.o.

You're right.

> With the latest CVS, on Unix or Linux, try this:
>=20
>   - Run Make to be sure you are up to date
>   - Touch Modules/socketobject.h
>   - Run Make again
>=20
> The latest setup.py has directives that tell it that the _socket and
> _ssl modules depend on socketmodule.h, and this makes it rebuild the
> necessary .o and .so files (through the changes to distutils that
> Jeremy made).

Cool. But my module consists of several .c files, how do I specify which
.o files depend on which .h files?

Now, it's a shame I have to maintain compatability with the Python
2.1 and Python 2.2 distributions in my setup.py ;)
I suppose I could try/except...


Regards,
Martin


--=20
Martin Sj=F6gren
  martin@strakt.com              ICQ : 41245059
  Phone: +46 (0)31 7710870       Cell: +46 (0)739 169191
  GPG key: http://www.strakt.com/~martin/gpg.html



From guido@python.org  Fri Jun 14 13:17:52 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 08:17:52 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Fri, 14 Jun 2002 13:23:06 +0200."
 <016a01c21395$dca1eed0$0900a8c0@spiff>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook>
 <016a01c21395$dca1eed0$0900a8c0@spiff>
Message-ID: <200206141217.g5ECHqf31913@pcp02138704pcs.reston01.va.comcast.net>

> consider PIL, for example: in my source directory, I have the
> following files, checked out from a repository:
> 
>     setup.py
>     _imaging.c
>     *.c
>     PIL/*.py
> 
> I want to be able to run setup.py and end up with an _imaging.pyd
> in the same directory.  I don't want distutils to attempt to copy
> stuff from PIL/*.py to PIL/*.py, mess up other parts of my source
> tree, install any scripts (broken or not) in the Python directory, or
> just generally make an ass of itself when failing to copy readonly
> files on top of other readonly files.

I thought that the thing to do this was

  python setup.py build_ext -i

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun 14 13:21:58 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 08:21:58 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Fri, 14 Jun 2002 14:12:25 +0200."
 <20020614121225.GA32573@strakt.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net>
 <20020614121225.GA32573@strakt.com>
Message-ID: <200206141222.g5ECMA406904@pcp02138704pcs.reston01.va.comcast.net>

> Cool. But my module consists of several .c files, how do I specify
> which .o files depend on which .h files?

You can't.  Compared to throwing away the entire build directory
containing all Python extensions, it's still a huge win.  For your
extension, it may not make much of a difference.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin@mems-exchange.org  Fri Jun 14 13:24:36 2002
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Fri, 14 Jun 2002 08:24:36 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <2m7kl2npax.fsf@starship.python.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <2m7kl2npax.fsf@starship.python.net>
Message-ID: <20020614122436.GA2791@ute.mems-exchange.org>

On Fri, Jun 14, 2002 at 11:10:46AM +0100, Michael Hudson wrote:
>Yeah, this area sucks.  It interacts v. badly with umask, too.  Maybe
>I'll work on this bug instead on my next train journey... installing
>shared libraries with something like copy_tree is gross.

Out of curiosity, why?  

I've had the patch below sitting in my copy of the tree forever, but
haven't gotten around to checking whether it fixes the umask-related
bug: I think it should, though.

Index: install_lib.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/distutils/command/install_lib.py,vretrieving revision 1.40
diff -u -r1.40 install_lib.py
--- install_lib.py      4 Jun 2002 20:14:43 -0000       1.40
+++ install_lib.py      14 Jun 2002 12:05:30 -0000
@@ -106,7 +106,8 @@

     def install (self):
         if os.path.isdir(self.build_dir):
-            outfiles = self.copy_tree(self.build_dir, self.install_dir)
+            outfiles = self.copy_tree(self.build_dir, self.install_dir,
+                                      preserve_mode=1)
         else:
             self.warn("'%s' does not exist -- no Python modules to install" %
                       self.build_dir)

--amk



From David Abrahams" <david.abrahams@rcn.com  Fri Jun 14 13:27:36 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 14 Jun 2002 08:27:36 -0400
Subject: [Python-Dev] [development doc updates]
References: <20020614122004.9148F286BC@beowolf.fdrake.net>
Message-ID: <084801c2139e$df0bfae0$6601a8c0@boostconsulting.com>

Hi Fred,

I wasn't aware of how it would be formatted when I submitted my
PyObject_RichCompare patches. I think the different italic "op" usages are
too similar. Probably \emph should be replaced with something that
bold-ifies.

Best,
Dave


----- Original Message -----
From: "Fred L. Drake" <fdrake@acm.org>
To: <python-dev@python.org>; <doc-sig@python.org>; <python-list@python.org>
Sent: Friday, June 14, 2002 8:20 AM
Subject: [Python-Dev] [development doc updates]


> The development version of the documentation has been updated:
>
>     http://www.python.org/dev/doc/devel/
>
> Updated to reflect recent changes.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev




From mwh@python.net  Fri Jun 14 13:46:28 2002
From: mwh@python.net (Michael Hudson)
Date: 14 Jun 2002 13:46:28 +0100
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Andrew Kuchling's message of "Fri, 14 Jun 2002 08:24:36 -0400"
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <2m7kl2npax.fsf@starship.python.net> <20020614122436.GA2791@ute.mems-exchange.org>
Message-ID: <2m4rg6hvtn.fsf@starship.python.net>

Andrew Kuchling <akuchlin@mems-exchange.org> writes:

> On Fri, Jun 14, 2002 at 11:10:46AM +0100, Michael Hudson wrote:
> >Yeah, this area sucks.  It interacts v. badly with umask, too.  Maybe
> >I'll work on this bug instead on my next train journey... installing
> >shared libraries with something like copy_tree is gross.
> 
> Out of curiosity, why?  

Dunno.  It just strikes me as a really bad idea.  If the linker
produces other cruft you'll end up installing that too.  

It's certainly possible that core files could get installed, if you've
run tests in build/lib.foo/.

> I've had the patch below sitting in my copy of the tree forever, but
> haven't gotten around to checking whether it fixes the umask-related
> bug: I think it should, though.

I think I tried that and it didn't work, but can't remember all that
clearly.

Cheers,
M.

-- 
 As it seems to me, in Perl you have to be an expert to correctly make
 a nested data structure like, say, a list of hashes of instances.  In
 Python, you have to be an idiot not  to be able to do it, because you
 just write it down.             -- Peter Norvig, comp.lang.functional



From mwh@python.net  Fri Jun 14 13:49:39 2002
From: mwh@python.net (Michael Hudson)
Date: 14 Jun 2002 13:49:39 +0100
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: "Thomas Heller"'s message of "Fri, 14 Jun 2002 13:39:29 +0200"
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff> <0ef701c21398$24fc81c0$e000a8c0@thomasnotebook>
Message-ID: <2m1ybahvoc.fsf@starship.python.net>

"Thomas Heller" <thomas.heller@ion-tof.com> writes:

> Then there's Berthold Höllmanns test-command he posted to the
> distutils sig, which internally runs the 'build' command, then extends
> sys.path by build_purelib, build_platlib, and the test-directory, and
> finally runs the tests in the test-directory files.

You could have a variant of that that just ran a
code.InteractiveInterpreter.  

$ python setup.py play-around

Cheers,
M.

-- 
31. Simplicity does not precede complexity, but follows it.
  -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html



From mgilfix@eecs.tufts.edu  Fri Jun 14 13:55:53 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Fri, 14 Jun 2002 08:55:53 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <016901c21395$dc9ee190$0900a8c0@spiff>; from fredrik@pythonware.com on Fri, Jun 14, 2002 at 01:20:18PM +0200
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <200206141059.g5EAxLI31419@pythonware.com> <016901c21395$dc9ee190$0900a8c0@spiff>
Message-ID: <20020614085552.A4109@eecs.tufts.edu>

On Fri, Jun 14 @ 13:20, Fredrik Lundh wrote:
> alex wrote:
> > > cycles.  I still haven't found a way to get the damn thing to just build
> > > my extension and copy it to the current directory, so I can run the
> > > test scripts.
> > 
> > I haven't even looked for such a way, since going to site-packages is
> > no problem for me.  If I was developing on a Python installation shared
> > by several users I'd no doubt feel differently about it.
> 
> you only work on a single project too, I assume.
> 
> I tend to prefer not to install a broken extension in my machine's
> default install, in case I have to switch to another project...  (and
> switching between projects is all I seem to do these days ;-)
> 
> (and I maintain too many modules to afford to install a separate
> python interpreter for each one of them...)

  Er, do you encase the main routines for your programs in a
sub-directory?  I usually create a 'libapp' directory and put my
sources in there and then the main application loads main.py from
libapp. That way, setup.py installs libapp into site-packages and I
don't have to worry about multiple projects. That's a definite
help.

  If that doesn't satisfy you, you could always play around with
the install locations and then sys.path.. but that's usually not
necessary.

                     -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From guido@python.org  Fri Jun 14 14:12:45 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 09:12:45 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "14 Jun 2002 13:46:28 BST."
 <2m4rg6hvtn.fsf@starship.python.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <2m7kl2npax.fsf@starship.python.net> <20020614122436.GA2791@ute.mems-exchange.org>
 <2m4rg6hvtn.fsf@starship.python.net>
Message-ID: <200206141312.g5EDCjl07313@pcp02138704pcs.reston01.va.comcast.net>

> > I've had the patch below sitting in my copy of the tree forever, but
> > haven't gotten around to checking whether it fixes the umask-related
> > bug: I think it should, though.
> 
> I think I tried that and it didn't work, but can't remember all that
> clearly.

Looks like that wouldn't address /F's problem, which is that the
original files are read-only, so distutils makes the copies read-only,
and then refuses to remove them when you ask it to.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy@zope.com  Fri Jun 14 09:25:15 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Fri, 14 Jun 2002 04:25:15 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <20020614121225.GA32573@strakt.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net>
 <20020614121225.GA32573@strakt.com>
Message-ID: <15625.43115.50297.18925@slothrop.zope.com>

>>>>> "MS" =3D=3D Martin Sj=F6gren <martin@strakt.com> writes:

  >> But this is wrong: it's not foo1.c that depends on bar.h, it's
  >> foo1.o.

  MS> You're right.

On the other hand, distutils setup scripts don't talk about .o files
directly.  They talk about the .c file and assume there is a
one-to-one correspondence between .c files and .o files.

  >> With the latest CVS, on Unix or Linux, try this:
  >>
  >> - Run Make to be sure you are up to date
  >> - Touch Modules/socketobject.h
  >> - Run Make again
  >>
  >> The latest setup.py has directives that tell it that the _socket
  >> and _ssl modules depend on socketmodule.h, and this makes it
  >> rebuild the necessary .o and .so files (through the changes to
  >> distutils that Jeremy made).

  MS> Cool. But my module consists of several .c files, how do I
  MS> specify which .o files depend on which .h files?

I did something simpler, as Guido mentioned.  I added global
dependencies for an extension.  This has been fine for all the
extensions that I commonly build because they have only one or several
source files.  Recompiling a few .c files costs little.

I agree that it would be nice to have fine-grained dependency
tracking, but that costs more in the implementation and to use.
Thomas Heller has a patch on SF (don't recall the number) that handles
per-file dependencies.  I didn't care for the way the dependencies are
spelled in the setup script, but something like the dict that Martin
(the other Martin, right?) suggested seems workable.

  MS> Now, it's a shame I have to maintain compatability with the
  MS> Python 2.1 and Python 2.2 distributions in my setup.py ;)
  MS> I suppose I could try/except...

We should come up with a good hack to use in setup scripts.  This is
my first try.  It's got too many lines, but it works.

# A hack to determine if Extension objects support the depends keyword =
arg.
if not "depends" in Extension.__init__.func_code.co_varnames:
    # If it doesn't, create a local replacement that removes depends
    # from the kwargs before calling the regular constructor.
    _Extension =3D Extension
    class Extension(_Extension):
        def __init__(self, name, sources, **kwargs):
            if "depends" in kwargs:
                del kwargs["depends"]
            _Extension.__init__(self, name, sources, **kwargs)

Jeremy




From Jack.Jansen@cwi.nl  Fri Jun 14 14:11:43 2002
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Fri, 14 Jun 2002 15:11:43 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
Message-ID: <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl>

On Thursday, June 13, 2002, at 07:49 , Skip Montanaro wrote:

>
> I wonder if it would be better to have distutils generate the 
> appropriate
> type of makefile and execute that instead of directly building objects 
> and
> shared libraries.  This would finesse some of the dependency tracking
> problems that pop up frequently.

+1

Distutils is very unix-centric in that it expects there to be separate 
compile and link steps. While this can be made to work on Windows (at 
least for MSVC) where there are such separate compilers if you look hard 
enough it can't be made to work for MetroWerks on the Mac, and also for 
MSVC it's a rather funny way to do things.

I would much prefer it if distutils would (optionally) gather all it's 
knowledge and generate a Makefile or an MW projectfile or an MSVC 
projectfile.

For MW distutils already does this (every step simply remembers 
information, and at the "link" step it writes out a project file and 
builds that) but it would be nice if this way of operation was codified.

Note that for people having an IDE this would also make debugging a lot 
easier: if you have an IDE project you can easily do nifty things like 
turn on debugging, use its class browser, etc.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -




From martin@strakt.com  Fri Jun 14 14:23:00 2002
From: martin@strakt.com (Martin =?iso-8859-1?Q?Sj=F6gren?=)
Date: Fri, 14 Jun 2002 15:23:00 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15625.43115.50297.18925@slothrop.zope.com>
References: <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> <20020614121225.GA32573@strakt.com> <15625.43115.50297.18925@slothrop.zope.com>
Message-ID: <20020614132300.GB712@strakt.com>

On Fri, Jun 14, 2002 at 04:25:15AM -0400, Jeremy Hylton wrote:
>   MS> Cool. But my module consists of several .c files, how do I
>   MS> specify which .o files depend on which .h files?
>=20
> I did something simpler, as Guido mentioned.  I added global
> dependencies for an extension.  This has been fine for all the
> extensions that I commonly build because they have only one or several
> source files.  Recompiling a few .c files costs little.
>=20
> I agree that it would be nice to have fine-grained dependency
> tracking, but that costs more in the implementation and to use.
> Thomas Heller has a patch on SF (don't recall the number) that handles
> per-file dependencies.  I didn't care for the way the dependencies are
> spelled in the setup script, but something like the dict that Martin
> (the other Martin, right?) suggested seems workable.

  Extension('foo', ['foo1.c', 'foo2.c'], dependencies=3D{'foo1.c':
    ['bar.h'], 'foo2.c': ['bar.h', 'bar2.h']})

That's what I suggested, is that what you meant?

>   MS> Now, it's a shame I have to maintain compatability with the
>   MS> Python 2.1 and Python 2.2 distributions in my setup.py ;)
>   MS> I suppose I could try/except...
>=20
> We should come up with a good hack to use in setup scripts.  This is
> my first try.  It's got too many lines, but it works.
>=20
> # A hack to determine if Extension objects support the depends keyword =
arg.
> if not "depends" in Extension.__init__.func_code.co_varnames:
>     # If it doesn't, create a local replacement that removes depends
>     # from the kwargs before calling the regular constructor.
>     _Extension =3D Extension
>     class Extension(_Extension):
>         def __init__(self, name, sources, **kwargs):
>             if "depends" in kwargs:
>                 del kwargs["depends"]
>             _Extension.__init__(self, name, sources, **kwargs)

Eep :) Looks like it could work, yes, but I think I'll skip that one whil=
e
I'm still running Python 2.2. :)


Cheers,
Martin

--=20
Martin Sj=F6gren
  martin@strakt.com              ICQ : 41245059
  Phone: +46 (0)31 7710870       Cell: +46 (0)739 169191
  GPG key: http://www.strakt.com/~martin/gpg.html



From guido@python.org  Fri Jun 14 14:35:40 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 09:35:40 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Fri, 14 Jun 2002 15:11:43 +0200."
 <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl>
References: <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl>
Message-ID: <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net>

> Distutils is very unix-centric in that it expects there to be separate 
> compile and link steps. While this can be made to work on Windows (at 
> least for MSVC) where there are such separate compilers if you look hard 
> enough it can't be made to work for MetroWerks on the Mac, and also for 
> MSVC it's a rather funny way to do things.

Actually, the setup dialogs and general structure of MSVC make you
very aware of the Unixoid structure of the underlying compiler
suite. :-)

But I believe what you say about MW.

> I would much prefer it if distutils would (optionally) gather all it's 
> knowledge and generate a Makefile or an MW projectfile or an MSVC 
> projectfile.
> 
> For MW distutils already does this (every step simply remembers 
> information, and at the "link" step it writes out a project file and 
> builds that) but it would be nice if this way of operation was codified.

I'm not sure what's to codify -- this is different for each compiler
suite.  When using setup.py with a 3rd party extension on Windows, I
like the fact that I don't have to fire up the GUI to build it.  (I
just wish it were easier to make distutils do the right thing for
debug builds of Python.  This has improved on Unix but I hear it's
still broken on Windows.)

> Note that for people having an IDE this would also make debugging a lot 
> easier: if you have an IDE project you can easily do nifty things like 
> turn on debugging, use its class browser, etc.

That's for developers though, not for people installing extensions
that come with a setup.py script.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Fri Jun 14 14:44:07 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 14 Jun 2002 09:44:07 -0400
Subject: [Python-Dev] addressing distutils inability to track file
 dependencies
In-Reply-To: <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEAOPNAA.tim.one@comcast.net>

[Guido]
> ...
> (I just wish it were easier to make distutils do the right thing for
> debug builds of Python.  This has improved on Unix but I hear it's
> still broken on Windows.)

Hard to say.  "stupid_build.py --debug" works great on Windows in the Zope3
tree.  "setup.py --debug" on Windows in the Zope tree builds the debug stuff
but leaves the results in unusable places.  Since I don't understood
disutils or the Zope build process, I'm not complaining.  There's nothing
that can't be fixed by hand via a mouse, Windows Explorer, and a spare hour
each time around <wink>.




From fdrake@acm.org  Fri Jun 14 14:53:21 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 14 Jun 2002 09:53:21 -0400
Subject: [Python-Dev] [development doc updates]
In-Reply-To: <084801c2139e$df0bfae0$6601a8c0@boostconsulting.com>
References: <20020614122004.9148F286BC@beowolf.fdrake.net>
 <084801c2139e$df0bfae0$6601a8c0@boostconsulting.com>
Message-ID: <15625.62801.794774.266808@grendel.zope.com>

David Abrahams writes:
 > I wasn't aware of how it would be formatted when I submitted my
 > PyObject_RichCompare patches. I think the different italic "op" usages are
 > too similar. Probably \emph should be replaced with something that
 > bold-ifies.

Bold is the wrong thing.  I propose:

- change the argument name to "opid"
- let the "op" stand-in show in the code font so it contrasts with the
  o1 and o2 references in the \samp.

Here's what it looks like:

    http://www.python.org/dev/doc/devel/api/object.html#l2h-170

If that works for you, let me know and I'll update the 2.2.x docs and
commit the changes.

Thanks for your comments!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From David Abrahams" <david.abrahams@rcn.com  Fri Jun 14 15:20:55 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 14 Jun 2002 10:20:55 -0400
Subject: [Python-Dev] [development doc updates]
References: <20020614122004.9148F286BC@beowolf.fdrake.net><084801c2139e$df0bfae0$6601a8c0@boostconsulting.com> <15625.62801.794774.266808@grendel.zope.com>
Message-ID: <08b101c213af$662a2dc0$6601a8c0@boostconsulting.com>

That's great, Fred. You might also want to add the "Return value: New
reference." riff to PyObject_RichCompare; I missed that when looking at the
existing doc for examples.

-Dave

From: "Fred L. Drake, Jr." <fdrake@acm.org>
>
> David Abrahams writes:
>  > I wasn't aware of how it would be formatted when I submitted my
>  > PyObject_RichCompare patches. I think the different italic "op" usages
are
>  > too similar. Probably \emph should be replaced with something that
>  > bold-ifies.
>
> Bold is the wrong thing.  I propose:
>
> - change the argument name to "opid"
> - let the "op" stand-in show in the code font so it contrasts with the
>   o1 and o2 references in the \samp.
>
> Here's what it looks like:
>
>     http://www.python.org/dev/doc/devel/api/object.html#l2h-170
>
> If that works for you, let me know and I'll update the 2.2.x docs and
> commit the changes.
>
> Thanks for your comments!
>
>
>   -Fred
>
> --
> Fred L. Drake, Jr.  <fdrake at acm.org>
> PythonLabs at Zope Corporation
>




From fdrake@acm.org  Fri Jun 14 15:37:56 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 14 Jun 2002 10:37:56 -0400
Subject: [Python-Dev] [development doc updates]
In-Reply-To: <08b101c213af$662a2dc0$6601a8c0@boostconsulting.com>
References: <20020614122004.9148F286BC@beowolf.fdrake.net>
 <084801c2139e$df0bfae0$6601a8c0@boostconsulting.com>
 <15625.62801.794774.266808@grendel.zope.com>
 <08b101c213af$662a2dc0$6601a8c0@boostconsulting.com>
Message-ID: <15625.65476.263702.375912@grendel.zope.com>

David Abrahams writes:
 > That's great, Fred. You might also want to add the "Return value: New
 > reference." riff to PyObject_RichCompare; I missed that when looking at the
 > existing doc for examples.

The mechanics of the refcount data are a bit different.  I've added
that and committed the changes; updates should appear on the site in
the next hour.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From thomas.heller@ion-tof.com  Fri Jun 14 15:38:59 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 14 Jun 2002 16:38:59 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl>  <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <105701c213b1$3853a460$e000a8c0@thomasnotebook>

From: "Guido van Rossum" <guido@python.org>
> I'm not sure what's to codify -- this is different for each compiler
> suite.  When using setup.py with a 3rd party extension on Windows, I
> like the fact that I don't have to fire up the GUI to build it.  (I
Same for me.
> just wish it were easier to make distutils do the right thing for
> debug builds of Python.  This has improved on Unix but I hear it's
> still broken on Windows.)
> 
What do you think is broken with the debug builds?
I use it routinely and have no problems at all...

[Jack]
> > Note that for people having an IDE this would also make debugging a lot 
> > easier: if you have an IDE project you can easily do nifty things like 
> > turn on debugging, use its class browser, etc.

I prefer to insert
#ifdef _DEBUG
    _asm int 3; /* breakpoint */
#endif
into the problematic sections of my code, and whoops,
the MSVC GUI debugger opens just when this code is executed,
even if it was started from the command line.

Thomas




From guido@python.org  Fri Jun 14 15:51:21 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 10:51:21 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Fri, 14 Jun 2002 16:38:59 +0200."
 <105701c213b1$3853a460$e000a8c0@thomasnotebook>
References: <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl> <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net>
 <105701c213b1$3853a460$e000a8c0@thomasnotebook>
Message-ID: <200206141451.g5EEpMg08193@pcp02138704pcs.reston01.va.comcast.net>

> What do you think is broken with the debug builds?
> I use it routinely and have no problems at all...

I was repeating hearsay.

Here's what used to be broken on Unix: if you built a debug Python but
did not install it (assuming a non-debug Python was already
installed), and then used that debug Python to build a 3rd party
extension, the debug Python's configuration would be ignored, and the
extension would be built with the configuration of the installed
Python instead.  Such extensions can't be linked with the debug
Python, which was the whole point of using the debug Python to build
in the first place.

Jeremy recently fixed this for Unix, and I'm very happy.

But I believe that on Windows you still have to add "--debug" to your
setup.py build command to get the same effect.  I think that using the
debug executable should be sufficient to turn on the debug flags.

More generally, I think that when you use a Python executable that
lives in a build directory, the configuration of that build directory
should be used for all extensions you build.  This is what Jeremy did
in his fix.  (As a side effect, building the Python extensions no
longer needs to be special-cased.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas.heller@ion-tof.com  Fri Jun 14 15:59:06 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 14 Jun 2002 16:59:06 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl> <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net>              <105701c213b1$3853a460$e000a8c0@thomasnotebook>  <200206141451.g5EEpMg08193@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <107901c213b4$07e849e0$e000a8c0@thomasnotebook>

From: "Guido van Rossum" <guido@python.org>
> > What do you think is broken with the debug builds?
> > I use it routinely and have no problems at all...
> 
> I was repeating hearsay.

The complaints I remember (mostly from c.l.p) are from
people who want to build debug versions of extensions
while at the same time refusing to build a debug version
of Python from the sources.

> 
> Here's what used to be broken on Unix: if you built a debug Python but
> did not install it (assuming a non-debug Python was already
> installed), and then used that debug Python to build a 3rd party
> extension, the debug Python's configuration would be ignored, and the
> extension would be built with the configuration of the installed
> Python instead.  Such extensions can't be linked with the debug
> Python, which was the whole point of using the debug Python to build
> in the first place.
> 
> Jeremy recently fixed this for Unix, and I'm very happy.
> 
> But I believe that on Windows you still have to add "--debug" to your
> setup.py build command to get the same effect.  I think that using the
> debug executable should be sufficient to turn on the debug flags.
> 
> More generally, I think that when you use a Python executable that
> lives in a build directory, the configuration of that build directory
> should be used for all extensions you build.  This is what Jeremy did
> in his fix.  (As a side effect, building the Python extensions no
> longer needs to be special-cased.)
> 

I don't know anything about building Python (and extensions) on Unix,
but here's how it works on windows:
You can use the release as well as the debug version of Python to build
release debug or release extensions with distutils. You have to use the
--debug switch to specify which one to use.
The debug version needs other libraries than the release version, they
all have an _d inserted into the filename just before the filename-
extension (but you probably know this already ;-).

I don't know if it even is possible (in Python code) to determine
whether the debug or the release exe is currently running.

With changes I recently made to distutils, you can even do all this
in a 'not installed' version, straight from CVS, for example.

Thomas




From David Abrahams" <david.abrahams@rcn.com  Fri Jun 14 16:01:11 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 14 Jun 2002 11:01:11 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl>  <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <08ed01c213b4$52d866b0$6601a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>


> > Distutils is very unix-centric in that it expects there to be separate
> > compile and link steps. While this can be made to work on Windows (at
> > least for MSVC) where there are such separate compilers if you look
hard
> > enough it can't be made to work for MetroWerks on the Mac, and also for
> > MSVC it's a rather funny way to do things.
>
> Actually, the setup dialogs and general structure of MSVC make you
> very aware of the Unixoid structure of the underlying compiler
> suite. :-)
>
> But I believe what you say about MW.

Well, that really depends on whether you think supporting MacOS 9
development is important. MW supplies regular command-line tools for MacOS
X.

-Dave




From jeremy@zope.com  Fri Jun 14 11:20:04 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Fri, 14 Jun 2002 06:20:04 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206141451.g5EEpMg08193@pcp02138704pcs.reston01.va.comcast.net>
References: <45FEE4AC-7F98-11D6-BA39-0030655234CE@cwi.nl>
 <200206141335.g5EDZeA07410@pcp02138704pcs.reston01.va.comcast.net>
 <105701c213b1$3853a460$e000a8c0@thomasnotebook>
 <200206141451.g5EEpMg08193@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15625.50004.13771.247686@slothrop.zope.com>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

  GvR> Jeremy recently fixed this for Unix, and I'm very happy.

Actually, it was Fred.  I expect you're still very happy, and now Fred
is, too.

Jeremy




From tim.one@comcast.net  Fri Jun 14 16:18:22 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 14 Jun 2002 11:18:22 -0400
Subject: [Python-Dev] addressing distutils inability to track file
 dependencies
In-Reply-To: <107901c213b4$07e849e0$e000a8c0@thomasnotebook>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEBIPNAA.tim.one@comcast.net>

[Thomas Heller]
> ...
> I don't know if it even is possible (in Python code) to determine
> whether the debug or the release exe is currently running.

FYI, the sys module exposes some debugging tools only in the debug build.
So, e.g.,

def is_debug_build():
    import sys
    return hasattr(sys, "getobjects")

returns the right answer (and, I believe, under all versions of Python).




From Jack.Jansen@cwi.nl  Fri Jun 14 16:30:47 2002
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Fri, 14 Jun 2002 17:30:47 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <105701c213b1$3853a460$e000a8c0@thomasnotebook>
Message-ID: <B33FDFB2-7FAB-11D6-BA39-0030655234CE@cwi.nl>

On Friday, June 14, 2002, at 04:38 , Thomas Heller wrote:
> I prefer to insert
> #ifdef _DEBUG
>     _asm int 3; /* breakpoint */
> #endif
> into the problematic sections of my code, and whoops,
> the MSVC GUI debugger opens just when this code is executed,
> even if it was started from the command line.

Ok, MSVC finally scored a point with me, this is nifty:-)
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -




From thomas.heller@ion-tof.com  Fri Jun 14 16:45:04 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 14 Jun 2002 17:45:04 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <LNBBLJKPBEHFEDALKOLCIEBIPNAA.tim.one@comcast.net>
Message-ID: <10fb01c213ba$743e25f0$e000a8c0@thomasnotebook>

From: "Tim Peters" <tim.one@comcast.net>
> [Thomas Heller]
> > ...
> > I don't know if it even is possible (in Python code) to determine
> > whether the debug or the release exe is currently running.
> 
> FYI, the sys module exposes some debugging tools only in the debug build.
> So, e.g.,
> 
> def is_debug_build():
>     import sys
>     return hasattr(sys, "getobjects")
> 
> returns the right answer (and, I believe, under all versions of Python).
> 
I can (in 2.2) see sys.getobjects() and sys.gettotalrefcount().
I can also guess what gettotalrefcount does, but what does
getobjects() do? Is it documented somewhere?

Thomas




From jepler@unpythonic.net  Fri Jun 14 17:01:22 2002
From: jepler@unpythonic.net (Jeff Epler)
Date: Fri, 14 Jun 2002 11:01:22 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <B33FDFB2-7FAB-11D6-BA39-0030655234CE@cwi.nl>
References: <105701c213b1$3853a460$e000a8c0@thomasnotebook> <B33FDFB2-7FAB-11D6-BA39-0030655234CE@cwi.nl>
Message-ID: <20020614160117.GE30070@unpythonic.net>

On Fri, Jun 14, 2002 at 05:30:47PM +0200, Jack Jansen wrote:
> 
> On Friday, June 14, 2002, at 04:38 , Thomas Heller wrote:
> >I prefer to insert
> >#ifdef _DEBUG
> >    _asm int 3; /* breakpoint */
> >#endif
> >into the problematic sections of my code, and whoops,
> >the MSVC GUI debugger opens just when this code is executed,
> >even if it was started from the command line.
> 
> Ok, MSVC finally scored a point with me, this is nifty:-)

You can "set" a breakpoint this way in x86 Linux too.  Unfortunately,
when this is not run under the debugger, it simply sends a SIGTRAP to
the process.  In theory the standard library could handle SIGTRAP by
invoking the debugger, but 5 minutes fiddling around didn't produce a
very dependable way of doing so.

    (gdb) run
    Starting program: ./a.out 
    a

    Program received signal SIGTRAP, Trace/breakpoint trap.
    main () at bp.c:21
    21          printf("b\n");
    (gdb) cont
    Continuing.
    b

#include <stdio.h>

#define _DEBUG

#ifdef _DEBUG
#if defined(WIN32)
#define BREAKPOINT _asm int 3
#elif defined(__GNUC__) && defined(__i386__)
#define BREAKPOINT __asm__ __volatile__ ("int3")
#else
#warning "BREAKPOINT not defined for this OS / Compiler"
#define BREAKPOINT (void)0
#endif
#else
#define _DEBUG (void)0
#endif

main() {
    printf("a\n");
    BREAKPOINT;
    printf("b\n");
    return 0;
}



From oren-py-d@hishome.net  Fri Jun 14 17:15:58 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 14 Jun 2002 19:15:58 +0300
Subject: [Python-Dev] 'new' and 'types'
Message-ID: <20020614191558.A31580@hishome.net>

Patch 568629 removes the built-in module new (with sincere apologies to 
Tommy Burnette ;-) and replaces it with a tiny Python module consisting of a 
single import statement:

"""This module is no longer required except for backward compatibility.
Objects of most types can now be created by calling the type object. """

from types import \
ClassType as classobj, \
CodeType as code, \
FunctionType as function, \
InstanceType as instance, \
MethodType as instancemethod, \
ModuleType as module

These types (as well as buffer and slice) have been made callable. It looks 
like the Python core no longer has any objects that are created by a 
separate factory function (there are still some in the Modules).

Now, what about the types module?  It has been suggested that this module
should be deprecated.  I think it still has some use: we need a place to put
all the types that are not used often enough to be added to the builtins.
I suggest that they be placed in the module 'types' with names matching their
__name__ attribute.  The types module will still have the long MixedCaseType 
names for backward compatibility.  The use of the long names should be 
deprecated, not the types module itself.

    Oren




From tim.one@comcast.net  Fri Jun 14 17:41:03 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 14 Jun 2002 12:41:03 -0400
Subject: [Python-Dev] addressing distutils inability to track file
 dependencies
In-Reply-To: <10fb01c213ba$743e25f0$e000a8c0@thomasnotebook>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEBPPNAA.tim.one@comcast.net>

[Thomas Heller]
> I can (in 2.2) see sys.getobjects() and sys.gettotalrefcount().
> I can also guess what gettotalrefcount does, but what does
> getobjects() do? Is it documented somewhere?

Sorry, I don't think any debug-mode-only gimmicks are documented outside of
comments in the source files.

In a debug build, the PyObject layout changes (btw, that's why you can't mix
debug-build modules w/ release-build modules), adding new _ob_next and
_ob_prev pointers at the start of every PyObject.

The pointers form a doubly-linked list, which contains every live object in
existence, except for those statically allocated (the builtin type objects).
The head of the list is in object.c's static refchain vrbl.

sys.getobjects(n) returns that C list of (almost) all live objects, as a
Python list.  Excluded from the list returned are the list itself, and the
objects created to *call* getobjects().  The list of objects is in
allocation order, most-recently allocated at the start (getobjects()[0]).  n
is the maximum number of objects it will return, where n==0 means (of course
<wink>) infinity.

You can also pass it a type after the int, and, if you do, only objects of
that type get returned.

getobjects() is the tool of last resort when trying to track down an excess
of increfs over decrefs.  Python code that's exceedingly careful to account
for its own effects can figure out anything using it.  I once determined
that the compiler was leaking references to the integer 2 this way <wink>.




From tim.one@comcast.net  Fri Jun 14 18:00:06 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 14 Jun 2002 13:00:06 -0400
Subject: [Python-Dev] addressing distutils inability to track file
 dependencies
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEBPPNAA.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECDPNAA.tim.one@comcast.net>

[Tim, on the debug-build sys.getobjects()]
> ...
> You can also pass it a type after the int, and, if you do, only objects of
> that type get returned.

Speaking of which, that become a lot more pleasant in 2.2, as new-style
classes create new types, and most builtin types have builtin names.  You
can pee away delighted weeks pondering the mysteries <wink>.  For example:

Python 2.3a0 (#29, Jun 13 2002, 17:06:59) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
[8285 refs]
>>> sys.getobjects(0, int)
[17, 19, 20, 18, 14, 512, 56, 448, 128, 256, 512, 1024, 2048,
 49152, 40960, 4096, 32768, 24576, 8192, 16384, 61440, 4095,
 9, 7, 6, 5, 10, 32, 16, 64, 4096, 128, 16384, 32768, 512, 1024,
 256, 32767, 511, -4, -1, 15, 11, 8, 22, 4, 21, 23, 503316480, 65535,
 2147483647, 1, 0, 3, 2, 33751201]
[8348 refs]
>>>

Why would the first int Python allocates be 33751201?  The answer is clear
with a little hexification:

>>> hex(_[-1])
'0x20300a1'
[8292 refs]
>>>

Or, if that answer isn't clear, you should unsubscribe from Python-Dev
immediately <wink>.




From martin@v.loewis.de  Fri Jun 14 18:16:20 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 14 Jun 2002 19:16:20 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206141158.g5EBwrw31717@pcp02138704pcs.reston01.va.comcast.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <m3660m8hot.fsf@mira.informatik.hu-berlin.de>
 <200206141158.g5EBwrw31717@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3znxxvl0b.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> IMO that's entirely accidental.  You can use Setup to build either
> form.  I would assume you can use setup.py to build either form too,
> but I'm not sure.

Can you please elaborate? I believe that, because of the fragment

			case $objs in
			*$mod.o*)	base=$mod;;
			*)		base=${mod}module;;
			esac

the line

nis nismodule.c -lnsl	# Sun yellow pages -- not everywhere

will always cause makesetup to build nismodule.so - if you want to
build nis.so, you have to rename the source file.

I don't think you can tell setup.py to build nismodule.so.

So what do you propose to do to make the resulting shared library name
consistent regardless of whether it is build through setup.py or
makesetup?

Regards,
Martin



From martin@v.loewis.de  Fri Jun 14 18:18:07 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 14 Jun 2002 19:18:07 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <3D09A388.8080107@lemburg.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com>
Message-ID: <m3vg8lvkxc.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> The question is whether we want distutils to be a development
> tool as well, or rather stick to its main purpose: that of
> simplifying distribution and installation of software (and
> thanks to Greg, it's great at that !).

IMO, that's not a question anymore: distutils already *is* a tool used
in build and development environments.

Regards,
Martin




From martin@v.loewis.de  Fri Jun 14 18:19:57 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 14 Jun 2002 19:19:57 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <005a01c21388$387cd3e0$0900a8c0@spiff>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com>
 <E17Imfl-00046Q-00@mail.python.org>
 <005a01c21388$387cd3e0$0900a8c0@spiff>
Message-ID: <m3r8j9vkua.fsf@mira.informatik.hu-berlin.de>

"Fredrik Lundh" <fredrik@pythonware.com> writes:

> I tend to use an incremental approach, with lots of edit-compile-run
> cycles.  I still haven't found a way to get the damn thing to just build
> my extension and copy it to the current directory, so I can run the
> test scripts.
> 
> does anyone here know how to do that, without having to resort to
> ugly wrapper batch files/shell scripts?

I usually make a symlink into the build directory. Then, whenever it
is rebuild, the symlink will still be there.

> (distutils is also a pain to use with a version management system
> that marks files in the repository as read-only; distutils copy function
> happily copies all the status bits. but the remove function refuses to
> remove files that are read-only, even if the files have been created
> by distutils itself...)

That's a bug, IMO.

Regards,
Martin



From martin@v.loewis.de  Fri Jun 14 18:23:43 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 14 Jun 2002 19:23:43 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206141213.g5ECDMT31861@pcp02138704pcs.reston01.va.comcast.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com>
 <E17Imfl-00046Q-00@mail.python.org>
 <005a01c21388$387cd3e0$0900a8c0@spiff>
 <200206141213.g5ECDMT31861@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3n0txvko0.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> > I tend to use an incremental approach, with lots of edit-compile-run
> > cycles.  I still haven't found a way to get the damn thing to just build
> > my extension and copy it to the current directory, so I can run the
> > test scripts.
> 
> Funny, I use an edit-compile-run cycle too, but I don't have the need
> to copy anything to the current directory.

That's because Python treats its own build directory special: it adds
build/something to sys.path when it finds that it is started from the
build directory.

If you are developing a separate package, all your code ends up in
./build/lib.platform, which is not on sys.path.

Regards,
Martin



From skip@pobox.com  Fri Jun 14 18:49:04 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 14 Jun 2002 12:49:04 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <3D09A388.8080107@lemburg.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
Message-ID: <15626.11408.660388.360296@12-248-41-177.client.attbi.com>

    mal> The question is whether we want distutils to be a development tool
    mal> as well, or rather stick to its main purpose: that of simplifying
    mal> distribution and installation of software (and thanks to Greg, it's
    mal> great at that !).

Thanks for elaborating the distinction.  That is exactly what I missed.  I
really want make+makedepend.  I think that's what others have missed as
well.

Skip




From skip@pobox.com  Fri Jun 14 19:02:01 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 14 Jun 2002 13:02:01 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15626.12185.867020.437418@12-248-41-177.client.attbi.com>

    Guido> All we need is for someone to add all the other dependencies to
    Guido> setup.py.

May I humbly propose that this task should be automated?  Tools like
makedepend have been invented and reinvented many times over precisely
because it's too error-prone for humans to maintain that information
manually.  Switching from Make's syntax to Python's syntax won't make that
task substantially easier.

(Yes, I realize that backward compatibility is a strong goal so the layout
of objects tends to change rarely.  I still prefer having correct
dependencies.)

Skip



From fredrik@pythonware.com  Fri Jun 14 19:03:31 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 14 Jun 2002 20:03:31 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook>              <016a01c21395$dca1eed0$0900a8c0@spiff>  <200206141217.g5ECHqf31913@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <01e701c213cd$d148ae60$ced241d5@hagrid>

guido wrote:
> I thought that the thing to do this was
> 
>   python setup.py build_ext -i

oh, that's definitely close enough.

that's what you get for reading the docs instead of trying
every combination of the available options ;-)

(maybe someone who knows a little more about distutils
could take an hour and add brief overviews of all standard
commands to the reference section(s)?  just having a list
of all commands and command options would have helped
me, for sure...)

thanks /F




From martin@v.loewis.de  Fri Jun 14 19:36:33 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 14 Jun 2002 20:36:33 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15626.12185.867020.437418@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net>
 <15626.12185.867020.437418@12-248-41-177.client.attbi.com>
Message-ID: <m38z5hvham.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> May I humbly propose that this task should be automated?  Tools like
> makedepend have been invented and reinvented many times over precisely
> because it's too error-prone for humans to maintain that information
> manually.

They also have been invented and reinvented because the previous tool
would not work, just like the next one wouldn't.

makedepend is particularly bad: you need to reinvoke makedepend
manually whenever you change a file, which is as easily to forget as
updating dependency lists whenever you change a file. In addition,
makedepend has problems finding out the names of header files used.

That said, feel free to contribute patches that automate this task.

Regards,
Martin



From guido@python.org  Fri Jun 14 20:01:59 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 15:01:59 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "14 Jun 2002 19:16:20 +0200."
 <m3znxxvl0b.fsf@mira.informatik.hu-berlin.de>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <m3660m8hot.fsf@mira.informatik.hu-berlin.de> <200206141158.g5EBwrw31717@pcp02138704pcs.reston01.va.comcast.net>
 <m3znxxvl0b.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206141902.g5EJ20e09812@pcp02138704pcs.reston01.va.comcast.net>

> > IMO that's entirely accidental.  You can use Setup to build either
> > form.  I would assume you can use setup.py to build either form too,
> > but I'm not sure.
> 
> Can you please elaborate? I believe that, because of the fragment
> 
> 			case $objs in
> 			*$mod.o*)	base=$mod;;
> 			*)		base=${mod}module;;
> 			esac
> 
> the line
> 
> nis nismodule.c -lnsl	# Sun yellow pages -- not everywhere
> 
> will always cause makesetup to build nismodule.so - if you want to
> build nis.so, you have to rename the source file.

Oops, I was mistaken.

> I don't think you can tell setup.py to build nismodule.so.

Actually, you can.  Just specify "nismodule" as the extension name.
Whether you should, I don't know.

> So what do you propose to do to make the resulting shared library name
> consistent regardless of whether it is build through setup.py or
> makesetup?

I don't know if we need consistency, but if we do, I propose that we
deprecate the "module" part.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun 14 20:06:50 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 15:06:50 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Fri, 14 Jun 2002 12:49:04 CDT."
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
Message-ID: <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>

> Thanks for elaborating the distinction.  That is exactly what I missed.  I
> really want make+makedepend.  I think that's what others have missed as
> well.

Sorry Skip, but many others pointed out early on in this discussion
that dependency discovery is the important issue.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun 14 20:09:24 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 15:09:24 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Fri, 14 Jun 2002 13:02:01 CDT."
 <15626.12185.867020.437418@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net>
 <15626.12185.867020.437418@12-248-41-177.client.attbi.com>
Message-ID: <200206141909.g5EJ9Ow09923@pcp02138704pcs.reston01.va.comcast.net>

> May I humbly propose that this task should be automated?  Tools like
> makedepend have been invented and reinvented many times over
> precisely because it's too error-prone for humans to maintain that
> information manually.  Switching from Make's syntax to Python's
> syntax won't make that task substantially easier.

Unfortunately it's also darn tooting hard to do a good job of
discovering dependencies, which is why there is still no standard tool
that does this.  Makedepend tries, but is still hard to use.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun 14 20:11:30 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 15:11:30 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Fri, 14 Jun 2002 20:03:31 +0200."
 <01e701c213cd$d148ae60$ced241d5@hagrid>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff> <200206141217.g5ECHqf31913@pcp02138704pcs.reston01.va.comcast.net>
 <01e701c213cd$d148ae60$ced241d5@hagrid>
Message-ID: <200206141911.g5EJBUC09941@pcp02138704pcs.reston01.va.comcast.net>

> > I thought that the thing to do this was
> > 
> >   python setup.py build_ext -i
> 
> oh, that's definitely close enough.
> 
> that's what you get for reading the docs instead of trying
> every combination of the available options ;-)
> 
> (maybe someone who knows a little more about distutils
> could take an hour and add brief overviews of all standard
> commands to the reference section(s)?  just having a list
> of all commands and command options would have helped
> me, for sure...)

Instead of bothering with the (mostly) harmless but also mostly
unhelpful manuals, try the --help feature.  E.g. this has the info you
want:

$ python setup.py build_ext --help
Global options:
  --verbose (-v)  run verbosely (default)
  --quiet (-q)    run quietly (turns verbosity off)
  --dry-run (-n)  don't actually do anything
  --help (-h)     show detailed help message

Options for 'PyBuildExt' command:
  --build-lib (-b)     directory for compiled extension modules
  --build-temp (-t)    directory for temporary files (build by-products)
  --inplace (-i)       ignore build-lib and put compiled extensions into the
                       source directory alongside your pure Python modules
  --include-dirs (-I)  list of directories to search for header files
                       (separated by ':')
  --define (-D)        C preprocessor macros to define
  --undef (-U)         C preprocessor macros to undefine
  --libraries (-l)     external C libraries to link with
  --library-dirs (-L)  directories to search for external C libraries
                       (separated by ':')
  --rpath (-R)         directories to search for shared C libraries at runtime
  --link-objects (-O)  extra explicit link objects to include in the link
  --debug (-g)         compile/link with debugging information
  --force (-f)         forcibly build everything (ignore file timestamps)
  --compiler (-c)      specify the compiler type
  --swig-cpp           make SWIG create C++ files (default is C)
  --help-compiler      list available compilers

usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
   or: setup.py --help [cmd1 cmd2 ...]
   or: setup.py --help-commands
   or: setup.py cmd --help

$

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas@python.ca  Fri Jun 14 20:47:22 2002
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 14 Jun 2002 12:47:22 -0700
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206141909.g5EJ9Ow09923@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jun 14, 2002 at 03:09:24PM -0400
References: <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net> <15626.12185.867020.437418@12-248-41-177.client.attbi.com> <200206141909.g5EJ9Ow09923@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020614124722.A415@glacier.arctrix.com>

Guido van Rossum wrote:
> Unfortunately it's also darn tooting hard to do a good job of
> discovering dependencies, which is why there is still no standard tool
> that does this.  Makedepend tries, but is still hard to use.

ccache is an interesting solution to the problem.

  Neil



From Steve Holden" <sholden@holdenweb.com  Fri Jun 14 20:43:04 2002
From: Steve Holden" <sholden@holdenweb.com (Steve Holden)
Date: Fri, 14 Jun 2002 15:43:04 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff> <200206141217.g5ECHqf31913@pcp02138704pcs.reston01.va.comcast.net>              <01e701c213cd$d148ae60$ced241d5@hagrid>  <200206141911.g5EJBUC09941@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <052f01c213db$b4f0a2f0$bc778f41@holdenweb.com>

[Fredrik]
> > (maybe someone who knows a little more about distutils
> > could take an hour and add brief overviews of all standard
> > commands to the reference section(s)?  just having a list
> > of all commands and command options would have helped
> > me, for sure...)
>
[Guido]
> Instead of bothering with the (mostly) harmless but also mostly
> unhelpful manuals, try the --help feature.  E.g. this has the info you
> want:
>
> $ python setup.py build_ext --help

[ ... ]
It seems like a shame that effort was wasted producing "unhelpful"
documentation (and I have to say my experience was similar, but I thought it
was just me). The better the docs, the more module and extension authors
will use distutils.

Is the problem simply too generic for it to be logged as a documentation
bug? (A bit like the famous DEC SIR: "VMS 2.0 does not work". DEC's
response? "Fixed in next release"). Couldn't find anything in SF.

regards
-----------------------------------------------------------------------
Steve Holden                                 http://www.holdenweb.com/
Python Web Programming                http://pydish.holdenweb.com/pwp/
-----------------------------------------------------------------------







From guido@python.org  Fri Jun 14 20:49:43 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 15:49:43 -0400
Subject: [Python-Dev] 'new' and 'types'
In-Reply-To: Your message of "Fri, 14 Jun 2002 19:15:58 +0300."
 <20020614191558.A31580@hishome.net>
References: <20020614191558.A31580@hishome.net>
Message-ID: <200206141949.g5EJnh610636@pcp02138704pcs.reston01.va.comcast.net>

> Patch 568629 removes the built-in module new (with sincere apologies
> to Tommy Burnette ;-) and replaces it with a tiny Python module
> consisting of a single import statement:

I'm reviewing it now.  It seems it's your patch.  Did you forget to
mention that in this message?

> Now, what about the types module?  It has been suggested that this
> module should be deprecated.  I think it still has some use: we need
> a place to put all the types that are not used often enough to be
> added to the builtins.  I suggest that they be placed in the module
> 'types' with names matching their __name__ attribute.  The types
> module will still have the long MixedCaseType names for backward
> compatibility.  The use of the long names should be deprecated, not
> the types module itself.

Not a bad idea.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip@pobox.com  Fri Jun 14 20:46:36 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 14 Jun 2002 14:46:36 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15626.18460.685973.605098@12-248-41-177.client.attbi.com>

    >> Thanks for elaborating the distinction.  That is exactly what I
    >> missed.  I really want make+makedepend.  I think that's what others
    >> have missed as well.

    Guido> Sorry Skip, but many others pointed out early on in this
    Guido> discussion that dependency discovery is the important issue.

Which distutils doesn't do, but for which make and/or compilers have done
for years.

Skip




From fredrik@pythonware.com  Fri Jun 14 20:48:29 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 14 Jun 2002 21:48:29 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff> <200206141217.g5ECHqf31913@pcp02138704pcs.reston01.va.comcast.net>              <01e701c213cd$d148ae60$ced241d5@hagrid>  <200206141911.g5EJBUC09941@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <023301c213dc$7d3567a0$ced241d5@hagrid>

Guido wrote:
> Instead of bothering with the (mostly) harmless but also mostly
> unhelpful manuals, try the --help feature.  E.g. this has the info you
> want:

I think I got sidetracked by the --help-commands summary, which
sort of implies that build_ext is just a subvariant of build...

(maybe we could add a --help-commands-long option that lists
both the command names and their descriptions?  my brain clearly
couldn't execute [--help x for x in commands] without adding an
arbitrary if-clause, but I'm sure distutils can do that...)

</F>




From guido@python.org  Fri Jun 14 20:54:55 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 15:54:55 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Fri, 14 Jun 2002 14:46:36 CDT."
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
Message-ID: <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>

>     Guido> Sorry Skip, but many others pointed out early on in this
>     Guido> discussion that dependency discovery is the important issue.
> 
> Which distutils doesn't do, but for which make and/or compilers have
> done for years.

That same imprecise language again that got you in trouble before! :-)

Make doesn't do dependency discovery (beyond the trivial .c -> .o).
There may be a few compilers that do this but I don't think it's the
norm.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip@pobox.com  Fri Jun 14 20:55:56 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 14 Jun 2002 14:55:56 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
Message-ID: <15626.19020.867632.353969@12-248-41-177.client.attbi.com>

    Skip> Which distutils doesn't do, but for which make and/or compilers
    Skip> have done for years.

Bad English, sorry.  Should have been "which has been available for make for
years".  

Skip



From skip@pobox.com  Fri Jun 14 20:58:35 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 14 Jun 2002 14:58:35 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
 <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15626.19179.464237.382313@12-248-41-177.client.attbi.com>

    Guido> That same imprecise language again that got you in trouble
    Guido> before! :-)

Yes, I realize that.

    Guido> Make doesn't do dependency discovery (beyond the trivial .c ->
    Guido> .o).  There may be a few compilers that do this but I don't think
    Guido> it's the norm.

I also realize that.  Gcc has had good dependency checking for probably ten
years.  Sun's C compiler for a similar length of time.  Larry Wall did a
pretty good job of dependency checking for patch in the mid-80's.  Scons
does it as well.

Skip




From guido@python.org  Fri Jun 14 21:07:02 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 16:07:02 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Fri, 14 Jun 2002 14:58:35 CDT."
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
Message-ID: <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>

> Gcc has had good dependency checking for probably ten years.

How do you invoke this?  Maybe we can use this to our advantage.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip@pobox.com  Fri Jun 14 21:12:20 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 14 Jun 2002 15:12:20 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
 <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
 <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15626.20004.302739.140783@12-248-41-177.client.attbi.com>

    >> Gcc has had good dependency checking for probably ten years.

    Guido> How do you invoke this?  Maybe we can use this to our advantage.

"gcc -M" gives you all dependencies.  "gcc -MM" gives you just the stuff
included via '#include "file"' and omits the headers included via '#include
<file>'.  Programmers use <file> and "file" inconsistently enough that it's
probably better to just use -M and eliminate the files you don't care about
(or leave them in and have Python rebuild automatically after OS upgrades).
There are several other variants as well.  Search the GCC man page for "-M".

It seems to me that distutils' base compiler class could provide a generic
makedepend-like method which could be overridden in subclasses where
specific compilers have better builtin schemes for dependency generation.

Skip




From guido@python.org  Fri Jun 14 21:19:16 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Jun 2002 16:19:16 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Fri, 14 Jun 2002 15:12:20 CDT."
 <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
 <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
Message-ID: <200206142019.g5EKJGV10977@pcp02138704pcs.reston01.va.comcast.net>

> "gcc -M" gives you all dependencies.  "gcc -MM" gives you just the
> stuff included via '#include "file"' and omits the headers included
> via '#include <file>'.  Programmers use <file> and "file"
> inconsistently enough that it's probably better to just use -M and
> eliminate the files you don't care about (or leave them in and have
> Python rebuild automatically after OS upgrades).  There are several
> other variants as well.  Search the GCC man page for "-M".

Cool.

> It seems to me that distutils' base compiler class could provide a generic
> makedepend-like method which could be overridden in subclasses where
> specific compilers have better builtin schemes for dependency generation.

Care to whip up a patch?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas.heller@ion-tof.com  Fri Jun 14 21:23:17 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 14 Jun 2002 22:23:17 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>   <15624.62928.845160.407762@12-248-41-177.client.attbi.com>        <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>        <15625.2485.994408.888814@12-248-41-177.client.attbi.com>        <m3660myggx.fsf@mira.informatik.hu-berlin.de>        <15625.16032.161304.357298@12-248-41-177.client.attbi.com>        <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>        <20020614074458.GA31022@strakt.com>        <3D09A388.8080107@lemburg.com>        <15626.11408.660388.360296@12-248-41-177.client.attbi.com>        <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>        <15626.18460.685973.605098@12-248-41-177.client.attbi.com>        <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>        <15626.19179.464237.382313@12-248-41-177.client.attbi.com>        <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
Message-ID: <00bb01c213e1$528655a0$475afea9@thomasnotebook>

From: "Skip Montanaro" <skip@pobox.com>
> 
>     >> Gcc has had good dependency checking for probably ten years.
> 
>     Guido> How do you invoke this?  Maybe we can use this to our advantage.
> 
> "gcc -M" gives you all dependencies.  "gcc -MM" gives you just the stuff
> included via '#include "file"' and omits the headers included via '#include
> <file>'.  Programmers use <file> and "file" inconsistently enough that it's
> probably better to just use -M and eliminate the files you don't care about
> (or leave them in and have Python rebuild automatically after OS upgrades).
> There are several other variants as well.  Search the GCC man page for "-M".
> 
> It seems to me that distutils' base compiler class could provide a generic
> makedepend-like method which could be overridden in subclasses where
> specific compilers have better builtin schemes for dependency generation.
> 

MSVC could do something similar with the /E or /P flag (preprocess
to standard out or to file). A simple python filter looking for #line
directives could then collect the dependencies.
Isn't -E and -P also available in any unixish compiler?

Thomas




From thomas.heller@ion-tof.com  Fri Jun 14 21:27:22 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 14 Jun 2002 22:27:22 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <LNBBLJKPBEHFEDALKOLCAECDPNAA.tim.one@comcast.net>
Message-ID: <00db01c213e1$e3e21890$475afea9@thomasnotebook>

> [Tim, on the debug-build sys.getobjects()]
Thanks, this is useful info. Seems I have to read the source
more often...

Thomas




From skip@pobox.com  Fri Jun 14 21:33:31 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 14 Jun 2002 15:33:31 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <00bb01c213e1$528655a0$475afea9@thomasnotebook>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
 <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
 <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
 <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
 <00bb01c213e1$528655a0$475afea9@thomasnotebook>
Message-ID: <15626.21275.105157.150399@12-248-41-177.client.attbi.com>

    Thomas> MSVC could do something similar with the /E or /P flag
    Thomas> (preprocess to standard out or to file). A simple python filter
    Thomas> looking for #line directives could then collect the
    Thomas> dependencies.  Isn't -E and -P also available in any unixish
    Thomas> compiler?

Yes.  I believe this is how some makedepend scripts work.

Skip



From ask@valueclick.com  Fri Jun 14 21:47:04 2002
From: ask@valueclick.com (Ask Bjoern Hansen)
Date: Fri, 14 Jun 2002 13:47:04 -0700 (PDT)
Subject: [Python-Dev] Quota on sf.net
In-Reply-To: <j4y9dn12vo.fsf@informatik.hu-berlin.de>
Message-ID: <Pine.LNX.4.44.0206141344300.30919-100000@impatience.valueclick.com>

On 10 Jun 2002, Martin v. Löwis wrote:

> My recommendation would be to disable the scipt, and remove the
> snapshots, perhaps leaving a page that anybody who wants the snapshots
> should ask at python-dev to re-enable them.

feel free to refer people to;

http://cvs.perl.org/snapshots/python/

I'll keep about half a weeks worth of 6 hourly snapshots there, like 
we do for parrot at http://cvs.perl.org/snapshots/parrot/


 - ask

-- 
ask bjoern hansen, http://ask.netcetera.dk/         !try; do();




From akuchlin@mems-exchange.org  Fri Jun 14 21:57:44 2002
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Fri, 14 Jun 2002 16:57:44 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <052f01c213db$b4f0a2f0$bc778f41@holdenweb.com>
References: <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <0e8d01c21392$2b7bc5c0$e000a8c0@thomasnotebook> <016a01c21395$dca1eed0$0900a8c0@spiff> <200206141217.g5ECHqf31913@pcp02138704pcs.reston01.va.comcast.net> <01e701c213cd$d148ae60$ced241d5@hagrid> <200206141911.g5EJBUC09941@pcp02138704pcs.reston01.va.comcast.net> <052f01c213db$b4f0a2f0$bc778f41@holdenweb.com>
Message-ID: <20020614205744.GA12086@ute.mems-exchange.org>

On Fri, Jun 14, 2002 at 03:43:04PM -0400, Steve Holden wrote:
>It seems like a shame that effort was wasted producing "unhelpful"
>documentation (and I have to say my experience was similar, but I thought it
>was just me). The better the docs, the more module and extension authors
>will use distutils.

Part of it is not having an idea of what tasks people commonly need to
do with Distutils.  

--amk




From tim.one@comcast.net  Fri Jun 14 22:23:31 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 14 Jun 2002 17:23:31 -0400
Subject: [Python-Dev] addressing distutils inability to track file
 dependencies
In-Reply-To: <01e701c213cd$d148ae60$ced241d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEDPPNAA.tim.one@comcast.net>

[/F]
> ...
> (maybe someone who knows a little more about distutils
> could take an hour and add brief overviews of all standard
> commands to the reference section(s)?  just having a list
> of all commands and command options would have helped
> me, for sure...)

Me too, except that it still would <wink>.  The docs do a fine job of
explaining the framework, but it turns out every option I actually have to
use gets extracted from one of my coworkers at the point of tears <wink>.




From gmcm@hypernet.com  Fri Jun 14 23:25:43 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Fri, 14 Jun 2002 18:25:43 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206141911.g5EJBUC09941@pcp02138704pcs.reston01.va.comcast.net>
References: Your message of "Fri, 14 Jun 2002 20:03:31 +0200." <01e701c213cd$d148ae60$ced241d5@hagrid>
Message-ID: <3D0A3527.4320.90867436@localhost>

> $ python setup.py build_ext --help
> Global options:
...
>   --dry-run (-n)  don't actually do anything 

Last time I tried that with a package, it went
ahead and installed itself anyway.

-- Gordon
http://www.mcmillan-inc.com/




From neal@metaslash.com  Fri Jun 14 21:13:12 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Fri, 14 Jun 2002 16:13:12 -0400
Subject: [Python-Dev] addressing distutils inability to track file
 dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D0A4E58.760E3D40@metaslash.com>

Guido van Rossum wrote:
> 
> > Gcc has had good dependency checking for probably ten years.
> 
> How do you invoke this?  Maybe we can use this to our advantage.

Here's a bunch of useful options.

gcc --help -v | grep -e -M

  -M                        Generate make dependencies
  -MM                       As -M, but ignore system header files
  -MF <file>                Write dependency output to the given file
  -MG                       Treat missing header file as generated files
  -MP                       Generate phony targets for all headers
  -MQ <target>              Add a MAKE-quoted target
  -MT <target>              Add an unquoted target

  -MD                     Print dependencies to FILE.d
  -MMD                    Print dependencies to FILE.d
  -M                      Print dependencies to stdout
  -MM                     Print dependencies to stdout



From martin@v.loewis.de  Fri Jun 14 23:38:41 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 15 Jun 2002 00:38:41 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
 <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
 <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
 <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
Message-ID: <m3wut1eb9q.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> "gcc -M" gives you all dependencies.  "gcc -MM" gives you just the stuff
> included via '#include "file"' and omits the headers included via '#include
> <file>'.  

Both options are somewhat obsolete. It requires a separate invocation
of the compiler to output the dependencies, since it outputs the
dependencies to stdout; it can't do compilation at the same time.

It is much better if compilation of a file updates the dependency
information as a side effect. For that, gcc supports -MD/-MMD since
1989; this generates dependencies in a file obtained by replacing the
.o extension of the target with .d.

SunPRO supports generation of dependency files also as a separate
compiler invocation. It also supports the undocumented environment
variable SUNPRO_DEPENDENCIES, which allows specification of the
dependency file, along with specification of directories.

GCC also supports SUNPRO_DEPENDENCIES, so this is the most effective
and portable way to get dependency file generation.

Regards,
Martin



From martin@v.loewis.de  Fri Jun 14 23:45:40 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 15 Jun 2002 00:45:40 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206141902.g5EJ20e09812@pcp02138704pcs.reston01.va.comcast.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <m3660m8hot.fsf@mira.informatik.hu-berlin.de>
 <200206141158.g5EBwrw31717@pcp02138704pcs.reston01.va.comcast.net>
 <m3znxxvl0b.fsf@mira.informatik.hu-berlin.de>
 <200206141902.g5EJ20e09812@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3sn3peay3.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> > I don't think you can tell setup.py to build nismodule.so.
> 
> Actually, you can.  Just specify "nismodule" as the extension name.

That won't work. setup.py tries to import "md5module", which fails
since md5module.so has no function initmd5module.

> I don't know if we need consistency, but if we do, I propose that we
> deprecate the "module" part.

Ok, I'll try to remove the feature that makesetup adds "module".

Regards,
Martin



From nhodgson@bigpond.net.au  Sat Jun 15 02:11:36 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Sat, 15 Jun 2002 11:11:36 +1000
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>              <15626.18460.685973.605098@12-248-41-177.client.attbi.com>  <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <025f01c21409$98c42dd0$3da48490@neil>

Guido:

> Make doesn't do dependency discovery (beyond the trivial .c -> .o).
> There may be a few compilers that do this but I don't think it's the
> norm.

   Borland make does in conjunction with the compiler including header
dependencies in the object file. Thus there is no need for dependency
generation options like gcc's and no such options are provided.

   Its differences in functionality like this that will cause problems with
moving towards greater use of make.

   Neil





From skip@pobox.com  Sat Jun 15 15:51:25 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sat, 15 Jun 2002 09:51:25 -0500
Subject: [Python-Dev] unicode() and its error argument
Message-ID: <15627.21613.94336.985634@12-248-41-177.client.attbi.com>

The unicode() builtin accepts an optional third argument, errors, which
defaults to "strict".  According to the docs if errors is set to "ignore",
decoding errors are silently ignored.  I seem to still get the occasional
UnicodeError exception, however.  I'm still trying to track down an actual
example (it doesn't happen often, and I hadn't wrapped unicode() in a
try/except statement, so all I saw was the error raised, not the input
string value).

This reminds me, it occurred to me the other day that a plain text version
of cgitb would be useful to use for non-web scripts.  You'd get a lot more
context about the environment in which the exception was raised.

Skip




From Oleg Broytmann <phd@phd.pp.ru>  Sat Jun 15 15:58:42 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Sat, 15 Jun 2002 18:58:42 +0400
Subject: [Python-Dev] unicode() and its error argument
In-Reply-To: <15627.21613.94336.985634@12-248-41-177.client.attbi.com>; from skip@pobox.com on Sat, Jun 15, 2002 at 09:51:25AM -0500
References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com>
Message-ID: <20020615185842.D12705@phd.pp.ru>

On Sat, Jun 15, 2002 at 09:51:25AM -0500, Skip Montanaro wrote:
> The unicode() builtin accepts an optional third argument, errors, which
> defaults to "strict".  According to the docs if errors is set to "ignore",
> decoding errors are silently ignored.  I seem to still get the occasional
> UnicodeError exception, however.

   I got the error very often (but I use encoding conversion much more
often than you). First time I saw it I was very surprized that neither
"ignore" nor "replace" can eliminate the error.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From guido@python.org  Sat Jun 15 16:03:53 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 15 Jun 2002 11:03:53 -0400
Subject: [Python-Dev] unicode() and its error argument
In-Reply-To: Your message of "Sat, 15 Jun 2002 09:51:25 CDT."
 <15627.21613.94336.985634@12-248-41-177.client.attbi.com>
References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com>
Message-ID: <200206151503.g5FF3rJ16446@pcp02138704pcs.reston01.va.comcast.net>

> The unicode() builtin accepts an optional third argument, errors,
> which defaults to "strict".  According to the docs if errors is set
> to "ignore", decoding errors are silently ignored.  I seem to still
> get the occasional UnicodeError exception, however.  I'm still
> trying to track down an actual example (it doesn't happen often, and
> I hadn't wrapped unicode() in a try/except statement, so all I saw
> was the error raised, not the input string value).

This is between you and MAL. :-)

> This reminds me, it occurred to me the other day that a plain text
> version of cgitb would be useful to use for non-web scripts.  You'd
> get a lot more context about the environment in which the exception
> was raised.

Not a bad idea.  I think it could live in the traceback module,
possibly as a family of functions named "fancy_traceback" and similar.
Care to do a patch?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sat Jun 15 16:05:12 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 15 Jun 2002 11:05:12 -0400
Subject: [Python-Dev] unicode() and its error argument
In-Reply-To: Your message of "Sat, 15 Jun 2002 18:58:42 +0400."
 <20020615185842.D12705@phd.pp.ru>
References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com>
 <20020615185842.D12705@phd.pp.ru>
Message-ID: <200206151505.g5FF5Cr16468@pcp02138704pcs.reston01.va.comcast.net>

>    I got the error very often (but I use encoding conversion much more
> often than you). First time I saw it I was very surprized that neither
> "ignore" nor "replace" can eliminate the error.

Got an example?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From Oleg Broytmann <phd@phd.pp.ru>  Sat Jun 15 16:04:41 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Sat, 15 Jun 2002 19:04:41 +0400
Subject: [Python-Dev] unicode() and its error argument
In-Reply-To: <200206151505.g5FF5Cr16468@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Sat, Jun 15, 2002 at 11:05:12AM -0400
References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com> <20020615185842.D12705@phd.pp.ru> <200206151505.g5FF5Cr16468@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020615190441.E12705@phd.pp.ru>

On Sat, Jun 15, 2002 at 11:05:12AM -0400, Guido van Rossum wrote:
> >    I got the error very often (but I use encoding conversion much more
> > often than you). First time I saw it I was very surprized that neither
> > "ignore" nor "replace" can eliminate the error.
> 
> Got an example?

   Not right now... I'll send it when I get one.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From jon+python-dev@unequivocal.co.uk  Sat Jun 15 16:44:54 2002
From: jon+python-dev@unequivocal.co.uk (Jon Ribbens)
Date: Sat, 15 Jun 2002 16:44:54 +0100
Subject: [Python-Dev] unicode() and its error argument
In-Reply-To: <15627.21613.94336.985634@12-248-41-177.client.attbi.com>; from skip@pobox.com on Sat, Jun 15, 2002 at 09:51:25AM -0500
References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com>
Message-ID: <20020615164454.B4842@snowy.squish.net>

Skip Montanaro <skip@pobox.com> wrote:
> This reminds me, it occurred to me the other day that a plain text version
> of cgitb would be useful to use for non-web scripts.  You'd get a lot more
> context about the environment in which the exception was raised.

I have code adapted from cgitb in my jonpy cgi module which
simultaneously does text and optional html fancy tracebacks.

See the function "traceback" in:

  http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/jonpy/jonpy/jon/cgi.py

the 'req.error()' calls are doing the text traceback, simply remove
the stuff that says 'if html' to remove the html traceback and then do
a search&replace from req.error to out.write() or something.



From tim.one@comcast.net  Sat Jun 15 17:21:03 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 15 Jun 2002 12:21:03 -0400
Subject: [Python-Dev] unicode() and its error argument
In-Reply-To: <15627.21613.94336.985634@12-248-41-177.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEFOPNAA.tim.one@comcast.net>

[Skip Montanaro]
> The unicode() builtin accepts an optional third argument, errors, which
> defaults to "strict".  According to the docs if errors is set to "ignore",
> decoding errors are silently ignored.  I seem to still get the occasional
> UnicodeError exception, however.  I'm still trying to track down an actual
> example (it doesn't happen often, and I hadn't wrapped unicode() in a
> try/except statement, so all I saw was the error raised, not the input
> string value).

Play with this:

"""
def generrors(encoding, errors, maxlen, maxtries):
    from random import choice, randint
    bytes = [chr(i) for i in range(256)]
    paste = ''.join
    for dummy in xrange(maxtries):
        n = randint(1, maxlen)
        raw = paste([choice(bytes) for dummy in range(n)])
        try:
            u = unicode(raw, encoding, errors)
        except UnicodeError, detail:
            print 'fail w/ errors', errors, '- raw data', repr(raw)
            print '    UnicodeError', str(detail)

errors = ('strict', 'replace', 'ignore')

generrors('mac-turkish', errors[2], 10, 1000)
"""

Plug in your favorite encoding and let it do the work of finding examples.
It generates plenty of errors with 'strict', but so far I haven't seen it
generate one with 'replace' or 'ignore'.




From gward@python.net  Sat Jun 15 18:31:14 2002
From: gward@python.net (Greg Ward)
Date: Sat, 15 Jun 2002 13:31:14 -0400
Subject: [Python-Dev] addressing distutils inability to track file  dependencies
In-Reply-To: <m3wut272t7.fsf@mira.informatik.hu-berlin.de>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <3D08DDD7.BD8573D8@prescod.net> <15624.64540.472905.469106@12-248-41-177.client.attbi.com> <3D093438.92B46349@prescod.net> <15625.16072.900596.114938@12-248-41-177.client.attbi.com> <m3wut272t7.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020615173114.GA8981@gerg.ca>

On 14 June 2002, Martin v. Loewis said:
> Skip Montanaro <skip@pobox.com> writes:
> 
> >     Paul> I guess most of us don't understand the benefits because we don't
> >     Paul> see dependency tracking as necessarily that difficult. It's no
> >     Paul> harder than the new method resolution order. ;)
> > 
> > If it's not that difficult why isn't it being done? <no wink>
> 
> You are wrong assuming it is not done. distutils does dependency
> analysis since day 1.

Only insofar as foo.o depends on foo.c.  The header file stuff Jeremy
has been adding sounds like a very useful addition (haven't actually
inspected his patches yet).

        Greg
-- 
Greg Ward - Unix bigot                                  gward@python.net
http://starship.python.net/~gward/
Monday is an awful way to spend one seventh of your life.



From gward@python.net  Sat Jun 15 18:36:39 2002
From: gward@python.net (Greg Ward)
Date: Sat, 15 Jun 2002 13:36:39 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <005a01c21388$387cd3e0$0900a8c0@spiff>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff>
Message-ID: <20020615173639.GB8981@gerg.ca>

On 14 June 2002, Fredrik Lundh said:
> alex wrote:
> 
> > The "problem" (:-) is that it's great at just building extensions, too.
> > 
> > python2.1 setup.py install, python2.2 setup.py install, python2.3 setup.py 
> > install, and hey pronto, I have my extension built and installed on all 
> > Python versions I want to support, ready for testing.  Hard to beat!-)
> 
> does your code always work right away?

If we're talking about a downloaded third party extension -- the main
use case for the Distutils -- one certainly hopes so!  It's only a happy
accident that the Distutils are moderately useful for
building/development.

> I tend to use an incremental approach, with lots of edit-compile-run
> cycles.  I still haven't found a way to get the damn thing to just build
> my extension and copy it to the current directory, so I can run the
> test scripts.

Last time I checked:

  python setup.py build_ext --inplace

> (distutils is also a pain to use with a version management system
> that marks files in the repository as read-only; distutils copy function
> happily copies all the status bits. but the remove function refuses to
> remove files that are read-only, even if the files have been created
> by distutils itself...)

Yeah, that's a stupid situation.  I'm sure there are "XXX" comments in 
the code where I ponder the wisdom of preserving mtime and mode.

        Greg
-- 
Greg Ward - just another Python hacker                  gward@python.net
http://starship.python.net/~gward/



From gmcm@hypernet.com  Sat Jun 15 20:00:06 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Sat, 15 Jun 2002 15:00:06 -0400
Subject: [Python-Dev] SF bug count
Message-ID: <3D0B5676.8419.94F0937E@localhost>

Hi all,

 SF's "tracker" page 
    http://sourceforge.net/tracker/?group_id=5470
 says there are a total of 2581 bugs.

Using a url template of (broken so as to be
readable):
    bugurlfmt = http://sourceforge.net/tracker/index.php
      ?group_id=5470
      &atid=105470
      &set=custom
      &_assigned_to=100
      &_status=100
      &_category=100
      &_group=100
      &order=artifact_id
      &sort=ASC&offset=%d

to get 51 at a time, I get only 636.

Whose bug?

-- Gordon
http://www.mcmillan-inc.com/






From niemeyer@conectiva.com  Sat Jun 15 20:08:31 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Sat, 15 Jun 2002 16:08:31 -0300
Subject: [Python-Dev] mkdev, major, st_rdev, etc
Message-ID: <20020615160831.A5440@ibook.distro.conectiva>

After thinking for a while, and doing some research about these
functions, I've changed my mind about the best way to implement
the needed functionality for tarfile. Maybe including major,
minor, and makedev is the best solution. Some of the issues I'm
considering:

- st_rdev was already available in 2.2, so we'd have to introduce
  a new redundant pair attribute to provide a (major, minor) pair.
- mkdev would be able to use the standard posix format, and would
  work regardless of makedev's availability (mkdev is being
  introduced in 2.3).
- more flexible. major, minor, and makedev may be needed in other
  cases, besides st_rdev parsing and mknod device creation.
- TYPES.py is already trying to provide them, but it's broken
  (indeed, it's more broken than that. h2py should use cpp to
  preprocess the files, but that's something for another occasion).
- these "functions" are usually macros, thus should introduce
  little overhead.

A patch providing these functions is available at
http://www.python.org/sf/569139

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From martin@v.loewis.de  Sat Jun 15 22:00:02 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 15 Jun 2002 23:00:02 +0200
Subject: [Python-Dev] addressing distutils inability to track file  dependencies
In-Reply-To: <20020615173114.GA8981@gerg.ca>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <3D08DDD7.BD8573D8@prescod.net>
 <15624.64540.472905.469106@12-248-41-177.client.attbi.com>
 <3D093438.92B46349@prescod.net>
 <15625.16072.900596.114938@12-248-41-177.client.attbi.com>
 <m3wut272t7.fsf@mira.informatik.hu-berlin.de>
 <20020615173114.GA8981@gerg.ca>
Message-ID: <m3k7p0mf59.fsf@mira.informatik.hu-berlin.de>

Greg Ward <gward@python.net> writes:

> > You are wrong assuming it is not done. distutils does dependency
> > analysis since day 1.
> 
> Only insofar as foo.o depends on foo.c.  The header file stuff Jeremy
> has been adding sounds like a very useful addition (haven't actually
> inspected his patches yet).

Certainly true. However, the makefiles that Skip wanted to generate
would not have offered anything beyond "foo.o depends on foo.c". He
then recognized that dependencies are essential, here, and suggested
makedepend...

Regards,
Martin




From martin@v.loewis.de  Sat Jun 15 22:06:06 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 15 Jun 2002 23:06:06 +0200
Subject: [Python-Dev] SF bug count
In-Reply-To: <3D0B5676.8419.94F0937E@localhost>
References: <3D0B5676.8419.94F0937E@localhost>
Message-ID: <m3fzzomev5.fsf@mira.informatik.hu-berlin.de>

"Gordon McMillan" <gmcm@hypernet.com> writes:

> to get 51 at a time, I get only 636.
> 
> Whose bug?

Yours; you count only unassigned reports. assigned_to=0 gives you some
more.

Regards,
Martin



From gmcm@hypernet.com  Sat Jun 15 22:11:40 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Sat, 15 Jun 2002 17:11:40 -0400
Subject: [Python-Dev] SF bug count
In-Reply-To: <3D0B5676.8419.94F0937E@localhost>
Message-ID: <3D0B754C.4256.956906F7@localhost>

On 15 Jun 2002 at 15:00, Gordon McMillan wrote:

Hmmph. For any filterable column *except*
_assigned_to, "100" means "any". For
_assigned_to, the magic number is "0".

>  SF's "tracker" page 
>     http://sourceforge.net/tracker/?group_id=5470
>  says there are a total of 2581 bugs.
> 
> Using a url template of (broken so as to be
> readable):
>     bugurlfmt =
>     http://sourceforge.net/tracker/index.php
>       ?group_id=5470
>       &atid=105470
>       &set=custom
>       &_assigned_to=100
>       &_status=100
>       &_category=100
>       &_group=100
>       &order=artifact_id
>       &sort=ASC&offset=%d
> 
> to get 51 at a time, I get only 636.
> 
> Whose bug?

Their design bug is my implementation bug.

-- Gordon
http://www.mcmillan-inc.com/




From martin@v.loewis.de  Sat Jun 15 22:22:19 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 15 Jun 2002 23:22:19 +0200
Subject: [Python-Dev] mkdev, major, st_rdev, etc
In-Reply-To: <20020615160831.A5440@ibook.distro.conectiva>
References: <20020615160831.A5440@ibook.distro.conectiva>
Message-ID: <m3bsacme44.fsf@mira.informatik.hu-berlin.de>

Gustavo Niemeyer <niemeyer@conectiva.com> writes:

> - mkdev would be able to use the standard posix format, and would
>   work regardless of makedev's availability (mkdev is being
>   introduced in 2.3).

Notice that the this interface is *not* part of the Posix spec.

http://www.opengroup.org/onlinepubs/007904975/functions/mknod.html

says that the only portable use of mknod is to create FIFOs; any use
where dev is not null is unspecified. Furthermore, major and minor are
not part of Posix.

> - TYPES.py is already trying to provide them, but it's broken
>   (indeed, it's more broken than that. h2py should use cpp to
>   preprocess the files, but that's something for another occasion).

That cannot work: the preprocessor will eat the macro definition, and
you have no way to find out what its body was.

> A patch providing these functions is available at
> http://www.python.org/sf/569139

I wonder whether the additional TRY_COMPILE test is really
necessary. Isn't it sufficient to restrict attention to systems on
which major and minor are macros, and use

#ifdef major

inside posixmodule.c?

Regards,
Martin



From niemeyer@conectiva.com  Sat Jun 15 23:21:46 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Sat, 15 Jun 2002 19:21:46 -0300
Subject: [Python-Dev] mkdev, major, st_rdev, etc
In-Reply-To: <m3bsacme44.fsf@mira.informatik.hu-berlin.de>
References: <20020615160831.A5440@ibook.distro.conectiva> <m3bsacme44.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020615192146.A5978@ibook.distro.conectiva>

Hello Martin!

> > - mkdev would be able to use the standard posix format, and would
> >   work regardless of makedev's availability (mkdev is being
> >   introduced in 2.3).
> 
> Notice that the this interface is *not* part of the Posix spec.

Please, notice that what I said is that *mknod* would be able to
use the standard posix format. In other words, instead of

mknod(name, mode, major, minor)

It can become:

mknod(name, mode, device)

Which *is* posix compliant, so your note seems to be another reason
for us to use the new proposed system.

> http://www.opengroup.org/onlinepubs/007904975/functions/mknod.html
> 
> says that the only portable use of mknod is to create FIFOs; any use
> where dev is not null is unspecified. Furthermore, major and minor are
> not part of Posix.

Indeed. But I think we need this functionality nevertheless, since
that's the only way to create special devices in Linux and other
systems. Otherwise we won't be able to have modules like tarfile.py
which have to rebuild them. Besides that, os is meant to have
operating system specific functionality, isn't it?

> > - TYPES.py is already trying to provide them, but it's broken
> >   (indeed, it's more broken than that. h2py should use cpp to
> >   preprocess the files, but that's something for another occasion).
> 
> That cannot work: the preprocessor will eat the macro definition, and
> you have no way to find out what its body was.

That's what I meant:

[niemeyer@ibook dist]$ cpp -dM /usr/include/sys/types.h 
#define __LITTLE_ENDIAN 1234 
#define BYTE_ORDER __BYTE_ORDER 
#define powerpc 1 
#define __linux__ 1 
#define LITTLE_ENDIAN __LITTLE_ENDIAN 
#define FD_SET(fd, fdsetp) __FD_SET (fd, fdsetp) 
[...]

> > A patch providing these functions is available at
> > http://www.python.org/sf/569139
> 
> I wonder whether the additional TRY_COMPILE test is really
> necessary. Isn't it sufficient to restrict attention to systems on
> which major and minor are macros, and use
> 
> #ifdef major
> 
> inside posixmodule.c?

I'm not sure if these functions are macros in every system. If that's
true, and makedev is always available with major, you're completely
right. My purpose to include a TRY_COMPILE was a safe approach, and
was based on a review on the way autoconf checks if makedev is
available.

Martin, thanks for your review!

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From guido@python.org  Sun Jun 16 02:23:44 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 15 Jun 2002 21:23:44 -0400
Subject: [Python-Dev] SF bug count
In-Reply-To: Your message of "Sat, 15 Jun 2002 15:00:06 EDT."
 <3D0B5676.8419.94F0937E@localhost>
References: <3D0B5676.8419.94F0937E@localhost>
Message-ID: <200206160123.g5G1NiJ23638@pcp02138704pcs.reston01.va.comcast.net>

>  SF's "tracker" page 
>     http://sourceforge.net/tracker/?group_id=5470
>  says there are a total of 2581 bugs.
> 
> Using a url template of (broken so as to be
> readable):
>     bugurlfmt = http://sourceforge.net/tracker/index.php
>       ?group_id=5470
>       &atid=105470
>       &set=custom
>       &_assigned_to=100
>       &_status=100
>       &_category=100
>       &_group=100
>       &order=artifact_id
>       &sort=ASC&offset=%d
> 
> to get 51 at a time, I get only 636.
> 
> Whose bug?

I assume yours -- I tried manually clicking on the "Next 50" link and
got bored after 20 clicks or over 1000 bugs.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip@pobox.com  Sun Jun 16 02:48:28 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sat, 15 Jun 2002 20:48:28 -0500
Subject: [Python-Dev] addressing distutils inability to track file  dependencies
In-Reply-To: <m3k7p0mf59.fsf@mira.informatik.hu-berlin.de>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <3D08DDD7.BD8573D8@prescod.net>
 <15624.64540.472905.469106@12-248-41-177.client.attbi.com>
 <3D093438.92B46349@prescod.net>
 <15625.16072.900596.114938@12-248-41-177.client.attbi.com>
 <m3wut272t7.fsf@mira.informatik.hu-berlin.de>
 <20020615173114.GA8981@gerg.ca>
 <m3k7p0mf59.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15627.61036.77500.635415@12-248-41-177.client.attbi.com>

    Martin> Certainly true. However, the makefiles that Skip wanted to
    Martin> generate would not have offered anything beyond "foo.o depends
    Martin> on foo.c". He then recognized that dependencies are essential,
    Martin> here, and suggested makedepend...

Please don't put words into my mouth (or thoughts into my brain)?  I have
used make+makedepend for a long time and tend to think of them as
inseparable.  I was certainly thinking in terms of .o:.h dependencies.  That
is, after all, what my example demonstrated was missing.

Skip



From skip@pobox.com  Sun Jun 16 05:53:57 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sat, 15 Jun 2002 23:53:57 -0500
Subject: [Python-Dev] unicode() and its error argument
In-Reply-To: <200206151503.g5FF3rJ16446@pcp02138704pcs.reston01.va.comcast.net>
References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com>
 <200206151503.g5FF3rJ16446@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15628.6629.292906.585819@12-248-41-177.client.attbi.com>

    >> This reminds me, it occurred to me the other day that a plain text
    >> version of cgitb would be useful to use for non-web scripts.  You'd
    >> get a lot more context about the environment in which the exception
    >> was raised.

    Guido> Not a bad idea.  I think it could live in the traceback module,
    Guido> possibly as a family of functions named "fancy_traceback" and
    Guido> similar.  Care to do a patch?

I just submitted a patch done differently than you suggested.  I simply
added a text() formatting routine to cgitb.py and an extra 'format' argument
to cgitb.enable().  Now, if you want plain text output, just call enable()
like so

    import cgitb
    cgitb.enable(format="text")

I think I muffed the HTML formatting (there was an odd little bit of logic
in there I believe I might have botched).  I'll take another look at that
and submit a revised patch if necessary and include a little doc update.

For the curious, the patch is at

    http://python.org/sf/569574

Guido expressed interest, so I assigned it to him. ;-)

Skip




From mal@lemburg.com  Sun Jun 16 11:48:49 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 16 Jun 2002 12:48:49 +0200
Subject: [Python-Dev] unicode() and its error argument
References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com>
Message-ID: <3D0C6D11.5090200@lemburg.com>

Skip Montanaro wrote:
> The unicode() builtin accepts an optional third argument, errors, which
> defaults to "strict".  According to the docs if errors is set to "ignore",
> decoding errors are silently ignored.  I seem to still get the occasional
> UnicodeError exception, however.  I'm still trying to track down an actual
> example (it doesn't happen often, and I hadn't wrapped unicode() in a
> try/except statement, so all I saw was the error raised, not the input
> string value).

The error argument is passed on to the codec you request.
It's the codec that decides how to implement the error handling,
not the unicode() builtin, so if you're seeing errors with 'ignore'
then this is probably the result of some problem in the codec.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From skip@pobox.com  Sun Jun 16 15:15:35 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 16 Jun 2002 09:15:35 -0500
Subject: [Python-Dev] unicode() and its error argument
In-Reply-To: <3D0C6D11.5090200@lemburg.com>
References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com>
 <3D0C6D11.5090200@lemburg.com>
Message-ID: <15628.40327.92855.344184@12-248-41-177.client.attbi.com>

    >> According to the docs if errors is set to "ignore", decoding errors
    >> are silently ignored.  I seem to still get the occasional
    >> UnicodeError exception, however.  

    mal> The error argument is passed on to the codec you request ... so if
    mal> you're seeing errors with 'ignore' then this is probably the result
    mal> of some problem in the codec.

Thanks.  I've been running my test program a lot the past couple of days.  I
think I squelched a couple bugs in my own code that may have been causing
the problem.  I thought it was because some non-string args were sneaking
into the call, but that appears not to be the case either (the error message
in that case is different than what I was seeing).  Tim's inability to
provoke errors was also suggestive that it was pilot error, not a problem
with the plane.

I'll keep my eye on things and let you know if anything else appears.

Skip

P.S.  Happy Father's Day all you dads out there.



From skip@mojam.com  Sun Jun 16 17:14:25 2002
From: skip@mojam.com (Skip Montanaro)
Date: Sun, 16 Jun 2002 11:14:25 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200206161614.g5GGEPZ24272@12-248-41-177.client.attbi.com>

Bug/Patch Summary
-----------------

254 open / 2582 total bugs (-12)
128 open / 1554 total patches (-3)

New Bugs
--------

tarball to untar into a single dir (2002-06-11)
	http://python.org/sf/567576
whatsnew explains noargs incorrectly (2002-06-11)
	http://python.org/sf/567607
Various Playstation 2 Linux Test Errors (2002-06-12)
	http://python.org/sf/567892
urllib needs 303/307 handlers (2002-06-12)
	http://python.org/sf/568068
asynchat module undocumented (2002-06-12)
	http://python.org/sf/568134
Misleading string constant. (2002-06-12)
	http://python.org/sf/568269
socket module htonl/ntohl bug (2002-06-12)
	http://python.org/sf/568322
minor improvement to Grammar file (2002-06-13)
	http://python.org/sf/568412
__slots__ attribute and private variable (2002-06-14)
	http://python.org/sf/569257
cgi.py broken with xmlrpclib on Python 2 (2002-06-15)
	http://python.org/sf/569316
LINKCC incorrectly set (2002-06-16)
	http://python.org/sf/569668

New Patches
-----------

unicode in sys.path (2002-06-10)
	http://python.org/sf/566999
GetFInfo update (2002-06-11)
	http://python.org/sf/567296
A different patch for python-mode vs gdb (2002-06-11)
	http://python.org/sf/567468
Add param to email.Utils.decode()  (2002-06-12)
	http://python.org/sf/568348
Convert slice and buffer to types (2002-06-13)
	http://python.org/sf/568544
gettext module charset changes (2002-06-13)
	http://python.org/sf/568669
Implementation of major, minor and makedev (2002-06-14)
	http://python.org/sf/569139
names in types module (2002-06-15)
	http://python.org/sf/569328
plain text enhancement for cgitb (2002-06-15)
	http://python.org/sf/569574

Closed Bugs
-----------

tuple __getitem__ limited (2001-09-06)
	http://python.org/sf/459235
str, __getitem__ and slices (2001-10-23)
	http://python.org/sf/473985
string.{starts,ends}with vs slices (2001-12-16)
	http://python.org/sf/493951
HTMLParser fail to handle '&foobar' (2002-01-06)
	http://python.org/sf/500073
markupbase handling of HTML declarations (2002-01-19)
	http://python.org/sf/505747
Recursive class instance "error" (2002-03-20)
	http://python.org/sf/532646
fcntl module with wrong module for ioctl (2002-04-30)
	http://python.org/sf/550777
deepcopy can't handle custom metaclasses (2002-05-26)
	http://python.org/sf/560794
Assertion with very long lists (2002-05-29)
	http://python.org/sf/561858
test_signal.py fails on FreeBSD-4-stable (2002-05-29)
	http://python.org/sf/562188
xmlrpclib.Binary.data undocumented (2002-05-31)
	http://python.org/sf/562878
Clarify documentation for inspect (2002-06-01)
	http://python.org/sf/563273
os.tmpfile should use w+b, not w+ (2002-06-02)
	http://python.org/sf/563750
getttext defaults with unicode (2002-06-03)
	http://python.org/sf/563915
IDLE needs printing (2002-06-06)
	http://python.org/sf/565373
urllib FancyURLopener.__init__ / urlopen (2002-06-06)
	http://python.org/sf/565414
string.replace() can corrupt heap (2002-06-07)
	http://python.org/sf/565993
telnetlib makes Python dump core (2002-06-07)
	http://python.org/sf/566006
rotormodule's set_key calls strlen (2002-06-10)
	http://python.org/sf/566859
Typo in "What's new in Python 2.3" (2002-06-10)
	http://python.org/sf/566869

Closed Patches
--------------

experimental support for extended slicing on lists (2000-07-27)
	http://python.org/sf/400998
getopt with GNU style scanning (2001-10-21)
	http://python.org/sf/473512
AtheOS port of Python 2.2b2 (2001-12-02)
	http://python.org/sf/488073
Silence AIX C Compiler Warnings. (2002-03-21)
	http://python.org/sf/533070
Floating point issues in body of text (2002-04-26)
	http://python.org/sf/548943
Deprecate bsddb (2002-05-06)
	http://python.org/sf/553108
Prevent duplicates in readline history (2002-05-30)
	http://python.org/sf/562492
First patch: start describing types... (2002-05-30)
	http://python.org/sf/562529
posixmodule.c RedHat 6.1 (bug #535545) (2002-06-03)
	http://python.org/sf/563954
error in weakref.WeakKeyDictionary (2002-06-04)
	http://python.org/sf/564549
modulefinder and string methods (2002-06-05)
	http://python.org/sf/564840
fix bug in shutil.rmtree exception case (2002-06-09)
	http://python.org/sf/566517



From tim.one@comcast.net  Sun Jun 16 17:41:23 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 16 Jun 2002 12:41:23 -0400
Subject: [Python-Dev] unicode() and its error argument
In-Reply-To: <15628.40327.92855.344184@12-248-41-177.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHLPNAA.tim.one@comcast.net>

[Skip Montanaro]
> ...
> Tim's inability to provoke errors was also suggestive that it was pilot
> error, not a problem with the plane.

Ya, but what do I know about encodings?  "Nothing" is right -- that's why I
wrote a program to generate stuff at random.

Taking that another step, to generate the encoding at random too, turns up
at least one way to crash Python:  the attached program eventually crashes
when doing a utf7 decode.  It appears to be in this line:

            if ((ch == '-') || !B64CHAR(ch)) {

and ch "is big" when it blows up.  I assume this is because B64CHAR(ch)
expands in part to isalnum(ch), and on Windows the latter is done via array
lookup (and ch is out-of-bounds).

Other failures I've seen out of this are benign, like

>>> unicode('\xf1R\x7f^C\x1e\xd8', 'hex_codec', 'ignore')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "C:\CODE\PYTHON\lib\encodings\hex_codec.py", line 41, in hex_decode
    assert errors == 'strict'
AssertionError
>>>



from random import choice, randint
from traceback import print_exc

bytes = [chr(i) for i in range(256)]
paste = ''.join

def generrors(encoding, errors, maxlen, maxtries):
    for dummy in xrange(maxtries):
        n = randint(1, maxlen)
        raw = paste([choice(bytes) for dummy in range(n)])
        try:
            u = unicode(raw, encoding, errors)
        except:
            print 'failure in unicode(%r, %r, %r)' % (raw, encoding, errors)
            print_exc(0)
            return 1
    return 0

from encodings.aliases import aliases
unique = aliases.values()
unique = dict(zip(unique, unique)).keys()

while unique:
    e = choice(unique)
    print
    print 'Trying', e
    if generrors(e, 'ignore', 10, 1000):
        unique.remove(e)




From mgilfix@eecs.tufts.edu  Sun Jun 16 18:24:45 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Sun, 16 Jun 2002 13:24:45 -0400
Subject: [Python-Dev] test_socket failures
In-Reply-To: <200206122059.g5CKxQa16372@odiug.zope.com>; from guido@python.org on Wed, Jun 12, 2002 at 04:59:26PM -0400
References: <200206122059.g5CKxQa16372@odiug.zope.com>
Message-ID: <20020616132445.C23809@eecs.tufts.edu>

  Ok. I just submitted a patch to SF: http://www.python.org/sf/569697
that fixes the race conditions in test_socket.py (and also documents
ThreadedTest). I've done a make test and it seem to pass through the
regression test. So please test this out on windows for me Guido.

                            -- Mike

On Wed, Jun 12 @ 16:59, Guido van Rossum wrote:
> I know that there are problem with the two new socket tests:
> test_timeout and test_socket.  The problems are varied: the tests
> assume network access and a working and consistent DNS, they assume
> predictable timing, and there is a number of Windows-specific failures
> that I'm trying to track down.  Also, when the full test suite is run,
> test_socket.py may hang, while in isolation it will work.  (Gosh if
> only we had had these unit tests a few years ago.  They bring up all
> sorts of issues that are good to know about.)
> 
> I'll try to fix these ASAP.

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From aahz@pythoncraft.com  Mon Jun 17 00:45:55 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sun, 16 Jun 2002 19:45:55 -0400
Subject: [Python-Dev] PEP 8: Lists/tuples
Message-ID: <20020616234555.GA3415@panix.com>

I don't really want to open a can of worms here, but as I'm finishing up
my OSCON slides, I remembered a conversation earlier where Guido said
that tuples should be used for heterogeneous items (i.e. lightweight
structs) while lists should be used for homogeneous items.

Should this preference be enshrined in PEP 8?

(I'm -1 myself, but I'd like to know what to tell my class.)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From guido@python.org  Mon Jun 17 01:23:58 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 16 Jun 2002 20:23:58 -0400
Subject: [Python-Dev] PEP 8: Lists/tuples
In-Reply-To: Your message of "Sun, 16 Jun 2002 19:45:55 EDT."
 <20020616234555.GA3415@panix.com>
References: <20020616234555.GA3415@panix.com>
Message-ID: <200206170023.g5H0NxC00733@pcp02138704pcs.reston01.va.comcast.net>

> I don't really want to open a can of worms here, but as I'm finishing up
> my OSCON slides, I remembered a conversation earlier where Guido said
> that tuples should be used for heterogeneous items (i.e. lightweight
> structs) while lists should be used for homogeneous items.
> 
> Should this preference be enshrined in PEP 8?

Yes.

> (I'm -1 myself, but I'd like to know what to tell my class.)

Like it or not, that's what tuples are for. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Mon Jun 17 02:13:21 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 16 Jun 2002 21:13:21 -0400
Subject: [Python-Dev] behavior of inplace operations
In-Reply-To: <018d01c21208$1efadab0$6501a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEIMPNAA.tim.one@comcast.net>

[David Abrahams]
> ...
> The pathological/non-generic cases are> the ones that make me think twice
> about using the inplace ops at all. They don't, in fact, "just work", so
> I have to think carefully about what's happening to avoid getting myself
> in trouble.

I didn't understand this thread.  The inplace ops in Python do "just work"
to my eyes, but I expect them to work the way Python defines them to work,
which is quite uniform.  For example,

    e1[e2] += e3

acts like

    t0, t1 = e1, e2
    t0[t1] = t0[t1] + e3

There's no guarantee that e1[e2] as a whole is evaluated at most once, and,
to the contrary, the subscription is performed twice, just like the "acts
like" line implies.  Likewise

    e1.id += e2

acts like

    t0 = e1
    t0.id = t0.id + e3

The way an augmented assignment in Python works is defined by cases, on the
form of the target.  Those were the "subscription" and "attributeref" forms
of target.  There are two other relevant forms of target, "identifier" and
"slicing", and they're wholly analogous.  Note an implication:  in a
"complicated" target, it's only the top-level subscription or attribute
lookup that gets evaluated twice; e.g.,

    e1[e2].e3[e4] += e5

acts like

    t0, t1 = e1[e2].e3, e4
    t0[t1] = t0[t1] + e5

Note that Python doesn't have a reference-to-lvalue concept.  If you don't
believe "but it should, so I'm going to think as if it does", there's
nothing surprising about augmented assignment in Python.  Indeed, I'm not
even surprised by what this prints <wink>:

>>> a = range(12)
>>> a[2:9] += [666]
>>> a




From greg@cosc.canterbury.ac.nz  Mon Jun 17 02:46:58 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 17 Jun 2002 13:46:58 +1200 (NZST)
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <3D09A388.8080107@lemburg.com>
Message-ID: <200206170146.g5H1kwh27759@oma.cosc.canterbury.ac.nz>

> The question is whether we want distutils to be a development
> tool as well

I'd say yes, we do -- otherwise we have to maintain two
parallel systems for building stuff, which sucks for
what should be obvious reasons.

What's more -- on Windows, distutils is the only
way I know *how* to build extension modules! I once
tried doing it on my own and gave up in disgust.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+




From greg@cosc.canterbury.ac.nz  Mon Jun 17 02:57:51 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 17 Jun 2002 13:57:51 +1200 (NZST)
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206141205.g5EC5Wc31785@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200206170157.g5H1vp027803@oma.cosc.canterbury.ac.nz>

>   Extension('foo', ['foo1.c', 'foo2.c'], dependencies={'foo1.c':
> >   ['bar.h'], 'foo2.c': ['bar.h', 'bar2.h']})
>
> But this is wrong: it's not foo1.c that depends on bar.h, it's foo1.o.

It's not wrong if you read the dependency statement
as "anything which depends on foo1.c also depends
on bar.h" etc.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+




From David Abrahams" <david.abrahams@rcn.com  Mon Jun 17 12:44:16 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 17 Jun 2002 07:44:16 -0400
Subject: [Python-Dev] behavior of inplace operations
References: <LNBBLJKPBEHFEDALKOLCMEIMPNAA.tim.one@comcast.net>
Message-ID: <009a01c215f4$6481b1e0$6601a8c0@boostconsulting.com>

From: "Tim Peters" <tim.one@comcast.net>


> [David Abrahams]
> > ...
> > The pathological/non-generic cases are> the ones that make me think
twice
> > about using the inplace ops at all. They don't, in fact, "just work",
so
> > I have to think carefully about what's happening to avoid getting
myself
> > in trouble.
>
> I didn't understand this thread.  The inplace ops in Python do "just
work"
> to my eyes, but I expect them to work the way Python defines them to
work,
> which is quite uniform.  For example,
>
>     e1[e2] += e3
>
> acts like
>
>     t0, t1 = e1, e2
>     t0[t1] = t0[t1] + e3

But that's not even right, AFAICT. Instead, its:

    t0, t1 = e1, e2
    t2 = t0[t1]
    t2 += e3        # possible rebinding operation
    t0[t1] = t2

> There's no guarantee that e1[e2] as a whole is evaluated at most once

Actually, that was exactly what I expected. What I didn't expect was that
there's a guarantee that it's evaluated twice, once as part of a getitem
and once as part of a setitem.

> The way an augmented assignment in Python works is defined by cases, on
the
> form of the target.  Those were the "subscription" and "attributeref"
forms
> of target.  There are two other relevant forms of target, "identifier"
and
> "slicing", and they're wholly analogous.  Note an implication:  in a
> "complicated" target, it's only the top-level subscription or attribute
> lookup that gets evaluated twice; e.g.,
>
>     e1[e2].e3[e4] += e5
>
> acts like
>
>     t0, t1 = e1[e2].e3, e4
>     t0[t1] = t0[t1] + e5

I understood that part, but thanks for going to the trouble.

> Note that Python doesn't have a reference-to-lvalue concept.

Never expected it to.

> If you don't
> believe "but it should, so I'm going to think as if it does", there's
> nothing surprising about augmented assignment in Python.

I don't think it should have a reference-to-lvalue. Please, give me a tiny
bit of credit for being able to think Pythonically. I don't see everything
in terms of C++; I just expected Python not to do a potentially expensive
lookup and writeback in the cases where it could be avoided. Other people,
apparently, are also surprised by some of the cases that arise due to the
unconditional write-back operation.

> Indeed, I'm not
> even surprised by what this prints <wink>:
>
> >>> a = range(12)
> >>> a[2:9] += [666]
> >>> a

I guess I am, even if I believed your "as-if" description:

>>> a = range(12)
>>> t0,t1 = a,slice(2,9)
>>> t0[t1] = t0[t1] + [666]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: sequence index must be integer

can-we-stop-beating-this-horse-now-ly y'rs,
dave





From jeremy@zope.com  Mon Jun 17 13:40:30 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Mon, 17 Jun 2002 08:40:30 -0400
Subject: [Python-Dev] behavior of inplace operations
In-Reply-To: <009a01c215f4$6481b1e0$6601a8c0@boostconsulting.com>
References: <LNBBLJKPBEHFEDALKOLCMEIMPNAA.tim.one@comcast.net>
 <009a01c215f4$6481b1e0$6601a8c0@boostconsulting.com>
Message-ID: <15629.55486.743231.509983@slothrop.zope.com>

>>>>> "DA" == David Abrahams <david.abrahams@rcn.com> writes:

  DA> I guess I am, even if I believed your "as-if" description:

  >>> a = range(12) 
  >>> t0,t1 = a,slice(2,9) 
  >>> t0[t1] = t0[t1] + [666]
  DA> Traceback (most recent call last):
  DA>   File "<stdin>", line 1, in ?
  DA> TypeError: sequence index must be integer

There seem to be many ways to spell this, all quite similar.  And
different versions of Python have different things to say about it.
Current CVS says:

   >>> a = range(12)
   >>> t0,t1 = a,slice(2, 9)
   >>> t0[t1] = t0[t1] + [666]
   Traceback (most recent call last):
     File "<stdin>", line 1, in ?
   ValueError: attempt to assign list of size 8 to extended slice of
   size 7

I suspect this is a bug, since I didn't ask for an extended slice.

   >>> a[2:9] = a[2:9] + [666]
   >>> a
   [0, 1, 2, 3, 4, 5, 6, 7, 8, 666, 9, 10, 11]

Jeremy




From guido@python.org  Mon Jun 17 13:59:46 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 08:59:46 -0400
Subject: [Python-Dev] behavior of inplace operations
In-Reply-To: Your message of "Mon, 17 Jun 2002 08:40:30 EDT."
 <15629.55486.743231.509983@slothrop.zope.com>
References: <LNBBLJKPBEHFEDALKOLCMEIMPNAA.tim.one@comcast.net> <009a01c215f4$6481b1e0$6601a8c0@boostconsulting.com>
 <15629.55486.743231.509983@slothrop.zope.com>
Message-ID: <200206171259.g5HCxkB08311@pcp02138704pcs.reston01.va.comcast.net>

> Current CVS says:
> 
>    >>> a = range(12)
>    >>> t0,t1 = a,slice(2, 9)
>    >>> t0[t1] = t0[t1] + [666]
>    Traceback (most recent call last):
>      File "<stdin>", line 1, in ?
>    ValueError: attempt to assign list of size 8 to extended slice of
>    size 7
> 
> I suspect this is a bug, since I didn't ask for an extended slice.
> 
>    >>> a[2:9] = a[2:9] + [666]
>    >>> a
>    [0, 1, 2, 3, 4, 5, 6, 7, 8, 666, 9, 10, 11]
> 
> Jeremy

Yeah, I think this is a bug introduced when MWH added support for
extended slices.  I hope he'll fix it so I won't have to. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mwh@python.net  Mon Jun 17 14:03:24 2002
From: mwh@python.net (Michael Hudson)
Date: Mon, 17 Jun 2002 14:03:24 +0100
Subject: [Python-Dev] (no subject)
Message-ID: <9B37BC74-81F2-11D6-9BA6-0003931DF95C@python.net>

My starship mail currently seems to be broken in and out, so this is my 
first mail sent with Mac OS X's Mail.app.  I hope it comes out plain 
text...

> Current CVS says:
>
>    >>> a = range(12)
>    >>> t0,t1 = a,slice(2, 9)
>    >>> t0[t1] = t0[t1] + [666]
>    Traceback (most recent call last):
>      File "<stdin>", line 1, in ?
>    ValueError: attempt to assign list of size 8 to extended slice of
>    size 7
>
> I suspect this is a bug, since I didn't ask for an extended slice.

Yes you did :)

If you'd have tried that with any released Python, you'd have got a 
TypeError.

The trouble is, there's no way to distinguish between

l1[a:b:]
l1[slice(a,b)]

I deliberately made the former be the same as l1[a:b:1] (and so have the 
restriction on the length of slice) to reduce special-casing (both for 
the user and me).  Do you think I got that wrong?

Cheers,
M.




From guido@python.org  Mon Jun 17 14:39:59 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 09:39:59 -0400
Subject: [Python-Dev] (no subject)
In-Reply-To: Your message of "Mon, 17 Jun 2002 14:03:24 BST."
 <9B37BC74-81F2-11D6-9BA6-0003931DF95C@python.net>
References: <9B37BC74-81F2-11D6-9BA6-0003931DF95C@python.net>
Message-ID: <200206171339.g5HDdxN08737@pcp02138704pcs.reston01.va.comcast.net>

> The trouble is, there's no way to distinguish between
> 
> l1[a:b:]
> l1[slice(a,b)]
> 
> I deliberately made the former be the same as l1[a:b:1] (and so have the 
> restriction on the length of slice) to reduce special-casing (both for 
> the user and me).  Do you think I got that wrong?

Yes I think you got that wrong.  __getslice__ and __setlice__ are
being deprecated (or at least discouraged), so you'll have objects
implementing only __getitem__.  Such objects will get a slice object
passed to __getitem__ even for simple (one-colon) slices.  If such an
object wants to pass the slice on to a list object underlying the
implementation, it should be allowed to.

IOW slice(a, b, None) should be considered equivalent to L[a:b] in all
situations.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mwh@python.net  Mon Jun 17 10:54:51 2002
From: mwh@python.net (Michael Hudson)
Date: 17 Jun 2002 10:54:51 +0100
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules unicodedata.c,2.16,2.17
References: <Pine.OS2.4.32.0206142201060.69-100000@tenring.andymac.org> <02e101c213e3$4785cc10$ced241d5@hagrid> <200206142048.g5EKmwC14089@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <2m4rg21b84.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> > in my experience, simple names without underscores always conflicts
> > with something on platforms that I don't have access to, while simple
> > names with leading underscores never causes any problems...
> 
> Unfortunately, that's exactly the opposite of what the C
> standardization committee wants you to do.

Huh?  I thought underscore-lowercase character was fine, and
double-underscore and underscore-capital were the verboten
combinations.

Cheers,
M.

-- 
  SPIDER:  'Scuse me. [scuttles off]
  ZAPHOD:  One huge spider.
    FORD:  Polite though.
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 11



From mwh@python.net  Mon Jun 17 10:54:58 2002
From: mwh@python.net (Michael Hudson)
Date: 17 Jun 2002 10:54:58 +0100
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <E17Imfl-00046Q-00@mail.python.org> <005a01c21388$387cd3e0$0900a8c0@spiff> <m3r8j9vkua.fsf@mira.informatik.hu-berlin.de>
Message-ID: <2mznxuz0ul.fsf@starship.python.net>

martin@v.loewis.de (Martin v. Loewis) writes:

> > (distutils is also a pain to use with a version management system
> > that marks files in the repository as read-only; distutils copy function
> > happily copies all the status bits. but the remove function refuses to
> > remove files that are read-only, even if the files have been created
> > by distutils itself...)
> 
> That's a bug, IMO.

And hang on, wasn't it fixed by revision 1.12 of
Lib/distutils/file_util.py?  If not, more details would be appreciated.

Cheers,
M.

-- 
  Now this is what I don't get.  Nobody said absolutely anything
  bad about anything.  Yet it is always possible to just pull
  random flames out of ones ass.
         -- http://www.advogato.org/person/vicious/diary.html?start=60



From jeremy@zope.com  Mon Jun 17 15:30:24 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Mon, 17 Jun 2002 10:30:24 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <m3wut1eb9q.fsf@mira.informatik.hu-berlin.de>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
 <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
 <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
 <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
 <m3wut1eb9q.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15629.62080.478763.190468@slothrop.zope.com>

>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

  MvL> GCC also supports SUNPRO_DEPENDENCIES, so this is the most
  MvL> effective and portable way to get dependency file generation.

Here's a rough strategy for exploiting this feature in distutils.
Does it make sense?  Happily, I can't see any possible use of make.

There is an option to enable dependency tracking.  Not sure how the
option is passed: command line (tedious), setup (not easily customized
by user), does distutils have a user options file of some sort?

Each time distutils compiles a file it passes the -MD file to generate
a .d file.

On subsequent compilations, it checks for the .d file.  If the .d file
does not exist or is older than the .c file, it recompiles.
Otherwise, it parses the .d file and compares the times for each of
the dependencies.

This doesn't involve make because the only thing make would do for us
is check the dependencies and invoke the compiler.  distutils already
knows how to do both those things.

Jeremy




From guido@python.org  Mon Jun 17 15:47:38 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 10:47:38 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Mon, 17 Jun 2002 10:30:24 EDT."
 <15629.62080.478763.190468@slothrop.zope.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <m3wut1eb9q.fsf@mira.informatik.h!
 u-berlin.de>
 <15629.62080.478763.190468@slothrop.zope.com>
Message-ID: <200206171447.g5HElcw09279@pcp02138704pcs.reston01.va.comcast.net>

> Here's a rough strategy for exploiting this feature in distutils.
> Does it make sense?  Happily, I can't see any possible use of make.
> 
> There is an option to enable dependency tracking.  Not sure how the
> option is passed: command line (tedious), setup (not easily customized
> by user), does distutils have a user options file of some sort?

We could make the configure script check for GCC, and if detected, add
-MD to it.

> Each time distutils compiles a file it passes the -MD file to generate
> a .d file.
> 
> On subsequent compilations, it checks for the .d file.  If the .d file
> does not exist or is older than the .c file, it recompiles.
> Otherwise, it parses the .d file and compares the times for each of
> the dependencies.

Sounds good.  It could skip parsing the .d file if the .o file
doesn't exist or is older than the .c file.  If there is no .d file, I
would suggest only recompiling if the .c file is newer than the .o
file (otherwise systems without GCC will see recompilation of
everything all the time -- not a good idea IMO.)

Go ahead and implement this!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jun 17 15:49:23 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 10:49:23 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules unicodedata.c,2.16,2.17
In-Reply-To: Your message of "17 Jun 2002 10:54:51 BST."
 <2m4rg21b84.fsf@starship.python.net>
References: <Pine.OS2.4.32.0206142201060.69-100000@tenring.andymac.org> <02e101c213e3$4785cc10$ced241d5@hagrid> <200206142048.g5EKmwC14089@pcp02138704pcs.reston01.va.comcast.net>
 <2m4rg21b84.fsf@starship.python.net>
Message-ID: <200206171449.g5HEnNe09298@pcp02138704pcs.reston01.va.comcast.net>

> Huh?  I thought underscore-lowercase character was fine, and
> double-underscore and underscore-capital were the verboten
> combinations.

underscore-lowercase with external linkage is also reserved for the C
implementation.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From oren-py-d@hishome.net  Mon Jun 17 15:53:13 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 17 Jun 2002 10:53:13 -0400
Subject: [Python-Dev] 'new' and 'types'
In-Reply-To: <20020617134905.GA1003@gerg.ca>
References: <20020614191558.A31580@hishome.net> <20020617134905.GA1003@gerg.ca>
Message-ID: <20020617145313.GA6960@hishome.net>

On Mon, Jun 17, 2002 at 09:49:05AM -0400, Greg Ward wrote:
> On 14 June 2002, Oren Tirosh said:
> > Patch 568629 removes the built-in module new (with sincere apologies to 
> > Tommy Burnette ;-) and replaces it with a tiny Python module consisting of a 
> > single import statement:
> [...]
> > Now, what about the types module?  It has been suggested that this module
> > should be deprecated.  I think it still has some use: we need a place to put
> > all the types that are not used often enough to be added to the builtins.
> > I suggest that they be placed in the module 'types' with names matching their
> > __name__ attribute.  The types module will still have the long MixedCaseType 
> > names for backward compatibility.  The use of the long names should be 
> > deprecated, not the types module itself.
> 
> Two great ideas.  I think you should write up a *short* PEP to keep them
> alive -- this feels like one of those small, not-too-controversial
> changes that will slip between the cracks unless/until someone with
> checkin privs takes up the cause.  Don't let that discourage you!

The first is very much alive - my patch for the new module has already been 
checked in by Guido.  No need for a PEP there because it's a transparent 
change.

For the types module I have a pending patch but I guess it won't sneak into
the CVS without a PEP. It requires general agreement that it's a bad thing 
for types to have two names, that the long names should be deprecated and 
that the right place for short names that are not builtins is in the types 
module.

It's PEP time...

	Oren



From guido@python.org  Mon Jun 17 16:01:38 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 11:01:38 -0400
Subject: [Python-Dev] 'new' and 'types'
In-Reply-To: Your message of "Mon, 17 Jun 2002 10:53:13 EDT."
 <20020617145313.GA6960@hishome.net>
References: <20020614191558.A31580@hishome.net> <20020617134905.GA1003@gerg.ca>
 <20020617145313.GA6960@hishome.net>
Message-ID: <200206171501.g5HF1cm09450@pcp02138704pcs.reston01.va.comcast.net>

> For the types module I have a pending patch but I guess it won't
> sneak into the CVS without a PEP. It requires general agreement that
> it's a bad thing for types to have two names, that the long names
> should be deprecated and that the right place for short names that
> are not builtins is in the types module.

FWIW, *I* agree with that.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip@pobox.com  Mon Jun 17 16:18:58 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 17 Jun 2002 10:18:58 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15629.62080.478763.190468@slothrop.zope.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
 <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
 <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
 <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
 <m3wut1eb9q.fsf@mira.informatik.hu-berlin.de>
 <15629.62080.478763.190468@slothrop.zope.com>
Message-ID: <15629.64994.385213.97041@12-248-41-177.client.attbi.com>

    Jeremy> Here's a rough strategy for exploiting this feature in
    Jeremy> distutils.  Does it make sense?  Happily, I can't see any
    Jeremy> possible use of make.

I still don't quite understand what everyone's aversion to make is (yes, I
realize it's not platform-independent, but then neither are C compilers or
linkers and we manage to live with that), but I will let that slide.

Instead, I see a potentially different approach.  Write an scons build file
(typically named SConstruct) and deliver that in the Modules directory.
Most people can safely ignore it.  The relatively few people (mostly on this
list) who care about such things can simply install SCons (it's quite small)
and run it to build the stuff in the Modules directory.

The benefits as I see them are

    * SCons implements portable automatic dependency analysis already

    * Dependencies are based upon file checksums instead of timestamps
      (worthwhile in highly networked development environments)

    * Clearer separation between build/install and edit/compile/test types
      of tasks.

I was able to create a simple SConstruct file over the weekend that builds
many of the extension modules.  I stalled a bit on library/include file
discovery, but hopefully that barrier will be passed soon.

I realize in the short-term there are also several disadvantages to this
idea:

    * There will initially be a lot of overlap between setup.py and SCons.

    * SCons doesn't yet implement a VPATH-like capability so the source and
      build directories can't easily be separated.  One is in the works
      though, planned for initial release in 0.09.  The current version is
      0.07.

Skip



From guido@python.org  Mon Jun 17 16:30:09 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 11:30:09 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Mon, 17 Jun 2002 10:18:58 CDT."
 <15629.64994.385213.97041@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <m3wut1eb9q.fsf@mira.informatik.h!
 u-berlin.de> <15629.62080.478763.190468@slothrop.zope.com>
 <15629.64994.385213.97041@12-248-41-177.client.attbi.com>
Message-ID: <200206171530.g5HFU9X09701@pcp02138704pcs.reston01.va.comcast.net>

[Proposal to use SCons]

Let's not tie ourselves to SCons before it's a lot more mature.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@zope.com  Mon Jun 17 16:29:29 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 17 Jun 2002 11:29:29 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
 <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
 <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
 <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
 <m3wut1eb9q.fsf@mira.informatik.h!
 u-berlin.de>
 <15629.62080.478763.190468@slothrop.zope.com>
 <15629.64994.385213.97041@12-248-41-177.client.attbi.com>
 <200206171530.g5HFU9X09701@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15630.89.990123.822136@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> [Proposal to use SCons]

    GvR> Let's not tie ourselves to SCons before it's a lot more
    GvR> mature.

On the other hand, eating our own dogfood is a great incentive for
quickly making it taste better. :)

-Barry



From jeremy@zope.com  Mon Jun 17 16:36:38 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Mon, 17 Jun 2002 11:36:38 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15629.64994.385213.97041@12-248-41-177.client.attbi.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
 <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
 <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
 <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
 <m3wut1eb9q.fsf@mira.informatik.hu-berlin.de>
 <15629.62080.478763.190468@slothrop.zope.com>
 <15629.64994.385213.97041@12-248-41-177.client.attbi.com>
Message-ID: <15630.518.652328.859842@slothrop.zope.com>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

  SM> I still don't quite understand what everyone's aversion to make
  SM> is (yes, I realize it's not platform-independent, but then
  SM> neither are C compilers or linkers and we manage to live with
  SM> that), but I will let that slide.

You didn't let it slide.  You brought it up again.  Many people have
offered many reasons not to use make.  You haven't offered any
rebuttal to their arguments, which comes across as rather cavalier.

  SM> Instead, I see a potentially different approach.  Write an scons
  SM> build file (typically named SConstruct) and deliver that in the
  SM> Modules directory.

I don't care much about the Modules directory actually.  I want this
for third-party extensions that use distutils for distribution,
particularly for my own third-party extensions :-).  

It sounds like you're proposing to drop distutils in favor of SCons,
but not saying so explicitly.  Is that right?  If so, we'd need to
strong case for dumping distutils than automatic dependency tracking.
If that isn't right, I don't understand how SCons and distutils meet
in the middle.  Would extension writers need to learn distutils and
SCons?

It seems like the primary benefit of SCons is that it does the
dependency analysis for us, while only gcc and MSVC seem to offer
something similar as a compiler builtin.  Since those two compilers
cover all the platforms I ever use, it isn't something that interests
me a lot.

  SM> The benefits as I see them are

  SM> * SCons implements portable automatic dependency analysis
  SM>       already

That's good.

  SM> * Dependencies are based upon file checksums instead of
  SM>       timestamps (worthwhile in highly networked development
  SM>       environments)

That's good, too, although we could do the same for distutils.  Not
too much work, but not my first priority.

  SM> * Clearer separation between build/install and edit/compile/test
  SM>       types of tasks.

I don't know what you mean.  I use the current Python make file for
both tasks, and haven't had much problem.

  SM> I was able to create a simple SConstruct file over the weekend
  SM> that builds many of the extension modules.  I stalled a bit on
  SM> library/include file discovery, but hopefully that barrier will
  SM> be passed soon.

That's cool.

  SM> I realize in the short-term there are also several disadvantages
  SM> to this idea:

  SM> * There will initially be a lot of overlap between setup.py and
  SM>       SCons.

Won't there be a lot of overlap for all time unless Python adopts
SCons as the one true way to build extension modules?  It's not like
setup.py is going to be replaced. 

  SM> * SCons doesn't yet implement a VPATH-like capability so the
  SM>       source and build directories can't easily be separated.
  SM>       One is in the works though, planned for initial release in
  SM>       0.09.  The current version is 0.07.

Absolute requirement for me :-(.  I've got three CVS checkouts of
Python and probably 10 total build directories that I use on a regular
basis -- normal builds, debug builds, profiled builds, etc.

Jeremy




From mwh@python.net  Mon Jun 17 15:26:22 2002
From: mwh@python.net (Michael Hudson)
Date: 17 Jun 2002 15:26:22 +0100
Subject: [Python-Dev] extended slicing again
In-Reply-To: Guido van Rossum's message of "Mon, 17 Jun 2002 09:39:59 -0400"
References: <9B37BC74-81F2-11D6-9BA6-0003931DF95C@python.net> <200206171339.g5HDdxN08737@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <2my9de7zht.fsf_-_@starship.python.net>

Email back.

Guido van Rossum <guido@python.org> writes:

> > The trouble is, there's no way to distinguish between
> > 
> > l1[a:b:]
> > l1[slice(a,b)]
> > 
> > I deliberately made the former be the same as l1[a:b:1] (and so have the 
> > restriction on the length of slice) to reduce special-casing (both for 
> > the user and me).  Do you think I got that wrong?
> 
> Yes I think you got that wrong.  __getslice__ and __setlice__ are
> being deprecated (or at least discouraged), so you'll have objects
> implementing only __getitem__.  Such objects will get a slice object
> passed to __getitem__ even for simple (one-colon) slices.  If such an
> object wants to pass the slice on to a list object underlying the
> implementation, it should be allowed to.
> 
> IOW slice(a, b, None) should be considered equivalent to L[a:b] in all
> situations.

OK.  I'll do this soon.  It's not as bad as I thought at first -- only
mutable sequences are affected, so it's only lists and arrays that
need to be tweaked.

Cheers,
M.

-- 
  Our lecture theatre has just crashed. It will currently only
  silently display an unexplained line-drawing of a large dog
  accompanied by spookily flickering lights.
     -- Dan Sheppard, ucam.chat (from Owen Dunn's summary of the year)



From tim.one@comcast.net  Mon Jun 17 17:28:26 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 17 Jun 2002 12:28:26 -0400
Subject: [Python-Dev] behavior of inplace operations
In-Reply-To: <009a01c215f4$6481b1e0$6601a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKELCPNAA.tim.one@comcast.net>

[Tim sez]
> The inplace ops in Python do "just work" to my eyes, but I expect them
> to work the way Python defines them to work,> which is quite uniform.
> For example,
>
>     e1[e2] += e3
>
> acts like
>
>     t0, t1 = e1, e2
>     t0[t1] = t0[t1] + e3

[David Abrahams]
> But that's not even right, AFAICT. Instead, its:
>
>     t0, t1 = e1, e2
>     t2 = t0[t1]
>     t2 += e3        # possible rebinding operation
>     t0[t1] = t2

That's closer, although the "mystery" in the 3rd line is less mysterious if
the whole shebang is rewritten

    t0, t1 = e1, e2
    t0[t1] = t0[t1].__iadd__(e3)

That makes it clearer that the effect of the final binding is determined by
what the __iadd__ implementation chooses to return.

> ...
> Actually, that was exactly what I expected. What I didn't expect was that
> there's a guarantee that it's evaluated twice, once as part of a getitem
> and once as part of a setitem.

There is.

> ...
> I don't think it should have a reference-to-lvalue. Please, give me a tiny
> bit of credit for being able to think Pythonically. I don't see everything
> in terms of C++; I just expected Python not to do a potentially expensive
> lookup and writeback in the cases where it could be avoided. Other people,
> apparently, are also surprised by some of the cases that arise due to the
> unconditional write-back operation.

Getting single-evaluation requires picturing reference-to-lvalue, or magical
writeback proxies, or something else equally convoluted:  they're unPythonic
simply because Python doesn't have stuff like that.  A protocol was invented
for supporting both in-place and replacement forms of augmented assignments
uniformly, but it's a Pythonically simple protocol in that it can be
expressed *in* Python with no new concepts beyond that methods like __iadd__
exist.  I don't dispute that it surprises some people some of the time, but
I submit that any other protocol would surprise some people too.  Heck, even
before augmented assignments were introduced, it surprised some people that

    list = list + [list2]

*didn't* extend list inplace.  Overall, "other people" are nuts <0.9 wink>.




From python@rcn.com  Mon Jun 17 18:17:11 2002
From: python@rcn.com (Raymond Hettinger)
Date: Mon, 17 Jun 2002 13:17:11 -0400
Subject: [Python-Dev] PEP 290 - Code Modernization and Migration
Message-ID: <002a01c21622$d2ede3a0$52b53bd0@othello>

The migration guide has been codified in a new informational pep at
http://www.python.org/peps/pep-0290.html.  Developers with CVS access can
add their contributions or improvements be directly to the PEP.  Over time,
it is expected to grow and serve as a repository of collective wisdom
regarding version upgrades.






From jeremy@zope.com  Mon Jun 17 20:22:33 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Mon, 17 Jun 2002 15:22:33 -0400
Subject: [Python-Dev] large file support
Message-ID: <15630.14073.764208.284613@slothrop.zope.com>

I've run into a problem with large files using Python 2.1.2 and a
Linux 2.4.9 box.  We've got a large file -- almost 6GB -- that Python
chokes on even though regular shell tools seem to be fine.

In particular, os.stat() of the file fails with EOVERFLOW and open()
of the file fails with EFBIG.  The stat() failure is really bad
because it means os.path.exists() returns false.

strace tells me that other tools open the file passing O_LARGEFILE,
but Python does not.  (They pass it even for small files.) I can't
find any succient explanation of O_LARGEFILE, but Google turns up all
sorts of pages that mention it.  It looks like the right way to open
large files, but it only seems to be defined in <asm/fcntl.h> on the
Linux box in question.

I haven't had any luck searching for a decent way to invoke stat() and
have it be prepared for a very large file.

I think Python is definitely broken here.  Can anyone offer any clues
or pointers to documentation?  Better yet, a fix.  I'm happy to help
integrate and test it.

Jeremy




From David Abrahams" <david.abrahams@rcn.com  Mon Jun 17 20:15:28 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 17 Jun 2002 15:15:28 -0400
Subject: [Python-Dev] behavior of inplace operations
References: <LNBBLJKPBEHFEDALKOLCKELCPNAA.tim.one@comcast.net>
Message-ID: <017c01c21633$d19dad30$6601a8c0@boostconsulting.com>

----- Original Message -----
From: "Tim Peters" <tim.one@comcast.net>

> Getting single-evaluation requires picturing reference-to-lvalue, or
magical
> writeback proxies, or something else equally convoluted:

No, sorry, it doesn't. All it requires is a straightforward interpretation
of the name given to these operations. You guys called them "inplace". When
I heard that I thought "oh, it just modifies the value in-place. Wait, what
about immutable objects? I guess it must rebind the thing on the LHS to the
new value." Call that convoluted if you want, but at least a few people
seem to approach it that way.

> they're unPythonic simply because Python doesn't have stuff like that.

Just stop right there and sign it "tautology-of-the-week-ly y'rs", OK?

> A protocol was invented
> for supporting both in-place and replacement forms of augmented
assignments
> uniformly, but it's a Pythonically simple protocol in that it can be
> expressed *in* Python with no new concepts beyond that methods like
__iadd__
> exist.

Another simple protocol can also be expressed in Python, but since it
involves an "if" statement it might not be considered "Pythonically
simple". But, y'know, I don't care about this issue that much -- I just
don't like leaving the implication that I'm thinking convolutedly
uncontested. IOW, yer pressing my buttons, Tim! I'm happy to drop the
technical issue unless you want to keep trolling me.

> I don't dispute that it surprises some people some of the time, but
> I submit that any other protocol would surprise some people too.

True. I just think that the trade-offs of the chosen protocol are going to
surprise more people in simpler situations than an alternative would have.
It seems as though clean operation with mutable tuple elements (among other
things) was sacrificed for the sake of transparency with persistent
containers. I would've made a different trade-off, but then I'm not BDFL
around here.

like-a-moth-to-a-flame-ly y'rs,
dave




From guido@python.org  Mon Jun 17 20:31:36 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 15:31:36 -0400
Subject: [Python-Dev] large file support
In-Reply-To: Your message of "Mon, 17 Jun 2002 15:22:33 EDT."
 <15630.14073.764208.284613@slothrop.zope.com>
References: <15630.14073.764208.284613@slothrop.zope.com>
Message-ID: <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net>

> I've run into a problem with large files using Python 2.1.2 and a
> Linux 2.4.9 box.  We've got a large file -- almost 6GB -- that Python
> chokes on even though regular shell tools seem to be fine.

Was this Python configured for large file support?  I think you have
to turn that on somehow, and then everything is automatic.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas@python.ca  Mon Jun 17 20:45:17 2002
From: nas@python.ca (Neil Schemenauer)
Date: Mon, 17 Jun 2002 12:45:17 -0700
Subject: [Python-Dev] large file support
In-Reply-To: <15630.14073.764208.284613@slothrop.zope.com>; from jeremy@zope.com on Mon, Jun 17, 2002 at 03:22:33PM -0400
References: <15630.14073.764208.284613@slothrop.zope.com>
Message-ID: <20020617124517.A13106@glacier.arctrix.com>

Jeremy Hylton wrote:
>  can't find any succient explanation of O_LARGEFILE, but Google turns
>  up all sorts of pages that mention it.  It looks like the right way
>  to open large files, but it only seems to be defined in <asm/fcntl.h>
>  on the Linux box in question.

Perhaps it is set by libc if the application is compiled with large file
support.

  Neil



From Andreas Jung <andreas@andreas-jung.com>  Mon Jun 17 20:44:54 2002
From: Andreas Jung <andreas@andreas-jung.com> (Andreas Jung)
Date: Mon, 17 Jun 2002 15:44:54 -0400
Subject: [Python-Dev] large file support
In-Reply-To: <15630.14073.764208.284613@slothrop.zope.com>
References: <15630.14073.764208.284613@slothrop.zope.com>
Message-ID: <104250714.1024328694@[10.10.1.2]>

I remember that we had several times trouble with compiling
Python 2.1 under Redhat with LF support. Also the way described in the docs
did not work in all cases and we had to tweak the sources at bit
(I think it was posixmodule.c).

-aj

--On Monday, June 17, 2002 15:22 -0400 Jeremy Hylton <jeremy@zope.com> 
wrote:

> I've run into a problem with large files using Python 2.1.2 and a
> Linux 2.4.9 box.  We've got a large file -- almost 6GB -- that Python
> chokes on even though regular shell tools seem to be fine.
>
> In particular, os.stat() of the file fails with EOVERFLOW and open()
> of the file fails with EFBIG.  The stat() failure is really bad
> because it means os.path.exists() returns false.
>
> strace tells me that other tools open the file passing O_LARGEFILE,
> but Python does not.  (They pass it even for small files.) I can't
> find any succient explanation of O_LARGEFILE, but Google turns up all
> sorts of pages that mention it.  It looks like the right way to open
> large files, but it only seems to be defined in <asm/fcntl.h> on the
> Linux box in question.
>
> I haven't had any luck searching for a decent way to invoke stat() and
> have it be prepared for a very large file.
>
> I think Python is definitely broken here.  Can anyone offer any clues
> or pointers to documentation?  Better yet, a fix.  I'm happy to help
> integrate and test it.
>
> Jeremy
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev





    ---------------------------------------------------------------------
   -    Andreas Jung                     http://www.andreas-jung.com   -
  -   EMail: andreas at andreas-jung.com                              -
   -            "Life is too short to (re)write parsers"               -
    ---------------------------------------------------------------------




From jeremy@zope.com  Mon Jun 17 20:37:05 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Mon, 17 Jun 2002 15:37:05 -0400
Subject: [Python-Dev] large file support
In-Reply-To: <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net>
References: <15630.14073.764208.284613@slothrop.zope.com>
 <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15630.14945.184194.434432@slothrop.zope.com>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

  >> I've run into a problem with large files using Python 2.1.2 and a
  >> Linux 2.4.9 box.  We've got a large file -- almost 6GB -- that
  >> Python chokes on even though regular shell tools seem to be fine.

  GvR> Was this Python configured for large file support?  I think you
  GvR> have to turn that on somehow, and then everything is automatic.

Indeed, I think my message ought to be mostly disregarded :-).  I was
told that Python had been built with large file support, but didn't
test it myself.

However, I'm still unhappy with one thing related to large file
support.  If you've got a Python that doesn't have large file support
and you try os.path.exists() on a large file, it will return false.
This is really bad!  Imagine you've got code that says, if the file
doesn't exist open with mode "w+b" :-(.

I'd be happiest if os.path.exists() would work regardless of whether
Python supported large files.  I'd be satisifed with an exception that
at least let me know something went wrong.

Jeremy




From guido@python.org  Mon Jun 17 21:08:26 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 16:08:26 -0400
Subject: [Python-Dev] large file support
In-Reply-To: Your message of "Mon, 17 Jun 2002 15:37:05 EDT."
 <15630.14945.184194.434432@slothrop.zope.com>
References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net>
 <15630.14945.184194.434432@slothrop.zope.com>
Message-ID: <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net>

> However, I'm still unhappy with one thing related to large file
> support.  If you've got a Python that doesn't have large file support
> and you try os.path.exists() on a large file, it will return false.
> This is really bad!  Imagine you've got code that says, if the file
> doesn't exist open with mode "w+b" :-(.

Wow, that sucks.

> I'd be happiest if os.path.exists() would work regardless of whether
> Python supported large files.  I'd be satisifed with an exception that
> at least let me know something went wrong.

Is there an errno we can test for?  stat() for a non-existent file
raises one exception, stat() for a file in a directory you can't read
raises a different one; maybe stat of a large file raises something
else again?  I think os.path.exists() ought to return True in this case.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy@zope.com  Mon Jun 17 21:28:23 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Mon, 17 Jun 2002 16:28:23 -0400
Subject: [Python-Dev] large file support
In-Reply-To: <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net>
References: <15630.14073.764208.284613@slothrop.zope.com>
 <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net>
 <15630.14945.184194.434432@slothrop.zope.com>
 <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15630.18023.298708.670795@slothrop.zope.com>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

  >> I'd be happiest if os.path.exists() would work regardless of
  >> whether Python supported large files.  I'd be satisifed with an
  >> exception that at least let me know something went wrong.

  GvR> Is there an errno we can test for?  stat() for a non-existent
  GvR> file raises one exception, stat() for a file in a directory you
  GvR> can't read raises a different one; maybe stat of a large file
  GvR> raises something else again?  I think os.path.exists() ought to
  GvR> return True in this case.

On the platform I tried (apparently RH 7.1) it raises EOVERFLOW.  I
can extend posixpath to treat that as "file exists" tomorrow.

Jeremy




From bac@OCF.Berkeley.EDU  Mon Jun 17 21:28:22 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Mon, 17 Jun 2002 13:28:22 -0700 (PDT)
Subject: [Python-Dev] Python strptime
Message-ID: <Pine.SOL.4.44.0206171314510.19039-100000@death.OCF.Berkeley.EDU>

I have implemented strptime in pure Python (SF patch #474274) as a drop-in
replacement for the time module's version, but there is the issue of the
time module being a C extension.  Any chance of getting a Python module
stub for time (assuming this patch is good enough to be accepted)?

There is also obviously the option of doing something like a time2, but is
there enough other time-manipulating Python code out there to warrant
another module?  It could be used for housing naivetime and any other code
that does not directly stem from some ANSI C function.

-Brett C.




From guido@python.org  Mon Jun 17 21:49:49 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 16:49:49 -0400
Subject: [Python-Dev] large file support
In-Reply-To: Your message of "Mon, 17 Jun 2002 16:28:23 EDT."
 <15630.18023.298708.670795@slothrop.zope.com>
References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net>
 <15630.18023.298708.670795@slothrop.zope.com>
Message-ID: <200206172049.g5HKnn012080@pcp02138704pcs.reston01.va.comcast.net>

> On the platform I tried (apparently RH 7.1) it raises EOVERFLOW.  I
> can extend posixpath to treat that as "file exists" tomorrow.

OK.  Be sure to check that the errno module and the value
errno.EOVERFLOW exist before using them!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jun 17 21:51:34 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 16:51:34 -0400
Subject: [Python-Dev] Python strptime
In-Reply-To: Your message of "Mon, 17 Jun 2002 13:28:22 PDT."
 <Pine.SOL.4.44.0206171314510.19039-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0206171314510.19039-100000@death.OCF.Berkeley.EDU>
Message-ID: <200206172051.g5HKpYn12103@pcp02138704pcs.reston01.va.comcast.net>

> I have implemented strptime in pure Python (SF patch #474274) as a drop-in
> replacement for the time module's version, but there is the issue of the
> time module being a C extension.  Any chance of getting a Python module
> stub for time (assuming this patch is good enough to be accepted)?
> 
> There is also obviously the option of doing something like a time2, but is
> there enough other time-manipulating Python code out there to warrant
> another module?  It could be used for housing naivetime and any other code
> that does not directly stem from some ANSI C function.

I think this should be done, but I have no time to review your
strptime implementation.  Can you submit (to the same patch item) a
patch for timemodule.c that adds a callout to your Python strptime
code when HAVE_STRPTIME is undefined?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@v.loewis.de  Mon Jun 17 22:25:59 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 17 Jun 2002 23:25:59 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15629.62080.478763.190468@slothrop.zope.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
 <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
 <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
 <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
 <m3wut1eb9q.fsf@mira.informatik.hu-berlin.de>
 <15629.62080.478763.190468@slothrop.zope.com>
Message-ID: <m3n0ttmwbc.fsf@mira.informatik.hu-berlin.de>

Jeremy Hylton <jeremy@zope.com> writes:

> Here's a rough strategy for exploiting this feature in distutils.
> Does it make sense?

Sounds good. Unlike make, it should not choke if it cannot locate one
of the inputs of the dependency file - it may be that the header file
has gone away, and subsequent recompilation would update the
dependency file to show that.

If that is done, I'd still encourage use of the SUNPRO_DEPENDENCIES
feature for use with SunPRO (aka Forte, aka Sun ONE). Not that I'm
asking you to implement it, but it would be good if another such
mechanism would be easy to hook into whatever you implement.

Regards,
Martin



From guido@python.org  Mon Jun 17 22:31:35 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 17:31:35 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "17 Jun 2002 23:25:59 +0200."
 <m3n0ttmwbc.fsf@mira.informatik.hu-berlin.de>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <m3wut1eb9q.fsf@mira.informatik.h!
 u-berlin.de> <15629.62080.478763.190468@slothrop.zope.com>
 <m3n0ttmwbc.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206172131.g5HLVZT12712@pcp02138704pcs.reston01.va.comcast.net>

> If that is done, I'd still encourage use of the SUNPRO_DEPENDENCIES
> feature for use with SunPRO (aka Forte, aka Sun ONE). Not that I'm
> asking you to implement it, but it would be good if another such
> mechanism would be easy to hook into whatever you implement.

I don't recall that you explained the meaning of the
SUNPRO_DEPENDENCIES variable, only that it was undocumented and did
something similar to GCC's -M.  That's hardly enough. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From bac@OCF.Berkeley.EDU  Mon Jun 17 22:31:35 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Mon, 17 Jun 2002 14:31:35 -0700 (PDT)
Subject: [Python-Dev] Python strptime
In-Reply-To: <200206172051.g5HKpYn12103@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.SOL.4.44.0206171425250.8320-100000@hailstorm.OCF.Berkeley.EDU>

On Mon, 17 Jun 2002, Guido van Rossum wrote:

> > I have implemented strptime in pure Python (SF patch #474274) as a drop-in
> > replacement for the time module's version, but there is the issue of the
> > time module being a C extension.  Any chance of getting a Python module
> > stub for time (assuming this patch is good enough to be accepted)?
> >
> > There is also obviously the option of doing something like a time2, but is
> > there enough other time-manipulating Python code out there to warrant
> > another module?  It could be used for housing naivetime and any other code
> > that does not directly stem from some ANSI C function.
>
> I think this should be done, but I have no time to review your
> strptime implementation.  Can you submit (to the same patch item) a
> patch for timemodule.c that adds a callout to your Python strptime
> code when HAVE_STRPTIME is undefined?

Do you just want a callout to strptime or should I also include my helper
classes and functions?  I have implemented a class that figures out and
stores all locale-specific date info (weekday names, month names, etc.).
I subclass that for another class that creates the regexes used by
strptime.  I also have three functions that calculate missing data (Julian
date from Gregorian date, etc.).

But one does not need access to any of these things directly if one just
wants to use strptime like in the time module, so they can be left out for
now if you prefer.

>
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
>


-Brett C.




From guido@python.org  Mon Jun 17 22:44:28 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 17:44:28 -0400
Subject: [Python-Dev] Python strptime
In-Reply-To: Your message of "Mon, 17 Jun 2002 14:31:35 PDT."
 <Pine.SOL.4.44.0206171425250.8320-100000@hailstorm.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0206171425250.8320-100000@hailstorm.OCF.Berkeley.EDU>
Message-ID: <200206172144.g5HLiSZ12766@pcp02138704pcs.reston01.va.comcast.net>

> Do you just want a callout to strptime or should I also include my helper
> classes and functions?  I have implemented a class that figures out and
> stores all locale-specific date info (weekday names, month names, etc.).
> I subclass that for another class that creates the regexes used by
> strptime.  I also have three functions that calculate missing data (Julian
> date from Gregorian date, etc.).
> 
> But one does not need access to any of these things directly if one just
> wants to use strptime like in the time module, so they can be left out for
> now if you prefer.

If there's a way to get at the extra stuff by importing strptime.py,
that's preferred.  The time module only needs to support the classic
strptime function.  (But as I said I haven't seen your code, so maybe
I misunderstand your question.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@v.loewis.de  Mon Jun 17 22:44:59 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 17 Jun 2002 23:44:59 +0200
Subject: [Python-Dev] large file support
In-Reply-To: <15630.14945.184194.434432@slothrop.zope.com>
References: <15630.14073.764208.284613@slothrop.zope.com>
 <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net>
 <15630.14945.184194.434432@slothrop.zope.com>
Message-ID: <m3660hmvfo.fsf@mira.informatik.hu-berlin.de>

jeremy@zope.com (Jeremy Hylton) writes:

> However, I'm still unhappy with one thing related to large file
> support.  If you've got a Python that doesn't have large file support
> and you try os.path.exists() on a large file, it will return false.
> This is really bad!

I believe this is a pilot error. On a system that supports large
files, it is the administrator's job to make sure the Python
installation has large file support enabled, otherwise, strange things
may happen.

So yes, it is bad, but no, it is not really bad. Feel free to fix it,
but be prepared to include work-arounds in many other places, too.

Regards,
Martin




From martin@v.loewis.de  Mon Jun 17 22:49:11 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 17 Jun 2002 23:49:11 +0200
Subject: [Python-Dev] Python strptime
In-Reply-To: <Pine.SOL.4.44.0206171425250.8320-100000@hailstorm.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0206171425250.8320-100000@hailstorm.OCF.Berkeley.EDU>
Message-ID: <m3wusxlgo8.fsf@mira.informatik.hu-berlin.de>

Brett Cannon <bac@OCF.Berkeley.EDU> writes:

> Do you just want a callout to strptime or should I also include my helper
> classes and functions?  I have implemented a class that figures out and
> stores all locale-specific date info (weekday names, month names, etc.).

That sounds terrible. How do you do that, and on what systems does it
work? Do we really want to do that? Does it always work?

Regards,
Martin



From bac@OCF.Berkeley.EDU  Mon Jun 17 22:47:44 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Mon, 17 Jun 2002 14:47:44 -0700 (PDT)
Subject: [Python-Dev] Python strptime
In-Reply-To: <200206172144.g5HLiSZ12766@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.SOL.4.44.0206171445230.8504-100000@hailstorm.OCF.Berkeley.EDU>

On Mon, 17 Jun 2002, Guido van Rossum wrote:

> > Do you just want a callout to strptime or should I also include my helper
> > classes and functions?  I have implemented a class that figures out and
> > stores all locale-specific date info (weekday names, month names, etc.).
> > I subclass that for another class that creates the regexes used by
> > strptime.  I also have three functions that calculate missing data (Julian
> > date from Gregorian date, etc.).
> >
> > But one does not need access to any of these things directly if one just
> > wants to use strptime like in the time module, so they can be left out for
> > now if you prefer.
>
> If there's a way to get at the extra stuff by importing strptime.py,
> that's preferred.  The time module only needs to support the classic
> strptime function.  (But as I said I haven't seen your code, so maybe
> I misunderstand your question.)

No, you understood it.  I made all of it importable.  I figured they might
be useful in some other fashion so there is no munging of names or
explicit leaving out in __all__ or anything.

So my question is answered.  Now I just need to write the patch.  Might be
a little while since I have never bothered to learn callouts from C to
Python.  Guess I now have my personal project for the week.

-Brett C.




From guido@python.org  Mon Jun 17 22:57:03 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 17:57:03 -0400
Subject: [Python-Dev] large file support
In-Reply-To: Your message of "17 Jun 2002 23:44:59 +0200."
 <m3660hmvfo.fsf@mira.informatik.hu-berlin.de>
References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com>
 <m3660hmvfo.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206172157.g5HLv3f12913@pcp02138704pcs.reston01.va.comcast.net>

> I believe this is a pilot error. On a system that supports large
> files, it is the administrator's job to make sure the Python
> installation has large file support enabled, otherwise, strange things
> may happen.

I'm not sure who to blame, but note that (at least for 2.1.2, which is
the version that Jeremy said he was given to use) large file support
must be configured manually.  So this might be a common problem.
Unfortunately that may mean that it's only worth fixing in 2.1.4...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@v.loewis.de  Mon Jun 17 23:03:12 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 18 Jun 2002 00:03:12 +0200
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206172131.g5HLVZT12712@pcp02138704pcs.reston01.va.comcast.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
 <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
 <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
 <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
 <m3wut1eb9q.fsf@mira.informatik.h! u-berlin.de>
 <15629.62080.478763.190468@slothrop.zope.com>
 <m3n0ttmwbc.fsf@mira.informatik.hu-berlin.de>
 <200206172131.g5HLVZT12712@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3sn3llg0v.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> I don't recall that you explained the meaning of the
> SUNPRO_DEPENDENCIES variable, only that it was undocumented and did
> something similar to GCC's -M.  That's hardly enough. :-)

I see :-) Suppose you have a file x.c, and you invoke

   env SUNPRO_DEPENDENCIES="x.deps build/x.o" gcc -c -o x.o x.c

then a file x.deps is generated, and has, on the left-hand side of the
dependency rule, build/x.o. It works the same way for compilers
identifying themselves as

cc: Sun WorkShop 6 update 1 C 5.2 2000/09/11

when invoked with -V. I can't give a complete list of compilers that
support that feature, but setting the variable can't hurt - the worst
case is that it is ignored.

Regards,
Martin



From martin@v.loewis.de  Mon Jun 17 22:47:09 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 17 Jun 2002 23:47:09 +0200
Subject: [Python-Dev] large file support
In-Reply-To: <104250714.1024328694@[10.10.1.2]>
References: <15630.14073.764208.284613@slothrop.zope.com>
 <104250714.1024328694@[10.10.1.2]>
Message-ID: <m31yb5mvc2.fsf@mira.informatik.hu-berlin.de>

Andreas Jung <andreas@andreas-jung.com> writes:

> I remember that we had several times trouble with compiling
> Python 2.1 under Redhat with LF support. Also the way described in the docs
> did not work in all cases and we had to tweak the sources at bit
> (I think it was posixmodule.c).

For the current 2.1 release, the docs are believed to be correct (the
instructions used to be incorrect, as was the code). For 2.2, it is
believed that no extra configuration is necessary on "most" systems
(Windows, Linux, Solaris).

Regards,
Martin



From bac@OCF.Berkeley.EDU  Mon Jun 17 23:27:23 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Mon, 17 Jun 2002 15:27:23 -0700 (PDT)
Subject: [Python-Dev] Python strptime
In-Reply-To: <m3wusxlgo8.fsf@mira.informatik.hu-berlin.de>
Message-ID: <Pine.SOL.4.44.0206171504530.8504-100000@hailstorm.OCF.Berkeley.EDU>

On 17 Jun 2002, Martin v. Loewis wrote:

> Brett Cannon <bac@OCF.Berkeley.EDU> writes:
>
> > Do you just want a callout to strptime or should I also include my helper
> > classes and functions?  I have implemented a class that figures out and
> > stores all locale-specific date info (weekday names, month names, etc.).
>
> That sounds terrible. How do you do that, and on what systems does it
> work? Do we really want to do that? Does it always work?
>

Well, since locale info is not directly accessible for time-specific
things in Python (let alone in C in a standard way), I have to do multiple
calls to strftime to get the names of the weekdays.  As for the strings
representing locale-specifc date, time, and date/time representation I
have to go through and figure out what the format of the output to extract
the format string used by strftime to create the string. Since it is in
pure Python and relies only on strftime and locale for its info, it works
on all systems.  I have yet to have anyone say it doesn't work for them.

As for whether that is the best solution, I think it is for the situation.
Yes, I could roll all of this into strptime itself and make it a single
monolithic function.  The reason I did this was so that the object (names
LocaleTime) could handle lazy evaluation for that info.  That way you are
not paying the price of having to recalculate the same information
thousands of times (for instance if you are parsing a huge logfile).

I also think it is helpful to have that info available separately from
strptime since locale does not provide it.

Since the locale information is not accessible any other way that I can
come up with, the only other solution is to have someone enter all the
locale-specific info by hand.  I personally would rather put up with a
more complicated strptime setup then have to worry about entering all of
that info.

-Brett C.




From niemeyer@conectiva.com  Tue Jun 18 00:45:15 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Mon, 17 Jun 2002 20:45:15 -0300
Subject: [Python-Dev] Python strptime
In-Reply-To: <Pine.SOL.4.44.0206171504530.8504-100000@hailstorm.OCF.Berkeley.EDU>
References: <m3wusxlgo8.fsf@mira.informatik.hu-berlin.de> <Pine.SOL.4.44.0206171504530.8504-100000@hailstorm.OCF.Berkeley.EDU>
Message-ID: <20020617204515.A10690@ibook.distro.conectiva>

Brett,

> Well, since locale info is not directly accessible for time-specific
> things in Python (let alone in C in a standard way), I have to do multiple
> calls to strftime to get the names of the weekdays.  As for the strings
> representing locale-specifc date, time, and date/time representation I
> have to go through and figure out what the format of the output to extract
> the format string used by strftime to create the string. Since it is in
> pure Python and relies only on strftime and locale for its info, it works
> on all systems.  I have yet to have anyone say it doesn't work for them.
[...]
> I also think it is helpful to have that info available separately from
> strptime since locale does not provide it.

What kind of information are you looking for exactly? I'm not sure if
this is available in every paltform (it's standarized only by the
"Single UNIX Specification" acording to my man page), but depending on
this issue, everything you're looking for is there:

>>> locale.nl_langinfo(locale.D_FMT)
'%m/%d/%y'

You could also try loading a translation catalog in the target system,
but that could be unportable as well.

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From bac@OCF.Berkeley.EDU  Tue Jun 18 00:49:05 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Mon, 17 Jun 2002 16:49:05 -0700 (PDT)
Subject: [Python-Dev] Python strptime
In-Reply-To: <20020617204515.A10690@ibook.distro.conectiva>
Message-ID: <Pine.SOL.4.44.0206171647100.27620-100000@sandstorm.OCF.Berkeley.EDU>

> What kind of information are you looking for exactly? I'm not sure if
> this is available in every paltform (it's standarized only by the
> "Single UNIX Specification" acording to my man page), but depending on
> this issue, everything you're looking for is there:
>
> >>> locale.nl_langinfo(locale.D_FMT)
> '%m/%d/%y'
>
> You could also try loading a translation catalog in the target system,
> but that could be unportable as well.

That is the type of info I am looking for, but it is not portable.
Windows does not have this functionality to my knowledge.  If it did it is
stupid that it does not have strptime built-in.  ANSI C, unfortunately,
does not provide a way to get this info directly.  This is why I have to
get it from strftime.

-Brett C.

>
> --
> Gustavo Niemeyer




From niemeyer@conectiva.com  Tue Jun 18 00:53:21 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Mon, 17 Jun 2002 20:53:21 -0300
Subject: [Python-Dev] Python strptime
In-Reply-To: <Pine.SOL.4.44.0206171647100.27620-100000@sandstorm.OCF.Berkeley.EDU>
References: <20020617204515.A10690@ibook.distro.conectiva> <Pine.SOL.4.44.0206171647100.27620-100000@sandstorm.OCF.Berkeley.EDU>
Message-ID: <20020617205321.A11040@ibook.distro.conectiva>

> That is the type of info I am looking for, but it is not portable.
> Windows does not have this functionality to my knowledge.  If it did it is
> stupid that it does not have strptime built-in.  ANSI C, unfortunately,
> does not provide a way to get this info directly.  This is why I have to
> get it from strftime.

Well, providing strftime and not strptime is stupid already, following
your point of view. :-)

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From jason@crash.org  Tue Jun 18 01:14:21 2002
From: jason@crash.org (Jason L. Asbahr)
Date: Mon, 17 Jun 2002 19:14:21 -0500
Subject: [Python-Dev] Playstation 2 and GameCube ports
Message-ID: <EIEFLCFECLLBKGPNJJIMOEOAIIAA.jason@crash.org>

Pythonistas,

During the past year, I did some work that involved porting Python 2.1
to the Sony Playstation 2 (professional developer system) and Nintendo 
GameCube platforms.  Since it involved little code change to Python
itself, I was wondering if there is interest in merging these changes 
into the main trunk?  

This would first involved bringing those ports up to speed with the 
current CVS trunk.  But before that, a PEP?

Cheers,

Jason

______________________________________________________________________
Jason Asbahr
jason@asbahr.com

 



From pobrien@orbtech.com  Tue Jun 18 01:30:41 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Mon, 17 Jun 2002 19:30:41 -0500
Subject: [Python-Dev] Playstation 2 and GameCube ports
In-Reply-To: <EIEFLCFECLLBKGPNJJIMOEOAIIAA.jason@crash.org>
Message-ID: <NBBBIOJPGKJEKIECEMCBIEOGNEAA.pobrien@orbtech.com>

[Jason L. Asbahr]
>
> Pythonistas,
>
> During the past year, I did some work that involved porting Python 2.1
> to the Sony Playstation 2 (professional developer system) and Nintendo
> GameCube platforms.  Since it involved little code change to Python
> itself, I was wondering if there is interest in merging these changes
> into the main trunk?
>
> This would first involved bringing those ports up to speed with the
> current CVS trunk.  But before that, a PEP?

That would certainly get my son's attention and might even get him started
in programming. I wouldn't mind seeing your efforts written up in a PEP.
What exactly can you accomplish with Python on one of these boxes?

--
Patrick K. O'Brien
Orbtech
-----------------------------------------------
"Your source for Python software development."
-----------------------------------------------
Web:  http://www.orbtech.com/web/pobrien/
Blog: http://www.orbtech.com/blog/pobrien/
Wiki: http://www.orbtech.com/wiki/PatrickOBrien
-----------------------------------------------




From guido@python.org  Tue Jun 18 02:24:50 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 21:24:50 -0400
Subject: [Python-Dev] Playstation 2 and GameCube ports
In-Reply-To: Your message of "Mon, 17 Jun 2002 19:14:21 CDT."
 <EIEFLCFECLLBKGPNJJIMOEOAIIAA.jason@crash.org>
References: <EIEFLCFECLLBKGPNJJIMOEOAIIAA.jason@crash.org>
Message-ID: <200206180124.g5I1Oos13421@pcp02138704pcs.reston01.va.comcast.net>

> During the past year, I did some work that involved porting Python 2.1
> to the Sony Playstation 2 (professional developer system) and Nintendo 
> GameCube platforms.  Since it involved little code change to Python
> itself, I was wondering if there is interest in merging these changes 
> into the main trunk?  
> 
> This would first involved bringing those ports up to speed with the 
> current CVS trunk.  But before that, a PEP?

I don't think this needs a PEP -- you can just submit the changes to
the SF patch manager, assuming they are indeed small.

Our neighbors Qove here in McLean might be interested in your work.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jun 18 02:42:36 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 21:42:36 -0400
Subject: [Python-Dev] Playstation 2 and GameCube ports
In-Reply-To: Your message of "Mon, 17 Jun 2002 19:30:41 CDT."
 <NBBBIOJPGKJEKIECEMCBIEOGNEAA.pobrien@orbtech.com>
References: <NBBBIOJPGKJEKIECEMCBIEOGNEAA.pobrien@orbtech.com>
Message-ID: <200206180142.g5I1gbg13515@pcp02138704pcs.reston01.va.comcast.net>

> That would certainly get my son's attention and might even get him started
> in programming. I wouldn't mind seeing your efforts written up in a PEP.
> What exactly can you accomplish with Python on one of these boxes?

Don't you need a (costly) developers license in order to use this?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jun 18 02:46:35 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 21:46:35 -0400
Subject: [Python-Dev] Python strptime
In-Reply-To: Your message of "Mon, 17 Jun 2002 15:27:23 PDT."
 <Pine.SOL.4.44.0206171504530.8504-100000@hailstorm.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0206171504530.8504-100000@hailstorm.OCF.Berkeley.EDU>
Message-ID: <200206180146.g5I1kZn13557@pcp02138704pcs.reston01.va.comcast.net>

> Well, since locale info is not directly accessible for time-specific
> things in Python (let alone in C in a standard way), I have to do
> multiple calls to strftime to get the names of the weekdays.

I guess so -- the calendar module does the same (and then makes them
available).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jafo@tummy.com  Tue Jun 18 03:50:02 2002
From: jafo@tummy.com (Sean Reifschneider)
Date: Mon, 17 Jun 2002 20:50:02 -0600
Subject: [Python-Dev] large file support
In-Reply-To: <15630.18023.298708.670795@slothrop.zope.com>; from jeremy@zope.com on Mon, Jun 17, 2002 at 04:28:23PM -0400
References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com>
Message-ID: <20020617205002.A13413@tummy.com>

On Mon, Jun 17, 2002 at 04:28:23PM -0400, Jeremy Hylton wrote:
>On the platform I tried (apparently RH 7.1) it raises EOVERFLOW.  I
>can extend posixpath to treat that as "file exists" tomorrow.

How about changing os.path.exists for posix to:

   def exists(path):
      return(os.access(path, os.F_OK))

I haven't done more than a few simple tests, but I believe that this would
provide similar functionality without relying on os.stat not breaking.
Plus, access is faster (on the order of 2x as fast stating a quarter
million files on my laptop).

Sean
-- 
 I have never been able to conceive how any rational being could propose
 happiness to himself from the exercise of power over others.  -- Jefferson
Sean Reifschneider, Inimitably Superfluous <jafo@tummy.com>
tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python



From guido@python.org  Tue Jun 18 04:02:42 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 23:02:42 -0400
Subject: [Python-Dev] large file support
In-Reply-To: Your message of "Mon, 17 Jun 2002 20:50:02 MDT."
 <20020617205002.A13413@tummy.com>
References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com>
 <20020617205002.A13413@tummy.com>
Message-ID: <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net>

> How about changing os.path.exists for posix to:
> 
>    def exists(path):
>       return(os.access(path, os.F_OK))

NO, NO, NOOOOOOO!

access() does something different.  It checks permissions as they
would be for the effective user id.  DO NOT USE access() TO CHECK FOR
FILE PERMISSIONS UNLESS YOU HAVE A SET-UID MISSION!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jafo@tummy.com  Tue Jun 18 04:07:07 2002
From: jafo@tummy.com (Sean Reifschneider)
Date: Mon, 17 Jun 2002 21:07:07 -0600
Subject: [Python-Dev] large file support
In-Reply-To: <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jun 17, 2002 at 11:02:42PM -0400
References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com> <20020617205002.A13413@tummy.com> <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020617210707.C24702@tummy.com>

On Mon, Jun 17, 2002 at 11:02:42PM -0400, Guido van Rossum wrote:
>> How about changing os.path.exists for posix to:
>> 
>>    def exists(path):
>>       return(os.access(path, os.F_OK))
>
>NO, NO, NOOOOOOO!
>
>access() does something different.  It checks permissions as they

F_OK checks to see if the file exists.  Am I misunderstanding something in
the following test:

   [2] guin:tmp# cd /tmp
   [2] guin:tmp# mkdir test
   [2] guin:tmp# chmod 700 test
   [2] guin:tmp# touch test/exists
   [2] guin:tmp# chmod 700 test/exists
   [2] guin:tmp# su -c '/tmp/showaccess /tmp/test/exists' jafo
   access: 0   exists: 0
   [2] guin:tmp# chmod 111 /tmp/test
   [2] guin:tmp# su -c '/tmp/showaccess /tmp/test/exists' jafo
   access: 1   exists: 1
   [2] guin:tmp# chmod 000 test/exists
   [2] guin:tmp# su -c '/tmp/showaccess /tmp/test/exists' jafo
   access: 1   exists: 1
   [2] guin:tmp# chmod 000 /tmp/test
   [2] guin:tmp# su -c '/tmp/showaccess /tmp/test/exists' jafo
   access: 0   exists: 0
   [2] guin:tmp# su -c '/tmp/showaccess /tmp/test/noexists' jafo
   access: 0   exists: 0
   [2] guin:tmp# chmod 777 /tmp/test
   [2] guin:tmp# su -c '/tmp/showaccess /tmp/test/noexists' jafo
   access: 0   exists: 0

The above is run as root, with the su doing the test as non-root.  The code
in showaccess simply does an os.access and then an os.path.exists and
displays the results:

   [2] guin:tmp# cat /tmp/showaccess 
   #!/usr/bin/env python2

   import os, sys

   print 'access: %d   exists: %d' % ( os.access(sys.argv[1], os.F_OK),
         os.path.exists(sys.argv[1]))

Sean
-- 
 /home is where your .heart is.  -- Sean Reifschneider, 1999
Sean Reifschneider, Inimitably Superfluous <jafo@tummy.com>
tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python



From guido@python.org  Tue Jun 18 04:25:34 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Jun 2002 23:25:34 -0400
Subject: [Python-Dev] large file support
In-Reply-To: Your message of "Mon, 17 Jun 2002 21:07:07 MDT."
 <20020617210707.C24702@tummy.com>
References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com> <20020617205002.A13413@tummy.com> <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net>
 <20020617210707.C24702@tummy.com>
Message-ID: <200206180325.g5I3PYj28659@pcp02138704pcs.reston01.va.comcast.net>

> >>    def exists(path):
> >>       return(os.access(path, os.F_OK))
> >
> >NO, NO, NOOOOOOO!
> >
> >access() does something different.  It checks permissions as they
> 
> F_OK checks to see if the file exists.

It is my understanding that if some directory along the path to the
file is accessible to root but not to the effective user, access() for
a file in that directory might return 0 while exists would return 1,
on some operating systems.

There's only one rule for access(): only use it if you have a set-uid
mission.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From jeremy@zope.com  Tue Jun 18 05:14:18 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Tue, 18 Jun 2002 00:14:18 -0400
Subject: [Python-Dev] large file support
In-Reply-To: <m3660hmvfo.fsf@mira.informatik.hu-berlin.de>
References: <15630.14073.764208.284613@slothrop.zope.com>
 <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net>
 <15630.14945.184194.434432@slothrop.zope.com>
 <m3660hmvfo.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15630.45978.675730.729531@slothrop.zope.com>

>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

  MvL> jeremy@zope.com (Jeremy Hylton) writes:
  >> However, I'm still unhappy with one thing related to large file
  >> support.  If you've got a Python that doesn't have large file
  >> support and you try os.path.exists() on a large file, it will
  >> return false.  This is really bad!

  MvL> I believe this is a pilot error. On a system that supports
  MvL> large files, it is the administrator's job to make sure the
  MvL> Python installation has large file support enabled, otherwise,
  MvL> strange things may happen.

We sure don't provide much help for such an administrator.  (Happily,
I am not one.)  The instructions for Linux offer a configure recipe
and says "it might work."  If you build without large file support on
a Linux system, the test suite gives no indication that something went
wrong.  So I think it is unreasonable to say the Python install is
broken, despite the fact that it's possible to do better.

  MvL> So yes, it is bad, but no, it is not really bad. Feel free to
  MvL> fix it, but be prepared to include work-arounds in many other
  MvL> places, too.

os.path.exists() is perhaps the most egregious.  I think it's worth
backporting the fix to the 2.1 branch, along with any other glaring
errors.  We might still see a 2.1.4.

Jeremy





From mgilfix@eecs.tufts.edu  Tue Jun 18 05:26:28 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Tue, 18 Jun 2002 00:26:28 -0400
Subject: [Python-Dev] test_socket failures
In-Reply-To: <20020618000741.F17999@eecs.tufts.edu>; from mgilfix@eecs.tufts.edu on Tue, Jun 18, 2002 at 12:07:41AM -0400
References: <200206122059.g5CKxQa16372@odiug.zope.com> <20020612191355.C10542@eecs.tufts.edu> <200206130042.g5D0gZu10922@pcp02138704pcs.reston01.va.comcast.net> <20020612232545.B12119@eecs.tufts.edu> <200206131657.g5DGv0300386@odiug.zope.com> <20020613140908.E18170@eecs.tufts.edu> <200206131816.g5DIGSS03032@odiug.zope.com> <20020618000741.F17999@eecs.tufts.edu>
Message-ID: <20020618002627.G17999@eecs.tufts.edu>

  Guido, please let me know if you want me to do anything more
regarding the test_socket.py stuff and perhaps some timeout stuff
before I take off (this wednsday). A note for the interested: I'll be
gone for a month on vacation, so response won't be timely.

             Regards,

                 -- Mike

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html



From martin@v.loewis.de  Tue Jun 18 06:30:15 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 18 Jun 2002 07:30:15 +0200
Subject: [Python-Dev] Python strptime
In-Reply-To: <Pine.SOL.4.44.0206171504530.8504-100000@hailstorm.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0206171504530.8504-100000@hailstorm.OCF.Berkeley.EDU>
Message-ID: <m3adptqhlk.fsf@mira.informatik.hu-berlin.de>

Brett Cannon <bac@OCF.Berkeley.EDU> writes:

> Well, since locale info is not directly accessible for time-specific
> things in Python (let alone in C in a standard way), I have to do multiple
> calls to strftime to get the names of the weekdays.

I wonder what the purpose of having a pure-Python implementation of
strptime is, if you have to rely on strftime. Is this for Windows only?

Regards,
Martin




From jafo@tummy.com  Tue Jun 18 06:32:41 2002
From: jafo@tummy.com (Sean Reifschneider)
Date: Mon, 17 Jun 2002 23:32:41 -0600
Subject: [Python-Dev] large file support
In-Reply-To: <200206180325.g5I3PYj28659@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jun 17, 2002 at 11:25:34PM -0400
References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com> <20020617205002.A13413@tummy.com> <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net> <20020617210707.C24702@tummy.com> <200206180325.g5I3PYj28659@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020617233241.E24702@tummy.com>

On Mon, Jun 17, 2002 at 11:25:34PM -0400, Guido van Rossum wrote:
>It is my understanding that if some directory along the path to the
>file is accessible to root but not to the effective user, access() for
>a file in that directory might return 0 while exists would return 1,

I would be shocked if POSIX allowed a non-root user to probe file entries
under a root/700 directory...

What a paradox -- when I submitted the patch to add F_OK, you said that
exists() did the same thing.  ;-)

Sean
-- 
 "Your documents always look so good."  "That's because I keep my laser-printer
 set on ``stun''."  -- Sean Reifschneider, 1998
Sean Reifschneider, Inimitably Superfluous <jafo@tummy.com>
tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python



From bac@OCF.Berkeley.EDU  Tue Jun 18 06:32:45 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Mon, 17 Jun 2002 22:32:45 -0700 (PDT)
Subject: [Python-Dev] Python strptime
In-Reply-To: <20020617205321.A11040@ibook.distro.conectiva>
Message-ID: <Pine.SOL.4.44.0206172231340.15718-100000@death.OCF.Berkeley.EDU>

On Mon, 17 Jun 2002, Gustavo Niemeyer wrote:

> > That is the type of info I am looking for, but it is not portable.
> > Windows does not have this functionality to my knowledge.  If it did it is
> > stupid that it does not have strptime built-in.  ANSI C, unfortunately,
> > does not provide a way to get this info directly.  This is why I have to
> > get it from strftime.
>
> Well, providing strftime and not strptime is stupid already, following
> your point of view. :-)
>

Yeah.  =)  Since it is just reversed it is not that difficult once you
have the locale information.  The most difficult part of this whole module
was trying to come up with a way to get that information.

> --
> Gustavo Niemeyer
>

-Brett C.




From martin@v.loewis.de  Tue Jun 18 06:36:07 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 18 Jun 2002 07:36:07 +0200
Subject: [Python-Dev] large file support
In-Reply-To: <15630.45978.675730.729531@slothrop.zope.com>
References: <15630.14073.764208.284613@slothrop.zope.com>
 <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net>
 <15630.14945.184194.434432@slothrop.zope.com>
 <m3660hmvfo.fsf@mira.informatik.hu-berlin.de>
 <15630.45978.675730.729531@slothrop.zope.com>
Message-ID: <m3660hqhbs.fsf@mira.informatik.hu-berlin.de>

jeremy@zope.com (Jeremy Hylton) writes:

> We sure don't provide much help for such an administrator.

We do, but not in 2.1.

> os.path.exists() is perhaps the most egregious.  I think it's worth
> backporting the fix to the 2.1 branch, along with any other glaring
> errors.  We might still see a 2.1.4.

In that case, I recommend to backport the machinery that enables LFS
from 2.2. If this machinery fails to detect LFS support on a system,
there is a good chance that your processing of EOVERFLOW fails on that
system as well.

Regards,
Martin




From bac@OCF.Berkeley.EDU  Tue Jun 18 06:39:37 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Mon, 17 Jun 2002 22:39:37 -0700 (PDT)
Subject: [Python-Dev] Python strptime
In-Reply-To: <200206180146.g5I1kZn13557@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.SOL.4.44.0206172232540.15718-100000@death.OCF.Berkeley.EDU>

On Mon, 17 Jun 2002, Guido van Rossum wrote:

> > Well, since locale info is not directly accessible for time-specific
> > things in Python (let alone in C in a standard way), I have to do
> > multiple calls to strftime to get the names of the weekdays.
>
> I guess so -- the calendar module does the same (and then makes them
> available).
>

Perhaps this info is important enough to not be in time but in locale?
I could rework my code that figures out the date info to fit more into
locale.  Maybe have some constants (like A_WEEKDAY, F_WEEKDAY, etc.) that
could be passed to a function that would return a list of the requested
names?  Or could stay with the way I currently have it and just have a
class that stores all of that info and has named attributes to return the
info?

> --Guido van Rossum (home page: http://www.python.org/~guido/)

-Brett C.




From bac@OCF.Berkeley.EDU  Tue Jun 18 06:53:58 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Mon, 17 Jun 2002 22:53:58 -0700 (PDT)
Subject: [Python-Dev] Python strptime
In-Reply-To: <m3adptqhlk.fsf@mira.informatik.hu-berlin.de>
Message-ID: <Pine.SOL.4.44.0206172245190.15718-100000@death.OCF.Berkeley.EDU>

On 18 Jun 2002, Martin v. Loewis wrote:

> Brett Cannon <bac@OCF.Berkeley.EDU> writes:
>
> > Well, since locale info is not directly accessible for time-specific
> > things in Python (let alone in C in a standard way), I have to do multiple
> > calls to strftime to get the names of the weekdays.
>
> I wonder what the purpose of having a pure-Python implementation of
> strptime is, if you have to rely on strftime. Is this for Windows only?
>

The purpose is that strptime is not common across all platforms.  As it
stands now, it requires that the underlying C library support it.  Since
it is not specified in ANSI C, not all have it.  glibc has it so most UNIX
installs have it.  But Windows doesn't.  It is not meant specifically for
Windows, but it happens to be the major OS that lacks it.

But strftime is guaranteed by Python to be there since it is in ANSI C.
The reason I have the reliance on strftime is because it was the only way
I could think of to reliably gain access to locale information in regards
for time.  If I didn't try to figure out what the names of months were, I
would not need strftime at all.  But since I wanted this to be able to be
a drop-in replacement, I decided it was worth my time to figure out how to
get this locale info when it is not directly accessible.

As for why it is in Python and not C, it's mainly because I prefer Python.
=)  I think it could be done in C, but it would be much more work.  I also
remember Guido saying somewhere that he would like to see modules in
Python when possible.  It was possible, so I did it in Python.

> Regards,
> Martin
>
>

-Brett C.




From pf@artcom-gmbh.de  Tue Jun 18 08:27:13 2002
From: pf@artcom-gmbh.de (Peter Funk)
Date: Tue, 18 Jun 2002 09:27:13 +0200 (CEST)
Subject: [Python-Dev] Python strptime
In-Reply-To: <Pine.SOL.4.44.0206172245190.15718-100000@death.OCF.Berkeley.EDU>
 from Brett Cannon at "Jun 17, 2002 10:53:58 pm"
Message-ID: <m17KDOT-000H5jC@artcom0.artcom-gmbh.de>

Hi,

Brett Cannon:
[...]
> The purpose is that strptime is not common across all platforms.  As it
> stands now, it requires that the underlying C library support it.  Since
> it is not specified in ANSI C, not all have it.  glibc has it so most UNIX
> installs have it.  But Windows doesn't.  It is not meant specifically for
> Windows, but it happens to be the major OS that lacks it.
[...]

There is some relationship betweeen the time module and the calendar
module, which is also a long time member of the Python standard
library.  If your new code becomes part of the standard library,
please have a look at the calendar module and its documentation.
At least crossreferencing pointers should be added.  May be someone
will come up with a "locale-awareness" patch to calendar?
Currently month and weekday names are constants hardcoded in
english in calendar.py.  The stuff you wrote might help here.

Regards, Peter
-- 
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen, Germany)




From fredrik@pythonware.com  Tue Jun 18 09:50:05 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 18 Jun 2002 10:50:05 +0200
Subject: [Python-Dev] Python strptime
References: <Pine.SOL.4.44.0206171445230.8504-100000@hailstorm.OCF.Berkeley.EDU>
Message-ID: <01e801c216a5$257eb130$0900a8c0@spiff>

brett wrote:

> Might be a little while since I have never bothered to learn callouts
> from C to Python.  Guess I now have my personal project for the week.

look for the "call" function in Modules/_sre.c (and how it's
used throughout the module).

</F>




From mwh@python.net  Tue Jun 18 12:23:11 2002
From: mwh@python.net (Michael Hudson)
Date: 18 Jun 2002 12:23:11 +0100
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects bufferobject.c,2.17,2.18 classobject.c,2.156,2.157 descrobject.c,2.26,2.27 funcobject.c,2.54,2.55 sliceobject.c,2.14,2.15
In-Reply-To: gvanrossum@users.sourceforge.net's message of "Fri, 14 Jun 2002 13:41:18 -0700"
References: <E17Ixsk-0005Ic-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <2m3cvkde5c.fsf@starship.python.net>

gvanrossum@users.sourceforge.net writes:

> Index: classobject.c
[...]
> + static PyObject *
> + new_class(PyObject* unused, PyObject* args)
> + {
> + 	PyObject *name;
> + 	PyObject *classes;
> + 	PyObject *dict;
> +   
> + 	if (!PyArg_ParseTuple(args, "SO!O!:class",
> + 			      &name,
> + 			      &PyTuple_Type, &classes,
> + 			      &PyDict_Type, &dict))
> + 		return NULL;
> + 	return PyClass_New(classes, dict, name);
> + }
> + 

What's this for?  It's not referred to anywhere, so I'm getting
warnings about it.  I'd just hack it out, but it only just got
added...

Cheers,
M.

-- 
  ARTHUR:  Don't ask me how it works or I'll start to whimper.
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 11



From guido@python.org  Tue Jun 18 12:43:56 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 18 Jun 2002 07:43:56 -0400
Subject: [Python-Dev] Python strptime
In-Reply-To: Your message of "18 Jun 2002 07:30:15 +0200."
 <m3adptqhlk.fsf@mira.informatik.hu-berlin.de>
References: <Pine.SOL.4.44.0206171504530.8504-100000@hailstorm.OCF.Berkeley.EDU>
 <m3adptqhlk.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206181144.g5IBi3g29948@pcp02138704pcs.reston01.va.comcast.net>

> I wonder what the purpose of having a pure-Python implementation of
> strptime is, if you have to rely on strftime. Is this for Windows only?

Isn't the problem that strftime() is in the C standard but strptime()
is not?  So strptime() isn't always provided but we can count on
strftime()?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jun 18 12:50:09 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 18 Jun 2002 07:50:09 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects bufferobject.c,2.17,2.18 classobject.c,2.156,2.157 descrobject.c,2.26,2.27 funcobject.c,2.54,2.55 sliceobject.c,2.14,2.15
In-Reply-To: Your message of "18 Jun 2002 12:23:11 BST."
 <2m3cvkde5c.fsf@starship.python.net>
References: <E17Ixsk-0005Ic-00@usw-pr-cvs1.sourceforge.net>
 <2m3cvkde5c.fsf@starship.python.net>
Message-ID: <200206181150.g5IBo9e30012@pcp02138704pcs.reston01.va.comcast.net>

> > Index: classobject.c
> [...]
> > + static PyObject *
> > + new_class(PyObject* unused, PyObject* args)
[...]
> 
> What's this for?  It's not referred to anywhere, so I'm getting
> warnings about it.  I'd just hack it out, but it only just got
> added...

Looks like an experiment by Oren Tirosh that didn't get nuked.  I
think you can safely lose it.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jun 18 12:56:15 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 18 Jun 2002 07:56:15 -0400
Subject: [Python-Dev] Python strptime
In-Reply-To: Your message of "Tue, 18 Jun 2002 09:27:13 +0200."
 <m17KDOT-000H5jC@artcom0.artcom-gmbh.de>
References: <m17KDOT-000H5jC@artcom0.artcom-gmbh.de>
Message-ID: <200206181156.g5IBuFI30101@pcp02138704pcs.reston01.va.comcast.net>

> Currently month and weekday names are constants hardcoded in
> english in calendar.py.

No they're not.  You're a year behind. ;-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jun 18 13:00:26 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 18 Jun 2002 08:00:26 -0400
Subject: [Python-Dev] Python strptime
In-Reply-To: Your message of "Mon, 17 Jun 2002 22:39:37 PDT."
 <Pine.SOL.4.44.0206172232540.15718-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0206172232540.15718-100000@death.OCF.Berkeley.EDU>
Message-ID: <200206181200.g5IC0Qu30146@pcp02138704pcs.reston01.va.comcast.net>

> Perhaps this info is important enough to not be in time but in locale?

Perhaps, if Martin von Loewis agrees.

> I could rework my code that figures out the date info to fit more into
> locale.  Maybe have some constants (like A_WEEKDAY, F_WEEKDAY, etc.) that
> could be passed to a function that would return a list of the requested
> names?  Or could stay with the way I currently have it and just have a
> class that stores all of that info and has named attributes to return the
> info?

I haven't seen your code and have no time to review it until I'm back
from vacation, so can't comment on this bit. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido@python.org  Tue Jun 18 13:01:56 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 18 Jun 2002 08:01:56 -0400
Subject: [Python-Dev] large file support
In-Reply-To: Your message of "18 Jun 2002 07:36:07 +0200."
 <m3660hqhbs.fsf@mira.informatik.hu-berlin.de>
References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <m3660hmvfo.fsf@mira.informatik.hu-berlin.de> <15630.45978.675730.729531@slothrop.zope.com>
 <m3660hqhbs.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206181201.g5IC1uM30164@pcp02138704pcs.reston01.va.comcast.net>

> In that case, I recommend to backport the machinery that enables LFS
> from 2.2. If this machinery fails to detect LFS support on a system,
> there is a good chance that your processing of EOVERFLOW fails on that
> system as well.

That sounds a good plan, though painful (much configure.in hacking,
and didn't we switch to a newer version of autoconf?).

Can you help?  2.1 is still a popular release, and large files will
become more and more common as it grows older...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jun 18 13:05:35 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 18 Jun 2002 08:05:35 -0400
Subject: [Python-Dev] large file support
In-Reply-To: Your message of "Mon, 17 Jun 2002 23:32:41 MDT."
 <20020617233241.E24702@tummy.com>
References: <15630.14073.764208.284613@slothrop.zope.com> <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com> <20020617205002.A13413@tummy.com> <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net> <20020617210707.C24702@tummy.com> <200206180325.g5I3PYj28659@pcp02138704pcs.reston01.va.comcast.net>
 <20020617233241.E24702@tummy.com>
Message-ID: <200206181205.g5IC5aE30187@pcp02138704pcs.reston01.va.comcast.net>

> I would be shocked if POSIX allowed a non-root user to probe file
> entries under a root/700 directory...

Exactly.  If a program is written to use access(), and subsequently
that program is used in a setuid(root) situation, access() will say
you can't access the file, but exists() will say it exists.  So
access() cannot be used to emulate exists() -- they serve different
purposes, and can return different results.

> What a paradox -- when I submitted the patch to add F_OK, you said that
> exists() did the same thing.  ;-)

Given the widespread misunderstanding of what access() does, anything
that makes using access() easier is a mistake IMO.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From Oleg Broytmann <phd@phd.pp.ru>  Tue Jun 18 15:51:27 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Tue, 18 Jun 2002 18:51:27 +0400
Subject: [Python-Dev] unicode() and its error argument
In-Reply-To: <20020615190441.E12705@phd.pp.ru>; from phd@phd.pp.ru on Sat, Jun 15, 2002 at 07:04:41PM +0400
References: <15627.21613.94336.985634@12-248-41-177.client.attbi.com> <20020615185842.D12705@phd.pp.ru> <200206151505.g5FF5Cr16468@pcp02138704pcs.reston01.va.comcast.net> <20020615190441.E12705@phd.pp.ru>
Message-ID: <20020618185127.R17532@phd.pp.ru>

Hello!

On Sat, Jun 15, 2002 at 07:04:41PM +0400, Oleg Broytmann wrote:
> On Sat, Jun 15, 2002 at 11:05:12AM -0400, Guido van Rossum wrote:
> > >    I got the error very often (but I use encoding conversion much more
> > > often than you). First time I saw it I was very surprized that neither
> > > "ignore" nor "replace" can eliminate the error.
> > 
> > Got an example?
> 
>    Not right now... I'll send it when I get one.

   Sorry for the false alarm. It was my fault. I used to write

s = unicode(s, "cp1251").encode("koi8-r", "replace")

   where I need

s = unicode(s, "cp1251", "replace").encode("koi8-r", "replace")                         ^^^^^^^^^

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From jafo@tummy.com  Tue Jun 18 15:53:08 2002
From: jafo@tummy.com (Sean Reifschneider)
Date: Tue, 18 Jun 2002 08:53:08 -0600
Subject: [Python-Dev] large file support
In-Reply-To: <200206181205.g5IC5aE30187@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Tue, Jun 18, 2002 at 08:05:35AM -0400
References: <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net> <15630.14945.184194.434432@slothrop.zope.com> <200206172008.g5HK8QA11745@pcp02138704pcs.reston01.va.comcast.net> <15630.18023.298708.670795@slothrop.zope.com> <20020617205002.A13413@tummy.com> <200206180302.g5I32gk28518@pcp02138704pcs.reston01.va.comcast.net> <20020617210707.C24702@tummy.com> <200206180325.g5I3PYj28659@pcp02138704pcs.reston01.va.comcast.net> <20020617233241.E24702@tummy.com> <200206181205.g5IC5aE30187@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020618085308.I24702@tummy.com>

On Tue, Jun 18, 2002 at 08:05:35AM -0400, Guido van Rossum wrote:
>Given the widespread misunderstanding of what access() does, anything
>that makes using access() easier is a mistake IMO.

I obviously need to re-read my Posix reference.  I've submitted a docstr
and library documentation change for os.access() which should make it clear
what the issues are...

Sean
-- 
 You know you're in Canada when:  You see a flyer advertising a polka-fest
 at the curling rink.
Sean Reifschneider, Inimitably Superfluous <jafo@tummy.com>
tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python



From skip@pobox.com  Tue Jun 18 16:21:07 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 18 Jun 2002 10:21:07 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <200206171530.g5HFU9X09701@pcp02138704pcs.reston01.va.comcast.net>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
 <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
 <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
 <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
 <m3wut1eb9q.fsf@mira.informatik.h!
 u-berlin.de>
 <15629.62080.478763.190468@slothrop.zope.com>
 <15629.64994.385213.97041@12-248-41-177.client.attbi.com>
 <200206171530.g5HFU9X09701@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15631.20451.97640.63785@localhost.localdomain>

    Guido> [Proposal to use SCons]
    Guido> Let's not tie ourselves to SCons before it's a lot more mature.

I wasn't proposing that, at least not for the short-term.  I was suggesting
that distutils be left as is, and a SConstruct file be delivered in
.../Modules, to be used manually by developers to update module .so files.

Skip




From skip@pobox.com  Tue Jun 18 16:31:46 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 18 Jun 2002 10:31:46 -0500
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15630.518.652328.859842@slothrop.zope.com>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
 <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
 <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
 <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
 <m3wut1eb9q.fsf@mira.informatik.hu-berlin.de>
 <15629.62080.478763.190468@slothrop.zope.com>
 <15629.64994.385213.97041@12-248-41-177.client.attbi.com>
 <15630.518.652328.859842@slothrop.zope.com>
Message-ID: <15631.21090.114087.860848@localhost.localdomain>

    SM> Instead, I see a potentially different approach.  Write an scons
    SM> build file (typically named SConstruct) and deliver that in the
    SM> Modules directory.

    Jeremy> I don't care much about the Modules directory actually.  I want
    Jeremy> this for third-party extensions that use distutils for
    Jeremy> distribution, particularly for my own third-party extensions
    Jeremy> :-).

As I think has been hashed out here recently, there are two functions that
need to be addressed.  Distribution/installation of modules is fine with
distutils as it currently sits.

    Jeremy> It sounds like you're proposing to drop distutils in favor of
    Jeremy> SCons, but not saying so explicitly.  Is that right?  

No.  Here, I'll put it in writing:  I am explicitly not suggesting that
distutils be dropped.

I suggested that a SConstruct file be added to the modules directory to be
used by people who need to do more than install modules.  That's it.

    Jeremy> If so, we'd need to strong case for dumping distutils than
    Jeremy> automatic dependency tracking.  If that isn't right, I don't
    Jeremy> understand how SCons and distutils meet in the middle.  Would
    Jeremy> extension writers need to learn distutils and SCons?

No.  I'm only suggesting that a SConstruct file be added to the Modules
directory.  I don't want it tied into the build process, at least for the
time being.  As Guido indicated, scons is still in its infancy.

Skip



From jeremy@zope.com  Tue Jun 18 16:36:01 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Tue, 18 Jun 2002 11:36:01 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: <15631.21090.114087.860848@localhost.localdomain>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com>
 <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de>
 <15624.62928.845160.407762@12-248-41-177.client.attbi.com>
 <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de>
 <15625.2485.994408.888814@12-248-41-177.client.attbi.com>
 <m3660myggx.fsf@mira.informatik.hu-berlin.de>
 <15625.16032.161304.357298@12-248-41-177.client.attbi.com>
 <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net>
 <20020614074458.GA31022@strakt.com>
 <3D09A388.8080107@lemburg.com>
 <15626.11408.660388.360296@12-248-41-177.client.attbi.com>
 <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net>
 <15626.18460.685973.605098@12-248-41-177.client.attbi.com>
 <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net>
 <15626.19179.464237.382313@12-248-41-177.client.attbi.com>
 <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net>
 <15626.20004.302739.140783@12-248-41-177.client.attbi.com>
 <m3wut1eb9q.fsf@mira.informatik.hu-berlin.de>
 <15629.62080.478763.190468@slothrop.zope.com>
 <15629.64994.385213.97041@12-248-41-177.client.attbi.com>
 <15630.518.652328.859842@slothrop.zope.com>
 <15631.21090.114087.860848@localhost.localdomain>
Message-ID: <15631.21345.360104.903661@slothrop.zope.com>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

  SM> No.  I'm only suggesting that a SConstruct file be added to the
  SM> Modules directory.  I don't want it tied into the build process,
  SM> at least for the time being.  As Guido indicated, scons is still
  SM> in its infancy.

Oh!  That sounds fine with me.

Jeremy




From gcordova@hebmex.com  Tue Jun 18 16:07:45 2002
From: gcordova@hebmex.com (Gustavo Cordova)
Date: Tue, 18 Jun 2002 10:07:45 -0500
Subject: [Python-Dev] Playstation 2 and GameCube ports
Message-ID: <F7DB8D13DB61D511B6FF00B0D0F0623301311E2F@mail.hebmex.com>

> 
> > That would certainly get my son's attention and might even 
> > get him started in programming. I wouldn't mind seeing your
> > efforts written up in a PEP.
> > What exactly can you accomplish with Python on one of these boxes?
> 
> Don't you need a (costly) developers license in order to use this?
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 

Nope, just hack away at it in PS2-Linux, quite nice; you can
distribute your own games.

Now, if there's a PyGame and PyGL for PS2, then I think
that some very cool hacks and demos are gonna start appearing
for the piss-two in a little time. :-)

-gus



From skip@pobox.com  Tue Jun 18 16:44:34 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 18 Jun 2002 10:44:34 -0500
Subject: [Python-Dev] large file support
In-Reply-To: <m3660hmvfo.fsf@mira.informatik.hu-berlin.de>
References: <15630.14073.764208.284613@slothrop.zope.com>
 <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net>
 <15630.14945.184194.434432@slothrop.zope.com>
 <m3660hmvfo.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15631.21858.140029.311208@localhost.localdomain>

    >> If you've got a Python that doesn't have large file support and you
    >> try os.path.exists() on a large file, it will return false.  This is
    >> really bad!

    Martin> I believe this is a pilot error. On a system that supports large
    Martin> files, it is the administrator's job to make sure the Python
    Martin> installation has large file support enabled, otherwise, strange
    Martin> things may happen.

What about a networked environment?  If machine A without large file support
mounts an NFS directory from machine B that does support large files, what
should a program running on A see if it attempts to stat a large file?
Sounds like the EOVERFLOW thing would come in handy here.

Skip



From skip@pobox.com  Tue Jun 18 16:47:47 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 18 Jun 2002 10:47:47 -0500
Subject: [Python-Dev] Python strptime
In-Reply-To: <Pine.SOL.4.44.0206171504530.8504-100000@hailstorm.OCF.Berkeley.EDU>
References: <m3wusxlgo8.fsf@mira.informatik.hu-berlin.de>
 <Pine.SOL.4.44.0206171504530.8504-100000@hailstorm.OCF.Berkeley.EDU>
Message-ID: <15631.22051.656345.152081@localhost.localdomain>

Brett,

Have you looked at calendar.py?  It already does locale-specific weekday and
month names.  Localizing that code into a single place seems like it would
be a good idea?

Skip




From guido@python.org  Tue Jun 18 16:49:02 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 18 Jun 2002 11:49:02 -0400
Subject: [Python-Dev] addressing distutils inability to track file dependencies
In-Reply-To: Your message of "Tue, 18 Jun 2002 10:21:07 CDT."
 <15631.20451.97640.63785@localhost.localdomain>
References: <15624.56080.793620.970381@12-248-41-177.client.attbi.com> <m34rg7ynrf.fsf@mira.informatik.hu-berlin.de> <15624.62928.845160.407762@12-248-41-177.client.attbi.com> <m3k7p2yjvk.fsf@mira.informatik.hu-berlin.de> <15625.2485.994408.888814@12-248-41-177.client.attbi.com> <m3660myggx.fsf@mira.informatik.hu-berlin.de> <15625.16032.161304.357298@12-248-41-177.client.attbi.com> <200206140205.g5E250Z27438@pcp02138704pcs.reston01.va.comcast.net> <20020614074458.GA31022@strakt.com> <3D09A388.8080107@lemburg.com> <15626.11408.660388.360296@12-248-41-177.client.attbi.com> <200206141906.g5EJ6o809890@pcp02138704pcs.reston01.va.comcast.net> <15626.18460.685973.605098@12-248-41-177.client.attbi.com> <200206141954.g5EJstj10699@pcp02138704pcs.reston01.va.comcast.net> <15626.19179.464237.382313@12-248-41-177.client.attbi.com> <200206142007.g5EK72c10830@pcp02138704pcs.reston01.va.comcast.net> <15626.20004.302739.140783@12-248-41-177.client.attbi.com> <m3wut1eb9q.fsf@mira.informatik.h!
 ! u-berlin.de> <15629.62080.478763.190468@slothrop.zope.com> <15629.64994.385213.97041@12-248-41-177.client.attbi.com> <200206171530.g5HFU9X09701@pcp02138704pcs.reston01.va.comcast.net>
 <15631.20451.97640.63785@localhost.localdomain>
Message-ID: <200206181549.g5IFn2N01924@odiug.zope.com>

>     Guido> Let's not tie ourselves to SCons before it's a lot more mature.
> 
> I wasn't proposing that, at least not for the short-term.  I was suggesting
> that distutils be left as is, and a SConstruct file be delivered in
> .../Modules, to be used manually by developers to update module .so files.

I don't object to that, but it wouldn't do me any good.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip@pobox.com  Tue Jun 18 16:39:54 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 18 Jun 2002 10:39:54 -0500
Subject: [Python-Dev] Python strptime
In-Reply-To: <200206172051.g5HKpYn12103@pcp02138704pcs.reston01.va.comcast.net>
References: <Pine.SOL.4.44.0206171314510.19039-100000@death.OCF.Berkeley.EDU>
 <200206172051.g5HKpYn12103@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15631.21578.51007.71767@localhost.localdomain>

    Guido> Can you submit (to the same patch item) a patch for timemodule.c
    Guido> that adds a callout to your Python strptime code when
    Guido> HAVE_STRPTIME is undefined?

I thought the preferred way to do this would be to implement a Lib/time.py
module that includes Brett's strptime() function, move Modules/timemodule.c
to Modules/_timemodule.c and at the end of Lib/time.py import the symbols
from _time.

Skip




From aahz@pythoncraft.com  Tue Jun 18 17:47:35 2002
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 18 Jun 2002 12:47:35 -0400
Subject: [Python-Dev] Quota on sf.net
In-Reply-To: <Pine.LNX.4.44.0206141344300.30919-100000@impatience.valueclick.com>
References: <j4y9dn12vo.fsf@informatik.hu-berlin.de> <Pine.LNX.4.44.0206141344300.30919-100000@impatience.valueclick.com>
Message-ID: <20020618164735.GB5681@panix.com>

On Fri, Jun 14, 2002, Ask Bjoern Hansen wrote:
> On 10 Jun 2002, Martin v. Löwis wrote:
>> 
>> My recommendation would be to disable the scipt, and remove the
>> snapshots, perhaps leaving a page that anybody who wants the snapshots
>> should ask at python-dev to re-enable them.
> 
> feel free to refer people to;
> 
> http://cvs.perl.org/snapshots/python/
> 
> I'll keep about half a weeks worth of 6 hourly snapshots there, like 
> we do for parrot at http://cvs.perl.org/snapshots/parrot/

Thanks!  I've just replaced the link in the Dev Guide with your link, so
the SF snapshots can be blown away any time.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From martin@v.loewis.de  Tue Jun 18 17:56:17 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 18 Jun 2002 18:56:17 +0200
Subject: [Python-Dev] large file support
In-Reply-To: <200206181201.g5IC1uM30164@pcp02138704pcs.reston01.va.comcast.net>
References: <15630.14073.764208.284613@slothrop.zope.com>
 <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net>
 <15630.14945.184194.434432@slothrop.zope.com>
 <m3660hmvfo.fsf@mira.informatik.hu-berlin.de>
 <15630.45978.675730.729531@slothrop.zope.com>
 <m3660hqhbs.fsf@mira.informatik.hu-berlin.de>
 <200206181201.g5IC1uM30164@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3bsa8zfta.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Can you help?  2.1 is still a popular release, and large files will
> become more and more common as it grows older...

I can work out a patch, but that may take some time.

Regards,
Martin



From martin@v.loewis.de  Tue Jun 18 18:04:05 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 18 Jun 2002 19:04:05 +0200
Subject: [Python-Dev] large file support
In-Reply-To: <15631.21858.140029.311208@localhost.localdomain>
References: <15630.14073.764208.284613@slothrop.zope.com>
 <200206171931.g5HJVaC11429@pcp02138704pcs.reston01.va.comcast.net>
 <15630.14945.184194.434432@slothrop.zope.com>
 <m3660hmvfo.fsf@mira.informatik.hu-berlin.de>
 <15631.21858.140029.311208@localhost.localdomain>
Message-ID: <m37kkwzfga.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> What about a networked environment?  If machine A without large file support
> mounts an NFS directory from machine B that does support large files, what
> should a program running on A see if it attempts to stat a large file?

I would have to read the specs to answer this question correctly, but
I believe the answer would go like this:

case 1: Machine A only supports NFSv2, which does not support large files.
  When machine A accesses a large file on machine B (through the NFS
  GETATTR operation), it will see a truncated file. Notice that the exact
  behaviour depends on the NFSv2 implementation on machine B.

case 2: Machine A supports NFSv3, and the client NFS implementation
  correctly recognizes the large file. Now, you say "A has no large
  file support". That could either mean that the syscalls don't
  support that, or that the C library doesn't support that. If the
  kernel does not support it, it may be that it does not define
  EOVERFLOW, either. Most likely, you will again see the truncated
  value.

> Sounds like the EOVERFLOW thing would come in handy here.

It's not our choice whether the operating system reports EOVERFLOW, or
a truncated file. My guess is that you likely see a truncated file,
but you would need to specify a precise combination of (client C lib,
client OS, wire NFS version, server OS) to find out what really
happens. 

My guess is that if the system is not aware of large files, it likely
won't work "correctly" when it sees one, with Python having no way to
influence the outcome.

Regards,
Martin



From kbutler@campuspipeline.com  Tue Jun 18 18:20:59 2002
From: kbutler@campuspipeline.com (Kevin Butler)
Date: Tue, 18 Jun 2002 11:20:59 -0600
Subject: [Python-Dev] popen behavior
Message-ID: <3D0F6BFB.5080602@campuspipeline.com>

I've done some work implementing popen* for Jython, and had a couple of questions:

- Should we maintain the os.popen & popen2.popen dual exposure with their 
different argument & return value orders?

The 'os' exposure is newer, so I assume it is preferred.  The calls follow 
these patterns:
	stdin, stdout, stderr = os.popen*( command, mode, bufsize )
	stdout, stdin, stderr = popen2.popen*( command, bufsize, mode )

- Should we maintain the different behavior for lists of arguments vs strings? 
(it does not appear to be documented)

That is, the command can be either a string or a list of strings.  If it is a 
list of strings, it is executed as a new process without a shell. If it is a 
string, CPython's popen2 module attempts to execute it as a shell command-line 
as follows:
         if isinstance(cmd, types.StringTypes):
             cmd = ['/bin/sh', '-c', cmd]

Thanks

kb




From edcjones@erols.com  Tue Jun 18 19:03:45 2002
From: edcjones@erols.com (Edward C. Jones)
Date: Tue, 18 Jun 2002 14:03:45 -0400
Subject: [Python-Dev] MultiDict / Table, suggestion for a new module
Message-ID: <3D0F7601.1040003@erols.com>

I have written a module called MultiDict.py which can be found at 
http://members.tripod.com/~edcjones/MultiDict.py . It contains two 
classes MultiDict and Table.

MultiDict is like a dictionary except that each key can occur more than 
once. It is like the multimap in the C++ Standard Template Library 
except that MultiDicts are hashed rather than sorted.

Table views a 2 dimensional nested list as a "relation" (a set of 
n-tuples). I use it for simple one table databases where the full 
panoply of SQL is not needed.

If there is interest in this, I will write a PEP and some documentation.

Thanks,
Edward C. Jones






From bac@OCF.Berkeley.EDU  Tue Jun 18 19:20:00 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Tue, 18 Jun 2002 11:20:00 -0700 (PDT)
Subject: [Python-Dev] Python strptime
In-Reply-To: <200206181144.g5IBi3g29948@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.SOL.4.44.0206181114510.6932-100000@death.OCF.Berkeley.EDU>

On Tue, 18 Jun 2002, Guido van Rossum wrote:

> > I wonder what the purpose of having a pure-Python implementation of
> > strptime is, if you have to rely on strftime. Is this for Windows only?
>
> Isn't the problem that strftime() is in the C standard but strptime()
> is not?  So strptime() isn't always provided but we can count on
> strftime()?
>

Exactly.  For some reason ANSI decided to go to the trouble of requiring
strftime(), and thus all of the locale info for it, but not strptime()
nor a standard way to expose that locale info for the programmer to use.

> --Guido van Rossum (home page: http://www.python.org/~guido/)

-Brett C.




From David Abrahams" <david.abrahams@rcn.com  Tue Jun 18 19:21:23 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Tue, 18 Jun 2002 14:21:23 -0400
Subject: [Python-Dev] Slicing
Message-ID: <05cd01c216f5$00740fc0$6601a8c0@boostconsulting.com>

I did a little experiment to see if I could use a uniform interface for
slicing (from C++):

>>> range(10)[slice(3,5)]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: sequence index must be integer
>>> class Y(object):
...     def __getslice__(self, a, b):
...             print "getslice",a,b
...
>>> y = Y()
>>> y[slice(3,5)]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: unsubscriptable object
>>> y[3:5]
getslice 3 5

This seems to indicate that I can't, in general, pass a slice object to
PyObject_GetItem in order to do slicing.** Correct?

So I went looking around for alternatives to PyObject_GetItem. I found
PySequence_GetSlice, but that takes int parameters, and AFAIK there's no
rule saying you can't slice on strings, for example.

Further experiments revealed:

>>> y['hi':'there']
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: unsubscriptable object
>>> class X(object):
...     def __getitem__(self, x):
...             print 'getitem',x
...
>>> X()['hi':'there']
getitem slice('hi', 'there', None)

So I /can/ slice on strings, but only through __getitem__(). And...

>>> class Z(Y):
...     def __getitem__(self, x):
...             print 'getitem',x
...
>>> Z()[3:5]
getslice 3 5
>>> Z()['3':5]
getitem slice('3', 5, None)

So Python is doing some dispatching internally based on the types of the
slice elements, but:

>>> class subint(int): pass
...
>>> subint()
0
>>> Z[subint():5]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: unsubscriptable object

So it's looking at the concrete type of the slice elements. I'm not sure I
actually understand how this one fails.

I want to make a generalized getslice function in C which can operate on a
triple of arbitrary objects. Here's the python version I came up with:

    def getslice(x,start,finish):
        if (type(start) is type(finish) is int
            and hasattr(type(x), '__getslice__')):
            return x.__getslice__(start, finish)
        else:
            return x.__getitem__(slice(start,finish))

Have I got the logic right here?

Thanks,
Dave


**it seems like a good idea to make it work in the Python core, by
recognizing slice objects and dispatching the elements to __getslice__ if
they are ints and if one is defined. Have I overlooked something?

+---------------------------------------------------------------+
                  David Abrahams
      C++ Booster (http://www.boost.org)               O__  ==
      Pythonista (http://www.python.org)              c/ /'_ ==
  resume: http://users.rcn.com/abrahams/resume.html  (*) \(*) ==
          email: david.abrahams@rcn.com
+---------------------------------------------------------------+




From bac@OCF.Berkeley.EDU  Tue Jun 18 19:22:44 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Tue, 18 Jun 2002 11:22:44 -0700 (PDT)
Subject: [Python-Dev] Python strptime
In-Reply-To: <200206181156.g5IBuFI30101@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.SOL.4.44.0206181120120.6932-100000@death.OCF.Berkeley.EDU>

On Tue, 18 Jun 2002, Guido van Rossum wrote:

> > Currently month and weekday names are constants hardcoded in
> > english in calendar.py.
>
> No they're not.  You're a year behind. ;-)
>

Didn't realize that; undocumented feature.  I will change my code to use
calendar for getting the names of the days of the week and months.

I still have my code, though, that figures out the format strings for
date, time, and date/time if Martin wants to use that in locale.

> --Guido van Rossum (home page: http://www.python.org/~guido/)
>

-Brett C.




From guido@python.org  Tue Jun 18 20:52:49 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 18 Jun 2002 15:52:49 -0400
Subject: [Python-Dev] making dbmmodule still broken
Message-ID: <200206181952.g5IJqnL02042@odiug.zope.com>

On my 2yo Mandrake 8.1 (?) system, when I do "make" in the latest CVS
tree, I always get an error from building dbmmodule.c:

[guido@odiug linux]$ make
case $MAKEFLAGS in \
*-s*)  CC='gcc' LDSHARED='gcc -shared' OPT='-DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python -E ../setup.py -q build;; \
*)  CC='gcc' LDSHARED='gcc -shared' OPT='-DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python -E ../setup.py build;; \
esac
running build
running build_ext
building 'dbm' extension
gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC -I. -I/home/guido/python/dist/src/./Include -I/usr/local/include -I/home/guido/python/dist/src/Include -I/home/guido/python/dist/src/linux -c /home/guido/python/dist/src/Modules/dbmmodule.c -o build/temp.linux-i686-2.3/dbmmodule.o
/home/guido/python/dist/src/Modules/dbmmodule.c:25: #error "No ndbm.h available!"
running build_scripts
[guido@odiug linux]$

There's an ndbm.h is in /usr/include/db1/ndbm.h

Skip, didn't you change something in this area recently?  I think it's
still busted. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@v.loewis.de  Tue Jun 18 21:44:50 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 18 Jun 2002 22:44:50 +0200
Subject: [Python-Dev] popen behavior
In-Reply-To: <3D0F6BFB.5080602@campuspipeline.com>
References: <3D0F6BFB.5080602@campuspipeline.com>
Message-ID: <m3ofe871vh.fsf@mira.informatik.hu-berlin.de>

Kevin Butler <kbutler@campuspipeline.com> writes:

> - Should we maintain the os.popen & popen2.popen dual exposure with
> their different argument & return value orders?

Certainly. Any change to that will break existing applications.

> - Should we maintain the different behavior for lists of arguments vs
> strings? (it does not appear to be documented)
> 
> That is, the command can be either a string or a list of strings.  If
> it is a list of strings, it is executed as a new process without a
> shell. If it is a string, CPython's popen2 module attempts to execute
> it as a shell command-line as follows:
>          if isinstance(cmd, types.StringTypes):
>              cmd = ['/bin/sh', '-c', cmd]

If you propose to extend argument processing for one of the functions
so that passing the additional arguments in current releases produces
an exception - then adding this extension would be desirable if that
adds consistency.

Regards,
Martin




From barry@zope.com  Tue Jun 18 23:03:08 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 18 Jun 2002 18:03:08 -0400
Subject: [Python-Dev] making dbmmodule still broken
References: <200206181952.g5IJqnL02042@odiug.zope.com>
Message-ID: <15631.44572.417268.622961@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> On my 2yo Mandrake 8.1 (?) system, when I do "make" in the
    GvR> latest CVS tree, I always get an error from building
    GvR> dbmmodule.c:

I just tried building from scratch on my three systems using
"configure --with-pymalloc ; make test"

- RH6.1

    checking ndbm.h usability... no
    checking ndbm.h presence... no
    checking for ndbm.h... no
    checking gdbm/ndbm.h usability... yes
    checking gdbm/ndbm.h presence... yes
    checking for gdbm/ndbm.h... yes

    building 'dbm' extension
    gcc -g -Wall -Wstrict-prototypes -fPIC -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/dbmmodule.c -o build/temp.linux-i686-2.3/dbmmodule.o
    gcc -shared build/temp.linux-i686-2.3/dbmmodule.o -L/usr/local/lib -lndbm -o build/lib.linux-i686-2.3/dbm.so

    test_dbm succeeds

    Note that test_bsddb was skipped.  No attempt was even made to
    compile the bsddb extension.

- RH7.3

    checking ndbm.h usability... no
    checking ndbm.h presence... no
    checking for ndbm.h... no
    checking gdbm/ndbm.h usability... yes
    checking gdbm/ndbm.h presence... yes
    checking for gdbm/ndbm.h... yes

    building 'dbm' extension
    gcc -g -Wall -Wstrict-prototypes -fPIC -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/dbmmodule.c -o build/temp.linux-i686-2.3/dbmmodule.o
    gcc -shared build/temp.linux-i686-2.3/dbmmodule.o -L/usr/local/lib -lndbm -o build/lib.linux-i686-2.3/dbm.so

    test_dbm succeeds, as does test_bsddb

- MD8.1

    checking ndbm.h usability... no
    checking ndbm.h presence... no
    checking for ndbm.h... no
    checking gdbm/ndbm.h usability... no
    checking gdbm/ndbm.h presence... no
    checking for gdbm/ndbm.h... no

    building 'dbm' extension
    gcc -g -Wall -Wstrict-prototypes -fPIC -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/dbmmodule.c -o build/temp.linux-i686-2.3/dbmmodule.o
    /home/barry/projects/python/Modules/dbmmodule.c:25:2: #error "No ndbm.h available!"

    and test_dbm is skipped

    As with Guido, there is an ndbm.h in /usr/include/db1/ndbm.h

    Also, bsddbmodule seems to get build okay (i.e. no errors are
    reported), but test_bsddb gets skipped:

    building 'bsddb' extension
    gcc -g -Wall -Wstrict-prototypes -fPIC -DHAVE_DB_185_H=1 -I/usr/include/db3 -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/bsddbmodule.c -o build/temp.linux-i686-2.3/bsddbmodule.o
    gcc -shared build/temp.linux-i686-2.3/bsddbmodule.o -L/usr/local/BerkeleyDB.3.3/lib -L/usr/local/lib -ldb-3.3 -o build/lib.linux-i686-2.3/bsddb.so
    [...]
    test_bsddb
    test test_bsddb skipped -- No module named bsddb

- I can't at the moment test MD8.2

-Barry



From niemeyer@conectiva.com  Tue Jun 18 23:14:42 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Tue, 18 Jun 2002 19:14:42 -0300
Subject: [Python-Dev] mkdev, major, st_rdev, etc
In-Reply-To: <20020615160831.A5440@ibook.distro.conectiva>
References: <20020615160831.A5440@ibook.distro.conectiva>
Message-ID: <20020618191442.A7425@ibook.distro.conectiva>

> After thinking for a while, and doing some research about these
> functions, I've changed my mind about the best way to implement
> the needed functionality for tarfile. Maybe including major,
> minor, and makedev is the best solution. Some of the issues I'm
> considering:
[...]
> A patch providing these functions is available at
> http://www.python.org/sf/569139

Can someone please review it and let me know what I have to change
to get it in, or commit if everything is ok? I'd like to give Lars
some feedback about it, so that he can finish his work on tarfile.py.

Thank you!

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From kbutler@campuspipeline.com  Wed Jun 19 01:37:06 2002
From: kbutler@campuspipeline.com (Kevin Butler)
Date: Tue, 18 Jun 2002 18:37:06 -0600
Subject: [Python-Dev] popen behavior
References: <3D0F6BFB.5080602@campuspipeline.com>
 <m3ofe871vh.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D0FD232.8070207@campuspipeline.com>

Martin v. Loewis wrote:
> Kevin Butler <kbutler@campuspipeline.com> writes:
> 
>>- Should we maintain the os.popen & popen2.popen dual exposure with
>>their different argument & return value orders?
> 
> Certainly. Any change to that will break existing applications.

Actually, it would just "fail to enable existing applications that currently 
don't work on Jython".  :-)  But if one or the other form is or will be 
deprecated in CPython, I probably wouldn't expose it in Jython at this point 
(TMTOWTDI, etc.)

>>- Should we maintain the different behavior for lists of arguments vs
>>strings? (it does not appear to be documented)
> 
> If you propose to extend argument processing for one of the functions
> so that passing the additional arguments in current releases produces
> an exception - then adding this extension would be desirable if that
> adds consistency.

I'm not sure what you meant here.

The inconsistency is as follows (Python output below):

On both (all?) platforms:

	popen*( "cmd arg arg" ) executes cmd in a subshell
	popen( ["cmd", "arg", "arg"] ) fails

In win32:
	popen[234]( ["cmd", "arg", "arg"] ) fails

In posix:
	popen[234]( ["cmd", "arg", "arg"] ) runs cmd w/o a subshell

I consider the posix behavior more useful (especially on Java where we can't 
always determine a useful shell for a platform), but where it isn't documented 
and isn't supported in win32, I wasn't sure if I should support it.

I think it would also be useful and more consistent to allow popen() to accept 
an args list, which is currently not supported on either platform.  Should I 
allow this for Java?

Should I spend time to make the Win32 functions accept the args lists?

kb

Python 2.1.1 (#1, Aug 25 2001, 04:19:08)
[GCC 3.0.1] on sunos5
Type "copyright", "credits" or "license" for more information.
 >>> from os import popen, popen2
 >>> out = popen( "echo $USER" )
 >>> out.read()
'kbutler\n'
 >>> out = popen( ["echo", "$USER"] )
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: popen() argument 1 must be string, not list
 >>> in_, out = popen2( ["echo", "$USER"] )
 >>> out.read()
'$USER\n'
 >>> in_, out = popen2( "echo $USER" )
 >>> out.read()
'kbutler\n'
 >>>

Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> from os import popen, popen2
 >>> out = popen( "echo %USERNAME%" )
 >>> out.read()
'kbutler\n'
 >>> out = popen( ["echo", "%USERNAME%"] )
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: popen() argument 1 must be string, not list
 >>> in_, out = popen2( ["echo", "%USERNAME%"] )
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: popen2() argument 1 must be string, not list
 >>> in_, out = popen2( "echo %USERNAME%" )
 >>> out.read()
'kbutler\n'
 >>>




From groups@crash.org  Wed Jun 19 02:00:52 2002
From: groups@crash.org (Jason L. Asbahr)
Date: Tue, 18 Jun 2002 20:00:52 -0500
Subject: [Python-Dev] Playstation 2 and GameCube ports
In-Reply-To: <200206180142.g5I1gbg13515@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <EIEFLCFECLLBKGPNJJIMKEPKIIAA.groups@crash.org>

Patrick K. O'Brien wrote:
> That would certainly get my son's attention and might even get him started
> in programming. I wouldn't mind seeing your efforts written up in a PEP.
> What exactly can you accomplish with Python on one of these boxes?

Guido wrote:
> Don't you need a (costly) developers license in order to use this?

One does in fact need a costly, though very attractive and sleek :-),
developer box from Sony to get the most out of Python on the PS2
as a professional game developer.

However, the hobbiest PS2/Linux upgrade kit for the retail PS2 unit
may be acquired for $200 and Python could be used on that system
as well.  Info at http://playstation2-linux.com

As for what you can accomplish with Python in gaming, check out
my papers at http://www.asbahr.com/papers.html   :-)

Cheers,

Jason




From paul@prescod.net  Wed Jun 19 02:14:50 2002
From: paul@prescod.net (Paul Prescod)
Date: Tue, 18 Jun 2002 18:14:50 -0700
Subject: [Python-Dev] Playstation 2 and GameCube ports
References: <EIEFLCFECLLBKGPNJJIMKEPKIIAA.groups@crash.org>
Message-ID: <3D0FDB0A.EC53656@prescod.net>

"Jason L. Asbahr" wrote:
> 
>...
> 
> However, the hobbiest PS2/Linux upgrade kit for the retail PS2 unit
> may be acquired for $200 and Python could be used on that system
> as well.  Info at http://playstation2-linux.com

What do you lose by going this route? Obviously if this was good enough
there would be no need for developer boxes nor (I'd guess) for a special
port of Python.

 Paul Prescod



From barry@zope.com  Wed Jun 19 03:34:17 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 18 Jun 2002 22:34:17 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
Message-ID: <15631.60841.28978.492291@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    SM> If you build the bsddb module on a Unix-like system (that is,
    SM> you use configure and setup.py to build the interpreter and it
    SM> attempts to build the bsddb module), please give the new patch
    SM> attached to

    SM>     http://python.org/sf/553108

Skip,

Apologies for taking so long to respond to this thread.  I'm still
digging out from my move.

Basically what you have in cvs works great, except for one small
necessary addition.  If you build Berkeley DB from source, it's going
to install it in something like /usr/local/BerkeleyDB.3.3 by default.
Why they choose such a bizarre location, I don't know.

The problem is that unless your sysadmin hacks ld.so.conf to add
/usr/local/BerkeleyDB.X.Y/lib onto your standard ld run path,
bsddbmodule.so won't be linked in such a way that it can actually
resolve the symbols at run time.  I don't think it's reasonable to
require such system hacking to get the bsddb module to link properly,
and I think we can do better.

Here's a small patch to setup.py which should fix things in a portable
way, at least for *nix systems.  It sets the envar LD_RUN_PATH to the
location that it found the Berkeley library, but only if that envar
isn't already set.  I've tested this on RH7.3 and MD8.1 -- all of
which I have a from-source install of BerkeleyDB 3.3.11.  Seems to
work well for me.

I'm still having build trouble on my RH6.1 system, but maybe it's just
too old to worry about (I /really/ need to upgrade one of these days
;).

-------------------- snip snip --------------------
building 'bsddb' extension
gcc -g -Wall -Wstrict-prototypes -fPIC -DHAVE_DB_185_H=1 -I/usr/local/BerkeleyDB.3.3/include -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/bsddbmodule.c -o build/temp.linux-i686-2.3/bsddbmodule.o
In file included from /home/barry/projects/python/Modules/bsddbmodule.c:25:
/usr/local/BerkeleyDB.3.3/include/db_185.h:171: parse error before `*'
/usr/local/BerkeleyDB.3.3/include/db_185.h:171: warning: type defaults to `int' in declaration of `__db185_open'
/usr/local/BerkeleyDB.3.3/include/db_185.h:171: warning: data definition has no type or storage class
/home/barry/projects/python/Modules/bsddbmodule.c: In function `newdbhashobject':
/home/barry/projects/python/Modules/bsddbmodule.c:74: warning: assignment from incompatible pointer type
/home/barry/projects/python/Modules/bsddbmodule.c: In function `newdbbtobject':
/home/barry/projects/python/Modules/bsddbmodule.c:124: warning: assignment from incompatible pointer type
/home/barry/projects/python/Modules/bsddbmodule.c: In function `newdbrnobject':
/home/barry/projects/python/Modules/bsddbmodule.c:182: warning: assignment from incompatible pointer type
-------------------- snip snip --------------------

Sigh.

Anyway thanks!  You've improved the situation immensely.
-Barry

-------------------- snip snip --------------------
Index: setup.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/setup.py,v
retrieving revision 1.94
diff -u -r1.94 setup.py
--- setup.py	17 Jun 2002 17:55:30 -0000	1.94
+++ setup.py	19 Jun 2002 01:01:19 -0000
@@ -507,6 +507,13 @@
                             dblibs = [dblib]
                             raise found
         except found:
+            # A default source build puts Berkeley DB in something like
+            # /usr/local/Berkeley.3.3 and the lib dir under that isn't
+            # normally on LD_RUN_PATH, unless the sysadmin has hacked
+            # /etc/ld.so.conf.  Setting the envar should be portable across
+            # compilers and platforms.
+            if 'LD_RUN_PATH' not in os.environ:
+                os.environ['LD_RUN_PATH'] = dblib_dir
             if dbinc == 'db_185.h':
                 exts.append(Extension('bsddb', ['bsddbmodule.c'],
                                       library_dirs=[dblib_dir],




From barry@zope.com  Wed Jun 19 03:38:36 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 18 Jun 2002 22:38:36 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
Message-ID: <15631.61100.561824.480935@anthem.wooz.org>

>>>>> "OB" == Oleg Broytmann <phd@phd.pp.ru> writes:

    OB>    Can I have two different modules simultaneously? For
    OB> example, a module linked with db.1.85 plus a module linked
    OB> with db3.

I still think we may want to pull PyBSDDB into the standard distro, as
a way to provide BDB api's > 1.85.  The question is, what would this
new module be called?  I dislike "bsddb3" -- which I think PyBSDDB
itself uses -- because it links against BDB 4.0.

OTOH, PyBSDDB seems to be quite solid, so I think it's mature enough
to migrate into the Python distro.  I'm cc'ing pybsddb-users for
feedback.

-Barry



From skip@pobox.com  Wed Jun 19 03:03:57 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 18 Jun 2002 21:03:57 -0500
Subject: [Python-Dev] Re: making dbmmodule still broken
Message-ID: <15631.59021.96743.453588@localhost.localdomain>

I wrote:

    ... here's what I propose (and what changes I just made locally):

I forgot about one other change.  In dbmmodule.c I #include <db.h> if
HAVE_BERKDB_H is defined:

    #if defined(HAVE_NDBM_H)
    #include <ndbm.h>
    #if defined(PYOS_OS2) && !defined(PYCC_GCC)
    static char *which_dbm = "ndbm";
    #else
    static char *which_dbm = "GNU gdbm";  /* EMX port of GDBM */
    #endif
    #elif defined(HAVE_GDBM_NDBM_H)
    #include <gdbm/ndbm.h>
    static char *which_dbm = "GNU gdbm";
    #elif defined(HAVE_BERKDB_H)
    #include <db.h>
    static char *which_dbm = "Berkeley DB";
    #else
    #error "No ndbm.h available!"
    #endif

Skip



From skip@pobox.com  Wed Jun 19 02:58:47 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 18 Jun 2002 20:58:47 -0500
Subject: [Python-Dev] Re: making dbmmodule still broken
Message-ID: <15631.58711.213506.701945@localhost.localdomain>

    Guido> On my 2yo Mandrake 8.1 (?) system, when I do "make" in the latest
    Guido> CVS tree, I always get an error from building dbmmodule.c:

Okay, in between ringing up credit card charges I took another look at
building the dbm module.  I can't cvs diff at the moment, but here's what I
propose (and what changes I just made locally):

    * Remove the ndbm.h and gdbm/ndbm.h checks from configure.in and run
      autoconf.

    * Change the block of code in setup.py that checks for dbm libraries and
      include files to

        if platform not in ['cygwin']:
            if (self.compiler.find_library_file(lib_dirs, 'ndbm')
                and find_file("ndbm.h", inc_dirs, []) is not None):
                exts.append( Extension('dbm', ['dbmmodule.c'],
                                       define_macros=[('HAVE_NDBM_H',None)],
                                       libraries = ['ndbm'] ) )
            elif (self.compiler.find_library_file(lib_dirs, 'gdbm')
                  and find_file("gdbm/ndbm.h", inc_dirs, []) is not None):
                exts.append( Extension('dbm', ['dbmmodule.c'],
                                       define_macros=[('HAVE_GDBM_NDBM_H',None)],
                                       libraries = ['gdbm'] ) )
            elif db_incs is not None:
                exts.append( Extension('dbm', ['dbmmodule.c'],
                                       library_dirs=[dblib_dir],
                                       include_dirs=db_incs,
                                       define_macros=[('HAVE_BERKDB_H',None),
                                                      ('DB_DBM_HSEARCH',None)],
                                       libraries=dblibs))

      This does two things.  It removes the else clause which would never
      have worked (no corresponding include files would have been found).
      It also performs the two include file tests I removed from
      configure.in and defines the appropriate macros.

Building after making these changes I get gdbm.  If I mv gdbm/ndbm.h out of
the way or comment out the first elif branch I get Berkeley DB.  I don't
have an ndbm library on my system, so I can't exercise the first branch.

I think it would probably be a good idea to alert the person running make
what library the module will be linked with.  Anyone else agree?

Skip




From skip@pobox.com  Wed Jun 19 00:44:40 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 18 Jun 2002 18:44:40 -0500
Subject: [Python-Dev] Re: making dbmmodule still broken
In-Reply-To: <200206181952.g5IJqnL02042@odiug.zope.com>
References: <200206181952.g5IJqnL02042@odiug.zope.com>
Message-ID: <15631.50664.251930.976552@localhost.localdomain>

    Guido> On my 2yo Mandrake 8.1 (?) system, when I do "make" in the latest
    Guido> CVS tree, I always get an error from building dbmmodule.c:

    Guido> [guido@odiug linux]$ make
    Guido> case $MAKEFLAGS in \
    Guido> *-s*)  CC='gcc' LDSHARED='gcc -shared' OPT='-DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python -E ../setup.py -q build;; \
    Guido> *)  CC='gcc' LDSHARED='gcc -shared' OPT='-DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python -E ../setup.py build;; \
    Guido> esac
    Guido> running build
    Guido> running build_ext
    Guido> building 'dbm' extension
    Guido> gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC -I. -I/home/guido/python/dist/src/./Include -I/usr/local/include -I/home/guido/python/dist/src/Include -I/home/guido/python/dist/src/linux -c /home/guido/python/dist/src/Modules/dbmmodule.c -o build/temp.linux-i686-2.3/dbmmodule.o
    Guido> /home/guido/python/dist/src/Modules/dbmmodule.c:25: #error "No ndbm.h available!"
    Guido> running build_scripts
    Guido> [guido@odiug linux]$

    Guido> There's an ndbm.h is in /usr/include/db1/ndbm.h

    Guido> Skip, didn't you change something in this area recently?  I think
    Guido> it's still busted. :-(

Hmmm...  Works on my Mandrake 8.1 system.  I have the db2-devel-2.4.14-5mdk
package installed, which provides /usr/lib/libndbm.{a,so}.

Note that you probably don't want to use /usr/include/db1/ndbm.h because you
will wind up using is the broken Berkeley DB 1.85 hash file implementation.
One of the two major goals of the change I checked in recently was to
deprecate BerkeleyDB 1.85.  Do you not have an ndbm or gdbm implementation
on your system?

If you don't have an acceptable set of libraries/include files it shouldn't
try building the module at all.  It looks like the else: branch

            else:
                exts.append( Extension('dbm', ['dbmmodule.c']) )

should probably be removed.

I'll take another look at the problem again Wednesday.  I am offline at the
moment and can't "cvs up" (the modem here at the North Beach Inn shares a
phone line with the credit card machine and it's the dinner hour... :-)

Skip



From gerhard@bigfoot.de  Wed Jun 19 03:48:06 2002
From: gerhard@bigfoot.de (Gerhard =?iso-8859-15?Q?H=E4ring?=)
Date: Wed, 19 Jun 2002 04:48:06 +0200
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15631.60841.28978.492291@anthem.wooz.org>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org>
Message-ID: <20020619024806.GA7218@lilith.my-fqdn.de>

* Barry A. Warsaw <barry@zope.com> [2002-06-18 22:34 -0400]:
> The problem is that unless your sysadmin hacks ld.so.conf to add
> /usr/local/BerkeleyDB.X.Y/lib onto your standard ld run path,
> bsddbmodule.so won't be linked in such a way that it can actually
> resolve the symbols at run time.
> [...]
> os.environ['LD_RUN_PATH'] = dblib_dir

I may be missing something here, but AFAIC that's what the library_dirs
parameter in the Extension constructor of distutils is for. It basically
sets the runtime library path at compile time using the "-R" linker
option.

Gerhard
-- 
mail:   gerhard <at> bigfoot <dot> de       registered Linux user #64239
web:    http://www.cs.fhm.edu/~ifw00065/    OpenPGP public key id AD24C930
public key fingerprint: 3FCC 8700 3012 0A9E B0C9  3667 814B 9CAA AD24 C930
reduce(lambda x,y:x+y,map(lambda x:chr(ord(x)^42),tuple('zS^BED\nX_FOY\x0b')))



From barry@zope.com  Wed Jun 19 04:24:09 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 18 Jun 2002 23:24:09 -0400
Subject: [Python-Dev] Re: making dbmmodule still broken
References: <15631.58711.213506.701945@localhost.localdomain>
Message-ID: <15631.63833.440127.405556@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    SM> I think it would probably be a good idea to alert the person
    SM> running make what library the module will be linked with.
    SM> Anyone else agree?

+1.  The less guessing the builder has to do the better!
-Barry



From barry@zope.com  Wed Jun 19 04:27:28 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 18 Jun 2002 23:27:28 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
 <20020619024806.GA7218@lilith.my-fqdn.de>
Message-ID: <15631.64032.21870.910289@anthem.wooz.org>

>>>>> "GH" =3D=3D Gerhard H=E4ring <gerhard@bigfoot.de> writes:

    GH> * Barry A. Warsaw <barry@zope.com> [2002-06-18 22:34 -0400]:
    >> The problem is that unless your sysadmin hacks ld.so.conf to
    >> add /usr/local/BerkeleyDB.X.Y/lib onto your standard ld run
    >> path, bsddbmodule.so won't be linked in such a way that it can
    >> actually resolve the symbols at run time.  [...]
    >> os.environ['LD_RUN_PATH'] =3D dblib_dir

    GH> I may be missing something here, but AFAIC that's what the
    GH> library_dirs parameter in the Extension constructor of
    GH> distutils is for. It basically sets the runtime library path
    GH> at compile time using the "-R" linker option.

Possibly (Greg?), but without setting that envar (or doing a less
portable -Xlinker -Rblah trick) I'd end up with a
build/lib.linux-i686-2.3/bsddb_failed.so which, if you ran ldd over it
would show no resolution for libdb-3.3.so.  Also, no -R or equivalent
option showed up in the compilation output.

-Barry



From barry@zope.com  Tue Jun 18 23:29:02 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue Jun 18 23:29:11 2002
Subject: [Python-Dev] PEP 292, Simpler String Substitutions

I'm so behind on my email, that the anticipated flamefest will surely
die down before I get around to reading it.  Yet still, here is a new
PEP. :)

-Barry

-------------------- snip snip --------------------
PEP: 292
Title: Simpler String Substitutions
Version: $Revision: 1.2 $
Last-Modified: $Date: 2002/06/19 02:54:22 $
Author: barry@zope.com (Barry A. Warsaw)
Status: Draft
Type: Standards Track
Created: 18-Jun-2002
Python-Version: 2.3
Post-History: 18-Jun-2002


Abstract

    This PEP describes a simpler string substitution feature, also
    known as string interpolation.  This PEP is "simpler" in two
    respects:

    1. Python's current string substitution feature (commonly known as
       %-substitutions) is complicated and error prone.  This PEP is
       simpler at the cost of less expressiveness.

    2. PEP 215 proposed an alternative string interpolation feature,
       introducing a new `$' string prefix.  PEP 292 is simpler than
       this because it involves no syntax changes and has much simpler
       rules for what substitutions can occur in the string.
       

Rationale

    Python currently supports a string substitution (a.k.a. string
    interpolation) syntax based on C's printf() % formatting
    character[1].  While quite rich, %-formatting codes are also quite
    error prone, even for experienced Python programmers.  A common
    mistake is to leave off the trailing format character, e.g. the
    `s' in "%(name)s".

    In addition, the rules for what can follow a % sign are fairly
    complex, while the usual application rarely needs such complexity.


A Simpler Proposal

    Here we propose the addition of a new string method, called .sub()
    which performs substitution of mapping values into a string with
    special substitution placeholders.  These placeholders are
    introduced with the $ character.  The following rules for
    $-placeholders apply:

    1. $$ is an escape; it is replaced with a single $

    2. $identifier names a substitution placeholder matching a mapping
       key of "identifier".  "identifier" must be a Python identifier
       as defined in [2].  The first non-identifier character after
       the $ character terminates this placeholder specification.

    3. ${identifier} is equivalent to $identifier and for clarity,
       this is the preferred form.  It is required for when valid
       identifier characters follow the placeholder but are not part of
       the placeholder, e.g. "${noun}ification".

    No other characters have special meaning.

    The .sub() method takes an optional mapping (e.g. dictionary)
    where the keys match placeholders in the string, and the values
    are substituted for the placeholders.  For example:

	'${name} was born in ${country}'.sub({'name': 'Guido',
					      'country': 'the Netherlands'})

    returns

        'Guido was born in the Netherlands'

    The mapping argument is optional; if it is omitted then the
    mapping is taken from the locals and globals of the context in
    which the .sub() method is executed.  For example:

	def birth(self, name):
	    country = self.countryOfOrigin['name']
	    return '${name} was born in ${country}'

	birth('Guido')

    returns

	'Guido was born in the Netherlands'


Reference Implementation

    Here's a Python 2.2-based reference implementation.  Of course the
    real implementation would be in C, would not require a string
    subclass, and would not be modeled on the existing %-interpolation
    feature.

	import sys
	import re

	dre = re.compile(r'(\$\$)|\$([_a-z]\w*)|\$\{([_a-z]\w*)\}', re.I)
	EMPTYSTRING = ''

	class dstr(str):
	    def sub(self, mapping=None):
		# Default mapping is locals/globals of caller
		if mapping is None:
		    frame = sys._getframe(1)
		    mapping = frame.f_globals.copy()
		    mapping.update(frame.f_locals)
		# Escape %'s
		s = self.replace('%', '%%')
		# Convert $name and ${name} to $(name)s
		parts = dre.split(s)
		for i in range(1, len(parts), 4):
		    if parts[i] is not None:
			parts[i] = '$'
		    elif parts[i+1] is not None:
			parts[i+1] = '%(' + parts[i+1] + ')s'
		    else:
			parts[i+2] = '%(' + parts[i+2] + ')s'
		# Interpolate
		return EMPTYSTRING.join(filter(None, parts)) % mapping
    
    And here are some examples:

	s = dstr('${name} was born in ${country}')
	print s.sub({'name': 'Guido',
		     'country': 'the Netherlands'})

	name = 'Barry'
	country = 'the USA'
	print s.sub()

    This will print "Guido was born in the Netherlands" followed by
    "Barry was born in the USA".


Handling Missing Keys

    What should happen when one of the substitution keys is missing
    from the mapping (or the locals/globals namespace if no argument
    is given)?  There are two possibilities:

    - We can simply allow the exception (likely a NameError or
      KeyError) to propagate.

    - We can return the original substitution placeholder unchanged.

    An example of the first is:

        print dstr('${name} was born in ${country}').sub({'name': 'Bob'})

    would raise:

	Traceback (most recent call last):
	  File "sub.py", line 66, in ?
	    print s.sub({'name': 'Bob'})
	  File "sub.py", line 26, in sub
	    return EMPTYSTRING.join(filter(None, parts)) % mapping
	KeyError: country

    An example of the second is:

        print dstr('${name} was born in ${country}').sub({'name': 'Bob'})

    would print:

	Bob was born in ${country}

    The PEP author would prefer the latter interpretation, although a
    case can be made for raising the exception instead.  We could
    almost ignore the issue, since the latter example could be
    accomplished by passing in a "safe-dictionary" in instead of a
    normal dictionary, like so:

	class safedict(dict):
	    def __getitem__(self, key):
		try:
		    return dict.__getitem__(self, key)
		except KeyError:
		    return '${%s}' % key

    so that

	d = safedict({'name': 'Bob'})
	print dstr('${name} was born in ${country}').sub(d)

    would print:

	Bob was born in ${country}

    The one place where this won't work is when no arguments are given
    to the .sub() method.  .sub() wouldn't know whether to wrap
    locals/globals in a safedict or not.

    This ambiguity can be solved in several ways:

    - we could have a parallel method called .safesub() which always
      wrapped its argument in a safedict()

    - .sub() could take an optional keyword argument flag which
      indicates whether to wrap the argument in a safedict or not.

    - .sub() could take an optional keyword argument which is a
      callable that would get called with the original mapping and
      return the mapping to be used for the substitution.  By default,
      this callable would be the identity function, but you could
      easily pass in the safedict constructor instead.

    BDFL proto-pronouncement: It should always raise a NameError when
    the key is missing.  There may not be sufficient use case for soft
    failures in the no-argument version.


Comparison to PEP 215

    PEP 215 describes an alternate proposal for string interpolation.
    Unlike that PEP, this one does not propose any new syntax for
    Python.  All the proposed new features are embodied in a new
    string method.  PEP 215 proposes a new string prefix
    representation such as $"" which signal to Python that a new type
    of string is present.  $-strings would have to interact with the
    existing r-prefixes and u-prefixes, essentially doubling the
    number of string prefix combinations.

    PEP 215 also allows for arbitrary Python expressions inside the
    $-strings, so that you could do things like:

	import sys
	print $"sys = $sys, sys = $sys.modules['sys']"

    which would return

	sys = <module 'sys' (built-in)>, sys = <module 'sys' (built-in)>
 
    It's generally accepted that the rules in PEP 215 are safe in the
    sense that they introduce no new security issues (see PEP 215,
    "Security Issues" for details).  However, the rules are still
    quite complex, and make it more difficult to see what exactly is
    the substitution placeholder in the original $-string.

    By design, this PEP does not provide as much interpolation power
    as PEP 215, however it is expected that the no-argument version of
    .sub() allows at least as much power with no loss of readability.


References

    [1] String Formatting Operations
        http://www.python.org/doc/current/lib/typesseq-strings.html

    [2] Identifiers and Keywords
	http://www.python.org/doc/current/ref/identifiers.html


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:



From ping@zesty.ca  Wed Jun 19 05:36:51 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Tue, 18 Jun 2002 21:36:51 -0700 (PDT)
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <200206190329.g5J3TLZ22194@server1.lfw.org>
Message-ID: <Pine.LNX.4.44.0206182131580.17350-100000@ziggy>

On Tue, 18 Jun 2002, Barry A. Warsaw wrote:
> 	def birth(self, name):
> 	    country = self.countryOfOrigin['name']
> 	    return '${name} was born in ${country}'
>
> 	birth('Guido')
>
>     returns
>
> 	'Guido was born in the Netherlands'

I assume you in fact meant

	    return '${name} was born in ${country}'.sub()

for the third line above?

> 	print s.sub({'name': 'Guido',
> 		     'country': 'the Netherlands'})

Have you considered the possibility of accepting keyword arguments
instead?  They would be slightly more pleasant to write:

        print s.sub(name='Guido', country='the Netherlands')

This is motivated because i imagine relative frequencies of use
to be something like this:

    1.  sub()                      [most frequent]
    2.  sub(name=value, ...)       [nearly as frequent]
    3.  sub(dictionary)            [least frequent]

If you decide to use keyword arguments, you can either allow both
keyword arguments and a single dictionary argument, or you can
just accept keyword arguments and people can pass in dictionaries
using **.


-- ?!ng




From paul@prescod.net  Wed Jun 19 05:46:33 2002
From: paul@prescod.net (Paul Prescod)
Date: Tue, 18 Jun 2002 21:46:33 -0700
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com>
Message-ID: <3D100CA9.999E7B3F@prescod.net>

"Barry A. Warsaw" wrote:
> 
>...
    The mapping argument is optional; if it is omitted then the
>     mapping is taken from the locals and globals of the context in
>     which the .sub() method is executed.  For example:
> 
>         def birth(self, name):
>             country = self.countryOfOrigin['name']
>             return '${name} was born in ${country}'
> 
>         birth('Guido')

You forgot the "implicit .sub"() feature.

>...

>     - We can simply allow the exception (likely a NameError or
>       KeyError) to propagate.

Explicit!

>     - We can return the original substitution placeholder unchanged.

Silently guess???

Overall it isn't bad...it's a little weird to have a method that depends
on sys._getframe(1) (or as the say in Tcl-land "upvar"). It may set a
bad precedent...

 Paul Prescod



From greg@cosc.canterbury.ac.nz  Wed Jun 19 06:11:37 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 19 Jun 2002 17:11:37 +1200 (NZST)
Subject: [Python-Dev] Negative long literals (was Re: Does Python need a '>>>' operator?)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKELJPLAA.tim.one@comcast.net>
Message-ID: <200206190511.g5J5BbU16855@oma.cosc.canterbury.ac.nz>

> 1xc00 "shows the bits" more clearly even in such
> an easy case.

Except that, if we're thinking in hex, it's not a 1-filled
bit string, it's an F-filled bit string! So it should be
Fxc00 :-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From martin@v.loewis.de  Wed Jun 19 06:30:11 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 19 Jun 2002 07:30:11 +0200
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15631.61100.561824.480935@anthem.wooz.org>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
Message-ID: <m3660fhm3g.fsf@mira.informatik.hu-berlin.de>

barry@zope.com (Barry A. Warsaw) writes:

> I still think we may want to pull PyBSDDB into the standard distro, as
> a way to provide BDB api's > 1.85.  The question is, what would this
> new module be called?  I dislike "bsddb3" -- which I think PyBSDDB
> itself uses -- because it links against BDB 4.0.

If this is just a question of naming, I recommend bsddb2 - not
indicating the version of the database, but the version of the Python
module.

Regards,
Martin




From martin@v.loewis.de  Wed Jun 19 06:37:08 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 19 Jun 2002 07:37:08 +0200
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15631.60841.28978.492291@anthem.wooz.org>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
Message-ID: <m31yb3hlrv.fsf@mira.informatik.hu-berlin.de>

barry@zope.com (Barry A. Warsaw) writes:

> Basically what you have in cvs works great, except for one small
> necessary addition.  If you build Berkeley DB from source, it's going
> to install it in something like /usr/local/BerkeleyDB.3.3 by default.
> Why they choose such a bizarre location, I don't know.
> 
> The problem is that unless your sysadmin hacks ld.so.conf to add
> /usr/local/BerkeleyDB.X.Y/lib onto your standard ld run path,
> bsddbmodule.so won't be linked in such a way that it can actually
> resolve the symbols at run time.  I don't think it's reasonable to
> require such system hacking to get the bsddb module to link properly,
> and I think we can do better.
> 
> Here's a small patch to setup.py which should fix things in a portable
> way, at least for *nix systems.  It sets the envar LD_RUN_PATH to the
> location that it found the Berkeley library, but only if that envar
> isn't already set.  

I dislike that change. Setting LD_RUN_PATH is the jobs of whoever is
building the compiler, and should not be done by Python
automatically. So far, the Python build process avoids adding any -R
linker options, since it requires quite some insight into the specific
installation to determine whether usage of that option is the right
thing.

If setup.py fails to build an extension correctly, it is the
adminstrator's job to specify a correct build procedure in
Modules/Setup. For that reason, I rather recommend to remove the magic
that setup.py looks in /usr/local/Berkeley*, instead of adding more
magic.

Regards,
Martin



From python@rcn.com  Wed Jun 19 07:45:46 2002
From: python@rcn.com (Raymond Hettinger)
Date: Wed, 19 Jun 2002 02:45:46 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <20020619033036.LOQA688.mta04.mrf.mail.rcn.net@mx03.mrf.mail.rcn.net>
Message-ID: <007f01c2175c$f1e768e0$f4d8accf@othello>

From: "Barry A. Warsaw" <barry@zope.com>
> A Simpler Proposal
>
>     Here we propose the addition of a new string method, called .sub()
>     which performs substitution of mapping values into a string with
>     special substitution placeholders.  These placeholders are
>     introduced with the $ character.  The following rules for
>     $-placeholders apply:
>
>     1. $$ is an escape; it is replaced with a single $

Hmm, some strings (at least in the spam I receive) contain $$$$$$.
How about ${$}?

>     2. $identifier names a substitution placeholder matching a mapping
>        key of "identifier".  "identifier" must be a Python identifier
>        as defined in [2].  The first non-identifier character after
>        the $ character terminates this placeholder specification.

+1

>
>     3. ${identifier} is equivalent to $identifier and for clarity,
>        this is the preferred form.  It is required for when valid
>        identifier characters follow the placeholder but are not part of
>        the placeholder, e.g. "${noun}ification".

+1


> Handling Missing Keys
>
>     What should happen when one of the substitution keys is missing
>     from the mapping (or the locals/globals namespace if no argument
>     is given)?  There are two possibilities:
>
>     - We can simply allow the exception (likely a NameError or
>       KeyError) to propagate.
>
>     - We can return the original substitution placeholder unchanged.

And/Or,
       - Leave placeholder unchanged unless default argument supplied:
                  mystr.sub(mydict, undefined='***')  # Fill unknowns with
stars
And/Or,
       - Raise an exception only if specified:
                  mystr.sub(mydict, undefined=NameError)
And/Or
        - Return a count of the number of missed substitutions:
                  nummisses = mystr.sub(mydict)

>     BDFL proto-pronouncement: It should always raise a NameError when
>     the key is missing.  There may not be sufficient use case for soft
>     failures in the no-argument version.

I had written a minature mail-merge program and learned that the NameError
approach is a PITA.  It makes sense if the mapping is defined inside the
program;
however, externally supplied mappings (like a mergelist) can be expected to
have "holes" and launching exceptions makes it harder to recover than having
a default behavior.  The best existing Python comparison is the
str.replace() method which does not bomb-out when the target string is not
found.


Raymond Hettinger




From pf@artcom-gmbh.de  Wed Jun 19 07:50:50 2002
From: pf@artcom-gmbh.de (Peter Funk)
Date: Wed, 19 Jun 2002 08:50:50 +0200 (CEST)
Subject: [Python-Dev] Python strptime
In-Reply-To: <200206181156.g5IBuFI30101@pcp02138704pcs.reston01.va.comcast.net>
 from Guido van Rossum at "Jun 18, 2002 07:56:15 am"
Message-ID: <m17KZIo-0071VFC@artcom0.artcom-gmbh.de>

Guido van Rossum:
> > Currently month and weekday names are constants hardcoded in
> > english in calendar.py.
> 
> No they're not.  You're a year behind. ;-)

Oupppsss!  Sorry.  I must admit I looked at the most recent documentation
and not at the most recent source code and so I missed the 
clever patch written by Denis S. Otkidach.  

Obviously there are still some more \versionchanged{} missing in 
python/dist/src/Doc/lib/libcalendar.tex.  
I will see, if I can provide another doc patch.

Regards, Peter
-- 
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen, Germany)




From fredrik@pythonware.com  Wed Jun 19 08:05:00 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 19 Jun 2002 09:05:00 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>
Message-ID: <003701c2175f$b219c340$ced241d5@hagrid>

Barry wrote:

> def birth(self, name):
>     country = self.countryOfOrigin['name']
>     return '${name} was born in ${country}'.sub()

now explain why the above is a vast improvement over:

    def birth(self, name):
        country = self.countryOfOrigin['name']
        return join(name, ' was born in ', country)

(for extra bonus, explain how sub() can be made to
execute substantially faster than a join() function)

</F>




From oren-py-d@hishome.net  Wed Jun 19 08:36:57 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 19 Jun 2002 03:36:57 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <007f01c2175c$f1e768e0$f4d8accf@othello>
References: <20020619033036.LOQA688.mta04.mrf.mail.rcn.net@mx03.mrf.mail.rcn.net> <007f01c2175c$f1e768e0$f4d8accf@othello>
Message-ID: <20020619073657.GA25541@hishome.net>

On Wed, Jun 19, 2002 at 02:45:46AM -0400, Raymond Hettinger wrote:
> > Handling Missing Keys
> >
> >     What should happen when one of the substitution keys is missing
> >     from the mapping (or the locals/globals namespace if no argument
> >     is given)?  There are two possibilities:
> >
> >     - We can simply allow the exception (likely a NameError or
> >       KeyError) to propagate.
> >
> >     - We can return the original substitution placeholder unchanged.
> 
> And/Or,
>        - Leave placeholder unchanged unless default argument supplied:
>                   mystr.sub(mydict, undefined='***')  # Fill unknowns with
> stars
> And/Or,
>        - Raise an exception only if specified:
>                   mystr.sub(mydict, undefined=NameError)
> And/Or
>         - Return a count of the number of missed substitutions:
>                   nummisses = mystr.sub(mydict)
> 
> >     BDFL proto-pronouncement: It should always raise a NameError when
> >     the key is missing.  There may not be sufficient use case for soft
> >     failures in the no-argument version.
> 
> I had written a minature mail-merge program and learned that the NameError
> approach is a PITA.  It makes sense if the mapping is defined inside the
> program;

Exceptions are *supposed* to be a PITA in order to make sure they are hard 
to ignore.

+1 on optional argument for default value. 
-1 on not raising exception for missing name. 

I think the best approach might be to raise NameError exception *unless* a 
default argument is passed.

The number of misses cannot be returned by this method - it returns the new
string. 

	Oren



From oren-py-d@hishome.net  Wed Jun 19 08:51:21 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 19 Jun 2002 03:51:21 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <003701c2175f$b219c340$ced241d5@hagrid>
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid>
Message-ID: <20020619075121.GB25541@hishome.net>

On Wed, Jun 19, 2002 at 09:05:00AM +0200, Fredrik Lundh wrote:
> Barry wrote:
> 
> > def birth(self, name):
> >     country = self.countryOfOrigin['name']
> >     return '${name} was born in ${country}'.sub()
> 
> now explain why the above is a vast improvement over:
> 
>     def birth(self, name):
>         country = self.countryOfOrigin['name']
>         return join(name, ' was born in ', country)

Assuming join = lambda *args: ''.join(map(str, args)) 

1. Friendly for people coming from other languages (Perl/shell). Same reason 
why the != operator was added as an alternative to <>.

2. Less quotes and commas for the terminally lazy.

3. More flexible for data-driven use.  Either the template or the
dictionary can be data rather than hard-wired into the code.

	Oren




From martin@strakt.com  Wed Jun 19 09:33:11 2002
From: martin@strakt.com (Martin =?iso-8859-1?Q?Sj=F6gren?=)
Date: Wed, 19 Jun 2002 10:33:11 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <20020619075121.GB25541@hishome.net>
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net>
Message-ID: <20020619083311.GA1011@ratthing-b3cf>

On Wed, Jun 19, 2002 at 03:51:21AM -0400, Oren Tirosh wrote:
> On Wed, Jun 19, 2002 at 09:05:00AM +0200, Fredrik Lundh wrote:
> > Barry wrote:
> >=20
> > > def birth(self, name):
> > >     country =3D self.countryOfOrigin['name']
> > >     return '${name} was born in ${country}'.sub()
> >=20
> > now explain why the above is a vast improvement over:
> >=20
> >     def birth(self, name):
> >         country =3D self.countryOfOrigin['name']
> >         return join(name, ' was born in ', country)
>=20
> Assuming join =3D lambda *args: ''.join(map(str, args))=20
>=20
> 1. Friendly for people coming from other languages (Perl/shell). Same r=
eason=20
> why the !=3D operator was added as an alternative to <>.
>=20
> 2. Less quotes and commas for the terminally lazy.
>=20
> 3. More flexible for data-driven use.  Either the template or the
> dictionary can be data rather than hard-wired into the code.

But what about

>>> '%(name)s was born in %(country)s' % {'name':'Guido',
  'country':'the Netherlands'}
'Guido was born in the Netherlands'
>>> name =3D 'Martin'
>>> country =3D 'Sweden'
>>> '%(name)s was born in %(country)s' % globals()
'Martin was born in Sweden'

What's the advantage of using ${name} and ${country} instead?


/Martin

--=20
Martin Sj=F6gren
  martin@strakt.com              ICQ : 41245059
  Phone: +46 (0)31 7710870       Cell: +46 (0)739 169191
  GPG key: http://www.strakt.com/~martin/gpg.html



From fredrik@pythonware.com  Wed Jun 19 09:53:02 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 19 Jun 2002 10:53:02 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net>
Message-ID: <006501c2176e$b9dbb3e0$0900a8c0@spiff>

oren wrote:

> 1. Friendly for people coming from other languages (Perl/shell).
>
> 2. Less quotes and commas for the terminally lazy.
>
> 3. More flexible for data-driven use.  Either the template or the
> dictionary can be data rather than hard-wired into the code.

combine 1, 2, and 3 with _getframe(), and you have a
feature that crackers are going to love...

</F>




From duncan@rcp.co.uk  Wed Jun 19 10:34:24 2002
From: duncan@rcp.co.uk (Duncan Booth)
Date: Wed, 19 Jun 2002 10:34:24 +0100
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf>
Message-ID: <09342475030690@aluminium.rcp.co.uk>

On 19 Jun 2002, Martin Sjögren <martin@strakt.com> wrote:

> But what about
> 
>>>> '%(name)s was born in %(country)s' % {'name':'Guido',
>   'country':'the Netherlands'}
> 'Guido was born in the Netherlands'
>>>> name = 'Martin'
>>>> country = 'Sweden'
>>>> '%(name)s was born in %(country)s' % globals()
> 'Martin was born in Sweden'
> 
> What's the advantage of using ${name} and ${country} instead?

Presumably it looks more natural to people experienced in shell programming 
or Perl---at the expense of losing the ability to format field widths and 
alignments of course (so should we have regexes delimited by '/' next?). 
Personally I can't see the need for a second form of string interpolation, 
but since it comes up so often somebody must feel it is significantly 
superior to the existing system.

What I really don't understand is why there is such pressure to get an 
alternative interpolation added as methods to str & unicode rather than 
just adding an interpolation module to the library?
e.g.

from interpolation import sub
def birth(self, name):
    country = self.countryOfOrigin['name']
    return sub('${name} was born in ${country}', vars())

I added in the explicit vars() parameter because the idea of a possibly 
unknown template string picking up arbitrary variables is, IMHO, a BAD 
idea.

If it were a library module then it would probably also make sense to 
define a wrapper object constructed from a sequence that would do the 
interpolation when called:
e.g.
>>> message = interpolation.template('${name} was born in ${country}')
>>> print message(name='Duncan', country='Scotland')
Duncan was born in Scotland 

Putting it in a separate module would also give more scope for providing 
minor variations on the theme, for example the default should be to throw a 
NameError for missing variables, but you could have another function 
wrapping the basic one that substituted in a default value instead.

-- 
Duncan Booth                                             duncan@rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?



From fredrik@pythonware.com  Wed Jun 19 11:01:36 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 19 Jun 2002 12:01:36 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk>
Message-ID: <018c01c21778$4d374f60$0900a8c0@spiff>

duncan wrote:

> What I really don't understand is why there is such pressure to get an =

> alternative interpolation added as methods to str & unicode rather =
than=20
> just adding an interpolation module to the library?
> e.g.
>=20
> from interpolation import sub
> def birth(self, name):
>     country =3D self.countryOfOrigin['name']
>     return sub('${name} was born in ${country}', vars())

that's too easy, of course ;-)

especially since someone has already added such a method, a
long time ago (os.path.expandvars).

and there's already an interpolation engine in there; barry's loop,
join and format stuff can replaced with a simple callback, and a
call to sre.sub with the right pattern:

    def sub(string, mapping):
        def repl(m, mapping=3Dmapping):
            return mapping[m.group(m.lastindex)]
        return sre.sub(A_PATTERN, repl, string)

(if you like lambdas, you can turn this into a one-liner)

maybe it would be sufficient to add a number of "right patterns"
to the standard library...

</F>





From walter@livinglogic.de  Wed Jun 19 11:07:36 2002
From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=)
Date: Wed, 19 Jun 2002 12:07:36 +0200
Subject: [Python-Dev] PEP 293, Codec Error Handling Callbacks
Message-ID: <3D1057E8.9090200@livinglogic.de>

Here's another new PEP.

Bye,
    Walter Dörwald

----------------------------------------------------------------------
PEP: 293
Title: Codec Error Handling Callbacks
Version: $Revision: 1.1 $
Last-Modified: $Date: 2002/06/19 03:22:11 $
Author: Walter Dörwald
Status: Draft
Type: Standards Track
Created: 18-Jun-2002
Python-Version: 2.3
Post-History:


Abstract

     This PEP aims at extending Python's fixed codec error handling
     schemes with a more flexible callback based approach.

     Python currently uses a fixed error handling for codec error
     handlers.  This PEP describes a mechanism which allows Python to
     use function callbacks as error handlers.  With these more
     flexible error handlers it is possible to add new functionality to
     existing codecs by e.g. providing fallback solutions or different
     encodings for cases where the standard codec mapping does not
     apply.


Specification

     Currently the set of codec error handling algorithms is fixed to
     either "strict", "replace" or "ignore" and the semantics of these
     algorithms is implemented separately for each codec.

     The proposed patch will make the set of error handling algorithms
     extensible through a codec error handler registry which maps
     handler names to handler functions.  This registry consists of the
     following two C functions:

         int PyCodec_RegisterError(const char *name, PyObject *error)

         PyObject *PyCodec_LookupError(const char *name)

     and their Python counterparts

         codecs.register_error(name, error)

         codecs.lookup_error(name)

     PyCodec_LookupError raises a LookupError if no callback function
     has been registered under this name.

     Similar to the encoding name registry there is no way of
     unregistering callback functions or iterating through the
     available functions.

     The callback functions will be used in the following way by the
     codecs: when the codec encounters an encoding/decoding error, the
     callback function is looked up by name, the information about the
     error is stored in an exception object and the callback is called
     with this object.  The callback returns information about how to
     proceed (or raises an exception).

     For encoding, the exception object will look like this:

        class UnicodeEncodeError(UnicodeError):
            def __init__(self, encoding, object, start, end, reason):
                UnicodeError.__init__(self,
                    "encoding '%s' can't encode characters " +
                    "in positions %d-%d: %s" % (encoding,
                        start, end-1, reason))
                self.encoding = encoding
                self.object = object
                self.start = start
                self.end = end
                self.reason = reason

     This type will be implemented in C with the appropriate setter and
     getter methods for the attributes, which have the following
     meaning:

       * encoding: The name of the encoding;
       * object: The original unicode object for which encode() has
         been called;
       * start: The position of the first unencodable character;
       * end: (The position of the last unencodable character)+1 (or
         the length of object, if all characters from start to the end
         of object are unencodable);
       * reason: The reason why object[start:end] couldn't be encoded.

     If object has consecutive unencodable characters, the encoder
     should collect those characters for one call to the callback if
     those characters can't be encoded for the same reason.  The
     encoder is not required to implement this behaviour but may call
     the callback for every single character, but it is strongly
     suggested that the collecting method is implemented.

     The callback must not modify the exception object.  If the
     callback does not raise an exception (either the one passed in, or
     a different one), it must return a tuple:

         (replacement, newpos)

     replacement is a unicode object that the encoder will encode and
     emit instead of the unencodable object[start:end] part, newpos
     specifies a new position within object, where (after encoding the
     replacement) the encoder will continue encoding.

     If the replacement string itself contains an unencodable character
     the encoder raises the exception object (but may set a different
     reason string before raising).

     Should further encoding errors occur, the encoder is allowed to
     reuse the exception object for the next call to the callback.
     Furthermore the encoder is allowed to cache the result of
     codecs.lookup_error.

     If the callback does not know how to handle the exception, it must
     raise a TypeError.

     Decoding works similar to encoding with the following differences:
     The exception class is named UnicodeDecodeError and the attribute
     object is the original 8bit string that the decoder is currently
     decoding.

     The decoder will call the callback with those bytes that
     constitute one undecodable sequence, even if there is more than
     one undecodable sequence that is undecodable for the same reason
     directly after the first one.  E.g. for the "unicode-escape"
     encoding, when decoding the illegal string "\\u00\\u01x", the
     callback will be called twice (once for "\\u00" and once for
     "\\u01").  This is done to be able to generate the correct number
     of replacement characters.

     The replacement returned from the callback is a unicode object
     that will be emitted by the decoder as-is without further
     processing instead of the undecodable object[start:end] part.

     There is a third API that uses the old strict/ignore/replace error
     handling scheme:

         PyUnicode_TranslateCharmap/unicode.translate

     The proposed patch will enhance PyUnicode_TranslateCharmap, so
     that it also supports the callback registry.  This has the
     additional side effect that PyUnicode_TranslateCharmap will
     support multi-character replacement strings (see SF feature
     request #403100 [1]).

     For PyUnicode_TranslateCharmap the exception class will be named
     UnicodeTranslateError.  PyUnicode_TranslateCharmap will collect
     all consecutive untranslatable characters (i.e. those that map to
     None) and call the callback with them.  The replacement returned
     from the callback is a unicode object that will be put in the
     translated result as-is, without further processing.

     All encoders and decoders are allowed to implement the callback
     functionality themselves, if they recognize the callback name
     (i.e. if it is a system callback like "strict", "replace" and
     "ignore").  The proposed patch will add two additional system
     callback names: "backslashreplace" and "xmlcharrefreplace", which
     can be used for encoding and translating and which will also be
     implemented in-place for all encoders and
     PyUnicode_TranslateCharmap.

     The Python equivalent of these five callbacks will look like this:

         def strict(exc):
             raise exc

         def ignore(exc):
             if isinstance(exc, UnicodeError):
                 return (u"", exc.end)
             else:
                 raise TypeError("can't handle %s" % exc.__name__)

        def replace(exc):
             if isinstance(exc, UnicodeEncodeError):
                 return ((exc.end-exc.start)*u"?", exc.end)
             elif isinstance(exc, UnicodeDecodeError):
                 return (u"\\ufffd", exc.end)
             elif isinstance(exc, UnicodeTranslateError):
                 return ((exc.end-exc.start)*u"\\ufffd", exc.end)
             else:
                 raise TypeError("can't handle %s" % exc.__name__)

        def backslashreplace(exc):
             if isinstance(exc,
                 (UnicodeEncodeError, UnicodeTranslateError)):
                 s = u""
                 for c in exc.object[exc.start:exc.end]:
                    if ord(c)<=0xff:
                        s += u"\\x%02x" % ord(c)
                    elif ord(c)<=0xffff:
                        s += u"\\u%04x" % ord(c)
                    else:
                        s += u"\\U%08x" % ord(c)
                 return (s, exc.end)
             else:
                 raise TypeError("can't handle %s" % exc.__name__)

        def xmlcharrefreplace(exc):
             if isinstance(exc,
                 (UnicodeEncodeError, UnicodeTranslateError)):
                 s = u""
                 for c in exc.object[exc.start:exc.end]:
                    s += u"&#%d;" % ord(c)
                 return (s, exc.end)
             else:
                 raise TypeError("can't handle %s" % exc.__name__)

     These five callback handlers will also be accessible to Python as
     codecs.strict_error, codecs.ignore_error, codecs.replace_error,
     codecs.backslashreplace_error and codecs.xmlcharrefreplace_error.


Rationale

     Most legacy encoding do not support the full range of Unicode
     characters.  For these cases many high level protocols support a
     way of escaping a Unicode character (e.g. Python itself supports
     the \x, \u and \U convention, XML supports character references
     via &#xxx; etc.).

     When implementing such an encoding algorithm, a problem with the
     current implementation of the encode method of Unicode objects
     becomes apparent: For determining which characters are unencodable
     by a certain encoding, every single character has to be tried,
     because encode does not provide any information about the location
     of the error(s), so

         # (1)
         us = u"xxx"
         s = us.encode(encoding)

     has to be replaced by

         # (2)
         us = u"xxx"
         v = []
         for c in us:
             try:
                 v.append(c.encode(encoding))
             except UnicodeError:
                 v.append("&#%d;" % ord(c))
         s = "".join(v)

     This slows down encoding dramatically as now the loop through the
     string is done in Python code and no longer in C code.

     Furthermore this solution poses problems with stateful encodings.
     For example UTF-16 uses a Byte Order Mark at the start of the
     encoded byte string to specify the byte order.  Using (2) with
     UTF-16, results in an 8 bit string with a BOM between every
     character.

     To work around this problem, a stream writer - which keeps state
     between calls to the encoding function - has to be used:

         # (3)
         us = u"xxx"
         import codecs, cStringIO as StringIO
         writer = codecs.getwriter(encoding)

         v = StringIO.StringIO()
         uv = writer(v)
         for c in us:
             try:
                 uv.write(c)
             except UnicodeError:
                 uv.write(u"&#%d;" % ord(c))
         s = v.getvalue()

     To compare the speed of (1) and (3) the following test script has
     been used:

         # (4)
         import time
         us = u"äa"*1000000
         encoding = "ascii"
         import codecs, cStringIO as StringIO

         t1 = time.time()

         s1 = us.encode(encoding, "replace")

         t2 = time.time()

         writer = codecs.getwriter(encoding)

         v = StringIO.StringIO()
         uv = writer(v)
         for c in us:
             try:
                 uv.write(c)
             except UnicodeError:
                 uv.write(u"?")
         s2 = v.getvalue()

         t3 = time.time()

         assert(s1==s2)
         print "1:", t2-t1
         print "2:", t3-t2
         print "factor:", (t3-t2)/(t2-t1)

     On Linux this gives the following output (with Python 2.3a0):

         1: 0.274321913719
         2: 51.1284689903
         factor: 186.381278466

     i.e. (3) is 180 times slower than (1).

     Codecs must be stateless, because as soon as a callback is
     registered it is available globally and can be called by multiple
     encode() calls.  To be able to use stateful callbacks, the errors
     parameter for encode/decode/translate would have to be changed
     from char * to PyObject *, so that the callback could be used
     directly, without the need to register the callback globally.  As
     this requires changes to lots of C prototypes, this approach was
     rejected.

     Currently all encoding/decoding functions have arguments

         const Py_UNICODE *p, int size

     or

         const char *p, int size

     to specify the unicode characters/8bit characters to be
     encoded/decoded.  So in case of an error the codec has to create a
     new unicode or str object from these parameters and store it in
     the exception object.  The callers of these encoding/decoding
     functions extract these parameters from str/unicode objects
     themselves most of the time, so it could speed up error handling
     if these object were passed directly.  As this again requires
     changes to many C functions, this approach has been rejected.


Implementation Notes

     A sample implementation is available as SourceForge patch #432401
     [2].  The current version of this patch differs from the
     specification in the following way:

       * The error information is passed from the codec to the callback
         not as an exception object, but as a tuple, which has an
         additional entry state, which can be used for additional
         information the codec might want to pass to the callback.
       * There are two separate registries (one for
         encoding/translating and one for decoding)

     The class codecs.StreamReaderWriter uses the errors parameter for
     both reading and writing.  To be more flexible this should
     probably be changed to two separate parameters for reading and
     writing.

     The errors parameter of PyUnicode_TranslateCharmap is not
     availably to Python, which makes testing of the new functionality
     of PyUnicode_TranslateCharmap impossible with Python scripts.  The
     patch should add an optional argument errors to unicode.translate
     to expose the functionality and make testing possible.

     Codecs that do something different than encoding/decoding from/to
     unicode and want to use the new machinery can define their own
     exception classes and the strict handlers will automatically work
     with it. The other predefined error handlers are unicode specific
     and expect to get a Unicode(Encode|Decode|Translate)Error
     exception object so they won't work.


Backwards Compatibility

     The semantics of unicode.encode with errors="replace" has changed:
     The old version always stored a ? character in the output string
     even if no character was mapped to ?  in the mapping.  With the
     proposed patch, the replacement string from the callback callback
     will again be looked up in the mapping dictionary.  But as all
     supported encodings are ASCII based, and thus map ? to ?, this
     should not be a problem in practice.


References

     [1] SF feature request #403100
         "Multicharacter replacements in PyUnicode_TranslateCharmap"
         http://www.python.org/sf/403100

     [2] SF patch #432401 "unicode encoding error callbacks"
         http://www.python.org/sf/432401


Copyright

     This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:




From mwh@python.net  Wed Jun 19 11:09:24 2002
From: mwh@python.net (Michael Hudson)
Date: 19 Jun 2002 11:09:24 +0100
Subject: [Python-Dev] Slicing
In-Reply-To: "David Abrahams"'s message of "Tue, 18 Jun 2002 14:21:23 -0400"
References: <05cd01c216f5$00740fc0$6601a8c0@boostconsulting.com>
Message-ID: <2m3cvjshpn.fsf@starship.python.net>

"David Abrahams" <david.abrahams@rcn.com> writes:

> I did a little experiment to see if I could use a uniform interface for
> slicing (from C++):
> 
> >>> range(10)[slice(3,5)]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: sequence index must be integer
> >>> class Y(object):
> ...     def __getslice__(self, a, b):
> ...             print "getslice",a,b
> ...
> >>> y = Y()
> >>> y[slice(3,5)]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: unsubscriptable object
> >>> y[3:5]
> getslice 3 5
> 
> This seems to indicate that I can't, in general, pass a slice object to
> PyObject_GetItem in order to do slicing.** Correct?

No.  The time machine has got you here; update to CVS and try again.

This comes down to the (slightly odd, IMHO) distinction between
sequences and mappings, which doesn't really appear at the Python
level.

type_pointer->tp_as_sequence->sq_item

takes a single int as a parameter

type_pointer->tp_as_mapping->mp_subscr

takes a PyObject*.  Builtin sequences (as of last week) have mp_subscr
methods that handle slices.  I haven't checked, but would be amazed if
PyObject_GetItem can't now be used with sliceobjects.

(PS: I'm not sure I've got all the field names right here.  They're
close).

> So I went looking around for alternatives to PyObject_GetItem. I found
> PySequence_GetSlice, but that takes int parameters, and AFAIK there's no
> rule saying you can't slice on strings, for example.
> 
> Further experiments revealed:
> 
> >>> y['hi':'there']
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: unsubscriptable object
> >>> class X(object):
> ...     def __getitem__(self, x):
> ...             print 'getitem',x
> ...
> >>> X()['hi':'there']
> getitem slice('hi', 'there', None)
> 
> So I /can/ slice on strings, but only through __getitem__(). And...
> 
> >>> class Z(Y):
> ...     def __getitem__(self, x):
> ...             print 'getitem',x
> ...
> >>> Z()[3:5]
> getslice 3 5
> >>> Z()['3':5]
> getitem slice('3', 5, None)
> 
> So Python is doing some dispatching internally based on the types of the

This area is very messy.

> slice elements, but:
> 
> >>> class subint(int): pass
> ...
> >>> subint()
> 0
> >>> Z[subint():5]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: unsubscriptable object

This last one is easy: you're trying to subscript the class object!

> So it's looking at the concrete type of the slice elements. I'm not
> sure I actually understand how this one fails.
> 
> I want to make a generalized getslice function in C which can operate on a
> triple of arbitrary objects. Here's the python version I came up with:
> 
>     def getslice(x,start,finish):
>         if (type(start) is type(finish) is int
>             and hasattr(type(x), '__getslice__')):
>             return x.__getslice__(start, finish)
>         else:
>             return x.__getitem__(slice(start,finish))
> 
> Have I got the logic right here?

You can't do this logic from Python, AFAIK.  I think PyObject_GetItem
is your best bet.

Cheers,
M.

-- 
  The only problem with Microsoft is they just have no taste.
              -- Steve Jobs, (From _Triumph of the Nerds_ PBS special)
                         and quoted by Aahz Maruch on comp.lang.python



From pyth@devel.trillke.net  Wed Jun 19 11:21:51 2002
From: pyth@devel.trillke.net (holger krekel)
Date: Wed, 19 Jun 2002 12:21:51 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <018c01c21778$4d374f60$0900a8c0@spiff>; from fredrik@pythonware.com on Wed, Jun 19, 2002 at 12:01:36PM +0200
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <018c01c21778$4d374f60$0900a8c0@spiff>
Message-ID: <20020619122151.I15079@prim.han.de>

Fredrik Lundh wrote:
> duncan wrote:
> 
> > What I really don't understand is why there is such pressure to get an 
> > alternative interpolation added as methods to str & unicode rather than 
> > just adding an interpolation module to the library?
> > e.g.
> > 
> > from interpolation import sub
> > def birth(self, name):
> >     country = self.countryOfOrigin['name']
> >     return sub('${name} was born in ${country}', vars())
> 
> that's too easy, of course ;-)
> 
> especially since someone has already added such a method, a
> long time ago (os.path.expandvars).
> 
> and there's already an interpolation engine in there; barry's loop,
> join and format stuff can replaced with a simple callback, and a
> call to sre.sub with the right pattern:
> 
>     def sub(string, mapping):
>         def repl(m, mapping=mapping):
>             return mapping[m.group(m.lastindex)]
>         return sre.sub(A_PATTERN, repl, string)
> 
> (if you like lambdas, you can turn this into a one-liner)
> 
> maybe it would be sufficient to add a number of "right patterns"
> to the standard library...

FWIW, +1

    holger



From barry@wooz.org  Wed Jun 19 12:10:10 2002
From: barry@wooz.org (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 07:10:10 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <200206190329.g5J3TLZ22194@server1.lfw.org>
 <Pine.LNX.4.44.0206182131580.17350-100000@ziggy>
Message-ID: <15632.26258.371707.408884@anthem.wooz.org>

>>>>> "KY" == Ka-Ping Yee <ping@zesty.ca> writes:

    KY> I assume you in fact meant

    KY> 	    return '${name} was born in ${country}'.sub()

    KY> for the third line above?

Yup, thanks for the fix.

    >> print s.sub({'name': 'Guido', 'country': 'the Netherlands'})

    KY> Have you considered the possibility of accepting keyword
    KY> arguments instead?

Nope, and it's not a bad idea.  I've added this in an "Open Issues"
section. 
    
    KY> If you decide to use keyword arguments, you can either allow
    KY> both keyword arguments and a single dictionary argument, or
    KY> you can just accept keyword arguments and people can pass in
    KY> dictionaries using **.

I'd prefer the latter, otherwise we'd have to pick a keyword that
would be off-limits as a substitution variable.

Thanks!
-Barry



From barry@zope.com  Wed Jun 19 12:16:52 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 07:16:52 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com>
 <3D100CA9.999E7B3F@prescod.net>
Message-ID: <15632.26660.981170.540633@anthem.wooz.org>

>>>>> "PP" == Paul Prescod <paul@prescod.net> writes:

    PP> You forgot the "implicit .sub"() feature.

Yup, thanks.

    >> ...

    >> - We can simply allow the exception (likely a NameError or
    >> KeyError) to propagate.

    PP> Explicit!

Now you're thinking like Guido! :)

    >> - We can return the original substitution placeholder
    >> unchanged.

    PP> Silently guess???

I'm beginning to agree that the exception should be raised.  I want to
be explicit about it so we can write the safedict wrapper effectively.
I agree that an exception is better when using the no-arg version, but
for the arg-version I /really/ want to be able to suppress all
interpolation exceptions, and just return some string, even if it has
placeholders still in it.

    PP> Overall it isn't bad...it's a little weird to have a method
    PP> that depends on sys._getframe(1) (or as the say in Tcl-land
    PP> "upvar"). It may set a bad precedent...

Noted.

Thanks,
-Barry



From barry@zope.com  Wed Jun 19 12:23:59 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 07:23:59 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <20020619033036.LOQA688.mta04.mrf.mail.rcn.net@mx03.mrf.mail.rcn.net>
 <007f01c2175c$f1e768e0$f4d8accf@othello>
Message-ID: <15632.27087.786103.175959@anthem.wooz.org>

>>>>> "RH" == Raymond Hettinger <python@rcn.com> writes:

    >> 1. $$ is an escape;
    >> it is replaced with a single $

    RH> Hmm, some strings (at least in the spam I receive) contain
    RH> $$$$$$.  How about ${$}?

How often will you be interpolating into spam?  Sounds like it could
get messy. ;)

    >> Handling Missing Keys
    >> What should happen when one of the substitution keys is missing
    >> from the mapping (or the locals/globals namespace if no
    >> argument is given)?  There are two possibilities: - We can
    >> simply allow the exception (likely a NameError or KeyError) to
    >> propagate.  - We can return the original substitution
    >> placeholder unchanged.

    RH> And/Or,
    RH>        - Leave placeholder unchanged unless default argument
    RH> supplied: mystr.sub(mydict, undefined='***') # Fill unknowns
    RH> with
    RH> stars
    RH> And/Or,
    RH>        - Raise an exception only if specified:
    RH>                   mystr.sub(mydict, undefined=NameError)
    RH> And/Or
    RH>         - Return a count of the number of missed substitutions:
    RH>                   nummisses = mystr.sub(mydict)

I /really/ dislike interfaces that raise an exception or don't, based
on an argument to the function.  Returning the number of missed
substitutions doesn't seem useful, but could be done with a regexp.
Filling unknowns with stars could just as easily be done with a
different safedict wrapper.

The specific issue is when using locals+globals.  In that case, it's
seems like the problem is clearly a programming bug, so the exception
should be raised.

    >> BDFL proto-pronouncement: It should always raise a NameError
    >> when the key is missing.  There may not be sufficient use case
    >> for soft failures in the no-argument version.

    RH> I had written a minature mail-merge program and learned that
    RH> the NameError approach is a PITA.  It makes sense if the
    RH> mapping is defined inside the program; however, externally
    RH> supplied mappings (like a mergelist) can be expected to have
    RH> "holes" and launching exceptions makes it harder to recover
    RH> than having a default behavior.  The best existing Python
    RH> comparison is the str.replace() method which does not bomb-out
    RH> when the target string is not found.

I agree that certain use cases make the exception problematic.  Think
a program that uses a template entered remotely through the web.  That
template could have misspellings in the variable substitutions.  In
that case I think you'd like to carry on as best you can, by returning
a string with the bogus placeholders still in the string.

-Barry



From David Abrahams" <david.abrahams@rcn.com  Wed Jun 19 12:57:22 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 19 Jun 2002 07:57:22 -0400
Subject: [Python-Dev] Slicing
References: <05cd01c216f5$00740fc0$6601a8c0@boostconsulting.com> <2m3cvjshpn.fsf@starship.python.net>
Message-ID: <08a201c21788$7afc4760$6601a8c0@boostconsulting.com>

----- Original Message -----
From: "Michael Hudson" <mwh@python.net>

> > This seems to indicate that I can't, in general, pass a slice object to
> > PyObject_GetItem in order to do slicing.** Correct?
>
> No.  The time machine has got you here; update to CVS and try again.

While that result is of interest, I think I need to support 2.2.1, so maybe
it doesn't make too much difference what the current CVS is doing.

> This comes down to the (slightly odd, IMHO) distinction between
> sequences and mappings, which doesn't really appear at the Python
> level.
>
> type_pointer->tp_as_sequence->sq_item
>
> takes a single int as a parameter
>
> type_pointer->tp_as_mapping->mp_subscr
>
> takes a PyObject*.

I know about those details, but of course they aren't really a cause: as
the current CVS shows, it *can* be handled.

> Builtin sequences (as of last week) have mp_subscr
> methods that handle slices.  I haven't checked, but would be amazed if
> PyObject_GetItem can't now be used with sliceobjects.

Good to know.

> > So Python is doing some dispatching internally based on the types of
the

> > >>> class subint(int): pass
> > ...
> > >>> subint()
> > 0
> > >>> Z[subint():5]
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in ?
> > TypeError: unsubscriptable object
>
> This last one is easy: you're trying to subscript the class object!

Oops, nice catch!

>>> Z()[subint():5]
getslice 0 5

> > I want to make a generalized getslice function in C which can operate
on a
> > triple of arbitrary objects. Here's the python version I came up with:
> >
> >     def getslice(x,start,finish):
> >         if (type(start) is type(finish) is int
> >             and hasattr(type(x), '__getslice__')):
> >             return x.__getslice__(start, finish)
> >         else:
> >             return x.__getitem__(slice(start,finish))
> >
> > Have I got the logic right here?
>
> You can't do this logic from Python, AFAIK.

Why do you say that? Are you saying I should be looking at slots and not
attributes, plus special handling for classic classes (ick)?

> I think PyObject_GetItem is your best bet.

Well, not if I care about versions < 2.2.2. So, I'm modifying my logic
slightly:

     def getslice(x,start,finish):
         if (isinstance(start,int) and isinstance(finish,int)
             and hasattr(type(x), '__getslice__')):
             return x.__getslice__(start, finish)
         else:
             return x.__getitem__(slice(start,finish))

-Dave






From barry@zope.com  Wed Jun 19 13:10:48 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 08:10:48 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>
 <003701c2175f$b219c340$ced241d5@hagrid>
Message-ID: <15632.29896.521752.346381@anthem.wooz.org>

>>>>> "FL" == Fredrik Lundh <fredrik@pythonware.com> writes:

    >> def birth(self, name): country = self.countryOfOrigin['name']
    >> return '${name} was born in ${country}'.sub()

    FL> now explain why the above is a vast improvement over:

    |     def birth(self, name):
    |         country = self.countryOfOrigin['name']
    |         return join(name, ' was born in ', country)

One use case: you can't internationalize that.  You /can/ translate
'${name} was born in ${country}', which might end up in some languages
like '${country} was ${name} born in'.

    FL> (for extra bonus, explain how sub() can be made to
    FL> execute substantially faster than a join() function)

All I care is that it runs as fast as the % operator.
-Barry



From barry@zope.com  Wed Jun 19 13:16:17 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 08:16:17 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>
 <003701c2175f$b219c340$ced241d5@hagrid>
 <20020619075121.GB25541@hishome.net>
 <20020619083311.GA1011@ratthing-b3cf>
Message-ID: <15632.30225.721902.65921@anthem.wooz.org>

>>>>> "MS" =3D=3D Martin Sj=F6gren <martin@strakt.com> writes:

    MS> What's the advantage of using ${name} and ${country} instead?

There's a lot of empirical evidence that %(name)s is quite error
prone.

BTW, you can't use locals() or globals() because you really want
globals()-overridden-with-locals(), i.e.

    d =3D globals().copy()
    d.update(locals())

vars() doesn't cut it either:

>>> help(vars)
Help on built-in function vars:

vars(...)
    vars([object]) -> dictionary
   =20
    Without arguments, equivalent to locals().
    With an argument, equivalent to object.__dict__.

-Barry



From barry@zope.com  Wed Jun 19 13:18:44 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 08:18:44 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>
 <003701c2175f$b219c340$ced241d5@hagrid>
 <20020619075121.GB25541@hishome.net>
 <006501c2176e$b9dbb3e0$0900a8c0@spiff>
Message-ID: <15632.30372.601835.200686@anthem.wooz.org>

>>>>> "FL" == Fredrik Lundh <fredrik@pythonware.com> writes:

    FL> combine 1, 2, and 3 with _getframe(), and you have a
    FL> feature that crackers are going to love...

Why?

I've added a note that you should never use no-arg .sub() on strings
that come from untrusted sources.  Are there any other specific
security concerns you can identify?

-Barry



From barry@zope.com  Wed Jun 19 13:29:39 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 08:29:39 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>
 <003701c2175f$b219c340$ced241d5@hagrid>
 <20020619075121.GB25541@hishome.net>
 <20020619083311.GA1011@ratthing-b3cf>
 <09342475030690@aluminium.rcp.co.uk>
Message-ID: <15632.31027.356393.678498@anthem.wooz.org>

>>>>> "DB" == Duncan Booth <duncan@rcp.co.uk> writes:

    DB> What I really don't understand is why there is such pressure
    DB> to get an alternative interpolation added as methods to str &
    DB> unicode rather than just adding an interpolation module to the
    DB> library?  e.g.

Because I don't think there's all that much useful variation, open
issues in this PEP notwithstanding.  A module seems pretty heavy for
such a simple addition.  It might obviate the need for a PEP
though. :)

    | from interpolation import sub
    | def birth(self, name):
    |     country = self.countryOfOrigin['name']
    |     return sub('${name} was born in ${country}', vars())

    DB> I added in the explicit vars() parameter because the idea of a
    DB> possibly unknown template string picking up arbitrary
    DB> variables is, IMHO, a BAD idea.

Only if the template string comes from an untrusted source.  If it's
in your code, there should be no problem, and if there is, it's a
programming bug.

vars() doesn't cut it as mentioned in a previous reply.

-Barry



From barry@zope.com  Wed Jun 19 13:31:56 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 08:31:56 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>
 <003701c2175f$b219c340$ced241d5@hagrid>
 <20020619075121.GB25541@hishome.net>
 <20020619083311.GA1011@ratthing-b3cf>
 <09342475030690@aluminium.rcp.co.uk>
 <018c01c21778$4d374f60$0900a8c0@spiff>
Message-ID: <15632.31164.457058.17493@anthem.wooz.org>

>>>>> "FL" == Fredrik Lundh <fredrik@pythonware.com> writes:

    FL> especially since someone has already added such a method, a
    FL> long time ago (os.path.expandvars).

Of course expandvars() takes it's mapping from os.environ, but I see
what you're saying.

-Barry



From neal@metaslash.com  Wed Jun 19 13:32:47 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Wed, 19 Jun 2002 08:32:47 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>
 <003701c2175f$b219c340$ced241d5@hagrid>
 <20020619075121.GB25541@hishome.net>
 <20020619083311.GA1011@ratthing-b3cf> <15632.30225.721902.65921@anthem.wooz.org>
Message-ID: <3D1079EF.61AA3E79@metaslash.com>

"Barry A. Warsaw" wrote:
> 
> BTW, you can't use locals() or globals() because you really want
> globals()-overridden-with-locals(), i.e.
> 
>     d = globals().copy()
>     d.update(locals())

What about free/cell vars?  Will these be used?  
If not, is that a problem?

Neal



From aahz@pythoncraft.com  Wed Jun 19 13:40:59 2002
From: aahz@pythoncraft.com (Aahz)
Date: Wed, 19 Jun 2002 08:40:59 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <m31yb3hlrv.fsf@mira.informatik.hu-berlin.de>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> <m31yb3hlrv.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020619124059.GA22356@panix.com>

On Wed, Jun 19, 2002, Martin v. Loewis wrote:
> barry@zope.com (Barry A. Warsaw) writes:
>> 
>> Here's a small patch to setup.py which should fix things in a portable
>> way, at least for *nix systems.  It sets the envar LD_RUN_PATH to the
>> location that it found the Berkeley library, but only if that envar
>> isn't already set.  
> 
> I dislike that change. Setting LD_RUN_PATH is the jobs of whoever is
> building the compiler, and should not be done by Python
> automatically. So far, the Python build process avoids adding any -R
> linker options, since it requires quite some insight into the specific
> installation to determine whether usage of that option is the right
> thing.
> 
> If setup.py fails to build an extension correctly, it is the
> adminstrator's job to specify a correct build procedure in
> Modules/Setup. For that reason, I rather recommend to remove the magic
> that setup.py looks in /usr/local/Berkeley*, instead of adding more
> magic.

-1 if it doesn't at least include an error message saying that we found
dbm but couldn't use it.  (That is, I agree with you that explicit is
better than implicit -- but if we can provide info, we should.)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From guido@python.org  Wed Jun 19 13:39:37 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 08:39:37 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: Your message of "Tue, 18 Jun 2002 22:38:36 EDT."
 <15631.61100.561824.480935@anthem.wooz.org>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
Message-ID: <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>

> I still think we may want to pull PyBSDDB into the standard distro, as
> a way to provide BDB api's > 1.85.  The question is, what would this
> new module be called?  I dislike "bsddb3" -- which I think PyBSDDB
> itself uses -- because it links against BDB 4.0.

Good idea.  Maybe call it berkeleydb?  That's what Sleepycat calls it
(there's no connection with the BSD Unix distribution AFAICT).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Wed Jun 19 13:49:22 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 08:49:22 -0400
Subject: [Python-Dev] Re: making dbmmodule still broken
In-Reply-To: Your message of "Tue, 18 Jun 2002 23:24:09 EDT."
 <15631.63833.440127.405556@anthem.wooz.org>
References: <15631.58711.213506.701945@localhost.localdomain>
 <15631.63833.440127.405556@anthem.wooz.org>
Message-ID: <200206191249.g5JCnM001518@pcp02138704pcs.reston01.va.comcast.net>

>     SM> I think it would probably be a good idea to alert the person
>     SM> running make what library the module will be linked with.
>     SM> Anyone else agree?
> 
> +1.  The less guessing the builder has to do the better!

Just don't start asking questions and reading answers from stdin.  The
Make process is often run unattended.  A new option to allow asking
questions is OK.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin@mems-exchange.org  Wed Jun 19 13:46:05 2002
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 19 Jun 2002 08:46:05 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <09342475030690@aluminium.rcp.co.uk>
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk>
Message-ID: <20020619124604.GB31653@ute.mems-exchange.org>

On Wed, Jun 19, 2002 at 10:34:24AM +0100, Duncan Booth wrote:
>What I really don't understand is why there is such pressure to get an 
>alternative interpolation added as methods to str & unicode rather than 
>just adding an interpolation module to the library?

It could live in the new text module, where Greg Ward's word-wrapping
code will be going.  +1 on /F's suggestion of recycling the
os.path.expandvars() code.

(Maybe a syntax-checker for %(...) strings would solve Mailman's
problems, and alleviate the plaintive cries for an alternative
interpolation syntax?)

--amk                                                             (www.amk.ca)
GERTRUDE: The lady doth protest too much, methinks.
    -- _Hamlet_, III, ii



From guido@python.org  Wed Jun 19 13:49:53 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 08:49:53 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Tue, 18 Jun 2002 23:29:20 EDT."
 <200206190329.g5J3TKm19802@smtp.zope.com>
References: <200206190329.g5J3TKm19802@smtp.zope.com>
Message-ID: <200206191249.g5JCnr901530@pcp02138704pcs.reston01.va.comcast.net>

> I'm so behind on my email, that the anticipated flamefest will surely
> die down before I get around to reading it.  Yet still, here is a new
> PEP. :)

No, the flamefest won't start until you post to c.l.py. :)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Wed Jun 19 13:52:38 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 08:52:38 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Tue, 18 Jun 2002 21:36:51 PDT."
 <Pine.LNX.4.44.0206182131580.17350-100000@ziggy>
References: <Pine.LNX.4.44.0206182131580.17350-100000@ziggy>
Message-ID: <200206191252.g5JCqcQ01558@pcp02138704pcs.reston01.va.comcast.net>

> Have you considered the possibility of accepting keyword arguments
> instead?  They would be slightly more pleasant to write:
> 
>         print s.sub(name='Guido', country='the Netherlands')
> 
> This is motivated because i imagine relative frequencies of use
> to be something like this:
> 
>     1.  sub()                      [most frequent]
>     2.  sub(name=value, ...)       [nearly as frequent]
>     3.  sub(dictionary)            [least frequent]
> 
> If you decide to use keyword arguments, you can either allow both
> keyword arguments and a single dictionary argument, or you can
> just accept keyword arguments and people can pass in dictionaries
> using **.

I imagine that the most common use case is a situation where the dict
is already prepared.  I think **dict is slower than a positional dict
argument.  I agree that keyword args would be useful in some cases
where you can't trust the string.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Wed Jun 19 13:57:07 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 08:57:07 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 02:45:46 EDT."
 <007f01c2175c$f1e768e0$f4d8accf@othello>
References: <20020619033036.LOQA688.mta04.mrf.mail.rcn.net@mx03.mrf.mail.rcn.net>
 <007f01c2175c$f1e768e0$f4d8accf@othello>
Message-ID: <200206191257.g5JCv7V01597@pcp02138704pcs.reston01.va.comcast.net>

> >     1. $$ is an escape; it is replaced with a single $
> 
> Hmm, some strings (at least in the spam I receive) contain $$$$$$.
> How about ${$}?

I don't understand the use case.  Do you want to *output* strings
containing many dollars?  If you want a {} based escape, it should be
${} IMO.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From Oleg Broytmann <phd@phd.pp.ru>  Wed Jun 19 13:53:44 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Wed, 19 Jun 2002 16:53:44 +0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Wed, Jun 19, 2002 at 08:39:37AM -0400
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020619165344.V4127@phd.pp.ru>

On Wed, Jun 19, 2002 at 08:39:37AM -0400, Guido van Rossum wrote:
> Good idea.  Maybe call it berkeleydb?  That's what Sleepycat calls it
> (there's no connection with the BSD Unix distribution AFAICT).

+1 on berkeleydb

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From guido@python.org  Wed Jun 19 13:58:54 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 08:58:54 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 09:05:00 +0200."
 <003701c2175f$b219c340$ced241d5@hagrid>
References: <200206190322.g5J3M1I07670@pythonware.com>
 <003701c2175f$b219c340$ced241d5@hagrid>
Message-ID: <200206191258.g5JCwsp01610@pcp02138704pcs.reston01.va.comcast.net>

> > def birth(self, name):
> >     country = self.countryOfOrigin['name']
> >     return '${name} was born in ${country}'.sub()
> 
> now explain why the above is a vast improvement over:
> 
>     def birth(self, name):
>         country = self.countryOfOrigin['name']
>         return join(name, ' was born in ', country)

One word: I18n.

> (for extra bonus, explain how sub() can be made to
> execute substantially faster than a join() function)

That's not a requirement.  It can obviously be made as fast as the %
operator.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Wed Jun 19 14:05:32 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 09:05:32 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 07:23:59 EDT."
 <15632.27087.786103.175959@anthem.wooz.org>
References: <20020619033036.LOQA688.mta04.mrf.mail.rcn.net@mx03.mrf.mail.rcn.net> <007f01c2175c$f1e768e0$f4d8accf@othello>
 <15632.27087.786103.175959@anthem.wooz.org>
Message-ID: <200206191305.g5JD5Xe01727@pcp02138704pcs.reston01.va.comcast.net>

> I agree that certain use cases make the exception problematic.  Think
> a program that uses a template entered remotely through the web.  That
> template could have misspellings in the variable substitutions.  In
> that case I think you'd like to carry on as best you can, by returning
> a string with the bogus placeholders still in the string.

That's a matter of validating the template before accepting it.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Wed Jun 19 13:54:56 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 08:54:56 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Tue, 18 Jun 2002 21:46:33 PDT."
 <3D100CA9.999E7B3F@prescod.net>
References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com>
 <3D100CA9.999E7B3F@prescod.net>
Message-ID: <200206191254.g5JCsu301574@pcp02138704pcs.reston01.va.comcast.net>

> >     - We can simply allow the exception (likely a NameError or
> >       KeyError) to propagate.
> 
> Explicit!
> 
> >     - We can return the original substitution placeholder unchanged.
> 
> Silently guess???

I'm strongly in favor of always making missing keys an error.  It
should be a KeyError when a dict is used, and a NameError when
locals/globals are looked up.

> Overall it isn't bad...it's a little weird to have a method that depends
> on sys._getframe(1) (or as the say in Tcl-land "upvar"). It may set a
> bad precedent...

No, the real implementation will be in C.  C functions always have
access to locals and globals.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik@pythonware.com  Wed Jun 19 14:13:25 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 19 Jun 2002 15:13:25 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>              <003701c2175f$b219c340$ced241d5@hagrid>  <200206191258.g5JCwsp01610@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <028c01c21795$a5841d20$0900a8c0@spiff>

guido wrote:


> > > def birth(self, name):
> > >     country =3D self.countryOfOrigin['name']
> > >     return '${name} was born in ${country}'.sub()
> >=20
> > now explain why the above is a vast improvement over:
> >=20
> >     def birth(self, name):
> >         country =3D self.countryOfOrigin['name']
> >         return join(name, ' was born in ', country)
>=20
> One word: I18n.

really?  who's doing the localization in:

    return '${name} was born in ${country}'.sub()

maybe barry meant

    return _('${name} was born in ${country}').sub()

but in that case, I completely fail to see why he couldn't
just as well do the substitution inside the "_" function:

    return _('${name} was born in ${country}')

where _ could be defined as:

    def _(string, mapping=3DNone):
        if mapping is None:
            ...
        def repl(m, mapping=3Dmapping):
            return mapping[m.group(m.lastindex)]
        return sre.sub(A_PATTERN, repl, do_translation(string))

instead of

    def _(string, mapping=3DNone):
        if mapping is None:
            ...
        return do_translation(string).sub(mapping)

</F>




From tismer@tismer.com  Wed Jun 19 14:38:46 2002
From: tismer@tismer.com (Christian Tismer)
Date: Wed, 19 Jun 2002 15:38:46 +0200
Subject: [Python-Dev] Tcl adept wanted for Stackless problem
Message-ID: <3D108966.9050402@tismer.com>

Dear Lists,

there is still some problem with Tcl and Stackless hiding.

Is there anybody around who knows the Tcl/Tk sources
about as I know Python's?

My big question is:
When does Tcl use C stack entries as globals, which
are passed as function arguments to interpreter calls?
This is of special interest when Python callbacks
are invoked.
The problem is, that I use stack slicing, which moves
parts of the C stack away at some times, and I need
to know when I am forbidden to do that. It is solved
to some extent, but not all.

That's unfortunately not all.
A friend is running the Tcl/Tk mainloop in one real thread,
and stackless tasklets are running in another one (of course
calling into Tcl). When tasklet switches are performed
(again, this is moving stack contents around), there appear
to be crashes, too.
This kind of slicing should be allowed IMHO, since these
are different contexts, which shouldn't interact at all.
Are there any structures which are shared between threads
that use Tcl/Tk? Something that may not disappear during
some operations?

Which part of the documentation / the Tcl/Tk source should
I read to find this out wihout learning everything?

This is really an urgent problem which is blocking me.
If somebody has good knowledge, please show up! :-)

thanks & cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From guido@python.org  Wed Jun 19 15:45:49 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 10:45:49 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 08:29:39 EDT."
 <15632.31027.356393.678498@anthem.wooz.org>
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk>
 <15632.31027.356393.678498@anthem.wooz.org>
Message-ID: <200206191445.g5JEjoL01987@pcp02138704pcs.reston01.va.comcast.net>

>     DB> What I really don't understand is why there is such pressure
>     DB> to get an alternative interpolation added as methods to str &
>     DB> unicode rather than just adding an interpolation module to the
>     DB> library?  e.g.
> 
> Because I don't think there's all that much useful variation, open
> issues in this PEP notwithstanding.  A module seems pretty heavy for
> such a simple addition.  It might obviate the need for a PEP
> though. :)

Certainly if we can't agree on the PEP, a module might make sense.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Wed Jun 19 15:50:59 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 10:50:59 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 08:32:47 EDT."
 <3D1079EF.61AA3E79@metaslash.com>
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <15632.30225.721902.65921@anthem.wooz.org>
 <3D1079EF.61AA3E79@metaslash.com>
Message-ID: <200206191450.g5JEoxC02004@pcp02138704pcs.reston01.va.comcast.net>

> > BTW, you can't use locals() or globals() because you really want
> > globals()-overridden-with-locals(), i.e.
> > 
> >     d = globals().copy()
> >     d.update(locals())
> 
> What about free/cell vars?  Will these be used?  
> If not, is that a problem?

Without compiler support for this construct we have no hope of getting
references to outer non-global scopes right.  E.g.

def f():
    x = 12

    def g():
        return "x is $x".sub()

    return g

Here the compiler has no clue that g references x, so it wouldn't do
the special treatment for x that's needed to make it work.

I see no way to fix this in general without introducing new syntax;
note that the string "x is $x" could have been an argument to g().

--Guido van Rossum (home page: http://www.python.org/~guido/)



From niemeyer@conectiva.com  Wed Jun 19 15:40:36 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Wed, 19 Jun 2002 11:40:36 -0300
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <028c01c21795$a5841d20$0900a8c0@spiff>
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <200206191258.g5JCwsp01610@pcp02138704pcs.reston01.va.comcast.net> <028c01c21795$a5841d20$0900a8c0@spiff>
Message-ID: <20020619114036.A5586@ibook.distro.conectiva>

[...]
> but in that case, I completely fail to see why he couldn't
> just as well do the substitution inside the "_" function:
> 
>     return _('${name} was born in ${country}')
[...]

That would parse every translated string, which doesn't seem
reasonable. My vote is for sub()-like, inside an extension module.

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From barry@zope.com  Wed Jun 19 15:58:56 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 10:58:56 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <20020619033036.LOQA688.mta04.mrf.mail.rcn.net@mx03.mrf.mail.rcn.net>
 <007f01c2175c$f1e768e0$f4d8accf@othello>
 <15632.27087.786103.175959@anthem.wooz.org>
 <200206191305.g5JD5Xe01727@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15632.39984.662165.422755@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    >> I agree that certain use cases make the exception problematic.
    >> Think a program that uses a template entered remotely through
    >> the web.  That template could have misspellings in the variable
    >> substitutions.  In that case I think you'd like to carry on as
    >> best you can, by returning a string with the bogus placeholders
    >> still in the string.

    GvR> That's a matter of validating the template before accepting
    GvR> it.

True, which isn't hard to do.  You can write a regexp to extract the
$names and then validate those.  In fact, I think this is what newer
versions of xgettext do for Python code (albeit with the %(name)s
syntax).

-Barry



From greg@electricrain.com  Wed Jun 19 19:21:41 2002
From: greg@electricrain.com (Gregory P. Smith)
Date: Wed, 19 Jun 2002 11:21:41 -0700
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <m3660fhm3g.fsf@mira.informatik.hu-berlin.de>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <m3660fhm3g.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020619182141.GA18944@zot.electricrain.com>

On Wed, Jun 19, 2002 at 07:30:11AM +0200, Martin v. Loewis wrote:
> barry@zope.com (Barry A. Warsaw) writes:
> 
> > I still think we may want to pull PyBSDDB into the standard distro, as
> > a way to provide BDB api's > 1.85.  The question is, what would this
> > new module be called?  I dislike "bsddb3" -- which I think PyBSDDB
> > itself uses -- because it links against BDB 4.0.
> 
> If this is just a question of naming, I recommend bsddb2 - not
> indicating the version of the database, but the version of the Python
> module.

If I hadn't made the initial mistake of naming pybsddb's module bsddb3
when i first extended robin's berkeleydb 2.x module to work with 3.0 I
would agree with that name.  I worry that having a module named bsddb2
might cause endless confusion as bsddb and bsddb3 already exist and did
correlate to the version number.  How about 'berkeleydb'?




From paul@prescod.net  Wed Jun 19 17:43:02 2002
From: paul@prescod.net (Paul Prescod)
Date: Wed, 19 Jun 2002 09:43:02 -0700
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com>
 <3D100CA9.999E7B3F@prescod.net> <200206191254.g5JCsu301574@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D10B496.67B19898@prescod.net>

Guido van Rossum wrote:
> 
>...
> 
> I'm strongly in favor of always making missing keys an error.  It
> should be a KeyError when a dict is used, and a NameError when
> locals/globals are looked up.

I think someone later suggested that maybe a keyword argument could
allow some kind of policy to be expressed. That would be okay for me if
people feel strongly that some strategy for fixing up missing arguments
is necessary.

> > Overall it isn't bad...it's a little weird to have a method that depends
> > on sys._getframe(1) (or as the say in Tcl-land "upvar"). It may set a
> > bad precedent...
> 
> No, the real implementation will be in C.  C functions always have
> access to locals and globals.

I didn't mean it will be a bad precedent because of the implemention. I
mean that methods do not usually peak into their caller's variables,
even from C. What other methods do that? I'm still "+0" despite being
somewhat uncomfortable with that aspect.

 Paul Prescod



From jeremy@zope.com  Wed Jun 19 16:29:13 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Wed, 19 Jun 2002 11:29:13 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <200206191450.g5JEoxC02004@pcp02138704pcs.reston01.va.comcast.net>
References: <200206190322.g5J3M1I07670@pythonware.com>
 <003701c2175f$b219c340$ced241d5@hagrid>
 <20020619075121.GB25541@hishome.net>
 <20020619083311.GA1011@ratthing-b3cf>
 <15632.30225.721902.65921@anthem.wooz.org>
 <3D1079EF.61AA3E79@metaslash.com>
 <200206191450.g5JEoxC02004@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15632.41801.526300.360183@slothrop.zope.com>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

  >> > BTW, you can't use locals() or globals() because you really
  >> > want globals()-overridden-with-locals(), i.e.
  >> >
  >> >     d = globals().copy() d.update(locals())
  >>
  >> What about free/cell vars?  Will these be used?  If not, is that
  >> a problem?

  GvR> Without compiler support for this construct we have no hope of
  GvR> getting references to outer non-global scopes right.  E.g.

  GvR> def f():
  GvR>     x = 12

  GvR>     def g():
  GvR>         return "x is $x".sub()

  GvR>     return g

  GvR> Here the compiler has no clue that g references x, so it
  GvR> wouldn't do the special treatment for x that's needed to make
  GvR> it work.

  GvR> I see no way to fix this in general without introducing new
  GvR> syntax; note that the string "x is $x" could have been an
  GvR> argument to g().

If Python had macros, then we could define the interpolation function
as a macro.  It would expand to explicit references to all the
variables in the block that called the macro.  Then the compiler could
do the right thing.

Of course, we ain't got macros, but whatever.  I think they would
provide cleaner support for interpolation than sys._getframe().

Jeremy




From barry@zope.com  Wed Jun 19 19:44:34 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 14:44:34 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
 <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15632.53522.716428.359480@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> Maybe call it berkeleydb?

>>>>> "GPS" == Gregory P Smith <greg@electricrain.com> writes:

    GPS> How about 'berkeleydb'?

Sounds like consensus.  Greg, how do you feel about moving project
management from the pybsddb project to the Python project?

Maybe we should plan a transition on the pybsddb-users list.  I'm
willing to help.

-Barry





From mwh@python.net  Wed Jun 19 17:07:55 2002
From: mwh@python.net (Michael Hudson)
Date: 19 Jun 2002 17:07:55 +0100
Subject: [Python-Dev] extended slicing again
In-Reply-To: Michael Hudson's message of "17 Jun 2002 15:26:22 +0100"
References: <9B37BC74-81F2-11D6-9BA6-0003931DF95C@python.net> <200206171339.g5HDdxN08737@pcp02138704pcs.reston01.va.comcast.net> <2my9de7zht.fsf_-_@starship.python.net>
Message-ID: <2m7kkv9rqc.fsf@starship.python.net>

Michael Hudson <mwh@python.net> writes:

> Guido van Rossum <guido@python.org> writes:
> 
> > IOW slice(a, b, None) should be considered equivalent to L[a:b] in all
> > situations.
> 
> OK.  I'll do this soon.  It's not as bad as I thought at first -- only
> mutable sequences are affected, so it's only lists and arrays that
> need to be tweaked.

That was easy!

Cheers,
M.

-- 
  I have no disaster recovery plan for black holes, I'm afraid.  Also
  please be aware that if it one looks imminent I will be out rioting
  and setting fire to McDonalds (always wanted to do that) and
  probably not reading email anyway.                     -- Dan Barlow



From guido@python.org  Wed Jun 19 16:07:36 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 11:07:36 -0400
Subject: [Python-Dev] Tcl adept wanted for Stackless problem
In-Reply-To: Your message of "Wed, 19 Jun 2002 15:38:46 +0200."
 <3D108966.9050402@tismer.com>
References: <3D108966.9050402@tismer.com>
Message-ID: <200206191507.g5JF7ag02088@pcp02138704pcs.reston01.va.comcast.net>

> My big question is:
> When does Tcl use C stack entries as globals, which
> are passed as function arguments to interpreter calls?

It's a performance hack, just as stackless :-).

Tcl's interpreter data structure has a return value field which can
receive a string of arbitrary length.  In order to make this
efficient, this is initialized with a pointer to a limited-size array
on the stack of the caller; when the return value is longer, a
malloc()'ed buffer is used.  There is a little dance you have to do to
free the malloc()'ed buffer.  The big win is that most calls return
short strings and hence you save a call to malloc() and one to free()
per invocation.  This is used *all over* the Tcl source, so good luck
getting rid of it.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From tim@zope.com  Wed Jun 19 16:16:44 2002
From: tim@zope.com (Tim Peters)
Date: Wed, 19 Jun 2002 11:16:44 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <20020619124604.GB31653@ute.mems-exchange.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEFMPOAA.tim@zope.com>

[Andrew Kuchling]
> +1 on /F's suggestion of recycling the os.path.expandvars() code.

-1 on that part:  os.path.expandvars() is an ill-defined mess (the core has
more than one of them, varying by platform, and what they do differs in
platform-irrelevant ways).  +1 on making Barry fix expandvars <wink>:

    http://www.python.org/sf/494589




From barry@zope.com  Wed Jun 19 19:31:58 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 14:31:58 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
 <m31yb3hlrv.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15632.52766.822003.689689@anthem.wooz.org>

>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

    MvL> I dislike that change. Setting LD_RUN_PATH is the jobs of
    MvL> whoever is building the compiler, and should not be done by
    MvL> Python automatically. So far, the Python build process avoids
    MvL> adding any -R linker options, since it requires quite some
    MvL> insight into the specific installation to determine whether
    MvL> usage of that option is the right thing.

Really?  You know the path for the -R/--rpath flag, so all you need is
the magic compiler-specific incantation, and distutils already (or
/should/ already) know that.

    MvL> If setup.py fails to build an extension correctly, it is the
    MvL> adminstrator's job to specify a correct build procedure in
    MvL> Modules/Setup. For that reason, I rather recommend to remove
    MvL> the magic that setup.py looks in /usr/local/Berkeley*,
    MvL> instead of adding more magic.

I disagree.  While the sysadmin should probably fiddle with
/etc/ld.so.conf when he installs BerkeleyDB, it's not documented in
the Sleepycat docs, so it's entirely possible that they haven't done
it.  That shouldn't stop Python from building a perfectly usable
module, especially because it really can figure out all the necessary
information.

Is there some specific fear you have about compiling in the run-path?

Note I'm not saying setting LD_RUN_PATH is the best approach, but it
seemed like the most portable.  I couldn't figure out if distutils
knew what the right compiler-specific switches are (i.e. "-R dir" on
Solaris cc if memory serves, and "-Xlinker -rpath -Xlinker dir" for
gcc, and who knows what for other Unix or <gasp> Windows compilers).

-Barry




From paul@prescod.net  Wed Jun 19 18:44:14 2002
From: paul@prescod.net (Paul Prescod)
Date: Wed, 19 Jun 2002 10:44:14 -0700
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>
 <003701c2175f$b219c340$ced241d5@hagrid>
 <20020619075121.GB25541@hishome.net>
 <20020619083311.GA1011@ratthing-b3cf>
 <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org>
Message-ID: <3D10C2EE.CE833DB7@prescod.net>

"Barry A. Warsaw" wrote:
> 
>...
> 
> Because I don't think there's all that much useful variation, open
> issues in this PEP notwithstanding.  A module seems pretty heavy for
> such a simple addition. 

I really hate putting things in modules that will be needed in a Python
programmer's second program (the one after "Hello world"). If this is to
be the *simpler* way of doing introspection then getting at it should be
simpler than getting at "%". $ is taught in hour 2, import is taught on
day 2. Some people may never make it to the metaphorical day 2 if they
are doing simple text processing in some kind of embedded-Python
environment.

 Paul Prescod



From greg@electricrain.com  Wed Jun 19 20:31:47 2002
From: greg@electricrain.com (Gregory P. Smith)
Date: Wed, 19 Jun 2002 12:31:47 -0700
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15632.53522.716428.359480@anthem.wooz.org>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.53522.716428.359480@anthem.wooz.org>
Message-ID: <20020619193147.GB18944@zot.electricrain.com>

On Wed, Jun 19, 2002 at 02:44:34PM -0400, Barry A. Warsaw wrote:
> 
> >>>>> "GvR" == Guido van Rossum <guido@python.org> writes:
> 
>     GvR> Maybe call it berkeleydb?
> 
> >>>>> "GPS" == Gregory P Smith <greg@electricrain.com> writes:
> 
>     GPS> How about 'berkeleydb'?
> 
> Sounds like consensus.  Greg, how do you feel about moving project
> management from the pybsddb project to the Python project?
> 
> Maybe we should plan a transition on the pybsddb-users list.  I'm
> willing to help.

That sounds like a good idea to me, though I don't know what moving
it entails.  (i assume creating a berkeleydb module directory in the
python project and maintaining the code and documentation from there?).

Technically Robin Dunn is the only project administrator on the pybsddb
sourceforge project but i've got access to at least modify pybsddb.sf.net,
cvs and file releases.  The project has been relatively idle recently;
i've tried to give it a little of my time every month or two (basic 4.0
support, some bugfixes, accepting patches, etc).  As it is quite stable
I don't believe he's actively working on it anymore.  Robin?

-G




From tismer@tismer.com  Wed Jun 19 16:28:50 2002
From: tismer@tismer.com (Christian Tismer)
Date: Wed, 19 Jun 2002 17:28:50 +0200
Subject: [Python-Dev] Tcl adept wanted for Stackless problem
References: <3D108966.9050402@tismer.com> <200206191507.g5JF7ag02088@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D10A332.7010909@tismer.com>

Guido van Rossum wrote:
>>My big question is:
>>When does Tcl use C stack entries as globals, which
>>are passed as function arguments to interpreter calls?
> 
> 
> It's a performance hack, just as stackless :-).

Where the effect of my hack is slightly bigger.
We can fight that out in Charleroi. :-)

> Tcl's interpreter data structure has a return value field which can
> receive a string of arbitrary length.  In order to make this
> efficient, this is initialized with a pointer to a limited-size array
> on the stack of the caller; when the return value is longer, a
> malloc()'ed buffer is used.  There is a little dance you have to do to
> free the malloc()'ed buffer.  The big win is that most calls return
> short strings and hence you save a call to malloc() and one to free()
> per invocation.  This is used *all over* the Tcl source, so good luck
> getting rid of it.

Thank you! I should better not try this. Instead, I'd like
not to touch it at all.
I have patched tkinter in a way that it does not slice the stack
while some Tcl stuff is running (maybe I didn't catch all).
That should mean that the small stack stings are all alive.
That is, in the context of Tcl, I dispensed with the
"stackless" concept.

The remaining problem is switching of tasklets which contain
Tcl invocations. I thought so far that this is no problem,
since these are disjoint contexts, but Jeff Senn reported
problems as well.
I fear I have the problem that Tcl thinks it is still using
the same interp, or it creates a nested one, while the
tasklets are not nested, but seen as independent. Somehow
I need to create a new Tcl frame chain for every tasklet
that uses Tcl.
Can this be the problem?

Still no clue how to do it but thanks - ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From barry@zope.com  Wed Jun 19 15:54:16 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 10:54:16 -0400
Subject: [Python-Dev] Re: making dbmmodule still broken
References: <15631.58711.213506.701945@localhost.localdomain>
 <15631.63833.440127.405556@anthem.wooz.org>
 <200206191249.g5JCnM001518@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15632.39704.752754.245878@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> Just don't start asking questions and reading answers from
    GvR> stdin.  The Make process is often run unattended.  A new
    GvR> option to allow asking questions is OK.

Right, but printing build process calculations is a good thing.  FWIW,
XEmacs's build process tells you exactly what libraries it's linking
with and features it's enabling.  Very helpful in answering questions
like "which Berkeley library did I link against?".

-Barry



From andymac@bullseye.apana.org.au  Wed Jun 19 12:42:14 2002
From: andymac@bullseye.apana.org.au (Andrew MacIntyre)
Date: Wed, 19 Jun 2002 22:42:14 +1100 (edt)
Subject: [Python-Dev] test_socket failure on FreeBSD
Message-ID: <Pine.OS2.4.32.0206192227090.74-100000@tenring.andymac.org>

Below is the output of test_socket with the -v option, from a CVS tree of
about 1915 UTC June 18.  FreeBSD 4.4, gcc 2.95.3 (-g -O3).

In speaking up now, I'm making the assumption that the non-blocking socket
changes should be complete, modulo bugfixes.  If this is not the case,
please let me know, and I'll wait for the situation to stabilise.

Otherwise, is there any more info I can (attempt to) provide?  I tried
"print"ing the addr variable when running the test, and just get "ERROR"
(sans quotes of course).

I've not yet tried to build the OS/2 port with the current CVS code, so I
don't yet know what the situation is there.

Won't have much time to dig until the weekend...

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  | Snail: PO Box 370
        andymac@pcug.org.au            |        Belconnen  ACT  2616
Web:    http://www.andymac.org/        |        Australia

---------- Forwarded message ----------
Date: Wed, 19 Jun 2002 22:13:45 +1000 (EST)
{...}

test_socket
Testing for mission critical constants. ... ok
Testing getservbyname(). ... ok
Testing getsockopt(). ... ok
Testing hostname resolution mechanisms. ... ok
Making sure getnameinfo doesn't crash the interpreter. ... ok
Testing for existance of non-crucial constants. ... ok
Testing reference count for getnameinfo. ... ok
Testing setsockopt(). ... ok
Testing getsockname(). ... ok
Testing that socket module exceptions. ... ok
Testing fromfd(). ... ok
Testing receive in chunks over TCP. ... ok
Testing recvfrom() in chunks over TCP. ... ERROR
Testing large receive over TCP. ... ok
Testing large recvfrom() over TCP. ... ERROR
Testing sendall() with a 2048 byte string over TCP. ... ok
Testing shutdown(). ... ok
Testing recvfrom() over UDP. ... ok
Testing sendto() and Recv() over UDP. ... ok
Testing non-blocking accept. ... FAIL
Testing non-blocking connect. ... ok
Testing non-blocking recv. ... FAIL
Testing whether set blocking works. ... ok
Performing file readline test. ... ok
Performing small file read test. ... ok
Performing unbuffered file read test. ... ok

======================================================================
ERROR: Testing recvfrom() in chunks over TCP.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./Lib/test/test_socket.py", line 359, in testOverFlowRecvFrom
    hostname, port = addr
TypeError: unpack non-sequence

======================================================================
ERROR: Testing large recvfrom() over TCP.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./Lib/test/test_socket.py", line 347, in testRecvFrom
    hostname, port = addr
TypeError: unpack non-sequence

======================================================================
FAIL: Testing non-blocking accept.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./Lib/test/test_socket.py", line 451, in testAccept
    self.fail("Error trying to do non-blocking accept.")
  File "/home/andymac/cvs/python/python-cvs/Lib/unittest.py", line 254, in fail
    raise self.failureException, msg
AssertionError: Error trying to do non-blocking accept.

======================================================================
FAIL: Testing non-blocking recv.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./Lib/test/test_socket.py", line 478, in testRecv
    self.fail("Error trying to do non-blocking recv.")
  File "/home/andymac/cvs/python/python-cvs/Lib/unittest.py", line 254, in fail
    raise self.failureException, msg
AssertionError: Error trying to do non-blocking recv.

----------------------------------------------------------------------
Ran 26 tests in 0.330s

FAILED (failures=2, errors=2)
test test_socket failed -- errors occurred; run in verbose mode for details
1 test failed:
    test_socket




From guido@python.org  Wed Jun 19 21:23:03 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 16:23:03 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 09:43:02 PDT."
 <3D10B496.67B19898@prescod.net>
References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com> <3D100CA9.999E7B3F@prescod.net> <200206191254.g5JCsu301574@pcp02138704pcs.reston01.va.comcast.net>
 <3D10B496.67B19898@prescod.net>
Message-ID: <200206192023.g5JKN3K02971@pcp02138704pcs.reston01.va.comcast.net>

> > > Overall it isn't bad...it's a little weird to have a method that depends
> > > on sys._getframe(1) (or as the say in Tcl-land "upvar"). It may set a
> > > bad precedent...
> > 
> > No, the real implementation will be in C.  C functions always have
> > access to locals and globals.
> 
> I didn't mean it will be a bad precedent because of the implemention. I
> mean that methods do not usually peak into their caller's variables,
> even from C. What other methods do that?

Dunno about methods, but locals(), globals(), vars() and dir() do this
or something like it.

> I'm still "+0" despite being somewhat uncomfortable with that
> aspect.

I think little would be lost if sub() always required a dict (or
perhaps keyword args, although that feels like a YAGNI now).

I think that the key thing here is to set the precedent of using $ and
the specific syntax proposed, not necessarily to have this as a
built-in string methong.

Note: posixpath.expandvars() doesn't have $$, which is essential, and
leaves unknown variables alone, which we (mostly) agree is not the
right thing to do.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik@pythonware.com  Wed Jun 19 21:23:55 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 19 Jun 2002 22:23:55 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>	<003701c2175f$b219c340$ced241d5@hagrid>	<20020619075121.GB25541@hishome.net>	<20020619083311.GA1011@ratthing-b3cf>	<09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net>
Message-ID: <01bf01c217cf$407bcec0$ced241d5@hagrid>

paul wrote:
> $ is taught in hour 2, import is taught on day 2.

says who?

I usually mention "import" in the first hour (before methods),
and nobody has ever had any problem with that...

</F>




From guido@python.org  Wed Jun 19 21:27:40 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 16:27:40 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 10:44:14 PDT."
 <3D10C2EE.CE833DB7@prescod.net>
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org>
 <3D10C2EE.CE833DB7@prescod.net>
Message-ID: <200206192027.g5JKReA03020@pcp02138704pcs.reston01.va.comcast.net>

> I really hate putting things in modules that will be needed in a Python
> programmer's second program (the one after "Hello world"). If this is to
> be the *simpler* way of doing introspection then getting at it should be
> simpler than getting at "%". $ is taught in hour 2, import is taught on
> day 2. Some people may never make it to the metaphorical day 2 if they
> are doing simple text processing in some kind of embedded-Python
> environment.

This is a good argument for making this a built-in (Barry, please add
to your PEP!).

Though I doubt that string % is taught in hour two -- you cna do
everything you want with str() and string concatenation, both of which
*are* taught in hour two.  (And you can do *most* of what you want
with print, which is taught in hour one. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik@pythonware.com  Wed Jun 19 21:29:20 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 19 Jun 2002 22:29:20 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <LNBBLJKPBEHFEDALKOLCGEFMPOAA.tim@zope.com>
Message-ID: <01f101c217d0$4e199110$ced241d5@hagrid>

Tim Peters wrote:

> [Andrew Kuchling]
> > +1 on /F's suggestion of recycling the os.path.expandvars() code.
> 
> -1 on that part:  os.path.expandvars() is an ill-defined mess (the core has
> more than one of them, varying by platform, and what they do differs in
> platform-irrelevant ways).  +1 on making Barry fix expandvars <wink>:

I'm pretty sure my plan was to change *path.expandvars to

    def expandvars(string):
        return string.expandvars(string, os.environ)

(and I've already sent SRE-based expandvars code to barry,
so all he has to do is to check it in ;-)

</F>




From gward@python.net  Wed Jun 19 21:33:32 2002
From: gward@python.net (Greg Ward)
Date: Wed, 19 Jun 2002 16:33:32 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <20020619024806.GA7218@lilith.my-fqdn.de>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <15631.60841.28978.492291@anthem.wooz.org> <20020619024806.GA7218@lilith.my-fqdn.de>
Message-ID: <20020619203332.GA9758@gerg.ca>

On 19 June 2002, Gerhard H?ring said:
> * Barry A. Warsaw <barry@zope.com> [2002-06-18 22:34 -0400]:
> > The problem is that unless your sysadmin hacks ld.so.conf to add
> > /usr/local/BerkeleyDB.X.Y/lib onto your standard ld run path,
> > bsddbmodule.so won't be linked in such a way that it can actually
> > resolve the symbols at run time.
> > [...]
> > os.environ['LD_RUN_PATH'] = dblib_dir
> 
> I may be missing something here, but AFAIC that's what the library_dirs
> parameter in the Extension constructor of distutils is for. It basically
> sets the runtime library path at compile time using the "-R" linker
> option.

No, library_dirs is for good old -L.  AFAIK it works fine.

For -R (or equivalent) you need runtime_library_dirs.  I'm not sure if
it works (or ever did).  I think it's a question of knowing what magic
options to supply to each compiler.  Probably it works (worked) on
Solaris, since for once Sun got things right and supplied a simple,
obvious, working command-line option -- namely -R.

        Greg
-- 
Greg Ward - programmer-at-large                         gward@python.net
http://starship.python.net/~gward/
Jesus Saves -- and you can too, by redeeming these valuable coupons!



From guido@python.org  Wed Jun 19 21:37:28 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 16:37:28 -0400
Subject: [Python-Dev] test_socket failure on FreeBSD
In-Reply-To: Your message of "Wed, 19 Jun 2002 22:42:14 +1100."
 <Pine.OS2.4.32.0206192227090.74-100000@tenring.andymac.org>
References: <Pine.OS2.4.32.0206192227090.74-100000@tenring.andymac.org>
Message-ID: <200206192037.g5JKbSj03086@pcp02138704pcs.reston01.va.comcast.net>

> Below is the output of test_socket with the -v option, from a CVS tree of
> about 1915 UTC June 18.  FreeBSD 4.4, gcc 2.95.3 (-g -O3).
> 
> In speaking up now, I'm making the assumption that the non-blocking socket
> changes should be complete, modulo bugfixes.  If this is not the case,
> please let me know, and I'll wait for the situation to stabilise.

This is supposed to work, there's a missing feature but it's not being
tested yet. :-)

> Otherwise, is there any more info I can (attempt to) provide?  I tried
> "print"ing the addr variable when running the test, and just get "ERROR"
> (sans quotes of course).

Try

    print "\n" + repr(addr)

There are probably some differences in the socket semantics.  I'd
appreciate it if you could provide a patch or at least a clue!

(I'll be away from Friday through July 8.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gward@python.net  Wed Jun 19 21:40:17 2002
From: gward@python.net (Greg Ward)
Date: Wed, 19 Jun 2002 16:40:17 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <20020619124604.GB31653@ute.mems-exchange.org>
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <20020619124604.GB31653@ute.mems-exchange.org>
Message-ID: <20020619204017.GB9758@gerg.ca>

On 19 June 2002, Andrew Kuchling said:
> It could live in the new text module, where Greg Ward's word-wrapping
> code will be going.  +1 on /F's suggestion of recycling the
> os.path.expandvars() code.

No, that's already checked in as textwrap.py.

        Greg



From fredrik@pythonware.com  Wed Jun 19 21:40:40 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 19 Jun 2002 22:40:40 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <LNBBLJKPBEHFEDALKOLCGEFMPOAA.tim@zope.com> <01f101c217d0$4e199110$ced241d5@hagrid>
Message-ID: <022901c217d1$a2ab6360$ced241d5@hagrid>

> I'm pretty sure my plan was to change *path.expandvars to
> 
>     def expandvars(string):
>         return string.expandvars(string, os.environ)

should of course have been:

     def expandvars(string):
         return text.expandvars(string, os.environ)

</F>




From martin@v.loewis.de  Wed Jun 19 21:43:01 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 19 Jun 2002 22:43:01 +0200
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <20020619182141.GA18944@zot.electricrain.com>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
 <m3660fhm3g.fsf@mira.informatik.hu-berlin.de>
 <20020619182141.GA18944@zot.electricrain.com>
Message-ID: <m34rfzrodm.fsf@mira.informatik.hu-berlin.de>

"Gregory P. Smith" <greg@electricrain.com> writes:

> If I hadn't made the initial mistake of naming pybsddb's module bsddb3
> when i first extended robin's berkeleydb 2.x module to work with 3.0 I
> would agree with that name.  I worry that having a module named bsddb2
> might cause endless confusion as bsddb and bsddb3 already exist and did
> correlate to the version number.  How about 'berkeleydb'?

What does it have to do with the city of Berkeley (CA)? Perhaps
"sleepycat"?

Regards,
Martin




From guido@python.org  Wed Jun 19 21:52:58 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 16:52:58 -0400
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: Your message of "19 Jun 2002 22:43:01 +0200."
 <m34rfzrodm.fsf@mira.informatik.hu-berlin.de>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <m3660fhm3g.fsf@mira.informatik.hu-berlin.de> <20020619182141.GA18944@zot.electricrain.com>
 <m34rfzrodm.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200206192052.g5JKqw803249@pcp02138704pcs.reston01.va.comcast.net>

> What does it have to do with the city of Berkeley (CA)? Perhaps
> "sleepycat"?

The company Sleepycat calls this particular product Berkeley DB,
that's enough reason for me.  They (may) have other products too, so
Sleepycat is not sufficiently distinctive.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@v.loewis.de  Wed Jun 19 21:49:49 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 19 Jun 2002 22:49:49 +0200
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15632.52766.822003.689689@anthem.wooz.org>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
 <m31yb3hlrv.fsf@mira.informatik.hu-berlin.de>
 <15632.52766.822003.689689@anthem.wooz.org>
Message-ID: <m3znxrq9ht.fsf@mira.informatik.hu-berlin.de>

barry@zope.com (Barry A. Warsaw) writes:

> Really?  You know the path for the -R/--rpath flag, so all you need is
> the magic compiler-specific incantation, and distutils already (or
> /should/ already) know that.

Yes, but you don't know whether usage of -R is appropriate. If the
installed library is static, -R won't be needed. If then the target
directory recorded with -R happens to be on an unavailable NFS server
at run-time (on a completely different network), you cannot import the
library module anymore, which would otherwise work perfectly fine.

We had big problems with recorded library directories over the years;
at some point, the administrators decided to take the machine that had
/usr/local/lib/gcc-lib/sparc-sun-solaris2.3/2.5.8 on it offline. They
did not knew that they would thus make vim inoperable, which happened
to be compiled with LD_RUN_PATH pointing to that directory - even
though no library was ever needed from that directory.

> I disagree.  While the sysadmin should probably fiddle with
> /etc/ld.so.conf when he installs BerkeleyDB, it's not documented in
> the Sleepycat docs, so it's entirely possible that they haven't done
> it.  

I'm not asking for the administrator fiddle with ld.so.conf. Instead,
I'm asking the administrator fiddle with Modules/Setup.

> Is there some specific fear you have about compiling in the run-path?

Yes, see above.

> Note I'm not saying setting LD_RUN_PATH is the best approach, but it
> seemed like the most portable.  I couldn't figure out if distutils
> knew what the right compiler-specific switches are (i.e. "-R dir" on
> Solaris cc if memory serves, and "-Xlinker -rpath -Xlinker dir" for
> gcc, and who knows what for other Unix or <gasp> Windows compilers).

LD_LIBRARY_PATH won't work for Windows compilers, either. To my
knowledge, there is nothign equivalent on Windows.

Regards,
Martin




From skip@pobox.com  Wed Jun 19 21:52:26 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 19 Jun 2002 15:52:26 -0500
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <m31yb3hlrv.fsf@mira.informatik.hu-berlin.de>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
 <m31yb3hlrv.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15632.61194.588186.196532@localhost.localdomain>

    Barry> Here's a small patch to setup.py which should fix things in a
    Barry> portable way, at least for *nix systems.  It sets the envar
    Barry> LD_RUN_PATH to the location that it found the Berkeley library,
    Barry> but only if that envar isn't already set.

    Martin> I dislike that change. Setting LD_RUN_PATH is the jobs of
    Martin> whoever is building the compiler, and should not be done by
    Martin> Python automatically.

Agreed.  Also, is LD_RUN_PATH widely available?

    Martin> If setup.py fails to build an extension correctly, it is the
    Martin> adminstrator's job to specify a correct build procedure in
    Martin> Modules/Setup. For that reason, I rather recommend to remove the
    Martin> magic that setup.py looks in /usr/local/Berkeley*, instead of
    Martin> adding more magic.

I'm happy with the current setup.  While the /usr/local/BerkeleyN.M location
is a bit odd, Sleepycat is pretty consistent in this regard.  (At least
versions 3 and 4 install this way.)  I'd rather require sysadmins to run
ldconfig or its equivalent.

Most of the time people install packages using the default locations.  In
those situations where they don't, distutils accepts a couple environment
variables which specific alternate search directories for libraries and
include files.  Their names escape me at the moment, and I'm not sure they
accept the usual colon-separated list of directories.  If they don't, they
should be suitably modified.  It should probably be easy to specify these
through some configure command line args.

Skip




From skip@pobox.com  Wed Jun 19 22:10:24 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 19 Jun 2002 16:10:24 -0500
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15631.60841.28978.492291@anthem.wooz.org>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
Message-ID: <15632.62272.946354.832044@localhost.localdomain>

    BAW> I'm still having build trouble on my RH6.1 system, but maybe it's
    BAW> just too old to worry about (I /really/ need to upgrade one of
    BAW> these days
    BAW> ;).

    BAW> -------------------- snip snip --------------------
    BAW> building 'bsddb' extension
    BAW> gcc -g -Wall -Wstrict-prototypes -fPIC -DHAVE_DB_185_H=1 -I/usr/local/BerkeleyDB.3.3/include -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/bsddbmodule.c -o build/temp.linux-i686-2.3/bsddbmodule.o
    BAW> In file included from /home/barry/projects/python/Modules/bsddbmodule.c:25:
    BAW> /usr/local/BerkeleyDB.3.3/include/db_185.h:171: parse error before `*'
    BAW> /usr/local/BerkeleyDB.3.3/include/db_185.h:171: warning: type defaults to `int' in declaration of `__db185_open'
    BAW> /usr/local/BerkeleyDB.3.3/include/db_185.h:171: warning: data definition has no type or storage class
    BAW> /home/barry/projects/python/Modules/bsddbmodule.c: In function `newdbhashobject':
    BAW> /home/barry/projects/python/Modules/bsddbmodule.c:74: warning: assignment from incompatible pointer type
    BAW> /home/barry/projects/python/Modules/bsddbmodule.c: In function `newdbbtobject':
    BAW> /home/barry/projects/python/Modules/bsddbmodule.c:124: warning: assignment from incompatible pointer type
    BAW> /home/barry/projects/python/Modules/bsddbmodule.c: In function `newdbrnobject':
    BAW> /home/barry/projects/python/Modules/bsddbmodule.c:182: warning: assignment from incompatible pointer type
    BAW> -------------------- snip snip --------------------

I think you might have to define another CPP macro.  In my post from last
night about building dbmmodule.c I included

                                       define_macros=[('HAVE_BERKDB_H',None),
                                                      ('DB_DBM_HSEARCH',None)],

in the Extension constructor.  Maybe DB_DBM_HSEARCH is also needed for older
bsddb?  I have no trouble building though.

Skip




From skip@pobox.com  Wed Jun 19 22:15:16 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 19 Jun 2002 16:15:16 -0500
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
 <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15632.62564.638418.191453@localhost.localdomain>

    BAW> I still think we may want to pull PyBSDDB into the standard distro,
    BAW> as a way to provide BDB api's > 1.85.  The question is, what would
    BAW> this new module be called?  I dislike "bsddb3" -- which I think
    BAW> PyBSDDB itself uses -- because it links against BDB 4.0.

    Guido> Good idea.  Maybe call it berkeleydb?  That's what Sleepycat
    Guido> calls it (there's no connection with the BSD Unix distribution
    Guido> AFAICT).

Why can't it just be called bsddb?  As far as I could tell tell, it provides
a bsddb-compatible interface at the module level.  The only change at the
bsddb level is the addition of an extra object (db?  I can't recall right
now and have to get offline soon for the credit card machine so I can't
pause to check ;-) which gives the programmer access to all the PyBSDDB
magic.

Skip



From skip@pobox.com  Wed Jun 19 22:19:30 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 19 Jun 2002 16:19:30 -0500
Subject: [Python-Dev] Re: making dbmmodule still broken
In-Reply-To: <200206191249.g5JCnM001518@pcp02138704pcs.reston01.va.comcast.net>
References: <15631.58711.213506.701945@localhost.localdomain>
 <15631.63833.440127.405556@anthem.wooz.org>
 <200206191249.g5JCnM001518@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15632.62818.352443.789431@localhost.localdomain>

    SM> I think it would probably be a good idea to alert the person
    SM> running make what library the module will be linked with.
    SM> Anyone else agree?

    BAW> +1.  The less guessing the builder has to do the better!

    Guido> Just don't start asking questions and reading answers from stdin.

Agreed.  If necessary, I would recommend adding an option to configure.

Skip



From Donald Beaudry <donb@abinitio.com>  Wed Jun 19 22:23:47 2002
From: Donald Beaudry <donb@abinitio.com> (Donald Beaudry)
Date: Wed, 19 Jun 2002 17:23:47 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com> <3D100CA9.999E7B3F@prescod.net> <200206191254.g5JCsu301574@pcp02138704pcs.reston01.va.comcast.net> <3D10B496.67B19898@prescod.net> <200206192023.g5JKN3K02971@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200206192123.g5JLNlS25542@zippy.abinitio.com>

Guido van Rossum <guido@python.org> wrote,
> I think little would be lost if sub() always required a dict (or
> perhaps keyword args, although that feels like a YAGNI now).

Requiring the dict sounds about right to me.  But, now when you
consider that the sub() being discussed is litte more than

  import re

  def sub(s, **kws):
      return re.sub(r"\${\w+}", lambda m, d=kws: d[m.group(0)[2:-1]], s)

  print sub("this is ${my} way to ${sub}", my="Don's", sub="do it")

you just have to wonder what the fuss is really all about.  Ease of
use seems to be the issue.  Should this variant of sub() just be added
to the re module?  With such a friendly introduction, it might coax
new users into looking deeper into the power of re.

There seems to be another issue though: the default value for the
substitution dictionary and whether a KeyError or NameError should be
raised when a key doesnt exist.  Why not just define a new mapping
object, returned from a call to namespace() that behaves something
like this (bogus implementation):

  class namespace:
      def __getitem__(s, k):
          return eval(k)

Then,

  print sub("this is ${my} way to ${sub}", **namespace())

Should do the right thing.  The fun here is that the namespace()
mechanism would be available for further abuse.  I see no reason to
lock it up inside a string interpolation function.  Consideration
should even be given to allowing a frame index argument to be passed
to it.  So,

  def sub(s, **kws):
      if not kws:
          kws = namespace(-1)
      return re.sub(r"\${\w+}", lambda m, d=kws: d[m.group(0)[2:-1]], s)

would do the complete job.  But that might be too much like upvar ;)

--
Donald Beaudry                                     Ab Initio Software Corp.
                                                   201 Spring Street
donb@abinitio.com                                  Lexington, MA 02421
                          ...So much code...



From greg@electricrain.com  Wed Jun 19 22:25:59 2002
From: greg@electricrain.com (Gregory P. Smith)
Date: Wed, 19 Jun 2002 14:25:59 -0700
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15632.62564.638418.191453@localhost.localdomain>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain>
Message-ID: <20020619212559.GC18944@zot.electricrain.com>

On Wed, Jun 19, 2002 at 04:15:16PM -0500, Skip Montanaro wrote:
> 
>     BAW> I still think we may want to pull PyBSDDB into the standard distro,
>     BAW> as a way to provide BDB api's > 1.85.  The question is, what would
>     BAW> this new module be called?  I dislike "bsddb3" -- which I think
>     BAW> PyBSDDB itself uses -- because it links against BDB 4.0.
> 
>     Guido> Good idea.  Maybe call it berkeleydb?  That's what Sleepycat
>     Guido> calls it (there's no connection with the BSD Unix distribution
>     Guido> AFAICT).
> 
> Why can't it just be called bsddb?  As far as I could tell tell, it provides
> a bsddb-compatible interface at the module level.  The only change at the
> bsddb level is the addition of an extra object (db?  I can't recall right
> now and have to get offline soon for the credit card machine so I can't
> pause to check ;-) which gives the programmer access to all the PyBSDDB
> magic.
> 
> Skip

Modern berkeleydb uses much different on disk database formats, glancing
at the docs on sleepycat.com i don't even think it can read bsddb (1.85)
files.  Existing code using bsddb (1.85) should not automatically start
using a different database library even if we provide a compatibility
interface.  That upgrade can be done to code manually using:

import berkeleydb
bsddb = berkeleydb

(and creating a single bsddb module that used the old 1.85 library for the
old interface and the 3.3/4.0 library for the modern interface would add
bloat to many applications that don't need both if it were even possible
to link that in such a way as to avoid the symbol conflicts)

Greg

-- 
Some mistakes are too much fun to make only once.



From jeremy@zope.com  Wed Jun 19 22:26:27 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Wed, 19 Jun 2002 17:26:27 -0400
Subject: [Python-Dev] PEP 8: Lists/tuples
In-Reply-To: <200206170023.g5H0NxC00733@pcp02138704pcs.reston01.va.comcast.net>
References: <20020616234555.GA3415@panix.com>
 <200206170023.g5H0NxC00733@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15632.63235.580901.707844@slothrop.zope.com>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

  >> (I'm -1 myself, but I'd like to know what to tell my class.)

  GvR> Like it or not, that's what tuples are for. :-)

That and storing homogenous lists in code objects and base classes and
function closures and ...

Jeremy




From martin@v.loewis.de  Wed Jun 19 22:41:53 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 19 Jun 2002 23:41:53 +0200
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15632.62564.638418.191453@localhost.localdomain>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
 <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
 <15632.62564.638418.191453@localhost.localdomain>
Message-ID: <m3bsa7q732.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> Why can't it just be called bsddb?

If full compatibility is guaranteed, I'm all for it.

Regards,
Martin



From tdelaney@avaya.com  Thu Jun 20 01:31:53 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Thu, 20 Jun 2002 10:31:53 +1000
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A3A1@natasha.auslabs.avaya.com>

> From: barry@zope.com [mailto:barry@zope.com]
>=20
> >>>>> "MS" =3D=3D Martin Sj=F6gren <martin@strakt.com> writes:
>=20
>     MS> What's the advantage of using ${name} and ${country} instead?
>=20
> There's a lot of empirical evidence that %(name)s is quite error
> prone.

Perhaps an unadorned %(name) should default to %(name)s?

Tim Delaney



From tdelaney@avaya.com  Thu Jun 20 01:35:01 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Thu, 20 Jun 2002 10:35:01 +1000
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A3A2@natasha.auslabs.avaya.com>

> From: barry@zope.com [mailto:barry@zope.com]
> 
> >>>>> "GvR" == Guido van Rossum <guido@python.org> writes:
> 
>     GvR> That's a matter of validating the template before accepting
>     GvR> it.
> 
> True, which isn't hard to do.  You can write a regexp to extract the
> $names and then validate those.  In fact, I think this is what newer
> versions of xgettext do for Python code (albeit with the %(name)s
> syntax).

Of course, once you've validated a string, it's almost no extra work to do
the interpolation in place (at least in source code size).

Tim Delaney



From trentm@ActiveState.com  Thu Jun 20 01:39:42 2002
From: trentm@ActiveState.com (Trent Mick)
Date: Wed, 19 Jun 2002 17:39:42 -0700
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <B43D149A9AB2D411971300B0D03D7E8BF0A3A1@natasha.auslabs.avaya.com>; from tdelaney@avaya.com on Thu, Jun 20, 2002 at 10:31:53AM +1000
References: <B43D149A9AB2D411971300B0D03D7E8BF0A3A1@natasha.auslabs.avaya.com>
Message-ID: <20020619173942.B28839@ActiveState.com>

[Delaney, Timothy wrote]
> > From: barry@zope.com [mailto:barry@zope.com]
> >=20
> > >>>>> "MS" =3D=3D Martin Sj=F6gren <martin@strakt.com> writes:
> >=20
> >     MS> What's the advantage of using ${name} and ${country} instead?
> >=20
> > There's a lot of empirical evidence that %(name)s is quite error
> > prone.
>=20
> Perhaps an unadorned %(name) should default to %(name)s?

Or:
- get pychecker2 working (the one that does not need to import modules
  that it checks, I *think* that that is one of the pychecker2 features)
- get PyChecker in the core
- provide a python flag to load the pychecker import hook to check your
  code when running it (say, '-w')
- have PyChecker warn about "%(name)"-sans-formatting-character
  instances in strings (if it does not already).

Trent

--=20
Trent Mick
TrentM@ActiveState.com



From guido@python.org  Thu Jun 20 01:50:11 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 20:50:11 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 17:39:42 PDT."
 <20020619173942.B28839@ActiveState.com>
References: <B43D149A9AB2D411971300B0D03D7E8BF0A3A1@natasha.auslabs.avaya.com>
 <20020619173942.B28839@ActiveState.com>
Message-ID: <200206200050.g5K0oBv03714@pcp02138704pcs.reston01.va.comcast.net>

> > > There's a lot of empirical evidence that %(name)s is quite error
> > > prone.
> > 
> > Perhaps an unadorned %(name) should default to %(name)s?

Ambiguous, hence even more error-prone.

> Or:
> - get pychecker2 working (the one that does not need to import modules
>   that it checks, I *think* that that is one of the pychecker2 features)
> - get PyChecker in the core
> - provide a python flag to load the pychecker import hook to check your
>   code when running it (say, '-w')
> - have PyChecker warn about "%(name)"-sans-formatting-character
>   instances in strings (if it does not already).

I'd rather have a notation that's less error-prone than a better way
to check for errors.  (Not that PyChecker 2 isn't a great idea. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From Donald Beaudry <donb@abinitio.com>  Thu Jun 20 01:53:38 2002
From: Donald Beaudry <donb@abinitio.com> (Donald Beaudry)
Date: Wed, 19 Jun 2002 20:53:38 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <B43D149A9AB2D411971300B0D03D7E8BF0A3A1@natasha.auslabs.avaya.com>
Message-ID: <200206200053.g5K0rch27979@zippy.abinitio.com>

"Delaney, Timothy" <tdelaney@avaya.com> wrote,
> > From: barry@zope.com [mailto:barry@zope.com]
> > 
> > >>>>> "MS" == Martin Sjögren <martin@strakt.com> writes:
> > 
> >     MS> What's the advantage of using ${name} and ${country} instead?
> > 
> > There's a lot of empirical evidence that %(name)s is quite error
> > prone.
> 
> Perhaps an unadorned %(name) should default to %(name)s?

The problem is knowing that it's unadorned.  Now an unadorned %{name}
could could be interpreted as %(name)s.

--
Donald Beaudry                                     Ab Initio Software Corp.
                                                   201 Spring Street
donb@abinitio.com                                  Lexington, MA 02421
                         ...So little time...





From joe@notcharles.ca  Wed Jun 19 22:37:59 2002
From: joe@notcharles.ca (Joe Mason)
Date: Wed, 19 Jun 2002 16:37:59 -0500
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <200206191252.g5JCqcQ01558@pcp02138704pcs.reston01.va.comcast.net>
References: <Pine.LNX.4.44.0206182131580.17350-100000@ziggy> <200206191252.g5JCqcQ01558@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020619213759.GA6092@plover.net>

On Wed, Jun 19, 2002 at 08:52:38AM -0400, Guido van Rossum wrote:
> I imagine that the most common use case is a situation where the dict
> is already prepared.  I think **dict is slower than a positional dict
> argument.  I agree that keyword args would be useful in some cases
> where you can't trust the string.

As Barry noted, this isn't as powerful as PEP 215 (er, was that the
right number?  The earlier $interpolation one, anyway) because it
doesn't allow arbitrary expressions.  I'd imagine a common use case
would be to shortcut an expression without binding it to a local
variable,

  "The length is ${length}".sub(length = len(someString))

In this case it would be handy to use the default environment overridden
by the new bindings, so you could do
  
  "The length of ${someString} is ${length}".sub(length =
    len(someString))

But that could get messy real fast.  The idiom could be

  "The length of ${someString} is ${length}".sub(someString =
    someString, length = len(someString))

But that's ugly.

Joe



From guido@python.org  Thu Jun 20 02:06:27 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 21:06:27 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 16:37:59 CDT."
 <20020619213759.GA6092@plover.net>
References: <Pine.LNX.4.44.0206182131580.17350-100000@ziggy> <200206191252.g5JCqcQ01558@pcp02138704pcs.reston01.va.comcast.net>
 <20020619213759.GA6092@plover.net>
Message-ID: <200206200106.g5K16RD03949@pcp02138704pcs.reston01.va.comcast.net>

> As Barry noted, this isn't as powerful as PEP 215 (er, was that the
> right number?  The earlier $interpolation one, anyway) because it
> doesn't allow arbitrary expressions.

That's intentional.  Trying to put an expression parser in the
interpolation code quickly leads to insanity.

> I'd imagine a common use case
> would be to shortcut an expression without binding it to a local
> variable,
> 
>   "The length is ${length}".sub(length = len(someString))
> 
> In this case it would be handy to use the default environment overridden
> by the new bindings, so you could do
>   
>   "The length of ${someString} is ${length}".sub(length =
>     len(someString))
> 
> But that could get messy real fast.  The idiom could be
> 
>   "The length of ${someString} is ${length}".sub(someString =
>     someString, length = len(someString))
> 
> But that's ugly.

I think you're simply trying to do too much in one line.  Simple is
better than complex.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tdelaney@avaya.com  Thu Jun 20 02:07:05 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Thu, 20 Jun 2002 11:07:05 +1000
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A3A4@natasha.auslabs.avaya.com>

> From: Guido van Rossum [mailto:guido@python.org]
> 
> > > > There's a lot of empirical evidence that %(name)s is quite error
> > > > prone.
> > > 
> > > Perhaps an unadorned %(name) should default to %(name)s?
> 
> Ambiguous, hence even more error-prone.

Fair enough. I couldn't off the top of my head think of an ambiguous case,
but of course there's

'%(thing)s'ly-yours' ...

PyChecker2 should *definitely* include checking format strings IMO,
irrespective of whether $ formatting gets in. But only as a warning (because
of the above case).

Tim Delaney



From neal@metaslash.com  Thu Jun 20 02:17:27 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Wed, 19 Jun 2002 21:17:27 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <B43D149A9AB2D411971300B0D03D7E8BF0A3A4@natasha.auslabs.avaya.com>
Message-ID: <3D112D27.BC846E5A@metaslash.com>

"Delaney, Timothy" wrote:

> PyChecker2 should *definitely* include checking format strings IMO,
> irrespective of whether $ formatting gets in. But only as a warning (because
> of the above case).

It does, but there are a few bugs in it right now.

Neal



From tim.one@comcast.net  Thu Jun 20 02:39:35 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 19 Jun 2002 21:39:35 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <B43D149A9AB2D411971300B0D03D7E8BF0A3A4@natasha.auslabs.avaya.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEIFPOAA.tim.one@comcast.net>

[Delaney, Timothy]
> Fair enough. I couldn't off the top of my head think of an ambiguous case,

The difficulty is that a blank is a kosher flag modifier in C formats.  So,
e.g.,

    'goofy%(name) dogs'

is a legitimate Python format as is, and

>>> print 'goofy%(name) dogs' % {'name': 666}
goofy 666ogs
>>>

works correctly.

Why *Python* follows these goofy rules in all respects is a question we
don't ask <wink>.




From paul@prescod.net  Thu Jun 20 02:58:26 2002
From: paul@prescod.net (Paul Prescod)
Date: Wed, 19 Jun 2002 18:58:26 -0700
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org>
 <3D10C2EE.CE833DB7@prescod.net> <200206192027.g5JKReA03020@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D1136C2.1A7E0D47@prescod.net>

Guido van Rossum wrote:
> 
>...
> Though I doubt that string % is taught in hour two -- you cna do
> everything you want with str() and string concatenation, both of which
> *are* taught in hour two.  

If there were an easy way to do interpolation I might well want to teach
it before any of str() or string concatenation. And I would probably
treat it in preference to the magic and special "," operator of the
print statement. I prefer to teach something that is generally useful
like $ rather than something which they may have to unlearn like "," --
unlearn to the extent that they will naturally expect that commas in
other contexts will do whitespace-generating concatenation and they
hardly ever will.

 Paul Prescod



From guido@python.org  Thu Jun 20 03:15:58 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 22:15:58 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 18:58:26 PDT."
 <3D1136C2.1A7E0D47@prescod.net>
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net> <200206192027.g5JKReA03020@pcp02138704pcs.reston01.va.comcast.net>
 <3D1136C2.1A7E0D47@prescod.net>
Message-ID: <200206200215.g5K2FwJ04359@pcp02138704pcs.reston01.va.comcast.net>

> If there were an easy way to do interpolation I might well want to
> teach it before any of str() or string concatenation.

I'm afraid your students would end up appending a character c to a
string s by writing

    s = "$s$c".sub()

Not exactly good style.

> And I would probably treat it in preference to the magic and special
> "," operator of the print statement.

I object to this insinuation.

> I prefer to teach something that is generally useful like $ rather
> than something which they may have to unlearn like "," -- unlearn to
> the extent that they will naturally expect that commas in other
> contexts will do whitespace-generating concatenation and they hardly
> ever will.

(a) You're making this argument up.  I don't believe for a second that
    you've observed this mistake in an actual student.

(b) I expect that students never even *think* about the space between
    printed items -- it's entirely natural.

(c) Commas are designed to "disappear" in our interpretation of
    things, and they do.  The comma has so many uses where whitespace
    generation is just not one of the things you could possibly think
    about that I find it hard to take this argument serious.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pinard@iro.umontreal.ca  Thu Jun 20 03:20:49 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 19 Jun 2002 22:20:49 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca>
References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca>
Message-ID: <oq3cviitby.fsf@titan.progiciels-bpi.ca>

[Barry A. Warsaw]

>     This PEP describes a simpler string substitution feature, also
>     known as string interpolation.  This PEP is "simpler" in two
>     respects:

>     1. Python's current string substitution feature (commonly known as
>        %-substitutions) is complicated and error prone.  This PEP is
>        simpler at the cost of less expressiveness.

>     2. PEP 215 proposed an alternative string interpolation feature,
>        introducing a new `$' string prefix.  PEP 292 is simpler than
>        this because it involves no syntax changes and has much simpler
>        rules for what substitutions can occur in the string.

For one, I do not like seeing `$' as a string prefix in Python, and wonder
if we could not merely go with `%' as we always did in Python.  At least,
it keeps a kind of clear cut distance between Python and Perl. :-)

>     In addition, the rules for what can follow a % sign are fairly
>     complex, while the usual application rarely needs such complexity.

This premise seems exaggerated to me.  `%' as it stands is not that
complex to understand.  Moreover, many of us use `%' formatting a lot,
so it is not so rare that the current `%' specification is useful.

>     1. $$ is an escape; it is replaced with a single $

Let's suppose we stick with `%', the above rule reduces to something
already known.

>     3. ${identifier} [...]

We could use %{identifier} as meaning `%(identifier)s'.  Clean.  Simple.

>     2. $identifier [...]

This is where the difficulty lies.  Since the PEP already suggests that
${identifier} was to be preferred over $identifier, why not just go a bit
forward, and drop 2. altogether?  Or else, how do you justify that using
it really make things more legible?

Then, the whole proposal would reduce to adding %{identifier}, and instead
of having `.sub()' methods or whatever, just stick with what we already have.

This would be a mild change instead of a whole new feature, and keep Python
a little more wrapped to itself.  Interpolation proposals I've seen always
looked a bit awkward and foreign so far.

I guess that merely adding %{identifier} would wholly satisfy the given
justifications for the PEP (that is, giving a mean for people to avoid
the %()s as error prone), with a minimal impact on the current Python
definition, and a bit less of a surprise.  Python does not have to look
like Perl to be useful, you know! :-)

> Handling Missing Keys

This would be a non-issue, by the fact that %(identifier)s behaviour,
for undefined identifier, is already what we want.

> The mapping argument is optional; if it is omitted then the mapping is
> taken from the locals and globals of the context in which the .sub()
> method is executed.

This is an interesting idea.  However, there are other contexts where the
concept of a compound dictionary of all globals and locals would be useful.
Maybe we could have some allvars() similar to globals() and locals(),
and use `... % allvars()' instead of `.sub()'?  So this would serve both
string interpolation and other avenues.

I hope I succeed to express my feeling that we should try keeping string
interpolation rather natural with what Python already is.  We should not
carelessly multiply paradigms.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From guido@python.org  Thu Jun 20 03:30:59 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 22:30:59 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "19 Jun 2002 22:20:49 EDT."
 <oq3cviitby.fsf@titan.progiciels-bpi.ca>
References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca>
 <oq3cviitby.fsf@titan.progiciels-bpi.ca>
Message-ID: <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net>

> For one, I do not like seeing `$' as a string prefix in Python, and
> wonder if we could not merely go with `%' as we always did in
> Python.  At least, it keeps a kind of clear cut distance between
> Python and Perl. :-)

The $ means "substitution" in so many languages besides Perl that I
wonder where you've been.

> >     In addition, the rules for what can follow a % sign are fairly
> >     complex, while the usual application rarely needs such complexity.
> 
> This premise seems exaggerated to me.  `%' as it stands is not that
> complex to understand.  Moreover, many of us use `%' formatting a lot,
> so it is not so rare that the current `%' specification is useful.

I quite like the positional % substitution.  I think %(...)s was a
mistake -- what we really wanted was ${...}.

> >     1. $$ is an escape; it is replaced with a single $
> 
> Let's suppose we stick with `%', the above rule reduces to something
> already known.
> 
> >     3. ${identifier} [...]
> 
> We could use %{identifier} as meaning `%(identifier)s'.  Clean.  Simple.

Confusing.  The visual difference between () and {} is too small.

> >     2. $identifier [...]
> 
> This is where the difficulty lies.  Since the PEP already suggests that
> ${identifier} was to be preferred over $identifier, why not just go a bit
> forward, and drop 2. altogether?  Or else, how do you justify that using
> it really make things more legible?

Less clutter.  Compare

    "My name is $name, I live in $country"

to

    "My name is ${name}, I live in ${country}"

The {} add nothing but noise.  We're copying this from the shell.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pinard@iro.umontreal.ca  Thu Jun 20 03:43:53 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 19 Jun 2002 22:43:53 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net>
References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca>
 <oq3cviitby.fsf@titan.progiciels-bpi.ca>
 <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <oqsn3ihdp2.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

> The $ means "substitution" in so many languages besides Perl that I
> wonder where you've been.

Of course, I've been elsewhere.  But Python currently uses `%' for driving
interpolation, and on this topic, I've been with Python, if you wonder :-).

> I quite like the positional % substitution.  I think %(...)s was a
> mistake -- what we really wanted was ${...}.

The distinction between %()s and %()r, recently introduced, has been useful.
But with str() and repr(), only one of those is really necessary.  But it
gave the impression that Python trend is pushing for % to get stronger.
The proposal of using $ as yet another formatting avenue makes it weaker.

> Less clutter.  Compare

>     "My name is $name, I live in $country"

> to

>     "My name is ${name}, I live in ${country}"

> The {} add nothing but noise.  We're copying this from the shell.

Noise decreases legibility.  So, maybe the PEP should not say that ${name}
is to be preferred over $name?  Or else, it should explain why.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From guido@python.org  Thu Jun 20 03:50:51 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 22:50:51 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "19 Jun 2002 22:43:53 EDT."
 <oqsn3ihdp2.fsf@titan.progiciels-bpi.ca>
References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca> <oq3cviitby.fsf@titan.progiciels-bpi.ca> <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net>
 <oqsn3ihdp2.fsf@titan.progiciels-bpi.ca>
Message-ID: <200206200250.g5K2opM04538@pcp02138704pcs.reston01.va.comcast.net>

> The distinction between %()s and %()r, recently introduced, has been
> useful.  But with str() and repr(), only one of those is really
> necessary.  But it gave the impression that Python trend is pushing
> for % to get stronger.  The proposal of using $ as yet another
> formatting avenue makes it weaker.

Language evolution doesn't always go into a straight line.

> > Less clutter.  Compare
> 
> >     "My name is $name, I live in $country"
> 
> > to
> 
> >     "My name is ${name}, I live in ${country}"
> 
> > The {} add nothing but noise.  We're copying this from the shell.
> 
> Noise decreases legibility.  So, maybe the PEP should not say that
> ${name} is to be preferred over $name?  Or else, it should explain
> why.

I agree that I see no reason to prefer ${name} (except when followed
by another word character of course).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@zope.com  Thu Jun 20 03:53:32 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 22:53:32 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>
 <003701c2175f$b219c340$ced241d5@hagrid>
 <20020619075121.GB25541@hishome.net>
 <20020619083311.GA1011@ratthing-b3cf>
 <09342475030690@aluminium.rcp.co.uk>
 <20020619124604.GB31653@ute.mems-exchange.org>
Message-ID: <15633.17324.467335.416736@anthem.wooz.org>

>>>>> "AK" == Andrew Kuchling <akuchlin@mems-exchange.org> writes:

    AK> (Maybe a syntax-checker for %(...) strings would solve
    AK> Mailman's problems, and alleviate the plaintive cries for an
    AK> alternative interpolation syntax?)

If I had to do it over again, I would have used $name in i18n source
strings from the start.  It would have saved lots of headaches and
broken translations.  People just seem to get $names whereas they get
%(name)s wrong too often.

(Little known MM2.1 fact: you can actually convert your headers and
footers to $name substitutions, but it's a hack.  Later, it might be
required.  One of the outgrowths of experimenting with this was to add
a %(name)s checker and now bogus names in the %(...)s are flagged as
errors, while missing trailing `s's are flagged as warnings and
auto-corrected.)

-Barry



From barry@zope.com  Thu Jun 20 04:00:52 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 23:00:52 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <20020619124604.GB31653@ute.mems-exchange.org>
 <LNBBLJKPBEHFEDALKOLCGEFMPOAA.tim@zope.com>
Message-ID: <15633.17764.219180.975870@anthem.wooz.org>

>>>>> "TP" == Tim Peters <tim@zope.com> writes:

    TP> -1 on that part: os.path.expandvars() is an ill-defined mess
    TP> (the core has more than one of them, varying by platform, and
    TP> what they do differs in platform-irrelevant ways).  +1 on
    TP> making Barry fix expandvars <wink>:

    TP>     http://www.python.org/sf/494589

Heck, I've never even /used/ expandvars (I think that's the first time
I've even typed that sequence of letters).  Plus if it works on Unix,
what more could you want?  I'd say Mr. Neal Odd Body is handling that
bug report just fine.

-Barry



From barry@zope.com  Thu Jun 20 04:13:55 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 23:13:55 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190330.g5J3Ubd30622@smtp1.ActiveState.com>
 <3D100CA9.999E7B3F@prescod.net>
 <200206191254.g5JCsu301574@pcp02138704pcs.reston01.va.comcast.net>
 <3D10B496.67B19898@prescod.net>
 <200206192023.g5JKN3K02971@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15633.18547.455980.465004@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> I think little would be lost if sub() always required a dict
    GvR> (or perhaps keyword args, although that feels like a YAGNI
    GvR> now).

IME, doing the locals/globals trick is really helpful, but I might be
willing to let go on that one, because I can wrap that functionality
in my _() function.

The reason for this is that I'd say 80-90% of the time, you have value
you want to interpolate into the string sitting around handy in a
local variable.  And that local variable has the name of the key in
the template string.  So what you (would) end up doing is:

    ...
    name = getNameSomehow()
    ...
    country = getCountryOfOrigin(name)
    ...
    return '$name was born in $country'.sub({'name': name,
                                             'country': country})

Do that a few hundred times and you start wanting to make that a lot
more concise. :)

    GvR> I think that the key thing here is to set the precedent of
    GvR> using $ and the specific syntax proposed, not necessarily to
    GvR> have this as a built-in string methong.

I'll note that before this idea gained PEPhood, Guido and I discussed
using an operator, like:

    return '$name was born in $country' / dict

but came around to the current proposal's string method.  I agree with
Guido that it's the use of $strings that is important here, and I
don't care how the interpolation is actually done (builtin, string
method, etc.), though relegating it to a module would, I think, make
this a rarely used syntax.

-Barry



From pobrien@orbtech.com  Thu Jun 20 04:25:34 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Wed, 19 Jun 2002 22:25:34 -0500
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>

[Guido van Rossum]
> 
> I quite like the positional % substitution.  I think %(...)s was a
> mistake -- what we really wanted was ${...}.

What is the advantage of curly braces over parens in this context?

+1 on the allvars() suggestion also.

--
Patrick K. O'Brien
Orbtech
-----------------------------------------------
"Your source for Python software development."
-----------------------------------------------
Web:  http://www.orbtech.com/web/pobrien/ 
Blog: http://www.orbtech.com/blog/pobrien/ 
Wiki: http://www.orbtech.com/wiki/PatrickOBrien 
-----------------------------------------------




From guido@python.org  Thu Jun 20 04:32:37 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 23:32:37 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 22:25:34 CDT."
 <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>
Message-ID: <200206200332.g5K3Wbj06062@pcp02138704pcs.reston01.va.comcast.net>

> > I quite like the positional % substitution.  I think %(...)s was a
> > mistake -- what we really wanted was ${...}.
> 
> What is the advantage of curly braces over parens in this context?

Apart from Make, most $ substituters use ${...}, not $(...).

> +1 on the allvars() suggestion also.

I have no idea what you are talking about. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@zope.com  Thu Jun 20 04:31:36 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 23:31:36 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca>
 <oq3cviitby.fsf@titan.progiciels-bpi.ca>
 <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net>
 <oqsn3ihdp2.fsf@titan.progiciels-bpi.ca>
Message-ID: <15633.19608.392087.60416@anthem.wooz.org>

>>>>> "FP" =3D=3D Fran=E7ois Pinard <pinard@iro.umontreal.ca> writes:

    FP> So, maybe the PEP should not say that ${name} is to be
    FP> preferred over $name?

Agreed.
-Barry



From barry@zope.com  Thu Jun 20 04:34:38 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 23:34:38 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca>
 <oq3cviitby.fsf@titan.progiciels-bpi.ca>
Message-ID: <15633.19790.152438.926329@anthem.wooz.org>

>>>>> "FP" =3D=3D Fran=E7ois Pinard <pinard@iro.umontreal.ca> writes:

    FP> However, there are other contexts where the concept of a
    FP> compound dictionary of all globals and locals would be useful.
    FP> Maybe we could have some allvars() similar to globals() and
    FP> locals(), and use `... % allvars()' instead of `.sub()'?  So
    FP> this would serve both string interpolation and other avenues.

Or maybe just make vars() do something more useful when no arguments
are given?

In any event, allvars() or a-different-vars() is out of scope for this
PEP.  We'd use it if it was there, but I think it needs its own PEP,
which someone else will have to champion.

-Barry



From damien.morton@acm.org  Thu Jun 20 04:35:32 2002
From: damien.morton@acm.org (Damien Morton)
Date: Wed, 19 Jun 2002 23:35:32 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
Message-ID: <006201c2180b$883c1120$72976c42@damien>

"I'd rather have a notation that's less error-prone than a better way
to check for errors.  (Not that PyChecker 2 isn't a great idea. :-)"

Percent notation "%s" notation already exists for strings.

Backquote notation is already in python, though I think, little used.

The $ notation reeks of obscure languages such as perl and shell.

Why not simply add backquote notation to python strings. I read in a
recent email from Timbot, I think,  that the backquote notation was
originally intended for string interpolation too.

name = "guido"
country = "the netherlands"
height = 1.92

"`name` is from `country`".sub() -> "guido is from the netherlands"
"`name.capitalize()` is from `country`" -> "Guido is from the
netherlands"

"`name` is %`height`4.1f meters tall".sub() -> "guido is 1.9 meters
tall"

"`name.capitalize()` can jump `height*1.7` meters".sub() -> "guido can
jump 3.264 meters"

You could probably also compile these interpolation strings as well.

Another thought:

One of the main problems with the "%(name)4.2f" notation is that the
format comes after the variable name. Is easy to forget adding the
actual format specifier in after the name.

Why not alter the notation to allow the format specifier to come before
the name part.

"%4.2f(height)" I think would be a whole lot less error prone, and would
allow for the format specifier to default to "s" where omitted.

"%(height)" is also less error prone, though it is ambiguous in the
current scheme.

Ive seen a nice class which evaluates the string used in the name part.

class itpl:
	def __getitem__(self, s):
		return eval(s, globals())











From guido@python.org  Thu Jun 20 04:46:33 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 19 Jun 2002 23:46:33 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 23:35:32 EDT."
 <006201c2180b$883c1120$72976c42@damien>
References: <006201c2180b$883c1120$72976c42@damien>
Message-ID: <200206200346.g5K3kXV06637@pcp02138704pcs.reston01.va.comcast.net>

> The $ notation reeks of obscure languages such as perl and shell.

Sigh.  Please grow up.

> Why not simply add backquote notation to python strings. I read in a
> recent email from Timbot, I think,  that the backquote notation was
> originally intended for string interpolation too.

Unfortunately, backquotes are often hard to see, or mistaken for
forward quotes.  I think that disqualifies it.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@zope.com  Thu Jun 20 04:45:32 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 19 Jun 2002 23:45:32 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>
 <200206200332.g5K3Wbj06062@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15633.20444.828514.232578@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> Apart from Make, most $ substituters use ${...}, not $(...).

GNU Make allows either braces or parentheses; there's no difference
between the two.  So it's a pretty strong precedent in lots of Unix
tools.  GNU Make also uses the $$ escape.

-Barry



From David Abrahams" <david.abrahams@rcn.com  Thu Jun 20 04:55:44 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 19 Jun 2002 23:55:44 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <006201c2180b$883c1120$72976c42@damien>
Message-ID: <0d3501c2180e$5b6c7b00$6601a8c0@boostconsulting.com>

From: "Damien Morton" <damien.morton@acm.org>

> "`name.capitalize()` can jump `height*1.7` meters".sub() -> "guido can
> jump 3.264 meters"

I love this suggestion. It's the sort of thing you can't do in C++ ;-)
I suspect the arguments against will run to efficiency and complexity,
since you need to compile the backquoted expressions (in some context).

Hmm, here they are... Nope, I'm wrong

-Dave





From pobrien@orbtech.com  Thu Jun 20 05:02:58 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Wed, 19 Jun 2002 23:02:58 -0500
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <200206200332.g5K3Wbj06062@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <NBBBIOJPGKJEKIECEMCBKEEBNFAA.pobrien@orbtech.com>

[Guido van Rossum]
>
> > +1 on the allvars() suggestion also.
>
> I have no idea what you are talking about. :-(

François Pinard made the following suggestion and I think something along
the lines of allvars() would be very handy, especially with the html stuff
I've been doing lately:

> This is an interesting idea.  However, there are other contexts where the
> concept of a compound dictionary of all globals and locals would
> be useful.
> Maybe we could have some allvars() similar to globals() and locals(),
> and use `... % allvars()' instead of `.sub()'?  So this would serve both
> string interpolation and other avenues.

--
Patrick K. O'Brien
Orbtech
-----------------------------------------------
"Your source for Python software development."
-----------------------------------------------
Web:  http://www.orbtech.com/web/pobrien/
Blog: http://www.orbtech.com/blog/pobrien/
Wiki: http://www.orbtech.com/wiki/PatrickOBrien
-----------------------------------------------




From guido@python.org  Thu Jun 20 05:06:55 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 00:06:55 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 23:55:44 EDT."
 <0d3501c2180e$5b6c7b00$6601a8c0@boostconsulting.com>
References: <006201c2180b$883c1120$72976c42@damien>
 <0d3501c2180e$5b6c7b00$6601a8c0@boostconsulting.com>
Message-ID: <200206200406.g5K46tB06740@pcp02138704pcs.reston01.va.comcast.net>

> I love this suggestion. It's the sort of thing you can't do in C++ ;-)
> I suspect the arguments against will run to efficiency and complexity,
> since you need to compile the backquoted expressions (in some context).

Actually, I had planned a secret feature that skips matching nested
{...} inside ${...}, so that you could write a magic dict whose keys
were eval()'ed in the caller's context.  The %(...) parser does this
(skipping nested (...)) because someone wanted to do that.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From damien.morton@acm.org  Thu Jun 20 05:14:13 2002
From: damien.morton@acm.org (Damien Morton)
Date: Thu, 20 Jun 2002 00:14:13 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <200206200407.g5K47f906751@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <006601c21810$ef52a770$72976c42@damien>

LOL. Youre right of course :)

`...` is already in python though.

Better to generalise and/or extend an already existing construct than to
add a new one, I would assert.

Youre also right about `...` being less visible than other pairs of
delimiters.

I think, though, that if modifying the % notation, that the requirement
should be for delimiters that are not parentheses, and ones that
wouldn't interfere with putting expressions between them. This would
allow for eval()ing the content between the delimiters.

> -----Original Message-----
> From: guido@pcp02138704pcs.reston01.va.comcast.net 
> [mailto:guido@pcp02138704pcs.reston01.va.comcast.net] On 
> Behalf Of Guido van Rossum
> Sent: Thursday, 20 June 2002 00:08
> To: Damien Morton
> Subject: Re: [Python-Dev] PEP 292, Simpler String Substitutions
> 
> 
> > I stand by my position though. Ive been programing for a long time, 
> > and I have rarely come across the $ notation. Mind you, I 
> don't work 
> > in unix very often, and I would hazard a guess that the $ 
> substitution 
> > comes mainly from unix.
> 
> So does `...`. :-)
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 




From guido@python.org  Thu Jun 20 05:23:26 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 00:23:26 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Thu, 20 Jun 2002 00:14:13 EDT."
 <006601c21810$ef52a770$72976c42@damien>
References: <006601c21810$ef52a770$72976c42@damien>
Message-ID: <200206200423.g5K4NQt06841@pcp02138704pcs.reston01.va.comcast.net>

> `...` is already in python though.


But not in this form.

> Better to generalise and/or extend an already existing construct than to
> add a new one, I would assert.

Only if that construct is successful.  `...` has a bad rep -- people
by and large prefer repr().

> Youre also right about `...` being less visible than other pairs of
> delimiters.
> 
> I think, though, that if modifying the % notation, that the requirement
> should be for delimiters that are not parentheses, and ones that
> wouldn't interfere with putting expressions between them. This would
> allow for eval()ing the content between the delimiters.

You could do that with ${...} if it skipped nested {...} inside, which
I plan to implement as an (initially secret) extension.  That should
be good enough.  If you try to put string literals containing { or }
inside ${...} you deserve what you get. :-)

Still, this wouldn't work:

  x = 12
  y = 14
  print "$x times $y equals ${x*y}".sub()

You'd have to pass a special magic dict (which I *won't* supply).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pobrien@orbtech.com  Thu Jun 20 05:23:10 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Wed, 19 Jun 2002 23:23:10 -0500
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <200206200332.g5K3Wbj06062@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <NBBBIOJPGKJEKIECEMCBIEEDNFAA.pobrien@orbtech.com>

[Guido van Rossum]
>
> > > I quite like the positional % substitution.  I think %(...)s was a
> > > mistake -- what we really wanted was ${...}.
> >
> > What is the advantage of curly braces over parens in this context?
>
> Apart from Make, most $ substituters use ${...}, not $(...).

I guess what I was really wondering is whether that advantage clearly
outways some of the possible disadvantages. I'm not a fan of curly braces
and I'll be sad to see more of them in Python. There's something refreshing
about only having curly braces for dictionaries and parens everywhere else.
And since the exisiting string substitution uses parens why shouldn't the
new?

It won't surprise me that you've already considered all this and are fine
with using curly braces here, but I just had to ask before it is a done
deal. (And I promise I won't go on a boolean crusade and predict that curly
braces will appear everywhere to the demise of the language. <wink>)

--
Patrick K. O'Brien
Orbtech
-----------------------------------------------
"Your source for Python software development."
-----------------------------------------------
Web:  http://www.orbtech.com/web/pobrien/
Blog: http://www.orbtech.com/blog/pobrien/
Wiki: http://www.orbtech.com/wiki/PatrickOBrien
-----------------------------------------------




From guido@python.org  Thu Jun 20 05:28:34 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 00:28:34 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Wed, 19 Jun 2002 23:23:10 CDT."
 <NBBBIOJPGKJEKIECEMCBIEEDNFAA.pobrien@orbtech.com>
References: <NBBBIOJPGKJEKIECEMCBIEEDNFAA.pobrien@orbtech.com>
Message-ID: <200206200428.g5K4SZh06890@pcp02138704pcs.reston01.va.comcast.net>

> I'm not a fan of curly braces and I'll be sad to see more of them in
> Python.

This seems more emotional than anything else.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From pobrien@orbtech.com  Thu Jun 20 05:33:15 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Wed, 19 Jun 2002 23:33:15 -0500
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <200206200428.g5K4SZh06890@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <NBBBIOJPGKJEKIECEMCBMEEENFAA.pobrien@orbtech.com>

[Guido van Rossum]
>
> > I'm not a fan of curly braces and I'll be sad to see more of them in
> > Python.
>
> This seems more emotional than anything else.

Definitely. And habit. Since I program mostly in Python I'm used to {}
meaning dictionary and I'm used to typing parens everywhere else. Others who
are used to ${} for string substitution in other contexts will be happy that
you copied that syntax. I'm just trying to see if there is anything more
substantial involved. Sounds like there isn't. And that's fine. I'll adapt.
:-)

--
Patrick K. O'Brien
Orbtech
-----------------------------------------------
"Your source for Python software development."
-----------------------------------------------
Web:  http://www.orbtech.com/web/pobrien/
Blog: http://www.orbtech.com/blog/pobrien/
Wiki: http://www.orbtech.com/wiki/PatrickOBrien
-----------------------------------------------




From tdelaney@avaya.com  Thu Jun 20 06:24:48 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Thu, 20 Jun 2002 15:24:48 +1000
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A3A8@natasha.auslabs.avaya.com>

> From: Patrick K. O'Brien [mailto:pobrien@orbtech.com]
> 
> Definitely. And habit. Since I program mostly in Python I'm used to {}
> meaning dictionary and I'm used to typing parens everywhere 

In that case you should be happy. ${} is using a dictionary as its source
...

Actually, for consistency, it should probably be $[] to suggest accessing a
dictionary element. But I won't go down that path ;)

Tim Delaney



From python@rcn.com  Thu Jun 20 06:38:35 2002
From: python@rcn.com (Raymond Hettinger)
Date: Thu, 20 Jun 2002 01:38:35 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <006201c2180b$883c1120$72976c42@damien>
Message-ID: <00cb01c2181c$b97fb320$d2f7a4d8@othello>

From: "Damien Morton" <damien.morton@acm.org>

> Why not simply add backquote notation to python strings. I read in a
> recent email from Timbot, I think,  that the backquote notation was
> originally intended for string interpolation too.
>
> "`name` is from `country`".sub()
> "`name.capitalize()` is from `country
> "`name` is %`height`4.1f meters tall".sub()
> "`name.capitalize()` can jump `height*1.7` meters".sub()

I'll bet this style would be brutal to read with the accented letters in
French.


Raymond Hettinger




From Oleg Broytmann <phd@phd.pp.ru>  Thu Jun 20 09:16:08 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Thu, 20 Jun 2002 12:16:08 +0400
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <m34rfzrodm.fsf@mira.informatik.hu-berlin.de>; from martin@v.loewis.de on Wed, Jun 19, 2002 at 10:43:01PM +0200
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <m3660fhm3g.fsf@mira.informatik.hu-berlin.de> <20020619182141.GA18944@zot.electricrain.com> <m34rfzrodm.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020620121608.H22899@phd.pp.ru>

On Wed, Jun 19, 2002 at 10:43:01PM +0200, Martin v. Loewis wrote:
> What does it have to do with the city of Berkeley (CA)? Perhaps
> "sleepycat"?

   from sleepycat import berkeleydb
   from sleepycat import bsddb2
   from sleepycat import bsddb3
   from sleepycat import bsddb4

      ???

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From oren-py-d@hishome.net  Thu Jun 20 08:18:56 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Thu, 20 Jun 2002 03:18:56 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <006201c2180b$883c1120$72976c42@damien>
References: <006201c2180b$883c1120$72976c42@damien>
Message-ID: <20020620071856.GA10497@hishome.net>

On Wed, Jun 19, 2002 at 11:35:32PM -0400, Damien Morton wrote:
> Why not simply add backquote notation to python strings. I read in a
> recent email from Timbot, I think,  that the backquote notation was
> originally intended for string interpolation too.

See http://tothink.com/python/embedpp

	Oren



From fredrik@pythonware.com  Thu Jun 20 11:17:14 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 20 Jun 2002 12:17:14 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <20020619124604.GB31653@ute.mems-exchange.org> <20020619204017.GB9758@gerg.ca>
Message-ID: <02fd01c21843$a959df30$ced241d5@hagrid>

<F mode="medium rant, insert winks/smileys as necessary ;-)">

Greg wrote:

&gt; No, that's already checked in as textwrap.py.

are you saying that you cannot rename stuff under CVS?
not even delete it, and check it in again under a new name?

(it's a brand new module, after all, so the history
shouldn't matter much)

I'm +1 on adding a text utility module for occasionally useful
stuff like wrapping, getting rid of gremlins, doing various kinds
of substitutions, centering/capitalizing/padding and otherwise
formatting strings, searching/parsing, and other fun stuff that
your average text editor can do (and +0 on using the existing
"string" module for that purpose, but I can live with another
name).

I'm -1 on adding one specialized module, and then rejecting
other things because we have nowhere to put it, or because
adding a useful function to a corner of the standard library is
so much harder than adding a method to a core data type,
or because a single trainer has decided that he doesn't want
to teach his classes to use modules and call functions...

</F>




From tismer@tismer.com  Thu Jun 20 11:27:04 2002
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 20 Jun 2002 12:27:04 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>	<003701c2175f$b219c340$ced241d5@hagrid>	<20020619075121.GB25541@hishome.net>	<20020619083311.GA1011@ratthing-b3cf>	<09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net> <01bf01c217cf$407bcec0$ced241d5@hagrid>
Message-ID: <3D11ADF8.9090802@tismer.com>

Fredrik Lundh wrote:
> paul wrote:
> 
>>$ is taught in hour 2, import is taught on day 2.
> 
> 
> says who?
> 
> I usually mention "import" in the first hour (before methods),
> and nobody has ever had any problem with that...

Well, same here, but that might change, since the string
module is nearly obsolete. You can show reasonably
powerful stuff(*) without a single import.

(*) and that's what you need to get people interested.

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From fredrik@pythonware.com  Thu Jun 20 11:52:26 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 20 Jun 2002 12:52:26 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>	<003701c2175f$b219c340$ced241d5@hagrid>	<20020619075121.GB25541@hishome.net>	<20020619083311.GA1011@ratthing-b3cf>	<09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net> <01bf01c217cf$407bcec0$ced241d5@hagrid> <3D11ADF8.9090802@tismer.com>
Message-ID: <038e01c21848$9bef5320$ced241d5@hagrid>

christian wrote:

> > I usually mention "import" in the first hour (before methods),
> > and nobody has ever had any problem with that...
> 
> Well, same here, but that might change, since the string
> module is nearly obsolete. You can show reasonably
> powerful stuff(*) without a single import.
> 
> (*) and that's what you need to get people interested.

I usually start out with something web-oriented (which means
urllib).  how about adding a "get" method to strings?  or an "L"
prefix character that causes Python to wrap it up in a simple
URL container:

    print url"http://www.python.org".read()

:::

but in practice, if you really want people to get interested,
make sure you have a domain-specific library installed on the
training machines.  why care about string fiddling when your
second python program (after print "hello world") can be:

    import noaa
    im = noaa.open("noaa16_20020620_1021")
    im.rectify("euro")
    im.show()

</F>




From tismer@tismer.com  Thu Jun 20 12:05:20 2002
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 20 Jun 2002 13:05:20 +0200
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>
Message-ID: <3D11B6F0.5000803@tismer.com>

Patrick K. O'Brien wrote:
> [Guido van Rossum]
> 
>>I quite like the positional % substitution.  I think %(...)s was a
>>mistake -- what we really wanted was ${...}.
> 
> 
> What is the advantage of curly braces over parens in this context?

It unambiguously spells that there is no format suffix char.

> +1 on the allvars() suggestion also.

me too.

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From greg@cosc.canterbury.ac.nz  Thu Jun 20 08:25:34 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 20 Jun 2002 19:25:34 +1200 (NZST)
Subject: [Python-Dev] Weird problem with exceptions raised in extension module
Message-ID: <200206200725.g5K7PYt26747@oma.cosc.canterbury.ac.nz>

I'm getting strange behaviour when raising an
exception in a extension module generated by
Pyrex. The extension module does the equivalent of

  def foo():
    raise TypeError("Test-Exception")

If I invoke it with the following Python code:

  try:
    mymodule.foo()
  except IOError:
    print "blarg"

the following happens:

  Traceback (most recent call last):
    File "<stdin>", line 3, in ?
  SystemError: 'finally' pops bad exception

This only happens when the try-except catches
something *other* than the exception being raised.
If the exception being raised is caught, or
no exception catching is done, the exception
is handled properly.

Also, it only happens when an *intance* is used
as the exception object. If I do this instead:

  raise TypeError, "Test-Exception"

the problem doesn't occur.

The relevant piece of C code generated by
Pyrex is as follows. Can anyone see if I'm
doing anything wrong? (I'm aware that there's
a missing Py_DECREF, but it shouldn't be
causing this sort of thing.)

The Python version I'm using is 2.2.

  __pyx_1 = __Pyx_GetName(__pyx_b, "TypeError"); 
  if (!__pyx_1) goto __pyx_L1;
  __pyx_2 = PyString_FromString(__pyx_k1); 
  if (!__pyx_2) goto __pyx_L1;
  __pyx_3 = PyTuple_New(1); 
  if (!__pyx_3) goto __pyx_L1;
  PyTuple_SET_ITEM(__pyx_3, 0, __pyx_2);
  __pyx_2 = 0;
  __pyx_4 = PyObject_CallObject(__pyx_1, __pyx_3); 
  if (!__pyx_4) goto __pyx_L1;
  Py_DECREF(__pyx_3); 
  __pyx_3 = 0;
  PyErr_SetNone(__pyx_4);
  Py_DECREF(__pyx_4); 
  __pyx_4 = 0;
  goto __pyx_L1;

  /*...*/

  __pyx_L1:;
  Py_XDECREF(__pyx_1);
  Py_XDECREF(__pyx_2);
  Py_XDECREF(__pyx_3);
  Py_XDECREF(__pyx_4);
  __pyx_r = 0;
  __pyx_L0:;
  return __pyx_r;

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From tismer@tismer.com  Thu Jun 20 12:52:06 2002
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 20 Jun 2002 13:52:06 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com>	<003701c2175f$b219c340$ced241d5@hagrid>	<20020619075121.GB25541@hishome.net>	<20020619083311.GA1011@ratthing-b3cf>	<09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net> <01bf01c217cf$407bcec0$ced241d5@hagrid> <3D11ADF8.9090802@tismer.com> <038e01c21848$9bef5320$ced241d5@hagrid>
Message-ID: <3D11C1E6.4060508@tismer.com>

Fredrik Lundh wrote:
> christian wrote:
> 
> 
>>>I usually mention "import" in the first hour (before methods),
>>>and nobody has ever had any problem with that...
>>
>>Well, same here, but that might change, since the string
>>module is nearly obsolete. You can show reasonably
>>powerful stuff(*) without a single import.
>>
>>(*) and that's what you need to get people interested.
> 
> 
> I usually start out with something web-oriented (which means
> urllib).  how about adding a "get" method to strings?  or an "L"
> prefix character that causes Python to wrap it up in a simple
> URL container:
> 
>     print url"http://www.python.org".read()

*puke*

> but in practice, if you really want people to get interested,
> make sure you have a domain-specific library installed on the
> training machines.  why care about string fiddling when your
> second python program (after print "hello world") can be:

Yes, I know. I didn't want to make a point, just to point
out that it is possible to show neat stuff without import.
Sure, the next thing I show is COM stuff or formatted stock
market reports, using urllib, xml... -- no point.

--- the rest below is not to Fredrik but the whole thread ---

I'd like to express my opinion at this place (which is as good
as any other place in such a much-too-fast growing thread):

The following statements are ordered by increasing hate.
1 - I do hate the idea of introducing a "$" sign at all.
2 - giving "$" special meaning in strings via a module
3 - doing it as a builtin function
4 - allowing it to address local/global variables

Version 4 as worst comes visually quite close to
languages like Perl. In another post, Guido answered
such objection with "grow up". While my emotional
reaction would be to reply with "wake up!", I have some
rationale reasons why I don't like this:

I have to read and sometimes write lots of Perl code.
The massive use of "$" gives me true headache. I don't
want Python to remind me of headaches.

One argument was that "$" and the unembraced usage in "$name"
is so common and therefore easy to sell to Python newbies.
Fine, but no reason to adopt this overly abused character.
Instead, I'm happy that exactly "$" is nowhere used in
formatting.
I don't want to make Python similar to something, but to
keep it different in this aspect. Like the triple quotes,
the percent formatting exists rather seldom in other
languages, and I love to use templates for makefiles,
scripts and whatsoever, where I don't have to care too
much about escaping the escapes.
With an upcoming "$" feature, I fear that "%" might get
abandoned in some future, and I loose this benefit.

I agree with any sensible extension/refinement of the "%" sign.
I disagree on using "$" for anything frequent in Python.
I don't want to see variable names as placeholder inside
of strings. Placeholders should be dictionary string keys,
but this dictionary must be obtained explicitly.
I do like the allvars() proposal.

crap-py -ly - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From David Abrahams" <david.abrahams@rcn.com  Thu Jun 20 13:26:02 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 20 Jun 2002 08:26:02 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <006201c2180b$883c1120$72976c42@damien>              <0d3501c2180e$5b6c7b00$6601a8c0@boostconsulting.com>  <200206200406.g5K46tB06740@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <0e4c01c21855$d16773e0$6601a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>


> > I love this suggestion. It's the sort of thing you can't do in C++ ;-)
> > I suspect the arguments against will run to efficiency and complexity,
> > since you need to compile the backquoted expressions (in some context).
>
> Actually, I had planned a secret feature that skips matching nested
> {...} inside ${...}, so that you could write a magic dict whose keys
> were eval()'ed in the caller's context.  The %(...) parser does this
> (skipping nested (...)) because someone wanted to do that.

Ooh, magic and secrets! Maybe a little too magical for me to understand
easily. Is the stuff between ${...} allowed to be any valid expression?

harry-potter's-got-nothing-on-you-ly y'rs,
dave




From fredrik@pythonware.com  Thu Jun 20 13:34:57 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 20 Jun 2002 14:34:57 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <006201c2180b$883c1120$72976c42@damien>              <0d3501c2180e$5b6c7b00$6601a8c0@boostconsulting.com>  <200206200406.g5K46tB06740@pcp02138704pcs.reston01.va.comcast.net> <0e4c01c21855$d16773e0$6601a8c0@boostconsulting.com>
Message-ID: <010501c21856$e4950260$0900a8c0@spiff>

David wrote:

> Ooh, magic and secrets! Maybe a little too magical for me to =
understand
> easily. Is the stuff between ${...} allowed to be any valid =
expression?

not according to the PEP, but nothing stops you from using
a magic dictionary:

class magic_dict:
    def __getitem__(self, value):
        return str(eval(value))

d =3D magic_dict()

print "%(__import__('os').system('echo hello'))s" % d
print replacevars("${__import__('os').system('echo hello')}", d)

# for extra fun, replace 'echo hello' with 'rm -rf ~')

</F>




From David Abrahams" <david.abrahams@rcn.com  Thu Jun 20 13:36:12 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 20 Jun 2002 08:36:12 -0400
Subject: [Python-Dev] proposed patch
Message-ID: <0e8801c21857$1212d280$6601a8c0@boostconsulting.com>

I suggested:

"What about making a public interface to apply_slice() and assign_slice()
which I
can call in the future? Perhaps PyObject_Get/Set/DelSlice()?"

If I submitted such a patch would it be likely to be accepted? I don't want
to waste my time on this if it's a bad idea.

-Dave
+---------------------------------------------------------------+
                  David Abrahams
      C++ Booster (http://www.boost.org)               O__  ==
      Pythonista (http://www.python.org)              c/ /'_ ==
  resume: http://users.rcn.com/abrahams/resume.html  (*) \(*) ==
          email: david.abrahams@rcn.com
+---------------------------------------------------------------+




From fredrik@pythonware.com  Thu Jun 20 13:37:18 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 20 Jun 2002 14:37:18 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206190322.g5J3M1I07670@pythonware.com><003701c2175f$b219c340$ced241d5@hagrid><20020619075121.GB25541@hishome.net><006501c2176e$b9dbb3e0$0900a8c0@spiff> <15632.30372.601835.200686@anthem.wooz.org>
Message-ID: <012701c21857$37bcb2d0$0900a8c0@spiff>

barry wrote:
> I've added a note that you should never use no-arg .sub() on strings
> that come from untrusted sources.

if adding a note to the specification really helped, my servers
logs wouldn't be full of findmail.pl requests, and our mail filters
wouldn't catch quite as many outlook worms ;-)

</F>




From aleax@aleax.it  Thu Jun 20 13:38:36 2002
From: aleax@aleax.it (Alex Martelli)
Date: Thu, 20 Jun 2002 14:38:36 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <3D11C1E6.4060508@tismer.com>
References: <200206190322.g5J3M1I07670@pythonware.com> <038e01c21848$9bef5320$ced241d5@hagrid> <3D11C1E6.4060508@tismer.com>
Message-ID: <E17L1DU-0005Rb-00@mail.python.org>

On Thursday 20 June 2002 01:52 pm, Christian Tismer wrote:
	...
> I'd like to express my opinion at this place (which is as good
> as any other place in such a much-too-fast growing thread):

Ditto (particularly since Christian's opinions strike me as
quite sensible here).


> The following statements are ordered by increasing hate.
> 1 - I do hate the idea of introducing a "$" sign at all.

In my case, I detest the idea of using '$' for something
that is similar but subtly different from the job of '%'.  It
reeks of "more than one way to do it".  I can just picture
myself once more having to teach/explain "this, dear bright
Python beginner, is the % way of formatting, with which 
you can do tasks X, Y, Z.  However, for tasks Y, most of Z,
and a good deal of T, there is also the $ way.  It's close 
enough to the % way that you're sure to confuse their 
syntactic details when you try to learn both, but don't 
worry, those differences are arbitrary and bereft of any
mnemonic value, so your eventual confusion is totally inevitable
and you may as well give up.  Cheer up, though -- if you're
well-learned in C (not Java, C#, or C++, but the Real
McCoy) *AND* sh (or bash or perl, and able to keep in
mind exactly what subset of their interpolation syntax
is implemented here), then you can toss a coin each and
every time you want to format/interpolate, given the wide
but not total overlap of tasks best accomplished each way.

There, aren't you happy you've chosen to learn a language
so powerful, simple, regular, and uniform, based on the self
evident principle that there should be one way, and ideally
just one way, to perform each task?"

> 2 - giving "$" special meaning in strings via a module
> 3 - doing it as a builtin function

I agree that having it as a builtin (or string method) would
be even worse than having it as a module.  A module I can
more easily try to "sweep under the carpet" as a side-show
aberration.  Built-in functions, operators, and methods of
built-in object types, are far harder to explain away.

> 4 - allowing it to address local/global variables

Yeah, I can see this smells, too, but IMHO not quite as
bad as the $-formatting - vs - %-formatting task overlap.


> Version 4 as worst comes visually quite close to
> languages like Perl. In another post, Guido answered
> such objection with "grow up". While my emotional
> reaction would be to reply with "wake up!", I have some
> rationale reasons why I don't like this:
>
> I have to read and sometimes write lots of Perl code.
> The massive use of "$" gives me true headache. I don't
> want Python to remind me of headaches.

I don't get this specific point.  As punctuation goes, $ or
% are much of a muchness from my POV (I admit to not
having a very high visual orientation, so I may be missing
some subtle point of graphical rendition?).  A massive use
of one OR the other would be just as bad.  Am I misreading
you or just missing something important?


> One argument was that "$" and the unembraced usage in "$name"
> is so common and therefore easy to sell to Python newbies.

Newbies who come from Windows and have never knowingly used a 
Unix-ish box (probably more numerous today, despite Linux's 
renaissance -- newbies do tend to grow on Microsoft operating
systems, that's what comes bundled with typical PCs today) might
of course be familiar with %name and not $name (%name is what
Microsoft's pitifully weak .BAT "language" uses).  It seems to me
that this "familiariry" argument doesn't cut much ice either way.

Were we designing from scratch, and having to choose one punctuation
character for this purpose, I'd be pretty neutral on ground of looks and
familiarity.  A slight bias against $ because on some terminals or printers
it can come out as some OTHER currency symbol, depending on various
settings, but that's pretty marginal.

But I'd much rather not have both $ and % used in slightly different
contexts for somewhat-similar, overlapping tasks...

> Fine, but no reason to adopt this overly abused character.
> Instead, I'm happy that exactly "$" is nowhere used in
> formatting.

Me, I'm happy (so far) that not BOTH $ and % are used for this,
but just one of them.

> I don't want to make Python similar to something, but to
> keep it different in this aspect. Like the triple quotes,
> the percent formatting exists rather seldom in other
> languages, and I love to use templates for makefiles,
> scripts and whatsoever, where I don't have to care too
> much about escaping the escapes.

Good point -- sometimes being different than most others has
pluses:-).  Of course, if you were templating to MS .BAT files
it would be the other way 'round, but one doesn't do that much:-).

> With an upcoming "$" feature, I fear that "%" might get
> abandoned in some future, and I loose this benefit.

This sounds to me like a FUD/"slippery slope" argument, even
though I'm in broad agreement with you.  I've neither heard nor
suspected anything about plans to introduce the huge code
breakage that abandoning % would entail.

Rather, my fear is exactly that we'll get BOTH approaches to
formatting, in an acute if localized outbreak of morethanonewayitis.


> I agree with any sensible extension/refinement of the "%" sign.

Sure.  The current %-formatting rules aren't perfect, far from it,
and while we must of course keep compatibility when the %
_operator_ is used, it WOULD be nice to have a function or
method that does something simpler and sensible with the
template string when called instead of the % operator.

> I disagree on using "$" for anything frequent in Python.

I'd have no inherent objection to using $ (or @ or ? -- I
think those are the three currently unused ASCII printing
characters) for other tasks that didn't overlap with %'s.

> I don't want to see variable names as placeholder inside
> of strings. Placeholders should be dictionary string keys,
> but this dictionary must be obtained explicitly.

Yes, I see your point.  It's definitely a valid one.

> I do like the allvars() proposal.

Me too BUT.  How would allvars deal with free variables?
eval is currently unable to deal with them very well:

>>> def f():
...   a=1; b=2; c=3
...   def g(x):
...     z = b
...     try: return eval(x)
...     except: return '%s unknown'%x
...   return g
...
>>> K=f()
>>> for Z in 'abc': print K(Z)
...
a unknown
2
c unknown
>>>

'b' is OK because the compiler has seen it used elsewhere
in nested function g, but 'a' and 'c' aren't.  If 'allvars' can't
deal with this problem, then it should not be named 'allvars'
but 'somevarsbutmaybenotalldepending'.


Alex



From martin@v.loewis.de  Thu Jun 20 08:42:27 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 20 Jun 2002 09:42:27 +0200
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <20020619212559.GC18944@zot.electricrain.com>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
 <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
 <15632.62564.638418.191453@localhost.localdomain>
 <20020619212559.GC18944@zot.electricrain.com>
Message-ID: <m3k7oujt0c.fsf@mira.informatik.hu-berlin.de>

"Gregory P. Smith" <greg@electricrain.com> writes:

> Modern berkeleydb uses much different on disk database formats, glancing
> at the docs on sleepycat.com i don't even think it can read bsddb (1.85)
> files.  Existing code using bsddb (1.85) should not automatically start
> using a different database library even if we provide a compatibility
> interface.  

The Python bsddb module never guaranteed that you can use it to read
bsddb 1.85 data files. In fact, on many installation, the bsddb module
links with bsddb 2.x or bsddb 3.x, using db_185.h.

So this is no reason not to call the module bsddb.

Regards,
Martin



From gmcm@hypernet.com  Thu Jun 20 14:07:34 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 20 Jun 2002 09:07:34 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <200206200230.g5K2Uxf04408@pcp02138704pcs.reston01.va.comcast.net>
References: Your message of "19 Jun 2002 22:20:49 EDT." <oq3cviitby.fsf@titan.progiciels-bpi.ca>
Message-ID: <3D119B56.32515.AD6E881A@localhost>

On 19 Jun 2002 at 22:30, Guido van Rossum wrote:

> The $ means "substitution" in so many languages
> besides Perl that I wonder where you've been. 

It doesn't mean anything in any language I *like*.

-- Gordon
http://www.mcmillan-inc.com/




From fredrik@pythonware.com  Thu Jun 20 14:34:02 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 20 Jun 2002 15:34:02 +0200
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: Your message of "19 Jun 2002 22:20:49 EDT." <oq3cviitby.fsf@titan.progiciels-bpi.ca>  <3D119B56.32515.AD6E881A@localhost>
Message-ID: <02e401c2185f$24db5970$0900a8c0@spiff>

gordon wrote:

> > The $ means "substitution" in so many languages
> > besides Perl that I wonder where you've been.=20
>=20
> It doesn't mean anything in any language I *like*.

not even in american?

</F>




From gward@python.net  Thu Jun 20 14:49:05 2002
From: gward@python.net (Greg Ward)
Date: Thu, 20 Jun 2002 09:49:05 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <02fd01c21843$a959df30$ced241d5@hagrid>
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <20020619124604.GB31653@ute.mems-exchange.org> <20020619204017.GB9758@gerg.ca> <02fd01c21843$a959df30$ced241d5@hagrid>
Message-ID: <20020620134905.GC13858@gerg.ca>

On 20 June 2002, Fredrik Lundh said:
> &gt; No, that's already checked in as textwrap.py.
> 
> are you saying that you cannot rename stuff under CVS?
> not even delete it, and check it in again under a new name?

Not at all -- I was just correcting Andrew's misunderstanding about
where the text-wrapping code lives (for now).

> I'm +1 on adding a text utility module for occasionally useful
> stuff like wrapping, getting rid of gremlins, doing various kinds
> of substitutions, centering/capitalizing/padding and otherwise
> formatting strings, searching/parsing, and other fun stuff that
> your average text editor can do (and +0 on using the existing
> "string" module for that purpose, but I can live with another
> name).

Sounds like a pretty good idea to me.

Note that textwrap.py is almost 300 lines, which in my worldview is big
enough to warrant its own module.  I don't think that reduces the
desirablity of having text.py (or text/__init__.py and friends) for all
the things you mentioned.

        Greg
-- 
Greg Ward - Python bigot                                gward@python.net
http://starship.python.net/~gward/



From oren-py-d@hishome.net  Thu Jun 20 14:49:16 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Thu, 20 Jun 2002 09:49:16 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <200206200053.g5K0rch27979@zippy.abinitio.com>
References: <B43D149A9AB2D411971300B0D03D7E8BF0A3A1@natasha.auslabs.avaya.com> <200206200053.g5K0rch27979@zippy.abinitio.com>
Message-ID: <20020620134916.GA53951@hishome.net>

>From what I've read on this thread so far my vote would be:

+1 - no new forms of string formatting
+0 - Donald Beaudry's proposal that %{name} would be equivalent to %(name)s
-1 - anything else

	Oren



From pinard@iro.umontreal.ca  Thu Jun 20 14:41:00 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 20 Jun 2002 09:41:00 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <15633.17324.467335.416736@anthem.wooz.org>
References: <200206190322.g5J3M1I07670@pythonware.com>
 <003701c2175f$b219c340$ced241d5@hagrid>
 <20020619075121.GB25541@hishome.net>
 <20020619083311.GA1011@ratthing-b3cf>
 <09342475030690@aluminium.rcp.co.uk>
 <20020619124604.GB31653@ute.mems-exchange.org>
 <15633.17324.467335.416736@anthem.wooz.org>
Message-ID: <oqptymm5jn.fsf@titan.progiciels-bpi.ca>

[Barry A. Warsaw]

> If I had to do it over again, I would have used $name in i18n source
> strings from the start.  It would have saved lots of headaches and
> broken translations.  People just seem to get $names whereas they get
> %(name)s wrong too often.

There were similar problems in C, you know, that yielded the addition of
diagnostics in GNU `msgfmt', in case of discrepancies between formatting
specifications in the original and the translated string.  The suffering
would not have really existed is `msgfmt' has been made Python aware,
or if Python programs used their own `msgfmt'.

As your wrote in your message, you wrote your own checker.  A solution should
be sought that would easily apply to all Python internationalised programs.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From pinard@iro.umontreal.ca  Thu Jun 20 15:28:16 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 20 Jun 2002 10:28:16 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <15633.19790.152438.926329@anthem.wooz.org>
References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca>
 <oq3cviitby.fsf@titan.progiciels-bpi.ca>
 <15633.19790.152438.926329@anthem.wooz.org>
Message-ID: <oqlm9am3cv.fsf@titan.progiciels-bpi.ca>

[Barry A. Warsaw]

>     FP> However, there are other contexts where the concept of a
>     FP> compound dictionary of all globals and locals would be useful.
>     FP> Maybe we could have some allvars() similar to globals() and
>     FP> locals(), and use `... % allvars()' instead of `.sub()'?  So
>     FP> this would serve both string interpolation and other avenues.

> Or maybe just make vars() do something more useful when no arguments
> are given?

I surely had the thought, but changing the meaning of an existing library
function is most probably out of question.

> In any event, allvars() or a-different-vars() is out of scope for this
> PEP.  We'd use it if it was there, but I think it needs its own PEP,
> which someone else will have to champion.

I do not see myself championing a PEP yet, I'm not sure the Python community
is soft enough for my thin skin (not so thin maybe, but I really had my share
of over-long discussions in other projects, I want some rest in these days).

On the other hand, the allvars() suggestion is right on the point in
my opinion.  It is not a stand-alone suggestion, its goal was to stress
out that `.sub()' is too far from the `%' operator, it looks like a
random addition.  The available formatting paradigms of Python, I mean,
those which are standard, should look a bit more unified, just to preserve
overall elegance.  If we want Python to stay elegant (which is the source
of comfort and pleasure, these being the main goals of using Python after
all), we have to seek elegance in each Python move.

To the risk of looking frenetic and heretic, I guess that `$' would become
more acceptable in view of the preceding paragraph, if we were introducing
an `$' operator for driving `$' substitutions, the same as the `%' operator
currently drives `%' substitutions.  I'm not asserting that this is the
direction to take, but I'm presenting this as an example of a direction
that would be a bit less shocking, and which through some unification,
could somewhat salvage the messy aspect of having two formatting characters.

Saying that PEP 292 rejects an idea because this idea would require another
PEP to be debated and accepted beforehand, and than rushing the acceptance
of PEP 292 as it stands, is probably missing the point of the discussion.
Each time such an argumentation is made, we loose vision and favour the
blossom of various Python features in random directions, which is not good
in the long term for Python self-consistency and elegance.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From pf@artcom-gmbh.de  Thu Jun 20 15:37:26 2002
From: pf@artcom-gmbh.de (Peter Funk)
Date: Thu, 20 Jun 2002 16:37:26 +0200 (CEST)
Subject: Version Fatigue (was: Re: [Python-Dev] PEP 292, Simpler String Substitutions)
In-Reply-To: <20020620134916.GA53951@hishome.net> from Oren Tirosh at "Jun 20,
 2002 09:49:16 am"
Message-ID: <m17L33u-0075L5C@artcom0.artcom-gmbh.de>

Oren Tirosh:
> From what I've read on this thread so far my vote would be:
> 
> +1 - no new forms of string formatting
> +0 - Donald Beaudry's proposal that %{name} would be equivalent to %(name)s
> -1 - anything else

/. just had a pointer to a feature defining the term "Version Fatigue":
   http://slashdot.org/articles/02/06/20/1223247.shtml?tid=126

"""Version fatigue comes from the accumulated realization that most 
   knowledge gained with regard to any particular version of a product 
   will be useless with regard to future generations of that same product."""

Thinking about that and recent Python development:

<> operator called "obsolescent",  iterators, generators, list
comprehensions, ugly '//' operator introduced for integer division,
deprecating import string, types, possibly adding "$name".sub(),
may be later deprecating the % operator.  What next?

Regards, Peter
-- 
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen, Germany)




From skip@pobox.com  Wed Jun 19 23:27:06 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 19 Jun 2002 17:27:06 -0500
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <20020619212559.GC18944@zot.electricrain.com>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
 <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
 <15632.62564.638418.191453@localhost.localdomain>
 <20020619212559.GC18944@zot.electricrain.com>
Message-ID: <15633.1338.367283.257786@localhost.localdomain>

    >> Why can't it just be called bsddb?

    Greg> Modern berkeleydb uses much different on disk database formats,
    Greg> glancing at the docs on sleepycat.com i don't even think it can
    Greg> read bsddb (1.85) files.

That's never stopped us before. ;-) The current bsddb module works with
versions 1, 2, 3, and 4 of Berkeley DB using the 1.85-compatible API that
Sleepycat provides.  It's always been the user's responsibility to run the
appropriate db_dump or db_dump185 commands before using the next version of
Berkeley DB.  Using the library from Python never removed that requirement.

Skip



From fredrik@pythonware.com  Thu Jun 20 16:15:44 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 20 Jun 2002 17:15:44 +0200
Subject: Version Fatigue (was: Re: [Python-Dev] PEP 292, Simpler String Substitutions)
References: <m17L33u-0075L5C@artcom0.artcom-gmbh.de>
Message-ID: <049301c2186d$5d627270$ced241d5@hagrid>

peter wrote:
> <> operator called "obsolescent",  iterators, generators

hey, don't lump generators in with the rest of the stuff.

generators opens a new universe, the rest is more like moving
the furniture around...

</F>




From gmcm@hypernet.com  Thu Jun 20 16:43:03 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 20 Jun 2002 11:43:03 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <02e401c2185f$24db5970$0900a8c0@spiff>
Message-ID: <3D11BFC7.4677.ADFCE47F@localhost>

On 20 Jun 2002 at 15:34, Fredrik Lundh wrote:

> gordon wrote:
> 
> > > The $ means "substitution" in so many languages
> > > besides Perl that I wonder where you've been. 
> > 
> > It doesn't mean anything in any language I *like*.
> 
> not even in american?

Where $ means "dough", which is one letter 
different from "cough" and "tough"[1]?

the-world's-best-language-for-discussing-
the-price-of-oranges-ly y'rs

-- Gordon
http://www.mcmillan-inc.com/

[1] If you're old-fashioned enough, you
can spell "plow" as "plough", too.



From niemeyer@conectiva.com  Thu Jun 20 16:48:07 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Thu, 20 Jun 2002 12:48:07 -0300
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <006201c2180b$883c1120$72976c42@damien>
References: <006201c2180b$883c1120$72976c42@damien>
Message-ID: <20020620124807.B1504@ibook.distro.conectiva>

> "`name` is from `country`".sub() -> "guido is from the netherlands"
[...]

But I'm not, thus I'm against any special character that needs two
hits to be typed in my keyboard.

;-)))

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From jacobs@penguin.theopalgroup.com  Thu Jun 20 16:55:28 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Thu, 20 Jun 2002 11:55:28 -0400 (EDT)
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <3D11BFC7.4677.ADFCE47F@localhost>
Message-ID: <Pine.LNX.4.44.0206201154090.15545-100000@penguin.theopalgroup.com>

On Thu, 20 Jun 2002, Gordon McMillan wrote:
> On 20 Jun 2002 at 15:34, Fredrik Lundh wrote:
> > gordon wrote:
> > 
> > > > The $ means "substitution" in so many languages
> > > > besides Perl that I wonder where you've been. 
> > > 
> > > It doesn't mean anything in any language I *like*.
> > 
> > not even in american?
> 
> Where $ means "dough", which is one letter 
> different from "cough" and "tough"[1]?

Shouldn't if be: d'oh!

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From aahz@pythoncraft.com  Thu Jun 20 17:21:50 2002
From: aahz@pythoncraft.com (Aahz)
Date: Thu, 20 Jun 2002 12:21:50 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <oqlm9am3cv.fsf@titan.progiciels-bpi.ca>
References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca> <oq3cviitby.fsf@titan.progiciels-bpi.ca> <15633.19790.152438.926329@anthem.wooz.org> <oqlm9am3cv.fsf@titan.progiciels-bpi.ca>
Message-ID: <20020620162150.GB18208@panix.com>

On Thu, Jun 20, 2002, François Pinard wrote:
> [Barry A. Warsaw]
>>     
>> In any event, allvars() or a-different-vars() is out of scope for this
>> PEP.  We'd use it if it was there, but I think it needs its own PEP,
>> which someone else will have to champion.
> 
> On the other hand, the allvars() suggestion is right on the point
> in my opinion.  It is not a stand-alone suggestion, its goal was to
> stress out that `.sub()' is too far from the `%' operator, it looks
> like a random addition.  The available formatting paradigms of Python,
> I mean, those which are standard, should look a bit more unified,
> just to preserve overall elegance.  If we want Python to stay elegant
> (which is the source of comfort and pleasure, these being the main
> goals of using Python after all), we have to seek elegance in each
> Python move.

+1
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From aahz@pythoncraft.com  Thu Jun 20 17:26:14 2002
From: aahz@pythoncraft.com (Aahz)
Date: Thu, 20 Jun 2002 12:26:14 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <01bf01c217cf$407bcec0$ced241d5@hagrid>
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <15632.31027.356393.678498@anthem.wooz.org> <3D10C2EE.CE833DB7@prescod.net> <01bf01c217cf$407bcec0$ced241d5@hagrid>
Message-ID: <20020620162613.GC18208@panix.com>

On Wed, Jun 19, 2002, Fredrik Lundh wrote:
> Paul Prescod wrote:
>>
>> $ is taught in hour 2, import is taught on day 2.
> 
> says who?
> 
> I usually mention "import" in the first hour (before methods),
> and nobody has ever had any problem with that...

Same here.  Note that there's a big difference between introducing
import (which pretty much is essential somewhere in the first or third
hour if you want to teach anything interesting) and giving a full
explanation of how import works (which would indeed be day 2 or 3).
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From aahz@pythoncraft.com  Thu Jun 20 17:10:57 2002
From: aahz@pythoncraft.com (Aahz)
Date: Thu, 20 Jun 2002 12:10:57 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <02fd01c21843$a959df30$ced241d5@hagrid>
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <20020619124604.GB31653@ute.mems-exchange.org> <20020619204017.GB9758@gerg.ca> <02fd01c21843$a959df30$ced241d5@hagrid>
Message-ID: <20020620161057.GA18208@panix.com>

On Thu, Jun 20, 2002, Fredrik Lundh wrote:
>
> I'm +1 on adding a text utility module for occasionally useful
> stuff like wrapping, getting rid of gremlins, doing various kinds
> of substitutions, centering/capitalizing/padding and otherwise
> formatting strings, searching/parsing, and other fun stuff that
> your average text editor can do (and +0 on using the existing
> "string" module for that purpose, but I can live with another
> name).

+1

(And I said so in the original thread on text wrapping.)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From aahz@pythoncraft.com  Thu Jun 20 17:43:23 2002
From: aahz@pythoncraft.com (Aahz)
Date: Thu, 20 Jun 2002 12:43:23 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
Message-ID: <20020620164323.GA27422@panix.com>

I'm about a pure 0 on this proposal in the aggregate

I'm -1 on .sub() as the name; I'd rather it be called .interp() (this
mainly due to confusing with the existing re.sub() and str.replace())

I'm +0 on putting this functionality in the text module instead of
adding a string method

I'm +0 on trying to find a solution that uses % instead of $ (There's
Only One Way)

I'm -1 on ${name}; there's no reason not to at least use $(name) for
consistency
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From guido@python.org  Thu Jun 20 18:10:36 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 13:10:36 -0400
Subject: Version Fatigue (was: Re: [Python-Dev] PEP 292, Simpler String Substitutions)
In-Reply-To: Your message of "Thu, 20 Jun 2002 16:37:26 +0200."
 <m17L33u-0075L5C@artcom0.artcom-gmbh.de>
References: <m17L33u-0075L5C@artcom0.artcom-gmbh.de>
Message-ID: <200206201710.g5KHAaO03970@odiug.zope.com>

> """Version fatigue comes from the accumulated realization that most 
>    knowledge gained with regard to any particular version of a product 
>    will be useless with regard to future generations of that same product."""
> 
> Thinking about that and recent Python development:

This is highly exaggerated.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jun 20 18:30:47 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 13:30:47 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Thu, 20 Jun 2002 03:18:56 EDT."
 <20020620071856.GA10497@hishome.net>
References: <006201c2180b$883c1120$72976c42@damien>
 <20020620071856.GA10497@hishome.net>
Message-ID: <200206201730.g5KHUlP04117@odiug.zope.com>

> See http://tothink.com/python/embedpp

How come you never submitted this PEP to the PEPmeister?  I can't
comment on what I don't know.  It certainly comes closest to the
original ABC feature.  (The main problem with `...` is that many people
can't distinguish between ` and ', as user testing has shown.)


--Guido van Rossum (home page: http://www.python.org/~guido/)



From damien.morton@acm.org  Thu Jun 20 18:45:03 2002
From: damien.morton@acm.org (Damien Morton)
Date: Thu, 20 Jun 2002 13:45:03 -0400
Subject: [Python-Dev] FW: PEP 292, Simpler String Substitutions
Message-ID: <008301c21882$357e25f0$72976c42@damien>

Youre right. I only threw that out there as a talking point rather than
a serious suggestion.

I take it you agree with my assertion that putting the format string
before the variable would be less error prone? (if it didn=92t destroy =
the
current usage).

Given that the $ notation is all-new, perhaps prefixing with the format
string should be considered

as in:

"$4.2f{height}"

In fact, if we are going to revisit format strings why not ditch the
format character and keep the numeric specifier only. Determine the
format character by the type of the variable.

For x =3D "hello", "$4.2{x}" =3D=3D "$4s{x}" -> "hell"
For x =3D 3.7865, "$4.2{x}" =3D=3D "$4.2f{x}" -> "3.78"


> -----Original Message-----
> From: pinard@titan.progiciels-bpi.ca
> [mailto:pinard@titan.progiciels-bpi.ca] On Behalf Of Fran=E7ois Pinard
> Sent: Thursday, 20 June 2002 10:39
> To: Damien Morton
> Subject: Re: PEP 292, Simpler String Substitutions
>=20
>=20
> [Damien Morton]
>=20
> > Why not alter the notation to allow the format specifier to come
> > before the name part.  "%4.2f(height)" I think would be a whole lot=20
> > less error prone, and would allow for the format specifier=20
> to default
> > to "s" where omitted.
>=20
> Hello, Damien.
>=20
> "%4.2f(height)" already has the meaning of "%4.2f", which is
> complete in itself, and then "(height)", which is a constant=20
> string -- you understand what I mean.  Altering the notation=20
> as you suggest would undoubtedly break many, many=20
> applications, so we should guess it is not acceptable.
>=20
> --=20
> Fran=E7ois Pinard   http://www.iro.umontreal.ca/~pinard
>=20




From guido@python.org  Thu Jun 20 18:45:18 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 13:45:18 -0400
Subject: [Python-Dev] Weird problem with exceptions raised in extension module
In-Reply-To: Your message of "Thu, 20 Jun 2002 19:25:34 +1200."
 <200206200725.g5K7PYt26747@oma.cosc.canterbury.ac.nz>
References: <200206200725.g5K7PYt26747@oma.cosc.canterbury.ac.nz>
Message-ID: <200206201745.g5KHjI604158@odiug.zope.com>

> I'm getting strange behaviour when raising an
> exception in a extension module generated by
> Pyrex. The extension module does the equivalent of
> 
>   def foo():
>     raise TypeError("Test-Exception")
> 
> If I invoke it with the following Python code:
> 
>   try:
>     mymodule.foo()
>   except IOError:
>     print "blarg"
> 
> the following happens:
> 
>   Traceback (most recent call last):
>     File "<stdin>", line 3, in ?
>   SystemError: 'finally' pops bad exception
> 
> This only happens when the try-except catches
> something *other* than the exception being raised.
> If the exception being raised is caught, or
> no exception catching is done, the exception
> is handled properly.
> 
> Also, it only happens when an *intance* is used
> as the exception object. If I do this instead:
> 
>   raise TypeError, "Test-Exception"
> 
> the problem doesn't occur.
> 
> The relevant piece of C code generated by
> Pyrex is as follows. Can anyone see if I'm
> doing anything wrong? (I'm aware that there's
> a missing Py_DECREF, but it shouldn't be
> causing this sort of thing.)
> 
> The Python version I'm using is 2.2.
> 
>   __pyx_1 = __Pyx_GetName(__pyx_b, "TypeError"); 
>   if (!__pyx_1) goto __pyx_L1;
>   __pyx_2 = PyString_FromString(__pyx_k1); 
>   if (!__pyx_2) goto __pyx_L1;
>   __pyx_3 = PyTuple_New(1); 
>   if (!__pyx_3) goto __pyx_L1;
>   PyTuple_SET_ITEM(__pyx_3, 0, __pyx_2);
>   __pyx_2 = 0;
>   __pyx_4 = PyObject_CallObject(__pyx_1, __pyx_3); 
>   if (!__pyx_4) goto __pyx_L1;
>   Py_DECREF(__pyx_3); 
>   __pyx_3 = 0;
>   PyErr_SetNone(__pyx_4);
>   Py_DECREF(__pyx_4); 
>   __pyx_4 = 0;
>   goto __pyx_L1;
> 
>   /*...*/

It seems that this is just for

   raise TypeError, "Test-Exception"

Shouldn't you show the code for the try/except and for the function
call/return too?

But I think that you shouldn't be calling PyErr_SetNone() here -- I
think you should call PyErr_SetObject(__pyx_1, __pyx_2).

For details see do_raise() in ceval.c.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jun 20 18:46:58 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 13:46:58 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Thu, 20 Jun 2002 13:05:20 +0200."
 <3D11B6F0.5000803@tismer.com>
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>
 <3D11B6F0.5000803@tismer.com>
Message-ID: <200206201746.g5KHkwH04175@odiug.zope.com>

Christian,

you seem to be contradicting yourself.  First:

[someone]
> > +1 on the allvars() suggestion also.

[Christian]
> me too.

and later:

[Christian]
> The following statements are ordered by increasing hate.
> 1 - I do hate the idea of introducing a "$" sign at all.
> 2 - giving "$" special meaning in strings via a module
> 3 - doing it as a builtin function
> 4 - allowing it to address local/global variables

Doesn't 4 contradict your +1 on allvars()?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik@pythonware.com  Thu Jun 20 18:48:12 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 20 Jun 2002 19:48:12 +0200
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <006201c2180b$883c1120$72976c42@damien>              <20020620071856.GA10497@hishome.net>  <200206201730.g5KHUlP04117@odiug.zope.com>
Message-ID: <00d901c21882$a9258a20$ced241d5@hagrid>

guido wrote:    


> > See http://tothink.com/python/embedpp
> 
> How come you never submitted this PEP to the PEPmeister?

iirc, that's because Oren did the 

    why would

        e"X=`x`, Y=`calc_y(x)`."

    be a vast improvement over:

        e("X=", x, ", Y=", calc_y(x), ".")

    test

and his answer was not "I18N" (for obvious reasons ;-)

(but I think we called the function "I" at that time)

</F>




From oren-py-d@hishome.net  Thu Jun 20 19:23:19 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Thu, 20 Jun 2002 21:23:19 +0300
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <200206201730.g5KHUlP04117@odiug.zope.com>; from guido@python.org on Thu, Jun 20, 2002 at 01:30:47PM -0400
References: <006201c2180b$883c1120$72976c42@damien> <20020620071856.GA10497@hishome.net> <200206201730.g5KHUlP04117@odiug.zope.com>
Message-ID: <20020620212319.A17467@hishome.net>

On Thu, Jun 20, 2002 at 01:30:47PM -0400, Guido van Rossum wrote:
> > See http://tothink.com/python/embedpp
> 
> How come you never submitted this PEP to the PEPmeister?  I can't
> comment on what I don't know.  It certainly comes closest to the
> original ABC feature.  (The main problem with `...` is that many people
> can't distinguish between ` and ', as user testing has shown.)

I guess I got a bit discouraged by the response on python-list back then. 
Now I know better :-)

	Oren



From tismer@tismer.com  Thu Jun 20 19:28:43 2002
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 20 Jun 2002 20:28:43 +0200
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>              <3D11B6F0.5000803@tismer.com> <200206201746.g5KHkwH04175@odiug.zope.com>
Message-ID: <3D121EDB.6070501@tismer.com>

Guido van Rossum wrote:
> Christian,
> 
> you seem to be contradicting yourself.  First:
> 
> [someone]
> 
>>>+1 on the allvars() suggestion also.
>>
> 
> [Christian]
> 
>>me too.
> 
> 
> and later:
> 
> [Christian]
> 
>>The following statements are ordered by increasing hate.
>>1 - I do hate the idea of introducing a "$" sign at all.
>>2 - giving "$" special meaning in strings via a module
>>3 - doing it as a builtin function
>>4 - allowing it to address local/global variables
> 
> 
> Doesn't 4 contradict your +1 on allvars()?

By no means.
allvars() is something like locals() or globals(), just
an explicit way to produce a dictionary of variables.

What I want to preserve is the distinction between
arbitrary "%(name)s" or maybe "${name}" names and
my local variables.
Using locals() or allvars(), I can decide to *feed*
the formatting expression with variable names.
But the implementation of .sub() should not know
anything about variables, the same way as % doesn't know
about variables.
Formatting is "by value", IMHO.

Furthermore I'd like to thank Alex for his opinions,
additions and adjustments to my post. I have to say that
I always *am* emotional with such stuff, although I'm
trying hard not to. But he hits the nail's head more than I.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From paul@prescod.net  Thu Jun 20 19:29:33 2002
From: paul@prescod.net (Paul Prescod)
Date: Thu, 20 Jun 2002 11:29:33 -0700
Subject: [Python-Dev] *Simpler* string substitutions
Message-ID: <3D121F0D.E3B60865@prescod.net>

We will never come to a solution unless we agree on what, if any, the
problem is.

Here is my sense of the "interpolation" problem (based entirely on the
code I see):

 * 95% of all scripts (or modules) need to do string interpolation

 * 5% of all scripts want to be explicit about the types

 * 10% of all scripts want to submit a dictionary rather than the
current namespace

 * 5% of all scripts want to do printf-style formatting tricks

Which means that if we do the math in a simplistic way, 20%
modules/scripts need these complicated features but the other 75% pay
for these features that they are not using. They pay through having to
use "% locals()" (which uses two advanced features of Python, operator
overloading and the local namespace). They pay through counting the
lengths of their %-tuples (in my case, usually miscounting). They pay
through adding (or forgetting to add) the format specifier after
"%(...)". They pay through having harder to read strings where they have
to go back and forth to figure out what various positional variables
mean. They through having to remember the special case for singletons --
except for singleton tuples!

Of course the syntax is flexible: you get to choose HOW you pay
(shifting from positional to name) and thus reduce some costs while you
incur others, but you can't choose simply NOT to pay, as you can in
every other scripting language I know. 

And remember that Python is a language that *encourages* readability.
But this kind of code is common:

 * exception.append('\n<br>%s%s&nbsp;=\n%s' % (indent, name, value))

whereas it could be just:

 * exception.append('\n<br>${ident}${name}&nbsp;=\n${value}')

Which is shorter, uses fewer concepts, and keeps variables close to
where they are used. We could argue that the programmer here made the
wrong choice (versus using % locals()) but the point is that Python
itself favoured the wrong choice by making the wrong choice shorter and
simpler. Usually Python favours the right choice.

The tax is small but it is collected on almost every script, almost
every beginner and almost every programmer almost every day. So it adds
up.

If we put this new feature in a module: (whether "text", "re",
"string"), then we are just divising another way to make people pay. At
that point it becomes a negative feature, because it will clutter up the
standard library without getting use.As long as you are agreeing to pay
some tax, "%" is a smaller tax (at least at first) because it does not
require you to interrupt your workflow to insert an import statement.

In my mind, this feature is only worth adding if we agree that it is now
the standard string interpolation feature and "%" becomes a quaint
historical feature -- a bad experiment in operator overloading gone
wrong. "%" could be renamed "text.printf" and would actually become more
familiar to its core constituency and less of a syntactic abberation.
"interp" could be a built-in and thus similarly simple syntactically.

But I am against adding "$" if half of Python programmers are going to
use that and half are going to use %. $ needs to be a replacement. There
should be one obvious way to solve simple problems like this, not two. I
am also against adding it as a useless function buried in a module that
nobody will bother to import.

 Paul Prescod



From guido@python.org  Thu Jun 20 19:32:39 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 14:32:39 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Thu, 20 Jun 2002 12:10:57 EDT."
 <20020620161057.GA18208@panix.com>
References: <200206190322.g5J3M1I07670@pythonware.com> <003701c2175f$b219c340$ced241d5@hagrid> <20020619075121.GB25541@hishome.net> <20020619083311.GA1011@ratthing-b3cf> <09342475030690@aluminium.rcp.co.uk> <20020619124604.GB31653@ute.mems-exchange.org> <20020619204017.GB9758@gerg.ca> <02fd01c21843$a959df30$ced241d5@hagrid>
 <20020620161057.GA18208@panix.com>
Message-ID: <200206201832.g5KIWd904912@odiug.zope.com>

> > I'm +1 on adding a text utility module for occasionally useful
> > stuff like wrapping, getting rid of gremlins, doing various kinds
> > of substitutions, centering/capitalizing/padding and otherwise
> > formatting strings, searching/parsing, and other fun stuff that
> > your average text editor can do (and +0 on using the existing
> > "string" module for that purpose, but I can live with another
> > name).

+1 on (e.g.) a text module.

-1 on reusing the string module.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tismer@tismer.com  Thu Jun 20 19:47:48 2002
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 20 Jun 2002 20:47:48 +0200
Subject: Version Fatigue (was: Re: [Python-Dev] PEP 292, Simpler String
 Substitutions)
References: <m17L33u-0075L5C@artcom0.artcom-gmbh.de> <200206201710.g5KHAaO03970@odiug.zope.com>
Message-ID: <3D122354.8040308@tismer.com>

Guido van Rossum wrote:
>>"""Version fatigue comes from the accumulated realization that most 
>>   knowledge gained with regard to any particular version of a product 
>>   will be useless with regard to future generations of that same product."""
>>
>>Thinking about that and recent Python development:
> 
> 
> This is highly exaggerated.

Guido, I'm not sure that you are always aware what
people actually like about Python and what they dislike.
I have heared such complaints from so many people,
that I think there are reasonably many who don't share
your judgement.
Personally, I belong to the more conservatives, too.
(Stunned? No, really, I like the minimum, most orthogonal
set of features, since I'm running low on brain cells).


Don't take me as negative. This has to be said, once:

I like the new generators very much. They
have a lot of elegance and power.
I am absolutely amazed by the solution to
the type/class dichotomy, and I'm completely
excited about the metaclass stuff. Great!

Much more valuable memorizing than list comprehensions,
booleans and hopefully no new formatting syntax.

All in all Python is evolving good. Maybe we could
slow a little down, please?

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From pinard@iro.umontreal.ca  Thu Jun 20 20:15:41 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 20 Jun 2002 15:15:41 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <200206201832.g5KIWd904912@odiug.zope.com>
References: <200206190322.g5J3M1I07670@pythonware.com>
 <003701c2175f$b219c340$ced241d5@hagrid>
 <20020619075121.GB25541@hishome.net>
 <20020619083311.GA1011@ratthing-b3cf>
 <09342475030690@aluminium.rcp.co.uk>
 <20020619124604.GB31653@ute.mems-exchange.org>
 <20020619204017.GB9758@gerg.ca>
 <02fd01c21843$a959df30$ced241d5@hagrid>
 <20020620161057.GA18208@panix.com>
 <200206201832.g5KIWd904912@odiug.zope.com>
Message-ID: <oqhejxlq1u.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

> +1 on (e.g.) a text module.
> -1 on reusing the string module.

_A_ text module is OK.  But please, avoid naming such a module "text".
Coming to Python, I had to loose the habit of naming the usual string
work-variable "string", because it would conflict with the module name.
(Some use `s', but this is more close to algebra than programming, as for
programming, clear names are usually better.)  So, I have tons of programs
and scripts using "text" instead for the usual string work-variable.
It would be a pain having to revise this all, making room for "import text".

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From guido@python.org  Thu Jun 20 20:54:29 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 15:54:29 -0400
Subject: [Python-Dev] Re: Version Fatigue
In-Reply-To: Your message of "Thu, 20 Jun 2002 20:47:48 +0200."
 <3D122354.8040308@tismer.com>
References: <m17L33u-0075L5C@artcom0.artcom-gmbh.de> <200206201710.g5KHAaO03970@odiug.zope.com>
 <3D122354.8040308@tismer.com>
Message-ID: <200206201954.g5KJsTt05302@odiug.zope.com>

> Guido, I'm not sure that you are always aware what
> people actually like about Python and what they dislike.
> I have heared such complaints from so many people,
> that I think there are reasonably many who don't share
> your judgement.

Tough.  People used to like it because they trusted my judgement.
Maybe I should stop listening to others. :-)

Seriously, the community is large enough that we can't expect
everybody to like the same things.  There are reasonably many who
still do share my judgement.

> Personally, I belong to the more conservatives, too.
> (Stunned? No, really, I like the minimum, most orthogonal
> set of features, since I'm running low on brain cells).
> 
> 
> Don't take me as negative. This has to be said, once:
> 
> I like the new generators very much. They
> have a lot of elegance and power.
> I am absolutely amazed by the solution to
> the type/class dichotomy, and I'm completely
> excited about the metaclass stuff. Great!

No surprise that you, always the mathematician, like the most
brain-exploding features. :-)

And note the contradiction, which you share with everybody else: you
don't want new features, except the three that you absolutely need to
have.  And you see nothing wrong with this contradiction.

> Much more valuable memorizing than list comprehensions,
> booleans and hopefully no new formatting syntax.

For many people it's just the other way around though.

> All in all Python is evolving good. Maybe we could
> slow a little down, please?

I'm trying.  I'm really trying.  Please give me some credit.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From niemeyer@conectiva.com  Thu Jun 20 21:00:25 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Thu, 20 Jun 2002 17:00:25 -0300
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <3D121F0D.E3B60865@prescod.net>
References: <3D121F0D.E3B60865@prescod.net>
Message-ID: <20020620170025.A25014@ibook.distro.conectiva>

> Here is my sense of the "interpolation" problem (based entirely on the
> code I see):
> 
>  * 95% of all scripts (or modules) need to do string interpolation
> 
>  * 5% of all scripts want to be explicit about the types
> 
>  * 10% of all scripts want to submit a dictionary rather than the
> current namespace
> 
>  * 5% of all scripts want to do printf-style formatting tricks
> 
> Which means that if we do the math in a simplistic way, 20%
> modules/scripts need these complicated features but the other 75% pay
[...]

I'm curious.. where did you get this from? Have you counted?

I think 99% of the statistics are forged to enforce an opinion. :-)

[...]
> Of course the syntax is flexible: you get to choose HOW you pay
> (shifting from positional to name) and thus reduce some costs while you
> incur others, but you can't choose simply NOT to pay, as you can in
> every other scripting language I know. 
> 
> And remember that Python is a language that *encourages* readability.
> But this kind of code is common:
> 
>  * exception.append('\n<br>%s%s&nbsp;=\n%s' % (indent, name, value))
> 
> whereas it could be just:
> 
>  * exception.append('\n<br>${ident}${name}&nbsp;=\n${value}')

That's the usual Perl way of string interpolation. I've used Perl in
some large projects before being a python adept, and I must confess I
don't miss this feature. Maybe it's my C background, but I don't like
to mix code and strings. Think about these real examples, taken
from *one* single module (BaseHTTPServer):

"%s %s %s\r\n" % (self.protocol_version, str(code), message)

"%s - - [%s] %s\n" % (self.address_string(),
		      self.log_date_time_string(),
		      format%args))

"%s, %02d %3s %4d %02d:%02d:%02d GMT" % (self.weekdayname[wd],
		                         day,
					 self.monthname[month], year,
				         hh, mm, ss)

"Serving HTTP on", sa[0], "port", sa[1], "..."

"Bad HTTP/0 .9 request type (%s)" % `command`

"Unsupported method (%s)" % `self.command`

"Bad request syntax (%s)" % `requestline`

"Bad request version (%s)" % `version`


> Which is shorter, uses fewer concepts, and keeps variables close to
> where they are used. We could argue that the programmer here made the
[...]

Please, show me that with one of the examples above.

> The tax is small but it is collected on almost every script, almost
> every beginner and almost every programmer almost every day. So it adds
> up.

That seems like an excessive generalization of a personal opinion.

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From niemeyer@conectiva.com  Thu Jun 20 21:13:17 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Thu, 20 Jun 2002 17:13:17 -0300
Subject: Version Fatigue (was: Re: [Python-Dev] PEP 292, Simpler String Substitutions)
In-Reply-To: <m17L33u-0075L5C@artcom0.artcom-gmbh.de>
References: <20020620134916.GA53951@hishome.net> <m17L33u-0075L5C@artcom0.artcom-gmbh.de>
Message-ID: <20020620171317.B25014@ibook.distro.conectiva>

> <> operator called "obsolescent",  iterators, generators, list
> comprehensions, ugly '//' operator introduced for integer division,
> deprecating import string, types, possibly adding "$name".sub(),
> may be later deprecating the % operator.  What next?

major, minor, and makedev.. I hope...

/me runs..

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From pinard@iro.umontreal.ca  Thu Jun 20 21:17:38 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 20 Jun 2002 16:17:38 -0400
Subject: [Python-Dev] Re: Version Fatigue (was: Re: PEP 292, Simpler String Substitutions)
In-Reply-To: <3D122354.8040308@tismer.com>
References: <m17L33u-0075L5C@artcom0.artcom-gmbh.de>
 <200206201710.g5KHAaO03970@odiug.zope.com>
 <3D122354.8040308@tismer.com>
Message-ID: <oq8z59ln6l.fsf@titan.progiciels-bpi.ca>

[Christian Tismer]

> >> """Version fatigue comes from the accumulated realization that most
> >> knowledge gained with regard to any particular version of a product   will
> >> be useless with regard to future generations of that same product."""
> >>
> >>Thinking about that and recent Python development:

> Guido van Rossum wrote:
> > This is highly exaggerated.

Exactly as stated, yes, I agree.

There is another kind of fatigue which may apply to Python, by which a
language becomes so featured over time that people may naturally come to
limit themselves to a sufficient subset of the language and be perfectly
happy, until they have to read the code written by guys speaking another
subset of the language possibilities.  Legibility becomes subjective and
questionable.  In the past, this has been true for some comprehensive
implementations of LISP, and as I heard (but did not experience) for PL/I.

There was a time, not so long ago, when there was only one way to do it
in Python, and this one way was the good way, necessarily.  This is not
true anymore, and we ought to recognise that this impacts legibility.

Fredrik wrote:

> Hey, don't lump generators in with the rest of the stuff.  Generators opens
> a new universe, the rest is more like moving the furniture around...

Indeed, there are very nice additions, that really bring something new,
and generators are of this kind.  Even moving the furniture around may be
very good, like for when the underlying mechanics get revised, acquiring
power and expressiveness on the road, while keeping the same surface aspect.
At least, people recognise the furniture, and could appreciate the new order.

Where it might hurt, however, is when the Python place get crowded with
furniture, that is, when Python gets new syntaxes and functions above those
which already exist, while keeping the old stuff around more or less forever
for compatibility reasons, with no firm intention or plan for deprecation,
and no tools to help users at switching from a paradigm to its replacement.
This ends up messy, as each programmer then uses some preferred subset.

The `$' PEP is a typical example of this.  Either the PEP should contain
a serious and detailed study about how `%' is going to become deprecated,
or the PEP should design `$' so nicely that it appears to be a born-twin of
the current `%' format, long lost then recently rediscovered.  The PEP should
convince us that it would be heart-breaking to separate so loving brothers.
Now, it looks like these two do not much belong to the same family, they
just randomly met in Python, they are not especially fit with one another.

Just the expression of a feeling, of course. :-)

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From paul@prescod.net  Thu Jun 20 21:26:03 2002
From: paul@prescod.net (Paul Prescod)
Date: Thu, 20 Jun 2002 13:26:03 -0700
Subject: [Python-Dev] *Simpler* string substitutions
References: <3D121F0D.E3B60865@prescod.net> <20020620170025.A25014@ibook.distro.conectiva>
Message-ID: <3D123A5B.EB7389AA@prescod.net>

Gustavo Niemeyer wrote:
> 
>...
> 
> I'm curious.. where did you get this from? Have you counted?

No.

> I think 99% of the statistics are forged to enforce an opinion. :-)

I said it was based only on my experience!

> ... Think about these real examples, taken
> from *one* single module (BaseHTTPServer):
> 
> "%s %s %s\r\n" % (self.protocol_version, str(code), message)

Let's presume a "sub" method with the features of Ping's string
interpolation PEP. This would look like:

"${self.protocol_version}, $code, $message\r\n".sub()

Shorter and simpler.

> "%s - - [%s] %s\n" % (self.address_string(),
>                       self.log_date_time_string(),
>                       format%args))

"${self.address_string()} - - [${self.log_date_time_string()}]
${format.sub(args)}".sub()

But I would probably clarify that:

addr = self.address_string()
time = self.log_date_time_string()
command = format.sub(args)

"$addr - - [$time] $command\n".sub()

> "%s, %02d %3s %4d %02d:%02d:%02d GMT" % (self.weekdayname[wd],
>                                          day,
>                                          self.monthname[month], year,
>                                          hh, mm, ss)

This one is part of the small percent that uses formatting codes. It
wouldn't be rocket science to integrate formatting codes with the "$"
notation $02d{day} but it would also be fine if this involved a call to
textutils.printf()

> "Serving HTTP on", sa[0], "port", sa[1], "..."

This doesn't use "%" to start with, but it is still clearer (IMO) in the
new notation:

"Serving HTTP on ${sa[0]} port ${sa[1]} ..."

> "Bad HTTP/0 .9 request type (%s)" % `command`

"Bad HTTP/0 .9 request type ${`command`}"

etc.

 Paul Prescod



From jmiller@stsci.edu  Thu Jun 20 21:29:18 2002
From: jmiller@stsci.edu (Todd Miller)
Date: Thu, 20 Jun 2002 16:29:18 -0400
Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__
Message-ID: <3D123B1E.6050600@stsci.edu>

There has been some recent interest in the Numeric/numarray community 
for using array objects as indices
for builtin sequences.  I know this has come up before, but to make 
myself clear, the basic idea is to make the
following work:

class C:
    def __int__(self):
          return 5

object = C()

l = "Another feature..."

print l[object]
"h"

Are there any plans (or interest) for developing Python in this direction?

Todd






From greg@electricrain.com  Thu Jun 20 21:50:41 2002
From: greg@electricrain.com (Gregory P. Smith)
Date: Thu, 20 Jun 2002 13:50:41 -0700
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15633.1338.367283.257786@localhost.localdomain>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> <20020619212559.GC18944@zot.electricrain.com> <15633.1338.367283.257786@localhost.localdomain>
Message-ID: <20020620205041.GD18944@zot.electricrain.com>

On Wed, Jun 19, 2002 at 05:27:06PM -0500, Skip Montanaro wrote:
> 
>     >> Why can't it just be called bsddb?
> 
>     Greg> Modern berkeleydb uses much different on disk database formats,
>     Greg> glancing at the docs on sleepycat.com i don't even think it can
>     Greg> read bsddb (1.85) files.
> 
> That's never stopped us before. ;-) The current bsddb module works with
> versions 1, 2, 3, and 4 of Berkeley DB using the 1.85-compatible API that
> Sleepycat provides.  It's always been the user's responsibility to run the
> appropriate db_dump or db_dump185 commands before using the next version of
> Berkeley DB.  Using the library from Python never removed that requirement.

Good point.  I was ignorant of the original bsddb 1.85 module workings as
i never used it.  Pybsddb backwards compatibility was implemented (not
be me) by with the intention that it could be used as a replacement for
the existing bsddb module.  It passes the simplistic test_bsddb.py that
is included with python today as well as pybsddb's own test_compat.py
to test the compatibility layer.

If we replace the existing bsddb with pybsddb (bsddb3), it should work.
If there are hidden bugs that's what the alpha/beta periods are for.
However linking against berkeleydb versions less than 3.2 will no longer
be supported; should we keep the existing bsddb around as oldbsddb for
users in that situation?

-G




From guido@python.org  Thu Jun 20 21:53:12 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 16:53:12 -0400
Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__
In-Reply-To: Your message of "Thu, 20 Jun 2002 16:29:18 EDT."
 <3D123B1E.6050600@stsci.edu>
References: <3D123B1E.6050600@stsci.edu>
Message-ID: <200206202053.g5KKrCA05552@odiug.zope.com>

> There has been some recent interest in the Numeric/numarray community 
> for using array objects as indices
> for builtin sequences.  I know this has come up before, but to make 
> myself clear, the basic idea is to make the
> following work:
> 
> class C:
>     def __int__(self):
>           return 5
> 
> object = C()
> 
> l = "Another feature..."
> 
> print l[object]
> "h"
> 
> Are there any plans (or interest) for developing Python in this direction?

I'm concerned that this will also make floats acceptable as indices
(since they have an __int__ method) and this would cause atrocities
like

print "hello"[3.5]

to work.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tismer@tismer.com  Thu Jun 20 21:59:43 2002
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 20 Jun 2002 22:59:43 +0200
Subject: [Python-Dev] Re: Version Fatigue (was: Re: PEP 292, Simpler String
 Substitutions)
References: <m17L33u-0075L5C@artcom0.artcom-gmbh.de>	<200206201710.g5KHAaO03970@odiug.zope.com>	<3D122354.8040308@tismer.com> <oq8z59ln6l.fsf@titan.progiciels-bpi.ca>
Message-ID: <3D12423F.3010604@tismer.com>

Fran=E7ois Pinard wrote:
> [Christian Tismer]
>=20
>=20
>>>>"""Version fatigue comes from the accumulated realization that most
>>>>knowledge gained with regard to any particular version of a product  =
 will
>>>>be useless with regard to future generations of that same product."""

Thanks.
This is a false quote, but I could have said it. :-)

--=20
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From niemeyer@conectiva.com  Thu Jun 20 22:20:43 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Thu, 20 Jun 2002 18:20:43 -0300
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <3D123A5B.EB7389AA@prescod.net>
References: <3D121F0D.E3B60865@prescod.net> <20020620170025.A25014@ibook.distro.conectiva> <3D123A5B.EB7389AA@prescod.net>
Message-ID: <20020620182043.A4252@ibook.distro.conectiva>

> Let's presume a "sub" method with the features of Ping's string
> interpolation PEP. This would look like:

That's not the PEP being discussed, and if it was, it can't replace
the % mapping. Read the Security Issues.

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From barry@barrys-emacs.org  Thu Jun 20 22:21:02 2002
From: barry@barrys-emacs.org (Barry Scott)
Date: Thu, 20 Jun 2002 22:21:02 +0100
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <3D121F0D.E3B60865@prescod.net>
Message-ID: <001901c218a0$6158d1c0$070210ac@LAPDANCE>

If I'm going to move from %(name)fmt to ${name} I need a place for
the fmt format. Given the error prone nature of %(name) should have
been %(name)s

Howabout adding the format inside the {} for example:

    ${name:format}

You can then have

    $name
    ${name}
    ${name:s}

$name and ${name} work as you have already decided. ${name:format} allows
the format to control the substitution.

    Barry





From guido@python.org  Thu Jun 20 22:21:24 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 17:21:24 -0400
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: Your message of "Thu, 20 Jun 2002 11:29:33 PDT."
 <3D121F0D.E3B60865@prescod.net>
References: <3D121F0D.E3B60865@prescod.net>
Message-ID: <200206202121.g5KLLPT05634@odiug.zope.com>

[Paul]
> We will never come to a solution unless we agree on what, if any,
> the problem is.

[...eloquent argument, ending in...]

> But I am against adding "$" if half of Python programmers are going
> to use that and half are going to use %. $ needs to be a
> replacement. There should be one obvious way to solve simple
> problems like this, not two. I am also against adding it as a
> useless function buried in a module that nobody will bother to
> import.

Well argued.  Alex said roughly the same thing: let's not add $
while keeping %.

Adding a function for $-interpolation to a module would certainly help
some projects (like web templating) from reinventing the wheel -- but
/F has shown that this particular wheel isn't hard to recreate.  I
would certainly recommend any project that offers substitution in
templates that are edited by non-programmers to use the $-based syntax
from Barry's PEP rather than Python's %(name)s syntax.  (In particular
I hope Python's i18n projects will use $ interpolation.)

Oren made a good point that Paul emphasized: the most common use case
needs interpolation from the current namespace in a string literal,
and expressions would be handy.  Oren also made the point that the
necessary parsing could (should?) be done at compile time.

We currently already have many ways to do this:

- In some cases print is appropriate:

  def f(x, t):
      print "The sum of", x, "and", y, "is", x+y

- You can use string concatenation:

  def f(x, y):
      return "The sum of " + str(x) + " and " + str(y) + " is " + str(x+y)

- You can use % interpolation (with two variants: positional and
  by-name).  A problem is that you have to specify an explicit tuple
  or dict of values.

  def f(x, y):
      return "The sum of %s and %s is %s" % (x, y, x+y)

Note that the print version is the shortest, and IMO also the easiest
to read.  (Though some people might disagree and prefer the % version
because it separates the template from the data; it's not much
longer.)

- You could have an interpolation helper function:

  def i(*a):
      return "".join(map(str, a))

  so you could write this:

  def f(x, y):
      return i("The sum of ", x, " and ", y, " is ", x+y)

This comes closer in length to the print version.

IMO the attraction of the $ version is that it reduces the amount of
punctuation so that it becomes even shorter and clearer.  While I said
"shorter" several times above when comparing styles, I really meant
that as a shorthand for "shorter and clearer".  Even the print example
suffers from the fact that every interpolated value is separated from
the surrounding template by a comma and a string quote on both sides
-- that's a lot of visual clutter (not to mention stuff to type).

Maybe in Python 3.0 we will be able to write:

  def f(x, y):
      return "The sum of $x and $y is $(x+y)"

To me, it's a toss-up whether this looks better or worse than the ABC
version:

  def f(x, y):
      return "The sum of `x` and `y` is `x+y`"

but I do know that backticks have a poor reputation for being hard to
find on the keyboard (newbies don't even know they have it), hard to
distinguish in some fonts, and publishers often turn 'foo' into `foo',
making it hard to publish accurate documentation.  I think on some
European keyboards ` is a dead key, making it even harder to type.
Additionally, it's a symmetric operator, which makes it harder to
parse complex examples.

Now, how to get there (or somewhere similar) in Python 2.3?

PEP 215 solves it by using (yet) another string prefix character.  It
uses $, which to me looks a bit ugly; in this thread, someone proposed
using e, so you can do:

  def f(x, y):
      return e"The sum of $x and $y is $(x+y)"

That looks OK to me, especially if it can be combined with u and r to
create unicode and raw strings.

There are other possibilities:

  def f(x, y):
      return "The sum of \$x and \$y is \$(x+y)"

Alas, it's not 100% backwards compatible, and the \$ looks pretty bad.

Another one:

  def f(x, y):
      return "The sum of \(x) and \(y) is \(x+y)"

Still not 100% compatible, looks perhaps a bit better, but notice how
now every interpolation needs three punctuation characters: almost as
many as the print example.

Assuming that interpolating simple variables is relatively common, I
still like plain $ with something to tag the string as an
interpolation best.

PEP 292 is an attempt to do this *without* involving the parser:

  def f(x, y):
      return "The sum of $x and $y is $(x+y)".sub()

Downsides are that it invites using non-literals as formats, with all
the security aspects, and that its parsing happens at run-time (no big
deal IMO).

Now back to $ vs. %.  I think I can defend having both in the
language, but only if % is reduced to the positional version (classic
printf).  This would be used mostly to format numerical data with
fixed column width.  There would be very little overlap in use cases:
% always requires you to specify explicit values, while $ is always
% followed by a variable name.

(Yet another variant is from Tcl, which uses $variable but also
[expression].  In Python 3.0 this would become:

  def f(x, y):
      return "The sum of $x and $y is [x+y]"

But now you have three characters that need quoting, and we might as
well use \$ to quote a literal $ instead of $$.)

All options are still open.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jun 20 22:35:41 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 17:35:41 -0400
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: Your message of "Thu, 20 Jun 2002 13:26:03 PDT."
 <3D123A5B.EB7389AA@prescod.net>
References: <3D121F0D.E3B60865@prescod.net> <20020620170025.A25014@ibook.distro.conectiva>
 <3D123A5B.EB7389AA@prescod.net>
Message-ID: <200206202135.g5KLZf705731@odiug.zope.com>

> > "%s, %02d %3s %4d %02d:%02d:%02d GMT" % (self.weekdayname[wd],
> >                                          day,
> >                                          self.monthname[month], year,
> >                                          hh, mm, ss)
> 
> This one is part of the small percent that uses formatting codes. It
> wouldn't be rocket science to integrate formatting codes with the "$"
> notation $02d{day} but it would also be fine if this involved a call to
> textutils.printf()

But if you support $02d{day} you should also support $d{day}, but that
already means something different (the variable 'd' followed by '{day}').

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jun 20 22:37:39 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 17:37:39 -0400
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: Your message of "Thu, 20 Jun 2002 22:21:02 BST."
 <001901c218a0$6158d1c0$070210ac@LAPDANCE>
References: <001901c218a0$6158d1c0$070210ac@LAPDANCE>
Message-ID: <200206202137.g5KLbd505742@odiug.zope.com>

> If I'm going to move from %(name)fmt to ${name} I need a place for
> the fmt format. Given the error prone nature of %(name) should have
> been %(name)s
> 
> Howabout adding the format inside the {} for example:
> 
>     ${name:format}
> 
> You can then have
> 
>     $name
>     ${name}
>     ${name:s}
> 
> $name and ${name} work as you have already decided. ${name:format} allows
> the format to control the substitution.

Not bad.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From paul@prescod.net  Thu Jun 20 23:08:32 2002
From: paul@prescod.net (Paul Prescod)
Date: Thu, 20 Jun 2002 15:08:32 -0700
Subject: [Python-Dev] *Simpler* string substitutions
References: <3D121F0D.E3B60865@prescod.net> <20020620170025.A25014@ibook.distro.conectiva> <3D123A5B.EB7389AA@prescod.net> <20020620182043.A4252@ibook.distro.conectiva>
Message-ID: <3D125260.8559CAD8@prescod.net>

Gustavo Niemeyer wrote:
> 
> > Let's presume a "sub" method with the features of Ping's string
> > interpolation PEP. This would look like:
> 
> That's not the PEP being discussed, and if it was, it can't replace
> the % mapping. Read the Security Issues.

That's true. I didn't mean to endorse any particular solution but rather
to clarify the problem. I believe that only one of your examples
required a feature (runtime provision of the format string) that was not
in Ping's PEP. If another PEP is a better solution to the problem than
the current one then fine. My point is that there *is* a problem!

 Paul Prescod



From skip@pobox.com  Thu Jun 20 23:17:56 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 20 Jun 2002 17:17:56 -0500
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <200206202121.g5KLLPT05634@odiug.zope.com>
References: <3D121F0D.E3B60865@prescod.net>
 <200206202121.g5KLLPT05634@odiug.zope.com>
Message-ID: <15634.21652.224578.240799@beluga.mojam.com>

    Guido> Alex said roughly the same thing: let's not add $ while keeping
    Guido> %.

Then let's not add $ at all. ;-)

Seriously, I'm not keen on having to modify all my %-formatted strings for
something I perceive as a negligible improvement.  I've seen nothing to
suggest that any $-format proposals I've read were knock-my-socks-off better
than the current %-format implementation.

Skip



From jmiller@stsci.edu  Thu Jun 20 23:19:17 2002
From: jmiller@stsci.edu (Todd Miller)
Date: Thu, 20 Jun 2002 18:19:17 -0400
Subject: [Python-Dev] Indexing builtin sequences with objects which supply
 __int__
References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com>
Message-ID: <3D1254E5.6010007@stsci.edu>

Guido van Rossum wrote:

>>There has been some recent interest in the Numeric/numarray community 
>>for using array objects as indices
>>for builtin sequences.  I know this has come up before, but to make 
>>myself clear, the basic idea is to make the
>>following work:
>>
>>class C:
>>    def __int__(self):
>>          return 5
>>
>>object = C()
>>
>>l = "Another feature..."
>>
>>print l[object]
>>"h"
>>
>>Are there any plans (or interest) for developing Python in this direction?
>>
>
>I'm concerned that this will also make floats acceptable as indices
>(since they have an __int__ method) and this would cause atrocities
>like
>
>print "hello"[3.5]
>
>to work.
>
>--Guido van Rossum (home page: http://www.python.org/~guido/)
>
>
>_______________________________________________
>Python-Dev mailing list
>Python-Dev@python.org
>http://mail.python.org/mailman/listinfo/python-dev
>
That makes sense.    What if we specifically excluded Float objects from 
the conversion?   Are there any types that need to be excluded?    If 
there's a chance of getting a patch for this accepted,  STSCI is willing 
to do the work.

Todd

-- 
Todd Miller 			jmiller@stsci.edu
STSCI / SSG			(410) 338 4576






From niemeyer@conectiva.com  Thu Jun 20 23:20:15 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Thu, 20 Jun 2002 19:20:15 -0300
Subject: [Python-Dev] h2py
Message-ID: <20020620192014.A5111@ibook.distro.conectiva>

Hi everyone!

I was thinking about working a little bit in the h2py tool. But
first, I'd like to understand what's the current position of its
utility. Should I worry about it, or this tool and its generated files,
are something to be obsoleted soon?

Thanks!

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From niemeyer@conectiva.com  Thu Jun 20 23:32:44 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Thu, 20 Jun 2002 19:32:44 -0300
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <3D125260.8559CAD8@prescod.net>
References: <3D121F0D.E3B60865@prescod.net> <20020620170025.A25014@ibook.distro.conectiva> <3D123A5B.EB7389AA@prescod.net> <20020620182043.A4252@ibook.distro.conectiva> <3D125260.8559CAD8@prescod.net>
Message-ID: <20020620193244.B5111@ibook.distro.conectiva>

> That's true. I didn't mean to endorse any particular solution but rather
> to clarify the problem. I believe that only one of your examples
> required a feature (runtime provision of the format string) that was not
> in Ping's PEP. If another PEP is a better solution to the problem than
> the current one then fine. My point is that there *is* a problem!

Agreed. I feel relieved to know that the problem is in a PEP, and
that there's a lot of smart people discussing its implementation.
Don't worry, it won't get into Python before there's a minimum
consensus on the solution. Of course, issuing your opinion is
important to define the minimum consensus. ;-)

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From ping@zesty.ca  Thu Jun 20 23:48:52 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Thu, 20 Jun 2002 15:48:52 -0700 (PDT)
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <20020620071856.GA10497@hishome.net>
Message-ID: <Pine.LNX.4.44.0206201537410.1419-100000@ziggy>

On Thu, 20 Jun 2002, Oren Tirosh wrote:
>
> See http://tothink.com/python/embedpp

Hi Oren,

Your proposal brings up some valid concerns with PEP 215:

    1. run-time vs. compile-time parsing
    2. how to decide what's an expression
    3. balanced quoting instead of $

PEP 215 actually agrees with you on point #1.  That is, the intent
(though poorly explained) was that the interpolated strings would be
turned into bytecode by the compiler.  That is why the PEP insists on
having the interpolated expressions in the literal itself -- they
can be taken apart at compile time.

However, i don't necessarily agree with PEP 215.  (I mentioned this
once before, but it might not hurt to reiterate that i didn't write
the PEP because i desperately wanted string interpolation.  I wrote
it because i wanted to try to get one local optimum written down in
a PEP, so there would be something for discussion.)

Using compile-time parsing, as in PEP 215, has the advantage that it
avoids any possible security problems; but it also eliminates the
possibility of using this for internationalization.  I see this as
the key tension in the string interpolation issue (aside from all
the syntax stuff -- which is naturally controversial).


-- ?!ng

"Computers are useless.  They can only give you answers."
    -- Pablo Picasso




From Donald Beaudry <donb@abinitio.com>  Thu Jun 20 23:50:51 2002
From: Donald Beaudry <donb@abinitio.com> (Donald Beaudry)
Date: Thu, 20 Jun 2002 18:50:51 -0400
Subject: [Python-Dev] *Simpler* string substitutions
References: <001901c218a0$6158d1c0$070210ac@LAPDANCE>
Message-ID: <200206202250.g5KMops08100@zippy.abinitio.com>

"Barry Scott" <barry.alan.scott@ntlworld.com> wrote,
> Howabout adding the format inside the {} for example:
> 
>     ${name:format}

Considering that the $ is supposed to be familar to folks who use
other tools, the colon used this way might undo much of that good
will.  On the other hand,

    %{name:format}

might be just the right thing.

--
Donald Beaudry                                     Ab Initio Software Corp.
                                                   201 Spring Street
donb@abinitio.com                                  Lexington, MA 02421
                          ...So much code...



From aahz@pythoncraft.com  Fri Jun 21 00:00:11 2002
From: aahz@pythoncraft.com (Aahz)
Date: Thu, 20 Jun 2002 19:00:11 -0400
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <20020620170025.A25014@ibook.distro.conectiva>
References: <3D121F0D.E3B60865@prescod.net> <20020620170025.A25014@ibook.distro.conectiva>
Message-ID: <20020620230011.GA18327@panix.com>

On Thu, Jun 20, 2002, Gustavo Niemeyer wrote:
>
> "Serving HTTP on", sa[0], "port", sa[1], "..."

This is where current string handling comes up short.  What's the
correct way to internationalize this string?  What if the person
handling I18N isn't a Python programmer?

I'm sort of caught in the middle here.  I can see that in some ways what
we currently have isn't ideal, but we've already got problems with
strings violating the Only One Way stricture (largely due to immutability
vs. "+" combined with .join() vs. % -- fortunately, the use cases for
.join() and % are different, so people mostly use them appropriately).

It seems to me that fixing the problems with % formatting for newbie
Python programmers just isn't worth the pain.  It also seems to me that
getting better/simpler interpolation support for I18N and similar
templating situations is also a requirement.

I vote for two things:

* String template class for the text module/package that does
more-or-less what PEP 292 suggests.  I think standardizing string
templating would be a Good Thing.  I recommend that only one
interpolation form be supported; if we're following PEP 292, it should 
be ${var}.  This makes it visually easy for translators to find the
variables.

* No changes to current string interpolation features unless it's made
compatible with % formatting.

I don't think I can support dropping % formatting even in Python 3.0;
it's not just source code that will have string formats, but also config
files and databases.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From pinard@iro.umontreal.ca  Fri Jun 21 00:40:23 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 20 Jun 2002 19:40:23 -0400
Subject: [Python-Dev] Re: *Simpler* string substitutions
In-Reply-To: <200206202121.g5KLLPT05634@odiug.zope.com>
References: <3D121F0D.E3B60865@prescod.net>
 <200206202121.g5KLLPT05634@odiug.zope.com>
Message-ID: <oqznxpjz88.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

> [...] All options are still open.

Thanks, Guido, for the synthesis of a summary of various avenues.

These two points are worth underlining:

1) let's not add $ while keeping %.  [...] having both in the language, but
   only if % is reduced to the positional version

2) the necessary parsing could (should?) be done at compile time.

Here are other comments, some of which are related to internationalisation.

>       return "The sum of " + str(x) + " and " + str(y) + " is " + str(x+y)
>       return i("The sum of ", x, " and ", y, " is ", x+y)
>       print "The sum of", x, "and", y, "is", x+y

> Note that the print version is the shortest, and IMO also the easiest
> to read.

These are good for quick programs, and `print' is good for debugging.
But they are less appropriate whenever internationalisation is in the
picture, because it is more handy and precise for translators to handle
wider context at once, than individual sentence fragments.

> [...] % interpolation (with two variants: positional and by-name).

The advantage of by-name interpolation for internationalisation is the
flexibility it gives for translators to reorganise the inserts.

>       return "The sum of `x` and `y` is `x+y`"
>       return "The sum of $x and $y is $(x+y)"
>       return "The sum of $x and $y is [x+y]"

Those three above might be a little too magical for Python.  Python does
not ought to have interpolation on all double-quoted strings like shells
or Perl (and it should probably avoid deciding interpolability on the
delimiter being a single or double quote, even if shells or Perl do this).

>       return "The sum of \(x) and \(y) is \(x+y)"
>       return "The sum of \$x and \$y is \$(x+y)"
>       return e"The sum of $x and $y is $(x+y)"

> [...] I still like plain $ with something to tag the string as an
> interpolation best.

Those three are interesting, because they build on the escape syntax,
or prefix letters, which Python already has.  All these notations would
naturally accept `ur' prefix letters.  The shortest notation in the above is
the third, using the `e' prefix, because this is the one requiring the least
number of supplementary characters per interpolation.  This is really a big
advantage.  (A detail about the letter `e': is it the best letter to use?)

I also like the hidden suggestion that round parentheses are more readable
than braces, something that was already granted in Python through the
current %-by-name syntax.  In fact, `${name}' would be more acceptable
if Python also got at the same time `$(name)' as equivalent, and _also_
`%{name}format' as equivalent for %(name)format'.  The simplest is surely
to avoid braces completely, not introducing them.

As long as Python does not fully get rid of `%', I wonder if the last two
examples above could not be rewritten:

       return "The sum of \%x and \%y is \%(x+y)"
       return e"The sum of %x and %y is %(x+y)"

That would avoid introducing `$' while we already have `%'.  On the other
hand, it might be confusing to overload `%' too much, if one want to mix
everything like in:

       return e"The sum of %x and %y is %%d" % (x+y)

This is debatable, and delicate.  Users already have to deal with how to
quote `\' and `%'.  Having to deal with `$' as well, in all combinations and
exceptional cases, makes a lot of things to consider.  Most of us easily
write shell scripts, yet we have difficulty to properly write or decipher
a shell line using many quoting devices at once.  Python is progressively
climbing the same road.  It should stay simpler, all considered.

But I think the main problem in all these suggestions is how they interact
with internationalisation.  Surely:

       return _(e"The sum of %x and %y is %(x+y)")

cannot be right.  Interpolation has to be delayed to after translation, not
before, because you agree that translators just cannot produce a translation
for all possible inserts.  I do not know what the solution is, and what kind
of elegant magic may be invented to yield programmers all the flexibility
they still need in that area.  It is worth a good thought, and we should
not rush into a decision before this aspect has been carefully analysed.
If other PEPs are necessary for addressing interactions between interpolation
and translation, these PEPs should be fully resolved before or concurrently
with the PEP on interpolation, and not pictured as independent issues.

> [...]  There would be very little overlap in use cases: % always
> requires you to specify explicit values, while $ is always % followed
> by a variable name.

Yes, the suggestion of using `$(name:format)', whenever needed, is a good
one that should be retained, maybe as `%(name:format)', or maybe with `$'.
It means that the overlap would not be so little, after all.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From guido@python.org  Fri Jun 21 02:29:30 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 21:29:30 -0400
Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__
In-Reply-To: Your message of "Thu, 20 Jun 2002 18:19:17 EDT."
 <3D1254E5.6010007@stsci.edu>
References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com>
 <3D1254E5.6010007@stsci.edu>
Message-ID: <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net>

[Todd Miller]
> >>There has been some recent interest in the Numeric/numarray community 
> >>for using array objects as indices
> >>for builtin sequences.  I know this has come up before, but to make 
> >>myself clear, the basic idea is to make the
> >>following work:
> >>
> >>class C:
> >>    def __int__(self):
> >>          return 5
> >>
> >>object = C()
> >>
> >>l = "Another feature..."
> >>
> >>print l[object]
> >>"h"
> >>
> >>Are there any plans (or interest) for developing Python in this direction?

[Guido]
> >I'm concerned that this will also make floats acceptable as indices
> >(since they have an __int__ method) and this would cause atrocities
> >like
> >
> >print "hello"[3.5]
> >
> >to work.

[Todd]
> That makes sense.    What if we specifically excluded Float objects from 
> the conversion?   Are there any types that need to be excluded?    If 
> there's a chance of getting a patch for this accepted,  STSCI is willing 
> to do the work.

Hm, an exception for a specific type seems ugly.  What if a user
defines a UserFloat type, or a Rational type, or a FixedPoint type,
with an __int__ conversion?

This points to an unfortunate early design flaw in Python (inherited
from C casts): __int__ has two different meanings -- sometimes it
converts the type, sometimes it also truncates the value.

I suppose you could hack something where you extract x.__int__() and
x.__float__() and compare the two, but that could lead to a lot of
overhead.

I hesitate to propose a new special method, but that may be the only
solution. :-(

What's your use case?  Why do you need this?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun 21 02:31:11 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 21:31:11 -0400
Subject: [Python-Dev] h2py
In-Reply-To: Your message of "Thu, 20 Jun 2002 19:20:15 -0300."
 <20020620192014.A5111@ibook.distro.conectiva>
References: <20020620192014.A5111@ibook.distro.conectiva>
Message-ID: <200206210131.g5L1VBe09370@pcp02138704pcs.reston01.va.comcast.net>

> I was thinking about working a little bit in the h2py tool. But
> first, I'd like to understand what's the current position of its
> utility. Should I worry about it, or this tool and its generated files,
> are something to be obsoleted soon?

It's a poor hack.  We're trying to get away from having any header
files generated by this tool, because it turns out there is always
some platform where it misses some symbols.  (E.g. I recall we had a
case where a particular set of important constants was defined as an
enum instead of as #defines.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun 21 02:41:13 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Jun 2002 21:41:13 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: Your message of "Thu, 20 Jun 2002 15:48:52 PDT."
 <Pine.LNX.4.44.0206201537410.1419-100000@ziggy>
References: <Pine.LNX.4.44.0206201537410.1419-100000@ziggy>
Message-ID: <200206210141.g5L1fDv09800@pcp02138704pcs.reston01.va.comcast.net>

> Using compile-time parsing, as in PEP 215, has the advantage that it
> avoids any possible security problems;

It is also the only way to properly support nested scopes.  It would
be confusing and inconsistent if you can use a variable from a nested
scope in an expression but not in a "string display" (which I think is
a cute name for strings with embedded expressions).

> but it also eliminates the possibility of using this for
> internationalization.  I see this as the key tension in the string
> interpolation issue (aside from all the syntax stuff -- which is
> naturally controversial).

Yes, I believe that Barry's main purpose is i18n.  But I think i18n
should not be approached in a cavalier way.  If you need i18n of your
application, you have to be very disciplined anyway.  I think
collecting the variable available for interpolation in a dict and
passing them explicitly to an interpolation function is the way to go
here.

Also, in i18n the interpolation syntax must be usable for translators
who are not necessarily programmers.  I believe the $ notation with
only simple variables is entirely adequate for that purpose -- and
Barry can implement it in a few lines.  (We just adopted this for
Zope3, and while there are all sorts of open issues, $ interpolation
is not one of them.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From paul@prescod.net  Fri Jun 21 04:05:17 2002
From: paul@prescod.net (Paul Prescod)
Date: Thu, 20 Jun 2002 20:05:17 -0700
Subject: [Python-Dev] String substitution: compile-time versus runtime
References: <Pine.LNX.4.44.0206201537410.1419-100000@ziggy> <200206210141.g5L1fDv09800@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D1297ED.3990C30F@prescod.net>

Guido van Rossum wrote:
> 
> > Using compile-time parsing, as in PEP 215, has the advantage that it
> > avoids any possible security problems;
> 
> It is also the only way to properly support nested scopes.  It would
> be confusing and inconsistent if you can use a variable from a nested
> scope in an expression but not in a "string display" (which I think is
> a cute name for strings with embedded expressions).
>...
> Yes, I believe that Barry's main purpose is i18n.  But I think i18n
> should not be approached in a cavalier way.  If you need i18n of your
> application, you have to be very disciplined anyway.  I think
> collecting the variable available for interpolation in a dict and
> passing them explicitly to an interpolation function is the way to go
> here.

I think that what I hear you saying is that interpolation should ideally
be done at a compile time for simple uses and at runtime for i18n. The
compile-time version should have the ability to do full expressions
(array indexes and self.members at the very least) and will have access
to nested scopes. The runtime version should only work with
dictionaries.

I think you also said that they should both use named parameters instead
of positional parameters. And presumably just for simplicity they would
use similar syntax although one would be triggered at compile time and
one at runtime.

If "%" survives, it would be used for positional parameters, instead of
named parameters.

Is that your current thinking on the matter? 

I think we are making progress if we're coming to understand that the
two different problem domains (simple scripts versus i18n) have
different needs and that there is probably no one solution that fits
both.

 Paul Prescod



From niemeyer@conectiva.com  Fri Jun 21 06:07:25 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Fri, 21 Jun 2002 02:07:25 -0300
Subject: [Python-Dev] Behavior of matching backreferences
Message-ID: <20020621020725.A9565@ibook.distro.conectiva>

Hi everyone!

I was studying the sre module, when I came up with the following
regular expression:

re.compile("^(?P<a>a)?(?P=a)$").match("ebc").groups()

The (?P=a) matches with whatever was matched by the "a" group. If
"a" is optional and doesn't match, it seems to make sense that
(?P=a) becomes optional as well, instead of failing. Otherwise the
regular expression above will allways fail if the first group
fails, even being optional.

One could argue that to make it a valid regular expression, it should
become "^(?P<a>a)?(?P=a)?". But that's a different regular expression,
since it would match "a", while the regular expression above would
match "aa" or "", but not "a".

This kind of pattern is useful, for example, to match a string which
could be optionally surrounded by quotes, like shell variables. Here's
an example of such pattern: r"^(?P<a>')?((?:\\'|[^'])*)(?P=a)$".
This pattern matches "'a'", "\'a", "a\'a", "'a\'a'" and all such
variants, but not "'a", "a'", or "a'a".

I've submitted a patch to make this work to http://python.org/sf/571976

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From niemeyer@conectiva.com  Fri Jun 21 06:08:15 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Fri, 21 Jun 2002 02:08:15 -0300
Subject: [Python-Dev] h2py
In-Reply-To: <200206210131.g5L1VBe09370@pcp02138704pcs.reston01.va.comcast.net>
References: <20020620192014.A5111@ibook.distro.conectiva> <200206210131.g5L1VBe09370@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020621020815.B9565@ibook.distro.conectiva>

Guido,

> It's a poor hack.  We're trying to get away from having any header
> files generated by this tool, because it turns out there is always
> some platform where it misses some symbols.  (E.g. I recall we had a
> case where a particular set of important constants was defined as an
> enum instead of as #defines.)

Ok, I'll leave it as is then.

Thank you!

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From bac@OCF.Berkeley.EDU  Fri Jun 21 06:23:25 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Thu, 20 Jun 2002 22:23:25 -0700 (PDT)
Subject: [Python-Dev] strptime recapped
Message-ID: <Pine.SOL.4.44.0206201622310.11785-100000@death.OCF.Berkeley.EDU>

I have written the callout to strptime.strptime (strptime is SF patch
#474274) as Guido asked.  Since that was the current hold-up and the
thread has gone dormant, I figured I should summarize the discussion up to
this point.

1) what is the need?:
The question was raised why this was done.  The answer was that since time
is just a wrapper around time.h, strptime was not guaranteed since it is
not a part of ANSI C.  Some ANSI C libraries include it, though (like
glibc), because it is so useful.  Unfortunately Windows and OS X do not
have it.  Having it in Python means it is completely portable and no
longer reliant on the ANSI C library being kind enough to provide it.

2) strftime dependence:
Some people worried about the dependence upon strftime for calculating
some info.  But since strftime is guaranteed to be there by Python (since
it is a part of ANSI C), the dependence is not an issue.

3) locale info for dates:
Skip and Guido pointed out that calendar.py now generates the names of
the weekdays and months on the fly similar to my solution.  So I did go
ahead and use it.  But Skip pointed out that perhaps we should centralize
any code that calculates locale info for dates (calendar.py's names and my
code for figuring out format for date/time).  I had suggested adding it to
the locale module and Guido responded that Martin had to ok that.  Martin
hasn't responded to that idea.

4) location of strptime:
Skip asked why Guido was having me write the callout patch to
timemodule.c.  He wondered why Lib/time.py wasn't just created holding my
code and then renaming timemodule.c to _timemodule.c and importing it at
the end of time.py.  No response has been given thus far for that.

I also suggested a possible time2 where things like strptime, my helper
fxns (calculate the Julian date from the Gregorian date, etc.), and things
such as naivetime could be kept.  That would allow time to stay as a
straight wrapper to time.h while all bonus code gets relegated to time2.
Guido said it might be a good idea but would have to wait until he got
back from vacation.


That pretty much sums up everything to this point; hope I got it right and
didn't miss anything.

-Brett C.




From tim.one@comcast.net  Fri Jun 21 06:49:02 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 21 Jun 2002 01:49:02 -0400
Subject: [Python-Dev] RE: [Patches] [ python-Patches-566100 ] Rationalize DL_IMPORT and
 DL_EXPORT
In-Reply-To: <E17LGq1-0006Jo-00@usw-sf-web3.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEAEPPAA.tim.one@comcast.net>

In case anyone wants prettier Python source code, check out Mark Hammond's
patch.  It deserves more consideration than 50 identical ways to spell
string interpolation <wink>:

    http://www.python.org/sf/566100




From python@rcn.com  Fri Jun 21 07:18:59 2002
From: python@rcn.com (Raymond Hettinger)
Date: Fri, 21 Jun 2002 02:18:59 -0400
Subject: [Python-Dev] Re: *Simpler* string substitutions
References: <3D121F0D.E3B60865@prescod.net><200206202121.g5KLLPT05634@odiug.zope.com> <oqznxpjz88.fsf@titan.progiciels-bpi.ca>
Message-ID: <00a401c218eb$8872c7c0$87d8accf@othello>

+1 for $(name) instead of ${name}
           because it is closer to existing formatting spec
           because my tastes like it better

-1 for $(x+y)
            because z=x+y; '$z'.sub() works fine
            because general expressions are harder to pick-out
   
+1 for $(name:fmt)
            because the style is powerful and elegant

+1 for \$ instead of $$        
            because \ is already an escape character
            because $$ is more likely to occur in actual string samples

+1 for 'istring'.sub()  instead of e'istring'
            because sub allows a particular mapping to be specified

+1 for not being a separate module
            so the feature gets used

+1 for leaving %()s alone
           because formats may have been stored external to programs

+1 for not using back-quotes
           because they are hard to read in languages with accents
           because the open and close back-quotes are not distinct


'regnitteh dnomyar'[::-1] 





From greg@cosc.canterbury.ac.nz  Fri Jun 21 07:36:23 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 21 Jun 2002 18:36:23 +1200 (NZST)
Subject: [Python-Dev] Weird problem with exceptions raised in extension module
In-Reply-To: <200206201745.g5KHjI604158@odiug.zope.com>
Message-ID: <200206210636.g5L6aNU06187@oma.cosc.canterbury.ac.nz>

Guido:

> It seems that this is just for
> 
>    raise TypeError, "Test-Exception"

Actually, it's

   raise TypeError("Test-Exception")

> But I think that you shouldn't be calling PyErr_SetNone() here -- I
> think you should call PyErr_SetObject(__pyx_1, __pyx_2).
>
> For details see do_raise() in ceval.c.

Hmmm. Having studied this routine *very* carefully,
I think I can see where things are going wrong.
Reading the C API docs led me to believe that the
equivalent of the Python statement

   raise x

would be

  PyErr_SetNone(x)

But it appears that is not the case, and what I
should actually be doing is

  PyErr_SetObject(
    ((PyInstanceObject*)x)->in_class, x)

This is... um... not very intuitive. Perhaps the
C API docs could be amended to mention this?

Also, it looks as if exceptions have to be
old-style instances, not new-style ones. Is
that correct?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From aleax@aleax.it  Fri Jun 21 08:38:05 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 21 Jun 2002 09:38:05 +0200
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <200206202121.g5KLLPT05634@odiug.zope.com>
References: <3D121F0D.E3B60865@prescod.net> <200206202121.g5KLLPT05634@odiug.zope.com>
Message-ID: <E17LJ09-0007B0-00@mail.python.org>

On Thursday 20 June 2002 11:21 pm, Guido van Rossum wrote:
	...
> Now back to $ vs. %.  I think I can defend having both in the
> language, but only if % is reduced to the positional version (classic
> printf).  This would be used mostly to format numerical data with
> fixed column width.  There would be very little overlap in use cases:

I think you're right: in a "greenfield" language design (a hypothetical one 
starting from scratch with no constraints of backwards compatibility) you can 
indeed defend using both % and $ for these two tasks, net of the issues of
what feature set to give $ formatting -- implicit vs nonimplicit access to
variables, including the very delicate case of access to free variables (HOW
to give access to free variables if the formatstring isn't a literal?); 
ability to use expressions and not just identifiers; ability to pass a 
mapping; what format control should be allowed in $ formatting -- and 
what syntax to use to give acces to those features.

If %(name)s is to be deprecated moving towards Python-3000 (surely it
can't be _removed_ before then), $-formatting needs a very rich feature set; 
otherwise it can't _replace_ %-formatting.  It seems to me that (assuming
$ formatting IS destined to get into Python) $ formatting should then be
introduced with all or most of the formatting power it eventually needs, so
that those who want to make their programs Py3K-ready can use $ formatting
to replace all their uses of %(name)s formatting.

The "transition" period will thus inevitably offer different ways to perform 
the same tasks -- we can never get out of this bind, any time we move to
deprecate an "old way" to perform a task, since the old way and the new
way MUST both work together for a good while to allow migration.  This
substantial cost is of course worth paying only if the new way is a huge win
over the old one -- not just "somewhat" better, but ENORMOUSLY better.
But that's OK, and exactly the kind of delicate trade-off which you DO have
such a good track record at getting right in the past:-).


> All options are still open.

Thanks for clarifying this.  To me personally it seems that the gain of 
introducing $ formatting, if gain it be, is small enough not to be worth the
transition cost, but that's just opinion, hard to back up with any substance.

So I offer a real-life anecdote instead.  A colleague at Strakt (a wizard at
various communication and storage programming issues) had no previous
exposure to Python at all, his recent background being mostly with Plan-9,
Inferno, and Limbo (previously, other Bell Labs technologies, centered on
Unix and C).  He picked up Python on the job over the last few months --
basically from Python's own docs, our existing code base, and discussions
with colleagues, me included -- and didn't take long to become productive
with it.  He still has some issues.  Some are very understandable considering
his background -- e.g., he's still not fully _comfortable_ with dynamic 
typing (I predict he'll grow to like it, but Rome wasn't built in one day). 
Overall, what I would call a pretty good scenario and an implicit tribute to 
Python's simplicity / ease / power.  He may pine for Limbo, but in fact 
produces a lot of excellent Python code day in day out.

But his biggest remaining "general peeve" struck me hard the other day, 
exactly because that's not something he "heard", but an observation he
came up with all by himself, by reasonably unbiased examination of "Python as 
she's spoken".  "I wouldn't mind Python so much" (I'm paraphrasing, but that 
IS the kind of grudging-compliment understatement he did use:-) "except that 
there's always so MANY deuced ways to do everything -- can't they just pick
one and STICK with it?!".  In the widespread subtext of most Python discourse
this might sound like irony, but in his case, it was just an issue of fact 
(compared, remember, with SMALL languages such as Limbo -- bloated
ones such as, e.g., C++, are totally *outside* his purvey and experience) -- a
bewildering array of possible variations.  Surely inevitable when viewed 
diachronically (==as an evolution over time), but his view, like that of 
anybody who comes to Python anew today, is synchronic (==a snapshot at one 
moment).

I don't think there's anything we can do to AVOID this phenomenon, of course,
but right now I'm probably over-sensitized to the "transition costs" of 
introducing "yet one more way to do it" by this recent episode.  So, it 
appears to me that REDUCING the occurrence of such perceptions is important.


Alex



From mal@lemburg.com  Fri Jun 21 08:57:57 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 21 Jun 2002 09:57:57 +0200
Subject: [Python-Dev] *Simpler* string substitutions
References: <3D121F0D.E3B60865@prescod.net> <200206202121.g5KLLPT05634@odiug.zope.com> <E17LJ09-0007B0-00@mail.python.org>
Message-ID: <3D12DC85.6040501@lemburg.com>

Alex Martelli wrote:
> If %(name)s is to be deprecated moving towards Python-3000 (surely it
> can't be _removed_ before then), $-formatting needs a very rich feature set; 
> otherwise it can't _replace_ %-formatting.  It seems to me that (assuming
> $ formatting IS destined to get into Python) $ formatting should then be
> introduced with all or most of the formatting power it eventually needs, so
> that those who want to make their programs Py3K-ready can use $ formatting
> to replace all their uses of %(name)s formatting.

I haven't jumped into this discussion since I thought that
you were only discussing some new feature which I don't have
a need for.

Now if you want to deprecate %(name)s formatting,
the situation is different: my tie would start jumping up
and down, doing funny noises :-)

So just this comment from me: please don't deprecate %(name)s
formatting. For the rest: I don't really care.

Thanks,
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From aleax@aleax.it  Fri Jun 21 09:10:03 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 21 Jun 2002 10:10:03 +0200
Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__
In-Reply-To: <3D1254E5.6010007@stsci.edu>
References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com> <3D1254E5.6010007@stsci.edu>
Message-ID: <E17LJV5-0000Ls-00@mail.python.org>

On Friday 21 June 2002 12:19 am, Todd Miller wrote:
	...
> >I'm concerned that this will also make floats acceptable as indices
> >(since they have an __int__ method) and this would cause atrocities
> >like
> >
> >print "hello"[3.5]
	...
> That makes sense.    What if we specifically excluded Float objects from
> the conversion?   Are there any types that need to be excluded?    If

"Any type that's float-like", and that's a very hard set to pin down.

Consider a user-written class that implements (e.g.) a number in decimal
form (maybe BCD), carefully crafted to "look&feel just like float" except for
its specifics (such as different rounding behavior).  How would you tell that
this class is NOT acceptable as a sequence index even though it has an
__int__ method while another class with an __int__ method IS OK?

It seems to me that one solution would be to add an attribute that is to
be exposed by types / classes that WANT to be usable as indices in this
way.  If, say, the object exposes an attribute _usable_as_sequence_index, then
the indexing code could proceed, otherwise, TypeError.

It's quite sad that a lot of ad-hoc approaches such as this one have to be
devised in each and every similar case, when PEP 246, gathering dust in
the PEP repository, offers such a simple, elegant architecture for them all.

Basically, PEP 246 lets you ask a "central mechanism", given an object X and
a "protocol" Y, to yield a Z (where Z is X if feasible, but in many cases 
might be a "version of X which is Y-fied without loss of information") such
that Z is "X or a version of X that satisfies protocol Y".  "Adaptation" is 
the name commonly used for this approach (also in PEP 246).  When X can't be 
adapted to Y, an exception gets raised.

Here, indexing code could ask for an adaptation of X to the "sequence index
protocol" and get either "a version of X usable as sequence index" or an
exception.  "A protocol" is normally a type or class, and "Z satisfies 
protocol Y" may then be roughly equated to "Z is an instance of Y", but the
concept is more general.  If Python had a formal concept of 'interface', a
protocol might also be an interface -- this is apparently what's holding up
PEP 246, waiting for such 'interfaces' to appear.  But "a protocol" may in
fact be any object at all and the concept of "satisfying" it is really a 
matter of convention between the code that requests adaptation and the
code that _provides_ adaptation.  The latter may live in X's type, or in the
Y protocol, or *outside of both* and get added to the "central mechanism"
dynamically -- so you get a chance to adapt two separately developed
frameworks without as much blood, sweat and tears as currently needed.
(The compile-time equivalent of this is in Haskell's "typeclass" mechanism,
but of course Python moves it to runtime instead.)


Back to your specific issue.  "An integer" is too BROAD a concept.  When
some client-code has an object X and "wants an integer equivalent of X"
it may have SEVERAL different purposes in mind.  int(X) can't guess and
so provides only ONE way -- for example, truncating the fractional part if
X is a float.  If the client-code could ask more precisely for "give me a
version of X to be used as a sequence index" it would still get back either
an int OR an exception, BUT, the int result would only be supplied if "it
was known" that X is indeed "usable without loss of information" for the
specific purpose of indexing a sequence.

The "it was known" part could reside in any one of three places:
a. the SequenceIndexing protocol could 'know' that e.g. every int X is OK
    as a sequence index, and immediately return such an X if asked for
    adaptation of it;
b. a type could 'know' its instances are OK as sequence indices, and
    supply the equivalent-for-THAT-purpose int on request;
c. a "third-party" adapter could know that, for this application, instances of
    type A are OK to use as sequence indices: the third-party adapter would
    be installed at application startup, get invoked upon such adaptation
    requests when X is an instance of type A, and provide the needed int.
See PEP 246 for one possible mechanism (at Python-level) to support
this, but the mechanism is of course fully negotiable.  The point is that we
NEED something like PEP 246 each and every time we want to perform
any task of this ilk.  Almost every time I see type-testing (as implicit in 
the idea "but do something different if X is a float", for example), I see a 
need for PEP 246 that stays unmet because PEP 246 is waiting...


Alex



From aleax@aleax.it  Fri Jun 21 09:20:02 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 21 Jun 2002 10:20:02 +0200
Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__
In-Reply-To: <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net>
References: <3D123B1E.6050600@stsci.edu> <3D1254E5.6010007@stsci.edu> <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17LJeI-0001mf-00@mail.python.org>

On Friday 21 June 2002 03:29 am, Guido van Rossum wrote:
	...
> This points to an unfortunate early design flaw in Python (inherited
> from C casts): __int__ has two different meanings -- sometimes it
> converts the type, sometimes it also truncates the value.

That's inherent in any conversion to a type which has multiple purposes.  I 
wouldn't call it a "design flaw" -- it's a "flaw" (?) in the underlying 
reality:-).

> I hesitate to propose a new special method, but that may be the only
> solution. :-(

PEP 246...


Alex



From martin@v.loewis.de  Fri Jun 21 09:34:00 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 21 Jun 2002 10:34:00 +0200
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <20020620205041.GD18944@zot.electricrain.com>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
 <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
 <15632.62564.638418.191453@localhost.localdomain>
 <20020619212559.GC18944@zot.electricrain.com>
 <15633.1338.367283.257786@localhost.localdomain>
 <20020620205041.GD18944@zot.electricrain.com>
Message-ID: <m34rfxowsn.fsf@mira.informatik.hu-berlin.de>

"Gregory P. Smith" <greg@electricrain.com> writes:

> However linking against berkeleydb versions less than 3.2 will no longer
> be supported; should we keep the existing bsddb around as oldbsddb for
> users in that situation?

I don't think so; users could always extract the module from older
distributions if they want to.

Instead, if there are complaints, I think we should try to extend
source support a little further back.

Regards,
Martin




From martin@v.loewis.de  Fri Jun 21 09:36:14 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 21 Jun 2002 10:36:14 +0200
Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__
In-Reply-To: <3D1254E5.6010007@stsci.edu>
References: <3D123B1E.6050600@stsci.edu>
 <200206202053.g5KKrCA05552@odiug.zope.com>
 <3D1254E5.6010007@stsci.edu>
Message-ID: <m3znxpni4h.fsf@mira.informatik.hu-berlin.de>

Todd Miller <jmiller@stsci.edu> writes:

> That makes sense.    What if we specifically excluded Float objects
> from the conversion?   Are there any types that need to be excluded?
> If there's a chance of getting a patch for this accepted,  STSCI is
> willing to do the work.

Perhaps an __index__ conversion could work?

Regards,
Martin



From piers@cs.su.oz.au  Fri Jun 21 09:41:25 2002
From: piers@cs.su.oz.au (Piers Lauder)
Date: Fri, 21 Jun 2002 18:41:25 +1000
Subject: [Python-Dev] unifying read method semantics
Message-ID: <1024648887.89.481975932@cs.su.oz.au>

A user of imaplib's IMAP4_SSL class has complained that the "read" and
"write" methods don't behave correctly, sometimes omitting to handle all
the requested data. This is a bug - I should have noticed this common
misconception when installing the submitted sub-class into imaplib.

However, this is a common enough gotcha for python programmers that I
wondered if it is worthwhile fixing it once and for all. Ie: mandate
that core python modules providing read/write methods guarantee that
all the data is sent by write() (or exception), and all the requested
data is read() (or exception).

The last time this came up the socketmodule code got a "sendall" method.
However, this doesn't exist in the ssl portion of socketmodule.c.

And while I'm on the topic - please could we always support "readline"
(or "makefile") methods in C modules?  Surely the following code now
necessary in imaplib must make CPU-time conscious programmers wince:

    def readline(self):
        """Read line from remote."""
        line = ""
        while 1:
            char = self.sslobj.read(1)
            line += char
            if char == "\n": return line

:-)

Piers Lauder.






From martin@v.loewis.de  Fri Jun 21 09:46:52 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 21 Jun 2002 10:46:52 +0200
Subject: [Python-Dev] unifying read method semantics
In-Reply-To: <1024648887.89.481975932@cs.su.oz.au>
References: <1024648887.89.481975932@cs.su.oz.au>
Message-ID: <m3ofe5nhmr.fsf@mira.informatik.hu-berlin.de>

Piers Lauder <piers@cs.su.oz.au> writes:

> And while I'm on the topic - please could we always support "readline"
> (or "makefile") methods in C modules?  

I don't think this is feasiable.

> Surely the following code now
> necessary in imaplib must make CPU-time conscious programmers wince:
> 
>     def readline(self):
>         """Read line from remote."""
>         line = ""
>         while 1:
>             char = self.sslobj.read(1)
>             line += char
>             if char == "\n": return line

Moving this algorithm to another location won't essentially change CPU
consumption...

Regards,
Martin




From Paul.Moore@atosorigin.com  Fri Jun 21 11:22:27 2002
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Fri, 21 Jun 2002 11:22:27 +0100
Subject: [Python-Dev] Re: *Simpler* string substitutions
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B3B2@UKRUX002.rundc.uk.origin-it.com>

Some points on the current thread, in no particular order...

1. While I agree that "$" is better known as an interpolation
   character than "%", it shouldn't be forgotten that "%" is the
   interpolation character in DOS/Windows shells. Some recent
   examples which showed "%" used ("The sum of %x and %y is
   %(x+y)") looked entirely natural to me (I use Windows more
   than Unix) - in fact, more so than "$"!!

2. The internationalisation issue is clearly important. However,
   it has very different characteristics insofar as the template
   string is (of necessity) handled at runtime, so issues of
   compilation and security become relevant. I'm no I18N expert,
   so I can't comment on details, but I *do* think it's worth
   separating out the I18N issues from the "simple interpolation"
   issues...

3. I feel that the existing % formatting operator cannot
   realistically be removed. Tidying up some of its warts may be
   possible, and even sensible, but there's too much code using
   it (and as was pointed out, template strings may not event be
   stored in code files) to make major changes.

4. Access to variables is also problematic. Without compile-time
   support, access to nested scopes is impossible (AIUI). But on
   the other hand, a scheme with subtle limitations such as lack
   of such access may not realistically count as "simple"...

5. (Personal opinion here!) I believe that formatting specifiers
   so not belong in a "simple" scheme - leave them for the
   "advanced" verion (the existing % operator). On the other hand,
   I feel that expression interpolation, within limits, *is*
   suitable. It's the user's responsibility not to go overboard,
   though...

Sorry for butting into an already long thread. I hope the summary
is useful, at least...

Paul.



From tismer@tismer.com  Fri Jun 21 12:09:33 2002
From: tismer@tismer.com (Christian Tismer)
Date: Fri, 21 Jun 2002 13:09:33 +0200
Subject: [Python-Dev] *Simpler* string substitutions
References: <3D121F0D.E3B60865@prescod.net> <200206202121.g5KLLPT05634@odiug.zope.com> <E17LJ09-0007B0-00@mail.python.org> <3D12DC85.6040501@lemburg.com>
Message-ID: <3D13096D.9030803@tismer.com>

M.-A. Lemburg wrote:
> Alex Martelli wrote:
> 
>> If %(name)s is to be deprecated moving towards Python-3000 (surely it
>> can't be _removed_ before then), $-formatting needs a very rich 
>> feature set; otherwise it can't _replace_ %-formatting.  It seems to 
>> me that (assuming
>> $ formatting IS destined to get into Python) $ formatting should then be
>> introduced with all or most of the formatting power it eventually 
>> needs, so
>> that those who want to make their programs Py3K-ready can use $ 
>> formatting
>> to replace all their uses of %(name)s formatting.
> 
> 
> I haven't jumped into this discussion since I thought that
> you were only discussing some new feature which I don't have
> a need for.
> 
> Now if you want to deprecate %(name)s formatting,
> the situation is different: my tie would start jumping up
> and down, doing funny noises :-)
> 
> So just this comment from me: please don't deprecate %(name)s
> formatting. For the rest: I don't really care.

Yes, please don't!

Besides the proposals so far, I'd like to add one,
which I really like a bit, since I used it for
years in an institute with a row of macro languages:

How about

name = "Guido" ; land = "The Netherlands"
"His name is <<name>> and he comes from <<land>>.".sub(locals())

I always found this notation very sharp and readable,
maybe this is just me.
I like to have a notation that is easily parsed, has unique
start and stop strings, no puctuation/whitespace rules
at all.
Any kind of extra stuff like format specifiers, default
values or expressions (if you really must) can be added
with ease.
If people like to use different delimiters, why not:

"His name is <$name$> and he comes from <$land$>.".sub(locals(), 
delimiters=("<$","$>") )

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From tismer@tismer.com  Fri Jun 21 12:48:18 2002
From: tismer@tismer.com (Christian Tismer)
Date: Fri, 21 Jun 2002 13:48:18 +0200
Subject: [Python-Dev] Re: Version Fatigue
References: <m17L33u-0075L5C@artcom0.artcom-gmbh.de> <200206201710.g5KHAaO03970@odiug.zope.com>              <3D122354.8040308@tismer.com> <200206201954.g5KJsTt05302@odiug.zope.com>
Message-ID: <3D131282.4030400@tismer.com>

Guido van Rossum wrote:
>>Guido, I'm not sure that you are always aware what
>>people actually like about Python and what they dislike.
>>I have heared such complaints from so many people,
>>that I think there are reasonably many who don't share
>>your judgement.
> 
> 
> Tough.  People used to like it because they trusted my judgement.
> Maybe I should stop listening to others. :-)

I thought you did already? :-)

> Seriously, the community is large enough that we can't expect
> everybody to like the same things.  There are reasonably many who
> still do share my judgement.

As the community grows, your audience also seems to
change. Newbies are more comfortable with new features.
The crowd of people like me who have become accustomed
to slow and resistant motion in Python over the years
are now no longer the main target of Python's develoment.
I think I could spell about 20 "oldtimers" without
thinking, who might have similar feelings.

[type, class, metaclass]

> No surprise that you, always the mathematician, like the most
> brain-exploding features. :-)

It is not for the features. It is for the elegance of
the concept, the great backward compatibility, the
nice convergence of concepts that seemed to be impossible
to get married. This is real "Kunst", whether I'm a math
guy or a musician.

> And note the contradiction, which you share with everybody else: you
> don't want new features, except the three that you absolutely need to
> have.  And you see nothing wrong with this contradiction.

Many people share this, but not me, sorry. I didn't
request any change to Python since years. I don't need
any of the recent changes, but I can of course use them.

>>All in all Python is evolving good. Maybe we could
>>slow a little down, please?
> 
> I'm trying.  I'm really trying.  Please give me some credit.

I checked your soundness. How much do you want?

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From s_lott@yahoo.com  Fri Jun 21 13:27:22 2002
From: s_lott@yahoo.com (Steven Lott)
Date: Fri, 21 Jun 2002 05:27:22 -0700 (PDT)
Subject: [Python-Dev] strptime recapped
In-Reply-To: <Pine.SOL.4.44.0206201622310.11785-100000@death.OCF.Berkeley.EDU>
Message-ID: <20020621122722.44222.qmail@web9601.mail.yahoo.com>

time2 might be a good place to include the nifty
Reingold-Dershowitz "rata die" date numbering; it can be
converted back and forth a vast number of widely used calendars:
Julian, Gregorian, Hebrew, old and new Hindu, Chinese (given
enough floating-point accuracy), Astronomical Julian, and
several others.

Generally, "Julian" dates are really just the day number within
a given year; this is a simple special case of the more general
(and more useful) approach that R-D use.

See
http://emr.cs.iit.edu/home/reingold/calendar-book/index.shtml

for more information.


--- Brett Cannon <bac@OCF.Berkeley.EDU> wrote:
> I have written the callout to strptime.strptime (strptime is
> SF patch
> #474274) as Guido asked.  Since that was the current hold-up
> and the
> thread has gone dormant, I figured I should summarize the
> discussion up to
> this point.
> 
> 1) what is the need?:
> The question was raised why this was done.  The answer was
> that since time
> is just a wrapper around time.h, strptime was not guaranteed
> since it is
> not a part of ANSI C.  Some ANSI C libraries include it,
> though (like
> glibc), because it is so useful.  Unfortunately Windows and OS
> X do not
> have it.  Having it in Python means it is completely portable
> and no
> longer reliant on the ANSI C library being kind enough to
> provide it.
> 
> 2) strftime dependence:
> Some people worried about the dependence upon strftime for
> calculating
> some info.  But since strftime is guaranteed to be there by
> Python (since
> it is a part of ANSI C), the dependence is not an issue.
> 
> 3) locale info for dates:
> Skip and Guido pointed out that calendar.py now generates the
> names of
> the weekdays and months on the fly similar to my solution.  So
> I did go
> ahead and use it.  But Skip pointed out that perhaps we should
> centralize
> any code that calculates locale info for dates (calendar.py's
> names and my
> code for figuring out format for date/time).  I had suggested
> adding it to
> the locale module and Guido responded that Martin had to ok
> that.  Martin
> hasn't responded to that idea.
> 
> 4) location of strptime:
> Skip asked why Guido was having me write the callout patch to
> timemodule.c.  He wondered why Lib/time.py wasn't just created
> holding my
> code and then renaming timemodule.c to _timemodule.c and
> importing it at
> the end of time.py.  No response has been given thus far for
> that.
> 
> I also suggested a possible time2 where things like strptime,
> my helper
> fxns (calculate the Julian date from the Gregorian date,
> etc.), and things
> such as naivetime could be kept.  That would allow time to
> stay as a
> straight wrapper to time.h while all bonus code gets relegated
> to time2.
> Guido said it might be a good idea but would have to wait
> until he got
> back from vacation.
> 
> 
> That pretty much sums up everything to this point; hope I got
> it right and
> didn't miss anything.
> 
> -Brett C.
> 
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev


=====
--
S. Lott, CCP :-{)
S_LOTT@YAHOO.COM
http://www.mindspring.com/~slott1
Buccaneer #468: KaDiMa

Macintosh user: drinking upstream from the herd.

__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From guido@python.org  Fri Jun 21 13:43:28 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 21 Jun 2002 08:43:28 -0400
Subject: [Python-Dev] Vacation
Message-ID: <200206211243.g5LChT524426@pcp02138704pcs.reston01.va.comcast.net>

I should mention that I'm going on vacation.  I'm not going to run a
vacation program -- I've had poor results with those in the past, and
I expect to be sporadically checking email.  But I don't expect to be
responding to python-dev mail until I'm back.  I'm leaving later today
and plan to return Monday July 8, back at work the 9th.

Of course, I'll see a bunch of you all at EuroPython!  I'm looking
forward to it -- it sounds like it's gonna be a great conference!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun 21 13:50:54 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 21 Jun 2002 08:50:54 -0400
Subject: [Python-Dev] Weird problem with exceptions raised in extension module
In-Reply-To: Your message of "Fri, 21 Jun 2002 18:36:23 +1200."
 <200206210636.g5L6aNU06187@oma.cosc.canterbury.ac.nz>
References: <200206210636.g5L6aNU06187@oma.cosc.canterbury.ac.nz>
Message-ID: <200206211250.g5LCosp24749@pcp02138704pcs.reston01.va.comcast.net>

> Reading the C API docs led me to believe that the
> equivalent of the Python statement
> 
>    raise x
> 
> would be
> 
>   PyErr_SetNone(x)
> 
> But it appears that is not the case, and what I
> should actually be doing is
> 
>   PyErr_SetObject(
>     ((PyInstanceObject*)x)->in_class, x)
> 
> This is... um... not very intuitive. Perhaps the
> C API docs could be amended to mention this?

I guess so.  The rule is that all PyErr_SetXXX functions correspond
to a raise statement with a class as first argument.  raise with an
instance first argument is a shortcut.

> Also, it looks as if exceptions have to be
> old-style instances, not new-style ones. Is
> that correct?

Unfortunately so in the current code base.  I'm not sure if/when we
should lift this restriction.  I'm also not sure if, when we lift it,
we should make Exception and all other built-in exceptions new-style
classes.  New-style and classic classes aren't 100% compatible and I
don't like to break people's code who have subclassed a built-in
exception class and did something that doesn't work the same in
new-style classes.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun 21 13:59:47 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 21 Jun 2002 08:59:47 -0400
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: Your message of "Fri, 21 Jun 2002 09:38:05 +0200."
 <E17LJ0C-0007G7-00@mail.python.org>
References: <3D121F0D.E3B60865@prescod.net> <200206202121.g5KLLPT05634@odiug.zope.com>
 <E17LJ0C-0007G7-00@mail.python.org>
Message-ID: <200206211259.g5LCxmx24769@pcp02138704pcs.reston01.va.comcast.net>

> But his biggest remaining "general peeve" struck me hard the other
> day, exactly because that's not something he "heard", but an
> observation he came up with all by himself, by reasonably unbiased
> examination of "Python as she's spoken".  "I wouldn't mind Python so
> much" (I'm paraphrasing, but that IS the kind of grudging-compliment
> understatement he did use:-) "except that there's always so MANY
> deuced ways to do everything -- can't they just pick one and STICK
> with it?!".  In the widespread subtext of most Python discourse this
> might sound like irony, but in his case, it was just an issue of
> fact (compared, remember, with SMALL languages such as Limbo --
> bloated ones such as, e.g., C++, are totally *outside* his purvey
> and experience) -- a bewildering array of possible variations.
> Surely inevitable when viewed diachronically (==as an evolution over
> time), but his view, like that of anybody who comes to Python anew
> today, is synchronic (==a snapshot at one moment).
> 
> I don't think there's anything we can do to AVOID this phenomenon,
> of course, but right now I'm probably over-sensitized to the
> "transition costs" of introducing "yet one more way to do it" by
> this recent episode.  So, it appears to me that REDUCING the
> occurrence of such perceptions is important.

AFAIK Limbo has a very small user base (and its key designer is much
more arrogant than your average BDFL even :-).  It's much easier to
withstand the pressure to add features in that case.  And lately, most
new features have been better ways to do things you could already do,
but only clumsily.  That would add to his impressions.  Plus,
inevitably, that not everybody at Strakt uses the same coding style.

I understand the sentiment, but users are like this: they all want you
to stop adding features except the one thing they absolutely need.
(Myhrvold)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pinard@iro.umontreal.ca  Fri Jun 21 14:35:36 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 21 Jun 2002 09:35:36 -0400
Subject: [Python-Dev] Re: String substitution: compile-time versus runtime
In-Reply-To: <3D1297ED.3990C30F@prescod.net>
References: <Pine.LNX.4.44.0206201537410.1419-100000@ziggy>
 <200206210141.g5L1fDv09800@pcp02138704pcs.reston01.va.comcast.net>
 <3D1297ED.3990C30F@prescod.net>
Message-ID: <oq4rfwkb4n.fsf@titan.progiciels-bpi.ca>

[Paul Prescod]

> I think that what I hear you saying is that interpolation should ideally
> be done at a compile time for simple uses and at runtime for i18n.
> [...]  If "%" survives, it would be used for positional parameters,
> instead of named parameters.  [...]  I think we are making progress
> if we're coming to understand that the two different problem domains
> (simple scripts versus i18n) have different needs and that there is
> probably no one solution that fits both.

[Moore, Paul]

> The internationalisation issue is clearly important.  However, it has
> very different characteristics insofar as the template string is (of
> necessity) handled at runtime, so issues of compilation and security
> become relevant.  I'm no I18N expert, so I can't comment on details,
> but I *do* think it's worth separating out the I18N issues from the
> "simple interpolation" issues...

You know, the ultimate goal of internationalisation, for a non English
speaking user and even programmer, is to see his/her own language all over
the screen.  This means from the shell, from the system libraries, from
all applications, big or small, everything.  For what is provided by other
programmers or maintainers, this may occur sooner and later, depending on
the language, the interest of the maintainer, and the development dynamic.
The far-reaching hope is that it will eventually occur.

For what a user/programmer writes little things himself/herself, and this
is where Python pops up, there are two ways.  The simplest is to write all
strings in native language.  The other way, meant to help exchange with
various friends or get feedback from a wider community, is to do things
properly, and internationalise even small scripts from the start.  It is
easy to develop such an attitude, yet currently, examples do not abound.

I surely had it for a few languages, despite it was rather demanding on me,
at the time `gettext' was not yet available -- and in fact, my works were
used to benchmark various ideas before `gettext' was first written.

The mantra I repeated all along had two key points:

1) internationalisation will only be successful if designed to be unobtrusive,
   otherwise average maintainers and implementors will resist it.

2) programmer duties and translation duties are to be kept separate, so these
   activities could be done asynchronously from one another.[1]

I really, really think that with enough and proper care, Python could be set
so internationalisation of Python scripts is just unobtrusive routine.  There
should not be one way to write Python when one does not internationalise,
and another different way to use it when one internationalises.  The full
power and facilities of Python should be available at all times, unrelated
to internationalisation intents.  Non-English people should not have to pay a
penalty, or if they do, the penalty should be minimised As Much As Possible.

Our BDFL, Guido, should favour internationalisation as a principle in the
evolution for the language, that is, more than a random negligible feature.
I sincerely hope he will do.  For many people, internationalisation issues
cannot be separated out that simply, or otherwise dismissed.  We should
rather learn to collaborate at properly addressing and solving them at
each evolutionary step, so Python really remains a language for everybody.

--------------------
[1] In practice, we've met those two goals only partly.  For C programs,
the character overhead per localised string is low -- the three characters
"_()", while exceptionally _not_ obeying the GNU standard about a space
before the opening parenthesis.  The glue code is still small -- yet not
as small as I would have wanted.  I wrote the Emacs PO mode so marking
strings in a C project can be done rather quickly by maintainers, and so
translators can do their job alone.  These are on the positive side.

But I think we failed at the level of release engineering, as the combined
complexity of Automake, Autoconf, Libtool and Gettext installation scripts
is merely frightening, and very discouraging for the casual user.  There were
reasons behind "releng" choices, but they would make a long story. :-) Also,
people in the development allowed more fundamental unneeded complexities,
and which had to sad effect of anchoring the original plans to the point
of being stuck.  On the other hand, people not understanding where we were
aiming, are happily unaware of what we are missing.  (Maintainers may become
incredibly stubborn, when having erections. :-) Eh, that's life...  Sigh!

Python can do better on _all_ fronts.  By the way, I hope that `distutils'
can be adapted to address internationalisation-related release engineering
difficulties, so these merely vanish in practice for Python lovers.
We could also have other standard helper tools for non-installed scripts.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From aleax@aleax.it  Fri Jun 21 14:38:46 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 21 Jun 2002 15:38:46 +0200
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <200206211259.g5LCxmx24769@pcp02138704pcs.reston01.va.comcast.net>
References: <3D121F0D.E3B60865@prescod.net> <E17LJ0C-0007G7-00@mail.python.org> <200206211259.g5LCxmx24769@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17LOdG-0003V6-00@mail.python.org>

On Friday 21 June 2002 02:59 pm, Guido van Rossum wrote:
	...
> AFAIK Limbo has a very small user base (and its key designer is much
> more arrogant than your average BDFL even :-).  It's much easier to
> withstand the pressure to add features in that case.  And lately, most

This is probably correct (both arrogance and a small user base help:-).
However, see at end as for how a largish user base (of a certain kind)
may actually HELP.

But I'm not particularly concerned with comparisons of Python to Limbo,
but rather with the general issue of how Python can be perceived (quite
apart from any issue of spin or how to present it - by somebody who's
not inclined to listen to spin or presentation but prefers to see things
for himself).

> new features have been better ways to do things you could already do,
> but only clumsily.  

Yes, to some extent that's inevitable.  Once a language is (net of
finite-storage issues) Turing-complete, in a very real sense EVERYTHING 
is one of those "things you could already do".

> That would add to his impressions.  Plus,
> inevitably, that not everybody at Strakt uses the same coding style.

We try hard to avoid such "code ownership" issues, with pair programming,
frequent refactoring, strong consensus-based coding-style guidelines, and so 
on.  But of course we can't get them down to 0.

> I understand the sentiment, but users are like this: they all want you
> to stop adding features except the one thing they absolutely need.
> (Myhrvold)

Actually, I believe a lot of users don't particularly mind there being lots 
of redundant features around, but presumably THAT sort tends to be 
selected-against wrt Python-Dev (and Strakt employment), while (e.g.) Perl 
(or MS employment) might draw them more.  Still, as long as you keep in your 
field of vision the reality that (excluding the selected-against crowd) every 
new feature you DO add is perceived as a negative by MOST users, I trust 
you'll keep being extremely selective in deciding what IS truly "absolutely" 
needed.

This is the point I mentioned at the start about effects of user base.  Given
that the user base is largish AND biased AGAINST featuritis, it should HELP
you "withstand the pressure to add features"... if you WANT to withstand it.
I.e., you'll mostly get strong support for any stance of "let's NOT add 
this".  You may dislike that when you WANT to add a feature, but surely
not when it's about "withstanding the pressure".


Alex



From jmiller@stsci.edu  Fri Jun 21 14:42:02 2002
From: jmiller@stsci.edu (Todd Miller)
Date: Fri, 21 Jun 2002 09:42:02 -0400
Subject: [Python-Dev] Indexing builtin sequences with objects which supply
 __int__
References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com>              <3D1254E5.6010007@stsci.edu> <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D132D2A.7080801@stsci.edu>

  Guido van Rossum wrote:

>[Todd Miller]
>
>>>>There has been some recent interest in the Numeric/numarray community 
>>>>for using array objects as indices
>>>>for builtin sequences.  I know this has come up before, but to make 
>>>>myself clear, the basic idea is to make the
>>>>following work:
>>>>
>>>>class C:
>>>>   def __int__(self):
>>>>         return 5
>>>>
>>>>object = C()
>>>>
>>>>l = "Another feature..."
>>>>
>>>>print l[object]
>>>>"h"
>>>>
>>>>Are there any plans (or interest) for developing Python in this direction?
>>>>
>
>[Guido]
>
>>>I'm concerned that this will also make floats acceptable as indices
>>>(since they have an __int__ method) and this would cause atrocities
>>>like
>>>
>>>print "hello"[3.5]
>>>
>>>to work.
>>>
>
>[Todd]
>
>>That makes sense.    What if we specifically excluded Float objects from 
>>the conversion?   Are there any types that need to be excluded?    If 
>>
^ other types

>>
>>there's a chance of getting a patch for this accepted,  STSCI is willing 
>>to do the work.
>>
>
>Hm, an exception for a specific type seems ugly.  What if a user
>
I agree.

>
>defines a UserFloat type, or a Rational type, or a FixedPoint type,
>with an __int__ conversion?
>
Perry actually suggested excluding instances of any subclass of Float. I 
see now that there is
also the related problem of excluding instances which act like Floats.

>
>This points to an unfortunate early design flaw in Python (inherited
>from C casts): __int__ has two different meanings -- sometimes it
>converts the type, sometimes it also truncates the value.
>
>I suppose you could hack something where you extract x.__int__() and
>x.__float__() and compare the two, but that could lead to a lot of
>overhead.
>
Sounds too tricky to me. I'd hate to explain it.

>
>
>I hesitate to propose a new special method, but that may be the only
>solution. :-(
>
I liked MvL's __index__ method. I understand your hesitancy. It's pretty 
tough exploring new features side-by-side the "version fatigue" thread :)

>
>
>What's your use case?  Why do you need this?
>
This might settle it :) Numeric/numarray arrays are sometimes used in 
reduction operations (e.g. max) which eliminate one dimension. Sometimes 
the result is a zero dimensional array, which is currently converted to 
a Python scalar in both Numeric and numarray. The conversion to scalar 
enables integer zero dimensional results to be used as indices, but 
causes other problems since any auxilliary information in the array 
(e.g. type = Int8) is lost. Adding some form of implicit conversion to 
index value might permit us to retain zero dimensional objects as arrays.

>
>--Guido van Rossum (home page: http://www.python.org/~guido/)
>
>
>_______________________________________________
>Python-Dev mailing list
>Python-Dev@python.org
>http://mail.python.org/mailman/listinfo/python-dev
>







From Jack.Jansen@cwi.nl  Fri Jun 21 14:48:10 2002
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Fri, 21 Jun 2002 15:48:10 +0200
Subject: [Python-Dev] Python strptime
In-Reply-To: <m3adptqhlk.fsf@mira.informatik.hu-berlin.de>
Message-ID: <85F02CFE-851D-11D6-8310-0030655234CE@cwi.nl>

On Tuesday, June 18, 2002, at 07:30 , Martin v. Loewis wrote:

> I wonder what the purpose of having a pure-Python implementation of
> strptime is, if you have to rely on strftime. Is this for Windows only?

MacPython would benefit as well: it also has strftime() but not 
strptime(). There's currently a pure-python implementation of strptime() 
in the Contrib folder, but Brett's solution would better (the Contrib 
strptime can't be incorporated because it's under GPL license, and the 
automatic callout to Python would be really nice as it makes this 
user-invisible).
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -




From guido@python.org  Fri Jun 21 14:56:27 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 21 Jun 2002 09:56:27 -0400
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: Your message of "Fri, 21 Jun 2002 15:38:46 +0200."
 <E17LOdG-0003V6-00@mail.python.org>
References: <3D121F0D.E3B60865@prescod.net> <E17LJ0C-0007G7-00@mail.python.org> <200206211259.g5LCxmx24769@pcp02138704pcs.reston01.va.comcast.net>
 <E17LOdG-0003V6-00@mail.python.org>
Message-ID: <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net>

> This is the point I mentioned at the start about effects of user
> base.  Given that the user base is largish AND biased AGAINST
> featuritis, it should HELP you "withstand the pressure to add
> features"... if you WANT to withstand it.  I.e., you'll mostly get
> strong support for any stance of "let's NOT add this".  You may
> dislike that when you WANT to add a feature, but surely not when
> it's about "withstanding the pressure".

I really have to start packing :-), but I've got one more thing to
say.

You say that the use base is biased against featuritis.  Yet the user
base is the largest source of new feature requests and proposals.  How
do you reconcile these?  You yourself pleaded for PEP 246 just an hour
ago.  Surely that's a big honking new feature!

For the user base as a whole, the Myhrvold quote is even more true.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jun 21 14:59:50 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 21 Jun 2002 09:59:50 -0400
Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__
In-Reply-To: Your message of "Fri, 21 Jun 2002 09:42:02 EDT."
 <3D132D2A.7080801@stsci.edu>
References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com> <3D1254E5.6010007@stsci.edu> <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net>
 <3D132D2A.7080801@stsci.edu>
Message-ID: <200206211359.g5LDxoF25028@pcp02138704pcs.reston01.va.comcast.net>

> I liked MvL's __index__ method. I understand your hesitancy. It's
> pretty tough exploring new features side-by-side the "version
> fatigue" thread :)

Yes.

> >What's your use case?  Why do you need this?

> This might settle it :) Numeric/numarray arrays are sometimes used in 
> reduction operations (e.g. max) which eliminate one dimension. Sometimes 
> the result is a zero dimensional array, which is currently converted to 
> a Python scalar in both Numeric and numarray. The conversion to scalar 
> enables integer zero dimensional results to be used as indices, but 
> causes other problems since any auxilliary information in the array 
> (e.g. type = Int8) is lost. Adding some form of implicit conversion to 
> index value might permit us to retain zero dimensional objects as arrays.

But when you're indexing Numeric/numarray arrays, you have full
control over the interpretation of indices, so you can do this
yourself.  Do you really need to be able to index Python sequences
(lists, tuples) with your 0-dimensional arrays?  Could you live with
having to call int() in those cases?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pinard@iro.umontreal.ca  Fri Jun 21 14:58:37 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 21 Jun 2002 09:58:37 -0400
Subject: [Python-Dev] Re: String substitution: compile-time versus runtime
In-Reply-To: <3D1297ED.3990C30F@prescod.net>
References: <Pine.LNX.4.44.0206201537410.1419-100000@ziggy>
 <200206210141.g5L1fDv09800@pcp02138704pcs.reston01.va.comcast.net>
 <3D1297ED.3990C30F@prescod.net>
Message-ID: <oqznxoivhu.fsf@titan.progiciels-bpi.ca>

[Alex Martelli]

> If %(name)s is to be deprecated moving towards Python-3000 (surely it
> can't be _removed_ before then), $-formatting needs a very rich feature
> set; otherwise it can't _replace_ %-formatting.  [...]  The "transition"
> period will thus inevitably offer different ways to perform the same
> tasks [...] the old way and the new way MUST both work together for a
> good while to allow migration.

[Moore, Paul]

> I feel that the existing % formatting operator cannot realistically
> be removed.

I too, like Alex and Paul, have a hard time believing that `%' will
effectively fade out in favour of `$'.  As a few people tried to stress out
(Alex did very well with his anecdote), changes in Python are welcome when
they add real new capabilities, but they are less welcome when they merely
add diversity over old substance: the language is then hurt each time,
loosing bits of simplicity (and even legibility, through the development
of Python subsets in user habits).  Each individual loss may be seen as
insignificant when discussed separately[1], but when the pace of change
is high, the losses accumulate, especially if the cleanup does not occur.

This is why any change in current string interpolation should be crafted
so it fits _very_ naturally with what already exists, and does not look
like another feature patched over other features.  A forever "transition"
period between two interpolation paradigms, foreign to one another, might
give exactly that bad impression.

--------------------
[1] This is one of the drawback of the PEP system.  By concentrating on
individual features, we loose the vision of all features taken together.
Only Guido has a global vision. :-)

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From aleax@aleax.it  Fri Jun 21 15:19:50 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 21 Jun 2002 16:19:50 +0200
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net>
References: <3D121F0D.E3B60865@prescod.net> <E17LOdG-0003V6-00@mail.python.org> <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17LPGw-0002Yf-00@mail.python.org>

On Friday 21 June 2002 03:56 pm, Guido van Rossum wrote:
	...
> You say that the use base is biased against featuritis.  Yet the user
> base is the largest source of new feature requests and proposals.  How
> do you reconcile these?  

I hypothesized that, because of self-selection effects, Python's user based 
(particularly on Python-Dev and in Python-only firms) is biased against 
featurities _when compared_ to the general population, which (I opine)
includes a wider proportion of people who don't particularly mind a language
having many redundant features.  There is obviously nothing that needs to be 
"reconciled" between this hypothesized sample-bias and the observation
that requests for features come more from the user base than from (who
else would you expect them to come FROM -- people who've never HEARD
about Python...?-).  So, you're either joking or subject to a common and 
quite understandable "statistical fallacy".  Never forget Bayes's 
Theorem...!-)

> You yourself pleaded for PEP 246 just an hour
> ago.  Surely that's a big honking new feature!

I prefer to think of it as a framework that lets most type-casts, type-tests, 
special purpose type-conversion methods, and the like, be avoided WITHOUT
adding a zillion little ad-hoc features.  But of course you could choose to
view it differently:-).


Alex



From skip@pobox.com  Fri Jun 21 15:26:35 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 21 Jun 2002 09:26:35 -0500
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <m34rfxowsn.fsf@mira.informatik.hu-berlin.de>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
 <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
 <15632.62564.638418.191453@localhost.localdomain>
 <20020619212559.GC18944@zot.electricrain.com>
 <15633.1338.367283.257786@localhost.localdomain>
 <20020620205041.GD18944@zot.electricrain.com>
 <m34rfxowsn.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15635.14235.79608.390983@beluga.mojam.com>

    Greg> should we keep the existing bsddb around as oldbsddb for users in
    Greg> that situation?

    Martin> I don't think so; users could always extract the module from
    Martin> older distributions if they want to.

I would prefer the old version be moved to lib-old (or Modules-old?).  For
people still running DB 2.x it shouldn't be a major headache to retrieve.

Skip



From guido@python.org  Fri Jun 21 15:45:57 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 21 Jun 2002 10:45:57 -0400
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: Your message of "Fri, 21 Jun 2002 16:19:50 +0200."
 <E17LPGw-0002Yf-00@mail.python.org>
References: <3D121F0D.E3B60865@prescod.net> <E17LOdG-0003V6-00@mail.python.org> <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net>
 <E17LPGw-0002Yf-00@mail.python.org>
Message-ID: <200206211445.g5LEjvh25287@pcp02138704pcs.reston01.va.comcast.net>

> I hypothesized that, because of self-selection effects, Python's
> user based (particularly on Python-Dev and in Python-only firms) is
> biased against featurities _when compared_ to the general
> population, which (I opine) includes a wider proportion of people
> who don't particularly mind a language having many redundant
> features.  There is obviously nothing that needs to be "reconciled"
> between this hypothesized sample-bias and the observation that
> requests for features come more from the user base than from (who
> else would you expect them to come FROM -- people who've never HEARD
> about Python...?-).  So, you're either joking or subject to a common
> and quite understandable "statistical fallacy".  Never forget
> Bayes's Theorem...!-)

I dunno.  Most feature proposals come from the c.l.py crowd, and
that's also the place where the loudest clamor for a stop to the
featuritis was heard.

And I believe that even those who consider themselves strongly
anti-featuritis still have one or two pet features that they really
need (even self-proclaimed arch-conservative Christian Tismer, who
went so far as to develop his own version of the language because he
couldn't get his pet feature adopted).

> > You yourself pleaded for PEP 246 just an hour
> > ago.  Surely that's a big honking new feature!
> 
> I prefer to think of it as a framework that lets most type-casts,
> type-tests, special purpose type-conversion methods, and the like,
> be avoided WITHOUT adding a zillion little ad-hoc features.  But of
> course you could choose to view it differently:-).

Surely it would be a dramatic change, probably deeper than new-style
classes and generators together.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tismer@tismer.com  Fri Jun 21 16:10:10 2002
From: tismer@tismer.com (Christian Tismer)
Date: Fri, 21 Jun 2002 17:10:10 +0200
Subject: [Python-Dev] *Simpler* string substitutions
References: <3D121F0D.E3B60865@prescod.net> <E17LOdG-0003V6-00@mail.python.org> <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net>              <E17LPGw-0002Yf-00@mail.python.org> <200206211445.g5LEjvh25287@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D1341D2.9020805@tismer.com>

Guido van Rossum wrote:
...

> And I believe that even those who consider themselves strongly
> anti-featuritis still have one or two pet features that they really
> need (even self-proclaimed arch-conservative Christian Tismer, who
> went so far as to develop his own version of the language because he
> couldn't get his pet feature adopted).

ROTFLMAO yes, how could I forget. :-)

Well, the real story is this: I created some problems
and some solutions, made people dependant from this,
and now I make my living from maintenance work.

>>I prefer to think of it as a framework that lets most type-casts,
>>type-tests, special purpose type-conversion methods, and the like,
>>be avoided WITHOUT adding a zillion little ad-hoc features.  But of
>>course you could choose to view it differently:-).
> 
> Surely it would be a dramatic change, probably deeper than new-style
> classes and generators together.

Sounds as if this would both be very powerful and
might shrink the code base at the same time.
One of the things I like the best.

have-to-start-packing,-too - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From guido@python.org  Fri Jun 21 16:19:32 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 21 Jun 2002 11:19:32 -0400
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: Your message of "Fri, 21 Jun 2002 17:10:10 +0200."
 <3D1341D2.9020805@tismer.com>
References: <3D121F0D.E3B60865@prescod.net> <E17LOdG-0003V6-00@mail.python.org> <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net> <E17LPGw-0002Yf-00@mail.python.org> <200206211445.g5LEjvh25287@pcp02138704pcs.reston01.va.comcast.net>
 <3D1341D2.9020805@tismer.com>
Message-ID: <200206211519.g5LFJWl25427@pcp02138704pcs.reston01.va.comcast.net>

> > Surely it would be a dramatic change, probably deeper than new-style
> > classes and generators together.
> 
> Sounds as if this would both be very powerful and
> might shrink the code base at the same time.
> One of the things I like the best.

See?  Given sufficiently clever presentation, even the most
conservative users can be made to want new things.  Advertisers know
this, of course. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Fri Jun 21 16:28:42 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 21 Jun 2002 11:28:42 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <200206201730.g5KHUlP04117@odiug.zope.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEBJPPAA.tim.one@comcast.net>

[Guido]
> ...
> (The main problem with `...` is that many people can't distinguish
> between ` and ', as user testing has shown.)

Including Tim testing, which is dear to my heart.  The editor I usually use
allows defining styles (font, size, color, etc) for syntactic elements, and
for Python files I set it up so that the backtick has its own style, 1.5x
bigger than all other characters.  This makes it very easy to see the
backticks as such, but mostly(!) because it forces extra vertical space
above a line containing one.  That's more emphasis than even a Tim needs.
OTOH, I have no trouble seeing lowercase "x" <wink>.

xabcx==repr(abc)-ly y'rs  - tim




From tim.one@comcast.net  Fri Jun 21 16:33:14 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 21 Jun 2002 11:33:14 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <200206201746.g5KHkwH04175@odiug.zope.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEBKPPAA.tim.one@comcast.net>

[Guido, quotes Christian]
>> The following statements are ordered by increasing hate.
>> 1 - I do hate the idea of introducing a "$" sign at all.
>> 2 - giving "$" special meaning in strings via a module
>> 3 - doing it as a builtin function
>> 4 - allowing it to address local/global variables

[and adds]
> Doesn't 4 contradict your +1 on allvars()?

Since Christian's reply only increased the apparent contradiction, allow me
to channel:  they are ordered by increasing hate, but starting at the
bottom.  s/increasing/decreasing/ in his original, or s/hate/love/, and you
can continue to read it in the top-down Dutch way <wink>.




From tismer@tismer.com  Fri Jun 21 16:53:36 2002
From: tismer@tismer.com (Christian Tismer)
Date: Fri, 21 Jun 2002 17:53:36 +0200
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <LNBBLJKPBEHFEDALKOLCGEBKPPAA.tim.one@comcast.net>
Message-ID: <3D134C00.2090205@tismer.com>

Tim Peters wrote:
> [Guido, quotes Christian]
> 
>>>The following statements are ordered by increasing hate.
>>>1 - I do hate the idea of introducing a "$" sign at all.
>>>2 - giving "$" special meaning in strings via a module
>>>3 - doing it as a builtin function
>>>4 - allowing it to address local/global variables
>>
> 
> [and adds]
> 
>>Doesn't 4 contradict your +1 on allvars()?
> 
> 
> Since Christian's reply only increased the apparent contradiction, allow me
> to channel:  they are ordered by increasing hate, but starting at the
> bottom.  s/increasing/decreasing/ in his original, or s/hate/love/, and you
> can continue to read it in the top-down Dutch way <wink>.

Huh?
Reading from top to bottom, as I used to, I see increasing
numbers, which are in the same order as the "increasing hate"
(not a linear function, but the same ordering).

4 - allowing it to address local/global variables
is what I hate the most.
This is in no contradiction to allvars(), which is simply
a function that puts some variables into a dict, therefore
deliberating the interpolation from variable access.

Where is the problem, please?

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From aleax@aleax.it  Fri Jun 21 16:55:34 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 21 Jun 2002 17:55:34 +0200
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <200206211445.g5LEjvh25287@pcp02138704pcs.reston01.va.comcast.net>
References: <3D121F0D.E3B60865@prescod.net> <E17LPGw-0002Yf-00@mail.python.org> <200206211445.g5LEjvh25287@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17LQl9-0003Bh-00@mail.python.org>

On Friday 21 June 2002 04:45 pm, Guido van Rossum wrote:
	...[re PEP 246]...
> Surely it would be a dramatic change, probably deeper than new-style
> classes and generators together.

Rarely does one catch Guido (or most any Dutch, I believe) in such
a wild overbid.  Heat getting to you?-)

Protocol-Adaptation is (I believe) a nice idea, but somewhat of a
marginal one when compared e.g. to new-style classes (a change
whose consequences still haven't finished propagating through the
language -- witness the recent issues about making exception
classes new-style vs keeping them classic) -- and most evidently
so if you ALSO add generators to that side of the scales!-)


Alex



From jmiller@stsci.edu  Fri Jun 21 17:00:27 2002
From: jmiller@stsci.edu (Todd Miller)
Date: Fri, 21 Jun 2002 12:00:27 -0400
Subject: [Python-Dev] Indexing builtin sequences with objects which supply
 __int__
References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com> <3D1254E5.6010007@stsci.edu> <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net>              <3D132D2A.7080801@stsci.edu> <200206211359.g5LDxoF25028@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D134D9B.7030601@stsci.edu>

Guido van Rossum wrote:

>>I liked MvL's __index__ method. I understand your hesitancy. It's
>>pretty tough exploring new features side-by-side the "version
>>fatigue" thread :)
>>
>
>Yes.
>
>>>What's your use case?  Why do you need this?
>>>
>
>>This might settle it :) Numeric/numarray arrays are sometimes used in 
>>reduction operations (e.g. max) which eliminate one dimension. Sometimes 
>>the result is a zero dimensional array, which is currently converted to 
>>a Python scalar in both Numeric and numarray. The conversion to scalar 
>>enables integer zero dimensional results to be used as indices, but 
>>causes other problems since any auxilliary information in the array 
>>(e.g. type = Int8) is lost. Adding some form of implicit conversion to 
>>index value might permit us to retain zero dimensional objects as arrays.
>>
>
>But when you're indexing Numeric/numarray arrays, you have full
>control over the interpretation of indices, so you can do this
>yourself.  
>
Yes.  We do this now in numarray.

>Do you really need to be able to index Python sequences
>(lists, tuples) with your 0-dimensional arrays?
>
We want to.  Here's why:

1) Currently, when you fully reduce or subscript a numarray, you get 
back a python scalar.  This has the disadvantages that:

a. information (type=Int8) is lost
b. precision (Float128 --> types.FloatType) can be lost.
c. subsequent code must handle multiple types:
 
   result = some_array_operation()
   if result in PythonScalarTypes:
      do_it_the_scalar_way()
   else:
      do_it_the_array_way()

2) 0-D arrays can be represented as simple scalars using __repr__.  This 
creates a convenient illusion that a 0-D array is just a number.  0-D 
arrays solve all of the problems cited in 1.  But 0-D arrays introduce 
one new problem:

a. 0-D arrays don't work as builtin sequence indices, destroying the 
illusion that what lies at the bottom of an array is just a number.  If 
a fix for this was conceivable,  we'd be willing to do the frontend work 
to make it happen.

>Could you live with
>having to call int() in those cases?
>
Yes and no.  I think there are two disadvantages:

a. There is a small notational overhead, and the need to remember it. 
 In terms of the illusion that a 0-D array is a number, this will be a 
point of confusion.

b. Once int() is added, the semantics of the code are narrower than they 
used to be.   The same code called with  an array as the sequence, might 
otherwise accept an index to be an array.  Once int() is used, this can 
no longer reasonably happen since int(multi_valued_array) should raise 
an exception.

Thanks for your attention on this thread.  Have a nice vacation!

Todd

>
>--Guido van Rossum (home page: http://www.python.org/~guido/)
>
>
>_______________________________________________
>Python-Dev mailing list
>Python-Dev@python.org
>http://mail.python.org/mailman/listinfo/python-dev
>


-- 
Todd Miller 			jmiller@stsci.edu
STSCI / SSG			(410) 338 4576






From python@rcn.com  Fri Jun 21 16:59:05 2002
From: python@rcn.com (Raymond Hettinger)
Date: Fri, 21 Jun 2002 11:59:05 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <LNBBLJKPBEHFEDALKOLCGEBKPPAA.tim.one@comcast.net>
Message-ID: <002f01c2193c$925a0360$a5f8a4d8@othello>

> [Guido, quotes Christian]
> >> The following statements are ordered by increasing hate.
> >> 1 - I do hate the idea of introducing a "$" sign at all.
> >> 2 - giving "$" special meaning in strings via a module
> >> 3 - doing it as a builtin function
> >> 4 - allowing it to address local/global variables
>
> [and adds]
> > Doesn't 4 contradict your +1 on allvars()?
>
[Tim]
> Since Christian's reply only increased the apparent contradiction, allow
me
> to channel:  they are ordered by increasing hate, but starting at the
> bottom.  s/increasing/decreasing/ in his original, or s/hate/love/, and
you
> can continue to read it in the top-down Dutch way <wink>.

template = [
  '$linenum - I do $feeling the idea of introducing the "$$" sign at all.',
  '$linenum - give "$$" special meaning in strings via a module',
  '$linenum - doing it as a builtin function'
  '$linenum - allowing it to address/global local variables'
]

feeling = 'hate'
if 'Dutch' in options:
    feeling = 'love'
    template = template[::-1]        # cool new feature
print 'The following statements are ordered by increasing $feeling.'.sub()
for cnt, line in enumerate(template):   # cool new feature
    linenum = cnt+1  # still wish enumerate had an optional start arg
    print linenum.sub()        # aspiring cool new feature


'regnitteh dnomyar'[::-1]





From mclay@nist.gov  Fri Jun 21 16:57:50 2002
From: mclay@nist.gov (Michael McLay)
Date: Fri, 21 Jun 2002 11:57:50 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <Pine.LNX.4.44.0206201537410.1419-100000@ziggy>
References: <Pine.LNX.4.44.0206201537410.1419-100000@ziggy>
Message-ID: <200206211157.50972.mclay@nist.gov>

On Thursday 20 June 2002 06:48 pm, Ka-Ping Yee wrote:
> On Thu, 20 Jun 2002, Oren Tirosh wrote:
> > See http://tothink.com/python/embedpp
>
> Hi Oren,
>
> Your proposal brings up some valid concerns with PEP 215:
>
>     1. run-time vs. compile-time parsing
>     2. how to decide what's an expression
>     3. balanced quoting instead of $
>

I like Oren's PEP as a replacement for PEP 292. But there is one major problem 
with his notation. I would change the "`" character to something more 
readable. I tried examples with "@", "$", "%", "!", and "?". My preference 
was "?", "@", or "$". (The choice should consider the easy of typing on 
international keyboards.)  The "?" seems like a good choice because the 
replacement expresssion will answer the question of what will appear in the 
string at that location. Here is Oren's example using the "?" to quote the 
expression.

print e"X=?x?, Y=?calc_y(x)?."

The following example is provided for contrast. It has a larger text to 
variable substitution ratio.

p = e"""A new character prefix "e" is defined for strings.  This prefix
    precedes the 'u' and 'r' prefixes, if present. Capital 'E' is also
    acceptable. Within an e-string any ?expressions? enclosed in
    backquotes are evaluated, converted to strings using the
    equivalent of the ?str()? function and embedded in-place into the
 
In the larger body of text the "?" is clearly visible. I'm not so sure I like 
the "?" in the smaller example. It may be because the "?" looks too much like 
letters that can appear in a variable name.

The "@" stands out a bit better than "?". This is probably because there are 
more pixels turned on and the character is fatter.

print e"X=@x@, Y=@calc_y(x)@."

p = e"""A new character prefix "e" is defined for strings.  This prefix
    precedes the 'u' and 'r' prefixes, if present. Capital 'E' is also
    acceptable. Within an e-string any @expressions@ enclosed in
    backquotes are evaluated, converted to strings using the
    equivalent of the @str()@ function and embedded in-place into the
    e-string."""

The function of the "$" would be recognizable to people migrating from other 
languages, but it would be used as a balanced quote, rather than as a 
starting character in a variable that will be substituted. (Is this character 
easy to type on non-US keyboards? I thought the "$" was one of the character 
that is replaced on European keyboards.) If The "@" is available on 
international keyboards then I think it would be a better choice.

print e"X=$x$, Y=$calc_y(x)$."

p = e"""A new character prefix "e" is defined for strings.  This prefix
    precedes the 'u' and 'r' prefixes, if present. Capital 'E' is also
    acceptable. Within an e-string any $expressions$ enclosed in
    backquotes are evaluated, converted to strings using the
    equivalent of the $str()$ function and embedded in-place into the
    e-string."""





From skip@pobox.com  Fri Jun 21 17:36:26 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 21 Jun 2002 11:36:26 -0500
Subject: [Python-Dev] strptime recapped
In-Reply-To: <Pine.SOL.4.44.0206201622310.11785-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0206201622310.11785-100000@death.OCF.Berkeley.EDU>
Message-ID: <15635.22026.914241.398242@beluga.mojam.com>

    Brett> 4) location of strptime:
    Brett> Skip asked why Guido was having me write the callout patch to
    Brett> timemodule.c.  He wondered why Lib/time.py wasn't just created
    Brett> holding my code and then renaming timemodule.c to _timemodule.c
    Brett> and importing it at the end of time.py.  No response has been
    Brett> given thus far for that.

This is what's keeping me from going further.  I did run the test suite
against the latest version with no problem.  I think making the current time
module call out to a new strptime module is the wrong way to do things,
especially given past practice (socket/_socket, string/strop, etc).  I would
prefer a time.py module be created to hold Brett's strptime function.  On
import, the last thing it would try doing is to import * from _time, which
would obliterate Brett's Python version if the platform supports strptime().

    Brett> I also suggested a possible time2 where things like strptime, my
    Brett> helper fxns (calculate the Julian date from the Gregorian date,
    Brett> etc.), and things such as naivetime could be kept.  

That's well beyond the scope of this patch.  I'd rather not address it at
this point (at least not on this thread).  I'd prefer to just focus on how
best to add strptime() to platforms without a libc version.

Skip



From skip@pobox.com  Fri Jun 21 17:42:52 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 21 Jun 2002 11:42:52 -0500
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <E17LJ09-0007B0-00@mail.python.org>
References: <3D121F0D.E3B60865@prescod.net>
 <200206202121.g5KLLPT05634@odiug.zope.com>
 <E17LJ09-0007B0-00@mail.python.org>
Message-ID: <15635.22412.414350.293111@beluga.mojam.com>

    >> Now back to $ vs. %.  I think I can defend having both in the
    >> language, but only if % is reduced to the positional version (classic
    >> printf).  This would be used mostly to format numerical data with
    >> fixed column width.  There would be very little overlap in use cases:

Overlap or not, you wind up with two things that look very much alike doing
nearly identical things.

-1...

Skip



From guido@python.org  Fri Jun 21 17:58:11 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 21 Jun 2002 12:58:11 -0400
Subject: [Python-Dev] strptime recapped
In-Reply-To: Your message of "Fri, 21 Jun 2002 11:36:26 CDT."
 <15635.22026.914241.398242@beluga.mojam.com>
References: <Pine.SOL.4.44.0206201622310.11785-100000@death.OCF.Berkeley.EDU>
 <15635.22026.914241.398242@beluga.mojam.com>
Message-ID: <200206211658.g5LGwBr26143@pcp02138704pcs.reston01.va.comcast.net>

> This is what's keeping me from going further.  I did run the test
> suite against the latest version with no problem.  I think making
> the current time module call out to a new strptime module is the
> wrong way to do things, especially given past practice
> (socket/_socket, string/strop, etc).  I would prefer a time.py
> module be created to hold Brett's strptime function.  On import, the
> last thing it would try doing is to import * from _time, which would
> obliterate Brett's Python version if the platform supports
> strptime().

That's only a good idea if Brett's Python code has absolutely no
features beyond the C version.

I'm -0 on the time.py idea -- it seems it would churn things around
more than absolutely necessary.  But you're right about the socket
precedent.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From python@rcn.com  Fri Jun 21 18:16:49 2002
From: python@rcn.com (Raymond Hettinger)
Date: Fri, 21 Jun 2002 13:16:49 -0400
Subject: [Python-Dev] Behavior of buffer()
Message-ID: <005001c21947$6eb31e00$adb53bd0@othello>

I would like to solicit py-dev's thoughts on the best way to resolve a bug,
www.python.org/sf/546434 .

The root problem is that mybuf[:] returns a buffer type and mybuf[2:4]
returns a string type.  A similar issue exists for buffer repetition.

One way to go is to have the slices always return a string.  If code
currently relies on the type of a buffer slice, it is more likely to be
relying on it being a string as in:  print mybuf[:4].  This is an intuitive
guess because I can't find empirical evidence.  Another reason to choose a
string return type is that buffer() appears to have been designed to be as
stringlike as possible so that it can be easily substituted in code
originally designed for strings.

The other way to go is to return a buffer object everytime.  Slices usually,
but not always (see subclasses of list), return the same type that was being
sliced.  If we choose this route, another issue remains -- mybuf[:] returns
self instead of a new buffer.  I think that behavior is also a bug and
should be changed to be consistent with the Python idiom where:
  b = a[:]
  assert id(a) != id(b)

Incidental to the above, GvR had a thought that slice repetition ought to
always return an error.  Though I don't see any use cases for buffer
repetition, bufferobjects do implement all other sequence behaviors and I
think it would be weird to nullify the sq_repeat slot.

I appreciate your thoughts on the best way to proceed.

fixing-bugs-is-easier-than-deciding-appropriate-behavior-ly yours,


'regnitteh dnomyar'[::-1]

















From skip@pobox.com  Fri Jun 21 18:29:59 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 21 Jun 2002 12:29:59 -0500
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <E17LOdG-0003V6-00@mail.python.org>
References: <3D121F0D.E3B60865@prescod.net>
 <E17LJ0C-0007G7-00@mail.python.org>
 <200206211259.g5LCxmx24769@pcp02138704pcs.reston01.va.comcast.net>
 <E17LOdG-0003V6-00@mail.python.org>
Message-ID: <15635.25239.56486.554175@beluga.mojam.com>

    Alex> This is the point I mentioned at the start about effects of user
    Alex> base.  Given that the user base is largish AND biased AGAINST
    Alex> featuritis, it should HELP you "withstand the pressure to add
    Alex> features"... if you WANT to withstand it.  I.e., you'll mostly get
    Alex> strong support for any stance of "let's NOT add this".  You may
    Alex> dislike that when you WANT to add a feature, but surely not when
    Alex> it's about "withstanding the pressure".

Alex,

I think you're missing one point.  As the Python user base grows, even
though the majority of people are comfortable with the status quo, most of
them are silent most of the time, more people who do want some changes are
added to the mix, and more people with strident voices who want some changes
are avaiable.  Guido isn't cloned to keep up with the increasing user base,
however.

(I'm obviously picking numbers out of thin air in what follows.) If you go
from 100,000 users, 100 of whom would like their favorite bit from the last
language they used added to Python, and 1 of whom is a crackpot who just
won't take "no" for an answer, to 1,000,000 users, you probably have 10
crackpots and 1,000 less strident voices now clamoring for change.  You also
probably have multiple proposals for similar changes (like string
interpolation - everybody has their favorite scheme, whether it's $name,
${name}, %(name)s, or <<name>>).  You still have just one BDFL, however.  He
has more inputs to consider, and has to figure out who among the much larger
masses are the crackpots.  And some of the arguments, whether they come from
crackpots or not, are fairly convincing.  Makes it tougher to resist change.

Skip



From joe@notcharles.ca  Fri Jun 21 18:54:37 2002
From: joe@notcharles.ca (Joe Mason)
Date: Fri, 21 Jun 2002 12:54:37 -0500
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
Message-ID: <20020621175437.GA1213@plover.net>

Tim wrote:
> [Guido, quotes Christian]
> >> The following statements are ordered by increasing hate.
> >> 1 - I do hate the idea of introducing a "$" sign at all.
> >> 2 - giving "$" special meaning in strings via a module
> >> 3 - doing it as a builtin function
> >> 4 - allowing it to address local/global variables
> 
> [and adds]
> > Doesn't 4 contradict your +1 on allvars()?
> 
> Since Christian's reply only increased the apparent contradiction, allow me
> to channel:  they are ordered by increasing hate, but starting at the
> bottom.  s/increasing/decreasing/ in his original, or s/hate/love/, and you
> can continue to read it in the top-down Dutch way <wink>.

If you'll allow me to counter-channel:

Christian hates giving this special syntax form access to local/global
variables, since it's a security risk that's not apparent unless you
know what you're looking for.  He prefers to use allvars() to achieve
the same end, since it's explicit.

He's not opposed to variable access in general.  Write-only variables
don't tend to find much use.

Joe



From David Abrahams" <david.abrahams@rcn.com  Fri Jun 21 18:52:10 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 21 Jun 2002 13:52:10 -0400
Subject: [Python-Dev] *Simpler* string substitutions
References: <3D121F0D.E3B60865@prescod.net> <E17LOdG-0003V6-00@mail.python.org> <200206211356.g5LDuRj25007@pcp02138704pcs.reston01.va.comcast.net> <E17LPGw-0002Yf-00@mail.python.org>
Message-ID: <145001c2194c$ac570c30$6601a8c0@boostconsulting.com>

From: "Alex Martelli" <aleax@aleax.it>

> > You yourself pleaded for PEP 246 just an hour
> > ago.  Surely that's a big honking new feature!
>
> I prefer to think of it as a framework that lets most type-casts,
type-tests,
> special purpose type-conversion methods, and the like, be avoided WITHOUT
> adding a zillion little ad-hoc features.

Such a strong endorsement from you made me go take a cursory look; I think
I'd be -1 on this in its current form. It seems like an intrusive mechanism
in that it forces the adapter or the adaptee to know how to do the job.
Given libraries A and B, can I do something to allow them to interoperate
without modifying them?

Conversely, is there a reasonably "safe" way to add adaptations to an
existing type from the outside? I'm thinking of some analogy to
specialization of traits in C++, here.

-Dave





From oren-py-d@hishome.net  Fri Jun 21 19:09:03 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 21 Jun 2002 14:09:03 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <200206211157.50972.mclay@nist.gov>
References: <Pine.LNX.4.44.0206201537410.1419-100000@ziggy> <200206211157.50972.mclay@nist.gov>
Message-ID: <20020621180903.GA66506@hishome.net>

On Fri, Jun 21, 2002 at 11:57:50AM -0400, Michael McLay wrote:
> On Thursday 20 June 2002 06:48 pm, Ka-Ping Yee wrote:
> > On Thu, 20 Jun 2002, Oren Tirosh wrote:
> > > See http://tothink.com/python/embedpp
> >
> > Hi Oren,
> >
> > Your proposal brings up some valid concerns with PEP 215:
> >
> >     1. run-time vs. compile-time parsing
> >     2. how to decide what's an expression
> >     3. balanced quoting instead of $
> >
> 
> I like Oren's PEP as a replacement for PEP 292. But there is one major 
> problem with his notation. I would change the "`" character to something 
> more readable. 

Expression embedding, unlike interpolation, is done at compile time. This
would make it natural to use the same prefix used for inserting other kinds
of special stuff into strings at compile-time - the backslash.

print "X=\(x), Y=\(calc_y(x))."

No need for double backslash. No need for a special string prefix either 
because \( currently has no meaning.

	Oren




From skip@pobox.com  Fri Jun 21 19:19:12 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 21 Jun 2002 13:19:12 -0500
Subject: [Python-Dev] strptime recapped
In-Reply-To: <200206211658.g5LGwBr26143@pcp02138704pcs.reston01.va.comcast.net>
References: <Pine.SOL.4.44.0206201622310.11785-100000@death.OCF.Berkeley.EDU>
 <15635.22026.914241.398242@beluga.mojam.com>
 <200206211658.g5LGwBr26143@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15635.28192.396040.21104@beluga.mojam.com>

I am getting more and more frustrated with the way things are going here
lately.  At this point I would be more than happy to pass off anything
that's assigned to me to other people and just unsubscribe from python-dev.
I feel like there are enormous contradictions in the way different changes
to Python are being addressed.  If you want to take over any of these bugs
or patches, feel free:

    411881 Use of "except:" in modules
    569574 plain text enhancement for cgitb
    542562 clean up trace.py
    474274 Pure Python strptime() (PEP 42)
    541694 whichdb unittest


    >> I would prefer a time.py module be created to hold Brett's strptime
    >> function.  On import, the last thing it would try doing is to import
    >> * from _time, which would obliterate Brett's Python version if the
    >> platform supports strptime().

    Guido> That's only a good idea if Brett's Python code has absolutely no
    Guido> features beyond the C version.

I don't understand what you mean.  Guido's probably gone by now.  Perhaps
someone can channel him.  I am clearly missing something obvious, but I
don't see any support for the argument that having the existing time module
call out to a separate Python module makes a lot of sense.  (Other than the
fact that it comes from the BDFL, of course.)

If we put Brett's changes into time.py (I'd argue that initially all we want
is strptime(), but can live with the other stuff assuming it's tested -
after all, it has to go somewhere), then

    from _time import *

at the bottom, the only thing to be eliminated would be his version of
strptime(), and only if the platform libc didn't support it.  The
Gregorian/Julian date stuff would remain.  If you don't want them exposed in
time, just prefix them with underscores and don't add them to time.all.

The original patch

    http://python.org/sf/474274

was just about adding strptime() to the time module.  All PEP 42 asked for
was

    Add a portable implementation of time.strptime() that works in clearly
    defined ways on all platforms.

All the other stuff is secondary in my mind to making time.strptime()
universally available and should be dealt with separately.

If performance isn't a big issue (I doubt it will be most of the time), I
can see dropping the C version of time.strptime altogether.  I still think
the best way to add new stuff which is written in Python to the time module
is to have time.py be the front-end module and have it import other stuff
from a C-based _time module.

Skip



From guido@python.org  Fri Jun 21 19:49:19 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 21 Jun 2002 14:49:19 -0400
Subject: [Python-Dev] strptime recapped
In-Reply-To: Your message of "Fri, 21 Jun 2002 13:19:12 CDT."
 <15635.28192.396040.21104@beluga.mojam.com>
References: <Pine.SOL.4.44.0206201622310.11785-100000@death.OCF.Berkeley.EDU> <15635.22026.914241.398242@beluga.mojam.com> <200206211658.g5LGwBr26143@pcp02138704pcs.reston01.va.comcast.net>
 <15635.28192.396040.21104@beluga.mojam.com>
Message-ID: <200206211849.g5LInJg26550@pcp02138704pcs.reston01.va.comcast.net>

(I'm still here, for maybe another hour.)

> I am getting more and more frustrated with the way things are going here
> lately.  At this point I would be more than happy to pass off anything
> that's assigned to me to other people and just unsubscribe from python-dev.
> I feel like there are enormous contradictions in the way different changes
> to Python are being addressed.

This sounds like a reference to something I've said but I don't get it.

> If you want to take over any of these bugs or patches, feel free:
> 
>     411881 Use of "except:" in modules
>     569574 plain text enhancement for cgitb
>     542562 clean up trace.py
>     474274 Pure Python strptime() (PEP 42)
>     541694 whichdb unittest

>From time to time we all get frustrated.  I, too, wish things would
move along quicker.  One thing that may not be obvious is that most of
PythonLabs' resources (myself to some extent excluded) have been
consumed by Zope projects recently, significantly reducing the time we
can spend on moving Python projects along.  This is part of our deal
with Zope Corp: they pay our salaries, we have to spend over half our
time on Zope Corp projects.  That's on average: sometimes we spend
weeks or more exclusively on Python stuff, other times we spend weeks
working on Zope Corp stuff nearly full time.

>     >> I would prefer a time.py module be created to hold Brett's strptime
>     >> function.  On import, the last thing it would try doing is to import
>     >> * from _time, which would obliterate Brett's Python version if the
>     >> platform supports strptime().
> 
>     Guido> That's only a good idea if Brett's Python code has absolutely no
>     Guido> features beyond the C version.
> 
> I don't understand what you mean.  Guido's probably gone by now.  Perhaps
> someone can channel him.  I am clearly missing something obvious, but I
> don't see any support for the argument that having the existing time module
> call out to a separate Python module makes a lot of sense.  (Other than the
> fact that it comes from the BDFL, of course.)

I meant that if Brett's code has useful features not found in the
standard strptime, it should be available explicitly (for those who
want the extra features) and not be overwritten by the C version even
if it exists.

I'm not sure what your objection is against calling out to Python from
C.  We do it all the time, e.g. in PyErr_Warn().

I guess my objection (of -0 strength) against renaming time to _time
is that you'd have to fix a dozen or so build recipes for all sorts of
exotic platforms.  The last time something like this was done (for
new, a much less popular module than time) the initial change set
broke the Windows build, and I think I saw Mac build changes for this
issue checked in today or yesterday.

We can avoid all that if the time module calls out to strptime.py.

> If we put Brett's changes into time.py (I'd argue that initially all we want
> is strptime(), but can live with the other stuff assuming it's tested -
> after all, it has to go somewhere), then
> 
>     from _time import *
> 
> at the bottom, the only thing to be eliminated would be his version of
> strptime(), and only if the platform libc didn't support it.  The
> Gregorian/Julian date stuff would remain.  If you don't want them exposed in
> time, just prefix them with underscores and don't add them to time.all.

It's not that I don't want to expose them.  I haven't seen them, so I
don't know how useful they are.

However (as I have tried to point out a few times now in response to
proposed changes to calendar.py) I plan to introduce the new datetime
type that's currently living in nondist/sandbox/datetime/, either the
Python version or the C version if we find time to finish it.  This
has all the date/time calculations you want, can represent years from
AD 0 till 9999 (we can easily extend it if that's not enough :-) and I
would like all code in need of date/time calculations to be based on
this rather than grow more ad-hoc approaches to doing essentially the
same.

> The original patch
> 
>     http://python.org/sf/474274
> 
> was just about adding strptime() to the time module.  All PEP 42 asked for
> was
> 
>     Add a portable implementation of time.strptime() that works in clearly
>     defined ways on all platforms.
> 
> All the other stuff is secondary in my mind to making time.strptime()
> universally available and should be dealt with separately.

Correct.

> If performance isn't a big issue (I doubt it will be most of the time), I
> can see dropping the C version of time.strptime altogether.  I still think
> the best way to add new stuff which is written in Python to the time module
> is to have time.py be the front-end module and have it import other stuff
> from a C-based _time module.

I hope I can dissuade you from this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip@pobox.com  Fri Jun 21 20:06:47 2002
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 21 Jun 2002 14:06:47 -0500
Subject: [Python-Dev] strptime recapped
In-Reply-To: <200206211849.g5LInJg26550@pcp02138704pcs.reston01.va.comcast.net>
References: <Pine.SOL.4.44.0206201622310.11785-100000@death.OCF.Berkeley.EDU>
 <15635.22026.914241.398242@beluga.mojam.com>
 <200206211658.g5LGwBr26143@pcp02138704pcs.reston01.va.comcast.net>
 <15635.28192.396040.21104@beluga.mojam.com>
 <200206211849.g5LInJg26550@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15635.31047.68516.959914@beluga.mojam.com>

    Guido> I guess my objection (of -0 strength) against renaming time to
    Guido> _time is that you'd have to fix a dozen or so build recipes for
    Guido> all sorts of exotic platforms.  The last time something like this
    Guido> was done (for new, a much less popular module than time) the
    Guido> initial change set broke the Windows build, and I think I saw Mac
    Guido> build changes for this issue checked in today or yesterday.

Okay, I can understand that issue.  Still, that is a mere ripple compared to
the longer term ramifications of taking a wrong turn by adding a strptime
module that might turn out to be more-or-less an orphan.  You have to
consider:

    * Is strptime even the right name for it?  I doubt it.  Only us C-heads
      would think that was a good name.

    * If you create a strptime (or timeparse or parsedate) module should it
      really have exposed functions named julianFirst, julianToGreg or
      gregToJulian?  Ignore the studly caps issue (sorry Brett, I don't
      think they fit in with normal naming practice in the Python core
      library) and just consider the functionality.

    Guido> We can avoid all that if the time module calls out to
    Guido> strptime.py.

But it seems to me that it would be an even bigger step to add a new module
to Lib, which, as it now sits, would probably only provide a single useful
function.

    >> If we put Brett's changes into time.py (I'd argue that initially all
    >> we want is strptime(), but can live with the other stuff assuming
    >> it's tested - after all, it has to go somewhere), then
    >> 
    >> from _time import *
    >> 
    >> at the bottom, the only thing to be eliminated would be his version
    >> of strptime(), and only if the platform libc didn't support it.  The
    >> Gregorian/Julian date stuff would remain.  If you don't want them
    >> exposed in time, just prefix them with underscores and don't add them
    >> to time.all.

    Guido> It's not that I don't want to expose them.  I haven't seen them, so I
    Guido> don't know how useful they are.

    Guido> However (as I have tried to point out a few times now in response
    Guido> to proposed changes to calendar.py) I plan to introduce the new
    Guido> datetime type that's currently living in
    Guido> nondist/sandbox/datetime/, either the Python version or the C
    Guido> version if we find time to finish it.  

Right.  Which is another reason I think we shouldn't just plop a strptime
module into Lib.  There is more going on with time issues than just adding
time.strptime().  Creating a Python-based time module seems less intrusive
to me at the user level than creating new module you will wind up supporting
for a long time.

    >> If performance isn't a big issue (I doubt it will be most of the time), I
    >> can see dropping the C version of time.strptime altogether.  I still
    >> think the best way to add new stuff which is written in Python to the
    >> time module is to have time.py be the front-end module and have it
    >> import other stuff from a C-based _time module.

    Guido> I hope I can dissuade you from this.

Likewise. ;-) It's clear that there is a lot of semi-related stuff going on
related to timekeeping and time calculations.  Maybe the best course is
simply to hold off on Brett's patch for the time being and consider it in
the context of the all the other stuff (your datetime object, Brett's
Gregorian/Julian functions, etc).

Skip




From guido@python.org  Fri Jun 21 20:26:55 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 21 Jun 2002 15:26:55 -0400
Subject: [Python-Dev] strptime recapped
In-Reply-To: Your message of "Fri, 21 Jun 2002 14:06:47 CDT."
 <15635.31047.68516.959914@beluga.mojam.com>
References: <Pine.SOL.4.44.0206201622310.11785-100000@death.OCF.Berkeley.EDU> <15635.22026.914241.398242@beluga.mojam.com> <200206211658.g5LGwBr26143@pcp02138704pcs.reston01.va.comcast.net> <15635.28192.396040.21104@beluga.mojam.com> <200206211849.g5LInJg26550@pcp02138704pcs.reston01.va.comcast.net>
 <15635.31047.68516.959914@beluga.mojam.com>
Message-ID: <200206211926.g5LJQuC26703@pcp02138704pcs.reston01.va.comcast.net>

>     Guido> I guess my objection (of -0 strength) against renaming
>     Guido> time to _time is that you'd have to fix a dozen or so
>     Guido> build recipes for all sorts of exotic platforms.  The
>     Guido> last time something like this was done (for new, a much
>     Guido> less popular module than time) the initial change set
>     Guido> broke the Windows build, and I think I saw Mac build
>     Guido> changes for this issue checked in today or yesterday.

[Skip]
> Okay, I can understand that issue.  Still, that is a mere ripple
> compared to the longer term ramifications of taking a wrong turn by
> adding a strptime module that might turn out to be more-or-less an
> orphan.  You have to consider:
> 
>     * Is strptime even the right name for it?  I doubt it.  Only us
>       C-heads would think that was a good name.

It's already called strptime in the time module. :-)

>     * If you create a strptime (or timeparse or parsedate) module
>       should it really have exposed functions named julianFirst,
>       julianToGreg or gregToJulian?  Ignore the studly caps issue
>       (sorry Brett, I don't think they fit in with normal naming
>       practice in the Python core library) and just consider the
>       functionality.

I think it shouldn't, see my argument about the datetime type.

>     Guido> We can avoid all that if the time module calls out to
>     Guido> strptime.py.
> 
> But it seems to me that it would be an even bigger step to add a new
> module to Lib, which, as it now sits, would probably only provide a
> single useful function.

IMO a new module in Lib is a much smaller step than renaming an
existing built-in module.  New modules get added all the time.

>     >> If we put Brett's changes into time.py (I'd argue that
>     >> initially all we want is strptime(), but can live with the
>     >> other stuff assuming it's tested - after all, it has to go
>     >> somewhere), then
>     >> 
>     >> from _time import *
>     >> 
>     >> at the bottom, the only thing to be eliminated would be his
>     >> version of strptime(), and only if the platform libc didn't
>     >> support it.  The Gregorian/Julian date stuff would remain.
>     >> If you don't want them exposed in time, just prefix them with
>     >> underscores and don't add them to time.all.
> 
>     Guido> It's not that I don't want to expose them.  I haven't
>     Guido> seen them, so I don't know how useful they are.
> 
>     Guido> However (as I have tried to point out a few times now in
>     Guido> response to proposed changes to calendar.py) I plan to
>     Guido> introduce the new datetime type that's currently living
>     Guido> in nondist/sandbox/datetime/, either the Python version
>     Guido> or the C version if we find time to finish it.
> 
> Right.  Which is another reason I think we shouldn't just plop a
> strptime module into Lib.  There is more going on with time issues
> than just adding time.strptime().  Creating a Python-based time
> module seems less intrusive to me at the user level than creating
> new module you will wind up supporting for a long time.

I guess we just disagree.  Th datetime type does *not* have parsing
capability, so we still need a strptime.

>     >> If performance isn't a big issue (I doubt it will be most of
>     >> the time), I can see dropping the C version of time.strptime
>     >> altogether.  I still think the best way to add new stuff
>     >> which is written in Python to the time module is to have
>     >> time.py be the front-end module and have it import other
>     >> stuff from a C-based _time module.
> 
>     Guido> I hope I can dissuade you from this.
> 
> Likewise. ;-) It's clear that there is a lot of semi-related stuff
> going on related to timekeeping and time calculations.  Maybe the
> best course is simply to hold off on Brett's patch for the time
> being and consider it in the context of the all the other stuff
> (your datetime object, Brett's Gregorian/Julian functions, etc).

Yes, holding off until I have the time to work on datetime and review
Brett's patch seems wise.  Apologies for Brett.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From hbl@st-andrews.ac.uk  Fri Jun 21 20:31:27 2002
From: hbl@st-andrews.ac.uk (Hamish Lawson)
Date: Fri, 21 Jun 2002 20:31:27 +0100
Subject: [Python-Dev] Provide a Python wrapper for any new C extension
Message-ID: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk>

One of the arguments put forward against renaming the existing time module 
to _time (as part of incorporating a pure-Python strptime function) is that 
it could break some builds. Therefore I'd suggest that it could be a useful 
principle for any C extension added in the future to the standard library 
to have an accompanying pure-Python wrapper that would be the one that 
client code would usually import.

Hamish Lawson




From mal@lemburg.com  Fri Jun 21 20:41:00 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 21 Jun 2002 21:41:00 +0200
Subject: [Python-Dev] Provide a Python wrapper for any new C extension
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk>
Message-ID: <3D13814C.2040708@lemburg.com>

Hamish Lawson wrote:
> One of the arguments put forward against renaming the existing time 
> module to _time (as part of incorporating a pure-Python strptime 
> function) is that it could break some builds. Therefore I'd suggest that 
> it could be a useful principle for any C extension added in the future 
> to the standard library to have an accompanying pure-Python wrapper that 
> would be the one that client code would usually import.

Sounds like a plan :-)

BTW, this reminds me of the old idea to move that standard
lib into a package, eg. 'python'...

from python import time.

We should at least reserve such a name RSN so that we don't
run into problems later on.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From paul@prescod.net  Fri Jun 21 20:54:53 2002
From: paul@prescod.net (Paul Prescod)
Date: Fri, 21 Jun 2002 12:54:53 -0700
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <Pine.LNX.4.44.0206201537410.1419-100000@ziggy> <200206211157.50972.mclay@nist.gov> <20020621180903.GA66506@hishome.net>
Message-ID: <3D13848D.345119F7@prescod.net>

Oren Tirosh wrote:
> 
>...
> No need for double backslash. No need for a special string prefix either
> because \( currently has no meaning.

I like this idea but note that \( does have a current meaning:

>>> "\("
'\\('
>>> "\(" =="\\("
1

I think this is weird but it is inherited from C... So it would take
time to phase this in. First we have to warn about \( and then give
people time to find instances of it and change them to \\(. Then we
could introduce a new meaning for it.

 Paul Prescod



From smurf@noris.de  Fri Jun 21 21:37:26 2002
From: smurf@noris.de (Matthias Urlichs)
Date: Fri, 21 Jun 2002 22:37:26 +0200
Subject: [Python-Dev] *Simpler* string substitutions
Message-ID: <p05111707b9393a26b1f1@[192.109.102.36]>

Guido:
>  publishers often turn 'foo' into `foo'

It gets worse. The opposite of ` isn't ' -- it's =B4.
Besides, these are apostrophes and not quotes. _Real_ symmetric=20
quotes are " " or =AB =BB or =93 " or ' ' or =92 ' or ..., but you can'=
t use=20
any of these with just ASCII. Apple's MPW Shell language played with=20
some of these.

Anyway, I agree that real languages use ${} or $WORD and that=20
formatting is best done with ${NAME:format}.

Personally, the "${foo}".sub(foo=3D"bar") syntax (using keyword=20
arguments) looks good and works reasonably well for i18n. A possible=20
simplification would be to use the local+global variables if no=20
arguments are given.

>    def f(x, y):
>        return e"The sum of $x and $y is $(x+y)"

How would that work with i18n?

Proposal: The compiler should translate

	_"foo is ${foo}"

to

	_("foo is ${foo}")

and

	e"foo is ${foo}"

to

	"foo is ${foo}".sub()

>That looks OK to me, especially if it can be combined with u and r to
>  create unicode and raw strings.
>
Exactly.

>  PEP 292 is an attempt to do this *without* involving the parser:
>
>    def f(x, y):
>        return "The sum of $x and $y is $(x+y)".sub()
>
>  Downsides are that it invites using non-literals as formats, with al=
l
>  the security aspects, and that its parsing happens at run-time (no b=
ig
>  deal IMO).
>
You can't do it any other way if you want to use i18nalized strings=20
and formats.

Note that some sentences cannot be internationalized without=20
rearranging some parameters...

--=20
Matthias Urlichs



From aleax@aleax.it  Fri Jun 21 21:37:21 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 21 Jun 2002 22:37:21 +0200
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <145001c2194c$ac570c30$6601a8c0@boostconsulting.com>
References: <3D121F0D.E3B60865@prescod.net> <E17LPGw-0002Yf-00@mail.python.org> <145001c2194c$ac570c30$6601a8c0@boostconsulting.com>
Message-ID: <E17LVAa-00020u-00@mail.python.org>

On Friday 21 June 2002 07:52 pm, David Abrahams wrote:
	...
> Such a strong endorsement from you made me go take a cursory look; I think
> I'd be -1 on this in its current form. It seems like an intrusive mechanism
> in that it forces the adapter or the adaptee to know how to do the job.

That's point (e) in the Requirements of the PEP:

"""
e) When the context knows about the object and the protocol and
       knows how to adapt the object so that the required protocol is
       satisfied.  This could use an adapter registry or similar
       method.
"""

but the reference implementation doesn't go as far as that -- the only
thing the PEP has to say about how to implement "third-party, non-invasive
adaptation" is this vague mention of an adapter registry.  As I recall
(the PEP's been stagnant for over a year so my memory of it is getting
fuzzy), I had gently nudged Clark Evans at the time about committing to
SOME specific form of registry, even if a minimal one (e.g., a dictionary
of callables keyed by the pair (protocol, typeofadaptee)).

However, do notice that even in its present form it's WAY less invasive
than C++'s dynamic_cast<>, which ONLY allows the _adaptee_ to solve things --
and in a very inflexible way, too.  With dynamic_cast there's no way the 
"protocol" can noninvasively "adopt" existing objects, nor can an object
have any say about it (e.g. how to disambiguate between multiple
inheritance cases).  QueryInterface does let the adaptee have an explicit
say, but still, the adaptee is the only party consulted.

Only Haskell's typeclass, AFAIK, has (among widely used languages and
objectmodels) a smooth way to allow noninvasive 3rd party post-facto 
adaptation (and another couple of small gems too), but I guess it has an 
easier life because it's compile-time rather than runtime.  Reussner et al
(http://hera.csse.monash.edu.au/dsse/seminars/2000-12-07-reussner.html)
and of course Yellin and Strom (ACM Transactions on Programming Languages and 
Systems, 19(2):292-333, 1997) may have even better stories, but I think their
work (particularly the parts on _automatic_ discovery/adaptation) must
still count as research, not yet suitable as a foundation for a language to
be widely deployed.  So let's not count those here.

> Given libraries A and B, can I do something to allow them to interoperate
> without modifying them?

With the reference implementation proposed in PEP 246 you'd only have
a few more strings to your bow than in C++ or COM -- enough to solve
many cases, but not all.  The adapter registry, even in its simplest form,
would, I think, solve all of these cases.

> Conversely, is there a reasonably "safe" way to add adaptations to an
> existing type from the outside? I'm thinking of some analogy to
> specialization of traits in C++, here.

If a type is DESIGNED to let itself be extended in this way, no problem,
even with the reference implementation.  Remember that one of the
steps of function adapt is to call the type's __conform__ method, if
any, with the protocol as its argument.  If the PEP was adopted in its
current reference-form (_without_ a global adapter registry), there would
still be nothing stopping a clever type from letting 3rd parties extend its
adaptability by enriching its __conform__ -- most simply for example by
having __conform__ as a last step check a typespecific registry of
adapters.  Code may be clearer...:

class DummyButExtensibleType(object):
    _adapters = {}
    def addadapter(cls, protocol, adapter): 
        cls._adapters[protocol] = adapter
    addadapter = classmethod(addadapter)
    def __conform__(self, protocol):
        adapter = self._adapters.get(protocol)
        if adapter: return adapter(self, protocol)
        raise TypeError

BTW, I think the reference implementation's "# try to use the object's 
adapting mechanism" section is flawed -- it wouldn't let __conform__
return the object as being conformant to the protocol if the object
happened to be false in a boolean context.  I think TypeError must
be the normal way for a __conform__ method (or an __adapt__ one)
to indicate failure -- we can't reject conformant objects that happen
to evaluate as false, it seems to me.


Alex



From smurf@noris.de  Fri Jun 21 21:46:52 2002
From: smurf@noris.de (Matthias Urlichs)
Date: Fri, 21 Jun 2002 22:46:52 +0200
Subject: [Python-Dev] *Simpler* string substitutions
Message-ID: <p05111708b9393facfd4e@[192.109.102.36]>

Aahz:
>  On Thu, Jun 20, 2002, Gustavo Niemeyer wrote:
>  >
>  > "Serving HTTP on", sa[0], "port", sa[1], "..."
>
_("Serving HTTP on ${addr} port ${port}").sub(addr=sa[0], port=sa[1])
_("Serving HTTP on ${sa[0]} port ${sa[1]}").sub()

or some equivalent syntax.

Note that the second way, above, has a distinct disadvantage for the 
translating person. He or she would probably know what "addr" stands 
for, but what is a "sa[0]" ???

>  This is where current string handling comes up short.  What's the
>  correct way to internationalize this string?

Currently, you do the same thing but use % and a dictionary. As I 
said, equivalent syntax.

>   What if the person
>  handling I18N isn't a Python programmer?
>
The person gets frustrated and bitches at the programmer until the 
programmer fixes the code...
-- 
Matthias Urlichs



From bac@OCF.Berkeley.EDU  Fri Jun 21 22:06:36 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Fri, 21 Jun 2002 14:06:36 -0700 (PDT)
Subject: [Python-Dev] strptime recapped
In-Reply-To: <200206211926.g5LJQuC26703@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.SOL.4.44.0206211350040.13283-100000@death.OCF.Berkeley.EDU>

After reading all the email up to this point,


[Guido van Rossum]

> [Skip]

[snip]
> >     * Is strptime even the right name for it?  I doubt it.  Only us
> >       C-heads would think that was a good name.
>
> It's already called strptime in the time module. :-)
>

I have to agree with Guido on this one.  It might only make sense to
people who come from C, but it has always been named this in Python.

If the decision is made to go with another module for this code, though,
then that is a different story.


> >     * If you create a strptime (or timeparse or parsedate) module
> >       should it really have exposed functions named julianFirst,
> >       julianToGreg or gregToJulian?  Ignore the studly caps issue
> >       (sorry Brett, I don't think they fit in with normal naming
> >       practice in the Python core library) and just consider the

I think you're right.  I wrote this code originally after my last final
ever in college as an undergraduate and so I was just more interested in
relaxing and churning out some good code then being overtly proper in fxn
naming. =)  I will go through and read the Python coding style PEP and
clean up my code.


[snip]
[Big discussion on whether a new module in Lib or just the callout to my
Python code from timemodule.c ensued that is beyond my comment since I am
so new to this list]

> Yes, holding off until I have the time to work on datetime and review
> Brett's patch seems wise.  Apologies for Brett.

It's quite fine with me.  I want to see this done right just like everyone
else who cares about Python's development.  Personally, I am just ecstatic
that I am getting to help out in some way.  I feel more like a giddy
little kid who is helping out some grown-ups with some important project
than a recent college graduate.  =)

Enjoy your vacation, Guido.

And don't leave us, Skip!  I know I have greatly appreciated your help
both on my patch and your input into all the other threads that have been
going on as late here.

-Brett C.




From bac@OCF.Berkeley.EDU  Fri Jun 21 22:18:39 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Fri, 21 Jun 2002 14:18:39 -0700 (PDT)
Subject: [Python-Dev] Provide a Python wrapper for any new C extension
In-Reply-To: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk>
Message-ID: <Pine.SOL.4.44.0206211407280.13283-100000@death.OCF.Berkeley.EDU>

[Hamish Lawson]

> One of the arguments put forward against renaming the existing time module
> to _time (as part of incorporating a pure-Python strptime function) is that
> it could break some builds. Therefore I'd suggest that it could be a useful
> principle for any C extension added in the future to the standard library
> to have an accompanying pure-Python wrapper that would be the one that
> client code would usually import.

I am for that, but then again I am biased in this situation.  =)

But it seems reasonable.  I would think everyeone who makes any major
contribution of code to Python would much rather code it up in Python then
C.  It would probably help to get more code accepted since I know I felt
a little daunted having to write that callout for strptime.

The only obvious objection I can see to this is a performance hit for
having to go through the Python stub to call the C extension.  But I just
did a very simple test of calling strftime('%c') 25,000 times from time
directly and using a Python stub and it was .470 and .490 secs total
respectively according to profile.run().

The oher objection I can see is that this would promote coding everything
in Python when possible and that might not always be the best solution.
Some things should just be coded in C, period.  But I think for such
situations that the person writing the code would most likely recognize
that fact.

Or maybe I am wrong in all of this.  I don't know the exact process of how
a C extension file gets accepted or what currently leads to an extension
file getting a stub is.  I would (and I am sure anyone else new to the
list) really appreciate someone possibly explaining it to me since I would
like to know.

-Brett C.




From oren-py-d@hishome.net  Fri Jun 21 22:27:37 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sat, 22 Jun 2002 00:27:37 +0300
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <3D13848D.345119F7@prescod.net>; from paul@prescod.net on Fri, Jun 21, 2002 at 12:54:53PM -0700
References: <Pine.LNX.4.44.0206201537410.1419-100000@ziggy> <200206211157.50972.mclay@nist.gov> <20020621180903.GA66506@hishome.net> <3D13848D.345119F7@prescod.net>
Message-ID: <20020622002737.A31767@hishome.net>

On Fri, Jun 21, 2002 at 12:54:53PM -0700, Paul Prescod wrote:
> > No need for double backslash. No need for a special string prefix either
> > because \( currently has no meaning.
> 
> I like this idea but note that \( does have a current meaning:
> 
> >>> "\("
> '\\('
> >>> "\(" =="\\("
> 1

"""Unlike Standard C, all unrecognized escape sequences are left in the 
string unchanged, i.e., the backslash is left in the string. (This behavior 
is useful when debugging: if an escape sequence is mistyped, the resulting 
output is more easily recognized as broken.) """

In other words, programs that rely on this beaviour are broken.

	Oren




From David Abrahams" <david.abrahams@rcn.com  Fri Jun 21 22:39:29 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 21 Jun 2002 17:39:29 -0400
Subject: [Python-Dev] *Simpler* string substitutions
References: <3D121F0D.E3B60865@prescod.net> <E17LPGw-0002Yf-00@mail.python.org> <145001c2194c$ac570c30$6601a8c0@boostconsulting.com> <E17LV9o-0005IP-00@mx05.mrf.mail.rcn.net>
Message-ID: <154001c2196c$c411f9f0$6601a8c0@boostconsulting.com>

From: "Alex Martelli" <aleax@aleax.it>


> On Friday 21 June 2002 07:52 pm, David Abrahams wrote:
> ...
> > Such a strong endorsement from you made me go take a cursory look; I
think
> > I'd be -1 on this in its current form. It seems like an intrusive
mechanism
> > in that it forces the adapter or the adaptee to know how to do the job.
>
> That's point (e) in the Requirements of the PEP:
>
> """
> e) When the context knows about the object and the protocol and
>        knows how to adapt the object so that the required protocol is
>        satisfied.  This could use an adapter registry or similar
>        method.
> """

Oh, sorry I missed that.

> However, do notice that even in its present form it's WAY less invasive
> than C++'s dynamic_cast<>, which ONLY allows the _adaptee_ to solve
things --
> and in a very inflexible way, too.  With dynamic_cast there's no way the
> "protocol" can noninvasively "adopt" existing objects, nor can an object
> have any say about it (e.g. how to disambiguate between multiple
> inheritance cases).  QueryInterface does let the adaptee have an explicit
> say, but still, the adaptee is the only party consulted.

I wasn't trying to spark a comparison with C++ here, nor was I talking
about runtime-dispatched stuff in C++. I'm not even sure I would call
dynamic_cast<> a candidate for this kind of job, at least, not by iteself.
I was thinking of the use of template specialization to describe the
relationship of a type to a library, e.g. specialization of
std::iterator_traits<libA::some_class> by libB, which makes
libA::some_class available for use as an iterator with the standard library
(assuming it has some appropriate interface).

> Only Haskell's typeclass, AFAIK, has (among widely used languages and
> objectmodels) a smooth way to allow noninvasive 3rd party post-facto
> adaptation (and another couple of small gems too), but I guess it has an
> easier life because it's compile-time rather than runtime.

IIUC the same kind of thing can be implemented in C++ templates, if you
know where to look. There's been a lot of discussion of how to build
variant types lately.

-Dave







From greg@electricrain.com  Fri Jun 21 22:54:44 2002
From: greg@electricrain.com (Gregory P. Smith)
Date: Fri, 21 Jun 2002 14:54:44 -0700
Subject: [Python-Dev] Re: replacing bsddb with pybsddb's bsddb3 module
In-Reply-To: <15635.14235.79608.390983@beluga.mojam.com>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com> <20020611203906.V6026@phd.pp.ru> <15631.61100.561824.480935@anthem.wooz.org> <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net> <15632.62564.638418.191453@localhost.localdomain> <20020619212559.GC18944@zot.electricrain.com> <15633.1338.367283.257786@localhost.localdomain> <20020620205041.GD18944@zot.electricrain.com> <m34rfxowsn.fsf@mira.informatik.hu-berlin.de> <15635.14235.79608.390983@beluga.mojam.com>
Message-ID: <20020621215444.GB30056@zot.electricrain.com>

On Fri, Jun 21, 2002 at 09:26:35AM -0500, Skip Montanaro wrote:
> 
>     Greg> should we keep the existing bsddb around as oldbsddb for users in
>     Greg> that situation?
> 
>     Martin> I don't think so; users could always extract the module from
>     Martin> older distributions if they want to.
> 
> I would prefer the old version be moved to lib-old (or Modules-old?).  For
> people still running DB 2.x it shouldn't be a major headache to retrieve.

This sounds good.  Here's what i see on the plate to be done so far:

1) move the existing Modules/bsddbmodule.c to a new Modules-old or
   directory.
2) create a new Lib/bsddb directory containing bsddb3/bsddb3/*.py from
   the pybsddb project.
3) create a new Modules/bsddb directory containing bsddb3/src/* from
   the pybsddb project (the files should probably be renamed to
   _bsddbmodule.c and bsddbmoduleversion.h for consistent naming)
4) place the pybsddb setup.py in the Modules/bsddb directory,
   modifying it as needed.  OR  modify the top level setup.py to
   understand how to build the pybsddb module.  (there is code in
   pybsddb's setup.py to locate the berkeleydb install and determine
   appropriate flags that should be cleaned up and carried on)
5) modify the top level python setup.py to build the bsddb module
   as appropriate.
6) "everything else" including integrating documentation and
   pybsddb's large test suite.

Sound correct?

How do we want future bsddb module development to proceed?  I envision
it either taking place 100% under the python project, or taking place
as it is now in the pybsddb project with patches being fed to the python
project as desired?  Any preferences?  [i prefer to not maintain the
code in two places myself (ie: do it all under the python project)]

Greg




From tim.one@comcast.net  Fri Jun 21 23:48:44 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 21 Jun 2002 18:48:44 -0400
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <E17LQl9-0003Bh-00@mail.python.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEDLPPAA.tim.one@comcast.net>

[Guido, re PEP 246]
> Surely it would be a dramatic change, probably deeper than new-style
> classes and generators together.

[Alex Martelli]
> Rarely does one catch Guido (or most any Dutch, I believe) in such
> a wild overbid.  Heat getting to you?-)

Curiously, I don't think Guido was overstating his belief, but he's got his
Python-User's Hat on there, not his Developer-of-Python Hat.  While
new-style classes cut deeply and broadly in the language implementation,
most Python programmers can ignore them (the type/class split bit extension
module authors the hardest, and life can be much more pleasant for them
now).  Protocol adaptation taken seriously would be a fundamental change in
the Python Way of Life for users, from "just try it and see whether it
works", to "you don't *have* to guess anymore".  I think it would make a
dramatic difference in the flavor of day-to-day Python programming -- and
probably for the better, ignoring speed.




From tim.one@comcast.net  Sat Jun 22 00:01:25 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 21 Jun 2002 19:01:25 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <3D13848D.345119F7@prescod.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEDLPPAA.tim.one@comcast.net>

[Paul Prescod]
> I like this idea but note that \( does have a current meaning:
>
> >>> "\("
> '\\('
> >>> "\(" =="\\("
> 1
>
> I think this is weird but it is inherited from C...

C89 doesn't define the effect.  C99 specifically forbids this treatment, and
requires a diagnostic if \( appears.  Guido did this originally to make it
easier to write Emacsish regexps; the later raw strings were a better
solution to that problem, although 99.7% of Python newbies seem to believe
that raw strings are an idiot's attempt to make it easier to embed Windows
file path literals (newbies -- gotta love 'em <wink>).




From tim.one@comcast.net  Sat Jun 22 00:18:12 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 21 Jun 2002 19:18:12 -0400
Subject: [Python-Dev] strptime recapped
In-Reply-To: <15635.31047.68516.959914@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEDNPPAA.tim.one@comcast.net>

[Skip Montanaro]
> ...
>     * Is strptime even the right name for it?  I doubt it.  Only
>       us C-heads would think that was a good name.

Given that we're stuck with strftime for date->string, strptime for
string->date is better than just about anything else ('f' for 'format', 'p'
for 'parse').

>     * If you create a strptime (or timeparse or parsedate) module
>       should it really have exposed functions named julianFirst,
>       julianToGreg or gregToJulian?

No, and definitely not at first.  Stick to the original request and this
will be sooooo much easier to resolve.  As you put it earlier,

    All PEP 42 asked for was

        Add a portable implementation of time.strptime() that works in
        clearly defined ways on all platforms.

Cool!  Let's do just that much to start, and don't take it as "a reason" to
rename the time module either (it really is trivial to add another .py file
to Lib!  give the name a leading underscore if you want to imply it's a
helper for something else).




From tim.one@comcast.net  Sat Jun 22 00:31:00 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 21 Jun 2002 19:31:00 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <3D134C00.2090205@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEDOPPAA.tim.one@comcast.net>

[Tim]
>> Since Christian's reply only increased the apparent contradiction,
>> allow me to channel: ...

[Christian Tismer]
> Huh?
> Reading from top to bottom, as I used to, I see increasing
> numbers, which are in the same order as the "increasing hate"
> (not a linear function, but the same ordering).
>
> 4 - allowing it to address local/global variables
> is what I hate the most.
> This is in no contradiction to allvars(), which is simply
> a function that puts some variables into a dict, therefore
> deliberating the interpolation from variable access.
>
> Where is the problem, please?

I was warming up my awesome channeling powers for Guido's impending
vacation, and all I can figure is that I must have left them parked in
reverse the last time he came back.  Nothing a 12-pack of Coke didn't cure,
though!  I channel that you'll graciously accept my apology <wink>.




From tim.one@comcast.net  Sat Jun 22 00:33:38 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 21 Jun 2002 19:33:38 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <002f01c2193c$925a0360$a5f8a4d8@othello>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEDOPPAA.tim.one@comcast.net>

[Raymond Hettinger]
> ...
> 'regnitteh dnomyar'[::-1]

Is there any chance of ripping this out of the language before someone uses
it for real?  If not, strings need to grow a .reversed_title_case() method
too.

it's-bad-enough-we-added-a-reversed_alternating_rot13-method-ly y'rs  - tim




From bac@OCF.Berkeley.EDU  Sat Jun 22 00:47:22 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Fri, 21 Jun 2002 16:47:22 -0700 (PDT)
Subject: [Python-Dev] strptime recapped
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEDNPPAA.tim.one@comcast.net>
Message-ID: <Pine.SOL.4.44.0206211633580.22571-100000@death.OCF.Berkeley.EDU>

[Tim Peters]

> [Skip Montanaro]
[snip]
>
> Cool!  Let's do just that much to start, and don't take it as "a reason" to
> rename the time module either (it really is trivial to add another .py file
> to Lib!  give the name a leading underscore if you want to imply it's a
> helper for something else).

Sounds good to me.  Perhaps this is the best solution for Python 2.3 (goes
beta mid-July, right?).  If we do this should we leave access to the C
version of strptime, or move all calls over to my code?  Personally, I say
leave it since then any possible differences people might have with their
implementation of strptime compared to mine won't affect them.  This is
not saying that I think there is, though; I have done my best to make sure
there is not a deviance.

There is also a noticeable performace difference between my implementation
and the C version.  I have tried to address the best I could by making
locale discovery lazy and being able to have the re object used for a
format string be returned so as to use that instead of having to
recalculate it, but there is still going to be a difference.

So basically, I am agreeing with Tim that my module should just be added
as Lib/_strptime.py and my callout should just be added to timemodule.c.
I will clean up the naming of my helper fxns and add __all__ to only
contain strptime to keep it simple.  That will get this in for 2.3 and
lets this discussion of where time fxns, data types, etc. are going to be
in Python.

Who would of thought little old me would spark a Timbot response.  =)

-Brett C.




From tim.one@comcast.net  Sat Jun 22 01:03:01 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 21 Jun 2002 20:03:01 -0400
Subject: [Python-Dev] Indexing builtin sequences with objects which supply
 __int__
In-Reply-To: <3D134D9B.7030601@stsci.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEEAPPAA.tim.one@comcast.net>

[Todd Miller, wants to use rank-0 arrays as regular old indices]

Here's a sick idea:  given Python 2.2, you *could* make the type of a rank-0
array a subclass of Python's int type, making sure (if needed) to copy the
value into "the int part" at the start of the struct.  Then a rank-0 array
would act like an integer in almost all contexts requiring a Python int,
including use as a sequence index.

The relevant code in the core is

		if (PyInt_Check(key))
			return PySequence_GetItem(o, PyInt_AsLong(key));

in PyObject_GetItem().  PyInt_Check() says "yup!" for an instance of any
subclass of int, and PyInt_AsLong() extracts "the int part" out of any
instance of any subclass of int.

In return, it shifts the burden onto convincing the rest of numarray that
the thing is still an array too <0.4 wink>.




From sholden@holdenweb.com  Fri Jun 21 17:52:16 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Fri, 21 Jun 2002 12:52:16 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206182330444.SM01040@mail.python.org>
Message-ID: <002d01c2199b$d7e50150$6300000a@holdenweb.com>

"Barry A. Warsaw" <barry@zope.com> wrote ...
>
> I'm so behind on my email, that the anticipated flamefest will surely
> die down before I get around to reading it.  Yet still, here is a new
> PEP. :)
>
[flamefest-fodder]

Seems to me that the volume of comment might imply that string formatting
isn't the one obvious best way to do it. Now the flamefest has died down
somewhat, the only thing I can see PEP 292 justifying is better
documentation for the string-formatting operator. "But then I'm an
$pytpe".sub({"ptype": "old fogey"})

regards
 Steve
-----------------------------------------------------------------------
Steve Holden                                 http://www.holdenweb.com/
Python Web Programming                http://pydish.holdenweb.com/pwp/
-----------------------------------------------------------------------




From aleax@aleax.it  Sat Jun 22 08:58:14 2002
From: aleax@aleax.it (Alex Martelli)
Date: Sat, 22 Jun 2002 09:58:14 +0200
Subject: [Python-Dev] *Simpler* string substitutions
In-Reply-To: <154001c2196c$c411f9f0$6601a8c0@boostconsulting.com>
References: <3D121F0D.E3B60865@prescod.net> <E17LV9o-0005IP-00@mx05.mrf.mail.rcn.net> <154001c2196c$c411f9f0$6601a8c0@boostconsulting.com>
Message-ID: <E17Lfmn-0007C6-00@mail.python.org>

On Friday 21 June 2002 11:39 pm, David Abrahams wrote:
	...
> > That's point (e) in the Requirements of the PEP:
> >
> > """
> > e) When the context knows about the object and the protocol and
> >        knows how to adapt the object so that the required protocol is
> >        satisfied.  This could use an adapter registry or similar
> >        method.
> > """
>
> Oh, sorry I missed that.

Easy to miss because the PEP (I think) makes no further reference to
[e], not even to say it's not gonna address it directly.  I think the PEP
could be enhanced about this (as about the reference implementation's
buglet which I already remarked upon).

> I was thinking of the use of template specialization to describe the
> relationship of a type to a library, e.g. specialization of
> std::iterator_traits<libA::some_class> by libB, which makes
> libA::some_class available for use as an iterator with the standard library
> (assuming it has some appropriate interface).

That requires proper design for extensibility in advance -- the standard
library must have had the forethought to define, and use everywhere
appropriate, std::iterator_traits, AND libA must expose classes that
can be plugged into that "slot".

As I tried indicating, if you're willing to require design-in-advance for
such issues, PEP 246 (together with Python's general mechanisms)
already offer what you need.


Allow me to offer an analogy: a Ruby programmer complains to a
Python or C++ programmer "your language ain't flexible enough!  I
have a library X that supplies type X1 and a library Y that consumes
any type Y1 which exposes a method Y2 and I want to just add a
suitable Y2 to the existing X1 but Python/C++ doesn't let me modify
the existing type/class X1".
The Python or C++ programmer replies: "well INHERIT from X1
and add method Y2, that's easy".
The Ruby programmer retorts" "No use, library X does in umpteen
places a 'new X1();' [in C++ terms] so my subclassing won't be
picked up"
The Python or C++ programmer triumphantly concludes: "Ah that's
a design error in X, X should instead use a factory makeX1() and
let you override THAT to make your Y2-enriched X1 instead".

Yeah right.  That's like the airplane manufactures explaining away
most crashes as "pilot error".  Perrow's "Normal Accidents" (GREAT
book btw) is highly recommended reading, particularly to anybody
who still falls for that line.  *Humans are fallible* and most often in
quite predictable ways: a system that stresses humans just the wrong
way is gonna produce "pilot error" over and over AND over again.
Wishing for a better Human Being Release 2.0 is just silly.  Ain't
gonna come and we couldn't afford the upgrade fee if it did:-).

Yes, factories and such creational patterns ARE a better long-term
answer, BUT there's no denying that Ruby's ability to patch things
up with duct tape (while having its own costs, of course!-) can
be a short-term lifesaver.  "If God had WANTED us to get things
right the first time he wouldn't have created duct tape", after all:-).

End of analogy...


The way I read [e] is more demanding -- allowing some degree of
"impedance matching" WITHOUT requiring special forethought by
the designers of either library, beyond using adapt rather than
typetesting -- just some ingenuity on the third party's part.


> > Only Haskell's typeclass, AFAIK, has (among widely used languages and
> > objectmodels) a smooth way to allow noninvasive 3rd party post-facto
> > adaptation (and another couple of small gems too), but I guess it has an
> > easier life because it's compile-time rather than runtime.
>
> IIUC the same kind of thing can be implemented in C++ templates, if you
> know where to look. There's been a lot of discussion of how to build
> variant types lately.

I don't think you can do it without some degree of design forethought,
but admittedly I'm starting to get very slightly rusty (haven't designed a
dazzling new C++ template in almost six months, after all:-).


Alex



From oren-py-d@hishome.net  Sat Jun 22 11:44:59 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sat, 22 Jun 2002 13:44:59 +0300
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
In-Reply-To: <Pine.LNX.4.44.0206201537410.1419-100000@ziggy>; from ping@zesty.ca on Thu, Jun 20, 2002 at 03:48:52PM -0700
References: <20020620071856.GA10497@hishome.net> <Pine.LNX.4.44.0206201537410.1419-100000@ziggy>
Message-ID: <20020622134459.A6918@hishome.net>

On Thu, Jun 20, 2002 at 03:48:52PM -0700, Ka-Ping Yee wrote:
> Using compile-time parsing, as in PEP 215, has the advantage that it
> avoids any possible security problems; but it also eliminates the
> possibility of using this for internationalization.  

Compile-time parsing may eliminate the possibility of using the same 
mechanism for internationalization, but not the possibility of using the
same syntax. A module may provide a function that interprets the same 
notation at runtime.  The runtime version probably shouldn't support full 
expression embedding - just simple name substitution.

> I see this as the key tension in the string interpolation issue (aside 
> from all the syntax stuff -- which is naturally controversial).

And the security vs. ease-of-use issue.

	Oren




From mwh@python.net  Sat Jun 22 12:10:42 2002
From: mwh@python.net (Michael Hudson)
Date: 22 Jun 2002 12:10:42 +0100
Subject: [Python-Dev] strptime recapped
In-Reply-To: Brett Cannon's message of "Fri, 21 Jun 2002 14:06:36 -0700 (PDT)"
References: <Pine.SOL.4.44.0206211350040.13283-100000@death.OCF.Berkeley.EDU>
Message-ID: <2madpna7rh.fsf@starship.python.net>

Brett Cannon <bac@OCF.Berkeley.EDU> writes:

> It's quite fine with me.  I want to see this done right just like
> everyone else who cares about Python's development.  Personally, I
> am just ecstatic that I am getting to help out in some way.  I feel
> more like a giddy little kid who is helping out some grown-ups with
> some important project than a recent college graduate.  =)

Feel like being release manager for 2.2.2? <wink>

Cheers,
M.

-- 
  at any rate, I'm satisfied that not only do they know which end of
  the pointy thing to hold, but where to poke it for maximum effect.
                                  -- Eric The Read, asr, on google.com



From David Abrahams" <david.abrahams@rcn.com  Sat Jun 22 12:35:03 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Sat, 22 Jun 2002 07:35:03 -0400
Subject: [Python-Dev] *Simpler* string substitutions
References: <3D121F0D.E3B60865@prescod.net> <E17LV9o-0005IP-00@mx05.mrf.mail.rcn.net> <154001c2196c$c411f9f0$6601a8c0@boostconsulting.com> <E17Lfmi-0004Dq-00@mx05.mrf.mail.rcn.net>
Message-ID: <164501c219e0$dacf69b0$6601a8c0@boostconsulting.com>

----- Original Message -----
From: "Alex Martelli" <aleax@aleax.it>

> > I was thinking of the use of template specialization to describe the
> > relationship of a type to a library, e.g. specialization of
> > std::iterator_traits<libA::some_class> by libB, which makes
> > libA::some_class available for use as an iterator with the standard
library
> > (assuming it has some appropriate interface).
>
> That requires proper design for extensibility in advance -- the standard
> library must have had the forethought to define, and use everywhere
> appropriate, std::iterator_traits, AND libA must expose classes that
> can be plugged into that "slot".

Very true.

> As I tried indicating, if you're willing to require design-in-advance for
> such issues, PEP 246 (together with Python's general mechanisms)
> already offer what you need.

Super! +1

> Yes, factories and such creational patterns ARE a better long-term
> answer, BUT there's no denying that Ruby's ability to patch things
> up with duct tape (while having its own costs, of course!-) can
> be a short-term lifesaver.  "If God had WANTED us to get things
> right the first time he wouldn't have created duct tape", after all:-).

In Alaska, where my wife grew up, they call it "100-mile-an-hour tape" --
good for any use up to 100 mph.
[apparently not for ducts, though, even if they're sitting still :(]


> > > Only Haskell's typeclass, AFAIK, has (among widely used languages and
> > > objectmodels) a smooth way to allow noninvasive 3rd party post-facto
> > > adaptation (and another couple of small gems too), but I guess it has
an
> > > easier life because it's compile-time rather than runtime.
> >
> > IIUC the same kind of thing can be implemented in C++ templates, if you
> > know where to look. There's been a lot of discussion of how to build
> > variant types lately.
>
> I don't think you can do it without some degree of design forethought,
> but admittedly I'm starting to get very slightly rusty (haven't designed
a
> dazzling new C++ template in almost six months, after all:-).

Well, I have to admit that I don't have the time to say anything backed up
by any research at this point ...I'm currently stuck in a Microsoft gravity
well trying to survive the descent... but thanks as always for your
educational and broad perspective!

-Dave




From pinard@iro.umontreal.ca  Sat Jun 22 14:52:07 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 22 Jun 2002 09:52:07 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <20020622134459.A6918@hishome.net>
References: <20020620071856.GA10497@hishome.net>
 <Pine.LNX.4.44.0206201537410.1419-100000@ziggy>
 <20020622134459.A6918@hishome.net>
Message-ID: <oqlm97v2t4.fsf@titan.progiciels-bpi.ca>

[Oren Tirosh]

> On Thu, Jun 20, 2002 at 03:48:52PM -0700, Ka-Ping Yee wrote:
> > Using compile-time parsing, as in PEP 215, has the advantage that it
> > avoids any possible security problems; but it also eliminates the
> > possibility of using this for internationalization.  

> Compile-time parsing may eliminate the possibility of using the same 
> mechanism for internationalization, but not the possibility of using the
> same syntax.

Parsing must be done at some time.  Maybe the solution lies into finding some
way so Python could lazily delay the "compilation" of the string to after its
translation (at run-time), when it is known beforehand that a given string
is internationalised.  The `.pyc' would contain byte-code and data slot for
driving the laziness.  The translation and compilation should occur only
once for a particular string, of course, as the internationalised string
may appears within a loop, or within a function which gets called often.
In threaded contexts, if we allow for spurious re-compilations once in a
long while, and with a simple bit of care, locks could be fully avoided.[1]

The good in the above approach is that people would write Python about the
same way irrelevant to the fact internationalisation is in the picture or
not, and would not have to suffer the complexities of "hand" optimisation
of string interpolation in internationalised context.  It would simple
for _everybody_, on the road meant to make internationalisation a breeze.

For Python to know at initial compile time if a string is going to be
internationalised of not, it has to be modified, but a positive side of this
effort is that internationalisation becomes part of the language design.
A possible way towards this (suggested a long while ago) could be to use,
beside `eru', some `t' prefix letter asking for translation.

Two problems are still to be solved, however.  First, going from `_("TEXT")'
to `t"TEXT"', the translation function (`_' here) and textual domain should
have proper defaults, while offering a way to override them for bigger
applications needing finer control or tuning.  A simple solution might lie,
here, into inventing some special module attribute to that purpose.

Second, some applications accept switching national language at run-time.
So a mechanism is needed to invalidate lazily-compiled strings when such a
switch occurs.  An avenue would be to use the national language string code
as the "done" flag in the lazy compilation process, allowing recompilation
to occur on the fly, as needed.

--------------------
[1] Temporarily switching locale-related environment variables in threaded
contexts may yield pretty surprising results, this is well-known already.
It only stresses, in my opinion, that the design has been frozen without
having all the vision it would have taken.  Many internationalisation devices
implement half-hearted solutions for half-thought problems.  I'm not at all
asserting that it is possible to foresee everything in advance.  Yet, we
could be more productive by _not_ slavishly sticking to actual "standards".

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From aahz@pythoncraft.com  Sat Jun 22 15:00:16 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sat, 22 Jun 2002 10:00:16 -0400
Subject: [Python-Dev] strptime recapped
In-Reply-To: <2madpna7rh.fsf@starship.python.net>
References: <Pine.SOL.4.44.0206211350040.13283-100000@death.OCF.Berkeley.EDU> <2madpna7rh.fsf@starship.python.net>
Message-ID: <20020622140016.GA362@panix.com>

On Sat, Jun 22, 2002, Michael Hudson wrote:
> Brett Cannon <bac@OCF.Berkeley.EDU> writes:
>> 
>> It's quite fine with me.  I want to see this done right just like
>> everyone else who cares about Python's development.  Personally, I
>> am just ecstatic that I am getting to help out in some way.  I feel
>> more like a giddy little kid who is helping out some grown-ups with
>> some important project than a recent college graduate.  =)
> 
> Feel like being release manager for 2.2.2? <wink>

Tsk, tsk, let's not burn out Brett before we get some useful code from
him.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From bac@OCF.Berkeley.EDU  Sat Jun 22 20:13:12 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Sat, 22 Jun 2002 12:13:12 -0700 (PDT)
Subject: [Python-Dev] strptime recapped
In-Reply-To: <20020622140016.GA362@panix.com>
Message-ID: <Pine.SOL.4.44.0206221209230.22637-100000@death.OCF.Berkeley.EDU>

[Aahz]

> On Sat, Jun 22, 2002, Michael Hudson wrote:
> > Brett Cannon <bac@OCF.Berkeley.EDU> writes:
> >>
> >> It's quite fine with me.  I want to see this done right just like
> >> everyone else who cares about Python's development.  Personally, I
> >> am just ecstatic that I am getting to help out in some way.  I feel
> >> more like a giddy little kid who is helping out some grown-ups with
> >> some important project than a recent college graduate.  =)
> >
> > Feel like being release manager for 2.2.2? <wink>
>
> Tsk, tsk, let's not burn out Brett before we get some useful code from
> him.

Thanks for watching out for me, Aahz.  =)

Actually, I think it might be a cool thing to do.  I do have the time
(taking a year off before I apply to grad school and thus I am
unemployed).  Trouble is that beyond some light reading of timemodule.c, I
have no experience with Python's C code, let alone writing my own
extensions.  Maybe next time.  =)

Wouldn't mine learning, though.  Who knows, maybe I will get sucked into
all of this enough to do my master or PhD thesis on something
Python-related (assuming I get into grad school).  =)

-Brett C.




From barry@barrys-emacs.org  Sat Jun 22 20:39:19 2002
From: barry@barrys-emacs.org (Barry Scott)
Date: Sat, 22 Jun 2002 20:39:19 +0100
Subject: [Python-Dev] Behavior of matching backreferences
In-Reply-To: <20020621020725.A9565@ibook.distro.conectiva>
Message-ID: <001001c21a24$80a8dbd0$070210ac@LAPDANCE>

I think the re module worked correctly.

If you write your expression without the ambiguity:

yours: "^(?P<a>a)?(?P=a)$"
re-1a: "^((?P<a>a)(?P=a))?$"
re-2a: "^(?P<a>a?)(?P=a)$"

your test data ebc will does not match either 'aa' or ''. Try removing
the $ so that it will match '' at the start of the string.

re-1b: "^((?P<a>a)(?P=a))?"
re-2b: "^(?P<a>a?)(?P=a)"

I think the re-2b form is the way to deal with the optional quotes.

I'm not sure a patch is needed for this.

		BArry



-----Original Message-----
From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On
Behalf Of Gustavo Niemeyer
Sent: 21 June 2002 06:07
To: python-dev@python.org
Subject: [Python-Dev] Behavior of matching backreferences


Hi everyone!

I was studying the sre module, when I came up with the following
regular expression:

re.compile("^(?P<a>a)?(?P=a)$").match("ebc").groups()

The (?P=a) matches with whatever was matched by the "a" group. If
"a" is optional and doesn't match, it seems to make sense that
(?P=a) becomes optional as well, instead of failing. Otherwise the
regular expression above will allways fail if the first group
fails, even being optional.

One could argue that to make it a valid regular expression, it should
become "^(?P<a>a)?(?P=a)?". But that's a different regular expression,
since it would match "a", while the regular expression above would
match "aa" or "", but not "a".

This kind of pattern is useful, for example, to match a string which
could be optionally surrounded by quotes, like shell variables. Here's
an example of such pattern: r"^(?P<a>')?((?:\\'|[^'])*)(?P=a)$".
This pattern matches "'a'", "\'a", "a\'a", "'a\'a'" and all such
variants, but not "'a", "a'", or "a'a".

I've submitted a patch to make this work to http://python.org/sf/571976

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev





From niemeyer@conectiva.com  Sat Jun 22 21:10:36 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Sat, 22 Jun 2002 17:10:36 -0300
Subject: [Python-Dev] Behavior of matching backreferences
In-Reply-To: <001001c21a24$80a8dbd0$070210ac@LAPDANCE>
References: <20020621020725.A9565@ibook.distro.conectiva> <001001c21a24$80a8dbd0$070210ac@LAPDANCE>
Message-ID: <20020622171035.A6004@ibook>

> I think the re module worked correctly.
> 
> If you write your expression without the ambiguity:

I must confess I see no ambiguity in my expression.

> yours: "^(?P<a>a)?(?P=a)$"
> re-1a: "^((?P<a>a)(?P=a))?$"

Using "aa" was just an example, of course. If I wanted to match "aa" or
"", I wouldn't use this at all.

> re-2a: "^(?P<a>a?)(?P=a)$"
> 
> your test data ebc will does not match either 'aa' or ''. Try removing
> the $ so that it will match '' at the start of the string.

Sorry, I took the wrong test to paste into the message.

> re-1b: "^((?P<a>a)(?P=a))?"
> re-2b: "^(?P<a>a?)(?P=a)"
> 
> I think the re-2b form is the way to deal with the optional quotes.
> 
> I'm not sure a patch is needed for this.

If you think about a match with more characters, you'll end up in
something like "^(?P<a>(abc)?)(?P=a)", instead of "^(?P<a>abc)?(?P=a)".
Besides having a little difference in their meanings (the first
m.group(1) is '', and the second is None), it looks like you're
workarounding an existant problem, but you may argue that this opinion
is something personal.

Thus, my main point here is that using the second regular expression will
never work as expected, and there is no point in not fixing it, if that's
possible and has already been done.

If you find an example where it *should* fail, working as it is now, I
promiss I'll shut up, and withdraw myself. :-)

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From tismer@tismer.com  Sat Jun 22 23:57:52 2002
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 23 Jun 2002 00:57:52 +0200
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <LNBBLJKPBEHFEDALKOLCIEDOPPAA.tim.one@comcast.net>
Message-ID: <3D1500F0.708@tismer.com>

Tim Peters wrote:
> [Tim]
> 
>>>Since Christian's reply only increased the apparent contradiction,
>>>allow me to channel: ...
>>
> 
> [Christian Tismer]
> 
>>Huh?
>>Reading from top to bottom, as I used to, I see increasing
>>numbers, which are in the same order as the "increasing hate"
>>(not a linear function, but the same ordering).
>>
>>4 - allowing it to address local/global variables
>>is what I hate the most.
>>This is in no contradiction to allvars(), which is simply
>>a function that puts some variables into a dict, therefore
>>deliberating the interpolation from variable access.
>>
>>Where is the problem, please?
> 
> 
> I was warming up my awesome channeling powers for Guido's impending
> vacation, and all I can figure is that I must have left them parked in
> reverse the last time he came back.  Nothing a 12-pack of Coke didn't cure,
> though!  I channel that you'll graciously accept my apology <wink>.

Whow! A TPA. Will stick it next to my screen :-)

Well, the slightly twisted content of that message shaded
its correct logic, maybe.

Meanwhile, I'd like to drop that hate stuff and replace
it by a little reasoning:

Let's name locals/globals/whatever as "program variables".

If there are program variables directly accessible inside
strings to be interpolated, then I see possible abuse,
if abusers manage to supply such a string in an unforeseen way.
For that reason, I wanted to enforce that an explicit
dictionary has to be passed as an argument, to remind the
programmer that she is responsible for providing access.

But at that time, I wasn't considering compile time string
parsing. Compile time means the strings containing variable
names are evaluated only once, and they behave like constants,
cannot be passed in by a later intruder. That sounds pretty
cool, although I don't see how this fits with I18n, which needs
to change strings at runtime?
Maybe it is possible to parse variable names out, replace
them with some placeholders, and to do the internationalization
after that, still not giving variable access to the final
product.

Example (now also allowing functions):

name1 = "Felix"
age1 = 17
name2 = "Hannes"
age2 = 8

"My little son $name1 is $age1. $name2 is $(age2-age1) years older.".sub()

--> "My little son Felix is 8. Hannes is 9 years older."

This string might be translated under the hood into:
_ipol = {
   x1: name1, x2: age1,
   x3: name2, x4: (age2-age1)
}

"My little son $x1 is $x2. $x3 is $x4 years older.".sub(_ipol)

This string is now safe for further processing.

Maybe the two forms should be syntactically different,
but what I mean is a compile time transformation, that
removes all real variables names in the first place.

interpolation-is-by-value-not-by-name - ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From barry@barrys-emacs.org  Sun Jun 23 00:40:25 2002
From: barry@barrys-emacs.org (Barry Scott)
Date: Sun, 23 Jun 2002 00:40:25 +0100
Subject: [Python-Dev] Behavior of matching backreferences
In-Reply-To: <20020622171035.A6004@ibook>
Message-ID: <000501c21a46$2ee38210$070210ac@LAPDANCE>

I think your re has a bug in it that in python would be

    if cond:
        a = 1
    print a

python will give an error is cond is false.

An re that defines a group conditionally as yours does I think
is the same programming error. That's the ambiguity I am
referring to, is or is not the named group defined?

> If you think about a match with more characters, you'll end up in
> something like "^(?P<a>(abc)?)(?P=a)", instead of "^(?P<a>abc)?(?P=a)".
> Besides having a little difference in their meanings (the first
> m.group(1) is '', and the second is None), it looks like you're
> workarounding an existant problem, but you may argue that this opinion
> is something personal.

You can prevent groups being remember using the (?:...) syntax
if you need to preserve the group index. So you need:

    "^(?P<a>(?:abc)?)(?P=a)"

I'm not convinced you have found a bug in the engine that needs
fixing, I think its your re needs changing. I want the re engine
to report the error for re that are illogical.

	BArry





From barry@zope.com  Sun Jun 23 01:37:23 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sat, 22 Jun 2002 20:37:23 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <200206200332.g5K3Wbj06062@pcp02138704pcs.reston01.va.comcast.net>
 <NBBBIOJPGKJEKIECEMCBIEEDNFAA.pobrien@orbtech.com>
Message-ID: <15637.6211.441850.925511@anthem.wooz.org>

>>>>> "PKO" == Patrick K O'Brien <pobrien@orbtech.com> writes:

    PKO> I guess what I was really wondering is whether that advantage
    PKO> clearly outways some of the possible disadvantages. I'm not a
    PKO> fan of curly braces and I'll be sad to see more of them in
    PKO> Python. There's something refreshing about only having curly
    PKO> braces for dictionaries and parens everywhere else.  And
    PKO> since the exisiting string substitution uses parens why
    PKO> shouldn't the new?

Personally, I wouldn't mind it if this syntax took a cue from the make
program and accepted both $(name) and ${name} as alternatives to $name
(with nested parenthesis/brace matching).

-Barry



From barry@zope.com  Sun Jun 23 01:56:57 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sat, 22 Jun 2002 20:56:57 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>
 <3D11B6F0.5000803@tismer.com>
 <200206201746.g5KHkwH04175@odiug.zope.com>
 <3D121EDB.6070501@tismer.com>
Message-ID: <15637.7385.966341.14847@anthem.wooz.org>

>>>>> "CT" == Christian Tismer <tismer@tismer.com> writes:

    CT> By no means.  allvars() is something like locals() or
    CT> globals(), just an explicit way to produce a dictionary of
    CT> variables.

I'd be ok with something like allvars() and requiring a dictionary to
the .sub() method, /if/ allvars() were a method on a frame object.  I
really, really do want to write in my i18n programs:

    def whereBorn(name):
	country = countryOfOrigin(name)
	return _('$name was born in $country')

I'd be fine if the definition of _() could reach into the frame of
whereBorn() and give me a list of all variables, including ones in
nested scopes.  Actually, that'd be a lot better than what I do now
(although truth be told, losing access to nested scoped variables is
only a hypothetical limitation in the code I've written).

The feature would be useless to me if I had to pass some explicit
dictionary into the _() method.  It makes writing i18n code extremely
tedious.  Invariably, the unsafeness of an implicit dictionary happens
when strings come from untrusted sources, and your .py file can't be
considered untrusted.  In those cases, creating an explicit dictionary
for interpolation is fine, but they also tend not to overlap with i18n
much.

-Barry



From barry@zope.com  Sun Jun 23 02:02:14 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sat, 22 Jun 2002 21:02:14 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <200206190329.g5J3TKBa006071@mercure.iro.umontreal.ca>
 <oq3cviitby.fsf@titan.progiciels-bpi.ca>
 <15633.19790.152438.926329@anthem.wooz.org>
 <oqlm9am3cv.fsf@titan.progiciels-bpi.ca>
Message-ID: <15637.7702.841779.383698@anthem.wooz.org>

>>>>> "FP" =3D=3D Fran=E7ois Pinard <pinard@iro.umontreal.ca> writes:

    FP> Saying that PEP 292 rejects an idea because this idea would
    FP> require another PEP to be debated and accepted beforehand, and
    FP> than rushing the acceptance of PEP 292 as it stands, is
    FP> probably missing the point of the discussion.

I don't think there's /any/ danger of rushing acceptance of PEP 292.
It may not even be accepted at all.

still-slogging-through-50-some-odd-messages-ly y'rs,
-Barry



From barry@zope.com  Sun Jun 23 02:12:22 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sat, 22 Jun 2002 21:12:22 -0400
Subject: [Python-Dev] *Simpler* string substitutions
References: <3D121F0D.E3B60865@prescod.net>
 <001901c218a0$6158d1c0$070210ac@LAPDANCE>
Message-ID: <15637.8310.961687.468635@anthem.wooz.org>

>>>>> "BS" == Barry Scott <barry.alan.scott@ntlworld.com> writes:

    BS> If I'm going to move from %(name)fmt to ${name} I need a place
    BS> for the fmt format.

One of the reasons why I added "simpler" to the PEP is because I
didn't want to support formatting characters in the specification.
While admittedly handy for some applications, I submit that most
string interpolation simply uses %s or %(name)s and there should be a
simpler, less error prone way of writing that.

-Barry



From paul@prescod.net  Sun Jun 23 02:12:57 2002
From: paul@prescod.net (Paul Prescod)
Date: Sat, 22 Jun 2002 18:12:57 -0700
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <20020620071856.GA10497@hishome.net> <Pine.LNX.4.44.0206201537410.1419-100000@ziggy> <20020622134459.A6918@hishome.net>
Message-ID: <3D152099.1C1E73FA@prescod.net>

Oren Tirosh wrote:
> 
>...
> 
> Compile-time parsing may eliminate the possibility of using the same
> mechanism for internationalization, but not the possibility of using the
> same syntax. A module may provide a function that interprets the same
> notation at runtime.  The runtime version probably shouldn't support full
> expression embedding - just simple name substitution.

I think that there are enough benefits for each form (compile time with
expressions, runtime without) that we should expect any final solution
to support both. Maybe you guys should merge your PEPs!

 Paul Prescod



From jmiller@stsci.edu  Sun Jun 23 02:22:52 2002
From: jmiller@stsci.edu (Todd Miller)
Date: Sat, 22 Jun 2002 21:22:52 -0400
Subject: [Python-Dev] Indexing builtin sequences with objects which supply
 __int__
References: <LNBBLJKPBEHFEDALKOLCGEEAPPAA.tim.one@comcast.net>
Message-ID: <3D1522EC.1070001@stsci.edu>

Tim Peters wrote:

>[Todd Miller, wants to use rank-0 arrays as regular old indices]
>
>Here's a sick idea:  given Python 2.2, you *could* make the type of a rank-0
>
Right now, numarray is a subclass of object for Python-2.2 in order to 
get properties in order to emulate some of Numeric's attributes. I'm 
wondering what I'd loose from object in order to pick up int's indexing. 
 I'm also wondering how to make a rank-0 Float array fail as an index. 
 I might try it just to see where it breaks...  Thanks!

>
>array a subclass of Python's int type, making sure (if needed) to copy the
>value into "the int part" at the start of the struct.  Then a rank-0 array
>would act like an integer in almost all contexts requiring a Python int,
>including use as a sequence index.
>
>The relevant code in the core is
>
>		if (PyInt_Check(key)
>
>
>			return PySequence_GetItem(o, PyInt_AsLong(key));
>
>in PyObject_GetItem().  PyInt_Check() says "yup!" for an instance of any
>subclass of int, and PyInt_AsLong() extracts "the int part" out of any
>instance of any subclass of int.
>
>In return, it shifts the burden onto convincing the rest of numarray that
>the thing is still an array too <0.4 wink>.
>






From barry@zope.com  Sun Jun 23 02:20:54 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sat, 22 Jun 2002 21:20:54 -0400
Subject: [Python-Dev] *Simpler* string substitutions
References: <3D121F0D.E3B60865@prescod.net>
 <200206202121.g5KLLPT05634@odiug.zope.com>
Message-ID: <15637.8822.201643.67822@anthem.wooz.org>

    GvR> Oren made a good point that Paul emphasized: the most common
    GvR> use case needs interpolation from the current namespace in a
    GvR> string literal, and expressions would be handy.  Oren also
    GvR> made the point that the necessary parsing could (should?) be
    GvR> done at compile time.

I'll point out that in my experience, while expressions are (very)
occasionally handy, you wouldn't necessarily need /arbitrary/
expressions.  Something as simple as allowing dotted names only would
solve probably 90% of uses, e.g.

    person = getPerson()
    print '${person.name} was born in ${person.country}'

Not that this can't execute arbitrary code of course, so the security
implications of that would need to be examined.

-Barry



From barry@zope.com  Sun Jun 23 02:36:31 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sat, 22 Jun 2002 21:36:31 -0400
Subject: [Python-Dev] Re: *Simpler* string substitutions
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B3B2@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <15637.9759.111784.481102@anthem.wooz.org>

>>>>> "PM" == Paul Moore <Paul.Moore@atosorigin.com> writes:

    PM> 4. Access to variables is also problematic. Without
    PM> compile-time support, access to nested scopes is impossible
    PM> (AIUI).

Is this really true?  I think it was two IPC's ago that Jeremy and I
discussed the possibility of adding a method to frame objects that
would basically yield you the equivalent of globals+freevars+locals.

-Barry



From barry@zope.com  Sun Jun 23 02:40:22 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sat, 22 Jun 2002 21:40:22 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <200206182330444.SM01040@mail.python.org>
 <002d01c2199b$d7e50150$6300000a@holdenweb.com>
Message-ID: <15637.9990.703227.618127@anthem.wooz.org>

>>>>> "SH" == Steve Holden <sholden@holdenweb.com> writes:

    SH> Seems to me that the volume of comment might imply that string
    SH> formatting isn't the one obvious best way to do it. Now the
    SH> flamefest has died down somewhat, the only thing I can see PEP
    SH> 292 justifying is better documentation for the
    SH> string-formatting operator. "But then I'm an
    SH> $pytpe".sub({"ptype": "old fogey"})

I will soon do an update of the PEP to add a bunch more open issues
<wink> based on these threads, which I /think/ I've mostly slogged
through.

-Barry



From barry@zope.com  Sun Jun 23 02:45:42 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sat, 22 Jun 2002 21:45:42 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <LNBBLJKPBEHFEDALKOLCIEDOPPAA.tim.one@comcast.net>
 <3D1500F0.708@tismer.com>
Message-ID: <15637.10310.131724.556831@anthem.wooz.org>

>>>>> "CT" == Christian Tismer <tismer@tismer.com> writes:

    CT> If there are program variables directly accessible inside
    CT> strings to be interpolated, then I see possible abuse, if
    CT> abusers manage to supply such a string in an unforeseen way.

For literal strings in .py files, the only way that's going to happen
is if someone you don't trust is hacking your source code, /or/ if you
have evil translators sneaking in bogus translation strings.  The
latter can be solved with a verification step over your message
catalogs, while the former I leave as an exercise for the reader. :)

So still, I trust automatic interpolation of program vars for literal
strings, but for strings coming from some other source (e.g. a web
form), then yes, you obviously want to be explicit about the
interpolation dictionary.

-Barry



From barry@zope.com  Sun Jun 23 02:50:47 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sat, 22 Jun 2002 21:50:47 -0400
Subject: [Python-Dev] PEP 292, Simpler String Substitutions
References: <20020620071856.GA10497@hishome.net>
 <Pine.LNX.4.44.0206201537410.1419-100000@ziggy>
 <20020622134459.A6918@hishome.net>
 <3D152099.1C1E73FA@prescod.net>
Message-ID: <15637.10615.563601.808178@anthem.wooz.org>

>>>>> "PP" == Paul Prescod <paul@prescod.net> writes:

    PP> I think that there are enough benefits for each form (compile
    PP> time with expressions, runtime without) that we should expect
    PP> any final solution to support both. Maybe you guys should
    PP> merge your PEPs!

Only two of them are official PEPs currently <294 winks to Oren>.

-Barry



From tismer@tismer.com  Sun Jun 23 03:04:59 2002
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 23 Jun 2002 04:04:59 +0200
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <LNBBLJKPBEHFEDALKOLCIEDOPPAA.tim.one@comcast.net>	<3D1500F0.708@tismer.com> <15637.10310.131724.556831@anthem.wooz.org>
Message-ID: <3D152CCB.6010000@tismer.com>

Barry A. Warsaw wrote:
>>>>>>"CT" == Christian Tismer <tismer@tismer.com> writes:
>>>>>
> 
>     CT> If there are program variables directly accessible inside
>     CT> strings to be interpolated, then I see possible abuse, if
>     CT> abusers manage to supply such a string in an unforeseen way.
> 
> For literal strings in .py files, the only way that's going to happen
> is if someone you don't trust is hacking your source code, /or/ if you
> have evil translators sneaking in bogus translation strings.  The
> latter can be solved with a verification step over your message
> catalogs, while the former I leave as an exercise for the reader. :)
> 
> So still, I trust automatic interpolation of program vars for literal
> strings, but for strings coming from some other source (e.g. a web
> form), then yes, you obviously want to be explicit about the
> interpolation dictionary.

 From another reply:
 >
 >     def whereBorn(name):
 > 	country = countryOfOrigin(name)
 > 	return _('$name was born in $country')

Ok, I'm all with it.
Since a couple of hours, I'm riding the following horse:

- $name, $(name), $(any expr)  is just fine
- all of this is compile-time stuff

The idea is:
Resolve the variables at compile time.
Don't provide the feature at runtime.

Here a simple approach. (I'm working on a complicated, too):
(assuming the "e" character triggering expression extraction)

     def whereBorn(name):
	country = countryOfOrigin(name)
	return _(e'$name was born in $country')

is accepted by the grammar, but turned into the
equivalent of:

     def whereBorn(name):
         country = countryOfOrigin(name)
         return _('%(x1)s was born in %(x2)s') % {
           "x1": name, "x2": country}

That is: The $ stuff is extracted, turning the fmt
string into something anonymous. Your _() processes
it, then the variables are formatted in.
This turns the $ stuff completely into syntactic
sugar. Any Python expression inside $() is allowed,
it is compiled as if it were sitting inside the dict.
I also believe it is a good idea to do the _() on
the unexpanded string (as shown), since the submitted
values are most probably hard to translate at all.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From sjmachin@lexicon.net  Sun Jun 23 03:15:22 2002
From: sjmachin@lexicon.net (John Machin)
Date: Sun, 23 Jun 2002 12:15:22 +1000
Subject: "Julian" ambiguity (was Re: [Python-Dev] strptime recapped)
In-Reply-To: <20020621122722.44222.qmail@web9601.mail.yahoo.com>
Message-ID: <DBUQJHQL1W83B032XUP4Y62A9WTPNM.3d152f3a@Egil>

21/06/2002 10:27:22 PM, Steven Lott <s_lott@yahoo.com> wrote:

>
>Generally, "Julian" dates are really just the day number within
>a given year; this is a simple special case of the more general
>(and more useful) approach that R-D use.
>
>See
>http://emr.cs.iit.edu/home/reingold/calendar-book/index.shtml
>
>for more information.
>

AFAICT from perusing their book, R-D use the term "julian-date" to mean a tuple (year, month, day) in the Julian calendar.
The International Astro. Union uses "Julian date" to mean an instant in time measured in days (and fraction therof) since noon on 1 January -4712 (Julian ("proleptic") calendar). See for example 
http://maia.usno.navy.mil/iauc19/iaures.html#B1

A "Julian day number" (or "JDN") is generally used to mean an ordinal day number counting day 0 as Julian_calendar(-4712, 1, 1) as above. Some folks use JDN to include the IAU's instant-in-time.

Some folks use "julian day" to mean a day within a year (range 0-365 *or* 1-366 (all inclusive)). This terminology IMO should be severely deprecated. The concept is best described as something like "day of year", with a 
specification of the origin (0 or 1) when appropriate. 

It is not clear from the first of your sentences quoted above exactly what you are calling a "Julian date": (a) the tuple (given_year, day_of_year) with calendar not specified or (b) just day_of_year. However either answer seems 
IMO to be an inappropriate addition to the terminology confusion.

Cheers,
John






From niemeyer@conectiva.com  Sun Jun 23 05:39:36 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Sun, 23 Jun 2002 01:39:36 -0300
Subject: [Python-Dev] Behavior of matching backreferences
In-Reply-To: <000501c21a46$2ee38210$070210ac@LAPDANCE>
References: <20020622171035.A6004@ibook> <000501c21a46$2ee38210$070210ac@LAPDANCE>
Message-ID: <20020623013936.A6543@ibook>

> I think your re has a bug in it that in python would be
> 
>     if cond:
>         a = 1
>     print a
> 
> python will give an error is cond is false.
> 
> An re that defines a group conditionally as yours does I think
> is the same programming error. That's the ambiguity I am
> referring to, is or is not the named group defined?

Sorry Barry, but I don't see your point here. There's no change in
the naming semantics. In sre that's totally valid and used in a
lot of code:

>>> `re.compile("(?P<a>a)?").match("b").group("a")`
'None'
>>> `re.compile("(?P<a>a)?").match("a").group("a")`
"'a'"
>>>

[...]
> You can prevent groups being remember using the (?:...) syntax
> if you need to preserve the group index. So you need:
> 
>     "^(?P<a>(?:abc)?)(?P=a)"

Again, you may do regular expressions in many ways, the point I'm
still raising is that there's one way that doesn't work as expected.

> I'm not convinced you have found a bug in the engine that needs
> fixing, I think its your re needs changing. I want the re engine
> to report the error for re that are illogical.

The re won't report anything when somebody uses this syntax. It will
just don't work as expected. If you think this re is illogical, don't
use it. But I see no point in denying others to use it.

I'm not planning to discuss much more about this. My intentions and the
issue are clear enough. I'd like to hear the opinion of Fredrik about
this, though.

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From xscottg@yahoo.com  Sun Jun 23 06:05:06 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Sat, 22 Jun 2002 22:05:06 -0700 (PDT)
Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__
In-Reply-To: <3D1522EC.1070001@stsci.edu>
Message-ID: <20020623050506.37311.qmail@web40106.mail.yahoo.com>

--- Todd Miller:
> Right now, numarray is a subclass of object for Python-2.2 in order to 
> get properties in order to emulate some of Numeric's attributes. I'm 
> wondering what I'd loose from object in order to pick up int's indexing. 

Since int is also a subclass of object, you'd still get the benefits of new
style classes...

>  I'm also wondering how to make a rank-0 Float array fail as an index. 

Raise a TypeError and it would match the standard behavior.




__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From nhodgson@bigpond.net.au  Sun Jun 23 11:32:24 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Sun, 23 Jun 2002 20:32:24 +1000
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com><3D11B6F0.5000803@tismer.com><200206201746.g5KHkwH04175@odiug.zope.com><3D121EDB.6070501@tismer.com> <15637.7385.966341.14847@anthem.wooz.org>
Message-ID: <000a01c21aa1$438bfde0$3da48490@neil>

Barry A. Warsaw:

>     def whereBorn(name):
> country = countryOfOrigin(name)
> return _('$name was born in $country')
> ...
> The feature would be useless to me if I had to pass some explicit
> dictionary into the _() method.  It makes writing i18n code extremely
> tedious.

   I think you are overstating the problem here. The explicit bindings are a
small increase over your current code as you are already creating an extra
variable just to use the automatic binding. With explicit bindings:

def whereBorn(name):
   return _('$name was born in $country',
        name=name, country=countryOfOrigin(name))

   The protection provided is not just against untrustworthy translaters but
also allows checking the initial language code. You can ensure all the
interpolations are provided with values and all the provided values are
used. It avoids exposing implementation details such as the names of local
variables and can ensure that a more meaningful identifier in the local
context of the string is available to the translator. For example, I may
have some code that processes a command line argument which has multiple
uses on different execution paths:
_('$moduleName already exists', moduleName = arg)
_('$searchString can not be found', searchString = arg)

   Not making bindings explicit may mean that translators use other
variables available at the translation point leading to unexpected failures
when internal details are changed.

   Neil





From skip@mojam.com  Sun Jun 23 13:00:13 2002
From: skip@mojam.com (Skip Montanaro)
Date: Sun, 23 Jun 2002 07:00:13 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200206231200.g5NC0DU02192@12-248-11-90>

Bug/Patch Summary
-----------------

254 open / 2603 total bugs (-3)
128 open / 1565 total patches (no change)

New Bugs
--------

ConfigParser code cleanup (2002-04-17)
	http://python.org/sf/545096
odd index entries (2002-06-17)
	http://python.org/sf/570003
"python -u" not binary on cygwin (2002-06-17)
	http://python.org/sf/570044
Broken pre.subn() (and pre.sub()) (2002-06-17)
	http://python.org/sf/570057
inspect.getmodule symlink-related failur (2002-06-17)
	http://python.org/sf/570300
.PYO files not imported unless -OO used (2002-06-18)
	http://python.org/sf/570640
bdist_rpm and the changelog option (2002-06-18)
	http://python.org/sf/570655
CGIHTTPServer flushes read-only file. (2002-06-18)
	http://python.org/sf/570678
glob() fails for network drive in cgi (2002-06-19)
	http://python.org/sf/571167
imaplib fetch is broken (2002-06-19)
	http://python.org/sf/571334
Mixing framework and static Pythons (2002-06-19)
	http://python.org/sf/571343
Numeric Literal Anomoly (2002-06-19)
	http://python.org/sf/571382
test_import crashes/hangs for MacPython (2002-06-20)
	http://python.org/sf/571845
Segmentation fault in Python 2.3 (2002-06-20)
	http://python.org/sf/571885
python-mode IM parses code in docstrings (2002-06-21)
	http://python.org/sf/572341
Memory leak in object comparison (2002-06-22)
	http://python.org/sf/572567

New Patches
-----------

Remove support for Win16 (2002-06-16)
	http://python.org/sf/569753
Fix bug in encodings.search_function (2002-06-20)
	http://python.org/sf/571603
Changes (?P=) with optional backref (2002-06-20)
	http://python.org/sf/571976
AUTH method LOGIN for smtplib (2002-06-21)
	http://python.org/sf/572031
Remove import string in Tools/ directory (2002-06-21)
	http://python.org/sf/572113
opt. timeouts for Queue.put() and .get() (2002-06-22)
	http://python.org/sf/572628

Closed Bugs
-----------

Incorporate timeoutsocket.py into core (2001-08-30)
	http://python.org/sf/457114
ext call doco warts (2001-12-14)
	http://python.org/sf/493243
It's the future for generators (2001-12-21)
	http://python.org/sf/495978
PyModule_AddObject doesn't set exception (2002-02-27)
	http://python.org/sf/523473
Incomplete list of escape sequences (2002-03-06)
	http://python.org/sf/526390
range() description: rewording suggested (2002-03-11)
	http://python.org/sf/528748
Popen3 might cause dead lock (2002-03-16)
	http://python.org/sf/530637
6.9 The raise statement is confusing (2002-03-20)
	http://python.org/sf/532467
cut-o/paste-o in Marshalling doc: 2.2.1 (2002-03-22)
	http://python.org/sf/533735
mimify.mime_decode_header only latin1 (2002-05-03)
	http://python.org/sf/551912
[RefMan] Special status of "as" (2002-05-07)
	http://python.org/sf/553262
Expat improperly described in setup.py (2002-05-15)
	http://python.org/sf/556370
\verbatiminput and name duplication (2002-05-20)
	http://python.org/sf/558279
Missing operator docs (2002-06-02)
	http://python.org/sf/563530
PyUnicode_Find() returns wrong results (2002-06-09)
	http://python.org/sf/566631
__slots__ attribute and private variable (2002-06-14)
	http://python.org/sf/569257

Closed Patches
--------------

Pure python version of calendar.weekday (2001-11-20)
	http://python.org/sf/483864
Janitoring in ConfigParser (2002-04-17)
	http://python.org/sf/545096
add support for HtmlHelp output (2002-05-06)
	http://python.org/sf/552835
texi2html.py - add support for HTML Help (2002-05-06)
	http://python.org/sf/552837
OSX build -- make python.app (2002-05-18)
	http://python.org/sf/557719
unicode in sys.path (2002-06-10)
	http://python.org/sf/566999



From jmiller@stsci.edu  Sun Jun 23 14:37:12 2002
From: jmiller@stsci.edu (Todd Miller)
Date: Sun, 23 Jun 2002 09:37:12 -0400
Subject: [Python-Dev] Indexing builtin sequences with objects which supply
 __int__
References: <20020623050506.37311.qmail@web40106.mail.yahoo.com>
Message-ID: <3D15CF08.2020506@stsci.edu>

Scott Gilbert wrote:

>--- Todd Miller:
>
>>Right now, numarray is a subclass of object for Python-2.2 in order to 
>>get properties in order to emulate some of Numeric's attributes. I'm 
>>wondering what I'd loose from object in order to pick up int's indexing. 
>>
>
>Since int is also a subclass of object, you'd still get the benefits of new
>style classes...
>
Well, that's excellent!

>
>> I'm also wondering how to make a rank-0 Float array fail as an index. 
>>
>
>Raise a TypeError and it would match the standard behavior.
>
Raise TypeError where?   I was thinking I'd have to either inherit from 
int, or not, depending on the type of the array.   It still might work 
out though...

>
>
>
>__________________________________________________
>Do You Yahoo!?
>Yahoo! - Official partner of 2002 FIFA World Cup
>http://fifaworldcup.yahoo.com
>






From barry@zope.com  Sun Jun 23 16:33:18 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sun, 23 Jun 2002 11:33:18 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>
 <3D11B6F0.5000803@tismer.com>
 <200206201746.g5KHkwH04175@odiug.zope.com>
 <3D121EDB.6070501@tismer.com>
 <15637.7385.966341.14847@anthem.wooz.org>
 <000a01c21aa1$438bfde0$3da48490@neil>
Message-ID: <15637.59966.161957.754620@anthem.wooz.org>

>>>>> "NH" == Neil Hodgson <nhodgson@bigpond.net.au> writes:

    >> The feature would be useless to me if I had to
    >> pass some explicit dictionary into the _() method.  It makes
    >> writing i18n code extremely tedious.

    NH>    I think you are overstating the problem here.

Trust me, I'm not.  Then again, maybe it's just me, or my limited
experience w/ i18n'd source code, but being forced to pass in the
explicit bindings is a big burden in terms of maintainability and
readability.
    
    NH> The explicit bindings are a small increase over your current
    NH> code as you are already creating an extra variable just to use
    NH> the automatic binding. With explicit bindings:

    NH> def whereBorn(name):
    |    return _('$name was born in $country',
    |         name=name, country=countryOfOrigin(name))

More often then not, you already have the values you want to
interpolate sitting in local variables for other uses inside the
function.  Notice how you've written `name' 5 times there?  Try that
with every other line of code and see if it doesn't get tedious. ;)

    NH>    The protection provided is not just against untrustworthy
    NH> translaters but also allows checking the initial language
    NH> code. You can ensure all the interpolations are provided with
    NH> values and all the provided values are used.

Yes, you could do that.  Note that the actual interpolation function
/does/ have access to a dictionary, it might have more stuff than you
want (making the second check impossible), but the first check could
be done.
    
    NH> It avoids exposing implementation details such as the names of
    NH> local variables

This isn't an issue from a security concern, if the code is open
source.  And you should be picking meaningful local variable names
anyway!  Mine tend to be stuff like `subject', `listname',
`realname'.  I've yet to get a question about the meaning of an
interpolation variable.

Actually, translators really need access to the source code anyway,
and .po files usually contain references to the file and line number
of the source string, and po-mode makes it easy for translators to
locate the context and the purpose of the translation.

    NH> and can ensure that a more meaningful identifier in the local
    NH> context of the string is available to the translator. For
    NH> example, I may have some code that processes a command line
    NH> argument which has multiple uses on different execution paths:
    NH> _('$moduleName already exists', moduleName = arg)
    NH> _('$searchString can not be found', searchString = arg)

+1 on using explicit bindings or a dictionary when it improves
clarity!

    NH>    Not making bindings explicit may mean that translators use
    NH> other variables available at the translation point leading to
    NH> unexpected failures when internal details are changed.

I18n'ing a program means you have to worry about a lot more things.
If some local variable changed, I'd consider using an explicit binding
to preserve the original source string, a change to which would force
updated translations.  Then again, you tend to get paranoid about
changing /any/ source string, say to remove a comma, adjust
whitespace, or fix a preposition.  Any change means a dozen language
teams have a new message they must translate (unless you can
mechanically fix them for them).

Another i18n approach altogether uses explicit message ids instead of
using the source string as the implicit message id, but that has a
whole 'nuther set of issues.

multi-lingual-ly y'rs,
-Barry



From rnd@onego.ru  Sun Jun 23 17:10:43 2002
From: rnd@onego.ru (Roman Suzi)
Date: Sun, 23 Jun 2002 20:10:43 +0400 (MSD)
Subject: [Python-Dev] Behavior of matching backreferences
In-Reply-To: <20020623013936.A6543@ibook>
Message-ID: <Pine.LNX.4.44.0206232001240.2233-100000@rnd.onego.ru>

On Sun, 23 Jun 2002, Gustavo Niemeyer wrote:

I do not agree with both of you. I think, re should give an error at compile
time (as it does in cases, like (?<=REGEXP), where only fixed length is
allowed:

>>> re.compile("(?<=R*)")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/lib/python2.2/sre.py", line 178, in compile
    return _compile(pattern, flags)
  File "/usr/lib/python2.2/sre.py", line 228, in _compile
    raise error, v # invalid expression
sre_constants.error: look-behind requires fixed-width pattern


Why? Because there is no sense in matching non-existent group.
It's simply incorrect. So, instead of having time-bombs
Gustavo found, it's better to check at re compile time.

>Sorry Barry, but I don't see your point here. There's no change in
>the naming semantics. In sre that's totally valid and used in a
>lot of code:
>
>>>> `re.compile("(?P<a>a)?").match("b").group("a")`
>'None'
>>>> `re.compile("(?P<a>a)?").match("a").group("a")`
>"'a'"
>>>>

This is quite different. None has a sense of meta-value
which indicates that group was not used while matching.
There is no way to use it in the re consistently.

(well, probably some syntax could be invented for it,
like 'match only if exists', etc. But it is too subtle
and is hardly needed).

Sincerely yours, Roman Suzi
-- 
rnd@onego.ru =\= My AI powered by Linux RedHat 7.2




From lalo@laranja.org  Sun Jun 23 19:16:30 2002
From: lalo@laranja.org (Lalo Martins)
Date: Sun, 23 Jun 2002 15:16:30 -0300
Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting
Message-ID: <20020623181630.GN25927@laranja.org>

For a moment, please remove for your mind your experience of C and printf.
Meditate with me and picture yourself in a happy world of object-orientation
and code readability, where everything cryptic and obscured is banished.

Just to help you do that, I'll avoid the notation chosen by the PEP. Let's
use, for the duration of this post, the hypothetic notation suggested by
some other reader: "<<name>> is from <<country>>".

Now, this thing we're talking about is replacing parts of the string with
other strings. These strings may be the result of running some non-string
objects trough str(foo) - but, we are making no assumptions about these
objects. Just that str(foo) is somehow meaningful. And, to my knowledge,
there are no python objects for which str(foo) doesn't work.

So, string substitution is non-intrusive.

Also, if you keep your templates (let's call a string containing
substitution markup a template, shall we?) outside your source code, as is
the case with i18n, pure substitution doesn't require the people who edit
them (for example, translators) to know anything about python *or* even
programming.

String substitution only depends on an identifier ('name' or 'country'), no
sick abbreviations like 's' or 'd' or 'f' or 'r' or 'x' that you have to
keep a table for.

So, string substitution is readable and non-cryptic.



Now, data formatting is another animal entirely. It's a way to request one
specific representation of a piece of data.

But there is a catch. When you do '%8.3d' % foo you are *expecting* that foo
a floating-point number and you know you'll get TypeError otherwise. This
is, IMO, invasive. In my ideal OO-paradise I would rather have something
like foo.format(8, 3) (THIS IS NOT A PEP!).

IMO, if you, as I asked in the first paragraph, pretend you don't know C and
printf and python's % operator and then pretend you're having your first
contact with it, while already having some experience with python's
readability, it's hard not to be shocked. And I bet you'd go to great
lengths to avoid using the "feature".



Conclusion: I think string formatting is a cryptic and obscure misfeature
inherited from C that should be deprecated in favour of something less
invasive and more readable/explicit.

More, I'm completely opposed to "<<name>> is <<age:.0d>> years old" because
it's still cryptic and invasive. This should instead read similar to
"<<name>> is <<age>> years old".sub({'name': x.name, 'age': x.age.format(None, 0)})



Guido, can you please, for our enlightenment, tell us what are the reasons
you feel %(foo)s was a mistake?

[]s,
                                               |alo
                                               +----
--
  It doesn't bother me that people say things like
   "you'll never get anywhere with this attitude".
   In a few decades, it will make a good paragraph
      in my biography. You know, for a laugh.
--
http://www.laranja.org/                mailto:lalo@laranja.org
         pgp key: http://www.laranja.org/pessoal/pgp

Python Foundry Guide http://www.sf.net/foundry/python-foundry/



From tim.one@comcast.net  Sun Jun 23 19:28:53 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 23 Jun 2002 14:28:53 -0400
Subject: [Python-Dev] Behavior of matching backreferences
In-Reply-To: <20020623013936.A6543@ibook>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHJPPAA.tim.one@comcast.net>

[Gustavo Niemeyer, on the behavior of
    re.compile("^(?P<a>a)?(?P=a)$").match("ebc").groups()
]

Python and Perl work exactly the same way for the equivalent (but spellable
in Perl) regexp

    ^(a)?\1$

matching the two strings

    a
and
    aa

and nothing else.  That's what I expected.  You didn't give a concrete
example of what you think it should do instead.  It may have been your
intent to say that you believe the regexp *should* match the string

    ebc

but you didn't really say so one way or the other.  Regardless, neither
Python nor Perl do match ebc in this case, and that's intended.

The Rule, in vague English, is that a backreference matches the same text as
was matched by the referenced group; if the referenced group didn't match
any text, then the backreference can't match either.  Note that whether the
referenced group matched any text is a different question than whether the
referenced group is *used* in the match.  This is a subtle point I suspect
you're missing.

> Otherwise the regular expression above will allways fail if the first
> group fails,

Yes.

> even being optional

There's no such beast as "an optional group".  The

    ^(a)

part *must* match or the entire regexp fails, period, regardless of whether
or not backreferences appear later.  The question mark following doesn't
change this requirement.

    (a)?

says

    'a' must match
    but the overall pattern can choose to use this match or not

That's why the regexp as a whole matches the string

    a

The

    (a)

part does match 'a', the ? chooses not to use this match, and then the
backreference matches the 'a' that the first group matched.  Study the
output of this and it may be clearer:

import re
pat = re.compile(r"^((a)?)(\2)$")
print pat.match('a').groups()
print pat.match('aa').groups()


> ...
> while the regular expression above would match "aa" or "", but not "a".

As above, Python and Perl disagree with you:  they match "aa" and "a" but
not "".

> ...
> My intentions and the issue are clear enough.

Sorry, your intentions weren't clear to me.  The issue is, though <wink>.




From paul@prescod.net  Sun Jun 23 19:32:21 2002
From: paul@prescod.net (Paul Prescod)
Date: Sun, 23 Jun 2002 11:32:21 -0700
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <LNBBLJKPBEHFEDALKOLCIEDOPPAA.tim.one@comcast.net>	<3D1500F0.708@tismer.com> <15637.10310.131724.556831@anthem.wooz.org> <3D152CCB.6010000@tismer.com>
Message-ID: <3D161435.9D154EE0@prescod.net>

Christian Tismer wrote:
> 
>...
> 
> Ok, I'm all with it.
> Since a couple of hours, I'm riding the following horse:
> 
> - $name, $(name), $(any expr)  is just fine
> - all of this is compile-time stuff
> ....

I think you just described PEP 215. But what you're missing is that we
need a compile time facility for its flexibility and simplicity but we
also need a runtime facility to allow I18N.

> I also believe it is a good idea to do the _() on
> the unexpanded string (as shown), since the submitted
> values are most probably hard to translate at all.

_ runs at runtime. If the interpolation is done at compile time then "_"
is executed too late.

 Paul Prescod



From paul@prescod.net  Sun Jun 23 19:38:43 2002
From: paul@prescod.net (Paul Prescod)
Date: Sun, 23 Jun 2002 11:38:43 -0700
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com><3D11B6F0.5000803@tismer.com><200206201746.g5KHkwH04175@odiug.zope.com><3D121EDB.6070501@tismer.com> <15637.7385.966341.14847@anthem.wooz.org> <000a01c21aa1$438bfde0$3da48490@neil>
Message-ID: <3D1615B3.D9F382F@prescod.net>

Neil Hodgson wrote:
> 
>...
> 
>    Not making bindings explicit may mean that translators use other
> variables available at the translation point leading to unexpected failures
> when internal details are changed.

Actually, I don't think that is the case. I think that the security
implications of "_" are overstated.

name = "Paul"
country = "Canada"
password = "jfoiejw"
_('${name} was born in ${country}')

The "_" function can use a regular expression to determine that the
original code used only "${name}" and "${country}". Then it can disallow
access to ${password}

def _(origstring):
	orig_substitions = get_substitutions(origstring)
        translation = lookup_translation(origstring)
	translation_substitions = get_substitutions(translation_substitions)
	assert translation.substitutions == orig_substitutions

 Paul Prescod



From tim.one@comcast.net  Sun Jun 23 19:45:30 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 23 Jun 2002 14:45:30 -0400
Subject: [Python-Dev] Indexing builtin sequences with objects which supply
 __int__
In-Reply-To: <3D1522EC.1070001@stsci.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHLPPAA.tim.one@comcast.net>

[Todd Miller, wants to use rank-0 arrays as regular old indices]

[Tim]
> Here's a sick idea:  given Python 2.2, you *could* make the type
> of a rank-0 array a subclass of Python's int type

[Todd]
> Right now, numarray is a subclass of object for Python-2.2 in order to
> get properties in order to emulate some of Numeric's attributes. I'm
> wondering what I'd loose from object in order to pick up int's indexing.

All types in 2.2 inherit from object, including int.

>>> class IntWithA(int):
...     def seta(self, value):
...         self._a = value
...     def geta(self):
...         return self._a * 2
...     a = property(geta, seta)
...
>>> i = IntWithA(42)
>>> i
42
>>> i.a = 333
>>> i.a
666
>>> range(50)[i]
42
>>>

So, e.g., adding arbitrary properties should be a crawl in the park.

>  I'm also wondering how to make a rank-0 Float array fail as an index.

Quit while you're ahead <wink>.  The obvious idea is to make a
Rank0FloatArray type which is not a subclass of int.




From tim.one@comcast.net  Sun Jun 23 20:00:25 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 23 Jun 2002 15:00:25 -0400
Subject: [Python-Dev] Behavior of matching backreferences
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEHJPPAA.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHMPPAA.tim.one@comcast.net>

[Tim]
> ..
> There's no such beast as "an optional group".  The
>
>     ^(a)
>
> part *must* match or the entire regexp fails, period, regardless
> of whether or not backreferences appear later.  The question mark
> following doesn't change this requirement. ...

Wow, yesterday's drugs haven't worn off yet <wink>.  The details of this
explanation were partly full of beans.  Let's consider a different regexp:

    ^(a)?b\1$

Should that match

    b

or not?  Python and Perl say "no" today, because \1 refers to a group that
didn't match.  Ir remains unclear to me whether Gustavo is saying it should,
but, if he is, that's too big a change, and

    ^(a?)b\1$

is the intended way to spell it.




From python@rcn.com  Sun Jun 23 20:03:08 2002
From: python@rcn.com (Raymond Hettinger)
Date: Sun, 23 Jun 2002 15:03:08 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
Message-ID: <002001c21ae8$9d687a40$bbb53bd0@othello>

GvR thought you guys might have some ideas on this one for me.

If I don't get any replies, I may have to rely on my own instincts and
judgment and no one knows what follies might ensue ;)


Raymond Hettinger


----- Original Message -----
From: "Raymond Hettinger" <python@rcn.com>
To: <python-dev@python.org>
Sent: Friday, June 21, 2002 1:16 PM
Subject: Behavior of buffer()


> I would like to solicit py-dev's thoughts on the best way to resolve a
bug,
> www.python.org/sf/546434 .
>
> The root problem is that mybuf[:] returns a buffer type and mybuf[2:4]
> returns a string type.  A similar issue exists for buffer repetition.
>
> One way to go is to have the slices always return a string.  If code
> currently relies on the type of a buffer slice, it is more likely to be
> relying on it being a string as in:  print mybuf[:4].  This is an
intuitive
> guess because I can't find empirical evidence.  Another reason to choose a
> string return type is that buffer() appears to have been designed to be as
> stringlike as possible so that it can be easily substituted in code
> originally designed for strings.
>
> The other way to go is to return a buffer object everytime.  Slices
usually,
> but not always (see subclasses of list), return the same type that was
being
> sliced.  If we choose this route, another issue remains -- mybuf[:]
returns
> self instead of a new buffer.  I think that behavior is also a bug and
> should be changed to be consistent with the Python idiom where:
>   b = a[:]
>   assert id(a) != id(b)
>
> Incidental to the above, GvR had a thought that slice repetition ought to
> always return an error.  Though I don't see any use cases for buffer
> repetition, bufferobjects do implement all other sequence behaviors and I
> think it would be weird to nullify the sq_repeat slot.
>
> I appreciate your thoughts on the best way to proceed.
>
> fixing-bugs-is-easier-than-deciding-appropriate-behavior-ly yours,
>
>
> 'regnitteh dnomyar'[::-1]
>
>
>
>
>
>
>
>
>
>
>
>
>
>




From pinard@iro.umontreal.ca  Sun Jun 23 20:05:45 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 23 Jun 2002 15:05:45 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <15637.59966.161957.754620@anthem.wooz.org>
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>
 <3D11B6F0.5000803@tismer.com>
 <200206201746.g5KHkwH04175@odiug.zope.com>
 <3D121EDB.6070501@tismer.com>
 <15637.7385.966341.14847@anthem.wooz.org>
 <000a01c21aa1$438bfde0$3da48490@neil>
 <15637.59966.161957.754620@anthem.wooz.org>
Message-ID: <oq4rftvmra.fsf@titan.progiciels-bpi.ca>

[Barry A. Warsaw]
> "NH" == Neil Hodgson <nhodgson@bigpond.net.au>

> Another i18n approach altogether uses explicit message ids instead of
> using the source string as the implicit message id, but that has a
> whole 'nuther set of issues.

The `catgets' approach, by opposition to the `gettext' approach.  I've seen
some people having religious feelings in either direction.

Roughly said, `catgets' is faster, as you directly index the translation
string without having to hash the original string first.  It is also easier
to translate single words or strings offering little translation context,
as English ambiguities are resolved by using different message ids for
the same text fragment.

On the other hand, `gettext' can be made nearly as fast as `catgets', only
_if_ we use efficient hashing combined with proper caching.  But the real
advantage of `gettext' is that internationalised sources are more legible
and easier to maintain, since the original string is shown in clear exactly
where it is meant to be used.

A problem with both is that implementations bundled in various systems
are often weak of bugged, provided they exist of course.  Portability is
notoriously difficult.  Linux and GNU `gettext' rate rather nicely.
But nothing is perfect.

> [...] you tend to get paranoid about changing /any/ source string, say
> to remove a comma, adjust whitespace, or fix a preposition.  Any change
> means a dozen language teams have a new message they must translate
> (unless you can mechanically fix them for them).

This is why the responsibilities between maintainers and programmers ought
to be well split.  If the maintainer feels responsible for the work that
is induced on the translation teams by string changes, comfort is lost.
The maintainer should do its work in all freedom, and the problem of
later reflecting tiny editorial changes into PO `msgstr' fully pertains to
translators, with the possible help of automatic tools.  Translators should
be prepared to such changes.  If the split of responsibilities is not
fully understood and accepted, internationalisation becomes much heavier,
in practice, than it has to be.

>     >> The feature would be useless to me if I had to pass some explicit
>     >> dictionary into the _() method.  It makes writing i18n code
>     >> extremely tedious.

>     NH>    I think you are overstating the problem here.

> Trust me, I'm not.  [...] being forced to pass in the explicit bindings
> is a big burden in terms of maintainability and readability.

>     NH> Not making bindings explicit may mean that translators use
>     NH> other variables available at the translation point leading to
>     NH> unexpected failures when internal details are changed.

> I18n'ing a program means you have to worry about a lot more things.  [...]

Internationalisation should not add a significant burden on the programmer.
I mean, if there is something cumbersome in the internationalisation of
a string, then there is something cumbersome in that string outside any
internationalisation context.

If internationalisation really adds a significant burden, this is a
signal that internationalisation has not been implemented well enough in
the underlying language, or else, that it is not getting used correctly.
I really think that internationalising of strings should be designed so
it is a light activity and negligible burden for the maintainer.  (And of
course, translators should also get help in form of proper files and tools.)

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From tismer@tismer.com  Sun Jun 23 20:24:16 2002
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 23 Jun 2002 21:24:16 +0200
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <LNBBLJKPBEHFEDALKOLCIEDOPPAA.tim.one@comcast.net>	<3D1500F0.708@tismer.com> <15637.10310.131724.556831@anthem.wooz.org> <3D152CCB.6010000@tismer.com> <3D161435.9D154EE0@prescod.net>
Message-ID: <3D162060.9030101@tismer.com>

Paul Prescod wrote:
> Christian Tismer wrote:
> 
>>...
>>
>>Ok, I'm all with it.
>>Since a couple of hours, I'm riding the following horse:
>>
>>- $name, $(name), $(any expr)  is just fine
>>- all of this is compile-time stuff
>>....
> 
> 
> I think you just described PEP 215. But what you're missing is that we
> need a compile time facility for its flexibility and simplicity but we
> also need a runtime facility to allow I18N.

Are you sure you got what I meant?
I want to compile the variable references away at compile
time, resulting in an ordinary format string.
This string is wraped by the runtime _(), and
the result is then interpolated with a dict.

>>I also believe it is a good idea to do the _() on
>>the unexpanded string (as shown), since the submitted
>>values are most probably hard to translate at all.
> 
> 
> _ runs at runtime. If the interpolation is done at compile time then "_"
> is executed too late.

Compile time does no interpolation but a translation
of the string into a different one, which is interpolated
at runtime.

will-read-PEP215-anyway - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From skip@pobox.com  Sun Jun 23 20:28:20 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 23 Jun 2002 14:28:20 -0500
Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting
In-Reply-To: <20020623181630.GN25927@laranja.org>
References: <20020623181630.GN25927@laranja.org>
Message-ID: <15638.8532.380440.63318@beluga.mojam.com>

    Lalo> These strings may be the result of running some non-string objects
    Lalo> trough str(foo) - but, we are making no assumptions about these
    Lalo> objects. Just that str(foo) is somehow meaningful. And, to my
    Lalo> knowledge, there are no python objects for which str(foo) doesn't
    Lalo> work.

Unicode objects can't always be passed to str():

    >>> str(u"abc")
    'abc'
    >>> p = u'Scr\xfcj MacDuhk'
    >>> str(p)
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    UnicodeError: ASCII encoding error: ordinal not in range(128)

(My default encoding is "ascii".)

You need to encode Unicode objects using the appropriate charset, which may
not always be the default.

Skip



From paul@prescod.net  Sun Jun 23 20:36:34 2002
From: paul@prescod.net (Paul Prescod)
Date: Sun, 23 Jun 2002 12:36:34 -0700
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <LNBBLJKPBEHFEDALKOLCIEDOPPAA.tim.one@comcast.net>	<3D1500F0.708@tismer.com> <15637.10310.131724.556831@anthem.wooz.org> <3D152CCB.6010000@tismer.com> <3D161435.9D154EE0@prescod.net> <3D162060.9030101@tismer.com>
Message-ID: <3D162342.BBDC07B3@prescod.net>

Christian Tismer wrote:
> 
>...
> 
> Are you sure you got what I meant?
> I want to compile the variable references away at compile
> time, resulting in an ordinary format string.
> This string is wraped by the runtime _(), and
> the result is then interpolated with a dict.

How can that be?

Original expression:

_($"$foo")

Expands to:

_("%(x1)s"%{"x1": foo})

Standard Python order of operations will do the %-interpolation before
the method call! You say that it could instead be 

_("%(x1)s")%{"x1": foo}

But how would Python know to do that? "_" is just another function.
There is nothing magical about it. What if the function was instead
re.compile? In that case I would want to do the interpolation *before*
the compilation, not after!

Are you saying that the "_" function should be made special and
recognized by the compiler?

 Paul Prescod



From lalo@laranja.org  Sun Jun 23 20:38:41 2002
From: lalo@laranja.org (Lalo Martins)
Date: Sun, 23 Jun 2002 16:38:41 -0300
Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting
In-Reply-To: <15638.8532.380440.63318@beluga.mojam.com>
References: <20020623181630.GN25927@laranja.org> <15638.8532.380440.63318@beluga.mojam.com>
Message-ID: <20020623193841.GO25927@laranja.org>

On Sun, Jun 23, 2002 at 02:28:20PM -0500, Skip Montanaro wrote:
> 
>     Lalo> These strings may be the result of running some non-string objects
>     Lalo> trough str(foo) - but, we are making no assumptions about these
>     Lalo> objects. Just that str(foo) is somehow meaningful. And, to my
>     Lalo> knowledge, there are no python objects for which str(foo) doesn't
>     Lalo> work.
> 
> Unicode objects can't always be passed to str():
> 
>     >>> str(u"abc")
>     'abc'
>     >>> p = u'Scr\xfcj MacDuhk'
>     >>> str(p)
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     UnicodeError: ASCII encoding error: ordinal not in range(128)
> 
> (My default encoding is "ascii".)
> 
> You need to encode Unicode objects using the appropriate charset, which may
> not always be the default.

Valid point but completely unrelated to my argument - just s/str/unicode/
where necessary. '%s' already handles this:

>>> '-%s-' % u'Scr\xfcj MacDuhk'
u'-Scr\xfcj MacDuhk-'

[]s,
                                               |alo
                                               +----
--
  It doesn't bother me that people say things like
   "you'll never get anywhere with this attitude".
   In a few decades, it will make a good paragraph
      in my biography. You know, for a laugh.
--
http://www.laranja.org/                mailto:lalo@laranja.org
         pgp key: http://www.laranja.org/pessoal/pgp

Eu jogo RPG! (I play RPG)         http://www.eujogorpg.com.br/
Python Foundry Guide http://www.sf.net/foundry/python-foundry/



From tismer@tismer.com  Sun Jun 23 21:51:41 2002
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 23 Jun 2002 22:51:41 +0200
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <LNBBLJKPBEHFEDALKOLCIEDOPPAA.tim.one@comcast.net>	<3D1500F0.708@tismer.com> <15637.10310.131724.556831@anthem.wooz.org> <3D152CCB.6010000@tismer.com> <3D161435.9D154EE0@prescod.net> <3D162060.9030101@tismer.com> <3D162342.BBDC07B3@prescod.net>
Message-ID: <3D1634DD.8060101@tismer.com>

Paul Prescod wrote:
> Christian Tismer wrote:
> 
>>...
>>
>>Are you sure you got what I meant?
>>I want to compile the variable references away at compile
>>time, resulting in an ordinary format string.
>>This string is wraped by the runtime _(), and
>>the result is then interpolated with a dict.
> 
> 
> How can that be?
> 
> Original expression:
> 
> _($"$foo")
> 
> Expands to:
> 
> _("%(x1)s"%{"x1": foo})
> 
> Standard Python order of operations will do the %-interpolation before
> the method call! You say that it could instead be 
> 
> _("%(x1)s")%{"x1": foo}
> 
> But how would Python know to do that? "_" is just another function.
> There is nothing magical about it. What if the function was instead
> re.compile? In that case I would want to do the interpolation *before*
> the compilation, not after!
> 
> Are you saying that the "_" function should be made special and
> recognized by the compiler?

As you say it, it looks a little as if something special
would be needed, right.
I have no concrete idea.
Somehow I'd want to express that a function is
applied after compile time substitution, but before
runtime interpolation.

Here a simple idea, while not very nice, but it could work:

Assume a "$" prefix, which does the interpolation in the
way you said.
Assume further a "%" prefix, which does it only halfway,
returning a tuple: (modified string, dict).
This tuple would be passed to _(),
and it is _()'s decision to work this way:

def _(s):
     if type(s) == type(()):
         s, args = s
     else:
         args = None

#... processing s ...
     if args:
         return s % args
     else:
         return s

But this is a minor issue, I just wanted to tell what
I think should happen, without giving an exact
solution.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From xscottg@yahoo.com  Sun Jun 23 22:05:08 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Sun, 23 Jun 2002 14:05:08 -0700 (PDT)
Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__
In-Reply-To: <3D15CF08.2020506@stsci.edu>
Message-ID: <20020623210508.60531.qmail@web40105.mail.yahoo.com>

--- Todd Miller <jmiller@stsci.edu> wrote:
> >
> >Raise a TypeError and it would match the standard behavior.
> >
> Raise TypeError where?   I was thinking I'd have to either inherit from 
> int, or not, depending on the type of the array.   It still might work 
> out though...
> 

You're right.  You'd have to raise the TypeError from whatever object was
being subscripted.  I'm not sure what I was thinking...









__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From martin@v.loewis.de  Sun Jun 23 23:19:50 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 24 Jun 2002 00:19:50 +0200
Subject: [Python-Dev] Re: replacing bsddb with pybsddb's bsddb3 module
In-Reply-To: <20020621215444.GB30056@zot.electricrain.com>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
 <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
 <15632.62564.638418.191453@localhost.localdomain>
 <20020619212559.GC18944@zot.electricrain.com>
 <15633.1338.367283.257786@localhost.localdomain>
 <20020620205041.GD18944@zot.electricrain.com>
 <m34rfxowsn.fsf@mira.informatik.hu-berlin.de>
 <15635.14235.79608.390983@beluga.mojam.com>
 <20020621215444.GB30056@zot.electricrain.com>
Message-ID: <m34rftljsp.fsf@mira.informatik.hu-berlin.de>

"Gregory P. Smith" <greg@electricrain.com> writes:

> Sound correct?

Yes, please go ahead.

> How do we want future bsddb module development to proceed?  I envision
> it either taking place 100% under the python project, or taking place
> as it is now in the pybsddb project with patches being fed to the python
> project as desired?  Any preferences?  [i prefer to not maintain the
> code in two places myself (ie: do it all under the python project)]

It's your choice. If people want to maintain pybsddb3 for older Python
releases, it would be necessary to synchronize the two code bases
regularly. That would be the task of whoever is interested in
providing older Python releases with newer code. From the viewpoint
of the Python distribution, this support is not interesting.

Regards,
Martin



From xscottg@yahoo.com  Sun Jun 23 23:22:09 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Sun, 23 Jun 2002 15:22:09 -0700 (PDT)
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: <002001c21ae8$9d687a40$bbb53bd0@othello>
Message-ID: <20020623222209.62675.qmail@web40105.mail.yahoo.com>

--- Raymond Hettinger <python@rcn.com> wrote:
> GvR thought you guys might have some ideas on this one for me.
> 
> If I don't get any replies, I may have to rely on my own instincts and
> judgment and no one knows what follies might ensue ;)
> 
> [...]

I think buffers have a weird duality that they don't really want.  In one
case, the buffer object acts as a low level way to inspect some other
object's PyBufferProcs.  I'll call this BufferInspector.  In the other
case, the buffer object just acts like an array of bytes.  I'll call this
ByteArray.

So for a BufferInspector, you'd want slices to return new "views" into the
same object, and repetition doesn't make any sense.  If you wanted to copy
the data out of the object you're mucking with, you'd be explicit about it
- either creating a new string, or a new ByteArray.

For a ByteArray, I think you'd want slices to have copy behaviour and
return a new ByteArray.  Repetition also makes perfect sense.

Of course this all gets screwy when the object being inspected by the
BufferInspector sense is created solely to provide a ByteArray.  I see this
as an ugly workaround for arraymodule.c not allowing one to supply a
pointer/destructor when creating arrays.

The fact that either of these pretend to be strings is really convenient,
but I don't think it has much to do with the weirdness.  The fact that
either of these returns strings for any operation is somewhat weird.  For
the ByteArray sense of the buffer object, it's analagous to a list
slice/repetition returning a tuple.


Since the array module already has a way to create a ByteArray (and a
ShortArray, and...), buffer objects don't really need to duplicate that
effort.  Except creating an array from your own "special memory" (mmap,
DMA, third party API), and backwards compatibility in general.  :-)



BTW: I chuckled when I saw you post this the first time.  This topic seems
to draw a lot of silence.

I know that I would suggest deprecating the PyBufferObject to just being a
BufferInspector, and taking what little extra functionality was in there
and stuffing it into arraymodule.c.  Another solution would be to factor
PyBufferObject into PyBufferInspector and a "bytes" object.  A few months
ago, I was tempted to submit a PEP saying as much, but I think that would
have quietly fallen to the floor.  Nobody seems to like this topic too
much...

If you do go in and make changes to bufferobject.c, I've already submitted
two patches (fallen quietly to the floor) that fix some other classic
"buffer problems".  You might want to look at them.  Or not :-)



Cheers,
    -Scott


__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From aahz@pythoncraft.com  Sun Jun 23 23:35:26 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sun, 23 Jun 2002 18:35:26 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: <20020623222209.62675.qmail@web40105.mail.yahoo.com>
References: <002001c21ae8$9d687a40$bbb53bd0@othello> <20020623222209.62675.qmail@web40105.mail.yahoo.com>
Message-ID: <20020623223526.GA2570@panix.com>

On Sun, Jun 23, 2002, Scott Gilbert wrote:
>
> I know that I would suggest deprecating the PyBufferObject to just being a
> BufferInspector, and taking what little extra functionality was in there
> and stuffing it into arraymodule.c.  Another solution would be to factor
> PyBufferObject into PyBufferInspector and a "bytes" object.  A few months
> ago, I was tempted to submit a PEP saying as much, but I think that would
> have quietly fallen to the floor.  Nobody seems to like this topic too
> much...

OTOH, for PEPs, silence may be construed as consent.  Just don't be too
surprised if an actual PEP generated a lot of noise.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From niemeyer@conectiva.com  Mon Jun 24 00:30:06 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Sun, 23 Jun 2002 20:30:06 -0300
Subject: [Python-Dev] Behavior of matching backreferences
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEHMPPAA.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCOEHJPPAA.tim.one@comcast.net> <LNBBLJKPBEHFEDALKOLCOEHMPPAA.tim.one@comcast.net>
Message-ID: <20020623203006.A9783@ibook>

Hello Tim!

> Wow, yesterday's drugs haven't worn off yet <wink>.  The details of this
> explanation were partly full of beans.  Let's consider a different regexp:
[...]

Thanks for explaining again, in a way I could understand. :-)

>     ^(a)?b\1$
> 
> Should that match
> 
>     b
>
> or not?  Python and Perl say "no" today, because \1 refers to a group that
> didn't match.  Ir remains unclear to me whether Gustavo is saying it should,
> but, if he is, that's too big a change, and
> 
>     ^(a?)b\1$
[...]

I still think it should, because otherwise the "^(a)?b\1$" can never be
used, and this expression will become "^((a)?)b\1$" if more than one
character is needed. But since nobody agrees with me, and both languages
are doing it that way, I give up. :-)

Could you please reject the patch at SF?

Thank you!

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From barry@zope.com  Mon Jun 24 01:03:37 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sun, 23 Jun 2002 20:03:37 -0400
Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting
References: <20020623181630.GN25927@laranja.org>
Message-ID: <15638.25049.352408.232831@anthem.wooz.org>

>>>>> "LM" == Lalo Martins <lalo@laranja.org> writes:

    LM> Also, if you keep your templates (let's call a string
    LM> containing substitution markup a template, shall we?) outside
    LM> your source code, as is the case with i18n, pure substitution
    LM> doesn't require the people who edit them (for example,
    LM> translators) to know anything about python *or* even
    LM> programming.

It isn't always done that way though.  See Francois's very good
followup describing gettext vs. catgets.

    LM> Now, data formatting is another animal entirely. It's a way to
    LM> request one specific representation of a piece of data.

I agree!
-Barry



From barry@zope.com  Mon Jun 24 01:12:26 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sun, 23 Jun 2002 20:12:26 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>
 <3D11B6F0.5000803@tismer.com>
 <200206201746.g5KHkwH04175@odiug.zope.com>
 <3D121EDB.6070501@tismer.com>
 <15637.7385.966341.14847@anthem.wooz.org>
 <000a01c21aa1$438bfde0$3da48490@neil>
 <15637.59966.161957.754620@anthem.wooz.org>
 <oq4rftvmra.fsf@titan.progiciels-bpi.ca>
Message-ID: <15638.25578.254353.531473@anthem.wooz.org>

>>>>> "FP" =3D=3D Fran=E7ois Pinard <pinard@iro.umontreal.ca> writes:

    FP> This is why the responsibilities between maintainers and
    FP> programmers ought to be well split.  If the maintainer feels
    FP> responsible for the work that is induced on the translation
    FP> teams by string changes, comfort is lost.  The maintainer
    FP> should do its work in all freedom, and the problem of later
    FP> reflecting tiny editorial changes into PO `msgstr' fully
    FP> pertains to translators, with the possible help of automatic
    FP> tools.  Translators should be prepared to such changes.  If
    FP> the split of responsibilities is not fully understood and
    FP> accepted, internationalisation becomes much heavier, in
    FP> practice, than it has to be.

Unfortunately, sometimes one person has to wear both hats and then we
see the tension between the roles.

    >> I18n'ing a program means you have to worry about a lot more
    >> things.  [...]

    FP> Internationalisation should not add a significant burden on
    FP> the programmer.  I mean, if there is something cumbersome in
    FP> the internationalisation of a string, then there is something
    FP> cumbersome in that string outside any internationalisation
    FP> context.

It may not be a significant burden, once the infrastructure is in
place and a rhythm is established, but it is still not non-zero.
Little issues crop up all the time, like the fact that a message might
have the same English phrase but need to be distinguished for proper
translation in some other languages (gettext vs. catgets), or that the
translation is slightly different depending on where the message is
output (email, web, console), or dealing with localized formatting of
numbers, dates, and other values.  It's just stuff you have to keep in
mind and deal with, but it's not insurmountable.

I think the current Python tools for i18n'ing are pretty good, and the
bright side is that I'd still rather be developing an i18n'd program
in Python than in just about any other language.  One area that I
think we could do better in is in support of localizing dates,
currency, etc.  Here, Stephan Richter is laying some groundwork in the
Zope3 I18n project, possibly integrating IBM's ICU library into Python.=


    http://www-124.ibm.com/icu/

-Barry



From groups@crash.org  Mon Jun 24 01:23:50 2002
From: groups@crash.org (Jason L. Asbahr)
Date: Sun, 23 Jun 2002 19:23:50 -0500
Subject: [Python-Dev] Playstation 2 and GameCube ports
In-Reply-To: <3D0FDB0A.EC53656@prescod.net>
Message-ID: <EIEFLCFECLLBKGPNJJIMOEGAIJAA.groups@crash.org>

Paul,

The PS2 Linux FAQ has a great answer to this:

What are the differences between the Linux (for PlayStation 2) development
environment and that used by professional game developers?

Professional game developers get access to a special version of the
PlayStation 2 hardware which contains more memory and extra debug
facilities. This hardware, known as the T10K, is a lot more expensive than a
commercial PlayStation 2 and is only available to licensed game developers.
If you are seriously interested in becoming a licensed game developer,
please see this link for North America and this link for Europe and
Australasia . In addition to the T10K, licensed game developers get
additional support which is part of the reason that the T10K is so much more
expensive than a PlayStation 2 console.

In terms of access to the PlayStation 2 hardware and libraries, Linux (for
PlayStation 2) offers an almost identical set of functionality to that
provided to licensed game developers. In fact the system manuals provided
with the Linux kit have identical content to 6 of the 7 system manuals
provided  to licensed developers. The missing information which is provided
to licensed developers and not to users of Linux (for PlayStation 2)
describes the hardware that controls the CD/DVD-ROM, SPU2 Audio chip and
other IO peripheral control hardware. This hardware functionality is still
available for use with the linux kit through a software interface called the
Runtime Environment.

The final major difference between the two is the operating system. A
licensed developer creates games for the PlayStation 2 which use a light
weight proprietary operating system kernel. This kernel offers much less
functionality than Linux, but has the advantage of offering slightly faster
access to the hardware.

In most cases, it is possible to get almost the same performance with Linux
(for PlayStation 2) and the professional game development tools.



-----Original Message-----
From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On
Behalf Of Paul Prescod
Sent: Tuesday, June 18, 2002 8:15 PM
To: Jason L. Asbahr
Cc: python-dev@python.org
Subject: Re: [Python-Dev] Playstation 2 and GameCube ports


"Jason L. Asbahr" wrote:
>
>...
>
> However, the hobbiest PS2/Linux upgrade kit for the retail PS2 unit
> may be acquired for $200 and Python could be used on that system
> as well.  Info at http://playstation2-linux.com

What do you lose by going this route? Obviously if this was good enough
there would be no need for developer boxes nor (I'd guess) for a special
port of Python.

 Paul Prescod


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev




From tim.one@comcast.net  Mon Jun 24 03:24:42 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 23 Jun 2002 22:24:42 -0400
Subject: [Python-Dev] Behavior of matching backreferences
In-Reply-To: <20020623203006.A9783@ibook>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEJCPPAA.tim.one@comcast.net>

[Gustavo Niemeyer]
> I still think it should, because otherwise the "^(a)?b\1$" can never be
> used, and this expression will become "^((a)?)b\1$" if more than one
> character is needed.

Is that a real concern?  I mean that in the sense of whether you have an
actual application requiring that some multi-character bracketing string
either does or doesn't appear on both ends of a thing, and typing another
set of parens is a burden.  Both parts of that seem strained.

> But since nobody agrees with me, and both languages are doing it that
> way, I give up. :-)

That's wise <wink>.  It's not just Python and Perl, I expect you're going to
find this in every careful regexp package.  There's a painful discussion
buried here:

<http://standards.ieee.org/reading/ieee/interp/1003-2-92_int/pasc-1003.2-43.
html>

wherein the POSIX committee debated their own ambiguous wording about
backreferences.  Their specific example is:  what should the regexp (in
Python notation, not POSIX)

    ^((.)*\2#)*

match in

    xx#yy##

?  Your example is hiding in there, on the "third iteration of the outer
loop".  The official POSIX interpretation was that it should match just the
first 6 characters, and not the trailing #,

    because in a third iteration of the outer subexpression, . would match
    nothing (as distinct from matching a null string) and hence \2 would
    match nothing.

Python and Perl agree, which wouldn't surprise you if you first implemented
a regexp engine with stinking backreferences <0.9 wink>.  The distinction
between "matched an empty string" and "didn't match anything" is night-&-day
inside an engine, and people skating on the edge (meaning using
backreferences at all!) quickly rely on the exact behavior this implies.

> Could you please reject the patch at SF?

I'm not sure which one you mean, so on your authority I'm going to reject
all patches at SF.  Whew!  This makes our job much easier <wink>.




From niemeyer@conectiva.com  Mon Jun 24 04:04:58 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Mon, 24 Jun 2002 00:04:58 -0300
Subject: [Python-Dev] Behavior of matching backreferences
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEJCPPAA.tim.one@comcast.net>
References: <20020623203006.A9783@ibook> <LNBBLJKPBEHFEDALKOLCIEJCPPAA.tim.one@comcast.net>
Message-ID: <20020624000458.A12114@ibook>

> > I still think it should, because otherwise the "^(a)?b\1$" can never be
> > used, and this expression will become "^((a)?)b\1$" if more than one
> > character is needed.
> 
> Is that a real concern?  I mean that in the sense of whether you have an
> actual application requiring that some multi-character bracketing string
> either does or doesn't appear on both ends of a thing, and typing another
> set of parens is a burden.  Both parts of that seem strained.

No, it isn't. Even because there is some way to implement this,
as Barry and you have shown, and because *I* know it doesn't work as
I'd expect. :-))

Indeed, I've found it while implementing another feature which in my
opinion is really useful, and can't be easily achieved. But that's
something for another thread, another day.

[...]
> ?  Your example is hiding in there, on the "third iteration of the outer
> loop".  The official POSIX interpretation was that it should match just the
> first 6 characters, and not the trailing #,
> 
>     because in a third iteration of the outer subexpression, . would match
>     nothing (as distinct from matching a null string) and hence \2 would
>     match nothing.
[...]

Thanks for giving me a strong and detailed reason. I understand that
small issues can end up in endless discussions and different
implementations. I'm happy that the POSIX people thought about that
before me <2.0 wink>.

> > Could you please reject the patch at SF?
> 
> I'm not sure which one you mean, so on your authority I'm going to reject
> all patches at SF.  Whew!  This makes our job much easier <wink>.

That's good! You'll take back the time you wasted with me. ;-))

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From misa@redhat.com  Mon Jun 24 04:06:10 2002
From: misa@redhat.com (Mihai Ibanescu)
Date: Sun, 23 Jun 2002 23:06:10 -0400 (EDT)
Subject: [Python-Dev] Added SSL through HTTP proxies support to httplib.py
Message-ID: <Pine.LNX.4.44.0206232303190.11717-100000@roadrunner.devel.redhat.com>

Hello,

Can somebody please verify if the following patch makes enough sense to be 
accepted?

http://sourceforge.net/tracker/index.php?func=detail&aid=515003&group_id=5470&atid=305470

Thanks,
Misa




From neal@metaslash.com  Mon Jun 24 04:16:20 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Sun, 23 Jun 2002 23:16:20 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>
 <3D11B6F0.5000803@tismer.com>
 <200206201746.g5KHkwH04175@odiug.zope.com>
 <3D121EDB.6070501@tismer.com>
 <15637.7385.966341.14847@anthem.wooz.org>
 <000a01c21aa1$438bfde0$3da48490@neil>
 <15637.59966.161957.754620@anthem.wooz.org>
 <oq4rftvmra.fsf@titan.progiciels-bpi.ca> <15638.25578.254353.531473@anthem.wooz.org>
Message-ID: <3D168F04.CE03AD66@metaslash.com>

I'm pretty negative on string interpolation, I don't see it
as that useful or %()s as that bad.  But obviously, many others
do feel there is a problem.

I don't like the schism that $ vs. % would create.  Nor do
I like many other proposals.  So here is yet another proposal:

 * Add new builtin function interp() or some other name:
     def interp(format, uselocals=True, useglobals=True, dict={}, **kw)
 * use % as the format character and allow optional () or {}
   around the name
 * if this is acceptable, {name:format_modifiers} 
   could be added in the future

Code would then look like this:

	>>> x = 5
	>>> print interp('x = %x')
	x = 5
	>>> print interp('x = %(x)')
	x = 5
	>>> print interp('x = %{x}')
	x = 5
	>>> print interp('y = %y')
	NameError: name 'y' is not defined
	>>> print interp('y = %y', dict={'y': 10})
	y = 10
	>>> print interp('y = %y', y=10)
	y = 10

This form:
 * eliminates any hint of $
 * is similar to current % handling, 
   but hopefully fixes the current deficiencies
 * allows locals and/or globals to be used
 * allows any dictionary/mapping to be used
 * allows keywords
 * is extensible to allow for formatting in the future
 * doesn't require much extra typing or thought

Now I'm sure everyone will tell me how awful this is. :-)

Neal

PS  I'm -0 on this proposal.  And I dislike the name interp.



From pinard@iro.umontreal.ca  Mon Jun 24 05:02:35 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 24 Jun 2002 00:02:35 -0400
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
In-Reply-To: <15638.25578.254353.531473@anthem.wooz.org>
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>
 <3D11B6F0.5000803@tismer.com>
 <200206201746.g5KHkwH04175@odiug.zope.com>
 <3D121EDB.6070501@tismer.com>
 <15637.7385.966341.14847@anthem.wooz.org>
 <000a01c21aa1$438bfde0$3da48490@neil>
 <15637.59966.161957.754620@anthem.wooz.org>
 <oq4rftvmra.fsf@titan.progiciels-bpi.ca>
 <15638.25578.254353.531473@anthem.wooz.org>
Message-ID: <oqn0tlz5lw.fsf@titan.progiciels-bpi.ca>

[Barry A. Warsaw]

>     FP> This is why the responsibilities between maintainers and
>     FP> programmers ought to be well split.

> Unfortunately, sometimes one person has to wear both hats and then we
> see the tension between the roles.

I have the same experience, having been for a good while the assigned
French translator for the packages I was maintaining.  But I was splitting
my roles rather carefully, with the precise purpose of seeing where were
lying tensions and problems, and then work at improving how interactions
go between involved parties.

>     >> I18n'ing a program means you have to worry about a lot more
>     >> things.  [...]

>     FP> Internationalisation should not add a significant burden on
>     FP> the programmer.

> It may not be a significant burden, once the infrastructure is in
> place and a rhythm is established, but it is still not non-zero.

The Mailman effort has been especially courageous, as it ought to address
many problems on which we did not accumulate much experience yet, but which
are inescapable in the long run.  For example, I guess you had to take care
of translating external HTML templates, considering some input aspects,
allowing on-the-fly language selection, and of course, looking into more
prosaic non-message "locale" concerns.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From barry@zope.com  Mon Jun 24 05:28:50 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 24 Jun 2002 00:28:50 -0400
Subject: [Python-Dev] I18n'ing a Python program (was Re: PEP 292, Simpler String Substitutions)
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com>
 <3D11B6F0.5000803@tismer.com>
 <200206201746.g5KHkwH04175@odiug.zope.com>
 <3D121EDB.6070501@tismer.com>
 <15637.7385.966341.14847@anthem.wooz.org>
 <000a01c21aa1$438bfde0$3da48490@neil>
 <15637.59966.161957.754620@anthem.wooz.org>
 <oq4rftvmra.fsf@titan.progiciels-bpi.ca>
 <15638.25578.254353.531473@anthem.wooz.org>
 <oqn0tlz5lw.fsf@titan.progiciels-bpi.ca>
Message-ID: <15638.40962.721153.934484@anthem.wooz.org>

>>>>> "FP" =3D=3D Fran=E7ois Pinard <pinard@iro.umontreal.ca> writes:

    >> It may not be a significant burden, once the infrastructure is
    >> in place and a rhythm is established, but it is still not
    >> non-zero.

    FP> The Mailman effort has been especially courageous, as it ought
    FP> to address many problems on which we did not accumulate much
    FP> experience yet, but which are inescapable in the long run.
    FP> For example, I guess you had to take care of translating
    FP> external HTML templates, considering some input aspects,
    FP> allowing on-the-fly language selection, and of course, looking
    FP> into more prosaic non-message "locale" concerns.

Thanks, I think it's been valuable experience -- I certainly have
learned a lot!

One of the most painful areas has in fact been the translating of HTML
templates specifically because a template file is far too coarse a
granularity.  When I want to add a new widget to a template, I can
usually figure out where to add it in say, the Spanish or French
version, but it's nearly hopeless to try to add it to the Japanese
version. :)

Here, I hope Fred, Stephan Richter, and my efforts at i18n'ing Zope3's
Page Templates will greatly improve things.  It's early going but it
feels right.  It would mean you essentially have one version of the
template but you'd mark it up to designate the translatable messages,
and I think you'd end up integrating those with your Python source
catalogs (but maybe in a different domain?).  I'm not quite sure how
that would translate to plaintext templates (e.g. for email messages).

Input aspects are something neither MM nor Zope has (yet) adequately
addressed.  What I'm thinking of here are message footers in multiple
languages or say, a job description in multiple languages.  We'll have
to address these down the road.

I've already mentioned about efforts in Zopeland for localizing
non-message issues.  On-the-fly language selection is something that I
have had to deal with in MM, and Python's class-based gettext API is
essential here, and works well.  Zope3 and MM take slightly different
u/i tacks, with Zope3 doing better browser language negotiation and MM
allowing for explicit overrides in forms.  Some combination of the two
is probably where web-based applications want to head.

now-to-make-time-to-finish-MM2.1-ly y'rs,
-Barry



From nhodgson@bigpond.net.au  Mon Jun 24 09:22:14 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Mon, 24 Jun 2002 18:22:14 +1000
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <NBBBIOJPGKJEKIECEMCBKEDONFAA.pobrien@orbtech.com><3D11B6F0.5000803@tismer.com><200206201746.g5KHkwH04175@odiug.zope.com><3D121EDB.6070501@tismer.com><15637.7385.966341.14847@anthem.wooz.org><000a01c21aa1$438bfde0$3da48490@neil> <15637.59966.161957.754620@anthem.wooz.org>
Message-ID: <005201c21b58$3f6d39b0$3da48490@neil>

[Doh! Forgot to send to the list as well - shouldn't try to use a computer
when I have a cold]

Barry A. Warsaw:

> Trust me, I'm not.  Then again, maybe it's just me, or my limited
> experience w/ i18n'd source code, but being forced to pass in the
> explicit bindings is a big burden in terms of maintainability and
> readability.

   My main experience in internationalization has been in GUI apps where
there is often a strong separation between the localizable static text and
the variable text. In dialogs you often have:

Static localized description: [Editable variable]

   In my editor SciTE, which currently has about 15 translations, of the 177
localizable strings, only 9 are messages that require insertion of variables
and all of those require only one variable. Most of the strings are menu or
dialog items. Maybe I'm just stingy with messages :-)

   On the largest sensibly internationalized project I have worked on (7
years old and with a maximum of 20 reasearch/design/develop/test staff when
I left), I would estimate that less than 50 messages required variable
substitution.

   The amount of effort that went into ensuring that the messages were
accurate, meaningful and understandable outweighed by several orders of
magnitude any typing or reading work.

   Neil





From s_lott@yahoo.com  Mon Jun 24 15:40:49 2002
From: s_lott@yahoo.com (Steven Lott)
Date: Mon, 24 Jun 2002 07:40:49 -0700 (PDT)
Subject: "Julian" ambiguity (was Re: [Python-Dev] strptime recapped)
In-Reply-To: <DBUQJHQL1W83B032XUP4Y62A9WTPNM.3d152f3a@Egil>
Message-ID: <20020624144049.9418.qmail@web9603.mail.yahoo.com>

Thanks for the amplification - that was precisely my point.  
When proposing that strptime() parse "Julian" dates, some more
precise definition of Julian is required.


--- John Machin <sjmachin@lexicon.net> wrote:
> 21/06/2002 10:27:22 PM, Steven Lott <s_lott@yahoo.com> wrote:
> 
> >
> >Generally, "Julian" dates are really just the day number
> within
> >a given year; this is a simple special case of the more
> general
> >(and more useful) approach that R-D use.
> >
> >See
> >http://emr.cs.iit.edu/home/reingold/calendar-book/index.shtml
> >
> >for more information.
> >
> 
> AFAICT from perusing their book, R-D use the term
> "julian-date" to mean a tuple (year, month, day) in the Julian
> calendar.
> The International Astro. Union uses "Julian date" to mean an
> instant in time measured in days (and fraction therof) since
> noon on 1 January -4712 (Julian ("proleptic") calendar). See
> for example 
> http://maia.usno.navy.mil/iauc19/iaures.html#B1
> 
> A "Julian day number" (or "JDN") is generally used to mean an
> ordinal day number counting day 0 as Julian_calendar(-4712, 1,
> 1) as above. Some folks use JDN to include the IAU's
> instant-in-time.
> 
> Some folks use "julian day" to mean a day within a year (range
> 0-365 *or* 1-366 (all inclusive)). This terminology IMO should
> be severely deprecated. The concept is best described as
> something like "day of year", with a 
> specification of the origin (0 or 1) when appropriate. 
> 
> It is not clear from the first of your sentences quoted above
> exactly what you are calling a "Julian date": (a) the tuple
> (given_year, day_of_year) with calendar not specified or (b)
> just day_of_year. However either answer seems 
> IMO to be an inappropriate addition to the terminology
> confusion.
> 
> Cheers,
> John
> 
> 
> 
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev


=====
--
S. Lott, CCP :-{)
S_LOTT@YAHOO.COM
http://www.mindspring.com/~slott1
Buccaneer #468: KaDiMa

Macintosh user: drinking upstream from the herd.

__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From lalo@laranja.org  Mon Jun 24 17:07:19 2002
From: lalo@laranja.org (Lalo Martins)
Date: Mon, 24 Jun 2002 13:07:19 -0300
Subject: [Python-Dev] New subscriber
In-Reply-To: <20020624145315.8773.62199.Mailman@mail.python.org>
References: <20020624145315.8773.62199.Mailman@mail.python.org>
Message-ID: <20020624160719.GS25927@laranja.org>

On Mon, Jun 24, 2002 at 10:53:15AM -0400, python-dev-request@python.org wrote:
> If you are a new subscriber, please take the time to introduce yourself
> briefly in your first post.

Hmm, ok.

My name is Fernando Martins, known as Lalo. I'm currently 27 and I live in
Brazil. I've been a Python advocate since Bruce Perens introduced me to it
in, what, '96.

I've been working professionally with Zope - ranging from site building to
training, from infrastructure hacking to consulting - since mid-99, when I
selected it from a range of options due to the fact that it was in Python.

In the course of zope training and consulting, I take every opportunity to
give python courses and talks.

I never subscribed to python-dev before because I was very involved in the
Zope community and preferred to keep my mind out of lower-level stuff, but
now I find there are lots of interesting things going on and I'd prefer to
be a part of it. ;-)

(Also, Zope is very cool but the web marketing can get tiresome - if I can
find a way, I'm planning to retire from it at least in part and spend more
time doing plain Python.)

[]s,
                                               |alo
                                               +----
--
  It doesn't bother me that people say things like
   "you'll never get anywhere with this attitude".
   In a few decades, it will make a good paragraph
      in my biography. You know, for a laugh.
--
http://www.laranja.org/                mailto:lalo@laranja.org
         pgp key: http://www.laranja.org/pessoal/pgp

Eu jogo RPG! (I play RPG)         http://www.eujogorpg.com.br/
Python Foundry Guide http://www.sf.net/foundry/python-foundry/



From barry@zope.com  Mon Jun 24 19:04:31 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 24 Jun 2002 14:04:31 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
 <20020619024806.GA7218@lilith.my-fqdn.de>
 <20020619203332.GA9758@gerg.ca>
Message-ID: <15639.24367.371777.509082@anthem.wooz.org>

>>>>> "GW" == Greg Ward <gward@python.net> writes:

    GW> No, library_dirs is for good old -L.  AFAIK it works fine.

    GW> For -R (or equivalent) you need runtime_library_dirs.  I'm not
    GW> sure if it works (or ever did).  I think it's a question of
    GW> knowing what magic options to supply to each compiler.
    GW> Probably it works (worked) on Solaris, since for once Sun got
    GW> things right and supplied a simple, obvious, working
    GW> command-line option -- namely -R.

runtime_library_dirs works perfectly for Linux and gcc, thanks.

-Barry



From barry@zope.com  Mon Jun 24 19:29:07 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 24 Jun 2002 14:29:07 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
 <m31yb3hlrv.fsf@mira.informatik.hu-berlin.de>
 <15632.52766.822003.689689@anthem.wooz.org>
 <m3znxrq9ht.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15639.25843.562043.559385@anthem.wooz.org>

>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

    MvL> barry@zope.com (Barry A. Warsaw) writes:

    >> Really?  You know the path for the -R/--rpath flag, so all you
    >> need is the magic compiler-specific incantation, and distutils
    >> already (or /should/ already) know that.

    MvL> Yes, but you don't know whether usage of -R is appropriate.
    MvL> If the installed library is static, -R won't be needed.

And shouldn't hurt.
    
    MvL> If then the target directory recorded with -R happens to be
    MvL> on an unavailable NFS server at run-time (on a completely
    MvL> different network), you cannot import the library module
    MvL> anymore, which would otherwise work perfectly fine.

Do people still use NFS servers to share programs?  I thought big
cheap disks and RPMs did away with all that. :)

I believe that -R/-rpath adds directories to runtime search paths so
if the NFS directory was unmounted, ld.so should still be able to
locate the shared library through fallback means.  That may fail too,
but oh well.

One issue on Solaris may be that -- according to the GNU ld docs --
the runtime search path will be built from the -L options which we're
already passing, /unless/ -rpath is given, and this seems to be added
to help with NFS mounted directories on the -L specified path.  But
since I'm proposing that the -rpath directory be the same as the -L
path, I don't think it will make matters worse.

    MvL> We had big problems with recorded library directories over
    MvL> the years; at some point, the administrators decided to take
    MvL> the machine that had
    MvL> /usr/local/lib/gcc-lib/sparc-sun-solaris2.3/2.5.8 on it
    MvL> offline. They did not knew that they would thus make vim
    MvL> inoperable, which happened to be compiled with LD_RUN_PATH
    MvL> pointing to that directory - even though no library was ever
    MvL> needed from that directory.

Hmm.  Was the problem that the NFS server was unresponsive, or that
the directory was unmounted, but still searched?  If the former, then
maybe you do have a problem.  I've experienced hangs over the years
when NFS servers have been unresponsive (because the host was down and
the nfs mounts options weren't given to make this a soft error).  I
haven't used NFS in years though so my memory is rusty on the details.

    >> I disagree.  While the sysadmin should probably fiddle with
    >> /etc/ld.so.conf when he installs BerkeleyDB, it's not
    >> documented in the Sleepycat docs, so it's entirely possible
    >> that they haven't done it.

    MvL> I'm not asking for the administrator fiddle with
    MvL> ld.so.conf. Instead, I'm asking the administrator fiddle with
    MvL> Modules/Setup.

We've made it so easy to build a batteries-included Python that I
think it would be unfortunately not to do better just because we fear
that things /might/ go wrong in some strange environments.  I think
it's largely unnecessary to edit Modules/Setup these days, and since
we /know/ that BerkeleyDB is installed in a funky location not usually
on your ld.so path, I think we can take advantage of that to not
require editing Modules/Setup in this case too.

Our failure mode for bsddbmodule is so cryptic that it's very
difficult to figure out why it's not available.  I think this simple
change to setup.py[1] would improve the life for the average Python
programmer.  I'd be happy with a command line switch or envar to
disable the -R/--rpath addition.

Here's a compromise.  If LD_RUN_PATH is set at all (regardless of
value), don't add -R/--rpath.  Or add a --without-rpath switch to
configure.

    >> Note I'm not saying setting LD_RUN_PATH is the best approach,
    >> but it seemed like the most portable.  I couldn't figure out if
    >> distutils knew what the right compiler-specific switches are
    >> (i.e. "-R dir" on Solaris cc if memory serves, and "-Xlinker
    >> -rpath -Xlinker dir" for gcc, and who knows what for other Unix
    >> or <gasp> Windows compilers).

    MvL> LD_LIBRARY_PATH won't work for Windows compilers, either. To
    MvL> my knowledge, there is nothign equivalent on Windows.

Someone else will have to figure out the problems for Windows source
builders <wink>.  I'd like to make life just a little easier for Linux
and Unix users.  I think this change will do that.

-Barry

[1]

Index: setup.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/setup.py,v
retrieving revision 1.95
diff -u -r1.95 setup.py
--- setup.py	21 Jun 2002 14:48:38 -0000	1.95
+++ setup.py	24 Jun 2002 18:03:06 -0000
@@ -510,12 +510,14 @@
             if dbinc == 'db_185.h':
                 exts.append(Extension('bsddb', ['bsddbmodule.c'],
                                       library_dirs=[dblib_dir],
+                                      runtime_library_dirs=[dblib_dir],
                                       include_dirs=db_incs,
                                       define_macros=[('HAVE_DB_185_H',1)],
                                       libraries=[dblib]))
             else:
                 exts.append(Extension('bsddb', ['bsddbmodule.c'],
                                       library_dirs=[dblib_dir],
+                                      runtime_library_dirs=[dblib_dir],
                                       include_dirs=db_incs,
                                       libraries=[dblib]))
         else:



From barry@zope.com  Mon Jun 24 19:31:50 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 24 Jun 2002 14:31:50 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
 <15632.62272.946354.832044@localhost.localdomain>
Message-ID: <15639.26006.812728.291668@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    BAW> I'm still having build trouble on my RH6.1 system, but maybe
    BAW> it's just too old to worry about (I /really/ need to upgrade
    BAW> one of these days ;).

    [errors snipped]

    SM> I think you might have to define another CPP macro.  In my
    SM> post from last night about building dbmmodule.c I included

    |                                  define_macros=[('HAVE_BERKDB_H',None),
    |                                                 ('DB_DBM_HSEARCH',None)],

    SM> in the Extension constructor.  Maybe DB_DBM_HSEARCH is also
    SM> needed for older bsddb?  I have no trouble building though.

It must be an issue with this ancient RH6.1 system.  It builds fine on
Mandrake 8.1, and the time to dig into this would probably be better
spent upgrading to a more modern Linux distro.

But thanks,
-Barry



From barry@zope.com  Mon Jun 24 19:34:15 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 24 Jun 2002 14:34:15 -0400
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
 <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
 <15632.62564.638418.191453@localhost.localdomain>
 <20020619212559.GC18944@zot.electricrain.com>
 <15633.1338.367283.257786@localhost.localdomain>
 <20020620205041.GD18944@zot.electricrain.com>
 <m34rfxowsn.fsf@mira.informatik.hu-berlin.de>
 <15635.14235.79608.390983@beluga.mojam.com>
Message-ID: <15639.26151.752521.415108@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    Greg> should we keep the existing bsddb around as oldbsddb for
    Greg> users in that situation?

    Martin> I don't think so; users could always extract the module
    Martin> from older distributions if they want to.

    SM> I would prefer the old version be moved to lib-old (or
    SM> Modules-old?).  For people still running DB 2.x it shouldn't
    SM> be a major headache to retrieve.

Modules/old/ probably.  We wouldn't do anything with that directory
except use it as a placeholder for old extension source, right?

Do we care about preserving the cvs history for the current
bsddbmodule.c?  If so, we'll have to ask SF to do a cvs dance for us.
It may not be worth it.

-Barry



From barry@zope.com  Mon Jun 24 19:47:21 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 24 Jun 2002 14:47:21 -0400
Subject: [Python-Dev] Re: replacing bsddb with pybsddb's bsddb3 module
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
 <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
 <15632.62564.638418.191453@localhost.localdomain>
 <20020619212559.GC18944@zot.electricrain.com>
 <15633.1338.367283.257786@localhost.localdomain>
 <20020620205041.GD18944@zot.electricrain.com>
 <m34rfxowsn.fsf@mira.informatik.hu-berlin.de>
 <15635.14235.79608.390983@beluga.mojam.com>
 <20020621215444.GB30056@zot.electricrain.com>
Message-ID: <15639.26937.866896.917152@anthem.wooz.org>

>>>>> "GPS" == Gregory P Smith <greg@electricrain.com> writes:

    GPS> This sounds good.  Here's what i see on the plate to be done
    GPS> so far:

    GPS> 1) move the existing Modules/bsddbmodule.c to a new
    GPS> Modules-old or directory.

mkdir Modules/old (or Modules/extensions-old)
mv Modules/bsddbmodule.c Modules/old
    
    GPS> 2) create a new Lib/bsddb
    GPS> directory containing bsddb3/bsddb3/*.py from the pybsddb
    GPS> project.

+1
    
    GPS> 3) create a new Modules/bsddb directory containing
    GPS> bsddb3/src/* from the pybsddb project (the files should
    GPS> probably be renamed to _bsddbmodule.c and
    GPS> bsddbmoduleversion.h for consistent naming)

I don't think you need to create a new directory under Modules for
this; it's just two files.  Probably Modules/_bsddbmodule.c and
Modules/_bsddbversion.h are fine.

Also, for backwards compatibility, won't Lib/bsddb/__init__.py need to
do "from _bsddb import btopen, error, hashopen, rnopen"?

    GPS> 4) place the pybsddb setup.py in the Modules/bsddb directory,
    GPS>    modifying it as needed.  OR modify the top level setup.py
    GPS> to understand how to build the pybsddb module.  (there is
    GPS> code in pybsddb's setup.py to locate the berkeleydb install
    GPS> and determine appropriate flags that should be cleaned up and
    GPS> carried on)

How much of Skip's recent changes to setup.py can be retargeted for
pybsddb?

    GPS> 5) modify the top level python setup.py to build the bsddb
    GPS> module as appropriate.  6) "everything else" including
    GPS> integrating documentation and pybsddb's large test suite.

What to do about the test suite is a good question.  pybsddb's is
/much/ more extensive, and I wouldn't want to lose that, but I'm also
not sure I'd want it to run during a normal regrtest.

Here's an idea: leave test_bsddb as is and add pybsddb's as
test_all_bsddb.py.  Then add a "-u all_bsddb" to regrtest's resource
flags.

    GPS> How do we want future bsddb module development to proceed?  I
    GPS> envision it either taking place 100% under the python
    GPS> project, or taking place as it is now in the pybsddb project
    GPS> with patches being fed to the python project as desired?  Any
    GPS> preferences?  [i prefer to not maintain the code in two
    GPS> places myself (ie: do it all under the python project)]

I'd like to see one more official release from the pybsddb project,
since its cvs has some very useful additions (important bug fixes plus
support for BerkeleyDB 4).  Then move all development over to the
Python project and let interested volunteers port critical patches
back to the pybsddb project.  If you add me as a developer on
pybsddb.sf.net, I'll volunteer to help.

-Barry



From oren-py-l@hishome.net  Mon Jun 24 21:01:40 2002
From: oren-py-l@hishome.net (Oren Tirosh)
Date: Mon, 24 Jun 2002 23:01:40 +0300
Subject: [Python-Dev] PEP 294: Type Names in the types Module
Message-ID: <20020624230140.B3555@hishome.net>

PEP: 294
Title: Type Names in the types Module
Version: $Revision: 1.1 $
Last-Modified: $Date: 2002/06/23 23:52:19 $
Author: oren at hishome.net (Oren Tirosh)
Status: Draft
Type: Standards track
Created: 19-Jun-2002
Python-Version: 2.3
Post-History:


Abstract

    This PEP proposes that symbols matching the type name should be
    added to the types module for all basic Python types in the types
    module:

        types.IntegerType -> types.int
        types.FunctionType -> types.function
        types.TracebackType -> types.traceback
         ...    

    The long capitalized names currently in the types module will be
    deprecated.

    With this change the types module can serve as a replacement for
    the new module.  The new module shall be deprecated and listed in
    PEP 4.


Rationale

    Using two sets of names for the same objects is redundant and
    confusing.

    In Python versions prior to 2.2 the symbols matching many type
    names were taken by the factory functions for those types.  Now
    all basic types have been unified with their factory functions and
    therefore the type names are available to be consistently used to
    refer to the type object.

    Most types are accessible as either builtins or in the new module
    but some types such as traceback and generator are only accssible
    through the types module under names which do not match the type
    name.  This PEP provides a uniform way to access all basic types
    under a single set of names.


Specification

    The types module shall pass the following test:

        import types
        for t in vars(types).values():
            if type(t) is type:
                assert getattr(types, t.__name__) is t

    The types 'class', 'instance method' and 'dict-proxy' have already
    been renamed to the valid Python identifiers 'classobj',
    'instancemethod' and 'dictproxy', making this possible.


Backward compatibility

    Because of their widespread use it is not planned to actually
    remove the long names from the types module in some future
    version.  However, the long names should be changed in
    documentation and library sources to discourage their use in new
    code.


Reference Implementation
 
    A reference implementation is available in SourceForge patch
    #569328: http://www.python.org/sf/569328
  

Copyright

    This document has been placed in the public domain.




From martin@v.loewis.de  Mon Jun 24 21:05:13 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 24 Jun 2002 22:05:13 +0200
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15639.25843.562043.559385@anthem.wooz.org>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
 <m31yb3hlrv.fsf@mira.informatik.hu-berlin.de>
 <15632.52766.822003.689689@anthem.wooz.org>
 <m3znxrq9ht.fsf@mira.informatik.hu-berlin.de>
 <15639.25843.562043.559385@anthem.wooz.org>
Message-ID: <m3bsa0xx1i.fsf@mira.informatik.hu-berlin.de>

barry@zope.com (Barry A. Warsaw) writes:

>     MvL> If then the target directory recorded with -R happens to be
>     MvL> on an unavailable NFS server at run-time (on a completely
>     MvL> different network), you cannot import the library module
>     MvL> anymore, which would otherwise work perfectly fine.
> 
> Do people still use NFS servers to share programs?  I thought big
> cheap disks and RPMs did away with all that. :)

This was on Solaris, so no RPMs.

> I believe that -R/-rpath adds directories to runtime search paths so
> if the NFS directory was unmounted, ld.so should still be able to
> locate the shared library through fallback means.  That may fail too,
> but oh well.

Yes, but the startup time for the program increases dramatically - it
has to wait for the dead NFS server to timeout.

> One issue on Solaris may be that -- according to the GNU ld docs --
> the runtime search path will be built from the -L options which we're
> already passing, /unless/ -rpath is given, and this seems to be added

Where do the docs say that? I don't think this is the case, or ever
was ...

> to help with NFS mounted directories on the -L specified path.  But
> since I'm proposing that the -rpath directory be the same as the -L
> path, I don't think it will make matters worse.

Indeed, it wouldn't.

> Hmm.  Was the problem that the NFS server was unresponsive, or that
> the directory was unmounted, but still searched?  If the former, then
> maybe you do have a problem.  

Yes, that was the problem. Even with soft mounting, it will still take
time to timeout.

> We've made it so easy to build a batteries-included Python that I
> think it would be unfortunately not to do better just because we fear
> that things /might/ go wrong in some strange environments.  

That is a reasonable argument, and I've been giving similar arguments
in other cases, too, so I guess I should just stop complaining.

> Here's a compromise.  If LD_RUN_PATH is set at all (regardless of
> value), don't add -R/--rpath.  Or add a --without-rpath switch to
> configure.

I guess we don't need to compromise, and approach is *very* cryptic,
so I'd rather avoid it.

It looks like the current bsddb module is going to go away, anyway, so
there is no need to tweak the current configuration that much. I don't
know what the bsddb3 build procedure is, but any approach you come up
with now probably needs to be redone after pybsddb3 integration.

Regards,
Martin



From barry@zope.com  Mon Jun 24 21:24:27 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 24 Jun 2002 16:24:27 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
 <m31yb3hlrv.fsf@mira.informatik.hu-berlin.de>
 <15632.52766.822003.689689@anthem.wooz.org>
 <m3znxrq9ht.fsf@mira.informatik.hu-berlin.de>
 <15639.25843.562043.559385@anthem.wooz.org>
 <m3bsa0xx1i.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15639.32763.103711.902632@anthem.wooz.org>

>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

    >> Do people still use NFS servers to share programs?  I thought
    >> big cheap disks and RPMs did away with all that. :)

    MvL> This was on Solaris, so no RPMs.

I know, I was kind of joking.  But even Solaris has pkg, though I
don't know if it's in nearly as widespread use as Linux packages.

    >> I believe that -R/-rpath adds directories to runtime search
    >> paths so if the NFS directory was unmounted, ld.so should still
    >> be able to locate the shared library through fallback means.
    >> That may fail too, but oh well.

    MvL> Yes, but the startup time for the program increases
    MvL> dramatically - it has to wait for the dead NFS server to
    MvL> timeout.

Yeah that would suck.  I wonder if that would only affect imports of
bsddb though since the Python executable itself wouldn't be linked
w/-R.

    >> One issue on Solaris may be that -- according to the GNU ld
    >> docs -- the runtime search path will be built from the -L
    >> options which we're already passing, /unless/ -rpath is given,
    >> and this seems to be added

    MvL> Where do the docs say that? I don't think this is the case,
    MvL> or ever was ...

It's in the GNU ld info page under Options:

`-rpath DIR'
     [...]

     The `-rpath' option may also be used on SunOS.  By default, on
     SunOS, the linker will form a runtime search patch out of all the
     `-L' options it is given.  If a `-rpath' option is used, the
     runtime search path will be formed exclusively using the `-rpath'
     options, ignoring the `-L' options.  This can be useful when using
     gcc, which adds many `-L' options which may be on NFS mounted
     filesystems.

Reading it again now, it's not clear if "SunOS" also means "Solaris".

    >> We've made it so easy to build a batteries-included Python that
    >> I think it would be unfortunately not to do better just because
    >> we fear that things /might/ go wrong in some strange
    >> environments.

    MvL> That is a reasonable argument, and I've been giving similar
    MvL> arguments in other cases, too, so I guess I should just stop
    MvL> complaining.

    >> Here's a compromise.  If LD_RUN_PATH is set at all (regardless
    >> of value), don't add -R/--rpath.  Or add a --without-rpath
    >> switch to configure.

    MvL> I guess we don't need to compromise, and approach is *very*
    MvL> cryptic, so I'd rather avoid it.

Cool.  I'll commit the change.

    MvL> It looks like the current bsddb module is going to go away,
    MvL> anyway, so there is no need to tweak the current
    MvL> configuration that much. I don't know what the bsddb3 build
    MvL> procedure is, but any approach you come up with now probably
    MvL> needs to be redone after pybsddb3 integration.

I suspect we'll need /something/ like this once pybsddb's integrated,
but I'll definitely be testing it once Greg does the integration.  I
doubt pybsddb's build process is going to just drop into place, and I
suspect it'll actually be easier.

Thanks,
-Barry



From bac@OCF.Berkeley.EDU  Mon Jun 24 22:02:27 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Mon, 24 Jun 2002 14:02:27 -0700 (PDT)
Subject: "Julian" ambiguity (was Re: [Python-Dev] strptime recapped)
In-Reply-To: <20020624144049.9418.qmail@web9603.mail.yahoo.com>
Message-ID: <Pine.SOL.4.44.0206241358540.24327-100000@death.OCF.Berkeley.EDU>



[Steven Lott]

> Thanks for the amplification - that was precisely my point.
> When proposing that strptime() parse "Julian" dates, some more
> precise definition of Julian is required.

[snip]

strptime just follows strftime's definition of a Julian day which is the
number of days since Jan. 1 of the year.  It is out of my hands in terms
of definition of what type of Julian info strptime parses since I just
follow the formats for strftime.

But when Guido implements the new datetime type this argument will change
since both versions that he is considering do not include any Julian days
or dates.  There could be fxns, though (mine could actually be used), that
do calculate various Julian values and those can be abundantly clear on
what they return.

-Brett




From bac@OCF.Berkeley.EDU  Mon Jun 24 22:17:08 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Mon, 24 Jun 2002 14:17:08 -0700 (PDT)
Subject: [Python-Dev] New Subscriber Introduction
Message-ID: <Pine.SOL.4.44.0206241405010.24327-100000@death.OCF.Berkeley.EDU>

Uh, I realize this is a little late, but I didn't thoroughly read the
intro email for the list and so I didn't realize this was requested into
lalo sent his email.  Anyway, better late then never.

So my name is Brett Cannon.  I am a recent graduate of the philosophy
program here at UC Berkeley.  I am taking a year off from school (and
apparently employment =P) while I apply to grad school in hopes of
persuing as masters or doctorate in CS.  I also hope to discover an area
of CS that I love above all else during this year so that I can stop
jumping between different areas of programming (I have the slight issue of
wnating to be the best that I can be at everything and this jumping around
is not helping with that =).

I discovered Python in Fall 2000.  Was trying to choose a language to use
to teach myself OOP before I took my first CS course here at Cal.  Have
been using Python pretty exclusively since except for CS coursework;
Python spoiled me and didn't help when I had to use Scheme, Lisp, or Java
in my classes.

The only grumblings anyone has heard out of me as yet on this list is over
my Python implementation of strptime.  I do plan to stay on this list,
though, even after this is resolved and be as involved as I can on the
list (which is going to be limited until I get off my rear and really dive
into the C source).

-Brett C.




From skip@pobox.com  Mon Jun 24 23:48:44 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 24 Jun 2002 17:48:44 -0500
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15639.25843.562043.559385@anthem.wooz.org>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
 <m31yb3hlrv.fsf@mira.informatik.hu-berlin.de>
 <15632.52766.822003.689689@anthem.wooz.org>
 <m3znxrq9ht.fsf@mira.informatik.hu-berlin.de>
 <15639.25843.562043.559385@anthem.wooz.org>
Message-ID: <15639.41420.988942.868137@12-248-11-90.client.attbi.com>

Just a quick note to let you all know you've completely lost me with all
this -R stuff.  If someone would like to implement this, the now-closed
patch is at

    http://python.org/sf/553108

Just reopen it and assign it to yourself.  A quick summary of this thread
added to the bug report would probably be a good idea.

Skip



From skip@pobox.com  Mon Jun 24 23:50:37 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 24 Jun 2002 17:50:37 -0500
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15639.26151.752521.415108@anthem.wooz.org>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
 <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
 <15632.62564.638418.191453@localhost.localdomain>
 <20020619212559.GC18944@zot.electricrain.com>
 <15633.1338.367283.257786@localhost.localdomain>
 <20020620205041.GD18944@zot.electricrain.com>
 <m34rfxowsn.fsf@mira.informatik.hu-berlin.de>
 <15635.14235.79608.390983@beluga.mojam.com>
 <15639.26151.752521.415108@anthem.wooz.org>
Message-ID: <15639.41533.776854.272767@12-248-11-90.client.attbi.com>

    SM> I would prefer the old version be moved to lib-old (or
    SM> Modules-old?).  For people still running DB 2.x it shouldn't be a
    SM> major headache to retrieve.

    BAW> Modules/old/ probably.  We wouldn't do anything with that directory
    BAW> except use it as a placeholder for old extension source, right?

Sounds good to me.

    BAW> Do we care about preserving the cvs history for the current
    BAW> bsddbmodule.c?  If so, we'll have to ask SF to do a cvs dance for
    BAW> us.  It may not be worth it.

I think it would be worthwhile.  Alternatively, you could cvs remove it, the
add it to Modules/old with a note to check the Attic for older revision
notes.

Skip




From barry@zope.com  Mon Jun 24 23:58:05 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 24 Jun 2002 18:58:05 -0400
Subject: [Python-Dev] Please give this patch for building bsddb a try
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
 <m31yb3hlrv.fsf@mira.informatik.hu-berlin.de>
 <15632.52766.822003.689689@anthem.wooz.org>
 <m3znxrq9ht.fsf@mira.informatik.hu-berlin.de>
 <15639.25843.562043.559385@anthem.wooz.org>
 <15639.41420.988942.868137@12-248-11-90.client.attbi.com>
Message-ID: <15639.41981.672105.879289@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    SM> Just a quick note to let you all know you've completely lost
    SM> me with all this -R stuff.  If someone would like to implement
    SM> this, the now-closed patch is at

    SM>     http://python.org/sf/553108

    SM> Just reopen it and assign it to yourself.  A quick summary of
    SM> this thread added to the bug report would probably be a good
    SM> idea.

Actually, I think we're now good to go, although we'll need to revisit
this once Greg starts w/ the integration of pybsddb.

Thanks Skip,
-Barry



From barry@zope.com  Mon Jun 24 23:59:24 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 24 Jun 2002 18:59:24 -0400
Subject: [pybsddb] Re: [Python-Dev] Please give this patch for building bsddb a try
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <20020611203906.V6026@phd.pp.ru>
 <15631.61100.561824.480935@anthem.wooz.org>
 <200206191239.g5JCdbk01466@pcp02138704pcs.reston01.va.comcast.net>
 <15632.62564.638418.191453@localhost.localdomain>
 <20020619212559.GC18944@zot.electricrain.com>
 <15633.1338.367283.257786@localhost.localdomain>
 <20020620205041.GD18944@zot.electricrain.com>
 <m34rfxowsn.fsf@mira.informatik.hu-berlin.de>
 <15635.14235.79608.390983@beluga.mojam.com>
 <15639.26151.752521.415108@anthem.wooz.org>
 <15639.41533.776854.272767@12-248-11-90.client.attbi.com>
Message-ID: <15639.42060.171745.132635@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    BAW> Do we care about preserving the cvs history for the current
    BAW> bsddbmodule.c?  If so, we'll have to ask SF to do a cvs dance
    BAW> for us.  It may not be worth it.

    SM> I think it would be worthwhile.  Alternatively, you could cvs
    SM> remove it, the add it to Modules/old with a note to check the
    SM> Attic for older revision notes.

That would be fine with me.
-Barry



From mwh@python.net  Tue Jun 25 00:09:38 2002
From: mwh@python.net (Michael Hudson)
Date: 25 Jun 2002 00:09:38 +0100
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: Oren Tirosh's message of "Mon, 24 Jun 2002 23:01:40 +0300"
References: <20020624230140.B3555@hishome.net>
Message-ID: <2mvg888ea5.fsf@starship.python.net>

Oren Tirosh <oren-py-l@hishome.net> writes:

> Abstract
> 
>     This PEP proposes that symbols matching the type name should be
>     added to the types module for all basic Python types in the types
>     module:
> 
>         types.IntegerType -> types.int
>         types.FunctionType -> types.function
>         types.TracebackType -> types.traceback
>          ...    
> 
>     The long capitalized names currently in the types module will be
>     deprecated.

Um, can I be a little confused?  If you are writing code that you know
will be run in 2.2 and later, you write

   isinstance(obj, int)

If you want to support 2.1 and so on, you write 

   isinstance(obj, types.IntType)

What would writing 

   isinstance(obj, types.int)

ever gain you except restricting execution to 2.3+?

I mean, I don't have any real opinion *against* this pep, I just don't
really see why anyone would care...

Cheers,
M.

-- 
  it's not that perl programmers are idiots, it's that the language
  rewards idiotic behavior in a  way that no other language or tool
  has ever done                         -- Erik Naggum, comp.lang.lisp



From lalo@laranja.org  Tue Jun 25 00:18:45 2002
From: lalo@laranja.org (Lalo Martins)
Date: Mon, 24 Jun 2002 20:18:45 -0300
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: <2mvg888ea5.fsf@starship.python.net>
References: <20020624230140.B3555@hishome.net> <2mvg888ea5.fsf@starship.python.net>
Message-ID: <20020624231845.GT25927@laranja.org>

Check the rationale:

| Most types are accessible as either builtins or in the new module but some
| types such as traceback and generator are only accssible through the types
| module under names which do not match the type name.  This PEP provides a
| uniform way to access all basic types under a single set of names.


[]s,
                                               |alo
                                               +----
--
  It doesn't bother me that people say things like
   "you'll never get anywhere with this attitude".
   In a few decades, it will make a good paragraph
      in my biography. You know, for a laugh.
--
http://www.laranja.org/                mailto:lalo@laranja.org
         pgp key: http://www.laranja.org/pessoal/pgp

Eu jogo RPG! (I play RPG)         http://www.eujogorpg.com.br/
Python Foundry Guide http://www.sf.net/foundry/python-foundry/



From greg@cosc.canterbury.ac.nz  Tue Jun 25 02:07:15 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 25 Jun 2002 13:07:15 +1200 (NZST)
Subject: [Python-Dev] Re: String substitution: compile-time versus runtime
In-Reply-To: <oq4rfwkb4n.fsf@titan.progiciels-bpi.ca>
Message-ID: <200206250107.NAA08919@s454.cosc.canterbury.ac.nz>

pinard@iro.umontreal.ca:

> I really, really think that with enough and proper care, Python
> could be set so internationalisation of Python scripts is just
> unobtrusive routine.  There should not be one way to write Python when
> one does not internationalise, and another different way to use it
> when one internationalises.

As long as you have a Turing-complete programming language
available for constructing strings, there will always be
ways to write code that defies any straightforward means
of internationalisation.

Or in other words, if internationalisation is a goal, you'll
always have to keep it in mind when coding, one way
or another.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From kevin@koconnor.net  Tue Jun 25 02:33:18 2002
From: kevin@koconnor.net (Kevin O'Connor)
Date: Mon, 24 Jun 2002 21:33:18 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
Message-ID: <20020624213318.A5740@arizona.localdomain>

I often find myself needing priority queues in python, and I've finally
broken down and written a simple implementation.  Previously I've used
sorted lists (via bisect) to get the job done, but the heap code
significantly improves performance.  There are C based implementations, but
the effort of compiling in an extension often isn't worth the effort.  I'm
including the code here for everyone's amusement.

Any chance something like this could make it into the standard python
library?  It would save a lot of time for lazy people like myself.  :-)

Cheers,
-Kevin


def heappush(heap, item):
    pos = len(heap)
    heap.append(None)
    while pos:
        parentpos = (pos - 1) / 2
        parent = heap[parentpos]
        if item <= parent:
            break
        heap[pos] = parent
        pos = parentpos
    heap[pos] = item

def heappop(heap):
    endpos = len(heap) - 1
    if endpos <= 0:
        return heap.pop()
    returnitem = heap[0]
    item = heap.pop()
    pos = 0
    while 1:
        child2pos = (pos + 1) * 2
        child1pos = child2pos - 1
        if child2pos < endpos:
            child1 = heap[child1pos]
            child2 = heap[child2pos]
            if item >= child1 and item >= child2:
                break
            if child1 > child2:
                heap[pos] = child1
                pos = child1pos
                continue
            heap[pos] = child2
            pos = child2pos
            continue
        if child1pos < endpos:
            child1 = heap[child1pos]
            if child1 > item:
                heap[pos] = child1
                pos = child1pos
        break
    heap[pos] = item
    return returnitem



-- 
 ------------------------------------------------------------------------
 | Kevin O'Connor                     "BTW, IMHO we need a FAQ for      |
 | kevin@koconnor.net                  'IMHO', 'FAQ', 'BTW', etc. !"    |
 ------------------------------------------------------------------------



From skip@pobox.com  Tue Jun 25 02:53:49 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 24 Jun 2002 20:53:49 -0500
Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error
Message-ID: <15639.52525.481846.601961@12-248-8-148.client.attbi.com>

I just noticed in the development docs that when a timeout on a socket
occurs, socket.error is raised.  I rather liked the idea that a different
exception was raised for timeouts (I used Tim O'Malley's timeout_socket
module).  Making a TimeoutError exception a subclass of socket.error would
be fine so you can catch it with existing code, but I could see recovering
differently for a timeout as opposed to other possible errors:

    sock.settimeout(5.0)
    try:
        data = sock.recv(8192)
    except socket.TimeoutError:
        # maybe requeue the request
        ...
    except socket.error, codes:
        # some more drastic solution is needed
        ...

Skip



From skip@pobox.com  Tue Jun 25 03:00:49 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 24 Jun 2002 21:00:49 -0500
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <20020624213318.A5740@arizona.localdomain>
References: <20020624213318.A5740@arizona.localdomain>
Message-ID: <15639.52945.388250.264216@12-248-8-148.client.attbi.com>

    Kevin> I often find myself needing priority queues in python, and I've
    Kevin> finally broken down and written a simple implementation.

Hmmm...  I don't see a priority associated with items when you push them
onto the queue in heappush().  This seems somewhat different than my notion
of a priority queue.

Seems to me that you could implement the type of priority queue I'm think of
rather easily using a class that wraps a list of Queue.Queue objects.  Am I
missing something obvious?

-- 
Skip Montanaro
skip@pobox.com
consulting: http://manatee.mojam.com/~skip/resume.html



From kevin@koconnor.net  Tue Jun 25 03:59:41 2002
From: kevin@koconnor.net (Kevin O'Connor)
Date: Mon, 24 Jun 2002 22:59:41 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <15639.52945.388250.264216@12-248-8-148.client.attbi.com>; from skip@pobox.com on Mon, Jun 24, 2002 at 09:00:49PM -0500
References: <20020624213318.A5740@arizona.localdomain> <15639.52945.388250.264216@12-248-8-148.client.attbi.com>
Message-ID: <20020624225941.A5798@arizona.localdomain>

On Mon, Jun 24, 2002 at 09:00:49PM -0500, Skip Montanaro wrote:
> 
>     Kevin> I often find myself needing priority queues in python, and I've
>     Kevin> finally broken down and written a simple implementation.
> 
> Hmmm...  I don't see a priority associated with items when you push them
> onto the queue in heappush().  This seems somewhat different than my notion
> of a priority queue.

Hi Skip,

I should have included a basic usage in my original email:

>>> t = []; heappush(t, 10); heappush(t, 20); heappush(t, 15); heappush(t, 5)
>>> print heappop(t), heappop(t), heappop(t), heappop(t)
20 15 10 5

The binary heap has the property that pushing takes O(log n) time and
popping takes O(log n) time.  One may push in any order and a pop() always
returns the greatest item in the list.

I don't explicitly associate a priority with every item in the queue -
instead I rely on the user having a __cmp__ operator defined on the items
(if the default does not suffice).

The same behavior can be obtained using sorted lists:
>>> from bisect import insort
>>> t = []; insort(t, 10); insort(t, 20); insort(t, 15); insort(t, 5)
>>> print t.pop(), t.pop(), t.pop(), t.pop()
20 15 10 5

But insort takes a lot more overhead on large lists.


> Seems to me that you could implement the type of priority queue I'm think of
> rather easily using a class that wraps a list of Queue.Queue objects.  Am I
> missing something obvious?

Perhaps I am, because I do not see how one would use Queue.Queue
efficiently for this task.

Cheers,
-Kevin

-- 
 ------------------------------------------------------------------------
 | Kevin O'Connor                     "BTW, IMHO we need a FAQ for      |
 | kevin@koconnor.net                  'IMHO', 'FAQ', 'BTW', etc. !"    |
 ------------------------------------------------------------------------



From zack@codesourcery.com  Tue Jun 25 04:06:09 2002
From: zack@codesourcery.com (Zack Weinberg)
Date: Mon, 24 Jun 2002 20:06:09 -0700
Subject: [Python-Dev] Improved tmpfile module
Message-ID: <20020625030609.GD13729@codesourcery.com>

Attached please find a rewritten and improved tmpfile.py.  The major
change is to make the temporary file names significantly harder to
predict.  This foils denial-of-service attacks, where a hostile
program floods /tmp with files named @12345.NNNN to prevent process
12345 from creating any temp files.  It also makes the race condition
inherent in tmpfile.mktemp() somewhat harder to exploit.

I also implemented three new interfaces:

(fd, name) = mkstemp(suffix="", binary=1): Creates a temporary file,
returning both an OS-level file descriptor open on it and its name.
This is useful in situations where you need to know the name of the
temporary file, but can't risk the race in mktemp.

name = mkdtemp(suffix=""): Creates a temporary directory, without
race.

file = NamedTemporaryFile(mode='w+b', bufsize=-1, suffix=""): This is
just the non-POSIX version of tmpfile.TemporaryFile() made available
on all platforms, and with the .path attribute documented.  It
provides a convenient way to get a temporary file with a name, that
will be automatically deleted on close, and with a high-level file
object associated with it.

Finally, I tore out a lot of the posix/not-posix conditionals, relying
on the os module to provide open() and O_EXCL -- this should make all
the recommended interfaces race-safe on non-posix systems, which they
were not before.

Comments?  I would very much like to see something along these lines
in 2.3; I have an application that needs to be reliable in the face of
the aforementioned denial of service.

Please note that I wound up removing all the top-level 'del foo'
statements (cleaning up the namespace) as I could not figure out how
to do them properly.  I'm not a python guru.

zw

"""Temporary files and filenames."""

import os
from errno import EEXIST
from random import Random

__all__ = [
     "TemporaryFile", "NamedTemporaryFile",  # recommended (high level)
     "mkstemp", "mkdtemp",                   # recommended (low level)
     "mktemp", "gettempprefix",              # deprecated
     "tempdir", "template"                   # control
     ]

### Parameters that the caller may set to override the defaults.
tempdir = None

# _template contains an appropriate pattern for the name of each
# temporary file.

if os.name == 'nt':
    _template = '~%s~'
elif os.name in ('mac', 'riscos'):
    _template = 'Python-Tmp-%s'
else:
    _template = 'pyt%s' # better ideas?

### Recommended, user-visible interfaces.

_text_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL
if os.name == 'posix':
    _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL
else:
    _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL | os.O_BINARY

def mkstemp(suffix="", binary=1):
    """Function to create a named temporary file, with 'suffix' for
    its suffix.  Returns an OS-level handle to the file and the name,
    as a tuple.  If 'binary' is 1, the file is opened in binary mode,
    otherwise text mode (if this is a meaningful concept for the
    operating system in use).  In any case, the file is readable and
    writable only by the creating user, and executable by no one."""

    if binary: flags = _bin_openflags
    else: flags = _text_openflags

    while 1:
        name = _candidate_name(suffix)
        try:
            fd = os.open(name, flags, 0600)
            return (fd, name)
        except OSError, e:
            if e.errno == EEXIST:
                continue # try again
            raise

def mkdtemp(suffix=""):
    """Function to create a named temporary directory, with 'suffix'
    for its suffix.  Returns the name of the directory.  The directory
    is readable, writable, and searchable only by the creating user."""

    while 1:
        name = _candidate_name(suffix)
        try:
            os.mkdir(name, 0700)
            return name
        except OSError, e:
            if e.errno == EEXIST:
                continue # try again
            raise

class _TemporaryFileWrapper:
    """Temporary file wrapper

    This class provides a wrapper around files opened for temporary use.
    In particular, it seeks to automatically remove the file when it is
    no longer needed.
    """

    # Cache the unlinker so we don't get spurious errors at shutdown
    # when the module-level "os" is None'd out.  Note that this must
    # be referenced as self.unlink, because the name TemporaryFileWrapper
    # may also get None'd out before __del__ is called.
    unlink = os.unlink

    def __init__(self, file, path):
        self.file = file
        self.path = path
        self.close_called = 0

    def close(self):
        if not self.close_called:
            self.close_called = 1
            self.file.close()
            self.unlink(self.path)

    def __del__(self):
        self.close()

    def __getattr__(self, name):
        file = self.__dict__['file']
        a = getattr(file, name)
        if type(a) != type(0):
            setattr(self, name, a)
        return a

def NamedTemporaryFile(mode='w+b', bufsize=-1, suffix=""):

    """Create a named temporary file, with 'suffix' for its suffix.
    It will automatically be deleted when it is closed.  Pass 'mode'
    and 'bufsize' to fdopen.  Returns a file object; the name of the
    file is accessible as file.path."""

    if 'b' in mode: binary = 1
    else: binary = 0

    (fd, name) = mkstemp(suffix, binary)
    file = os.fdopen(fd, mode, bufsize)
    return _TemporaryFileWrapper(file, name)

if os.name != 'posix':
    # A file cannot be unlinked while open, so TemporaryFile
    # degenerates to NamedTemporaryFile.
    TemporaryFile = NamedTemporaryFile
else:
    def TemporaryFile(mode='w+b', bufsize=-1, suffix=""):
        """Create a temporary file.  It has no name and will not
        survive being closed; the 'suffix' argument is ignored. Pass
        'mode' and 'bufsize' to fdopen.  Returns a file object."""

        if 'b' in mode: binary = 1
        else: binary = 0

        (fd, name) = mkstemp(binary=binary)
        file = os.fdopen(fd, mode, bufsize)
        os.unlink(name)
        return file

### Deprecated, user-visible interfaces.

def mktemp(suffix=""):
    """User-callable function to return a unique temporary file name."""
    while 1:
        name = _candidate_name(suffix)
        if not os.path.exists(name):
            return name

def gettempprefix():
    """Function to calculate a prefix of the filename to use.

    This incorporates the current process id on systems that support such a
    notion, so that concurrent processes don't generate the same prefix.
    """

    global _template
    return (_template % `os.getpid`) + '.'

### Threading gook.

try:
    from thread import allocate_lock
except ImportError:
    class _DummyMutex:
        def acquire(self): pass
        release = acquire
    def allocate_lock():
        return _DummyMutex()
    del _DummyMutex

_init_once_lock = allocate_lock()

def _init_once(var, constructor):
    """If 'var' is not None, initialize it to the return value from
    'constructor'.  Do this exactly once, no matter how many threads
    call this routine.

    FIXME: How would I cause 'var' to be passed by reference to this
    routine, so that the caller can write simply
        _init_once(foo, make_foo)
    instead of
        foo = _init_once(foo, make_foo)
    ?"""

    # Check once outside the lock, so we can avoid acquiring it if
    # the variable has already been initialized.
    if var is not None:
        return var

    try:
        _init_once_lock.acquire()
        # Check again inside the lock, in case someone else got
        # here first.
        if var is None:
            var = constructor()
    finally:
        _init_once_lock.release()
    return var

### Internal routines and data.

_seq = None

def _candidate_name(suffix):
    """Return a candidate temporary name in 'tempdir' (global) ending
    with 'suffix'."""

    # We have to make sure that _seq and tempdir are initialized only
    # once, even in the presence of multiple threads of control.
    global _seq
    global tempdir
    _seq = _init_once(_seq, _RandomFilenameSequence)
    tempdir = _init_once(tempdir, _gettempdir)

    # Most of the work is done by _RandomFilenameSequence.
    return os.path.join(tempdir, _seq.get()) + suffix

class _RandomFilenameSequence:
    characters = (  "abcdefghijklmnopqrstuvwxyz"
                  + "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
                  + "0123456789-_")

    def __init__(self):
        self.mutex = allocate_lock()
        self.rng = Random()

    def get(self):
        global _template

	# Only one thread can call into the RNG at a time.
        self.mutex.acquire()

        c = self.characters
        r = self.rng

        letters = ''.join([r.choice(c), r.choice(c), r.choice(c),
                           r.choice(c), r.choice(c), r.choice(c)])
        self.mutex.release()

        return (_template % letters)

# XXX This tries to be not UNIX specific, but I don't know beans about
# how to choose a temp directory or filename on MS-DOS or other
# systems so it may have to be changed...

# _gettempdir deduces whether a candidate temp dir is usable by
# trying to create a file in it, and write to it.  If that succeeds,
# great, it closes the file and unlinks it.  There's a race, though:
# the *name* of the test file it tries is the same across all threads
# under most OSes (Linux is an exception), and letting multiple threads
# all try to open, write to, close, and unlink a single file can cause
# a variety of bogus errors (e.g., you cannot unlink a file under
# Windows if anyone has it open, and two threads cannot create the
# same file in O_EXCL mode under Unix).  The simplest cure is to serialize
# calls to _gettempdir, which is done above in _candidate_name().

def _gettempdir():
    """Function to calculate the directory to use."""
    try:
        pwd = os.getcwd()
    except (AttributeError, os.error):
        pwd = os.curdir
    attempdirs = ['/tmp', '/var/tmp', '/usr/tmp', pwd]
    if os.name == 'nt':
        attempdirs.insert(0, 'C:\\TEMP')
        attempdirs.insert(0, '\\TEMP')
    elif os.name == 'mac':
        import macfs, MACFS
        try:
            refnum, dirid = macfs.FindFolder(MACFS.kOnSystemDisk,
                                             MACFS.kTemporaryFolderType, 1)
            dirname = macfs.FSSpec((refnum, dirid, '')).as_pathname()
            attempdirs.insert(0, dirname)
        except macfs.error:
            pass
    elif os.name == 'riscos':
        scrapdir = os.getenv('Wimp$ScrapDir')
        if scrapdir:
            attempdirs.insert(0, scrapdir)
    for envname in 'TMPDIR', 'TEMP', 'TMP':
        if os.environ.has_key(envname):
            attempdirs.insert(0, os.environ[envname])
    testfile = gettempprefix() + 'test'
    for dir in attempdirs:
        try:
            filename = os.path.join(dir, testfile)
            fd = os.open(filename,
                         os.O_RDWR | os.O_CREAT | os.O_EXCL, 0700)
            fp = os.fdopen(fd, 'w')
            fp.write('blat')
            fp.close()
            os.unlink(filename)
            del fp, fd
            return dir
        except IOError:
            pass

    msg = "Can't find a usable temporary directory amongst " + `attempdirs`
    raise IOError, msg



From skip@pobox.com  Tue Jun 25 05:09:32 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 24 Jun 2002 23:09:32 -0500
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <20020624225941.A5798@arizona.localdomain>
References: <20020624213318.A5740@arizona.localdomain>
 <15639.52945.388250.264216@12-248-8-148.client.attbi.com>
 <20020624225941.A5798@arizona.localdomain>
Message-ID: <15639.60668.591252.466454@12-248-8-148.client.attbi.com>

    Kevin> I don't explicitly associate a priority with every item in the
    Kevin> queue - instead I rely on the user having a __cmp__ operator
    Kevin> defined on the items (if the default does not suffice).

That's what I missed.

    >> Seems to me that you could implement the type of priority queue I'm
    >> think of rather easily using a class that wraps a list of Queue.Queue
    >> objects.  Am I missing something obvious?

    Kevin> Perhaps I am, because I do not see how one would use Queue.Queue
    Kevin> efficiently for this task.

I don't know how efficient it would be, but I usually think that most
applications have a small, fixed set of possible priorities, like ("low",
"medium", "high") or ("info", "warning", "error", "fatal").  In this sort of
situation my initial inclination would be to implement a dict of Queue
instances which corresponds to the fixed set of priorities, something like:

    import Queue

    class PriorityQueue:
        def __init__(self, priorities):
            self.queues = {}
            self.marker = Queue.Queue()
            self.priorities = priorities
            for p in priorities:
                self.queues[p] = Queue.Queue()

        def put(self, obj, priority):
            self.queues[priority].put(obj)
            self.marker.put(None)

        def get(self):
            dummy = self.marker.get()
            # at this point we know one of the queues has an entry for us
            for p in self.priorities:
                try:
                    return self.queues[p].get_nowait()
                except Queue.Empty:
                    pass

    if __name__ == "__main__":
        q = PriorityQueue(("low", "medium", "high"))
        q.put(12, "low")
        q.put(13, "high")
        q.put(14, "medium")
        print q.get()
        print q.get()
        print q.get()

Obviously this won't work if your set of priorities isn't fixed at the
outset, but I think it's pretty straightforward, and it should work in
multithreaded applications.  It will also work if for some reason you want
to queue up objects for which __cmp__ doesn't make sense.

Skip



From oren-py-d@hishome.net  Tue Jun 25 06:02:10 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 25 Jun 2002 01:02:10 -0400
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: <2mvg888ea5.fsf@starship.python.net>
References: <20020624230140.B3555@hishome.net> <2mvg888ea5.fsf@starship.python.net>
Message-ID: <20020625050210.GA14749@hishome.net>

On Tue, Jun 25, 2002 at 12:09:38AM +0100, Michael Hudson wrote:
> Oren Tirosh <oren-py-l@hishome.net> writes:
> 
> > Abstract
> > 
> >     This PEP proposes that symbols matching the type name should be
> >     added to the types module for all basic Python types in the types
> >     module:
> > 
> >         types.IntegerType -> types.int
> >         types.FunctionType -> types.function
> >         types.TracebackType -> types.traceback
> >          ...    
> > 
> >     The long capitalized names currently in the types module will be
> >     deprecated.
> 
> Um, can I be a little confused?  If you are writing code that you know
> will be run in 2.2 and later, you write
> 
>    isinstance(obj, int)
> 
> If you want to support 2.1 and so on, you write 
> 
>    isinstance(obj, types.IntType)
> 
> What would writing 
> 
>    isinstance(obj, types.int)
> 
> ever gain you except restricting execution to 2.3+?

It's like asking what do you gain by using string methods instead of the 
string module.  It's part of a slow, long-term effort to clean up the 
language while trying to minimize the impact on existing code.

	Oren




From greg@cosc.canterbury.ac.nz  Tue Jun 25 06:28:36 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 25 Jun 2002 17:28:36 +1200 (NZST)
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <15639.60668.591252.466454@12-248-8-148.client.attbi.com>
Message-ID: <200206250528.RAA08943@s454.cosc.canterbury.ac.nz>

Skip Montanaro <skip@pobox.com>:

> I don't know how efficient it would be, but I usually think that most
> applications have a small, fixed set of possible priorities

Some applications of priority queues are like that,
but others aren't -- e.g. an event queue in a discrete
event simulation, where events are ordered by time.
I expect that's the sort of application Kevin had
in mind.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From fredrik@pythonware.com  Tue Jun 25 07:27:41 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 25 Jun 2002 08:27:41 +0200
Subject: [Python-Dev] PEP 294: Type Names in the types Module
References: <20020624230140.B3555@hishome.net> <2mvg888ea5.fsf@starship.python.net> <20020625050210.GA14749@hishome.net>
Message-ID: <008c01c21c11$6b529070$ced241d5@hagrid>

Oren Tirosh wrote:

> > What would writing 
> > 
> >    isinstance(obj, types.int)
> > 
> > ever gain you except restricting execution to 2.3+?
> 
> It's like asking what do you gain by using string methods instead of the 
> string module.

no, it's not.  it's not like that at all.

as michael pointed out, we've already added a *third* way to access
type objects in 2.2.  you're adding a *fourth* way.

string methods were added at a time when Python went from one to
two different string types; they solved a real implementation problem.
reducing/eliminating the need for the string module was a side effect.

> It's part of a slow, long-term effort to clean up the language
> while trying to minimize the impact on existing code.

or as likely, part of a slow, long-term effort by to make Python
totally unusable for any serious software engineering...

"who cares about timtowtdi? we add a new one every week!"

"we know what's better for you.  you don't."

"deprecation guaranteed!"

(etc)




From oren-py-d@hishome.net  Tue Jun 25 07:52:03 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 25 Jun 2002 02:52:03 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <20020624213318.A5740@arizona.localdomain>
References: <20020624213318.A5740@arizona.localdomain>
Message-ID: <20020625065203.GA27183@hishome.net>

On Mon, Jun 24, 2002 at 09:33:18PM -0400, Kevin O'Connor wrote:
> I often find myself needing priority queues in python, and I've finally
> broken down and written a simple implementation.  Previously I've used
> sorted lists (via bisect) to get the job done, but the heap code
> significantly improves performance.  There are C based implementations, but
> the effort of compiling in an extension often isn't worth the effort.  I'm
> including the code here for everyone's amusement.
> 
> Any chance something like this could make it into the standard python
> library?  It would save a lot of time for lazy people like myself.  :-)

A sorted list is a much more general-purpose data structure than a priority
queue and can be used to implement a priority queue. It offers almost the same 
asymptotic performance:

sorted list using splay tree (amortized):
  insert: O(log n)
  pop: O(log n)
  peek: O(log n)

priority queue using binary heap:
  insert: O(log n)
  pop: O(log n)
  peek: O(1)

The only advantage of a heap is O(1) peek which doesn't seem so critical. 
It may also have somewhat better performance by a constant factor because
it uses an array rather than allocating node structures.  But the internal 
order of a heap-based priority queue is very non-intuitive and quite useless 
for other purposes while a sorted list is, umm..., sorted!

	Oren




From martin@v.loewis.de  Tue Jun 25 08:04:38 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 25 Jun 2002 09:04:38 +0200
Subject: [Python-Dev] Please give this patch for building bsddb a try
In-Reply-To: <15639.32763.103711.902632@anthem.wooz.org>
References: <15622.9136.131945.699747@12-248-41-177.client.attbi.com>
 <15631.60841.28978.492291@anthem.wooz.org>
 <m31yb3hlrv.fsf@mira.informatik.hu-berlin.de>
 <15632.52766.822003.689689@anthem.wooz.org>
 <m3znxrq9ht.fsf@mira.informatik.hu-berlin.de>
 <15639.25843.562043.559385@anthem.wooz.org>
 <m3bsa0xx1i.fsf@mira.informatik.hu-berlin.de>
 <15639.32763.103711.902632@anthem.wooz.org>
Message-ID: <m3k7ong7p5.fsf@mira.informatik.hu-berlin.de>

barry@zope.com (Barry A. Warsaw) writes:

> `-rpath DIR'
>      [...]
> 
>      The `-rpath' option may also be used on SunOS.  By default, on
>      SunOS, the linker will form a runtime search patch out of all the
>      `-L' options it is given.  If a `-rpath' option is used, the
>      runtime search path will be formed exclusively using the `-rpath'
>      options, ignoring the `-L' options.  This can be useful when using
>      gcc, which adds many `-L' options which may be on NFS mounted
>      filesystems.
> 
> Reading it again now, it's not clear if "SunOS" also means "Solaris".

I see. This is indeed SunOS 4 only.

Regards,
Martin



From martin@v.loewis.de  Tue Jun 25 08:15:12 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 25 Jun 2002 09:15:12 +0200
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <20020624213318.A5740@arizona.localdomain>
References: <20020624213318.A5740@arizona.localdomain>
Message-ID: <m3fzzbg77j.fsf@mira.informatik.hu-berlin.de>

"Kevin O'Connor" <kevin@koconnor.net> writes:

> Any chance something like this could make it into the standard python
> library?  It would save a lot of time for lazy people like myself.  :-)

I think this deserves a library PEP. I would also recommend to have a
separate heap and priority queue API, to avoid the kind of confusion
that Skip ran into. Something like the C++ STL API might be
appropriate: the heap functions take a comparator function, on top of
which you offer both heapsort and priority queues.

The technical issues set aside, the main purpose of a library PEP is
to record a commitment from the author to maintain the module, with
the option of removing the module if the author runs away, and nobody
takes over.

Regards,
Martin



From martin@v.loewis.de  Tue Jun 25 08:23:43 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 25 Jun 2002 09:23:43 +0200
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <20020625065203.GA27183@hishome.net>
References: <20020624213318.A5740@arizona.localdomain>
 <20020625065203.GA27183@hishome.net>
Message-ID: <m3bs9zg6tc.fsf@mira.informatik.hu-berlin.de>

Oren Tirosh <oren-py-d@hishome.net> writes:

> The only advantage of a heap is O(1) peek which doesn't seem so
> critical.  It may also have somewhat better performance by a
> constant factor because it uses an array rather than allocating node
> structures.  But the internal order of a heap-based priority queue
> is very non-intuitive and quite useless for other purposes while a
> sorted list is, umm..., sorted!

I think that heaps don't allocate additional memory is a valuable
property, more valuable than the asymptotic complexity (which is also
quite good). If you don't want to build priority queues, you can still
use heaps to sort a list.

IMO, heaps are so standard as an algorithm that they belong into the
Python library, in some form. It is then the user's choice to use that
algorithm or not.

Regards,
Martin




From aleax@aleax.it  Tue Jun 25 08:30:43 2002
From: aleax@aleax.it (Alex Martelli)
Date: Tue, 25 Jun 2002 09:30:43 +0200
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <15639.60668.591252.466454@12-248-8-148.client.attbi.com>
References: <20020624213318.A5740@arizona.localdomain> <20020624225941.A5798@arizona.localdomain> <15639.60668.591252.466454@12-248-8-148.client.attbi.com>
Message-ID: <E17MknD-0007H6-00@mail.python.org>

On Tuesday 25 June 2002 06:09 am, Skip Montanaro wrote:
	...
> I don't know how efficient it would be, but I usually think that most
> applications have a small, fixed set of possible priorities, like ("low",
> "medium", "high") or ("info", "warning", "error", "fatal").  In this sort

Then you do "bin sorting", of course -- always worth considering
when you know the sort key can only take a small number of
different values (as is the more general "radix sorting" when you
have a few such keys, or a key that easily breaks down that way).

But it IS rather a special case, albeit an important one (and quite
possibly frequently occurring in some application areas).


Alex



From oren-py-d@hishome.net  Tue Jun 25 09:09:29 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 25 Jun 2002 04:09:29 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <m3bs9zg6tc.fsf@mira.informatik.hu-berlin.de>
References: <20020624213318.A5740@arizona.localdomain> <20020625065203.GA27183@hishome.net> <m3bs9zg6tc.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020625080929.GA39304@hishome.net>

On Tue, Jun 25, 2002 at 09:23:43AM +0200, Martin v. Loewis wrote:
> Oren Tirosh <oren-py-d@hishome.net> writes:
> 
> > The only advantage of a heap is O(1) peek which doesn't seem so
> > critical.  It may also have somewhat better performance by a
> > constant factor because it uses an array rather than allocating node
> > structures.  But the internal order of a heap-based priority queue
> > is very non-intuitive and quite useless for other purposes while a
> > sorted list is, umm..., sorted!
> 
> I think that heaps don't allocate additional memory is a valuable
> property, more valuable than the asymptotic complexity (which is also
> quite good). If you don't want to build priority queues, you can still
> use heaps to sort a list.

When I want to sort a list I just use .sort(). I don't care which algorithm
is used. I don't care whether dictionaries are implemented using hash tables, 
some kind of tree structure or magic smoke.  I just trust Python to use a
reasonably efficient implementation.

I always find it funny when C++ or Perl programmers refer to an associative 
array as a "hash".
 
> IMO, heaps are so standard as an algorithm that they belong into the
> Python library, in some form. It is then the user's choice to use that
> algorithm or not.

Heaps are a "standard algorithm" only from a CS point of view.  It doesn't
have much to do with everyday programming.  

Let's put it this way: If Python has an extension module in the standard 
library implementing a sorted list, would you care enough about the
specific binary heap implementation to go and write one or would you just
use what you had in the library for a priority queue? ;-)

	Oren




From mal@lemburg.com  Tue Jun 25 09:12:00 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 25 Jun 2002 10:12:00 +0200
Subject: [Python-Dev] New Subscriber Introduction
References: <Pine.SOL.4.44.0206241405010.24327-100000@death.OCF.Berkeley.EDU>
Message-ID: <3D1825D0.2070309@lemburg.com>

Brett Cannon wrote:
> The only grumblings anyone has heard out of me as yet on this list is over
> my Python implementation of strptime.  I do plan to stay on this list,
> though, even after this is resolved and be as involved as I can on the
> list (which is going to be limited until I get off my rear and really dive
> into the C source).

Just curious: have you taken a look at the mxDateTime parser ?

It has a slightly different approach than strptime() but also
takes a lot of load from the programmer in terms of not requiring
a predefined format.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From pf@artcom-gmbh.de  Tue Jun 25 09:56:47 2002
From: pf@artcom-gmbh.de (Peter Funk)
Date: Tue, 25 Jun 2002 10:56:47 +0200 (CEST)
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: <2mvg888ea5.fsf@starship.python.net> from Michael Hudson at "Jun
 25, 2002 00:09:38 am"
Message-ID: <m17Mm7z-0076c4C@artcom0.artcom-gmbh.de>

Hi,

> Oren Tirosh <oren-py-l@hishome.net> writes:
[...]
> >         types.IntegerType -> types.int
> >         types.FunctionType -> types.function
> >         types.TracebackType -> types.traceback
> >          ...    
> > 
> >     The long capitalized names currently in the types module will be
> >     deprecated.
 
Michael Hudson:
[...]
> I mean, I don't have any real opinion *against* this pep, I just don't
> really see why anyone would care...

I care and I've a strong opinion against this PEP and any other so
called "enhancement", which makes it harder or impossible to write
Python code *NOW*, which covers a certain range of Python language
implementations.

The Python documentation advertises the 'types' module with the following 
wording:

  """This module defines names for all object types that are used by 
     the standard Python interpreter, [...]
     It is safe to use "from types import *" -- the module does not 
     export any names besides the ones listed here. New names exported 
     by future versions of this module will all end in "Type".  """

This makes promises about future versions of this module and the the
Python language.  Breaking promises is in general a very bad idea
and will do serious harm to trustworthiness.

At the time of this writing the oldest Python version I have to
support is Python 1.5.2 and this will stay so until at least the end
of year 2004.

So any attempts to deprecate often used language features does no 
good other than demotivating people to start using Python.  

It would be possible to change the documentation of types module now
and start telling users that the Python development team made up
their mind.  That would open up the possibility to really deprecate
the module or change the type names later (but only much much later!),
without causing the effect I called "version fatigue" lately here.

A look at http://www.python.org/dev/doc/devel/lib/module-types.html
showed that this didn't happened yet.  Sigh!

Regards, Peter
-- 
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen, Germany)




From fredrik@pythonware.com  Tue Jun 25 11:16:18 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 25 Jun 2002 12:16:18 +0200
Subject: [Python-Dev] New Subscriber Introduction
References: <Pine.SOL.4.44.0206241405010.24327-100000@death.OCF.Berkeley.EDU> <3D1825D0.2070309@lemburg.com>
Message-ID: <003f01c21c31$59cae6c0$0900a8c0@spiff>

mal wrote:

> Just curious: have you taken a look at the mxDateTime parser ?

is that an extension of the rfc822.parsedate approach?

> It has a slightly different approach than strptime() but also
> takes a lot of load from the programmer in terms of not requiring
> a predefined format.

if you're asking me, strptime is mostly useless in 99% of all practical
cases (even more useless than scanf).  but luckily, it's mostly harm-
less as well...

</F>




From mal@lemburg.com  Tue Jun 25 11:21:31 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 25 Jun 2002 12:21:31 +0200
Subject: [Python-Dev] New Subscriber Introduction
References: <Pine.SOL.4.44.0206241405010.24327-100000@death.OCF.Berkeley.EDU> <3D1825D0.2070309@lemburg.com> <003f01c21c31$59cae6c0$0900a8c0@spiff>
Message-ID: <3D18442B.30609@lemburg.com>

Fredrik Lundh wrote:
> mal wrote:
> 
> 
>>Just curious: have you taken a look at the mxDateTime parser ?
> 
> 
> is that an extension of the rfc822.parsedate approach?

Yes, but it goes far beyond RFC822 style dates and times.

>>It has a slightly different approach than strptime() but also
>>takes a lot of load from the programmer in terms of not requiring
>>a predefined format.
> 
> 
> if you're asking me, strptime is mostly useless in 99% of all practical
> cases (even more useless than scanf).  but luckily, it's mostly harm-
> less as well...

Agreed.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From Oleg Broytmann <phd@phd.pp.ru>  Tue Jun 25 11:40:31 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Tue, 25 Jun 2002 14:40:31 +0400
Subject: [Python-Dev] mxDatTime parser (was: New Subscriber Introduction)
In-Reply-To: <3D18442B.30609@lemburg.com>; from mal@lemburg.com on Tue, Jun 25, 2002 at 12:21:31PM +0200
References: <Pine.SOL.4.44.0206241405010.24327-100000@death.OCF.Berkeley.EDU> <3D1825D0.2070309@lemburg.com> <003f01c21c31$59cae6c0$0900a8c0@spiff> <3D18442B.30609@lemburg.com>
Message-ID: <20020625144031.A11513@phd.pp.ru>

On Tue, Jun 25, 2002 at 12:21:31PM +0200, M.-A. Lemburg wrote:
> >>Just curious: have you taken a look at the mxDateTime parser ?
> > 
> > is that an extension of the rfc822.parsedate approach?
> 
> Yes, but it goes far beyond RFC822 style dates and times.

>>> from mx import DateTime
>>> dt = DateTime.DateTimeFrom("21/12/2002")
>>> dt
<DateTime object for '2002-06-25 20:02:00.00' at 819d7f8>
>>> dt = DateTime.DateTimeFrom("21/08/2002")
>>> dt
<DateTime object for '2002-06-25 20:02:00.00' at 819b860>
>>> dt = DateTime.DateTimeFrom("21-08-2002")
>>> dt
<DateTime object for '2021-08-20 00:00:00.00' at 819d7f8>

   I am not sure I understand the logic. Because of this I always use ISO
date format (2002-08-21).

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From mal@lemburg.com  Tue Jun 25 11:50:32 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 25 Jun 2002 12:50:32 +0200
Subject: [Python-Dev] mxDatTime parser (was: New Subscriber Introduction)
References: <Pine.SOL.4.44.0206241405010.24327-100000@death.OCF.Berkeley.EDU> <3D1825D0.2070309@lemburg.com> <003f01c21c31$59cae6c0$0900a8c0@spiff> <3D18442B.30609@lemburg.com> <20020625144031.A11513@phd.pp.ru>
Message-ID: <3D184AF8.8090300@lemburg.com>

Oleg Broytmann wrote:
> On Tue, Jun 25, 2002 at 12:21:31PM +0200, M.-A. Lemburg wrote:
> 
>>>>Just curious: have you taken a look at the mxDateTime parser ?
>>>
>>>is that an extension of the rfc822.parsedate approach?
>>
>>Yes, but it goes far beyond RFC822 style dates and times.
> 
> 
>>>>from mx import DateTime
>>>>dt = DateTime.DateTimeFrom("21/12/2002")
>>>>dt
>>>
> <DateTime object for '2002-06-25 20:02:00.00' at 819d7f8>
> 
>>>>dt = DateTime.DateTimeFrom("21/08/2002")
>>>>dt
>>>
> <DateTime object for '2002-06-25 20:02:00.00' at 819b860>
> 
>>>>dt = DateTime.DateTimeFrom("21-08-2002")
>>>>dt
>>>
> <DateTime object for '2021-08-20 00:00:00.00' at 819d7f8>
> 
>    I am not sure I understand the logic. Because of this I always use ISO
> date format (2002-08-21).

The problem with the first two is that the parser
parses date *and* time (it defaults to today for entries
which are not found in the string; this can be changed
though).

The last one is parsed as ISO date (21-08-20), the trailing
02 is omitted.

As you can see date parsing is very difficult, and even though
the mxDateTime parser already recognizes tons of different
formats, it doesn't always work. It is getting better with
each release, though :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From Oleg Broytmann <phd@phd.pp.ru>  Tue Jun 25 11:55:54 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Tue, 25 Jun 2002 14:55:54 +0400
Subject: [Python-Dev] mxDatTime parser (was: New Subscriber Introduction)
In-Reply-To: <3D184AF8.8090300@lemburg.com>; from mal@lemburg.com on Tue, Jun 25, 2002 at 12:50:32PM +0200
References: <Pine.SOL.4.44.0206241405010.24327-100000@death.OCF.Berkeley.EDU> <3D1825D0.2070309@lemburg.com> <003f01c21c31$59cae6c0$0900a8c0@spiff> <3D18442B.30609@lemburg.com> <20020625144031.A11513@phd.pp.ru> <3D184AF8.8090300@lemburg.com>
Message-ID: <20020625145553.B11513@phd.pp.ru>

On Tue, Jun 25, 2002 at 12:50:32PM +0200, M.-A. Lemburg wrote:
> As you can see date parsing is very difficult, and even though

   Too true, and that's why I never sent a complaint.

> the mxDateTime parser already recognizes tons of different
> formats, it doesn't always work. It is getting better with
> each release, though :-)

   Thank you for the work!

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From oren-py-d@hishome.net  Tue Jun 25 11:58:39 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 25 Jun 2002 06:58:39 -0400
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: <m17Mm7z-0076c4C@artcom0.artcom-gmbh.de>
References: <2mvg888ea5.fsf@starship.python.net> <m17Mm7z-0076c4C@artcom0.artcom-gmbh.de>
Message-ID: <20020625105839.GA58813@hishome.net>

On Tue, Jun 25, 2002 at 10:56:47AM +0200, Peter Funk wrote:
>   """This module defines names for all object types that are used by 
>      the standard Python interpreter, [...]
>      It is safe to use "from types import *" -- the module does not 
>      export any names besides the ones listed here. New names exported 
>      by future versions of this module will all end in "Type".  """

Thanks for pointing this out!

> It would be possible to change the documentation of types module now
> and start telling users that the Python development team made up
> their mind.  That would open up the possibility to really deprecate
> the module or change the type names later (but only much much later!),
> without causing the effect I called "version fatigue" lately here.

I don't understand exactly what you are suggesting here. Would you care to
explain it more clearly?

	Oren




From pf@artcom-gmbh.de  Tue Jun 25 13:08:54 2002
From: pf@artcom-gmbh.de (Peter Funk)
Date: Tue, 25 Jun 2002 14:08:54 +0200 (CEST)
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: <20020625105839.GA58813@hishome.net> from Oren Tirosh at "Jun 25,
 2002 06:58:39 am"
Message-ID: <m17Mp7u-0079QBC@artcom0.artcom-gmbh.de>

Hi,

Oren Tirosh:
> On Tue, Jun 25, 2002 at 10:56:47AM +0200, Peter Funk wrote:
> >   """This module defines names for all object types that are used by 
> >      the standard Python interpreter, [...]
> >      It is safe to use "from types import *" -- the module does not 
> >      export any names besides the ones listed here. New names exported 
> >      by future versions of this module will all end in "Type".  """
> 
> Thanks for pointing this out!
> 
> > It would be possible to change the documentation of types module now
> > and start telling users that the Python development team made up
> > their mind.  That would open up the possibility to really deprecate
> > the module or change the type names later (but only much much later!),
> > without causing the effect I called "version fatigue" lately here.
> 
> I don't understand exactly what you are suggesting here. Would you care to
> explain it more clearly?

A recent thread here on python-dev came to the conclusion to 
"silently deprecate" the standard library modules 'string' and 'types'.
This silent deprecation nevertheless means, that these modules will
go away at some future point in time.  I don't like this decision,
but I understand the reasoning and can now only hope, that this
point in time lies very very far away in the future.

It is a reasonable expection, that source code written for a
certain version of a serious programming language remains valid for a
*LONG* period of time.  Backward compatibility is absolutely essential.

What I was trying to suggest is to change the documentation of the 
Python language and library as early as possible, so that programmers 
get a reasonable chance to become familar with any upcoming new situation.

Unfortunately this will not help for software, which has already been
written and is in production.  If in 2004 certain Python programs
written in 2000 or earlier would start raising ImportError exceptions
on 'from types import *' after upgrading to a new system which may come
with the latest version of Python, this will certainly cause damage.

Regards, Peter
-- 
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen, Germany)




From oren-py-d@hishome.net  Tue Jun 25 14:24:24 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 25 Jun 2002 16:24:24 +0300
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: <m17Mp7u-0079QBC@artcom0.artcom-gmbh.de>; from pf@artcom-gmbh.de on Tue, Jun 25, 2002 at 02:08:54PM +0200
References: <20020625105839.GA58813@hishome.net> <m17Mp7u-0079QBC@artcom0.artcom-gmbh.de>
Message-ID: <20020625162424.A5762@hishome.net>

On Tue, Jun 25, 2002 at 02:08:54PM +0200, Peter Funk wrote:
> A recent thread here on python-dev came to the conclusion to 
> "silently deprecate" the standard library modules 'string' and 'types'.
> This silent deprecation nevertheless means, that these modules will
> go away at some future point in time.  I don't like this decision,
> but I understand the reasoning and can now only hope, that this
> point in time lies very very far away in the future.

I don't like it very much either.  I prefer the string module to be silently
deprecated "forever" without any specific schedule for removal.

That's why the Backward Compatibility section of this PEP says that "it is 
not planned to actually remove the long names from the types module in some 
future version."

I think that actually breaking backward compatibility should be reserved for 
really obscure modules that virtually nobody uses any more.  Another case is 
when the programs that will be broken were using somewhat questionable 
programming practices in the first place (e.g. lst.append(x,y) instead of 
lst.append((x,y)) or assignment to __class__).

> Unfortunately this will not help for software, which has already been
> written and is in production.  If in 2004 certain Python programs
> written in 2000 or earlier would start raising ImportError exceptions
> on 'from types import *' after upgrading to a new system which may come
> with the latest version of Python, this will certainly cause damage.

In fact, reusing the types module instead of deprecating and eventually
removing it will ensure that no ImportError will be raised.  The new types 
module will also serve as a retirement home for the long type names where 
they can live comfortably and still be of some use to old code instead being 
evicted.

There is a problem though. "from types import *" would import the short 
names, too, overriding the builtins.  If you redefine int or str you probably 
deserve it :-) but if you have an innocent variable called "function" 
somewhere in your module it will get clobbered. This is a problem.  The
solution might be to include only the long type names in __all__.

	Oren




From aahz@pythoncraft.com  Tue Jun 25 14:38:07 2002
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 25 Jun 2002 09:38:07 -0400
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: <m17Mp7u-0079QBC@artcom0.artcom-gmbh.de>
References: <20020625105839.GA58813@hishome.net> <m17Mp7u-0079QBC@artcom0.artcom-gmbh.de>
Message-ID: <20020625133807.GA21633@panix.com>

On Tue, Jun 25, 2002, Peter Funk wrote:
>
> Unfortunately this will not help for software, which has already been
> written and is in production.  If in 2004 certain Python programs
> written in 2000 or earlier would start raising ImportError exceptions
> on 'from types import *' after upgrading to a new system which may come
> with the latest version of Python, this will certainly cause damage.

This can be solved by a combination of changing the documentation and
using __all__ (which I think is in part precisely the point of creating
__all__).  (To save people time, __all__ controls what names import *
uses; I think it was introduced in Python 2.1, but I'm not sure.)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From pobrien@orbtech.com  Tue Jun 25 14:53:28 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Tue, 25 Jun 2002 08:53:28 -0500
Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting
In-Reply-To: <20020623181630.GN25927@laranja.org>
Message-ID: <NBBBIOJPGKJEKIECEMCBKEOINFAA.pobrien@orbtech.com>

[Lalo Martins]
>
> Now, this thing we're talking about is replacing parts of the string with
> other strings. These strings may be the result of running some non-string
> objects trough str(foo) - but, we are making no assumptions about these
> objects. Just that str(foo) is somehow meaningful. And, to my knowledge,
> there are no python objects for which str(foo) doesn't work.

I guess it depends on your definition of "work". This can fail if foo is an
instance of a class with __str__ (or __repr__) having a bug or raising an
exception. If foo is your own code you probably want it to fail. If foo is
someone else's code you may have no choice but to work around it. :-(

--
Patrick K. O'Brien
Orbtech
-----------------------------------------------
"Your source for Python software development."
-----------------------------------------------
Web:  http://www.orbtech.com/web/pobrien/
Blog: http://www.orbtech.com/blog/pobrien/
Wiki: http://www.orbtech.com/wiki/PatrickOBrien
-----------------------------------------------




From aahz@pythoncraft.com  Tue Jun 25 14:45:04 2002
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 25 Jun 2002 09:45:04 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <m3bs9zg6tc.fsf@mira.informatik.hu-berlin.de>
References: <20020624213318.A5740@arizona.localdomain> <20020625065203.GA27183@hishome.net> <m3bs9zg6tc.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020625134504.GB21633@panix.com>

On Tue, Jun 25, 2002, Martin v. Loewis wrote:
>
> IMO, heaps are so standard as an algorithm that they belong into the
> Python library, in some form. It is then the user's choice to use that
> algorithm or not.

Should this PEP be split in two, then?  One for a new "AbstractData"
package (that would include the heap algorithm) and one for an update to
Queue that would use some algorithm from AbstractData.  The latter might
not even need a PEP.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From skip@pobox.com  Tue Jun 25 15:23:20 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 25 Jun 2002 09:23:20 -0500
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: <20020625133807.GA21633@panix.com>
References: <20020625105839.GA58813@hishome.net>
 <m17Mp7u-0079QBC@artcom0.artcom-gmbh.de>
 <20020625133807.GA21633@panix.com>
Message-ID: <15640.31960.526262.104242@12-248-8-148.client.attbi.com>

    >> If in 2004 certain Python programs written in 2000 or earlier would
    >> start raising ImportError exceptions on 'from types import *' after
    >> upgrading to a new system which may come with the latest version of
    >> Python, this will certainly cause damage.

    aahz> This can be solved by a combination of changing the documentation
    aahz> and using __all__ (which I think is in part precisely the point of
    aahz> creating __all__). 

I don't think __all__ would help here.  The problem as I see it is that the
docs say "from types import *" is safe.  If you add new names to the types
module, they would presumably be added to __all__ as well, and then "from
types import *" could clobber local variables or hide globals or builtins
the programmer didn't anticipate.

So, if we add an object named "function" to the types module and Peter's
stable code has a variable of the same name, it's possible that running on a
new version of Python will introduce a bug.

Still, I have to quibble with Peter's somewhat extreme example.  If you take
a stable system of the complexity of perhaps Linux or Windows and upgrade it
four years later, Python compatibility will probably only be one of many
problems raised by the upgrade.  If you have a stable program, you try to
leave it alone.  That means not upgrading it.  If you modify the environment
the program runs in, you need to retest it.  If you write in C you can
minimize these problems through static linkage, but the problem with Python
is no different than that of a program written in C which uses shared
libraries.  Names can move around (from one library to another) or new names
can be added, giving rise to name conflicts.  I seem to recall someone
reporting recently about another shared library which defined an external
symbol named "socket_init".

Skip



From aahz@pythoncraft.com  Tue Jun 25 15:53:57 2002
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 25 Jun 2002 10:53:57 -0400
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: <15640.31960.526262.104242@12-248-8-148.client.attbi.com>
References: <20020625105839.GA58813@hishome.net> <m17Mp7u-0079QBC@artcom0.artcom-gmbh.de> <20020625133807.GA21633@panix.com> <15640.31960.526262.104242@12-248-8-148.client.attbi.com>
Message-ID: <20020625145357.GA6652@panix.com>

On Tue, Jun 25, 2002, Skip Montanaro wrote:
> 
>     >> If in 2004 certain Python programs written in 2000 or earlier would
>     >> start raising ImportError exceptions on 'from types import *' after
>     >> upgrading to a new system which may come with the latest version of
>     >> Python, this will certainly cause damage.
> 
>     aahz> This can be solved by a combination of changing the documentation
>     aahz> and using __all__ (which I think is in part precisely the point of
>     aahz> creating __all__). 
> 
> I don't think __all__ would help here.  The problem as I see it is that the
> docs say "from types import *" is safe.  If you add new names to the types
> module, they would presumably be added to __all__ as well, and then "from
> types import *" could clobber local variables or hide globals or builtins
> the programmer didn't anticipate.

The point is that we could change the docs -- but Peter would still have
his problem with import * unless we also used __all__ to retain the old
behavior.  Overall, I agree with your point about upgrading applications
four years old; I'm just suggesting a possible mechanism for minimizing
damage.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From gward@python.net  Tue Jun 25 16:08:16 2002
From: gward@python.net (Greg Ward)
Date: Tue, 25 Jun 2002 11:08:16 -0400
Subject: [Python-Dev] Improved tmpfile module
In-Reply-To: <20020625030609.GD13729@codesourcery.com>
References: <20020625030609.GD13729@codesourcery.com>
Message-ID: <20020625150816.GA3660@gerg.ca>

On 24 June 2002, Zack Weinberg said:
> Attached please find a rewritten and improved tmpfile.py.  The major
> change is to make the temporary file names significantly harder to
> predict.  This foils denial-of-service attacks, where a hostile
> program floods /tmp with files named @12345.NNNN to prevent process
> 12345 from creating any temp files.  It also makes the race condition
> inherent in tmpfile.mktemp() somewhat harder to exploit.

Oh, good!  I've long wished that there was a tmpfile module written by
someone who understands the security issues involved in generating
temporary filenames and files.  I hope you do... ;-)

> (fd, name) = mkstemp(suffix="", binary=1): Creates a temporary file,
> returning both an OS-level file descriptor open on it and its name.
> This is useful in situations where you need to know the name of the
> temporary file, but can't risk the race in mktemp.

+1 except for the name.  What does the "s" stand for?  Unfortunately, I
can't think of a more descriptive name offhand.

> name = mkdtemp(suffix=""): Creates a temporary directory, without
> race.

How about calling this one mktempdir() ?

> file = NamedTemporaryFile(mode='w+b', bufsize=-1, suffix=""): This is
> just the non-POSIX version of tmpfile.TemporaryFile() made available
> on all platforms, and with the .path attribute documented.  It
> provides a convenient way to get a temporary file with a name, that
> will be automatically deleted on close, and with a high-level file
> object associated with it.

I've scanned your code and the existing tempfile.py.  I don't understand
why you rearranged things.  Please explain why your arrangement of
_TemporaryFileWrapper/TemporaryFile/NamedTemporaryFile is better than
what we have.

A few minor comments on the code...

> if os.name == 'nt':
>     _template = '~%s~'
> elif os.name in ('mac', 'riscos'):
>     _template = 'Python-Tmp-%s'
> else:
>     _template = 'pyt%s' # better ideas?

Why reveal the implementation language of the application creating these
temporary names?  More importantly, why do it certain platforms, but not
others?

> ### Recommended, user-visible interfaces.
> 
> _text_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL
> if os.name == 'posix':
>     _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL

Why not just "_bin_openflags = _text_openflags" ?  That clarifies their
equality on Unix.

> else:
>     _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL | os.O_BINARY

Why not "_bin_openflags = _text_openflags | os.O_BINARY" ?

> def mkstemp(suffix="", binary=1):
>     """Function to create a named temporary file, with 'suffix' for
>     its suffix.  Returns an OS-level handle to the file and the name,
>     as a tuple.  If 'binary' is 1, the file is opened in binary mode,
>     otherwise text mode (if this is a meaningful concept for the
>     operating system in use).  In any case, the file is readable and
>     writable only by the creating user, and executable by no one."""

"Function to" is redundant.  That docstring should probably look
something like this:

    """Create a named temporary file.

    Create a named temporary file with 'suffix' for its suffix.  Return
    a tuple (fd, name) where 'fd' is an OS-level handle to the file, and
    'name' is the complete path to the file.  If 'binary' is true, the
    file is opened in binary mode, otherwise text mode (if this is a
    meaningful concept for the operating system in use).  In any case,
    the file is readable and writable only by the creating user, and
    executable by no one (on platforms where that makes sense).
    """

Hmmm: if suffix == ".bat", the file is executable on some platforms.
That last sentence still needs work.

>     if binary: flags = _bin_openflags
>     else: flags = _text_openflags

I dunno if the Python coding standards dictate this, but I prefer

    if binary:
        flags = _bin_openflags
    else:
        flags = _text_openflags


> class _TemporaryFileWrapper:
>     """Temporary file wrapper
> 
>     This class provides a wrapper around files opened for temporary use.
>     In particular, it seeks to automatically remove the file when it is
>     no longer needed.
>     """

Here's where I started getting confused.  I don't dispute that the
existing code could stand some rearrangement, but I don't understand why
you did it the way you did.  Please clarify!

> ### Deprecated, user-visible interfaces.
> 
> def mktemp(suffix=""):
>     """User-callable function to return a unique temporary file name."""
>     while 1:
>         name = _candidate_name(suffix)
>         if not os.path.exists(name):
>             return name

The docstring for mktemp() should state *why* it's bad to use this
function -- otherwise people will say, "oh, this looks like it does what
I need" and use it in ignorance.  So should the library reference
manual.

Overall I'm +1 on the idea of improving tempfile with an eye to
security.  +0 on implementation, mainly because I don't understand how
your arrangement of TemporaryFile and friends is better than what we
have.

        Greg
-- 
Greg Ward - geek                                        gward@python.net
http://starship.python.net/~gward/
What the hell, go ahead and put all your eggs in one basket.



From niemeyer@conectiva.com  Tue Jun 25 16:09:51 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Tue, 25 Jun 2002 12:09:51 -0300
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <15639.60668.591252.466454@12-248-8-148.client.attbi.com>
References: <20020624213318.A5740@arizona.localdomain> <15639.52945.388250.264216@12-248-8-148.client.attbi.com> <20020624225941.A5798@arizona.localdomain> <15639.60668.591252.466454@12-248-8-148.client.attbi.com>
Message-ID: <20020625120951.B2207@ibook.distro.conectiva>

> I don't know how efficient it would be, but I usually think that most
> applications have a small, fixed set of possible priorities, like ("low",
> "medium", "high") or ("info", "warning", "error", "fatal").  In this sort of
> situation my initial inclination would be to implement a dict of Queue
> instances which corresponds to the fixed set of priorities, something like:

If priority queues were to be included, I'd rather add the necessary
support in Queue to easily attach priority handling, if that's not
already possible. Maybe adding a generic **kw parameter, and passing it
to _put() could help a bit.

The applications of a priority Queue I've used until now weren't able
to use your approach. OTOH, there are many cases where you're right, and
we could benefit from this.

If it's of common sense that priority queues are that useful, we should
probably add one or two subclasses of Queue in the Queue module (one
with your approach and one with the more generic one).  Otherwise,
subclassing Queue is already easy enough, IMO (adding the **kw
suggestion would avoid overloading put(), and seems reasonable to me).

Thanks!

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From tim.one@comcast.net  Tue Jun 25 16:23:28 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 25 Jun 2002 11:23:28 -0400
Subject: [Python-Dev] Improved tmpfile module
In-Reply-To: <20020625150816.GA3660@gerg.ca>
Message-ID: <BIEJKCLHCIOIHAGOKOLHEEEODFAA.tim.one@comcast.net>

[Greg Ward, to Zack Weinberg]
> ../
> Overall I'm +1 on the idea of improving tempfile with an eye to
> security.  +0 on implementation, mainly because I don't understand how
> your arrangement of TemporaryFile and friends is better than what we
> have.

-1 on the implementation here, because it didn't start with current CVS, so
is missing important work that went into improving this module on Windows
for 2.3.  Whether spawned/forked processes inherit descriptors for "temp
files" is also a security issue that's addressed in current CVS but seemed
to have gotten dropped on the floor here.

A note on UI:  for many programmers, "it's a feature" that temp file names
contain the pid.  I don't think we can get away with taking that away no
matter how stridently someone claims it's bad for us <wink>.




From fredrik@pythonware.com  Tue Jun 25 16:26:57 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 25 Jun 2002 17:26:57 +0200
Subject: [Python-Dev] Improved tmpfile module
References: <20020625030609.GD13729@codesourcery.com> <20020625150816.GA3660@gerg.ca>
Message-ID: <003f01c21c5c$c8d8de20$ced241d5@hagrid>

Greg wrote:

> > (fd, name) = mkstemp(suffix="", binary=1): Creates a temporary file,
> > returning both an OS-level file descriptor open on it and its name.
> > This is useful in situations where you need to know the name of the
> > temporary file, but can't risk the race in mktemp.
> 
> +1 except for the name.  What does the "s" stand for?

"safe"?  or at least "safer"?  unix systems usually have both "mktemp"
and "mkstemp", but I think they're both deprecated under SUSv2 (use
"tmpfile" instead).

</F>




From fredrik@pythonware.com  Tue Jun 25 16:33:55 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 25 Jun 2002 17:33:55 +0200
Subject: [Python-Dev] Priority queue (binary heap) python code
References: <20020624213318.A5740@arizona.localdomain> <15639.52945.388250.264216@12-248-8-148.client.attbi.com> <20020624225941.A5798@arizona.localdomain> <15639.60668.591252.466454@12-248-8-148.client.attbi.com> <20020625120951.B2207@ibook.distro.conectiva>
Message-ID: <004d01c21c5d$b96c6460$ced241d5@hagrid>

Gustavo Niemeyer wrote:

> If priority queues were to be included, I'd rather add the necessary
> support in Queue to easily attach priority handling, if that's not
> already possible.

it takes a whopping four lines of code, if you're a pragmatic
programmer:

#
# implementation

import Queue, bisect

class PriorityQueue(Queue.Queue):
    def _put(self, item):
        bisect.insort(self.queue, item)

#
# usage

queue = PriorityQueue(0)

queue.put((2, "second"))
queue.put((1, "first"))
queue.put((3, "third"))

priority, value = queue.get()

</F>




From bernie@3captus.com  Tue Jun 25 16:29:34 2002
From: bernie@3captus.com (Bernard Yue)
Date: Tue, 25 Jun 2002 09:29:34 -0600
Subject: [Python-Dev] Minor socket timeout quibble - timeout raises
 socket.error
References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com>
Message-ID: <3D188C5D.D519DD90@3captus.com>


Skip Montanaro wrote:
> 
> I just noticed in the development docs that when a timeout on a socket
> occurs, socket.error is raised.  I rather liked the idea that a different
> exception was raised for timeouts (I used Tim O'Malley's timeout_socket
> module).  Making a TimeoutError exception a subclass of socket.error would
> be fine so you can catch it with existing code, but I could see recovering
> differently for a timeout as opposed to other possible errors:
> 
>     sock.settimeout(5.0)
>     try:
>         data = sock.recv(8192)
>     except socket.TimeoutError:
>         # maybe requeue the request
>         ...
>     except socket.error, codes:
>         # some more drastic solution is needed
>         ...
> 

+1 on your suggestion.  Anyway, under windows, the current
implementation returns incorrect socket.error code for timeout.  I am
working on the test suite as well as a fix for problem found.  Once the
code is bug free maybe we can put the TimeoutError in.

I will leave it to Guido for the approval of the change.  When he comes
back from his holiday.


Bernie


> Skip
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev



From niemeyer@conectiva.com  Tue Jun 25 17:02:16 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Tue, 25 Jun 2002 13:02:16 -0300
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <004d01c21c5d$b96c6460$ced241d5@hagrid>
References: <20020624213318.A5740@arizona.localdomain> <15639.52945.388250.264216@12-248-8-148.client.attbi.com> <20020624225941.A5798@arizona.localdomain> <15639.60668.591252.466454@12-248-8-148.client.attbi.com> <20020625120951.B2207@ibook.distro.conectiva> <004d01c21c5d$b96c6460$ced241d5@hagrid>
Message-ID: <20020625130216.B1837@ibook.distro.conectiva>

> it takes a whopping four lines of code, if you're a pragmatic
> programmer:

Indeed. Using a tuple directly was a nice idea! I was thinking about a
priority parameter (maybe I'm not that pragmatic? ;-), which is not hard
as well, but one will have to overload the put method to pass the
priority parameter.

import Queue, bisect

class PriorityQueue(Queue.Queue):
    def __init__(self, maxsize=0, defaultpriority=0):
        self.defaultpriority = defaultpriority
        Queue.Queue.__init__(self, maxsize)

    def put(self, item, block=1, **kw):
        if block:
            self.fsema.acquire()
        elif not self.fsema.acquire(0):
            raise Full
        self.mutex.acquire()
        was_empty = self._empty()
	# <- Priority could be handled here as well.
        self._put(item, **kw)
        if was_empty:
            self.esema.release()
        if not self._full():
            self.fsema.release()
        self.mutex.release()

    def _put(self, item, **kw):
	# <- But here seems better
        priority = kw.get("priority", self.defaultpriority)
        bisect.insort(self.queue, (priority, item))

    def _get(self):
        return self.queue.pop(0)[1]

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From sholden@holdenweb.com  Tue Jun 25 17:21:48 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Tue, 25 Jun 2002 12:21:48 -0400
Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error
References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com>
Message-ID: <0b0d01c21c64$6a17b0c0$6300000a@holdenweb.com>

----- Original Message -----
From: "Skip Montanaro" <skip@pobox.com>
To: <python-dev@python.org>
Sent: Monday, June 24, 2002 9:53 PM
Subject: [Python-Dev] Minor socket timeout quibble - timeout raises
socket.error


>
> I just noticed in the development docs that when a timeout on a socket
> occurs, socket.error is raised.  I rather liked the idea that a different
> exception was raised for timeouts (I used Tim O'Malley's timeout_socket
> module).  Making a TimeoutError exception a subclass of socket.error would
> be fine so you can catch it with existing code, but I could see recovering
> differently for a timeout as opposed to other possible errors:
>
>     sock.settimeout(5.0)
>     try:
>         data = sock.recv(8192)
>     except socket.TimeoutError:
>         # maybe requeue the request
>         ...
>     except socket.error, codes:
>         # some more drastic solution is needed
>         ...
>

This seems logical: the timeout is inherently different, so a separate
"except" seems better than having to analyze the reason of the socket error.

regards
-----------------------------------------------------------------------
Steve Holden                                 http://www.holdenweb.com/
Python Web Programming                http://pydish.holdenweb.com/pwp/
-----------------------------------------------------------------------






From fredrik@pythonware.com  Tue Jun 25 18:03:41 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 25 Jun 2002 19:03:41 +0200
Subject: [Python-Dev] Priority queue (binary heap) python code
References: <20020624213318.A5740@arizona.localdomain> <15639.52945.388250.264216@12-248-8-148.client.attbi.com> <20020624225941.A5798@arizona.localdomain> <15639.60668.591252.466454@12-248-8-148.client.attbi.com> <20020625120951.B2207@ibook.distro.conectiva> <004d01c21c5d$b96c6460$ced241d5@hagrid> <20020625130216.B1837@ibook.distro.conectiva>
Message-ID: <016e01c21c6a$4402cae0$ced241d5@hagrid>

Gustavo wrote:

>     def put(self, item, block=1, **kw):
>         if block:
>             self.fsema.acquire()
>         elif not self.fsema.acquire(0):
>             raise Full
>         self.mutex.acquire()
>         was_empty = self._empty()
> # <- Priority could be handled here as well.
>         self._put(item, **kw)
>         if was_empty:
>             self.esema.release()
>         if not self._full():
>             self.fsema.release()
>         self.mutex.release()
> 
>     def _put(self, item, **kw):
> # <- But here seems better
>         priority = kw.get("priority", self.defaultpriority)
>         bisect.insort(self.queue, (priority, item))

or better:

    def put(self, item, block=1, priority=None):
        if priority is None:
            priority = self.defaultpriority
        Queue.Queue.put(self, (priority, item), block)

</F>




From martin@v.loewis.de  Tue Jun 25 19:18:02 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 25 Jun 2002 20:18:02 +0200
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <20020625080929.GA39304@hishome.net>
References: <20020624213318.A5740@arizona.localdomain>
 <20020625065203.GA27183@hishome.net>
 <m3bs9zg6tc.fsf@mira.informatik.hu-berlin.de>
 <20020625080929.GA39304@hishome.net>
Message-ID: <m3elevxlwl.fsf@mira.informatik.hu-berlin.de>

Oren Tirosh <oren-py-d@hishome.net> writes:

> When I want to sort a list I just use .sort(). I don't care which
> algorithm is used. I don't care whether dictionaries are implemented
> using hash tables, some kind of tree structure or magic smoke.  I
> just trust Python to use a reasonably efficient implementation.

And nobody says you should think differently.

> I always find it funny when C++ or Perl programmers refer to an
> associative array as a "hash".

I agree.

> Heaps are a "standard algorithm" only from a CS point of view.  It doesn't
> have much to do with everyday programming.  

This has many different reasons: In the case of Python, the standard
.sort is indeed good for most applications. In general (including
Python), usage of heapsort is rare since it is difficult to implement
and not part of the standard library. Likewise, the naive priority
queue implementation is good in most cases.

If it was more easy to use, I assume it would be used more often.

> Let's put it this way: If Python has an extension module in the standard 
> library implementing a sorted list, would you care enough about the
> specific binary heap implementation to go and write one or would you just
> use what you had in the library for a priority queue? ;-)

I don't understand this question: Why do I have to implement anything?
Having heapsort in the library precisely means that I do not have to
write an implementation.

Regards,
Martin



From martin@v.loewis.de  Tue Jun 25 19:19:23 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 25 Jun 2002 20:19:23 +0200
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <20020625134504.GB21633@panix.com>
References: <20020624213318.A5740@arizona.localdomain>
 <20020625065203.GA27183@hishome.net>
 <m3bs9zg6tc.fsf@mira.informatik.hu-berlin.de>
 <20020625134504.GB21633@panix.com>
Message-ID: <m3adpjxluc.fsf@mira.informatik.hu-berlin.de>

Aahz <aahz@pythoncraft.com> writes:

> Should this PEP be split in two, then?  One for a new "AbstractData"
> package (that would include the heap algorithm) and one for an update to
> Queue that would use some algorithm from AbstractData.  The latter might
> not even need a PEP.

I don't know. The author of the PEP would have the freedom to propose
anything initially. Depending on the proposal, people will comment, then
reorganizations might be necessary.

Regards,
Martin




From bac@OCF.Berkeley.EDU  Tue Jun 25 19:43:45 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Tue, 25 Jun 2002 11:43:45 -0700 (PDT)
Subject: [Python-Dev] New Subscriber Introduction
In-Reply-To: <3D1825D0.2070309@lemburg.com>
Message-ID: <Pine.SOL.4.44.0206251133130.12420-100000@death.OCF.Berkeley.EDU>

[M.-A. Lemburg]

> Just curious: have you taken a look at the mxDateTime parser ?
>
> It has a slightly different approach than strptime() but also
> takes a lot of load from the programmer in terms of not requiring
> a predefined format.

No.  I originally wrote strptime a year ago and it was initially just a
 hack.  It just has been fleshed out by me over the past year.  Just last
month was when I realized how I could figure out all the locale info on my
own after having taken a break from it.  I also wanted to avoid any
possible license issues so I just did completely from scratch.

As for your comment about not requiring a predefined format, I don't quite
follow what you mean.  Looking at mxDateTime's strptime, the only
difference in the possible parameters is the optional default for
mxDateTime.  Otherwise both mxDateTime's and my implementation have
exactly the same parameter requirements:
mxDateTime.strptime(string,format_string[,default])
strptime.strptime(data_string, format)

with string == data_string and format_string == format.

-Brett C.




From mal@lemburg.com  Tue Jun 25 19:56:04 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 25 Jun 2002 20:56:04 +0200
Subject: [Python-Dev] New Subscriber Introduction
References: <Pine.SOL.4.44.0206251133130.12420-100000@death.OCF.Berkeley.EDU>
Message-ID: <3D18BCC4.4020407@lemburg.com>

Brett Cannon wrote:
> [M.-A. Lemburg]
> 
> 
>>Just curious: have you taken a look at the mxDateTime parser ?
>>
>>It has a slightly different approach than strptime() but also
>>takes a lot of load from the programmer in terms of not requiring
>>a predefined format.
> 
> 
> No.  I originally wrote strptime a year ago and it was initially just a
>  hack.  It just has been fleshed out by me over the past year.  Just last
> month was when I realized how I could figure out all the locale info on my
> own after having taken a break from it.  I also wanted to avoid any
> possible license issues so I just did completely from scratch.

mxDateTime is part of egenix-mx-base which is covered
by an open source license similar to that of Python (with less
fuzz, though :-).

> As for your comment about not requiring a predefined format, I don't quite
> follow what you mean.  Looking at mxDateTime's strptime, the only
> difference in the possible parameters is the optional default for
> mxDateTime.  Otherwise both mxDateTime's and my implementation have
> exactly the same parameter requirements:
> mxDateTime.strptime(string,format_string[,default])
> strptime.strptime(data_string, format)
> 
> with string == data_string and format_string == format.

That's correct. I was refering to the mx.DateTime.Parser
module, which implements several different date/time parsers.

The basic interface is mx.DateTime.DateTimeFrom(string). No format
string is required.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/
Meet us at EuroPython 2002:                 http://www.europython.org/




From bac@OCF.Berkeley.EDU  Tue Jun 25 20:06:26 2002
From: bac@OCF.Berkeley.EDU (Brett Cannon)
Date: Tue, 25 Jun 2002 12:06:26 -0700 (PDT)
Subject: [Python-Dev] New Subscriber Introduction
In-Reply-To: <3D18BCC4.4020407@lemburg.com>
Message-ID: <Pine.SOL.4.44.0206251202000.12420-100000@death.OCF.Berkeley.EDU>

[M.-A. Lemburg]

> mxDateTime is part of egenix-mx-base which is covered
> by an open source license similar to that of Python (with less
> fuzz, though :-).
>

Good to know.

> That's correct. I was refering to the mx.DateTime.Parser
> module, which implements several different date/time parsers.
>
> The basic interface is mx.DateTime.DateTimeFrom(string). No format
> string is required.

Ah, OK.  Well, that is handy, but since this is meant to be a drop-in
replacement for strptime, I don't think it is warranted here.  Perhaps
something like that could be put into Python when Guido starts putting in
new fxns for the forthcoming new datetime type?

And I do agree that strptime is not need most of the time.  But it is
there so might as well fix that non-portable wart.

-Brett C.




From kevin@koconnor.net  Tue Jun 25 23:07:59 2002
From: kevin@koconnor.net (Kevin O'Connor)
Date: Tue, 25 Jun 2002 18:07:59 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <20020625065203.GA27183@hishome.net>; from oren-py-d@hishome.net on Tue, Jun 25, 2002 at 02:52:03AM -0400
References: <20020624213318.A5740@arizona.localdomain> <20020625065203.GA27183@hishome.net>
Message-ID: <20020625180759.B5798@arizona.localdomain>

On Tue, Jun 25, 2002 at 02:52:03AM -0400, Oren Tirosh wrote:
> > Any chance something like this could make it into the standard python
> > library?  It would save a lot of time for lazy people like myself.  :-)
> 
> A sorted list is a much more general-purpose data structure than a priority
> queue and can be used to implement a priority queue. It offers almost the same 
> asymptotic performance:

Hi Oren,

I agree that some form of a balanced tree object would be more useful, but
unfortunately it doesn't exist natively.  A pure python implementation of
heaps is a pretty straight-forward addition.

If, however, one were to consider adding C code then I would agree a tree
object would be more valuable.  As you surmised later, I wouldn't have
bothered with a heap if trees were available.

In fact, I've always wondered why Python dictionaries use the hash
algorithm instead of the more general binary tree algorithm.  :-}

-Kevin

-- 
 ------------------------------------------------------------------------
 | Kevin O'Connor                     "BTW, IMHO we need a FAQ for      |
 | kevin@koconnor.net                  'IMHO', 'FAQ', 'BTW', etc. !"    |
 ------------------------------------------------------------------------



From kevin@koconnor.net  Tue Jun 25 23:26:06 2002
From: kevin@koconnor.net (Kevin O'Connor)
Date: Tue, 25 Jun 2002 18:26:06 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <15639.60668.591252.466454@12-248-8-148.client.attbi.com>; from skip@pobox.com on Mon, Jun 24, 2002 at 11:09:32PM -0500
References: <20020624213318.A5740@arizona.localdomain> <15639.52945.388250.264216@12-248-8-148.client.attbi.com> <20020624225941.A5798@arizona.localdomain> <15639.60668.591252.466454@12-248-8-148.client.attbi.com>
Message-ID: <20020625182606.C5798@arizona.localdomain>

On Mon, Jun 24, 2002 at 11:09:32PM -0500, Skip Montanaro wrote:
> I don't know how efficient it would be, but I usually think that most
> applications have a small, fixed set of possible priorities, like ("low",
> "medium", "high") or ("info", "warning", "error", "fatal").  In this sort of
> situation my initial inclination would be to implement a dict of Queue
> instances which corresponds to the fixed set of priorities, something like:

Hi Skip,

The application I had in mind stored between 100,000-1,000,000 objects with
priorities between 0-150.  I found that moving from bisect to a heap
improved performance of the entire program by about 25%.

>It will also work if for some reason you want
> to queue up objects for which __cmp__ doesn't make sense.

I just assumed the user would use the (priority, data) tuple trick at the
start (it does make the algorithm simpler).  In a way, the code is very
similar to the way the bisect module is implemented.

-Kevin

-- 
 ------------------------------------------------------------------------
 | Kevin O'Connor                     "BTW, IMHO we need a FAQ for      |
 | kevin@koconnor.net                  'IMHO', 'FAQ', 'BTW', etc. !"    |
 ------------------------------------------------------------------------



From tim.one@comcast.net  Wed Jun 26 04:58:24 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 25 Jun 2002 23:58:24 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <20020625180759.B5798@arizona.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCGECIAAAB.tim.one@comcast.net>

[Kevin O'Connor]
> ...
> In fact, I've always wondered why Python dictionaries use the hash
> algorithm instead of the more general binary tree algorithm.  :-}

Speed.  Zope and StandaloneZODB have a BTree package, which I've recently
spent a good amount of time optimizing.  Here's a timing driver:

"""
from time import clock as now

N = 1000000
indices = range(N)

def doit(constructor):
    d = constructor()
    t1 = now()
    for i in indices:
        d[i] = i
    t2 = now()
    for i in indices:
        assert d[i] == i
    t3 = now()
    for i in indices:
        del d[i]
    t4 = now()
    return t2-t1, t3-t2, t4-t3

def drive(constructor, n):
    print "Using", constructor.__name__, "on", N, "entries"
    for i in range(n):
        d1, d2, d3 = doit(constructor)
        print "construct %6.3f" % d1
        print "query     %6.3f" % d2
        print "remove    %6.3f" % d3

def dict():
    return {}

from BTrees.OOBTree import OOBTree
drive(OOBTree, 3)
drive(dict, 3)
"""

This is a little strained because I'm running it under Python 2.1.3.  This
favors the BTrees, because I also spent a lot of time optimizing Python's
dicts for the Python 2.2 release; 2.1 doesn't have that stuff.  OOBTrees are
most similar to Python's dicts, mapping objects to objects.  Here's a run:

Using OOBTree on 1000000 entries
construct  5.376
query      5.571
remove     4.065
construct  5.349
query      5.610
remove     4.211
construct  5.363
query      5.585
remove     4.374
Using dict on 1000000 entries
construct  1.411
query      1.336
remove     0.780
construct  1.382
query      1.335
remove     0.781
construct  1.376
query      1.334
remove     0.778

There's just no contest here.  BTrees have many other virtues, like
supporting range searches, and automatically playing nice with ZODB
persistence, but they're plain sluggish compared to dicts.  To be completely
fair and unfair at the same time <wink>, there are also 4 other flavors of
Zope BTree, purely for optimization reasons.  In particular, the IIBTree
maps Python ints to Python ints, and does so by avoiding Python int objects
altogether, storing C longs directly and comparing them at native "compare a
long to a long" C speed.  That's *almost* as fast as Python 2.1 int->int
dicts (which endure all-purpose Python object comparison), except for
deletion (the BTree spends a lot of time tearing apart all the tree pointers
again).

Now that's a perfectly height-balanced search tree that "chunks up" blocks
of keys for storage and speed efficiency, and rarely needs more than a
simple local adjustment to maintain balance.  I expect that puts it at the
fast end of what can be achieved with a balanced tree scheme.

The ABC language (which Guido worked on before Python) used AVL trees for
just about everything under the covers.  It's not a coincidence that Python
doesn't use balanced trees for anything <wink>.




From tim.one@comcast.net  Wed Jun 26 05:30:36 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 26 Jun 2002 00:30:36 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <20020625134504.GB21633@panix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECLAAAB.tim.one@comcast.net>

[Aahz]
> Should this PEP be split in two, then?  One for a new "AbstractData"
> package (that would include the heap algorithm) and one for an update to
> Queue that would use some algorithm from AbstractData.  The latter might
> not even need a PEP.

I'm chuckling, but to myself <wink>.  By the time you add all the bells and
whistles everyone may want out of "a priority queue", the interface gets so
frickin' complicated that almost everyone will ignore the library and call
bisect.insort() themself.

/F gives me a thread-safe Queue when I don't want to pay overheads for
enforcing mutual exclusion in 99% of my priority-queue apps.  Schemes that
store (priority, object) tuples to exploit lexicographic comparison are
convenient to code but a nightmare if priorities can ever be equal, and
object comparison can raise exceptions, or object comparison can be
expensive.  Sometimes I want a min-queue, other times a max-queue.
Sometimes I need efficient access to both ends.  About a month ago I needed
to write a priority queue that was especially efficient at adding thousands
of new entries in one gulp.

And so on.  It's easier to write appropriate code from scratch in Python
than to figure out how to *use* a package profligate enough to contain
canned solutions for all common and reasonable use cases.  People have been
known to gripe at the number of methods Python's simple little lists and
dicts have sprouted -- heh heh.

BTW, the Zope BTree may be a good candidate to fold into Python.  I'm not
sure.  It's a mountain of fairly sophisticated code with an interface so
rich that it's genuinely hard to learn how to use it as intended -- the
latter especially should appeal to just about everyone <wink>.




From aahz@pythoncraft.com  Wed Jun 26 05:40:20 2002
From: aahz@pythoncraft.com (Aahz)
Date: Wed, 26 Jun 2002 00:40:20 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCECLAAAB.tim.one@comcast.net>
References: <20020625134504.GB21633@panix.com> <LNBBLJKPBEHFEDALKOLCCECLAAAB.tim.one@comcast.net>
Message-ID: <20020626044020.GB11161@panix.com>

On Wed, Jun 26, 2002, Tim Peters wrote:
> [Aahz]
>>
>> Should this PEP be split in two, then?  One for a new "AbstractData"
>> package (that would include the heap algorithm) and one for an update to
>> Queue that would use some algorithm from AbstractData.  The latter might
>> not even need a PEP.
> 
> I'm chuckling, but to myself <wink>.  By the time you add all the
> bells and whistles everyone may want out of "a priority queue", the
> interface gets so frickin' complicated that almost everyone will
> ignore the library and call bisect.insort() themself.

Fair enough -- but I didn't really know about bisect myself.  Looking at
the docs for bisect, it says that the code might be best used as a
source code example.  I think that having a package to dump similar
kinds of code might be a good idea.  It's not a substitute for a CS
course, but...

> And so on.  It's easier to write appropriate code from scratch in Python
> than to figure out how to *use* a package profligate enough to contain
> canned solutions for all common and reasonable use cases.  People have been
> known to gripe at the number of methods Python's simple little lists and
> dicts have sprouted -- heh heh.

Actually, I was expecting that the Queue PEP would be dropped once the
AbstractData package got some momentum behind.  I was just trying to be
a tiny bit subtle.  <wink>
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From skip@pobox.com  Wed Jun 26 06:23:13 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 26 Jun 2002 00:23:13 -0500
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <20020626044020.GB11161@panix.com>
References: <20020625134504.GB21633@panix.com>
 <LNBBLJKPBEHFEDALKOLCCECLAAAB.tim.one@comcast.net>
 <20020626044020.GB11161@panix.com>
Message-ID: <15641.20417.403770.873433@12-248-8-148.client.attbi.com>

    aahz> Fair enough -- but I didn't really know about bisect myself.
    aahz> Looking at the docs for bisect, it says that the code might be
    aahz> best used as a source code example.  

I always forget about it as well.  I just added /F's four-line PriorityQueue
class as an example in the bisect docs and a "seealso" pointing at the
bisect doc to the Queue module doc.

Skip



From python@rcn.com  Wed Jun 26 07:37:17 2002
From: python@rcn.com (Raymond Hettinger)
Date: Wed, 26 Jun 2002 02:37:17 -0400
Subject: [Python-Dev] Xrange and Slices
Message-ID: <000d01c21cdb$eb03b720$91d8accf@othello>

Wild idea of the day:
Merge the code for xrange() into slice().
So that old code will work, make the word 'xrange' a synonym for 'slice'

>>> x = xrange(0,10,2)
>>> s = slice(0,10,2)
>>> [m for m in dir(x) if m not in dir(s)]
['__getitem__', '__iter__', '__len__']  
>>> [m for m in dir(s) if m not in dir(x)]
['__cmp__', 'start', 'step', 'stop']


Raymond Hettinger
'regnitteh dnomyar'[::-1]





From python@rcn.com  Wed Jun 26 08:36:21 2002
From: python@rcn.com (Raymond Hettinger)
Date: Wed, 26 Jun 2002 03:36:21 -0400
Subject: [Python-Dev] Dict constructor
Message-ID: <008101c21ce4$2b504fc0$91d8accf@othello>

Second wild idea of the day:

The dict constructor currently accepts sequences where each element has
length 2, interpreted as a key-value pair.

Let's have it also accept sequences with elements of length 1, interpreted
as a key:None pair.

The benefit is that it provides a way to rapidly construct sets:

lowercase = dict('abcdefghijklmnopqrstuvwxyz')
if char in lowercase: ...

dict([key1, key2, key3, key1]).keys()  # eliminate duplicate keys


Raymond Hettinger
'regnitteh dnomyar'[::-1]





From lellinghaus@yahoo.com  Wed Jun 26 09:21:35 2002
From: lellinghaus@yahoo.com (Lance Ellinghaus)
Date: Wed, 26 Jun 2002 01:21:35 -0700 (PDT)
Subject: [Python-Dev] posixmodule.c diffs for working forkpty() and openpty() under Solaris 2.8
Message-ID: <20020626082135.16733.qmail@web20905.mail.yahoo.com>

--0-68967167-1025079695=:16014
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Hello everyone!
I had to get forkpty() and openpty() working under Solaris 2.8 for a
project I am working on.
Here are the diffs to the 2.2.1 source file.

Please let me know if anyone has any problems with this!

Lance Ellinghaus


=====
--
Lance Ellinghaus

__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com
--0-68967167-1025079695=:16014
Content-Type: text/plain; name="posixmodule.c.diff"
Content-Description: posixmodule.c.diff
Content-Disposition: inline; filename="posixmodule.c.diff"

*** Python-2.2.1/Modules/posixmodule.c	Tue Mar 12 16:38:31 2002
--- Python-2.2.1.new/Modules/posixmodule.c	Tue May 21 01:16:29 2002
***************
*** 1904,1910 ****
  }
  #endif
  
! #if defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY)
  #ifdef HAVE_PTY_H
  #include <pty.h>
  #else
--- 1904,1913 ----
  }
  #endif
  
! #if defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) || defined(sun)
! #ifdef sun
! #include <sys/stropts.h>
! #endif
  #ifdef HAVE_PTY_H
  #include <pty.h>
  #else
***************
*** 1914,1920 ****
  #endif /* HAVE_PTY_H */
  #endif /* defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) */
  
! #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY)
  static char posix_openpty__doc__[] =
  "openpty() -> (master_fd, slave_fd)\n\
  Open a pseudo-terminal, returning open fd's for both master and slave end.\n";
--- 1917,1923 ----
  #endif /* HAVE_PTY_H */
  #endif /* defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) */
  
! #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(sun)
  static char posix_openpty__doc__[] =
  "openpty() -> (master_fd, slave_fd)\n\
  Open a pseudo-terminal, returning open fd's for both master and slave end.\n";
***************
*** 1925,1932 ****
  	int master_fd, slave_fd;
  #ifndef HAVE_OPENPTY
  	char * slave_name;
  #endif
! 
  	if (!PyArg_ParseTuple(args, ":openpty"))
  		return NULL;
  
--- 1928,1941 ----
  	int master_fd, slave_fd;
  #ifndef HAVE_OPENPTY
  	char * slave_name;
+ #ifdef sun
+         void *sig_saved;
  #endif
! #endif
! #if !defined(HAVE_OPENPTY) && !defined(HAVE__GETPTY) && defined(sun)
!         extern char *ptsname();
! #endif
!         
  	if (!PyArg_ParseTuple(args, ":openpty"))
  		return NULL;
  
***************
*** 1933,1939 ****
  #ifdef HAVE_OPENPTY
  	if (openpty(&master_fd, &slave_fd, NULL, NULL, NULL) != 0)
  		return posix_error();
! #else
  	slave_name = _getpty(&master_fd, O_RDWR, 0666, 0);
  	if (slave_name == NULL)
  		return posix_error();
--- 1942,1948 ----
  #ifdef HAVE_OPENPTY
  	if (openpty(&master_fd, &slave_fd, NULL, NULL, NULL) != 0)
  		return posix_error();
! #elif HAVE__GETPTY
  	slave_name = _getpty(&master_fd, O_RDWR, 0666, 0);
  	if (slave_name == NULL)
  		return posix_error();
***************
*** 1941,1946 ****
--- 1950,1966 ----
  	slave_fd = open(slave_name, O_RDWR);
  	if (slave_fd < 0)
  		return posix_error();
+ #else
+         master_fd = open("/dev/ptmx", O_RDWR|O_NOCTTY);  /* open master */
+         sig_saved = signal(SIGCHLD, SIG_DFL);
+         grantpt(master_fd);                     /* change permission of   slave */
+         unlockpt(master_fd);                    /* unlock slave */
+         signal(SIGCHLD,sig_saved);
+         slave_name = ptsname(master_fd);         /* get name of slave */
+         slave_fd = open(slave_name, O_RDWR);    /* open slave */
+         ioctl(slave_fd, I_PUSH, "ptem");       /* push ptem */
+         ioctl(slave_fd, I_PUSH, "ldterm");     /* push ldterm*/
+         ioctl(slave_fd, I_PUSH, "ttcompat");     /* push ttcompat*/
  #endif /* HAVE_OPENPTY */
  
  	return Py_BuildValue("(ii)", master_fd, slave_fd);
***************
*** 1948,1954 ****
  }
  #endif /* defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) */
  
! #ifdef HAVE_FORKPTY
  static char posix_forkpty__doc__[] =
  "forkpty() -> (pid, master_fd)\n\
  Fork a new process with a new pseudo-terminal as controlling tty.\n\n\
--- 1968,1974 ----
  }
  #endif /* defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) */
  
! #if defined(HAVE_FORKPTY) || defined(sun)
  static char posix_forkpty__doc__[] =
  "forkpty() -> (pid, master_fd)\n\
  Fork a new process with a new pseudo-terminal as controlling tty.\n\n\
***************
*** 1959,1968 ****
--- 1979,2067 ----
  posix_forkpty(PyObject *self, PyObject *args)
  {
  	int master_fd, pid;
+ #if defined(sun)
+         int slave;
+ 	char * slave_name;
+         void *sig_saved;
+         int fd;
+ #endif
  	
  	if (!PyArg_ParseTuple(args, ":forkpty"))
  		return NULL;
+ #if defined(sun)
+         master_fd = open("/dev/ptmx", O_RDWR|O_NOCTTY);  /* open master */
+         sig_saved = signal(SIGCHLD, SIG_DFL);
+         grantpt(master_fd);                     /* change permission of   slave */
+         unlockpt(master_fd);                    /* unlock slave */
+         signal(SIGCHLD,sig_saved);
+         slave_name = ptsname(master_fd);         /* get name of slave */
+         slave = open(slave_name, O_RDWR);    /* open slave */
+         ioctl(slave, I_PUSH, "ptem");       /* push ptem */
+         ioctl(slave, I_PUSH, "ldterm");     /* push ldterm*/
+         ioctl(slave, I_PUSH, "ttcompat");     /* push ttcompat*/
+         if (master_fd < 0 || slave < 0)
+         {
+             return posix_error();
+         }
+ 	switch (pid = fork()) {
+ 	case -1:	
+             return posix_error();
+ 	case 0:
+             /* First disconnect from the old controlling tty. */
+ #ifdef TIOCNOTTY
+             fd = open("/dev/tty", O_RDWR | O_NOCTTY);
+             if (fd >= 0) {
+ 		(void) ioctl(fd, TIOCNOTTY, NULL);
+ 		close(fd);
+             }
+ #endif /* TIOCNOTTY */
+             if (setsid() < 0)
+ 		return posix_error();
+             
+             /*
+              * Verify that we are successfully disconnected from the controlling
+              * tty.
+              */
+             fd = open("/dev/tty", O_RDWR | O_NOCTTY);
+             if (fd >= 0) {
+ 		return posix_error();
+ 		close(fd);
+             }
+             /* Make it our controlling tty. */
+ #ifdef TIOCSCTTY
+             if (ioctl(slave, TIOCSCTTY, NULL) < 0)
+ 		return posix_error();
+ #endif /* TIOCSCTTY */
+             fd = open(slave_name, O_RDWR);
+             if (fd < 0) {
+ 		return posix_error();
+             } else {
+ 		close(fd);
+             }
+             /* Verify that we now have a controlling tty. */
+             fd = open("/dev/tty", O_WRONLY);
+             if (fd < 0)
+ 		return posix_error();
+             else {
+ 		close(fd);
+             }
+             (void) close(master_fd);
+             (void) dup2(slave, 0);
+             (void) dup2(slave, 1);
+             (void) dup2(slave, 2);
+             if (slave > 2)
+                 (void) close(slave);
+             pid = 0;
+             break;
+           defautlt:
+             /*
+              * parent
+              */
+             (void) close(slave);
+ 	}
+ #else
  	pid = forkpty(&master_fd, NULL, NULL, NULL);
+ #endif
  	if (pid == -1)
  		return posix_error();
  	if (pid == 0)
***************
*** 5607,5616 ****
  #ifdef HAVE_FORK
  	{"fork",	posix_fork, METH_VARARGS, posix_fork__doc__},
  #endif /* HAVE_FORK */
! #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY)
  	{"openpty",	posix_openpty, METH_VARARGS, posix_openpty__doc__},
  #endif /* HAVE_OPENPTY || HAVE__GETPTY */
! #ifdef HAVE_FORKPTY
  	{"forkpty",	posix_forkpty, METH_VARARGS, posix_forkpty__doc__},
  #endif /* HAVE_FORKPTY */
  #ifdef HAVE_GETEGID
--- 5706,5715 ----
  #ifdef HAVE_FORK
  	{"fork",	posix_fork, METH_VARARGS, posix_fork__doc__},
  #endif /* HAVE_FORK */
! #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(sun)
  	{"openpty",	posix_openpty, METH_VARARGS, posix_openpty__doc__},
  #endif /* HAVE_OPENPTY || HAVE__GETPTY */
! #if defined(HAVE_FORKPTY) || defined(sun)
  	{"forkpty",	posix_forkpty, METH_VARARGS, posix_forkpty__doc__},
  #endif /* HAVE_FORKPTY */
  #ifdef HAVE_GETEGID

--0-68967167-1025079695=:16014--



From sholden@holdenweb.com  Wed Jun 26 11:12:54 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Wed, 26 Jun 2002 06:12:54 -0400
Subject: [Python-Dev] Asyncore/asynchat
Message-ID: <0c4801c21cfa$0b9d11c0$6300000a@holdenweb.com>

I thought I might try to add appropriate module documentation for asynchat.
This effective code doesn't get enough recognition (IMHO), partly because
you are forced to read the code to understand how to use it.

I notice that Sam Rushing's code tends to use spaces before the parentheses
around argument lists. Should I think about cleaning up the code at the same
time, or are we best letting sleeping dogs lie?

regards
-----------------------------------------------------------------------
Steve Holden                                 http://www.holdenweb.com/
Python Web Programming                http://pydish.holdenweb.com/pwp/
-----------------------------------------------------------------------






From barry@zope.com  Wed Jun 26 14:21:24 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 26 Jun 2002 09:21:24 -0400
Subject: [Python-Dev] Dict constructor
References: <008101c21ce4$2b504fc0$91d8accf@othello>
Message-ID: <15641.49108.839568.721853@anthem.wooz.org>

>>>>> "RH" == Raymond Hettinger <python@rcn.com> writes:

    RH> Second wild idea of the day:

    RH> The dict constructor currently accepts sequences where each
    RH> element has length 2, interpreted as a key-value pair.

    RH> Let's have it also accept sequences with elements of length 1,
    RH> interpreted as a key:None pair.

None might be an unfortunate choice because it would make dict.get()
less useful.  I'd prefer key:1

But of course it's fairly easy to construct either with a list
comprehension:

Python 2.2.1 (#1, May 31 2002, 18:34:35) 
[GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import string
>>> abc = string.letters[:26]
>>> dict([(c, 1) for c in abc])
{'a': 1, 'c': 1, 'b': 1, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'i': 1, 'h': 1, 'k': 1, 'j': 1, 'm': 1, 'l': 1, 'o': 1, 'n': 1, 'q': 1, 'p': 1, 's': 1, 'r': 1, 'u': 1, 't': 1, 'w': 1, 'v': 1, 'y': 1, 'x': 1, 'z': 1}
>>> dict([(c, None) for c in abc])
{'a': None, 'c': None, 'b': None, 'e': None, 'd': None, 'g': None, 'f': None, 'i': None, 'h': None, 'k': None, 'j': None, 'm': None, 'l': None, 'o': None, 'n': None, 'q': None, 'p': None, 's': None, 'r': None, 'u': None, 't': None, 'w': None, 'v': None, 'y': None, 'x': None, 'z': None}

pep-274-ly y'rs,
-Barry



From oren-py-d@hishome.net  Wed Jun 26 14:27:18 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 26 Jun 2002 09:27:18 -0400
Subject: [Python-Dev] Xrange and Slices
In-Reply-To: <000d01c21cdb$eb03b720$91d8accf@othello>
References: <000d01c21cdb$eb03b720$91d8accf@othello>
Message-ID: <20020626132718.GA57665@hishome.net>

On Wed, Jun 26, 2002 at 02:37:17AM -0400, Raymond Hettinger wrote:
> Wild idea of the day:
> Merge the code for xrange() into slice().
> So that old code will work, make the word 'xrange' a synonym for 'slice'

Nice idea.  Since xrange is the one more commonly used in everyday 
programming I'd say that slice should be an alias to xrange, not the other
way around.  The start, stop and step attributes to xrange would have to be
revived (what was the idea behind removing them in the first place?)

This would make it trivial to implement a __getitem__ that fully supports 
extended slice notation:

class Spam:
    def __getitem__(self, index):
        if isinstance(index, xrange):
            return [self[i] for i in index]
        else:
            ...handle integer index

Two strange things about xrange objects:
>>> xrange(1,100,2)
xrange(1, 101, 2)

It's been there since at least Python 2.0.  Hasn't anyone noticed this
bug before?

>>> dir(x)
[]
Shouldn't it have at least __class__, __repr__, etc and everything else
that object has?

	Oren



From pobrien@orbtech.com  Wed Jun 26 14:49:26 2002
From: pobrien@orbtech.com (Patrick K. O'Brien)
Date: Wed, 26 Jun 2002 08:49:26 -0500
Subject: [Python-Dev] Xrange and Slices
In-Reply-To: <20020626132718.GA57665@hishome.net>
Message-ID: <NBBBIOJPGKJEKIECEMCBMEACNGAA.pobrien@orbtech.com>

[Oren Tirosh]
>
> Two strange things about xrange objects:
> >>> xrange(1,100,2)
> xrange(1, 101, 2)
>
> It's been there since at least Python 2.0.  Hasn't anyone noticed this
> bug before?
>
> >>> dir(x)
> []
> Shouldn't it have at least __class__, __repr__, etc and everything else
> that object has?

What is x in your example? Assuming x == xrange, I get this with Python
2.2.1:

>>> dir(xrange)
['__call__', '__class__', '__cmp__', '__delattr__', '__doc__',
'__getattribute__', '__hash__', '__init__', '__name__', '__new__',
'__reduce__', '__repr__', '__self__', '__setattr__', '__str__']

Assuming x == xrange(1, 100, 2):

>>> x = xrange(1, 100, 2)
>>> dir(x)
PyCrust-Shell:1: DeprecationWarning: xrange object's 'start', 'stop' and
'step' attributes are deprecated
['start', 'step', 'stop', 'tolist']

--
Patrick K. O'Brien
Orbtech
-----------------------------------------------
"Your source for Python software development."
-----------------------------------------------
Web:  http://www.orbtech.com/web/pobrien/
Blog: http://www.orbtech.com/blog/pobrien/
Wiki: http://www.orbtech.com/wiki/PatrickOBrien
-----------------------------------------------




From walter@livinglogic.de  Wed Jun 26 14:55:38 2002
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 26 Jun 2002 15:55:38 +0200
Subject: [Python-Dev] Dict constructor
References: <008101c21ce4$2b504fc0$91d8accf@othello> <15641.49108.839568.721853@anthem.wooz.org>
Message-ID: <3D19C7DA.3050509@livinglogic.de>

Barry A. Warsaw wrote:

>>>>>>"RH" == Raymond Hettinger <python@rcn.com> writes:
>>>>>
> 
>     RH> Second wild idea of the day:
> 
>     RH> The dict constructor currently accepts sequences where each
>     RH> element has length 2, interpreted as a key-value pair.
> 
>     RH> Let's have it also accept sequences with elements of length 1,
>     RH> interpreted as a key:None pair.
> 
> None might be an unfortunate choice because it would make dict.get()
> less useful.  I'd prefer key:1

How about key:True ?

Bye,
    Walter Dörwald




From thomas.heller@ion-tof.com  Wed Jun 26 15:19:37 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 26 Jun 2002 16:19:37 +0200
Subject: [Python-Dev] Dict constructor
References: <008101c21ce4$2b504fc0$91d8accf@othello>
Message-ID: <029a01c21d1c$80ad0e30$e000a8c0@thomasnotebook>

> Second wild idea of the day:
> 
> The dict constructor currently accepts sequences where each element has
> length 2, interpreted as a key-value pair.
> 
> Let's have it also accept sequences with elements of length 1, interpreted
> as a key:None pair.
> 
> The benefit is that it provides a way to rapidly construct sets:
> 

The downside is that it's another way to write programs
incompatible with 2.2.

Thomas




From David Abrahams" <david.abrahams@rcn.com  Wed Jun 26 15:14:25 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 26 Jun 2002 10:14:25 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
References: <20020624213318.A5740@arizona.localdomain><20020625065203.GA27183@hishome.net> <m3bs9zg6tc.fsf@mira.informatik.hu-berlin.de>
Message-ID: <1c4901c21d1b$e0511d50$6601a8c0@boostconsulting.com>

Also, in case nobody has said so, worst-case performance for insertion into
a large heap (log N) is much better than for insertion into a sorted list
(N). Of course, in practice, it takes a really large heap to notice these
effects.

-Dave

From: "Martin v. Loewis" <martin@v.loewis.de>


> Oren Tirosh <oren-py-d@hishome.net> writes:
>
> > The only advantage of a heap is O(1) peek which doesn't seem so
> > critical.  It may also have somewhat better performance by a
> > constant factor because it uses an array rather than allocating node
> > structures.  But the internal order of a heap-based priority queue
> > is very non-intuitive and quite useless for other purposes while a
> > sorted list is, umm..., sorted!
>
> I think that heaps don't allocate additional memory is a valuable
> property, more valuable than the asymptotic complexity (which is also
> quite good). If you don't want to build priority queues, you can still
> use heaps to sort a list.
>
> IMO, heaps are so standard as an algorithm that they belong into the
> Python library, in some form. It is then the user's choice to use that
> algorithm or not.
>
> Regards,
> Martin





From barry@zope.com  Wed Jun 26 15:54:38 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 26 Jun 2002 10:54:38 -0400
Subject: [Python-Dev] Dict constructor
References: <008101c21ce4$2b504fc0$91d8accf@othello>
 <15641.49108.839568.721853@anthem.wooz.org>
 <3D19C7DA.3050509@livinglogic.de>
Message-ID: <15641.54702.652214.551556@anthem.wooz.org>

>>>>> "WD" =3D=3D Walter D=F6rwald <walter@livinglogic.de> writes:

    WD> How about key:True ?

Kids today, always with the newfangled gadgets.  :)
-Barry



From David Abrahams" <david.abrahams@rcn.com  Wed Jun 26 16:23:32 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 26 Jun 2002 11:23:32 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
References: <LNBBLJKPBEHFEDALKOLCGECIAAAB.tim.one@comcast.net>
Message-ID: <1d9601c21d25$d32a85d0$6601a8c0@boostconsulting.com>

This is really interesting. When I was at Dragon (well, actually, after Tim
left and it became L&H), I ported my natural language parsing/understanding
system from Python to C++ so it could run quickly enough for embedded
devices. The core of this system was an associative container, so I knew
that its performance would be crucial. I used C++ generics which made it
really easy to swap in different associative container implementations, and
I tried lots, including the red-black tree containers built into most C++
implementations, and hash tables. My experience was that trying to come up
with a hash function that would give a significant speed increases over the
tree containers was extremely difficult, because it was really hard to come
up with a good hash function. Furthermore, even if I succeeded, it was like
black magic: it was inconsistent accross my test cases and there was no way
to understand why it worked well, and to get a feeling for how it would
scale to problems outside those cases. I ended up hand-coding a two-level
scheme based on binary searches in contiguous arrays which blew away
anything I'd been able to do with a hash table. My conclusion was that for
general-purpose use, the red-black tree was pretty good, despite its
relatively high memory overhead of 3 pointers per node: it places easy
requirements on the user (supply a strick weak ordering) and provides
predictable and smooth performance even asymptotically. On the other hand,
hashing requires that the user supply both a hash function and an equality
detector which must agree with one-another, requires hand-tuning of the
hash function for performance, and is rather more unpredictable. We've been
talking about adding hash-based containers to the C++ standard library but
I'm reluctant on these grounds. It seems to me that when you really care
about speed, some kind of hand-coded solution might be a better investment
than trying to come up with a good hash function.

I'm ready to believe that hashing is the most appropriate choice for
Python, but I wonder what makes the difference?

-Dave

From: "Tim Peters" <tim.one@comcast.net>

> There's just no contest here.  BTrees have many other virtues, like
> supporting range searches, and automatically playing nice with ZODB
> persistence, but they're plain sluggish compared to dicts.  To be
completely
> fair and unfair at the same time <wink>, there are also 4 other flavors
of
> Zope BTree, purely for optimization reasons.  In particular, the IIBTree
> maps Python ints to Python ints, and does so by avoiding Python int
objects
> altogether, storing C longs directly and comparing them at native
"compare a
> long to a long" C speed.  That's *almost* as fast as Python 2.1 int->int
> dicts (which endure all-purpose Python object comparison), except for
> deletion (the BTree spends a lot of time tearing apart all the tree
pointers
> again).
>
> Now that's a perfectly height-balanced search tree that "chunks up"
blocks
> of keys for storage and speed efficiency, and rarely needs more than a
> simple local adjustment to maintain balance.  I expect that puts it at
the
> fast end of what can be achieved with a balanced tree scheme.
>
> The ABC language (which Guido worked on before Python) used AVL trees for
> just about everything under the covers.  It's not a coincidence that
Python
> doesn't use balanced trees for anything <wink>.
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
>




From skip@pobox.com  Wed Jun 26 17:32:16 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 26 Jun 2002 11:32:16 -0500
Subject: [Python-Dev] Asyncore/asynchat
In-Reply-To: <0c4801c21cfa$0b9d11c0$6300000a@holdenweb.com>
References: <0c4801c21cfa$0b9d11c0$6300000a@holdenweb.com>
Message-ID: <15641.60560.844503.239136@12-248-8-148.client.attbi.com>

    Steve> I thought I might try to add appropriate module documentation for
    Steve> asynchat.  This effective code doesn't get enough recognition
    Steve> (IMHO), partly because you are forced to read the code to
    Steve> understand how to use it.

That would be a great idea.  Once I actually tried it, it was easy to work
with, but the lack of documentation does steepen the initial learning curve
a bit.

    Steve> I notice that Sam Rushing's code tends to use spaces before the
    Steve> parentheses around argument lists. Should I think about cleaning
    Steve> up the code at the same time, or are we best letting sleeping
    Steve> dogs lie?

I would let this particular sleeping dog lie.  I think the code in Python is
occasionally sync'd with Sam's code.  Changing the spacing would just add a
bunch of spurious differences and thus make that task more difficult.

Skip



From tim@zope.com  Wed Jun 26 17:36:16 2002
From: tim@zope.com (Tim Peters)
Date: Wed, 26 Jun 2002 12:36:16 -0400
Subject: [Python-Dev] Asyncore/asynchat
In-Reply-To: <0c4801c21cfa$0b9d11c0$6300000a@holdenweb.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAEHODFAA.tim@zope.com>

[Steve Holden]
> I thought I might try to add appropriate module documentation for
> asynchat.

Cool!

> This effective code doesn't get enough recognition (IMHO), partly because
> you are forced to read the code to understand how to use it.
>
> I notice that Sam Rushing's code tends to use spaces before the
> parentheses around argument lists. Should I think about cleaning up the
> code at the same time, or are we best letting sleeping dogs lie?

You should feel free to clean it up, but not at the same time:  clean the
spaces in a distinct checkin dedicated to just that much, with a checkin
comment like "Whitespace normalization".




From jeremy@zope.com  Wed Jun 26 17:36:23 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Wed, 26 Jun 2002 12:36:23 -0400
Subject: [Python-Dev] Dict constructor
In-Reply-To: <008101c21ce4$2b504fc0$91d8accf@othello>
References: <008101c21ce4$2b504fc0$91d8accf@othello>
Message-ID: <15641.60807.674692.84314@slothrop.zope.com>

>>>>> "RH" == Raymond Hettinger <python@rcn.com> writes:

  RH> Second wild idea of the day: The dict constructor currently
  RH> accepts sequences where each element has length 2, interpreted
  RH> as a key-value pair.

  RH> Let's have it also accept sequences with elements of length 1,
  RH> interpreted as a key:None pair.

That seems a little too magical to me.

  RH> Raymond Hettinger 'regnitteh dnomyar'[::-1]

Then again it seems like you like magic!

Jeremy




From python@rcn.com  Wed Jun 26 18:38:09 2002
From: python@rcn.com (Raymond Hettinger)
Date: Wed, 26 Jun 2002 13:38:09 -0400
Subject: [Python-Dev] Dict constructor
References: <008101c21ce4$2b504fc0$91d8accf@othello> <15641.60807.674692.84314@slothrop.zope.com>
Message-ID: <001801c21d38$3d4e2220$56ec7ad1@othello>

From: "Jeremy Hylton" <jeremy@zope.com>
>   RH> Second wild idea of the day: The dict constructor currently
>   RH> accepts sequences where each element has length 2, interpreted
>   RH> as a key-value pair.
> 
>   RH> Let's have it also accept sequences with elements of length 1,
>   RH> interpreted as a key:None pair.
> 
> That seems a little too magical to me.

Fair enough.

> 
>   RH> Raymond Hettinger 'regnitteh dnomyar'[::-1]
> Then again it seems like you like magic!

While I'm a fan of performance magic, a la the Magic Castle,
the root of this suggestion is more mundane.  There are too
many pieces of code that test membership with 'if elem in container'
where the container is not a dictionary.  This results in O(n) 
performance rather than O(1).  To fix it, I found myself 
writing the same code over and over again:

  def _toset(container):
       return dict([(elem, True) for elem in container])

This repeated dictionary construction exercise occurs in so many
guises that it would be worthwhile to provide a fast, less magical
looking approach.  Being able to construct dictionaries with
default values isn't exactly the most exotic idea ever proposed.

IMO, it's clearer, faster, commonly needed, and easy to implement.

'nuff said,


Raymond Hettinger





From oren-py-d@hishome.net  Wed Jun 26 19:30:59 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 26 Jun 2002 21:30:59 +0300
Subject: [Python-Dev] Xrange and Slices
In-Reply-To: <NBBBIOJPGKJEKIECEMCBMEACNGAA.pobrien@orbtech.com>; from pobrien@orbtech.com on Wed, Jun 26, 2002 at 08:49:26AM -0500
References: <20020626132718.GA57665@hishome.net> <NBBBIOJPGKJEKIECEMCBMEACNGAA.pobrien@orbtech.com>
Message-ID: <20020626213059.A7500@hishome.net>

On Wed, Jun 26, 2002 at 08:49:26AM -0500, Patrick K. O'Brien wrote:
> What is x in your example? Assuming x == xrange, I get this with Python
> 2.2.1:
> 
> >>> dir(xrange)
> ['__call__', '__class__', '__cmp__', '__delattr__', '__doc__',
> '__getattribute__', '__hash__', '__init__', '__name__', '__new__',
> '__reduce__', '__repr__', '__self__', '__setattr__', '__str__']
> 
> Assuming x == xrange(1, 100, 2):
> 
> >>> x = xrange(1, 100, 2)
> >>> dir(x)
> PyCrust-Shell:1: DeprecationWarning: xrange object's 'start', 'stop' and
> 'step' attributes are deprecated
> ['start', 'step', 'stop', 'tolist']

It's the latter (xrange instance, not the type).

I'm getting an empty dir() in the latest CVS version.  The result you got 
is what happens in 2.2 and 2.2.1.

	Oren



From tim@zope.com  Wed Jun 26 20:18:52 2002
From: tim@zope.com (Tim Peters)
Date: Wed, 26 Jun 2002 15:18:52 -0400
Subject: [Python-Dev] Dict constructor
In-Reply-To: <008101c21ce4$2b504fc0$91d8accf@othello>
Message-ID: <BIEJKCLHCIOIHAGOKOLHOEIIDFAA.tim@zope.com>

[Raymond Hettinger]
> Second wild idea of the day:
>
> The dict constructor currently accepts sequences where each element has
> length 2, interpreted as a key-value pair.
>
> Let's have it also accept sequences with elements of length 1,
> interpreted as a key:None pair.

-1 because of ambiguity.  Is this trying to build a set with the single
element (42, 666), or a mapping of 42 to 666?

    dict([(42, 666)]}

The same dilemma but perhaps subtler:

    dict(["ab", "cd", "ef"])


> The benefit is that it provides a way to rapidly construct sets:

I've got nothing against sets, but don't think we should push raw dicts any
closer to supporting them directly than they already are.  Better for
someone to take over Greg Wilson's PEP to add a new set module; I also note
that Zope/ZODB's BTree package supports BTree-based sets directly as a
distinct (from BTree-based mappings) datatype.




From python@rcn.com  Wed Jun 26 20:45:09 2002
From: python@rcn.com (Raymond Hettinger)
Date: Wed, 26 Jun 2002 15:45:09 -0400
Subject: [Python-Dev] Dict constructor
References: <BIEJKCLHCIOIHAGOKOLHOEIIDFAA.tim@zope.com>
Message-ID: <009601c21d49$fb2acee0$56ec7ad1@othello>

From: "Tim Peters" <tim@zope.com>
> -1 because of ambiguity.  Is this trying to build a set with the single
> element (42, 666), or a mapping of 42 to 666?
>
>     dict([(42, 666)]}

I've been thinking about this and the unabiguous explicit solution is to 
specify a value argument like dict.get().

>>> dict([(42, 666)])           # current behavior unchanged
{42: 666}

>>> dict([(42, 666)], True)
{(42, 666): True}

>>> dict( '0123456789abcdef', True)
{'a': True, 'c': True, 'b': True, 'e': True, 'd': True, 'f': True, '1':
True, '0': True, '3': True, '2': True, '5': True, '4': True, 7': True, '6':
True, '9': True, '8': True}

>>> dict('0123456789abcdef')    # current behavior unchanged
ValueError: dictionary update sequence element #0 has length 1; 2 is
required



The goal is not to provide full set behavior but to facilitate the common
task of building dictionaries with a constant value.  It comes up in
membership testing and in uniquifying sequences.  The task of dict() is to
construct dictionaries and this is a reasonably common construction.


Raymond Hettinger





From tim@zope.com  Wed Jun 26 21:07:03 2002
From: tim@zope.com (Tim Peters)
Date: Wed, 26 Jun 2002 16:07:03 -0400
Subject: [Python-Dev] Dict constructor
In-Reply-To: <009601c21d49$fb2acee0$56ec7ad1@othello>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEINDFAA.tim@zope.com>

[Raymond Hettinger]
> I've been thinking about this and the unabiguous explicit solution is to
> specify a value argument like dict.get().
>
> >>> dict([(42, 666)])           # current behavior unchanged
> {42: 666}
>
> >>> dict([(42, 666)], True)
> {(42, 666): True}
>
> >>> dict( '0123456789abcdef', True)
> {'a': True, 'c': True, 'b': True, 'e': True, 'd': True, 'f': True, '1':
> True, '0': True, '3': True, '2': True, '5': True, '4': True, 7':
> True, '6': True, '9': True, '8': True}
>
> >>> dict('0123456789abcdef')    # current behavior unchanged
> ValueError: dictionary update sequence element #0 has length 1; 2 is
> required

That's better -- but I'd still rather see a set.

> The goal is not to provide full set behavior but to facilitate the
> common task of building dictionaries with a constant value.

The only dicts with constant values I've ever seen are simulating sets.

> It comes up in membership testing and in uniquifying sequences.

Those are indeed two common examples of using dicts to get at set
functionality.

> The task of dict() is to construct dictionaries and this is a
> reasonably common construction.

But only because there isn't a set type.




From gsw@agere.com  Wed Jun 26 21:22:06 2002
From: gsw@agere.com (Gerald S. Williams)
Date: Wed, 26 Jun 2002 16:22:06 -0400
Subject: [Python-Dev] List comprehensions
In-Reply-To: <20020626153218.1766.44879.Mailman@mail.python.org>
Message-ID: <GBEGLOMMCLDACBPKDIHFMEDCCJAA.gsw@agere.com>

Has anyone summarized the list comprehension design discussions? I found
references to "lots of discussion" about it but haven't yet found the
discussions themselves.

I don't want to rehash any old discussions, but I came across a surprise
recently while converting constructs like "map(lambda x:x+1,x)" and just
wanted to see the rationale behind not creating a local scope for list
comprehension variables.

Any pointer would be appreciated.

Thanks,

-Jerry



From python@rcn.com  Wed Jun 26 21:48:23 2002
From: python@rcn.com (Raymond Hettinger)
Date: Wed, 26 Jun 2002 16:48:23 -0400
Subject: [Python-Dev] List comprehensions
References: <GBEGLOMMCLDACBPKDIHFMEDCCJAA.gsw@agere.com>
Message-ID: <00d901c21d52$d13488c0$56ec7ad1@othello>

From: "Gerald S. Williams" <gsw@agere.com>

> I don't want to rehash any old discussions, but I came across a surprise
> recently while converting constructs like "map(lambda x:x+1,x)" and just
> wanted to see the rationale behind not creating a local scope for list
> comprehension variables.

The idea was to make a = [expr(i) for i in seqn]; print i behave the same
as:

a = []
for i in seqn:
    a.append(expr(i))
print i  # i is in locals in its final loop state


Raymond Hettinger






From Jack.Jansen@oratrix.com  Wed Jun 26 21:30:43 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Wed, 26 Jun 2002 22:30:43 +0200
Subject: [Python-Dev] Dict constructor
In-Reply-To: <001801c21d38$3d4e2220$56ec7ad1@othello>
Message-ID: <969E2908-8943-11D6-A9BF-003065517236@oratrix.com>

On woensdag, juni 26, 2002, at 07:38 , Raymond Hettinger wrote:
>  To fix it, I found myself
> writing the same code over and over again:
>
>   def _toset(container):
>        return dict([(elem, True) for elem in container])
>
> This repeated dictionary construction exercise occurs in so many
> guises that it would be worthwhile to provide a fast, less magical
> looking approach.

I disagree on this being "magical", I tend to think of it as 
"Pythonic". If there is a reasonably easy to remember construct 
(such as this one: if you've seen it once you'll remember it) 
just use that, in stead of adding extra layers of functionality. 
Moreover, this construct has lots of slight modifications that 
are useful in slightly different situations (i.e. don't put True 
in the value but something else), and people will "magically" 
understand these if they've seen this one.

What I could imagine would be nice is a warning if you're doing 
inefficient "in" operations. But I guess this would have to be 
done in the interpreter itself (I don't think pychecker could do 
this, or could it?), and the definition of "inefficient" is 
going to be difficult ("if your program has done more than N1 in 
operations on a data structure with more than N2 items in it and 
these took an average of O(N1*N2/2) compares", and keep that 
information per object).
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -




From fredrik@pythonware.com  Wed Jun 26 22:20:48 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 26 Jun 2002 23:20:48 +0200
Subject: [Python-Dev] Dict constructor
References: <969E2908-8943-11D6-A9BF-003065517236@oratrix.com>
Message-ID: <001f01c21d57$5a5782c0$ced241d5@hagrid>

jack wrote:
> I disagree on this being "magical", I tend to think of it as 
> "Pythonic". If there is a reasonably easy to remember construct 
> (such as this one: if you've seen it once you'll remember it) 
> just use that, in stead of adding extra layers of functionality. 

to quote a certain bot (guess raymond wasn't following that
thread):

    "It's easier to write appropriate code from scratch in Python
    than to figure out how to *use* a package profligate enough
    to contain canned solutions for all common and reasonable
    use cases."

time to add a best_practices module to Python 2.3?

KeyError:-profligate-ly yrs /F




From gsw@agere.com  Wed Jun 26 22:38:22 2002
From: gsw@agere.com (Gerald S. Williams)
Date: Wed, 26 Jun 2002 17:38:22 -0400
Subject: [Python-Dev] List comprehensions
In-Reply-To: <00d901c21d52$d13488c0$56ec7ad1@othello>
Message-ID: <GBEGLOMMCLDACBPKDIHFCEDDCJAA.gsw@agere.com>

Raymond Hettinger wrote:
> The idea was to make a = [expr(i) for i in seqn]; print i behave ...

No problem. As long as it was decided that there's a use for
the current behavior, I won't question it.

-Jerry



From skip@pobox.com  Wed Jun 26 22:57:53 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 26 Jun 2002 16:57:53 -0500
Subject: [Python-Dev] List comprehensions
In-Reply-To: <GBEGLOMMCLDACBPKDIHFMEDCCJAA.gsw@agere.com>
References: <20020626153218.1766.44879.Mailman@mail.python.org>
 <GBEGLOMMCLDACBPKDIHFMEDCCJAA.gsw@agere.com>
Message-ID: <15642.14561.397461.86661@12-248-8-148.client.attbi.com>

    Jerry> Has anyone summarized the list comprehension design discussions?
    Jerry> I found references to "lots of discussion" about it but haven't
    Jerry> yet found the discussions themselves.

Jerry,

I think list comprehensions were just about the last major feature to be
added to the language before PEPs became the absolute way to hash stuff out.
Here's a comment from Guido dated 2000-08-11:

    Go for it! (This must be unique -- the PEP still hasn't been finished,
    and the code is already accepted. :-)

Consequently, the PEP (202) never did really get fleshed out.

As I recall, it went something like:

    1. Buncha discussion in c.l.py.  Check out this thread begun by Greg
       Ewing from August 1998:

        http://groups.google.com/groups?dq=&hl=en&lr=&ie=UTF-8&oe=utf-8&threadm=35C7E33C.4B14%40cosc.canterbury.ac.nz&rnum=1&prev=/groups%3Fq%3Dg:thl4020484492d%26dq%3D%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3Dutf-8%26selm%3D35C7E33C.4B14%2540cosc.canterbury.ac.nz

       With a little more agressive use of the time machine, Tim & Greg
       could maybe have snuck them into 1.5.2!

    2. Greg implemented them as a proof of concept and then they
       languished. 

    3. I picked them up in mid-2000 and got the ball rolling on
       getting them into 2.0.

    4. They got accepted in August 2000.

The last couple steps happened while the 1.6/2.0/CNRI/BeOpen stuff was going
on, so I'm pretty sure no summary of the discussions took place.  If you're
looking for a significant thread, I'd start in the python-dev archives
around April or May 2000.  You might also want to check the comments in the
patch:

    http://python.org/sf/400654

I don't believe the issue of variable scope ever came up until after 2.0 was
released.  I certainly thought of them as just shorthand notation for for
loops.  (Maybe it was discussed in the 1998 thread, but I'm not about to
read all 103 articles. ;-)

Skip



From fredrik@pythonware.com  Wed Jun 26 23:05:29 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 27 Jun 2002 00:05:29 +0200
Subject: [Python-Dev] List comprehensions
References: <20020626153218.1766.44879.Mailman@mail.python.org>        <GBEGLOMMCLDACBPKDIHFMEDCCJAA.gsw@agere.com> <15642.14561.397461.86661@12-248-8-148.client.attbi.com>
Message-ID: <002901c21d5d$96f1aca0$ced241d5@hagrid>

skip wrote:

> Consequently, the PEP (202) never did really get fleshed out.

despite the fact that PEP 202 is marked as final, maybe relevant
portions of this thread could be added to it?

(so we can reply RTFP the next time someone stumbles upon this)

>     1. Buncha discussion in c.l.py.  Check out this thread begun by Greg
>        Ewing from August 1998:

in case someone would like to add this to the PEP, that URL can
be shortened to:

    http://groups.google.com/groups?threadm=35C7E33C.4B14%40cosc.canterbury.ac.nz

</F>




From tim@zope.com  Wed Jun 26 23:09:46 2002
From: tim@zope.com (Tim Peters)
Date: Wed, 26 Jun 2002 18:09:46 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <1d9601c21d25$d32a85d0$6601a8c0@boostconsulting.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHMEJGDFAA.tim@zope.com>

[David Abrahams]
> This is really interesting. When I was at Dragon (well, actually,
> after Tim left and it became L&H), I ported my natural language
> parsing/understanding system from Python to C++ so it could run
> quickly enough for embedded devices.  The core of this system was an
> associative container, so I knew that its performance would be
> crucial.  I used C++ generics which made it really easy to swap in
> different associative container implementations, and I tried lots,
> including the red-black tree containers built into most C++
> implementations, and hash tables. My experience was that trying to
> come up with a hash function that would give a significant speed
> increases over the tree containers was extremely difficult, because it
> was really hard to come up with a good hash function.

There's more to a speedy hash implementation than just the hash function, of
course.

> Furthermore, even if I succeeded, it was like black magic: it was
< inconsistent accross my test cases and there was no way to understand
> why it worked well, and to get a feeling for how it would scale to
> problems outside those cases.

Python's dictobject.c and .h have extensive comments about how Python's
dicts work.  Their behavior isn't mysterious, at least not after 10 years of
thinking about it <0.9 wink>.  Python's dicts also use tricks that I've
never seen in print -- many people have contributed clever tricks.

> I ended up hand-coding a two-level scheme based on binary searches in
> contiguous arrays which blew away anything I'd been able to do with a
> hash table. My conclusion was that for general-purpose use, the red-
> black tree was pretty good, despite its relatively high memory overhead
> of 3 pointers per node:

The example I posted built a mapping with a million entries.  A red-black
tree of that size needs to chase between 20 and 40 pointers to determine
membership.  By sheer instruction count alone, that's way more instructions
than the Python dict usually has to do, although it's comparable to the
number the BTree had to do.  The BTree probably has better cache behavior
than a red-black tree; for example, all the keys in the million-element
example were on the third level, and a query required exactly two
pointer-chases to get to the right leaf bucket.  All the rest is binary
search on 60-120 element contiguous vectors (in fact, sounds a lot like your
custom "two-level scheme")

> it places easy requirements on the user (supply a strick weak ordering)
> and provides predictable and smooth performance even asymptotically.

OTOH, it can be very hard to write an efficient, correct "<" ordering, while
testing just "equal or not?" can be easier and run quicker than that.  Dict
comparison is a good example from the Python core:  computing "<" for dicts
is a nightmare, but computing "==" for dicts is easy (contrast the
straightforward dict_equal() with the brain-busting dict_compare() +
characterize() pair).  This was one of the motivations for introducing "rich
comparisons".

> On the other hand, hashing requires that the user supply both a hash
> function and an equality detector which must agree with one-another,

I've rarely found this to be a challenge.  For example, for sets that
contain sets as elements, a suitable hash function can simply xor the hash
codes of the set elements.  Since equal sets have equal elements, such a
scheme delivers equal hash codes for equal sets, and independent of the
order in which set elements get enumerated.  In contrast, defining a
sensible *total* ordering on sets is a delicate undertaking (yes, I know
that "strict weak ordering" is weaker than "total", but in real life you
couldn't buy a hot dog with the difference <wink>).

> requires hand-tuning of the hash function for performance, and is rather
> more unpredictable.  We've been talking about adding hash-based
> containers to the C++ standard library but I'm reluctant on these
> grounds. It seems to me that when you really care about speed, some kind
> of hand-coded solution might be a better investment than trying to come
> up with a good hash function.
>
> I'm ready to believe that hashing is the most appropriate choice for
> Python, but I wonder what makes the difference?

Well, I'm intimately familar with the details of how Python dicts and Zope
BTrees are implemented, down to staring at the machine instructions
generated, and there's no mystery here to me.  I'm not familiar with any of
the details of what you tried.  Understanding speed differences at this
level isn't a "general principles" kind of discussion.

I should note that Zope's BTrees pay a lot for playing nice with
persistence, about a factor of two:  upon visiting and leaving each BTree
node, there are messy test+branch sequences ensuring that the object isn't a
ghost, notifying the persistence machinery that fields have been accessed
and/or changed, and telling the persistence machinery when the object is no
longer in active use.  Most of these bookkeeping operations can fail too, so
there's another layer of "and did that fail?" test+branches around all that.
The saving grace for BTrees (why this doesn't cost a factor of, say, 10) is
that each BTree node contains a fair amount of "stuff", so that the guts of
each function can do a reasonable amount of useful work.  The persistence
overhead could be a disaster if visiting an object only moved one bit closer
to the result.

But Python's dicts aren't aware of persistence at all, and that did give
dicts an ~= factor-of-2 advantage in the example.  While they're still not
as zippy as dicts after factoring that out, B-Trees certainly aren't pigs.

BTW, note that Python's dicts originally catered only to string keys, as
they were the implementation of Python's namespaces, and dicts remain highly
optimized for that specific purpose.  Indeed, there's a distinct dict lookup
routine dedicated to dicts with string keys.  Namespaces have no compelling
use for range search or lexicographic traversal, just association, and peak
lookup speed was the design imperative.




From tim.one@comcast.net  Thu Jun 27 00:29:42 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 26 Jun 2002 19:29:42 -0400
Subject: [Python-Dev] List comprehensions
In-Reply-To: <GBEGLOMMCLDACBPKDIHFCEDDCJAA.gsw@agere.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEFOAAAB.tim.one@comcast.net>

[Gerald S. Williams, on listcomp (non)scopes]
> No problem. As long as it was decided that there's a use for
> the current behavior, I won't question it.

I'm not sure there's a use for it, but I am sure I'd shoot any coworker who
found one and relied on it <wink>.  Python didn't have lexical scoping at
the time listcomps were getting hammered out, and it would have been nuts to
introduce a "local scope" for a single, isolated construct.  Then and now,
the semantics of listcomps can be exactly explained via a straightforward
transformation to a for-loop.  Now that we have lexical scoping, a local
index vrbl would also be easy to explain -- but it would be "a change".




From David Abrahams" <david.abrahams@rcn.com  Thu Jun 27 00:20:21 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 26 Jun 2002 19:20:21 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
References: <BIEJKCLHCIOIHAGOKOLHMEJGDFAA.tim@zope.com>
Message-ID: <222401c21d68$bc1f3000$6601a8c0@boostconsulting.com>

From: "Tim Peters" <tim@zope.com>

> There's more to a speedy hash implementation than just the hash function,
of
> course.

'course.

> > Furthermore, even if I succeeded, it was like black magic: it was
> < inconsistent accross my test cases and there was no way to understand
> > why it worked well, and to get a feeling for how it would scale to
> > problems outside those cases.
>
> Python's dictobject.c and .h have extensive comments about how Python's
> dicts work.  Their behavior isn't mysterious, at least not after 10 years
of
> thinking about it <0.9 wink>.  Python's dicts also use tricks that I've
> never seen in print -- many people have contributed clever tricks.

I noticed that, and I think the next time I try hashing I'm going to steal
as much as possible from Python's implementation to get a head start.

Noticing that also left me with a question: how come everybody in the world
hasn't stolen as much as possible from the Python hashing implementation?
Are there a billion such 10-years'-tweaked implementations lying around
which all perform comparably well?

> The example I posted built a mapping with a million entries.  A red-black
> tree of that size needs to chase between 20 and 40 pointers to determine
> membership.  By sheer instruction count alone, that's way more
instructions
> than the Python dict usually has to do, although it's comparable to the
> number the BTree had to do.  The BTree probably has better cache behavior
> than a red-black tree; for example, all the keys in the million-element
> example were on the third level, and a query required exactly two
> pointer-chases to get to the right leaf bucket.  All the rest is binary
> search on 60-120 element contiguous vectors (in fact, sounds a lot like
your
> custom "two-level scheme")

Yeah, I think it ended up being something like that. Of course, the
container I ended up with used domain-specific knowledge which would have
been inappropriate for general-purpose use.

> > it places easy requirements on the user (supply a strick weak ordering)
> > and provides predictable and smooth performance even asymptotically.
>
> OTOH, it can be very hard to write an efficient, correct "<" ordering,
while
> testing just "equal or not?" can be easier and run quicker than that.
Dict
> comparison is a good example from the Python core:  computing "<" for
dicts
> is a nightmare, but computing "==" for dicts is easy (contrast the
> straightforward dict_equal() with the brain-busting dict_compare() +
> characterize() pair).

Well, OK, ordering hash tables is hard, unless the bucket count is a
deterministic function of the element count. If they were sorted
containers, of course, < would be a simple matter. And I assume that
testing equality still involves a lot of hashing...

Hmm, looking at the 3 C++ implementations of hashed containers that I have
available to me, only one provides operator<(), which is rather strange
since the other two implement operator== by first comparing sizes, then
iterating through consecutive elements of each set looking for a
difference. The implementation supplying operator<() uses a (IMO misguided)
design that rehashes incrementally, but it seems to me that if the more
straightforward approaches can implement operator==() as described,
operator<() shouldn't have to be a big challenge for an everyday hash
table.

I'm obviously missing something, but what...?

> This was one of the motivations for introducing "rich
> comparisons".

I don't see how that helps. Got a link? Or a clue?

> > On the other hand, hashing requires that the user supply both a hash
> > function and an equality detector which must agree with one-another,
>
> I've rarely found this to be a challenge.  For example, for sets that
> contain sets as elements, a suitable hash function can simply xor the
hash
> codes of the set elements.  Since equal sets have equal elements, such a
> scheme delivers equal hash codes for equal sets, and independent of the
> order in which set elements get enumerated.  In contrast, defining a
> sensible *total* ordering on sets is a delicate undertaking (yes, I know
> that "strict weak ordering" is weaker than "total", but in real life you
> couldn't buy a hot dog with the difference <wink>).

I don't know what that means. If you represent your sets as sorted
containers, getting a strict weak ordering on sets is trivial; you just do
it with a lexicographical comparison of the two sequences.

> > I'm ready to believe that hashing is the most appropriate choice for
> > Python, but I wonder what makes the difference?
>
> Well, I'm intimately familar with the details of how Python dicts and
Zope
> BTrees are implemented, down to staring at the machine instructions
> generated, and there's no mystery here to me.  I'm not familiar with any
of
> the details of what you tried.  Understanding speed differences at this
> level isn't a "general principles" kind of discussion.

No, I suppose not. But python's dicts are general-purpose containers, and
you can put any key you like in there. It's still surprising to me given my
(much less than 10 years') experience with hash implementations that you
can design something that performs well over all those different cases.

> I should note that Zope's BTrees pay a lot for playing nice with
> persistence, about a factor of two:  upon visiting and leaving each BTree
> node, there are messy test+branch sequences ensuring that the object
isn't a
> ghost, notifying the persistence machinery that fields have been accessed
> and/or changed, and telling the persistence machinery when the object is
no
> longer in active use.  Most of these bookkeeping operations can fail too,
so
> there's another layer of "and did that fail?" test+branches around all
that.

Aww, heck, you just need a good C++ exception-handling implementation to
get rid of the error-checking overheads ;-)

> The saving grace for BTrees (why this doesn't cost a factor of, say, 10)
is
> that each BTree node contains a fair amount of "stuff", so that the guts
of
> each function can do a reasonable amount of useful work.  The persistence
> overhead could be a disaster if visiting an object only moved one bit
closer
> to the result.
>
> But Python's dicts aren't aware of persistence at all, and that did give
> dicts an ~= factor-of-2 advantage in the example.  While they're still
not
> as zippy as dicts after factoring that out, B-Trees certainly aren't
pigs.
>
> BTW, note that Python's dicts originally catered only to string keys, as
> they were the implementation of Python's namespaces, and dicts remain
highly
> optimized for that specific purpose.  Indeed, there's a distinct dict
lookup
> routine dedicated to dicts with string keys.  Namespaces have no
compelling
> use for range search or lexicographic traversal, just association, and
peak
> lookup speed was the design imperative.

Thanks for the perspective!

still-learning-ly y'rs,
dave





From greg@cosc.canterbury.ac.nz  Thu Jun 27 03:38:12 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 27 Jun 2002 14:38:12 +1200 (NZST)
Subject: [Python-Dev] Xrange and Slices
In-Reply-To: <20020626132718.GA57665@hishome.net>
Message-ID: <200206270238.g5R2cC825570@oma.cosc.canterbury.ac.nz>

Oren Tirosh <oren-py-d@hishome.net>:

> Since xrange is the one more commonly used in everyday 
> programming I'd say that slice should be an alias to xrange, not the other
> way around.

I was about to yell "No, don't do that, slice is a type!"
when I decided I'd better make sure that's true...
and found that it's NOT!

Python 2.2 (#14, May 28 2002, 14:11:27) 
[GCC 2.95.2 19991024 (release)] on sunos5
Type "help", "copyright", "credits" or "license" for more information.
>>> slice
<built-in function slice>
>>> s = slice(1,2,3)
>>> s.__class__
<type 'slice'>
>>> 

So... why *isn't* slice == <type 'slice'>?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From skip@pobox.com  Thu Jun 27 03:50:48 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 26 Jun 2002 21:50:48 -0500
Subject: [Python-Dev] Xrange and Slices
In-Reply-To: <200206270238.g5R2cC825570@oma.cosc.canterbury.ac.nz>
References: <20020626132718.GA57665@hishome.net>
 <200206270238.g5R2cC825570@oma.cosc.canterbury.ac.nz>
Message-ID: <15642.32136.631168.24453@12-248-8-148.client.attbi.com>

    Greg> So... why *isn't* slice == <type 'slice'>?

I suspect nobody at PythonLabs currently has any spare round tuits.

Skip



From oren-py-d@hishome.net  Thu Jun 27 05:37:47 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Thu, 27 Jun 2002 00:37:47 -0400
Subject: [Python-Dev] Xrange and Slices
In-Reply-To: <200206270238.g5R2cC825570@oma.cosc.canterbury.ac.nz>
References: <20020626132718.GA57665@hishome.net> <200206270238.g5R2cC825570@oma.cosc.canterbury.ac.nz>
Message-ID: <20020627043747.GA80339@hishome.net>

On Thu, Jun 27, 2002 at 02:38:12PM +1200, Greg Ewing wrote:
> Oren Tirosh <oren-py-d@hishome.net>:
> 
> > Since xrange is the one more commonly used in everyday 
> > programming I'd say that slice should be an alias to xrange, not the other
> > way around.
> 
> I was about to yell "No, don't do that, slice is a type!"
> when I decided I'd better make sure that's true...
> and found that it's NOT!

It is in the latest CVS and so is xrange.

	Oren




From tim.one@comcast.net  Thu Jun 27 05:44:36 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 27 Jun 2002 00:44:36 -0400
Subject: [Python-Dev] Xrange and Slices
In-Reply-To: <20020626132718.GA57665@hishome.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEGOAAAB.tim.one@comcast.net>

[Oren Tirosh]
> ...
> The start, stop and step attributes to xrange would have to be
> revived (what was the idea behind removing them in the first place?)

A futile attempt at bloat reduction.  At the time, there was more code in
Python to support unused xrange embellishments than there was to support
generators.

> ...
> >>> xrange(1,100,2)
> xrange(1, 101, 2)
>
> It's been there since at least Python 2.0.  Hasn't anyone noticed this
> bug before?

It's been that way since xrange() was introduced, but nobody *called* it a
bug before.  The two expressions are equivalent:

>>> list(xrange(1, 100, 2)) == list(xrange(1, 101, 2))
True
>>>

[Greg Ewing]
> ...
> So... why *isn't* slice == <type 'slice'>?

It is in current CVS Python, but still range != <type 'range'>, and won't
until someone cares enough to change it.




From oren-py-d@hishome.net  Thu Jun 27 08:00:53 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Thu, 27 Jun 2002 03:00:53 -0400
Subject: [Python-Dev] Xrange and Slices
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEGOAAAB.tim.one@comcast.net>
References: <20020626132718.GA57665@hishome.net> <LNBBLJKPBEHFEDALKOLCAEGOAAAB.tim.one@comcast.net>
Message-ID: <20020627070053.GA96670@hishome.net>

On Thu, Jun 27, 2002 at 12:44:36AM -0400, Tim Peters wrote:
> > >>> xrange(1,100,2)
> > xrange(1, 101, 2)
> >
> > It's been there since at least Python 2.0.  Hasn't anyone noticed this
> > bug before?
> 
> It's been that way since xrange() was introduced, but nobody *called* it a
> bug before.  The two expressions are equivalent:
> 
> >>> list(xrange(1, 100, 2)) == list(xrange(1, 101, 2))
> True

I found that seconds after hitting 'y'...

> [Greg Ewing]
> > ...
> > So... why *isn't* slice == <type 'slice'>?
> 
> It is in current CVS Python, but still range != <type 'range'>, and won't
> until someone cares enough to change it.

There is no spoo^H^H^H^H <type 'range'>.  
xrange is <type 'xrange'> and range is <built-in function range>.

	Oren




From oren-py-d@hishome.net  Thu Jun 27 08:06:03 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Thu, 27 Jun 2002 03:06:03 -0400
Subject: [Python-Dev] Xrange and Slices
In-Reply-To: <000d01c21cdb$eb03b720$91d8accf@othello>
References: <000d01c21cdb$eb03b720$91d8accf@othello>
Message-ID: <20020627070603.GB96670@hishome.net>

On Wed, Jun 26, 2002 at 02:37:17AM -0400, Raymond Hettinger wrote:
> Wild idea of the day:
> Merge the code for xrange() into slice().
> So that old code will work, make the word 'xrange' a synonym for 'slice'

It looks possible, but it will hurt the performance of xrange.  Internally, 
xrange uses C longs while slice uses python objects with all the associated 
overhead.  

	Oren



From gmcm@hypernet.com  Thu Jun 27 12:06:29 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Thu, 27 Jun 2002 07:06:29 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <222401c21d68$bc1f3000$6601a8c0@boostconsulting.com>
Message-ID: <3D1AB975.3954.1FBF483@localhost>

On 26 Jun 2002 at 19:20, David Abrahams wrote:

[Python's hashing]

> Noticing that also left me with a question: how
> come everybody in the world hasn't stolen as much as
> possible from the Python hashing implementation?
> Are there a billion such 10-years'-tweaked
> implementations lying around which all perform
> comparably well? 

Jean-Claude Wippler and Christian Tismer did some
benchmarks against other implementations. IIRC, the
only one in the same ballpark was Lua's (which, IIRC,
was faster at least under some conditions).

-- Gordon
http://www.mcmillan-inc.com/




From sholden@holdenweb.com  Thu Jun 27 12:15:32 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Thu, 27 Jun 2002 07:15:32 -0400
Subject: [Python-Dev] Dict constructor
References: <BIEJKCLHCIOIHAGOKOLHOEIIDFAA.tim@zope.com> <009601c21d49$fb2acee0$56ec7ad1@othello>
Message-ID: <12c301c21dcb$f420ebc0$6300000a@holdenweb.com>

----- Original Message -----
From: "Raymond Hettinger" <python@rcn.com>
To: <python-dev@python.org>
Sent: Wednesday, June 26, 2002 3:45 PM
Subject: Re: [Python-Dev] Dict constructor


> From: "Tim Peters" <tim@zope.com>
> > -1 because of ambiguity.  Is this trying to build a set with the single
> > element (42, 666), or a mapping of 42 to 666?
> >
> >     dict([(42, 666)]}
>
> I've been thinking about this and the unabiguous explicit solution is to
> specify a value argument like dict.get().
>
> >>> dict([(42, 666)])           # current behavior unchanged
> {42: 666}
>
> >>> dict([(42, 666)], True)
> {(42, 666): True}
>
> >>> dict( '0123456789abcdef', True)
> {'a': True, 'c': True, 'b': True, 'e': True, 'd': True, 'f': True, '1':
> True, '0': True, '3': True, '2': True, '5': True, '4': True, 7': True,
'6':
> True, '9': True, '8': True}
>
> >>> dict('0123456789abcdef')    # current behavior unchanged
> ValueError: dictionary update sequence element #0 has length 1; 2 is
> required
>
>
>
> The goal is not to provide full set behavior but to facilitate the common
> task of building dictionaries with a constant value.  It comes up in
> membership testing and in uniquifying sequences.  The task of dict() is to
> construct dictionaries and this is a reasonably common construction.
>
But is it really common enough to merit special-casing what can anyway be
spelt very simply:

adict = {}
for k in asequence:
    dict[k] = sentinel

?
regards
-----------------------------------------------------------------------
Steve Holden                                 http://www.holdenweb.com/
Python Web Programming                http://pydish.holdenweb.com/pwp/
-----------------------------------------------------------------------






From tack@cscs.ch  Thu Jun 27 13:01:22 2002
From: tack@cscs.ch (Davide Tacchella)
Date: Thu, 27 Jun 2002 14:01:22 +0200
Subject: [Python-Dev] Help, Compile / debug Python 2.2.1 64 bit on AIX
Message-ID: <20020627140122.07b44f43.tack@cscs.ch>

I'm trying to build Python with 64 bit support on AIX, so far I've encountered 2 problems,
dynload_aix.c is not 100% 64 bit compliant (it includes some cast from pointer to (int));
this was causing python to SEGV when building extensions, after changing from int to long, 
the error is now ILL (SIGILL), there is a pointer to NULL:
The call stack from debugger is:
0x000000
initstruct (structmodule.c - line 1508)
_PyImport_LoadDynamicModule ( importdl.c - line 53)
load_module (import.c - line 1365)
import_submodule (import.c - line 1895)
load_next (import.c - line 1751)
import_module_ex (import.c - line 1602)
PyImport_ImportModuleEx (import.c - line 1643)
builtin___import__ (bltinmodule.c - line 40)
PyCFunction_Call (methodobject.c - line 80)
eval_frame (ceval.c - line 2004)
....

Any idea ?
Any help is always welcome.
Can anybody help me out ?

Davide



From fredrik@pythonware.com  Thu Jun 27 15:29:45 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 27 Jun 2002 16:29:45 +0200
Subject: [Python-Dev] SF task tracker confusion
Message-ID: <04f801c21de7$168326e0$0900a8c0@spiff>

on my "my sf.net" page, there are a couple of development
tasks listed for python 2.1 (!).

however, if I click on one of the links, e.g.

https://sourceforge.net/pm/task.php?func=3Ddetailtask&project_task_id=3D2=
5031&group_id=3D5470&group_project_id=3D4564

all I get is a page saying that:

    Permission Denied

    This project's administrator will have to grant
    you permission to view this page.=20
  =20
any ideas?  maybe the project's administrator could remove
the tasks for me?

</F>




From David Abrahams" <david.abrahams@rcn.com  Thu Jun 27 15:57:10 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 27 Jun 2002 10:57:10 -0400
Subject: [Python-Dev] Dict constructor
References: <BIEJKCLHCIOIHAGOKOLHOEIIDFAA.tim@zope.com> <009601c21d49$fb2acee0$56ec7ad1@othello> <12c301c21dcb$f420ebc0$6300000a@holdenweb.com>
Message-ID: <239801c21deb$94528030$6601a8c0@boostconsulting.com>

From: "Steve Holden" <sholden@holdenweb.com>

> But is it really common enough to merit special-casing what can anyway be
> spelt very simply:
>
> adict = {}
> for k in asequence:
>     dict[k] = sentinel
>
> ?

Yep.

-Dave





From tim@zope.com  Thu Jun 27 17:19:09 2002
From: tim@zope.com (Tim Peters)
Date: Thu, 27 Jun 2002 12:19:09 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <3D1AB975.3954.1FBF483@localhost>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAEKODFAA.tim@zope.com>

[David Abrahams]
> Noticing that also left me with a question: how
> come everybody in the world hasn't stolen as much as
> possible from the Python hashing implementation?
> Are there a billion such 10-years'-tweaked
> implementations lying around which all perform
> comparably well?

[Gordon McMillan]
> Jean-Claude Wippler and Christian Tismer did some
> benchmarks against other implementations. IIRC, the
> only one in the same ballpark was Lua's (which, IIRC,
> was faster at least under some conditions).

I'd like to see the benchmark.  Like Python, Lua uses a power-of-2 table
size, but unlike Python uses linked lists for collisions instead of open
addressing.  This appears to leave it very vulnerable to bad cases (like
using

     [i << 16 for i in range(20000)]

as a set of keys -- Python and Lua both grab the last 15 bits of the ints as
their hash codes, which means every key maps to the same hash bucket.  Looks
like Lua would chain them all together.  Python breaks the ties quickly via
its collision resolution scrambling.).

The Lua string hash appears systematically vulnerable:

static unsigned long hash_s (const char *s, size_t l) {
  unsigned long h = l;  /* seed */
  size_t step = (l>>5)|1;  /* if string is too long, don't hash all its
chars */
  for (; l>=step; l-=step)
    h = h ^ ((h<<5)+(h>>2)+(unsigned char)*(s++));
  return h;
}

That hash function would be weak even if it didn't ignore up to 97% of the
input characters.  OTOH, if it happens not to collide, ignoring up to 97% of
the characters eliminates up to 97% of the expense of computing a hash.

Etc.

Lua's hashes do appear to get a major benefit from lacking a Python feature:
user-defined comparisons can (a) raise exceptions, and (b) mutate the hash
table *while* you're looking for a key in it.  Those cause the Python
implementation lots of expensive pain (indeed, the main reason Python has a
distinct lookup function for string-keyed dicts is that it doesn't have to
choke itself worrying about #a or #b for builtin strings).

There's a lovely irony here.  Python's dicts are fast because they've been
optimized to death.  When Lua's dicts are fast, it seems more the case it's
because they don't worry much about bad cases.  That's *supposed* to be
Python's trick <wink>.




From tim@zope.com  Thu Jun 27 19:54:50 2002
From: tim@zope.com (Tim Peters)
Date: Thu, 27 Jun 2002 14:54:50 -0400
Subject: [Python-Dev] SF task tracker confusion
In-Reply-To: <04f801c21de7$168326e0$0900a8c0@spiff>
Message-ID: <BIEJKCLHCIOIHAGOKOLHGELHDFAA.tim@zope.com>

[/F]
> on my "my sf.net" page, there are a couple of development
> tasks listed for python 2.1 (!).

So finish them already <wink>.

> however, if I click on one of the links, e.g.
>
> https://sourceforge.net/pm/task.php?func=detailtask&project_task_i
> d=25031&group_id=5470&group_project_id=4564
>
> all I get is a page saying that:
>
>     Permission Denied
>
>     This project's administrator will have to grant
>     you permission to view this page.
>
> any ideas?  maybe the project's administrator could remove
> the tasks for me?

I got the same error page.  Looks like someone tried to disable use of the
task manager, and delete the old tasks, without closing the subtasks first.
It took a lot of fiddling but I believe I've done all I can to get those
tasks off your page.  You were the only who still had a task assigned to
them.  If they still show up on your page, let me know; I expect we'll have
to elevate it to an SF support request then.




From fredrik@pythonware.com  Thu Jun 27 19:56:49 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 27 Jun 2002 20:56:49 +0200
Subject: [Python-Dev] SF task tracker confusion
References: <04f801c21de7$168326e0$0900a8c0@spiff>
Message-ID: <028301c21e0c$66427f30$ced241d5@hagrid>

> maybe the project's administrator could remove
> the tasks for me?

    "You have no open tasks assigned to you."

thanks! /F




From fredrik@pythonware.com  Thu Jun 27 20:12:41 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 27 Jun 2002 21:12:41 +0200
Subject: [Python-Dev] pre.sub's broken under 2.2
Message-ID: <02ad01c21e0e$b3737960$ced241d5@hagrid>

just for the record, one of those "let's change a lot of
code that we don't understand, just because we can"
things broke the "pre" module in 2.2.

someone changed:

            try:
                repl = pcre_expand(_Dummy, repl)
            except:
                m = MatchObject(self, source, 0, end, [])

to

            try:
                repl = pcre_expand(_Dummy, repl)
            except error:
                m = MatchObject(self, source, 0, end, [])

but in the most common use case (replacement strings
containing group references), the pcre_expand function
raises a TypeError exception...

</F>




From tim@zope.com  Thu Jun 27 20:30:00 2002
From: tim@zope.com (Tim Peters)
Date: Thu, 27 Jun 2002 15:30:00 -0400
Subject: [Python-Dev] pre.sub's broken under 2.2
In-Reply-To: <02ad01c21e0e$b3737960$ced241d5@hagrid>
Message-ID: <BIEJKCLHCIOIHAGOKOLHGELJDFAA.tim@zope.com>

[/F]
> just for the record, one of those "let's change a lot of
> code that we don't understand, just because we can"

In the case of try + bare-except, it was more a case of "let's change code
we don't understand because it's impossible to guess its intent and that's
bad for future maintenance".

> things broke the "pre" module in 2.2.
>
> someone changed:
>
>             try:
>                 repl = pcre_expand(_Dummy, repl)
>             except:
>                 m = MatchObject(self, source, 0, end, [])
>
> to
>
>             try:
>                 repl = pcre_expand(_Dummy, repl)
>             except error:
>                 m = MatchObject(self, source, 0, end, [])
>
> but in the most common use case (replacement strings
> containing group references), the pcre_expand function
> raises a TypeError exception...

Like I said <wink>.  The except clause should list the exceptions it
specifically intends to silence, and something as obscure as this case
deserves a comment to boot.

I also note that if this passed the tests, then the test suite wasn't even
trying "the most common use case".

there's-more-than-one-kind-of-breakage-illustrated-here-ly y'rs  - tim




From fredrik@pythonware.com  Thu Jun 27 20:38:47 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 27 Jun 2002 21:38:47 +0200
Subject: [Python-Dev] pre.sub's broken under 2.2
References: <BIEJKCLHCIOIHAGOKOLHGELJDFAA.tim@zope.com>
Message-ID: <030901c21e12$443ce000$ced241d5@hagrid>

tim wrote:

> I also note that if this passed the tests, then the test suite wasn't even
> trying "the most common use case".

sure.  but who should make sure that the regression test suite
covers the code being changed: the person changing it, or the
end user?

</F>




From barry@zope.com  Thu Jun 27 20:13:05 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Thu, 27 Jun 2002 15:13:05 -0400
Subject: [Python-Dev] Building Python cvs w/ gcc 3.1
Message-ID: <15643.25537.767831.983206@anthem.wooz.org>

File this under "you just can't win".

I'm building Python cvs w/gcc 3.1 and I get warnings for every
extension, e.g.:

building 'zlib' extension
gcc -g -Wall -Wstrict-prototypes -fPIC -I. -I/home/barry/projects/python/./Include -I/usr/local/include -I/home/barry/projects/python/Include -I/home/barry/projects/python -c /home/barry/projects/python/Modules/zlibmodule.c -o build/temp.linux-i686-2.3/zlibmodule.o
cc1: warning: changing search order for system directory "/usr/local/include"
cc1: warning:   as it has already been specified as a non-system directory
gcc -shared build/temp.linux-i686-2.3/zlibmodule.o -L/usr/local/lib -lz -o build/lib.linux-i686-2.3/zlib.so

The problem is the inclusion of -I/usr/local/include because that's a
directory on gcc's system include path.  Adding such directories can
cause gcc headaches because it likes to treat system include dirs
specially, doing helpful things like fix bugs in vendor's header files
and sticking them in special locations.  -I apparently overrides the
special treatment of system include dirs, so the warnings are gcc's
way of helpfully reminding us not to do that.

Unfortunately, it seems difficult to fix this in a principled way.  I
can't figure out a way to reliably ask gcc what its system include
dirs are.  -v doesn't give you the information.

There's no switch to turn off these warnings.

You could ask cpp ("cpp -v") which does provide output that could be
grep'd for the system include dirs, but that just seems way too
fragile.  Besides, that doesn't play well with distutils because it
only wants to invoke the preprocessor using "gcc -E" and /that/
interprets -v as one of its options.  You could use
"gcc -E -Wp,-v dummyfile.c" but then you'd have to redirect stderr,
capture that output, and grep it.  Blech, blech, blech.

If I comment out the line in setup.py which add /usr/local/include to
self.compiler.include_dirs, it takes care of the problem, but that
might break other builds, so I'm loathe to do that.

The other option is to ignore the warnings since I don't think gcc 3.1
(3.x?) is distributed as the default compiler for very many distros
yet.  OTOH, it /will/ at some point and then it will be a PITA for
support <wink>.  OTTH, from some quick googling, I gather that this
warning is somewhat controversial inside the gcc community, and other
projects have dealt with it in heavyhanded ways (just don't
-I/usr/local/include), so if we ignore it long enough, the problem
might just go away.

Sigh, I'm done with this for now, but wanted to get it into the
archives for future reference.

-Barry



From tim@zope.com  Thu Jun 27 20:56:16 2002
From: tim@zope.com (Tim Peters)
Date: Thu, 27 Jun 2002 15:56:16 -0400
Subject: [Python-Dev] pre.sub's broken under 2.2
In-Reply-To: <030901c21e12$443ce000$ced241d5@hagrid>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAELMDFAA.tim@zope.com>

[/F]
> sure.  but who should make sure that the regression test suite
> covers the code being changed: the person changing it, or the
> end user?

We could ask a lot of "who should have?" questions here.  As it turns out,
an end user finished everyone's job here.

learn-&-move-on-ly y'rs  - tim




From zack@codesourcery.com  Thu Jun 27 23:12:28 2002
From: zack@codesourcery.com (Zack Weinberg)
Date: Thu, 27 Jun 2002 15:12:28 -0700
Subject: [Python-Dev] Improved tmpfile module
Message-ID: <20020627221228.GB9371@codesourcery.com>

I'm not subscribed to python-dev.  Please cc: me directly on replies.
I'm going to respond to all the comments at once.

Greg Ward wrote:
> > Attached please find a rewritten and improved tmpfile.py.  The major
> > change is to make the temporary file names significantly harder to
> > predict.  This foils denial-of-service attacks, where a hostile
> > program floods /tmp with files named @12345.NNNN to prevent process
> > 12345 from creating any temp files.  It also makes the race condition
> > inherent in tmpfile.mktemp() somewhat harder to exploit.
> 
> Oh, good!  I've long wished that there was a tmpfile module written by
> someone who understands the security issues involved in generating
> temporary filenames and files.  I hope you do... ;-)

Well, I wrote the analogous code in the GNU C library (using basically
the same algorithm).  I'm confident it is safe on a Unix-based system.
On Windows and others, I am relying on os.open(..., os.O_EXCL) to do
what it claims to do; assuming it does, the code should be safe there too.

> > (fd, name) = mkstemp(suffix="", binary=1): Creates a temporary file,
> > returning both an OS-level file descriptor open on it and its name.
> > This is useful in situations where you need to know the name of the
> > temporary file, but can't risk the race in mktemp.
> 
> +1 except for the name.  What does the "s" stand for?  Unfortunately, I
> can't think of a more descriptive name offhand.

Fredrik Lundh's suggestion that it is for "safer" seems plausible, but
I do not actually know.  I chose the names mkstemp and mkdtemp to
match the functions of the same name in most modern Unix C libraries.
Since they don't take the same "template" parameter that those
functions do, that was probably a bad idea.

[Note to Fredrik: at the C level, mkstemp is not deprecated in favor
of tmpfile, as they do very different things - tmpfile(3) is analogous
to tmpfile.TemporaryFile(), you don't get the file name back.]

I'm open to suggestions for a better routine name; I can't think of a
good one myself.

> > name = mkdtemp(suffix=""): Creates a temporary directory, without
> > race.
> 
> How about calling this one mktempdir() ?

Sure.

> I've scanned your code and the existing tempfile.py.  I don't
> understand why you rearranged things.  Please explain why your
> arrangement of _TemporaryFileWrapper/TemporaryFile/
> NamedTemporaryFile is better than what we have.

I was trying to get all the user-accessible interfaces to be at the
top of the file.  Also, I do not understand the bits in the existing
file that delete names out of the module namespace after we're done
with them, so I wound up taking all of that out to get it to work.  I
think the existing file's organization was largely determined by those
'del' statements.

I'm happy to organize the file any way y'all like -- I'm kind of new
to Python and I don't know the conventions yet.


> A few minor comments on the code...
> 
> > if os.name == 'nt':
> >     _template = '~%s~'
> > elif os.name in ('mac', 'riscos'):
> >     _template = 'Python-Tmp-%s'
> > else:
> >     _template = 'pyt%s' # better ideas?
> 
> Why reveal the implementation language of the application creating these
> temporary names?  More importantly, why do it certain platforms, but not
> others?

This is largely as it was in the old file.  I happen to know that ~%s~
is conventional for temporary files on Windows.  I changed 'tmp%s' to
'pyt%s' for Unix to make it consistent with Mac/RiscOS

Ideally one would allow the calling application to control the prefix, but
I'm not sure what the right interface is.  Maybe

 tmpfile.mkstemp(prefix="", suffix="")

where if one argument is provided it gets treated as the suffix, but
if two are provided the prefix comes first, a la range()?  Is there a
way to express that in the prototype?


> > ### Recommended, user-visible interfaces.
> > 
> > _text_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL
> > if os.name == 'posix':
> >     _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL
> 
> Why not just "_bin_openflags = _text_openflags" ?  That clarifies their
> equality on Unix.
> 
> > else:
> >     _bin_openflags = os.O_RDWR | os.O_CREAT | os.O_EXCL | os.O_BINARY
> 
> Why not "_bin_openflags = _text_openflags | os.O_BINARY" ?

*shrug* Okay.


> 
> > def mkstemp(suffix="", binary=1):
> >     """Function to create a named temporary file, with 'suffix' for
> >     its suffix.  Returns an OS-level handle to the file and the name,
> >     as a tuple.  If 'binary' is 1, the file is opened in binary mode,
> >     otherwise text mode (if this is a meaningful concept for the
> >     operating system in use).  In any case, the file is readable and
> >     writable only by the creating user, and executable by no one."""
> 
> "Function to" is redundant.

I didn't change much of this text from the old file.  Where are
docstring conventions documented?

>     """Create a named temporary file.
> 
>     Create a named temporary file with 'suffix' for its suffix.  Return
>     a tuple (fd, name) where 'fd' is an OS-level handle to the file, and
>     'name' is the complete path to the file.  If 'binary' is true, the
>     file is opened in binary mode, otherwise text mode (if this is a
>     meaningful concept for the operating system in use).  In any case,
>     the file is readable and writable only by the creating user, and
>     executable by no one (on platforms where that makes sense).
>     """

Okay.

> Hmmm: if suffix == ".bat", the file is executable on some platforms.
> That last sentence still needs work.

   ... In any case, the file is readable and writable only by the
   creating user.  On platforms where the file's permission bits
   control whether it can be executed as a program, no one can.  Other
   platforms have other ways of controlling this: for instance, under
   Windows, the suffix determines whether the file can be executed.

How's that?

> > class _TemporaryFileWrapper:
> >     """Temporary file wrapper
> > 
> >     This class provides a wrapper around files opened for temporary use.
> >     In particular, it seeks to automatically remove the file when it is
> >     no longer needed.
> >     """
> 
> Here's where I started getting confused.  I don't dispute that the
> existing code could stand some rearrangement, but I don't understand why
> you did it the way you did.  Please clarify!

See above.  What would you consider a sensible arrangement?

> 
> > ### Deprecated, user-visible interfaces.
> > 
> > def mktemp(suffix=""):
> >     """User-callable function to return a unique temporary file name."""
> >     while 1:
> >         name = _candidate_name(suffix)
> >         if not os.path.exists(name):
> >             return name
> 
> The docstring for mktemp() should state *why* it's bad to use this
> function -- otherwise people will say, "oh, this looks like it does what
> I need" and use it in ignorance.  So should the library reference
> manual.

Good point.

   """Suggest a name to be used for a temporary file.

   This function returns a file name, with 'suffix' for its suffix,
   which did not correspond to any file at some point in the past.  By
   the time you get the return value of this function, a file may have
   already been created with that name.  It is therefore unsafe to use
   this function for any purpose.  It is deprecated and may be removed
   in a future version of Python."""

and corresponding text in the library manual?

Tim Peters wrote:
> 
> -1 on the implementation here, because it didn't start with current CVS, so
> is missing important work that went into improving this module on Windows
> for 2.3.  Whether spawned/forked processes inherit descriptors for "temp
> files" is also a security issue that's addressed in current CVS but seemed
> to have gotten dropped on the floor here.

I'll get my hands on a copy of current CVS and rework my changes
against that.

> A note on UI:  for many programmers, "it's a feature" that temp file names
> contain the pid.  I don't think we can get away with taking that away no
> matter how stridently someone claims it's bad for us <wink>.

GNU libc took that away from C programmers about four years ago and no
one even noticed.  FreeBSD libc, ditto, although I'm not sure when it
happened.

zw



From fredrik@pythonware.com  Thu Jun 27 23:41:06 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 28 Jun 2002 00:41:06 +0200
Subject: [Python-Dev] Improved tmpfile module
References: <20020627221228.GB9371@codesourcery.com>
Message-ID: <05ea01c21e2b$bcab5fd0$ced241d5@hagrid>

zack wrote:

> [Note to Fredrik: at the C level, mkstemp is not deprecated in favor
> of tmpfile, as they do very different things - tmpfile(3) is analogous
> to tmpfile.TemporaryFile(), you don't get the file name back.]

I quoted the SUSv2 spec from memory.   shouldn't have
done that: it says "preferred for portability reasons", not
deprecated.

</F>




From David Abrahams" <david.abrahams@rcn.com  Fri Jun 28 00:43:13 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 27 Jun 2002 19:43:13 -0400
Subject: [Python-Dev] list.extend
Message-ID: <020201c21e34$6905b6b0$6501a8c0@boostconsulting.com>

I just submitted a patch to the list.extend docstring, to reflect the fact
that x.extend(xrange(10)) and x.extend((2,3)) both work when x is a list.
Then I went to look at the documentation and noticed it says at
http://www.python.org/dev/doc/devel/lib/typesseq-mutable.html:

s.extend(x)    same as s[len(s):len(s)] = x    (2)
...
(2) Raises an exception when x is not a list object. The extend() method is
experimental and not supported by mutable sequence types other than lists.


Now I'm wondering what all this means. It is /not/ equivalent to the slice
assignment, because list slice assignment requires a list rhs. What does
this "experimental" label mean? Is my patch to the docstring wrong, in the
sense that it suggests exploiting undefined behavior in the same way that
the old append-multiple-items behavior was undefined?

Also, I note that the table referenced above seems to be missing some right
parentheses, at least on the .pop and .sort method descriptions.

-Dave


+---------------------------------------------------------------+
                  David Abrahams
      C++ Booster (http://www.boost.org)               O__  ==
      Pythonista (http://www.python.org)              c/ /'_ ==
  resume: http://users.rcn.com/abrahams/resume.html  (*) \(*) ==
          email: david.abrahams@rcn.com
+---------------------------------------------------------------+




From David Abrahams" <david.abrahams@rcn.com  Fri Jun 28 00:47:44 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 27 Jun 2002 19:47:44 -0400
Subject: [Python-Dev] list.extend
References: <020201c21e34$6905b6b0$6501a8c0@boostconsulting.com>
Message-ID: <021801c21e35$08fb1d90$6501a8c0@boostconsulting.com>

----- Original Message -----
From: "David Abrahams" <david.abrahams@rcn.com>
To: <python-dev@python.org>
Sent: Thursday, June 27, 2002 7:43 PM
Subject: [Python-Dev] list.extend


> I just submitted a patch to the list.extend docstring, to reflect the
fact
> that x.extend(xrange(10)) and x.extend((2,3)) both work when x is a list.
> Then I went to look at the documentation and noticed it says at
> http://www.python.org/dev/doc/devel/lib/typesseq-mutable.html:
>
> s.extend(x)    same as s[len(s):len(s)] = x    (2)
> ...
> (2) Raises an exception when x is not a list object. The extend() method
is
> experimental and not supported by mutable sequence types other than
lists.
>
>
> Now I'm wondering what all this means. It is /not/ equivalent to the
slice
> assignment, because list slice assignment requires a list rhs. What does
> this "experimental" label mean? Is my patch to the docstring wrong, in
the
> sense that it suggests exploiting undefined behavior in the same way that
> the old append-multiple-items behavior was undefined?

Looking again, I note that even if my patch is wrong, either the doc or the
implementation must be fixed since it currently lies about throwing an
exception when x is not a list. If someone can channel me the right state
of affairs I'll submit another patch.

-Dave




From tim@zope.com  Fri Jun 28 04:40:00 2002
From: tim@zope.com (Tim Peters)
Date: Thu, 27 Jun 2002 23:40:00 -0400
Subject: [Python-Dev] list.extend
In-Reply-To: <020201c21e34$6905b6b0$6501a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKFAAAB.tim@zope.com>

[David Abrahams]
> I just submitted a patch to the list.extend docstring, to reflect the fact
> that x.extend(xrange(10)) and x.extend((2,3)) both work when x is a list.
> Then I went to look at the documentation and noticed it says at
> http://www.python.org/dev/doc/devel/lib/typesseq-mutable.html:
>
> s.extend(x)    same as s[len(s):len(s)] = x    (2)

Ya, that's no longer true.

> ...
> (2) Raises an exception when x is not a list object.

That's true of s[len(s}:len(s)] = x, but not of s.extend(x).

> The extend() method is experimental

"experimental" doesn't mean anything, so neutral on that <wink>.

> and not supported by mutable sequence types other than lists.

That's not true anymore either; for example, arrays (from the array module)
have since grown .extend() methods.

> Now I'm wondering what all this means.

Just that the docs are, as you suspect, out of date.

> It is /not/ equivalent to the slice assignment, because list slice
> assignment requires a list rhs.

Right.  list.extend(x) actually requires that x be an iterable object.  Even

    list.extend(open('some file'))

works fine (and appends the lines of the file to the list).

> What does this "experimental" label mean?

I'm not sure.  Guido slaps that label on new features from time to time,
with the implication that they may go away in the following release.
However, no *advertised* experimental feature has ever gone away, and I
doubt one ever will.  We should drop the "experimental" on this one for sure
now, as lots of code uses list.extend().

> Is my patch to the docstring wrong, in the sense that it suggests
> exploiting undefined behavior in the same way that the old append
> -multiple-items behavior was undefined?

I haven't looked at the patch because you didn't include a handy link.  It's
definitely intended that list.extend() accept iterable objects now.

> Also, I note that the table referenced above seems to be missing
> some right parentheses, at least on the .pop and .sort method
> descriptions.

Yup, and they used to be there.

Thanks for the loan of the eyeballs!




From python@rcn.com  Fri Jun 28 04:53:11 2002
From: python@rcn.com (Raymond Hettinger)
Date: Thu, 27 Jun 2002 23:53:11 -0400
Subject: [Python-Dev] list.extend
References: <LNBBLJKPBEHFEDALKOLCGEKFAAAB.tim@zope.com>
Message-ID: <002401c21e57$548b5280$19d8accf@othello>

> > Also, I note that the table referenced above seems to be missing
> > some right parentheses, at least on the .pop and .sort method
> > descriptions.
>
> Yup, and they used to be there.

Hmmph!  This is occurring throughout the docs (see also dict.get() and
dict.setdefault()).   It looks like a flaw in the doc gen process or in the
interaction of tex macro for methods with optional arguments


Raymond Hettinger




From David Abrahams" <david.abrahams@rcn.com  Fri Jun 28 05:01:34 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 28 Jun 2002 00:01:34 -0400
Subject: [Python-Dev] list.extend
References: <LNBBLJKPBEHFEDALKOLCGEKFAAAB.tim@zope.com>
Message-ID: <031801c21e58$7f3f37c0$6501a8c0@boostconsulting.com>

From: "Tim Peters" <tim@zope.com>

> Thanks for the loan of the eyeballs!

As long as I'm eyeballin' (and you're thankin'), I notice in PyInt_AsLong:

 if (op == NULL || (nb = op->ob_type->tp_as_number) == NULL ||
     nb->nb_int == NULL) {
    PyErr_SetString(PyExc_TypeError, "an integer is required");
  return -1;
 }

But really, an integer isn't required; Any type with a tp_as_number section
and a conversion to int will do. Should the error say "a numeric type
convertible to int is required"?

-Dave





From tim.one@comcast.net  Fri Jun 28 06:01:41 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 28 Jun 2002 01:01:41 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <222401c21d68$bc1f3000$6601a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEKJAAAB.tim.one@comcast.net>

[David Abrahams]
> ...
> Noticing that also left me with a question: how come everybody in
> the world hasn't stolen as much as possible from the Python hashing
> implementation?  Are there a billion such 10-years'-tweaked
> implementations lying around which all perform comparably well?

It's a Mystery, and in all directions.  Python has virtually no code from,
say, Tcl or Perl either, and the latter certainly do some things better than
Python does them.  I've studied all their hash implementations, but didn't
find anything worth stealing <wink>; OTOH, multiple attempts by multiple
groups to steal Perl's regexp engine years ago fizzled out in a tarpit of
frustration.

Curious:  Python independently developed a string hash *very* similar to
what later became "the standard" Fowler-Noll-Vo string hash:

    http://www.isthe.com/chongo/tech/comp/fnv/

The multiplier is different, and the initial value, but that's it.  I'm sure
there was no communication in either direction.  So ya, given enough time, a
billion other projects will rediscover it too.

>> OTOH, it can be very hard to write an efficient, correct "<" ordering,
>> while testing just "equal or not?" can be easier and run quicker than
>> that.  Dict comparison is a good example from the Python core:
>> computing "<" for dicts is a nightmare, but computing "==" for dicts is
>> easy (contrast the straightforward dict_equal() with the brain-busting
>> dict_compare() + characterize() pair).

> Well, OK, ordering hash tables is hard, unless the bucket count is a
> deterministic function of the element count.

I don't know how the latter could help; for that matter, I'm not even sure
what it means.

> If they were sorted containers, of course, < would be a simple matter.

Yes.

> And I assume that testing equality still involves a lot of hashing...

No more times than the common length of the two dicts.  It's just:

def dict_equal(dict1, dict2):
    if len(dict1) != len(dict2):
        return False
    for key, value in dict1.iteritems():
        if key not in dict2 or not value == dict2[key]:
             return False
    return True

Searching dict2 for key *may* involve hashing key again (or it may not; for
example, Python string objects are immutable and cache their 32-bit hash in
the string object the first time it's computed).

There's a world of pain involved in the "==" there, though, as a dict can
very well have itself as a value in itself, and the time required for
completion appears to be exponential in some pathological cases of that kind
(Python does detect the unbounded recursion in such cases -- eventually).

> Hmm, looking at the 3 C++ implementations of hashed containers that I have
> available to me, only one provides operator<(), which is rather strange
> since the other two implement operator == by first comparing sizes, then
> iterating through consecutive elements of each set looking for a
> difference. The implementation supplying operator<() uses a (IMO
> misguided) design that rehashes incrementally, but it seems to me that if
> the more straightforward approaches can implement operator==() as
> described, operator<() shouldn't have to be a big challenge for an
> everyday hash table.
>
> I'm obviously missing something, but what...?

I don't know, but I didn't follow what you were saying (like, "rehashes
incrementally" doesn't mean anything to me).  If there's a simpler way to
get "the right" answer, I'd love to see it.  I once spent two hours trying
to prove that the dict_compare() + characterize() pair in Python was
correct, but gave up in a mushrooming forest of end cases.

In The Beginning, Python implemented dict comparison by materializing the
.items(), sorting both, and then doing list comparison.  The correctness of
that was easy to show.  But it turned out that in real life all anyone ever
used was == comparison on dicts, and sorting was enormously expensive
compared to what was possible.  characterize() is a very clever hack Guido
dreamt up to get the same result in no more than two passes -- but I've
never been sure it's a thoroughly correct hack.  OTOH, since nobody appears
to care about "<" for dicts, if it's wrong we may never know that.

>> This was one of the motivations for introducing "rich comparisons".

> I don't see how that helps. Got a link? Or a clue?

Sorry, I don't understand the question.  When Python funneled all
comparisons through cmp(), it wasn't possible for a type implementation to
do anything faster for, say, "==", because it had no idea why cmp() was
being called.  Allowing people to ask for the specific comparison they
wanted is part of what "rich comparisons" was about, and speed was one of
the reasons for adopting it.  Comparing strings for equality/inequality
alone is also done faster than needing to resolve string ordering.  And
complex numbers have no accepted "<" relation at all.  So comparing dicts
isn't the only place it's easier and quicker to restrict the burden on the
type to implementing equality testing.  For user-defined types, I've often
found it *much* easier.  For example, I can easily tell whether two
chessboards are equal (do they have the same pieces on the same squares?),
but a concept of "<" for chessboards is strained.

> I don't know what that means.

There's too much of that on both sides here, so I delcare this mercifully
ended now <0.9 wink>.

> If you represent your sets as sorted containers, getting a strict weak
> ordering on sets is trivial; you just do it with a lexicographical
> comparison of the two sequences.

And if you don't, that conclusion doesn't follow.

> .,,
> No, I suppose not. But python's dicts are general-purpose containers, and
> you can put any key you like in there. It's still surprising to
> me given my (much less than 10 years') experience with hash
> implementations that you can design something that performs well over
> all those different cases.

You probably can't a priori, but after a decade people stumble into all the
cases that don't work well, and you eventually fiddle the type-specific hash
functions and the general implementation until surprises appear to stop.  It
remains a probabilistic method, though, and there are no guarantees.  BTW, I
believe that of all Python's builtin types, only the hash function for
integers remains in its original form (hash(i) == i).  So even if I don't
want to, I'm forced to agree that finding a good hash function isn't
trivial.

[on Zope's B-Trees]
> Aww, heck, you just need a good C++ exception-handling implementation to
> get rid of the error-checking overheads ;-)

I'd love to use C++ for this.  This is one of those things that defines 5
families of 4 related data structures each via a pile of .c and .h files
that get #include'd and recompiled 5 times after #define'ing a pile of
macros.  It would be *so* much more pleasant using templates.

> ...
> Thanks for the perspective!
>
> still-learning-ly y'rs,

You're too old for that now -- start making money instead <wink>.

the-psf-will-put-it-to-good-use-ly y'rs  - tim




From tim.one@comcast.net  Fri Jun 28 06:20:17 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 28 Jun 2002 01:20:17 -0400
Subject: [Python-Dev] list.extend
In-Reply-To: <031801c21e58$7f3f37c0$6501a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEKKAAAB.tim.one@comcast.net>

[David Abrahams]
> As long as I'm eyeballin' (and you're thankin'), I notice in PyInt_AsLong:
>
>  if (op == NULL || (nb = op->ob_type->tp_as_number) == NULL ||
>      nb->nb_int == NULL) {
>     PyErr_SetString(PyExc_TypeError, "an integer is required");
>   return -1;
>  }
>
> But really, an integer isn't required; Any type with a
> tp_as_number section and a conversion to int will do. Should the
> error say "a numeric type convertible to int is required"?

I'll leave it up to Fred, but I don't think so.  The suggestion is wordier,
would be wordier still if converted to the more accurate "an object of a
numeric type convertible to int is required", and even then is not, IMO,
more likely to be of real help when this error triggers.  If you want to
change it, be sure to hunt down all the related ones too; e.g.,

>>> class C: pass
>>> range(12)[C()]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: list indices must be integers
>>>

BTW, most places that call PyInt_AsLong() either do so conditionally upon
the success of a PyInt_Check(), or replace the exception raised when it
returns -1 with an error.  Offhand I wasn't even able to provoke the msg in
question.




From David Abrahams" <david.abrahams@rcn.com  Fri Jun 28 06:22:05 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 28 Jun 2002 01:22:05 -0400
Subject: [Python-Dev] list.extend
References: <LNBBLJKPBEHFEDALKOLCCEKKAAAB.tim.one@comcast.net>
Message-ID: <035201c21e63$bfbdfc90$6501a8c0@boostconsulting.com>

From: "Tim Peters" <tim.one@comcast.net>

> If you want to
> change it, be sure to hunt down all the related ones too; e.g.,

I wouldn't know where to start with that project. Do you think it would be
a bad idea to make one of many error messages more accurate?

> BTW, most places that call PyInt_AsLong() either do so conditionally upon
> the success of a PyInt_Check(), or replace the exception raised when it
> returns -1 with an error.  Offhand I wasn't even able to provoke the msg
in
> question.

We extension writers like to use it too, though, and usually without an
extra layer of error processing.

-Dave





From tim.one@comcast.net  Fri Jun 28 06:48:11 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 28 Jun 2002 01:48:11 -0400
Subject: [Python-Dev] list.extend
In-Reply-To: <035201c21e63$bfbdfc90$6501a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEKLAAAB.tim.one@comcast.net>

[Tim]
>> If you want to change it, be sure to hunt down all the related ones
>> too; e.g.,

[David]
> I wouldn't know where to start with that project. Do you think it would be
> a bad idea to make one of many error messages more accurate?

Increasing accuracy isn't necessarily helpful.  In any context where
PyInt_AsLong is called, an int most certainly is required *in the end*.
Spelling out that the implementation may satisfy this requirement by asking
a non-int type whether it knows how to convert instances of itself to an int
doesn't seem helpful to me as a user.  I'm not thinking that much about the
internal implementation, and "of course" if an int is required Python will
accept an object of a type that knows how to convert itself to an int.  But
I suppose you don't like seeing

    SyntaxError: invalid syntax

at the end of a 7-line statement either <wink>.




From David Abrahams" <david.abrahams@rcn.com  Fri Jun 28 06:56:23 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 28 Jun 2002 01:56:23 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
References: <LNBBLJKPBEHFEDALKOLCIEKJAAAB.tim.one@comcast.net>
Message-ID: <036101c21e68$8abed730$6501a8c0@boostconsulting.com>

----- Original Message -----
From: "Tim Peters" <tim.one@comcast.net>
To: "David Abrahams" <david.abrahams@rcn.com>
Cc: <python-dev@python.org>
Sent: Friday, June 28, 2002 1:01 AM
Subject: RE: [Python-Dev] Priority queue (binary heap) python code


> [David Abrahams]
> > ...
> > Noticing that also left me with a question: how come everybody in
> > the world hasn't stolen as much as possible from the Python hashing
> > implementation?  Are there a billion such 10-years'-tweaked
> > implementations lying around which all perform comparably well?
>
> It's a Mystery, and in all directions.  Python has virtually no code
from,
> say, Tcl or Perl either, and the latter certainly do some things better
than
> Python does them.  I've studied all their hash implementations, but
didn't
> find anything worth stealing <wink>;

Well of course not!

> OTOH, multiple attempts by multiple
> groups to steal Perl's regexp engine years ago fizzled out in a tarpit of
> frustration.

Oh, I had the impression that Python's re *was* pilfered Perl.

> Curious:  Python independently developed a string hash *very* similar to
> what later became "the standard" Fowler-Noll-Vo string hash:
>
>     http://www.isthe.com/chongo/tech/comp/fnv/
>
> The multiplier is different, and the initial value, but that's it.  I'm
sure
> there was no communication in either direction.  So ya, given enough
time, a
> billion other projects will rediscover it too.

Nifty.

> > Well, OK, ordering hash tables is hard, unless the bucket count is a
> > deterministic function of the element count.
>
> I don't know how the latter could help; for that matter, I'm not even
sure
> what it means.

I know what I meant, but I was wrong. My brain cell musta jammed. Ordering
hash tables is hard if collisions are possible.

> > And I assume that testing equality still involves a lot of hashing...
>
> No more times than the common length of the two dicts.

Of course.

> It's just:
>
> def dict_equal(dict1, dict2):
>     if len(dict1) != len(dict2):
>         return False
>     for key, value in dict1.iteritems():
>         if key not in dict2 or not value == dict2[key]:
>              return False
>     return True
>
> Searching dict2 for key *may* involve hashing key again (or it may not;
for
> example, Python string objects are immutable and cache their 32-bit hash
in
> the string object the first time it's computed).

Tricky. I guess a C++ object could be designed to cooperate with hash
tables in that way also.

> There's a world of pain involved in the "==" there, though, as a dict can
> very well have itself as a value in itself, and the time required for
> completion appears to be exponential in some pathological cases of that
kind
> (Python does detect the unbounded recursion in such cases -- eventually).

Yuck. I wouldn't expect any C++ implementation to handle that issue.

> > Hmm, looking at the 3 C++ implementations of hashed containers that I
have
> > available to me, only one provides operator<(), which is rather strange
> > since the other two implement operator == by first comparing sizes,
then
> > iterating through consecutive elements of each set looking for a
> > difference. The implementation supplying operator<() uses a (IMO
> > misguided) design that rehashes incrementally, but it seems to me that
if
> > the more straightforward approaches can implement operator==() as
> > described, operator<() shouldn't have to be a big challenge for an
> > everyday hash table.
> >
> > I'm obviously missing something, but what...?
>
> I don't know, but I didn't follow what you were saying (like, "rehashes
> incrementally" doesn't mean anything to me).

Get ahold of MSVC7 and look at the hash_set implementation. IIRC how
Plaugher described it, it is constantly maintaining the load factor across
insertions, so there's never a big cost to grow the table. It also keeps
the items in each bucket sorted, so hash table comparisons are a lot
easier. My gut tells me that this isn't worth what you pay for it, but so
far my gut hasn't had very much of any value to say about hashing...

The other implementations seem to implement equality as something like:

template <class T, class Hash, class Compare, class Allocator>
inline
bool
operator==(const hash_set<T, Hash, Compare, Allocator>& x,
           const hash_set<T, Hash, Compare, Allocator>& y)
{
 return x.size() == y.size() && std::equal(x.begin(), x.end(), y.begin());
}

Which has to be a bug unless they've got a very strange way of defining
equality, or some kindof ordering built into the iterators.

> If there's a simpler way to
> get "the right" answer, I'd love to see it.  I once spent two hours
trying
> to prove that the dict_compare() + characterize() pair in Python was
> correct, but gave up in a mushrooming forest of end cases.

I think it's a tougher problem in Python than in languages with value
semantics, where an object can't actually contain itself.

> In The Beginning, Python implemented dict comparison by materializing the
> .items(), sorting both, and then doing list comparison.  The correctness
of
> that was easy to show.  But it turned out that in real life all anyone
ever
> used was == comparison on dicts, and sorting was enormously expensive
> compared to what was possible.  characterize() is a very clever hack
Guido
> dreamt up to get the same result in no more than two passes -- but I've
> never been sure it's a thoroughly correct hack.

??? I can't find characterize() described anywhere, nor can I find it on my
trusty dict objects:

>>> help({}.characterize)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'dict' object has no attribute 'characterize'

> OTOH, since nobody appears
> to care about "<" for dicts, if it's wrong we may never know that.

As long as the Python associative world is built around hash + ==, you're
probably OK.

> >> This was one of the motivations for introducing "rich comparisons".
>
> > I don't see how that helps. Got a link? Or a clue?
>
> Sorry, I don't understand the question.

Well, you answered it pretty damn well anyway...

> Comparing strings for equality/inequality
> alone is also done faster than needing to resolve string ordering.  And
> complex numbers have no accepted "<" relation at all.

Yeah, good point. C++ has a less<T>/operator< dichotomy mostly to
accomodate pointer types in segmented memory models, but there's no such
accomodation for complex<T>.

> So comparing dicts
> isn't the only place it's easier and quicker to restrict the burden on
the
> type to implementing equality testing.  For user-defined types, I've
often
> found it *much* easier.  For example, I can easily tell whether two
> chessboards are equal (do they have the same pieces on the same
squares?),
> but a concept of "<" for chessboards is strained.

Strained, maybe, but easy. You can do a lexicographic comparison of the
square contents.

> > I don't know what that means.
>
> There's too much of that on both sides here, so I delcare this mercifully
> ended now <0.9 wink>.

I, of course, will drag it on to the bitter end.

>
> [on Zope's B-Trees]
> > Aww, heck, you just need a good C++ exception-handling implementation
to
> > get rid of the error-checking overheads ;-)
>
> I'd love to use C++ for this.  This is one of those things that defines 5
> families of 4 related data structures each via a pile of .c and .h files
> that get #include'd and recompiled 5 times after #define'ing a pile of
> macros.  It would be *so* much more pleasant using templates.

I have *just* the library for you. Works with 'C', too!
http://www.boost.org/libs/preprocessor/doc/
Believe it or not, people are still pushing this technology to improve
compilation times and debuggability of the result.

> > ...
> > Thanks for the perspective!
> >
> > still-learning-ly y'rs,
>
> You're too old for that now -- start making money instead <wink>.

Sorry, I'll try hard to grow up now.

-Dave




From David Abrahams" <david.abrahams@rcn.com  Fri Jun 28 07:05:22 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 28 Jun 2002 02:05:22 -0400
Subject: [Python-Dev] list.extend
References: <LNBBLJKPBEHFEDALKOLCEEKLAAAB.tim.one@comcast.net>
Message-ID: <039c01c21e6a$335ee960$6501a8c0@boostconsulting.com>

From: "Tim Peters" <tim.one@comcast.net>

> Increasing accuracy isn't necessarily helpful.  In any context where
> PyInt_AsLong is called, an int most certainly is required *in the end*.
> Spelling out that the implementation may satisfy this requirement by
asking
> a non-int type whether it knows how to convert instances of itself to an
int
> doesn't seem helpful to me as a user.  I'm not thinking that much about
the
> internal implementation, and "of course" if an int is required Python
will
> accept an object of a type that knows how to convert itself to an int.

OK. Explicit is better than implicit, except when it's obvious what GvR
really meant ;-)

> But I suppose you don't like seeing
>
>     SyntaxError: invalid syntax
>
> at the end of a 7-line statement either <wink>.

I never like seeing that, but I don't know what you're getting at.

maybe-you-need-to-<wink>-harder-ly y'rs,
dave





From aahz@pythoncraft.com  Fri Jun 28 14:42:54 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 28 Jun 2002 09:42:54 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <036101c21e68$8abed730$6501a8c0@boostconsulting.com>
References: <LNBBLJKPBEHFEDALKOLCIEKJAAAB.tim.one@comcast.net> <036101c21e68$8abed730$6501a8c0@boostconsulting.com>
Message-ID: <20020628134254.GA14414@panix.com>

On Fri, Jun 28, 2002, David Abrahams wrote:
> From: "Tim Peters" <tim.one@comcast.net>
>>
>> OTOH, multiple attempts by multiple
>> groups to steal Perl's regexp engine years ago fizzled out in a tarpit of
>> frustration.
> 
> Oh, I had the impression that Python's re *was* pilfered Perl.

Thank Fredrik for a brilliant job of re-implementing Perl's regex syntax
into something that I assume is maintainable (haven't looked at the code
myself) *and* Unicode compliant.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From David Abrahams" <david.abrahams@rcn.com  Fri Jun 28 14:48:43 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 28 Jun 2002 09:48:43 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
References: <LNBBLJKPBEHFEDALKOLCIEKJAAAB.tim.one@comcast.net> <036101c21e68$8abed730$6501a8c0@boostconsulting.com> <20020628134254.GA14414@panix.com>
Message-ID: <04b701c21eaa$89d77e70$6501a8c0@boostconsulting.com>

From: "Aahz" <aahz@pythoncraft.com>

> > Oh, I had the impression that Python's re *was* pilfered Perl.
> 
> Thank Fredrik for a brilliant job of re-implementing Perl's regex syntax
> into something that I assume is maintainable (haven't looked at the code
> myself) *and* Unicode compliant.

Thanks, Fredrik!





From jacobs@penguin.theopalgroup.com  Fri Jun 28 15:44:44 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Fri, 28 Jun 2002 10:44:44 -0400 (EDT)
Subject: [Python-Dev] Garbage collector problem
Message-ID: <Pine.LNX.4.44.0206281001200.5875-100000@penguin.theopalgroup.com>

I've found what I consider a major problem with the garbage collector in the
Python 2.3 CVS tree.  Here is a small kernel that demonstrates the problem:

lst = []
for i in range(100000):
  lst.append( (1,) )

The key ingredients are:

  1) A method is called on a container (rather than __setitem__ or
     __setattr__).

  2) A new object is allocated while the method object lives on the Python
     VM stack, as shown by the disassembled bytecodes:

         40 LOAD_FAST                1 (lst)
         43 LOAD_ATTR                3 (append)
         46 LOAD_CONST               2 (1)
         49 BUILD_TUPLE              1
         52 CALL_FUNCTION            1

These ingredients combine in the following way to trigger quadratic-time
behavior in the Python garbage collector:

  * First, the LOAD_ATTR on "lst" for "append" is called, and a PyCFunction
    is returned from this code in descrobject.c:method_get:

        return PyCFunction_New(descr->d_method, obj);

    Thus, a _new_ PyCFunction is allocated every time the method is
    requested.

  * This new method object is added to generation 0 of the garbage
    collector, which holds a reference to "lst".

  * The BUILD_TUPLE call may then trigger a garbage collection cycle.

  * Since the "append" method is in generation 0, the reference traversal
    must also follow all objects within "lst", even if "lst" is in
    generation 1 or 2.  This traversal requires time linear in the number of
    objects in "lst", thus increasing the overall time complexity of the
    code to quadratic in the number of elements in "lst".

Also note that this is a much more general problem than this small example. 
It can affect many types of objects in addition to methods, including
descriptors, iterator objects, and any other object that contains a "back
reference".

So, what can be done about this.... One simple solution would be to not
traverse some "back references" if we are collecting objects in generation
0.  This will avoid traversing virtually all of these ephemoral objects that
will trigger such expensive behavior.  If they live long enough to pass
through to generation one or two, then clearly they should be traversed.

So, what do all of you GC gurus think?  Provided that my analysis is sound,
I can rapidly propose a patch to demonstrate this approach if there is
sufficient positive sentiment.

There is a bug open on sourceforge on this issue, so feel free to reply via
python-dev or via the bug -- I read both.  As usual sourceforge is buggered,
so I have not been able to update the bug with the contents of this e-mail.

  http://sourceforge.net/tracker/?func=detail&atid=105470&aid=572567&group_id=5470

Regards,
-Kevin

PS: I have not looked into why this doesn't happen in Python 2.2.x or
    before.  I suspect that it must be related to the recent GC changes in
    methodobject.py.  I'm not motivated to spend much time looking into
    this, because the current GC behavior is technically correct, though
    clearly sub-optimal.

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From jeremy@zope.com  Fri Jun 28 11:37:19 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Fri, 28 Jun 2002 06:37:19 -0400
Subject: [Python-Dev] list.extend
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEKFAAAB.tim@zope.com>
References: <020201c21e34$6905b6b0$6501a8c0@boostconsulting.com>
 <LNBBLJKPBEHFEDALKOLCGEKFAAAB.tim@zope.com>
Message-ID: <15644.15455.298184.157605@slothrop.zope.com>

>>>>> "TP" == Tim Peters <tim@zope.com> writes:

  TP> [David Abrahams]
  >> What does this "experimental" label mean?

  TP> I'm not sure.  Guido slaps that label on new features from time
  TP> to time, with the implication that they may go away in the
  TP> following release.  However, no *advertised* experimental
  TP> feature has ever gone away, and I doubt one ever will.  We
  TP> should drop the "experimental" on this one for sure now, as lots
  TP> of code uses list.extend().

The access statement was experimental and went away.  I guess it is
the exception that proves the rule.  It was removed about the time I
started using Python, so I don't know what it's intended use was.

Many of the Python 2.2 features are also labeled experimental.  And I
don't expect that they will go away either.

Jeremy




From tim.one@comcast.net  Fri Jun 28 16:50:42 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 28 Jun 2002 11:50:42 -0400
Subject: [Python-Dev] list.extend
In-Reply-To: <15644.15455.298184.157605@slothrop.zope.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAELPAAAB.tim.one@comcast.net>

[Jeremy]
> The access statement was experimental and went away.  I guess it is
> the exception that proves the rule.

There are no exceptions to Guido's channeled rules <wink>:

    no *advertised* experimental feature has ever gone away

and the access stmt was never documented ("advertised").  The closest it got
was its NEWS entry for 0.9.9:

    * There's a new reserved word: "access".  The syntax and semantics
      are still subject of research and debate (as well as undocumented),
      but the parser knows about the keyword so you must not use it as a
      variable, function, or attribute name.

The "debate" mentioned there may have been limited to email between Guido
and (IIRC) Tommy Burnette.

> It was removed about the time I started using Python, so I don't know
> what it's intended use was.

    access_stmt: 'access' NAME (',' NAME)* ':' accesstype (',' accesstype)*
    accesstype: NAME+
    # accesstype should be ('public' | 'protected' | 'private')
    #                      ['read'] ['write']
    # but can't be because that would create undesirable reserved words!

So it was for creating attributes that could be written by the public but
read only by class methods <wink>.

> Many of the Python 2.2 features are also labeled experimental.  And I
> don't expect that they will go away either.

Well, at least not the ones we've told people about.  Barry's hack to make

    print << file, '%d' % i

read an int i from file may well go away.




From tim.one@comcast.net  Fri Jun 28 17:18:34 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 28 Jun 2002 12:18:34 -0400
Subject: [Python-Dev] list.extend
In-Reply-To: <15644.34975.161808.776825@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEMAAAAB.tim.one@comcast.net>

[Barry, on 'access']
> python-mode.el gained knowledge of it in 1996:
>
> revision 2.81
> date: 1996/09/04 15:21:55;  author: bwarsaw;  state: Exp;  lines: +4 -4
> (python-font-lock-keywords): with Python 1.4 `access' is no a keyword

You're misreading "no" as "now" instead of "not".  This patch removed
'access' from python-font-lock-keywords, and that's exactly what "is no a
keyword" meant to me considering it was BarrySpeak <wink>.

> Which is just before Python 1.4 final.  I've no idea when it went
> away.

According to Misc/HISTORY, the bulk of it vanished in 1.4beta3, with
assorted forgetten pieces removed over the following years.

>     TP> Well, at least not the ones we've told people about.  Barry's
>     TP> hack to make
>
>     TP>     print << file, '%d' % i
>
>     TP> read an int i from file may well go away.
>
> It will last week.  Freakin' time machine erased all evidence
> tomorrow.

Damn -- it's already gone from my disk!  Quick, document it before




From barry@zope.com  Fri Jun 28 17:29:42 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 28 Jun 2002 12:29:42 -0400
Subject: [Python-Dev] list.extend
References: <15644.34975.161808.776825@anthem.wooz.org>
 <LNBBLJKPBEHFEDALKOLCGEMAAAAB.tim.one@comcast.net>
Message-ID: <15644.36598.681690.547336@anthem.wooz.org>

>>>>> "TP" == Tim Peters <tim.one@comcast.net> writes:

    TP> You're misreading "no" as "now" instead of "not".  This patch
    TP> removed 'access' from python-font-lock-keywords, and that's
    TP> exactly what "is no a keyword" meant to me considering it was
    TP> BarrySpeak <wink>.

How weird, I never wrote any of that!

I /have/ been playing with NaturallySpeaking for Linux (tm) and all I
did was burp.  Why did it take that sound to mean: cause my XEmacs to
respond to the message, do the cvs log, cut-n-paste, send the message,
without even my knowledge?  Okay, it was a rather, um, soupy burp, but
nonetheless...

You should have seen what it did with the cat's purrs.

i-swear-honey-it-was-the-cat-that-ran-pt.py-ly y'rs,
-Barry



From David Abrahams" <david.abrahams@rcn.com  Fri Jun 28 17:34:33 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 28 Jun 2002 12:34:33 -0400
Subject: [Python-Dev] list.extend
References: <15644.34975.161808.776825@anthem.wooz.org><LNBBLJKPBEHFEDALKOLCGEMAAAAB.tim.one@comcast.net> <15644.36598.681690.547336@anthem.wooz.org>
Message-ID: <05b501c21ec1$afc8a980$6501a8c0@boostconsulting.com>

From: "Barry A. Warsaw" <barry@zope.com>

> >>>>> "TP" == Tim Peters <tim.one@comcast.net> writes:
>
>     TP> You're misreading "no" as "now" instead of "not".  This patch
>     TP> removed 'access' from python-font-lock-keywords, and that's
>     TP> exactly what "is no a keyword" meant to me considering it was
>     TP> BarrySpeak <wink>.
>
> How weird, I never wrote any of that!
>
> I /have/ been playing with NaturallySpeaking for Linux (tm) and all I
> did was burp.  Why did it take that sound to mean: cause my XEmacs to
> respond to the message, do the cvs log, cut-n-paste, send the message,
> without even my knowledge?  Okay, it was a rather, um, soupy burp, but
> nonetheless...

Part of the deal with my natural language system at Dragon was that they
wanted me to work on Dutch translation, but I don't know Dutch so I used
Python and figured that would be enough. It turns out that Dutch sounds a
lot like burping to my ear. I think you can see where this is headed...

-Dave





From barry@zope.com  Fri Jun 28 17:02:39 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 28 Jun 2002 12:02:39 -0400
Subject: [Python-Dev] list.extend
References: <15644.15455.298184.157605@slothrop.zope.com>
 <LNBBLJKPBEHFEDALKOLCAELPAAAB.tim.one@comcast.net>
Message-ID: <15644.34975.161808.776825@anthem.wooz.org>

>>>>> "TP" == Tim Peters <tim.one@comcast.net> writes:

    TP> There are no exceptions to Guido's channeled rules <wink>:

    TP>     no *advertised* experimental feature has ever gone away

    TP> and the access stmt was never documented ("advertised").  The
    TP> closest it got was its NEWS entry for 0.9.9:

python-mode.el gained knowledge of it in 1996:

revision 2.81
date: 1996/09/04 15:21:55;  author: bwarsaw;  state: Exp;  lines: +4 -4
(python-font-lock-keywords): with Python 1.4 `access' is no a keyword

Which is just before Python 1.4 final.  I've no idea when it went
away.

    >> Many of the Python 2.2 features are also labeled experimental.
    >> And I don't expect that they will go away either.

    TP> Well, at least not the ones we've told people about.  Barry's
    TP> hack to make

    TP>     print << file, '%d' % i

    TP> read an int i from file may well go away.

It will last week.  Freakin' time machine erased all evidence
tomorrow.

-Barry



From jeremy@zope.com  Fri Jun 28 15:02:20 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Fri, 28 Jun 2002 10:02:20 -0400
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <Pine.LNX.4.44.0206281001200.5875-100000@penguin.theopalgroup.com>
References: <Pine.LNX.4.44.0206281001200.5875-100000@penguin.theopalgroup.com>
Message-ID: <15644.27756.584393.217271@slothrop.zope.com>

I had a different ideas to solve this performance problem and perhaps
others.  It's only half baked, but I thought it was at least worth
mentioning in an e-mail.  The premise is that the garbage collector
tracks a lot of objects that will never participate in cycles and can
never participate in cycles.  The idea is to avoid tracking objects
until it becomes possible for them to participate in a collectible
cycle.

For example, an object referenced from a local variable will never be
collected until after the frame releases its reference.  So what if we
did not track objects that were stored in local variables?  To make
this work, we would need to change the SETLOCAL macro in ceval to
track the object that it was DECREFing.  There are a lot of little
details that would make this complicated unfortunately.  All new
container objects are tracked, so we would need to untrack ones that
are stored in local variables.  To track objects on DECREF, we would
also need to ask if the object type was GC-enabled.

Another kind of object that is never going to participate in a cycle,
I think, is an object that lives only temporarily on the ceval stack.
For example, a bound method object created on the stack in order to be
called.  If it's never bound to another object as an attribute or
stored in local variable, it can never participate in the cycle.

How hard would it be to add logic that avoided tracking objects until
it was plausible that they would participate in a cycle?

Jeremy




From tim.one@comcast.net  Fri Jun 28 19:59:54 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 28 Jun 2002 14:59:54 -0400
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <Pine.LNX.4.44.0206281001200.5875-100000@penguin.theopalgroup.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEMHAAAB.tim.one@comcast.net>

[Kevin Jacobs, working hard!]

I don't know what causes this.  The little time I've been able to spend on
it ended up finding an obvious buglet in some new-in-2.3 gcmodule code:

	for (i = 0; i <= generation; i++)
		generations[generation].count = 0;

That was certainly intended to index by "i", not by "generation".

Fixing that makes the gc.DEBUG_STATS output less surprising, and cuts down
on the number of collections, but doesn't really cure anything.

Note that bound methods in 2.2 also create new objects, etc; that was good
deduction, but not yet good enough <wink>.




From jacobs@penguin.theopalgroup.com  Fri Jun 28 20:11:13 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Fri, 28 Jun 2002 15:11:13 -0400 (EDT)
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEMHAAAB.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.44.0206281505460.9182-100000@penguin.theopalgroup.com>

On Fri, 28 Jun 2002, Tim Peters wrote:
> I don't know what causes this.  The little time I've been able to spend on
> it ended up finding an obvious buglet in some new-in-2.3 gcmodule code:
> 
> 	for (i = 0; i <= generation; i++)
> 		generations[generation].count = 0;
> 
> That was certainly intended to index by "i", not by "generation".

Good catch!  I missed that in spite of reading those lines 20 times.

> Note that bound methods in 2.2 also create new objects, etc; that was good
> deduction, but not yet good enough <wink>.

That is why I added my "PS" about not looking into why it didn't blow up in
Python 2.2.  In reality, I did look, but only for 30 seconds, and then
decided I didn't want to know.

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From jacobs@penguin.theopalgroup.com  Fri Jun 28 20:23:04 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Fri, 28 Jun 2002 15:23:04 -0400 (EDT)
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <15644.27756.584393.217271@slothrop.zope.com>
Message-ID: <Pine.LNX.4.44.0206281450001.9182-100000@penguin.theopalgroup.com>

On Fri, 28 Jun 2002, Jeremy Hylton wrote:
> I had a different ideas to solve this performance problem and perhaps
> others.  It's only half baked, but I thought it was at least worth
> mentioning in an e-mail.  The premise is that the garbage collector
> tracks a lot of objects that will never participate in cycles and can
> never participate in cycles.  The idea is to avoid tracking objects
> until it becomes possible for them to participate in a collectible
> cycle.

Hi Jeremy,

You have an interesting idea, though I'd state the premise slightly
differently.  How about:

  The premise is that the garbage collector tracks a lot of objects that
  will never participate in collectible cycles, because untraceable
  references are held.  The idea is to avoid tracking these objects until it
  becomes possible for them to participate in a collectible cycle.

(virtually any object _can_ participate in a cycle -- most just never do)

Offhand, I am not sure if my idea of ignoring certain references in
generation 0 or your idea will work better in practice.  Both require adding
more intelligence to the garbage collection system via careful annotations.
I wouldn't be surprised if the optimal approach involved both methods.

> How hard would it be to add logic that avoided tracking objects until
> it was plausible that they would participate in a [collectable] cycle?

I can work up a patch that does this.  Can anyone else think of places where
this makes sense, other than frame objects and the ceval stack?

Also, any thoughts on my approach?  I have a hard time thinking of any
situation that generates enough cyclic garbage where delaying collection
until generation 1 would be a serious problem.

-Kevin

PS: The bug Tim spotted makes a big difference too.

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From python@rcn.com  Fri Jun 28 20:25:13 2002
From: python@rcn.com (Raymond Hettinger)
Date: Fri, 28 Jun 2002 15:25:13 -0400
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
Message-ID: <001f01c21ed9$873f3c00$06ea7ad1@othello>

As far as I can tell, buffer() is one of the least used or known about
Python tools.  What do you guys think about this as a candidate for silent
deprecation (moving out of the primary documentation)?


Raymond Hettinger




From jeremy@zope.com  Fri Jun 28 15:56:29 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Fri, 28 Jun 2002 10:56:29 -0400
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <Pine.LNX.4.44.0206281450001.9182-100000@penguin.theopalgroup.com>
References: <15644.27756.584393.217271@slothrop.zope.com>
 <Pine.LNX.4.44.0206281450001.9182-100000@penguin.theopalgroup.com>
Message-ID: <15644.31005.110922.650771@slothrop.zope.com>

>>>>> "KJ" == Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes:

  KJ> You have an interesting idea, though I'd state the premise
  KJ> slightly differently.  How about:

  KJ>   The premise is that the garbage collector tracks a lot of
  KJ>   objects that will never participate in collectible cycles,
  KJ>   because untraceable references are held.  The idea is to avoid
  KJ>   tracking these objects until it becomes possible for them to
  KJ>   participate in a collectible cycle.

I guess the untraced reference to the current frame is the untraceable
reference you refer to.  The crucial issue is that local variables of
the current frame can't be collected, so there's little point in
tracking and traversing the objects they refer to.  

I agree, of course, that the concern is collectible cycles.

  KJ> (virtually any object _can_ participate in a cycle -- most just
  KJ> never do)

Right.  There ought to be some way to exploit that.

  KJ> Also, any thoughts on my approach?  I have a hard time thinking
  KJ> of any situation that generates enough cyclic garbage where
  KJ> delaying collection until generation 1 would be a serious
  KJ> problem.

If I take your last statement literally, it sounds like we ought to
avoid doing anything until an object gets to generation 1 <0.7 wink>.

Your suggestion seems to be that we should treat references from older
generations to newer generations as external roots.  So a cycle that
spans generations will not get collected until everything is in the
same generation.  Indeed, that does not seem harmful.

On the other hand, it's hard to reconcile an intuitive notion of
generation with what we're doing by running GC over and over as you
add more elements to your list.  It doesn't seem right that your list
becomes an "old" object just because a single function allocates 100k
young objects.  That is, I wish the notion of generations accommodated
a baby boom in a generation.

Jeremy





From nas@python.ca  Fri Jun 28 21:04:18 2002
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 28 Jun 2002 13:04:18 -0700
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <Pine.LNX.4.44.0206281450001.9182-100000@penguin.theopalgroup.com>; from jacobs@penguin.theopalgroup.com on Fri, Jun 28, 2002 at 03:23:04PM -0400
References: <15644.27756.584393.217271@slothrop.zope.com> <Pine.LNX.4.44.0206281450001.9182-100000@penguin.theopalgroup.com>
Message-ID: <20020628130418.D10441@glacier.arctrix.com>

Another idea would be exploit the fact that we know most of the root
objects (e.g. sys.modules and the current stack of frames).  I haven't
figured out a good use for this knowledge though.  

  Neil



From jacobs@penguin.theopalgroup.com  Fri Jun 28 21:07:33 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Fri, 28 Jun 2002 16:07:33 -0400 (EDT)
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <15644.31005.110922.650771@slothrop.zope.com>
Message-ID: <Pine.LNX.4.44.0206281544300.9182-100000@penguin.theopalgroup.com>

On Fri, 28 Jun 2002, Jeremy Hylton wrote:
> Your suggestion seems to be that we should treat references from older
> generations to newer generations as external roots.  So a cycle that
> spans generations will not get collected until everything is in the
> same generation.  Indeed, that does not seem harmful.

Not really -- I'm saying that certain types of containers tend to hold
references to other, much larger, containers.  These small containers tend
to be ephemoral -- they appear and disappear quickly -- but sometimes are
unlucky enough to be around when a collection is triggered.  In my example,
the small containers were bound-method objects, which store back-references
to their class instance, a huge list, which will live in generation 2 very
quickly.

I do not advocate making objects store which generation they belong to, but
rather to delay the traversal of certain containers until after generation
0.  This means that they've been around the block a few times, and may have
fallen in with a bad cyclical crowd.

This annotation should be added to objects that tend to shadow other
containers, like bound-methods, iterators, generators, descriptors, etc.

In some tests using real workloads, I've found that upwards of 99% of these
ephemoral objects never make it to a generation 1 collection anyway.

Haulin' garbage,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From jacobs@penguin.theopalgroup.com  Fri Jun 28 21:11:41 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Fri, 28 Jun 2002 16:11:41 -0400 (EDT)
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <20020628130418.D10441@glacier.arctrix.com>
Message-ID: <Pine.LNX.4.44.0206281607540.9182-100000@penguin.theopalgroup.com>

On Fri, 28 Jun 2002, Neil Schemenauer wrote:
> Another idea would be exploit the fact that we know most of the root
> objects (e.g. sys.modules and the current stack of frames).  I haven't
> figured out a good use for this knowledge though.  

If the root objects cannot be reached by the GC traversal, you get the
approach that Jeremy is suggesting.  (Though I just looked, and frame
objects aren't exempt from tracking)

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From jeremy@zope.com  Fri Jun 28 16:49:46 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Fri, 28 Jun 2002 11:49:46 -0400
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <Pine.LNX.4.44.0206281607540.9182-100000@penguin.theopalgroup.com>
References: <20020628130418.D10441@glacier.arctrix.com>
 <Pine.LNX.4.44.0206281607540.9182-100000@penguin.theopalgroup.com>
Message-ID: <15644.34202.289500.35641@slothrop.zope.com>

>>>>> "KJ" == Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes:

  KJ> On Fri, 28 Jun 2002, Neil Schemenauer wrote:
  >> Another idea would be exploit the fact that we know most of the
  >> root objects (e.g. sys.modules and the current stack of frames).
  >> I haven't figured out a good use for this knowledge though.

  KJ> If the root objects cannot be reached by the GC traversal, you
  KJ> get the approach that Jeremy is suggesting.  (Though I just
  KJ> looked, and frame objects aren't exempt from tracking)

Right.  My suggestion is to not track a set of objects that otherwise
would be tracked -- the current frame and its local variables.

Jeremy




From tim.one@comcast.net  Fri Jun 28 22:49:58 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 28 Jun 2002 17:49:58 -0400
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <15644.31005.110922.650771@slothrop.zope.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMENBAAAB.tim.one@comcast.net>

[Jeremy, to Kevin Jacobs]
> ...
> Your suggestion seems to be that we should treat references from older
> generations to newer generations as external roots.

That's the way it works now:  an object in gen N *is* an external root wrt
any object it references in gen I with I < N.

> So a cycle that spans generations will not get collected until
> everything is in the same generation.

Right, and that's what happens (already).  When gen K is collected, all gens
<= K are smushed into gen K at the start, and all trash cycles are collected
except for those that contain at least one object in gen K+1 or higher.

> Indeed, that does not seem harmful.

It hasn't been so far <wink>, although you can certainly construct cases
where it causes an inconvenient delay in trash collection.

> On the other hand, it's hard to reconcile an intuitive notion of
> generation with what we're doing by running GC over and over as you
> add more elements to your list.  It doesn't seem right that your list
> becomes an "old" object just because a single function allocates 100k
> young objects.  That is, I wish the notion of generations accommodated
> a baby boom in a generation.

I don't think you do.  Pushing the parent object into an older generation is
exactly what's supposed to save us from needing to scan all its children
every time a gen0 collection occurs.

Under 2.2.1, Kevin's test case pushes "the list" into gen2 early on, and
those of the list's children that existed at that time are never scanned
again until another gen2 collection occurs.  For a reason I still haven't
determined, under current CVS "the whole list" is getting scanned by
move_root_reachable() every time a gen0 collection occurs.  It's also
getting scanned by both subtract_refs() and move_root_reachable() every time
a gen1 collection occurs.  I'm not yet sure whether the mystery is why this
happens in 2.3, or why it doesn't happen in 2.2.1 <0.5 wink>.




From gmcm@hypernet.com  Fri Jun 28 23:40:33 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Fri, 28 Jun 2002 18:40:33 -0400
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <Pine.LNX.4.44.0206281544300.9182-100000@penguin.theopalgroup.com>
References: <15644.31005.110922.650771@slothrop.zope.com>
Message-ID: <3D1CADA1.25053.99DBE9E@localhost>

On 28 Jun 2002 at 16:07, Kevin Jacobs wrote:

> In some tests using real workloads, I've found that
> upwards of 99% of these ephemoral objects never make
> it to a generation 1 collection anyway. 

Um, the objects are ephemeral. You were
probably thinking of Uncle Timmy.

-- Gordon
http://www.mcmillan-inc.com/




From tim.one@comcast.net  Fri Jun 28 23:58:30 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 28 Jun 2002 18:58:30 -0400
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMENBAAAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCENFAAAB.tim.one@comcast.net>

[Tim]
> ...
> I'm not yet sure whether the mystery is why this happens in 2.3, or
> why it doesn't happen in 2.2.1 <0.5 wink>.

Knock that down to 0.1 wink <0.3 wink>:  Kevin's problem goes away in
current CVS if I change the guard in visit_decref() to

		if (IS_TRACKED(op) && !IS_MOVED(op))
                               ^^^^^^^^^^^^^^^^  added this

I've no real idea why, as 2.2.1 didn't need this to prevent "the list" from
getting continually pulled back into a younger generation.

Without this change in current CVS, it looks like, in each gen0 collection:

a. The bound method object in gen0 knocks "the list"'s gc_refs down to
   -124 when visit_decref() is called by the bound method object
   traverse via subtract_refs().  Therefore IS_MOVED("the list") is
   no longer true.

b. move_root_reachable() then moves "the list" into the list of
   reachable things, because visit_move's has-always-been-there

		if (IS_TRACKED(op) && !IS_MOVED(op)) {

   guard doesn't believe "the list" has already been moved.  vist_move
   then restores the list's gc_refs to the magic -123.

c. move_root_reachable() goes on to scan all of "the list"'s entries too.

d. "the list" itself gets moved into gen1, just because it's in the
   list of reachable things.

e. The next gen0 collection starts at #a again, and does the same
   stuff all over again.

Adding the new guard in visit_decref() breaks this at #a:  IS_MOVED("the
list") remains true, and so #b doesn't move "the list" into the set of
reachable objects again, and so the list stays in whichever older generation
it was in, and doesn't get scanned again (until the next gen2 traversal).

The mystery to me now is why the a,b,c,d,e loop didn't happen in 2.2.1.




From aahz@pythoncraft.com  Sat Jun 29 00:03:43 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 28 Jun 2002 19:03:43 -0400
Subject: [Python-Dev] Improved tmpfile module
In-Reply-To: <20020627221228.GB9371@codesourcery.com>
References: <20020627221228.GB9371@codesourcery.com>
Message-ID: <20020628230343.GA6262@panix.com>

On Thu, Jun 27, 2002, Zack Weinberg wrote:
>
> This is largely as it was in the old file.  I happen to know that ~%s~
> is conventional for temporary files on Windows.  I changed 'tmp%s' to
> 'pyt%s' for Unix to make it consistent with Mac/RiscOS
> 
> Ideally one would allow the calling application to control the prefix, but
> I'm not sure what the right interface is.  Maybe
> 
>  tmpfile.mkstemp(prefix="", suffix="")
> 
> where if one argument is provided it gets treated as the suffix, but
> if two are provided the prefix comes first, a la range()?  Is there a
> way to express that in the prototype?

The main problem with this is that range() doesn't support keyword
arguments, just positional ones.  In order to get the same effect with
mkstemp, you'd have to do

    def tmpfile.mkstemp(*args):

and raise an exception with more than two arguments.  Otherwise, if you
allow keyword arguments, you get the possibility of:

    tmpfile.mkstemp(prefix="foo")

and you can't distinguish that from

    tmpfile.mkstemp("foo")

unless you change the prototype to

    def tmp.mkstemp(*args, **kwargs):

which requires a bit more of a song-and-dance setup routine.

In any event, you probably should not use empty strings as the default
parameters; use None instead.

(Yeah, this is getting a bit off-topic for python-dev; I'm just
practicing for my book. ;-)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From tim.one@comcast.net  Sat Jun 29 00:25:24 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 28 Jun 2002 19:25:24 -0400
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCENFAAAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCGENHAAAB.tim.one@comcast.net>

[Tim]
> ...
> The mystery to me now is why the a,b,c,d,e loop didn't happen in 2.2.1.

Because 2.2.1 has a bug in PyCFunction_New(), which ends with

	op->m_self = self;
	PyObject_GC_Init(op);
	return (PyObject *)op;

But also in 2.2.1,

/* This is here for the sake of backwards compatibility.  Extensions that
 * use the old GC API will still compile but the objects will not be
 * tracked by the GC. */
#define PyGC_HEAD_SIZE 0
#define PyObject_GC_Init(op)
#define PyObject_GC_Fini(op)
#define PyObject_AS_GC(op) (op)
#define PyObject_FROM_GC(op) (op)

IOW, PyObject_GC_Init(op) is a nop in 2.2.1, and the bound method object
never gets tracked.  Therefore the a,b,c,d,e loop never gets started.

In current CVS, the function ends with

	op->m_self = self;
	_PyObject_GC_TRACK(op);
	return (PyObject *)op;

and a world of fun follows <wink>.




From gward@python.net  Sat Jun 29 00:56:28 2002
From: gward@python.net (Greg Ward)
Date: Fri, 28 Jun 2002 19:56:28 -0400
Subject: [Python-Dev] posixmodule.c diffs for working forkpty() and openpty() under Solaris 2.8
In-Reply-To: <20020626082135.16733.qmail@web20905.mail.yahoo.com>
References: <20020626082135.16733.qmail@web20905.mail.yahoo.com>
Message-ID: <20020628235628.GA2634@gerg.ca>

On 26 June 2002, Lance Ellinghaus said:
> I had to get forkpty() and openpty() working under Solaris 2.8 for a
> project I am working on.
> Here are the diffs to the 2.2.1 source file.

Patches will get lost in the shuffle on python-dev.  You should a) make
the patch relative to the current CVS, b) submit it to SourceForge, and
c) keep your eye on the ball until someone checks it in.

Thanks!

        Greg
-- 
Greg Ward - geek-at-large                               gward@python.net
http://starship.python.net/~gward/
War is Peace; Freedom is Slavery; Ignorance is Knowledge



From aahz@pythoncraft.com  Sat Jun 29 01:21:23 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 28 Jun 2002 20:21:23 -0400
Subject: [Python-Dev] list.extend
In-Reply-To: <05b501c21ec1$afc8a980$6501a8c0@boostconsulting.com>
References: <15644.36598.681690.547336@anthem.wooz.org> <05b501c21ec1$afc8a980$6501a8c0@boostconsulting.com>
Message-ID: <20020629002123.GC18004@panix.com>

On Fri, Jun 28, 2002, David Abrahams wrote:
>
> Part of the deal with my natural language system at Dragon was that they
> wanted me to work on Dutch translation, but I don't know Dutch so I used
> Python and figured that would be enough. It turns out that Dutch sounds a
> lot like burping to my ear. I think you can see where this is headed...

Yeah, it means that Orlijn has a guaranteed job.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From gward@python.net  Sat Jun 29 01:24:55 2002
From: gward@python.net (Greg Ward)
Date: Fri, 28 Jun 2002 20:24:55 -0400
Subject: [Python-Dev] Improved tmpfile module
In-Reply-To: <20020627221228.GB9371@codesourcery.com>
References: <20020627221228.GB9371@codesourcery.com>
Message-ID: <20020629002455.GB2634@gerg.ca>

On 27 June 2002, Zack Weinberg said:
> Well, I wrote the analogous code in the GNU C library (using basically
> the same algorithm).  I'm confident it is safe on a Unix-based system.
> On Windows and others, I am relying on os.open(..., os.O_EXCL) to do
> what it claims to do; assuming it does, the code should be safe there too.

Sounds like good credentials to me.  Welcome to Python-land!  Note that
you'll probably get more positive feedback if you provide a patch to
tmpfile.py rather than a complete rewrite.  And patches will get lost on
python-dev -- you should submit it to SourceForge, and stay on the case
until the patch is accepted or rejected (or maybe deferred).

[me]
> +1 except for the name.  What does the "s" stand for?  Unfortunately, I
> can't think of a more descriptive name offhand.

[Zack]
> Fredrik Lundh's suggestion that it is for "safer" seems plausible, but
> I do not actually know.  I chose the names mkstemp and mkdtemp to
> match the functions of the same name in most modern Unix C libraries.
> Since they don't take the same "template" parameter that those
> functions do, that was probably a bad idea.

Hmmmm... I'm torn here.  When emulating (or wrapping) functionality from
the standard C library or Unix kernel, I think it's generally good to
preserve familiar, long-used names: os.chmod() is better than
os.changemode() (or change_mode(), if I wrote the code).  But mkstemp()
and mkdtemp() are *not* familiar, long-used names.  (At least not to me
-- I program in C very rarely!)  But they will probably become more
familiar over time.

Also, API changes that are just due to fundamental differences between C
and Python (immutable strings, multiple return values) are not really
enough reason to change a name.

It looks like your Python mkstemp() has one big advantage over the glibc
mkstemp() -- you can supply a suffix.  IMHO, the inability to supply a
prefix is a small disadvantage.  But those add up to a noticeably
different API.  I think I'm slightly in favour of a different name for
the Python version.  If you make it act like this:

    mkstemp(template : string = (sensible default),
            suffix : string = "")
    -> (filename : string, fd : int)

(err, I hope my personal type language is comprehensible), then call it
mkstemp() after all.  

> [Note to Fredrik: at the C level, mkstemp is not deprecated in favor
> of tmpfile, as they do very different things - tmpfile(3) is analogous
> to tmpfile.TemporaryFile(), you don't get the file name back.]

But the man page for mkstemp() in glibc 2.2.5 (Debian unstable) says:

       Don't  use  this  function, use tmpfile(3) instead.  It is
       better defined and more portable.

BTW, that man page has two "NOTES" sections.

> I was trying to get all the user-accessible interfaces to be at the
> top of the file.  Also, I do not understand the bits in the existing
> file that delete names out of the module namespace after we're done
> with them, so I wound up taking all of that out to get it to work.  I
> think the existing file's organization was largely determined by those
> 'del' statements.
> 
> I'm happy to organize the file any way y'all like -- I'm kind of new
> to Python and I don't know the conventions yet.

If I was starting from scratch, I would *probably* do something like
this:

  if os.name == "posix":
      class TemporaryFile:
         [... define Unix version of TemporaryFile ...]

  elif os.name == "nt":
      class NamedTemporaryFile:
         [...]

      class TemporaryFile:
         [... on top of NamedTemporaryFile ...]

  elif os.name == "macos":
      # beats me

But I don't know the full history of this module.

IMHO you would have a much better chance of success if you prepared a
couple of patches -- eg. one to add mkstemp() and mkdtemp() (possibly
with different names), another to do whatever it is to TemporaryFile
that you want to do.  Possibly a third for general code cleanup, if you
feel some is needed.  (Or do the code cleanup patch first, if that's
what's needed.)

        Greg
-- 
Greg Ward - just another Python hacker                  gward@python.net
http://starship.python.net/~gward/
I hope something GOOD came in the mail today so I have a REASON to live!!



From nas@python.ca  Sat Jun 29 04:01:27 2002
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 28 Jun 2002 20:01:27 -0700
Subject: [Python-Dev] On the topic of garbage collection
Message-ID: <20020628200127.A11344@glacier.arctrix.com>

Seen on the net:

    http://www.ravenbrook.com/project/mps/

    The Memory Pool System is a very general, adaptable, flexible,
    reliable, and efficient memory management system. It permits the
    flexible combination of memory management techniques, supporting
    manual and automatic memory management, in-line allocation,
    finalization, weakness, and multiple concurrent co-operating
    incremental generational garbage collections. It also includes a
    library of memory pool classes implementing specialized memory
    management policies. 

The code is offered under an open source license.

  Neil



From fredrik@pythonware.com  Sat Jun 29 12:02:13 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 29 Jun 2002 13:02:13 +0200
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
References: <001f01c21ed9$873f3c00$06ea7ad1@othello>
Message-ID: <004c01c21f5c$6dcbf2d0$ced241d5@hagrid>

raymond wrote:


> As far as I can tell, buffer() is one of the least used or known about
> Python tools.  What do you guys think about this as a candidate for silent
> deprecation (moving out of the primary documentation)?

+1, in theory.

does anyone have any real-life use cases?  I've never been
able to use it for anything, and cannot recall ever seeing it
being used by anyone else...

(it sure doesn't work for the use cases I thought of when
first learning about the API...)

</F>




From tim.one@comcast.net  Sat Jun 29 13:22:28 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 29 Jun 2002 08:22:28 -0400
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <Pine.LNX.4.44.0206281001200.5875-100000@penguin.theopalgroup.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEONAAAB.tim.one@comcast.net>

[Kevin Jacobs]

Nice job, Kevin!  You learned a lot in a hurry here.  I'll try to fill in
some blanks.

> ...
> lst = []
> for i in range(100000):
>   lst.append( (1,) )
>
> The key ingredients are:
>
>   1) A method is called on a container (rather than __setitem__ or
>      __setattr__).
>
>   2) A new object is allocated while the method object lives on the Python
>      VM stack, as shown by the disassembled bytecodes:
>
>          40 LOAD_FAST                1 (lst)
>          43 LOAD_ATTR                3 (append)
>          46 LOAD_CONST               2 (1)
>          49 BUILD_TUPLE              1
>          52 CALL_FUNCTION            1
>
> These ingredients combine in the following way to trigger quadratic-time
> behavior in the Python garbage collector:
>
>   * First, the LOAD_ATTR on "lst" for "append" is called, and a
>     PyCFunction is returned from this code in descrobject.c:method_get:
>
>         return PyCFunction_New(descr->d_method, obj);
>
>     Thus, a _new_ PyCFunction is allocated every time the method is
>     requested.

In outline, so far all that has been true since 0 AP (After Python).

>   * This new method object is added to generation 0 of the garbage
>     collector, which holds a reference to "lst".

It's a bug in 2.2.1 that the method object isn't getting added to gen0.  It
is added in current CVS.

>   * The BUILD_TUPLE call may then trigger a garbage collection cycle.
>
>   * Since the "append" method is in generation 0,

Yes.

>     the reference traversal must also follow all objects within "lst",
>     even if "lst" is in generation 1 or 2.

According to me, it's a bug that it does so in current CVS, and a bug that's
been in cyclic gc forever.  This kind of gc scheme isn't "supposed to" chase
old objects (there's no point to doing so -- if there is a reclaimable cycle
in the young generation, the cycle is necessarily composed of pure
young->young pointers, so chasing a cross-generation pointer can't yield any
useful info).  It's not a *semantic* error if it chases old objects too, but
it does waste time, and can (as it does here) yank old objects back to a
younger generation.  I attached a brief patch to your bug report that stops
this.

>     This traversal requires time linear in the number of
>     objects in "lst", thus increasing the overall time complexity of the
>     code to quadratic in the number of elements in "lst".

Yes.  Do note that this class of program is quadratic-time anyway, just
because the rare gen2 traversals have to crawl over an ever-increasing lst
too.  BTW, the "range(100000)" in your test program also gets crawled over
every time a gen2 collection occurs!  That's why Neil made them rare <wink>.

> Also note that this is a much more general problem than this
> small example.

Sure, although whether it's still "a real problem" after the patch is open
to cost-benefit ridicule <wink>.

> It can affect many types of objects in addition to methods, including
> descriptors, iterator objects, and any other object that contains a "back
> reference".
>
> So, what can be done about this.... One simple solution would be to not
> traverse some "back references" if we are collecting objects in generation
> 0.
>
> This will avoid traversing virtually all of these ephemoral
> objects that will trigger such expensive behavior.  If they live long
> enough to pass through to generation one or two, then clearly they
> should be traversed.
>
> So, what do all of you GC gurus think?  Provided that my analysis
> is sound, I can rapidly propose a patch to demonstrate this approach if
> there is sufficient positive sentiment.

Seeing a patch is the only way I'd understand your intent.  You can
understand my intent by reading my patch <wink>.

> ...
>
> PS: I have not looked into why this doesn't happen in Python 2.2.x or
>     before.

It's a bug in 2.2.1 (well, two bugs, if Neil accepts my claim that the patch
I put up "fixes a bug" too).  In 2.1, method objects hadn't yet been added
to cyclic gc.




From fredrik@pythonware.com  Sat Jun 29 13:39:07 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 29 Jun 2002 14:39:07 +0200
Subject: [Python-Dev] Priority queue (binary heap) python code
References: <LNBBLJKPBEHFEDALKOLCIEKJAAAB.tim.one@comcast.net> <036101c21e68$8abed730$6501a8c0@boostconsulting.com>
Message-ID: <01f201c21f69$f7600f10$ced241d5@hagrid>

david wrote:

> > OTOH, multiple attempts by multiple groups to steal Perl's
> > regexp engine years ago fizzled out in a tarpit of frustration.
> 
> Oh, I had the impression that Python's re *was* pilfered Perl.

Tim warned me that the mere attempt to read sources for
existing RE implementations was a sure way to destroy my
brain, so I avoided that.

SRE is a clean-room implementation, using the Python 1.5.2
docs and the regression test suite as the only reference.  I've
never written a Perl regexp in my life.

</F>




From jacobs@penguin.theopalgroup.com  Sat Jun 29 14:03:32 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Sat, 29 Jun 2002 09:03:32 -0400 (EDT)
Subject: [Python-Dev] Garbage collector problem
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEONAAAB.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.44.0206290824150.12679-100000@penguin.theopalgroup.com>

On Sat, 29 Jun 2002, Tim Peters wrote:
> Nice job, Kevin!  You learned a lot in a hurry here.  I'll try to fill in
> some blanks.

Thanks for the great sleuthing, Tim.  I missed a few critical details about
how the GC system was intended to work.  It was not initially clear that
most GC traversals were not recursive.  i.e., I had assumed that functions
like update_refs and subtract_refs did a DFS through all reachable
references, instead of a shallow 1-level search.  Of course, it all makes
much more sense now.

Here are the results of my test program (attached to the SF bug) with and
without your patch installed (2.3a0+ and 2.3a0-, respectively) and GC enabled:

                                           N
        20000    40000    80000    160000   240000   320000   480000   640000
Ver.   -------- -------- -------- -------- -------- -------- -------- --------
1.5.2  316450/s 345590/s 349609/s 342895/s 351352/s 353734/s 345362/s 350978/s
2.0    183723/s 192671/s 174146/s 151661/s 154592/s 127181/s 114903/s  99469/s
2.2.1  228553/s 234018/s 197809/s 166019/s 171306/s 137840/s 122835/s 105785/s
2.3a0- 164968/s 111752/s  68220/s  38129/s  26098/s  19678/s  13488/s  10396/s
2.3a0+ 291286/s 287168/s 284857/s 233244/s 196731/s 170759/s 135541/s 129851/s

There is still room for improvement, but overall I'm happy with the
performance of 2.3a0+.

> > So, what do all of you GC gurus think?  Provided that my analysis
> > is sound, I can rapidly propose a patch to demonstrate this approach if
> > there is sufficient positive sentiment.
> 
> Seeing a patch is the only way I'd understand your intent.  You can
> understand my intent by reading my patch <wink>.

When functioning correctly, the current garbage collector already does what
I was suggesting (in more generality, to boot).  No need for a patch.

Thanks again, Tim.  It was a lively chase through some of the strange and
twisted innards of my favorite language.

Off to write boring code again,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From Juergen Hermann" <j.her@t-online.de  Sat Jun 29 15:20:57 2002
From: Juergen Hermann" <j.her@t-online.de (Juergen Hermann)
Date: Sat, 29 Jun 2002 16:20:57 +0200
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
In-Reply-To: <004c01c21f5c$6dcbf2d0$ced241d5@hagrid>
Message-ID: <17OJ5m-171Y92C@fwd10.sul.t-online.com>

On Sat, 29 Jun 2002 13:02:13 +0200, Fredrik Lundh wrote:

>does anyone have any real-life use cases?  I've never been
>able to use it for anything, and cannot recall ever seeing it
>being used by anyone else...

We use it for BLOB support in our Python binding to our C++ binding to
Oracle OCI. Oracle allows loading of limited ranges out of BLOBs, and
the buffer interface perfectly fits into that.

Ciao, J=FCrgen





From mwh@python.net  Sat Jun 29 16:07:46 2002
From: mwh@python.net (Michael Hudson)
Date: 29 Jun 2002 16:07:46 +0100
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
In-Reply-To: j.her@t-online.de's message of "Sat, 29 Jun 2002 16:20:57 +0200"
References: <17OJ5m-171Y92C@fwd10.sul.t-online.com>
Message-ID: <2mlm8yw2bh.fsf@starship.python.net>

j.her@t-online.de (Juergen Hermann) writes:

> On Sat, 29 Jun 2002 13:02:13 +0200, Fredrik Lundh wrote:
> 
> >does anyone have any real-life use cases?  I've never been
> >able to use it for anything, and cannot recall ever seeing it
> >being used by anyone else...
> 
> We use it for BLOB support in our Python binding to our C++ binding to
> Oracle OCI. Oracle allows loading of limited ranges out of BLOBs, and
> the buffer interface perfectly fits into that.

But that's from C, right?  I don't think anyone's suggested removing
the C-level buffer interface.

Cheers,
M.

-- 
  I think perhaps we should have electoral collages and construct
  our representatives entirely of little bits of cloth and papier 
  mache.
                  -- Owen Dunn, ucam.chat, from his review of the year



From tismer@tismer.com  Sat Jun 29 16:48:00 2002
From: tismer@tismer.com (Christian Tismer)
Date: Sat, 29 Jun 2002 17:48:00 +0200
Subject: [Python-Dev] Re: PEP 292, Simpler String Substitutions
References: <LNBBLJKPBEHFEDALKOLCIEDOPPAA.tim.one@comcast.net>	<3D1500F0.708@tismer.com> <15637.10310.131724.556831@anthem.wooz.org> <3D152CCB.6010000@tismer.com> <3D161435.9D154EE0@prescod.net> <3D162060.9030101@tismer.com> <3D162342.BBDC07B3@prescod.net>
Message-ID: <3D1DD6B0.2020603@tismer.com>

Extended proposal at the end:

Paul Prescod wrote:
> Christian Tismer wrote:
> 
>>...
>>
>>Are you sure you got what I meant?
>>I want to compile the variable references away at compile
>>time, resulting in an ordinary format string.
>>This string is wraped by the runtime _(), and
>>the result is then interpolated with a dict.
> 
> 
> How can that be?
> 
> Original expression:
> 
> _($"$foo")
> 
> Expands to:
> 
> _("%(x1)s"%{"x1": foo})
> 
> Standard Python order of operations will do the %-interpolation before
> the method call! You say that it could instead be 
> 
> _("%(x1)s")%{"x1": foo}
> 
> But how would Python know to do that? "_" is just another function.
> There is nothing magical about it. What if the function was instead
> re.compile? In that case I would want to do the interpolation *before*
> the compilation, not after!
> 
> Are you saying that the "_" function should be made special and
> recognized by the compiler?

My idea has evolved into the following:
Consider an interpolating object with the following
properties (sketched by a class here):

class Interpol:
     def __init__(self, fmt, dic):
         self.fmt = fmt
         self.dic = dic
     def __repr__(self):
         return self.fmt % self.dic

Original expression:

_($"$foo")

Expands at compile time to:

_( Interpol("%(x1)s", {"x1": foo}) )

Having said that, it is now up to the function _()
to test whether its argument is an Interpol or not.
It can do something like that:

def _(arg):
     ...
     if type(arg) is Interpol:
         return _(arg.fmt) % arg.dic

# or, maybe cleaner, leaving the formatting action
# to the Interpol class:

def _(arg):
     ...
     if isinstance(arg, Interpol):
         return arg.__class__(_(arg.fmt), arg.dic)

# which then in turn will return the final string,
# if it is interrogated via str or repr.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From martin@v.loewis.de  Sat Jun 29 18:59:47 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 29 Jun 2002 19:59:47 +0200
Subject: [Python-Dev] posixmodule.c diffs for working forkpty() and openpty() under Solaris 2.8
In-Reply-To: <20020626082135.16733.qmail@web20905.mail.yahoo.com>
References: <20020626082135.16733.qmail@web20905.mail.yahoo.com>
Message-ID: <m3lm8y9d9o.fsf@mira.informatik.hu-berlin.de>

Lance Ellinghaus <lellinghaus@yahoo.com> writes:

> Please let me know if anyone has any problems with this!

I do. I have the general problem with posting such patches to
python-dev; please put them onto SF instead. For specific problems,
please see below.

> ! #if defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) || defined(sun)
> ! #ifdef sun
> ! #include <sys/stropts.h>
> ! #endif

I don't like #if <system> defines. What is the problem, and why can't
it be solved with a HAVE_ test?

Also, are you certain your changes apply to all systems that define sun?

> +         master_fd = open("/dev/ptmx", O_RDWR|O_NOCTTY);  /* open master */
> +         sig_saved = signal(SIGCHLD, SIG_DFL);
> +         grantpt(master_fd);                     /* change permission of   slave */
> +         unlockpt(master_fd);                    /* unlock slave */
> +         signal(SIGCHLD,sig_saved);
> +         slave_name = ptsname(master_fd);         /* get name of slave */
> +         slave_fd = open(slave_name, O_RDWR);    /* open slave */
> +         ioctl(slave_fd, I_PUSH, "ptem");       /* push ptem */
> +         ioctl(slave_fd, I_PUSH, "ldterm");     /* push ldterm*/
> +         ioctl(slave_fd, I_PUSH, "ttcompat");     /* push ttcompat*/

Again, that is a fragment that seems to apply to more systems than
just Solaris. It appears that atleast HP-UX has the same API, perhaps
other SysV systems have that as well.

On some of these other systems, ttcompat is not used, see

http://ou800doc.caldera.com/SDK_sysprog/_Pseudo-tty_Drivers_em_ptm_and_p.html

for an example. So I wonder whether it should be used by default -
especially since the Solaris man page says that it can be autopushed
as well.

Regards,
Martin



From martin@v.loewis.de  Sat Jun 29 19:02:29 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 29 Jun 2002 20:02:29 +0200
Subject: [Python-Dev] Asyncore/asynchat
In-Reply-To: <0c4801c21cfa$0b9d11c0$6300000a@holdenweb.com>
References: <0c4801c21cfa$0b9d11c0$6300000a@holdenweb.com>
Message-ID: <m3hejm9d56.fsf@mira.informatik.hu-berlin.de>

"Steve Holden" <sholden@holdenweb.com> writes:

> I notice that Sam Rushing's code tends to use spaces before the parentheses
> around argument lists. Should I think about cleaning up the code at the same
> time, or are we best letting sleeping dogs lie?

The general principle seems to be that cleanup can be done while the
module is reviewed, anyway. So I think doing these changes together
with the documentation is appropriate.

Regards,
Martin




From martin@v.loewis.de  Sat Jun 29 20:24:16 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 29 Jun 2002 21:24:16 +0200
Subject: [Python-Dev] Building Python cvs w/ gcc 3.1
In-Reply-To: <15643.25537.767831.983206@anthem.wooz.org>
References: <15643.25537.767831.983206@anthem.wooz.org>
Message-ID: <m3fzz599cv.fsf@mira.informatik.hu-berlin.de>

barry@zope.com (Barry A. Warsaw) writes:

> There's no switch to turn off these warnings.

The GCC developers now consider this entire warning a bug in the
compiler. Nobody can recall the rationale for the warning, and it will
likely go away.

Just remove it from your GCC sources if it bothers you too much.

Regards,
Martin



From xscottg@yahoo.com  Sat Jun 29 20:59:20 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Sat, 29 Jun 2002 12:59:20 -0700 (PDT)
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
In-Reply-To: <004c01c21f5c$6dcbf2d0$ced241d5@hagrid>
Message-ID: <20020629195920.64153.qmail@web40104.mail.yahoo.com>

--- Fredrik Lundh <fredrik@pythonware.com> wrote:
> 
> does anyone have any real-life use cases?  I've never been
> able to use it for anything, and cannot recall ever seeing it
> being used by anyone else...
> 

As far as I can tell, it only has two uses - To create a (read only)
subview of some other object without making a copy:

  a = array.array('b', [0])*16*1024*1024
  b = buffer(a, 512, 1024*1024)  # Cheap 1M view of 16M object

Or to add string like qualities to an object which supports the
PyBufferProcs interface, but didn't bother to support a string like
interface.  There don't appear to be any of those in the Python core, so
here is a bogus example:

  l = lazy.slacker()
  b = buffer(l)
  x = b[1024:1032]

>
> (it sure doesn't work for the use cases I thought of when
> first learning about the API...)
> 

I think that's the reason that no one ever fixes its quirks and bugs.  As
soon as you understand what it is, you realize that even if it was fixed it
isn't very useful.


What would be useful is a mutable array of bytes that you could optionally
construct from pointer and destructor, that pickled efficiently (no copy to
string), and that reliably retained it's pointer value after letting go of
the GIL (no realloc).







__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From lellinghaus@yahoo.com  Sat Jun 29 21:57:05 2002
From: lellinghaus@yahoo.com (Lance Ellinghaus)
Date: Sat, 29 Jun 2002 13:57:05 -0700 (PDT)
Subject: [Python-Dev] posixmodule.c diffs for working forkpty() and openpty() under Solaris 2.8
In-Reply-To: <m3lm8y9d9o.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020629205705.28248.qmail@web20905.mail.yahoo.com>

Martin:
See my comments below please...

--- "Martin v. Loewis" <martin@v.loewis.de> wrote:
> Lance Ellinghaus <lellinghaus@yahoo.com> writes:
> 
> > Please let me know if anyone has any problems with this!
> 
> I do. I have the general problem with posting such patches to
> python-dev; please put them onto SF instead. For specific problems,
> please see below.

I did not think just anyone could post to the python section on SF. My
mistake. The rest of the comments are below...

> > ! #if defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) ||
> defined(sun)
> > ! #ifdef sun
> > ! #include <sys/stropts.h>
> > ! #endif
> 
> I don't like #if <system> defines. What is the problem, and why can't
> it be solved with a HAVE_ test?

The problem is that Solaris (SUN) does NOT have openpty() and does not
have forkpty().. So what HAVE_ test would you suggest? What would I
test for? I guess I could have tested for "grantpt()", but testing for
"sun" works as needed. I understand your PERSONAL problem with testing
for SYSTEMs.. but that does not mean it is WRONG.. 

> Also, are you certain your changes apply to all systems that define
> sun?

Yes. All currently supported Solaris systems will need this patch to
provide openpty() and forkpty() services. Supported Solaris is 2.8.
This should work with 2.9 as well.

> > +         master_fd = open("/dev/ptmx", O_RDWR|O_NOCTTY);  /* open
> master */
> > +         sig_saved = signal(SIGCHLD, SIG_DFL);
> > +         grantpt(master_fd);                     /* change
> permission of   slave */
> > +         unlockpt(master_fd);                    /* unlock slave
> */
> > +         signal(SIGCHLD,sig_saved);
> > +         slave_name = ptsname(master_fd);         /* get name of
> slave */
> > +         slave_fd = open(slave_name, O_RDWR);    /* open slave */
> > +         ioctl(slave_fd, I_PUSH, "ptem");       /* push ptem */
> > +         ioctl(slave_fd, I_PUSH, "ldterm");     /* push ldterm*/
> > +         ioctl(slave_fd, I_PUSH, "ttcompat");     /* push
> ttcompat*/
> 
> Again, that is a fragment that seems to apply to more systems than
> just Solaris. It appears that atleast HP-UX has the same API, perhaps
> other SysV systems have that as well.

This may be the case. I was not coding for these other systems. I was
only coding for Sun Solaris 2.8. If someone wants to test it on those
other systems, then it could be expanded for them.

> On some of these other systems, ttcompat is not used, see
> 
>
http://ou800doc.caldera.com/SDK_sysprog/_Pseudo-tty_Drivers_em_ptm_and_p.html
> 

Again, was I coding for other systems? No. Hence the "#if
defined(sun)". Again, many other systems do not need this patch as they
already have forkpty() and openpty() defined.

> for an example. So I wonder whether it should be used by default -
> especially since the Solaris man page says that it can be autopushed
> as well.

Yes. You can use the autopush feature, but that requires making changes
to the OS level configuration files. If they have been autopushed, it
will not reload them. You do not want the requirement of making changes
to the OS level configuration files if you can keep from having to do
it. BTW: This is how SSH, EMACS, and other programs do it (YES I
LOOKED!).

Lance


=====
--
Lance Ellinghaus

__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From python@rcn.com  Sat Jun 29 23:04:28 2002
From: python@rcn.com (Raymond Hettinger)
Date: Sat, 29 Jun 2002 18:04:28 -0400
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
References: <001f01c21ed9$873f3c00$06ea7ad1@othello> <004c01c21f5c$6dcbf2d0$ced241d5@hagrid>
Message-ID: <003b01c21fb8$f0e3ce20$ecb53bd0@othello>

From: "Fredrik Lundh" <fredrik@pythonware.com>
> > As far as I can tell, buffer() is one of the least used or known about
> > Python tools.  What do you guys think about this as a candidate for
silent
> > deprecation (moving out of the primary documentation)?
>
> +1, in theory.

And perhaps in practice.  No replies were received from my buffer() survey
on comp.lang.py:
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&th=e36ac767eb076bf4&rnum=
1

> does anyone have any real-life use cases?  I've never been
> able to use it for anything, and cannot recall ever seeing it
> being used by anyone else...

Also, I scanned a few packages (just the ones I thought might use it like
Gadfly, HTMLgen, Spark, etc) on the Vaults of Parnassus and found zero
occurances.

My Google searches turned-up empty and so did a grep of the library.


Raymond Hettinger




From barry@zope.com  Sun Jun 30 00:57:39 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sat, 29 Jun 2002 19:57:39 -0400
Subject: [Python-Dev] Building Python cvs w/ gcc 3.1
References: <15643.25537.767831.983206@anthem.wooz.org>
 <m3fzz599cv.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15646.18803.947941.467257@anthem.wooz.org>

>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

    >> There's no switch to turn off these warnings.

    MvL> The GCC developers now consider this entire warning a bug in
    MvL> the compiler. Nobody can recall the rationale for the
    MvL> warning, and it will likely go away.

That's what I gathered from reading some archives.

    MvL> Just remove it from your GCC sources if it bothers you too
    MvL> much.

It really doesn't.  I was giving gcc 3.1 a shake for something else,
but that didn't turn out to be relevant, so I'll probably just wax it
for now.

-Barry



From tim.one@comcast.net  Sun Jun 30 03:07:43 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 29 Jun 2002 22:07:43 -0400
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
In-Reply-To: <003b01c21fb8$f0e3ce20$ecb53bd0@othello>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEAHABAB.tim.one@comcast.net>

Guido's last eassy on the buffer interface is still worth reading:

    http://mail.python.org/pipermail/python-dev/2000-October/009974.html

No progress on the issues discussed has been made since, and, to the
contrary, recent changes go in directions Guido didn't want to go.

Note that he was in favor of both gutting and deprecating the buffer object
(as distinct from the buffer C API) "way back then" already.  The only time
I ever see buffer() used in Python code is in examples of how to crash
Python <wink>.

In practice, the positive way to look at it is that we've been following
Finn Bock's advice with passion:

    Because it is so difficult to look at java storage as a sequence of
    bytes, I think I'm all for keeping the buffer() builtin and buffer
    object as obscure and unknown as possible <wink>.




From python@rcn.com  Sun Jun 30 04:24:35 2002
From: python@rcn.com (Raymond Hettinger)
Date: Sat, 29 Jun 2002 23:24:35 -0400
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
References: <LNBBLJKPBEHFEDALKOLCCEAHABAB.tim.one@comcast.net>
Message-ID: <008e01c21fe5$a9383480$17ea7ad1@othello>

From: "Tim Peters" <tim.one@comcast.net>
> Guido's last eassy on the buffer interface is still worth reading:
>
>     http://mail.python.org/pipermail/python-dev/2000-October/009974.html

Thanks for the helpful pointer!  :)

> No progress on the issues discussed has been made since, and, to the
> contrary, recent changes go in directions Guido didn't want to go.

He sent me to you guys for direction.  The change was based on the advice I
got.
The point is moot because a) it's not too late to change course to returning
all buffer objects, b) because almost nobody uses it anyway, and c) it all
should
probably be deprecated.

> Note that he was in favor of both gutting and deprecating the buffer
object
> (as distinct from the buffer C API) "way back then" already.  The only
time
> I ever see buffer() used in Python code is in examples of how to crash
> Python <wink>.

Perhaps full deprecation (of the Python API not the C API) is in order.
It;s just one fewer item in the Python concept space.  Besides mmap()
and iterators have already addressed some of the original need.

> In practice, the positive way to look at it is that we've been following
> Finn Bock's advice with passion:
>
>     Because it is so difficult to look at java storage as a sequence of
>     bytes, I think I'm all for keeping the buffer() builtin and buffer
>     object as obscure and unknown as possible <wink>.

Sounds almost like silent deprecation to me <winks back>.


Raymond Hettinger






From martin@v.loewis.de  Sun Jun 30 07:55:56 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 30 Jun 2002 08:55:56 +0200
Subject: [Python-Dev] posixmodule.c diffs for working forkpty() and openpty() under Solaris 2.8
In-Reply-To: <20020629205705.28248.qmail@web20905.mail.yahoo.com>
References: <20020629205705.28248.qmail@web20905.mail.yahoo.com>
Message-ID: <m3pty92r2b.fsf@mira.informatik.hu-berlin.de>

Lance Ellinghaus <lellinghaus@yahoo.com> writes:

> The problem is that Solaris (SUN) does NOT have openpty() and does not
> have forkpty().. So what HAVE_ test would you suggest? What would I
> test for? 

For the features you use: HAVE_PTMX, HAVE_GRANTPT, HAVE_SYSV_STREAMS,
... If you know they always come in groups, testing for an single one
would be sufficient.

> I guess I could have tested for "grantpt()", but testing for "sun"
> works as needed.

Does it work on SunOS 4 as well?

> I understand your PERSONAL problem with testing for SYSTEMs.. but
> that does not mean it is WRONG..

It is not just my personal problem; it is a maintainance principle for
Python. Perhaps there should be a section on it in PEP 7.

In this case, it is not only wrong because it is too inclusive (as it
tests for Sun 4 as well). What's worse is that it is too exclusive: it
will force us to produce long lists of tests for other systems that
use the same mechanism.

> > Also, are you certain your changes apply to all systems that define
> > sun?
> 
> Yes. All currently supported Solaris systems will need this patch to
> provide openpty() and forkpty() services. Supported Solaris is 2.8.
> This should work with 2.9 as well.

Besides SunOS 4, are you *sure* it also works on, say, Solaris 2.5?

> This may be the case. I was not coding for these other systems. I was
> only coding for Sun Solaris 2.8. 

But you should be.

> If someone wants to test it on those other systems, then it could be
> expanded for them.

No. Anybody expanding it for other systems will use the same style
that you currently use, and we can look forward for a constant stream
of patches saying "add this, and trust me - I'm the only one who has
such a system". If we later find that the version test was incorrect,
we are at a loss as to what to do.

> Again, was I coding for other systems? No. 

Again, this is my primary concern with that patch.

> Hence the "#if defined(sun)". Again, many other systems do not need
> this patch as they already have forkpty() and openpty() defined.

Right, and autoconf will find out. However, that still leaves quite a
number of systems that follow the STREAMS way of live. If there is a
chance to support them simultaneously, than this should be done.

> Yes. You can use the autopush feature, but that requires making changes
> to the OS level configuration files. If they have been autopushed, it
> will not reload them. You do not want the requirement of making changes
> to the OS level configuration files if you can keep from having to do
> it. BTW: This is how SSH, EMACS, and other programs do it (YES I
> LOOKED!).

That doesn't necessarily make it more right. What happens if you leave
out the ttcompat module?

Regards,
Martin




From pyth@devel.trillke.net  Sun Jun 30 09:12:10 2002
From: pyth@devel.trillke.net (holger krekel)
Date: Sun, 30 Jun 2002 10:12:10 +0200
Subject: [Python-Dev] Re: *Simpler* string substitutions
In-Reply-To: <15637.9759.111784.481102@anthem.wooz.org>; from barry@zope.com on Sat, Jun 22, 2002 at 09:36:31PM -0400
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B3B2@UKRUX002.rundc.uk.origin-it.com> <15637.9759.111784.481102@anthem.wooz.org>
Message-ID: <20020630101210.D20310@prim.han.de>

Barry A. Warsaw wrote:
> 
> >>>>> "PM" == Paul Moore <Paul.Moore@atosorigin.com> writes:
> 
>     PM> 4. Access to variables is also problematic. Without
>     PM> compile-time support, access to nested scopes is impossible
>     PM> (AIUI).
> 
> Is this really true?  I think it was two IPC's ago that Jeremy and I
> discussed the possibility of adding a method to frame objects that
> would basically yield you the equivalent of globals+freevars+locals.

Explicit ways to get a the actual name-obj bindings for any particular
code block are much appreciated. What's currently the best way to 
access lexically scoped names from inside a code block? 

    holger



From oren-py-d@hishome.net  Sun Jun 30 18:39:03 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sun, 30 Jun 2002 13:39:03 -0400
Subject: [Python-Dev] Xrange and Slices
In-Reply-To: <000d01c21cdb$eb03b720$91d8accf@othello>
References: <000d01c21cdb$eb03b720$91d8accf@othello>
Message-ID: <20020630173903.GA37045@hishome.net>

On Wed, Jun 26, 2002 at 02:37:17AM -0400, Raymond Hettinger wrote:
> Wild idea of the day:
> Merge the code for xrange() into slice().

There's a patch pending for this: www.python.org/sf/575515

Some issues related to the change:

xrange currently accepts only integer arguments.  With this change it 
will accept any type and the exception will be raised when iteration is
attempted. Is this a problem? The canonical use of xrange is to use it 
immediately in a for statement so it will probably go unnoticed.

Should xrange be an alias for slice or the other way around? Personally
I think that xrange is the more familiar of the two so the merged object 
should be called xrange.  Its repr should also be like that of xrange, 
suppressing the display of unnecessary None arguments.

One of the differences between slice and xrange is that slices are allowed
to have open-ended ranges such as slice(10, None).  It may useful (and
probably quite controversial...) to allow open-ended xranges too, defaulting
to INT_MAX or INT_MIN, depending on the sign of the step.  It's useful in
for loops where you know you will bail out with break and also for zip.

A possible extension is to add a method iterslice(len) to slice/xrange that 
exposes the functionality of PySlice_GetIndicesEx. With this change the 
following code should work correctly for all forms of slicing:

  def __getitem__(self, index):
    if isinstance(index, xrange):
      return [self[i] for i in index.iterslice(len(self))]
    else:
      ... implement integer indexing for this container class

This extension, BTW, is independent of whether slice/xrange merging is
accepted or not. 

	Oren




From Oleg Broytmann <phd@phd.pp.ru>  Sun Jun 30 20:48:20 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Sun, 30 Jun 2002 23:48:20 +0400
Subject: [Python-Dev] Infinie recursion in Pickle
Message-ID: <20020630234820.A1006@phd.pp.ru>

Hello!

   Nobody noted the message in c.l.py, let me try to ask you before I file
a bug report on SF.

----- Forwarded message from Oleg Broytmann <phd@phd.pp.ru> -----
On Thu, Jun 27, 2002 at 08:45:16PM +0000, Bengt Richter wrote:
> >   Recently Python (one program that I am dbugging) started to crash.
> >FreeBSD kills it with "Bus error", Linux with "Segmentation fault".
> >
> >   I think the program crashed in the cPickle.dump(file, 1)

   I replaced cPikle.dump with pikle.dump and got infinite rcursion. The
traceback is below.

   What's that? Are there any limits that an object to be pikled must
follow? Could it be a tree with loops? (I am pretty sure it could - I used
the program for years, and data structures was not changed much). Could it
be "new" Python class? (Recently I changed one of my classes to be derived
from builtin list instead of UserList).

   Well (or not so well), the traceback:

Traceback (most recent call last):
  File "/home/phd/lib/bookmarks_db/check_urls.py", line 158, in ?
    run()
  File "/home/phd/lib/bookmarks_db/check_urls.py", line 145, in run
    storage.store(root_folder)
  File "bkmk_stpickle.py", line 23, in store
  File "/usr/local/lib/python2.2/pickle.py", line 973, in dump
    Pickler(file, bin).dump(object)
  File "/usr/local/lib/python2.2/pickle.py", line 115, in dump
    self.save(object)
  File "/usr/local/lib/python2.2/pickle.py", line 219, in save
    self.save_reduce(callable, arg_tup, state)
  File "/usr/local/lib/python2.2/pickle.py", line 245, in save_reduce
    save(arg_tup)
  File "/usr/local/lib/python2.2/pickle.py", line 225, in save
    f(self, object)
  File "/usr/local/lib/python2.2/pickle.py", line 374, in save_tuple
    save(element)
  File "/usr/local/lib/python2.2/pickle.py", line 225, in save
    f(self, object)

[about 1000 lines skipped - they are all the same]

  File "/usr/local/lib/python2.2/pickle.py", line 498, in save_inst
    save(stuff)
  File "/usr/local/lib/python2.2/pickle.py", line 225, in save
    f(self, object)
  File "/usr/local/lib/python2.2/pickle.py", line 447, in save_dict
    save(value)
  File "/usr/local/lib/python2.2/pickle.py", line 219, in save
    self.save_reduce(callable, arg_tup, state)
  File "/usr/local/lib/python2.2/pickle.py", line 245, in save_reduce
    save(arg_tup)
  File "/usr/local/lib/python2.2/pickle.py", line 225, in save
    f(self, object)
  File "/usr/local/lib/python2.2/pickle.py", line 374, in save_tuple
    save(element)
  File "/usr/local/lib/python2.2/pickle.py", line 225, in save
    f(self, object)
  File "/usr/local/lib/python2.2/pickle.py", line 414, in save_list
    save(element)
  File "/usr/local/lib/python2.2/pickle.py", line 143, in save
    pid = self.persistent_id(object)
RuntimeError: maximum recursion depth exceeded
----- End forwarded message -----

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From tim.one@comcast.net  Sun Jun 30 20:58:02 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 30 Jun 2002 15:58:02 -0400
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
In-Reply-To: <008e01c21fe5$a9383480$17ea7ad1@othello>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECBABAB.tim.one@comcast.net>

[Tim]
>> Guido's last essay on the buffer interface is still worth reading:
>>
>>   http://mail.python.org/pipermail/python-dev/2000-October/009974.html
>>
>> No progress on the issues discussed has been made since, and, to the
>> contrary, recent changes go in directions Guido didn't want to go.

[Raymond Hettinger]
> He sent me to you guys for direction.

That's only because he forgot he wrote the essay -- it's my job to remember
what he did <wink>.

> The change was based on the advice I got.

Wasn't that an empty set?

> The point is moot because a) it's not too late to change course
> to returning all buffer objects, b) because almost nobody uses it
> anyway, and c) it all should probably be deprecated.

In effect, it's been "silently deprecated" since before Guido wrote the
above.

> ...
> Perhaps full deprecation (of the Python API not the C API) is in order.

Someone will whine if that's done.  Everyone's sick of fighting these
battles.  The buffer object is broken, won't get fixed (if it hasn't been by
now ...), and nobody seems to have a real use for it; but, *because* it's
virtually unused, "don't ask, don't tell" remains a path of small
resistance.

> It's just one fewer item in the Python concept space.  Besides mmap()
> and iterators have already addressed some of the original need.

I don't know what the original need was, but suspect it was never addressed.
IIRC, the real expressed need had something to do with running code objects
directly out of mmap'ed files, presumably on memory-starved platforms.  As
Guido said in his essay, "the reason" for the buffer object's existence
isn't clear, so whether the original need has been met, or could be met in
other ways now, isn't clear either.  Since it remains unused, if there is a
need for it, it's a peculiar meaning for "need".




From tim.one@comcast.net  Sun Jun 30 21:06:36 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 30 Jun 2002 16:06:36 -0400
Subject: [Python-Dev] Infinie recursion in Pickle
In-Reply-To: <20020630234820.A1006@phd.pp.ru>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECCABAB.tim.one@comcast.net>

[Oleg Broytmann]
>    Nobody noted the message in c.l.py, let me try to ask you before I file
> a bug report on SF.

I saw the c.l.py msgs but found nothing to say:  before you reduce this to a
program someone else can run and get the same error, it's going to remain a
mystery.  Mysteries belong on SF, though.  You may want to see whether the
problem persists with a build from current CVS Python (something someone
would have tried last week if they had a program they could run).




From python@rcn.com  Sun Jun 30 21:09:12 2002
From: python@rcn.com (Raymond Hettinger)
Date: Sun, 30 Jun 2002 16:09:12 -0400
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
References: <LNBBLJKPBEHFEDALKOLCMECBABAB.tim.one@comcast.net>
Message-ID: <006801c22072$00c890a0$88e97ad1@othello>

RH> > The change was based on the advice I got.
TP > Wasn't that an empty set?

Not unless Scott Gilbert is a null:

SG > > >  "... So the best bet would be to have it just always return a
string..."






From Oleg Broytmann <phd@phd.pp.ru>  Sun Jun 30 21:25:35 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Mon, 1 Jul 2002 00:25:35 +0400
Subject: [Python-Dev] Infinie recursion in Pickle
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEECCABAB.tim.one@comcast.net>; from tim.one@comcast.net on Sun, Jun 30, 2002 at 04:06:36PM -0400
References: <20020630234820.A1006@phd.pp.ru> <LNBBLJKPBEHFEDALKOLCEECCABAB.tim.one@comcast.net>
Message-ID: <20020701002535.A1510@phd.pp.ru>

On Sun, Jun 30, 2002 at 04:06:36PM -0400, Tim Peters wrote:
> I saw the c.l.py msgs but found nothing to say:  before you reduce this to a
> program someone else can run and get the same error, it's going to remain a

   I think I can reduce this, but I am afraid the data structure still will
be large,

> mystery.  Mysteries belong on SF, though.  You may want to see whether the

   That what I don't want to do - file a mysterious bug report.

> problem persists with a build from current CVS Python (something someone
> would have tried last week if they had a program they could run).

   I'll try.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From tim.one@comcast.net  Sun Jun 30 21:32:35 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 30 Jun 2002 16:32:35 -0400
Subject: [Python-Dev] Infinie recursion in Pickle
In-Reply-To: <20020701002535.A1510@phd.pp.ru>
Message-ID: <LNBBLJKPBEHFEDALKOLCIECEABAB.tim.one@comcast.net>

[Oleg Broytmann]
>    I think I can reduce this, but I am afraid the data structure
> still will be large,

That doesn't matter.  It's the amount of *code* we don't understand and have
to learn that matters.  If you could reduce this to a gigabyte of pickle
input that we only need to feed into pickle, that would be great.

>    That what I don't want to do - file a mysterious bug report.

That's what bug reports are best for!  Now you've got comments about your
bug scattered across comp.lang.python and python-dev, and nobody will be
able to find them again.  Attaching new info to a shared bug report is much
more effective.

if-there-isn't-a-mystery-there-isn't-a-bug<wink>-ly y'rs  - tim