From bjourne at gmail.com  Tue May  1 00:01:37 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Mon, 30 Apr 2007 22:01:37 +0000
Subject: [Python-3000] Traits/roles instead of ABCs
In-Reply-To: <-3456230403858254882@unknownmsgid>
References: <43aa6ff70704291840s3384824et44ebfd360c15eda@mail.gmail.com>
	<014201c78adc$ca70d960$f101a8c0@RaymondLaptop1>
	<1d36917a0704300816ma3bf9c2o4dd674cfcefa9172@mail.gmail.com>
	<-3456230403858254882@unknownmsgid>
Message-ID: <740c3aec0704301501u7df7b5a6uaea854d4716eb87e@mail.gmail.com>

On 4/30/07, Bill Janssen <janssen at parc.com> wrote:
> > On 4/30/07, Raymond Hettinger <python at rcn.com> wrote:
> > > I'm concerned that the current ABC proposal will quickly evolve from optional
> > > to required and create somewhat somewhat java-esque landscape where
> > > inheritance and full-specification are the order of the day.
> >
> > +1 for preferring simple solutions to complex ones
>
> Me, too.  But which is the simple solution?  I tend to think ABCs are.

Neither or. They are both an order of a magnitude more complex than
the problem they are designed to solve. Raymond Hettingers small list
of three example problems earlier in the thread, is the most concrete
description of what the problem really is all about. And I would
honestly rather sort them under "minor annoyances" than "really
critical stuff, needs to be fixed asap."

One really wise person wrote a long while ago (I'm paraphrasing) that
each new feature should have to prove itself against the standard
library. That is, a diff should be produced proving that real world
Python code reads better with the proposed feature than without. If no
such diff can be created, the feature probably isn't that useful.


-- 
mvh Bj?rn

From brett at python.org  Tue May  1 00:31:20 2007
From: brett at python.org (Brett Cannon)
Date: Mon, 30 Apr 2007 15:31:20 -0700
Subject: [Python-3000] Traits/roles instead of ABCs
In-Reply-To: <740c3aec0704301501u7df7b5a6uaea854d4716eb87e@mail.gmail.com>
References: <43aa6ff70704291840s3384824et44ebfd360c15eda@mail.gmail.com>
	<014201c78adc$ca70d960$f101a8c0@RaymondLaptop1>
	<1d36917a0704300816ma3bf9c2o4dd674cfcefa9172@mail.gmail.com>
	<-3456230403858254882@unknownmsgid>
	<740c3aec0704301501u7df7b5a6uaea854d4716eb87e@mail.gmail.com>
Message-ID: <bbaeab100704301531k4d194790y849864906eed180b@mail.gmail.com>

On 4/30/07, BJ?rn Lindqvist <bjourne at gmail.com> wrote:
[SNIP]
> One really wise person wrote a long while ago (I'm paraphrasing) that
> each new feature should have to prove itself against the standard
> library. That is, a diff should be produced proving that real world
> Python code reads better with the proposed feature than without. If no
> such diff can be created, the feature probably isn't that useful.

I think it would be a little difficult in this situation as since a
similar mechanism does not currently exist in the stdlib and so most
code is not written so that ABCs or roles are needed.  Plus you have
to find places of both LBYL and EAFP idioms if you did go with this.
I guess you could look for files that use isinstance or catch
AttributeError, respectively, but still.

And thanks for calling Raymond "really wise"; gave me a chuckle (not
because Raymond isn't smart but because he is not some old-timer who
tells "back in the day" stories and thus doesn't fit the stereotypical
"wise man" look).

-Brett

From l.mastrodomenico at gmail.com  Tue May  1 00:36:07 2007
From: l.mastrodomenico at gmail.com (Lino Mastrodomenico)
Date: Tue, 1 May 2007 00:36:07 +0200
Subject: [Python-3000] super() PEP
In-Reply-To: <014901c78b6e$d2d66d80$0201a8c0@ryoko>
References: <5.1.1.6.0.20070430142844.03c96240@sparrow.telecommunity.com>
	<ca471dc20704301217h1c117375r635bcae0034d593f@mail.gmail.com>
	<011b01c78b6a$72098810$0201a8c0@ryoko>
	<43aa6ff70704301403q5bf557c2wf43148f7a339353d@mail.gmail.com>
	<014901c78b6e$d2d66d80$0201a8c0@ryoko>
Message-ID: <cc93256f0704301536r2ab74894xfe15862a573263b4@mail.gmail.com>

2007/4/30, Tim Delaney <tcdelaney at optusnet.com.au>:
> Fine with me. Calvin - want to send me your latest draft, and I'll do some
> modifications? I think we've got to the point now where we can take this
> off-list.

One more thing: what do people think of modifying super so that when
it doesn't find a method instead of raising AttributeError it returns
something like "lambda *args, **kwargs: None"?

Optionally this can be a constant (e.g. default_method) defined
somewhere so, if necessary, it's still possible to detect if the value
of super.meth is a real method or the "fake" default_method.

I think this can be useful when a method *doesn't know* if it's the
last in the MRO because it may depend on the inheritance hierarchy of
its subclasses: you can always simply call super.meth(...) and if the
current method is the last this will be a NOP.

-- 
Lino Mastrodomenico
E-mail: l.mastrodomenico at gmail.com

From jimjjewett at gmail.com  Tue May  1 00:37:57 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 30 Apr 2007 18:37:57 -0400
Subject: [Python-3000] Revised PEPs 30XZ: remove implicit string
	concatenation and backslash continuation
Message-ID: <fb6fbf560704301537k3bd30349uf0efd19645b3a3d3@mail.gmail.com>

On 4/30/07, Guido van Rossum <guido at python.org> wrote:
> I think these should be two separate proposals, with more specific
> names (e.g. "remove implicit string concatenation" and "remove
> backslash continuation"). There's no need to mention the octal thing
> if it's already a separate PEP.

Revised versions attached, as David Goodger seemed to prefer attachments.

-jJ
-------------- next part --------------
PEP: 30XZA
Title: Remove Backslash Continuation
Version: $Revision$
Last-Modified: $Date$
Author: Jim J. Jewett <JimJJewett at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 29-Apr-2007
Post-History: 29-Apr-2007, 30-Apr-2007


Abstract

    Python initially inherited its parsing from C.  While this has
    been generally useful, there are some remnants which have been
    less useful for python, and should be eliminated.

    This PEP proposes elimination of terminal "\" as a marker for
    line continuation.

    
Rationale for Removing Explicit Line Continuation

    A terminal "\" indicates that the logical line is continued on the
    following physical line (after whitespace).

    Note that a non-terminal "\" does not have this meaning, even if the
    only additional characters are invisible whitespace.  (Python depends
    heavily on *visible* whitespace at the beginning of a line; it does
    not otherwise depend on *invisible* terminal whitespace.)  Adding
    whitespace after a "\" will typically cause a syntax error rather
    than a silent bug, but it still isn't desirable.

    The reason to keep "\" is that occasionally code looks better with
    a "\" than with a () pair.

        assert True, (
            "This Paren is goofy")

    But realistically, that parenthesis is no worse than a "\".  The
    only advantage of "\" is that it is slightly more familiar to users of
    C-based languages.  These same languages all also support line
    continuation with (), so reading code will not be a problem, and
    there will be one less rule to learn for people entirely new to
    programming.


Alternate proposal

    Several people have suggested alternative ways of marking the line
    end.  Most of these were rejected for not actually simplifying things.

    The one exception was to let any unfished expression signify a line
    continuation, possibly in conjunction with increased indentation

        assert True,            # comma implies tuple implies continue
            "No goofy parens"

    The objections to this are:

        - The amount of whitespace may be contentious; expression
          continuation should not be confused with opening a new
          suite.

        - The "expression continuation" markers are not as clearly marked
          in Python as the grouping punctuation "(), [], {}" marks are.

              "abc" +   # Plus needs another operand, so it continues
                  "def"

              "abc"       # String ends an expression, so
                  + "def" # this is a syntax error.

        - Guido says so.  [1]  His reasoning is that it may not even be
          feasible.  (See next reason.)

        - As a technical concern, supporting this would require allowing
          INDENT or DEDENT tokens anywhere, or at least in a widely
          expanded (and ill-defined) set of locations.  While this is
          in some sense a concern only for the internal parsing
          implementation, it would be a major new source of complexity.  [1]


References

    [1] PEP 30XZ: Simplified Parsing, van Rossum
        http://mail.python.org/pipermail/python-3000/2007-April/007063.html


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
-------------- next part --------------
PEP: 30xzB
Title: Remove Implicit String Concatenation
Version: $Revision$
Last-Modified: $Date$
Author: Jim J. Jewett <JimJJewett at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 29-Apr-2007
Post-History: 29-Apr-2007, 30-Apr-2007


Abstract

    Python initially inherited its parsing from C.  While this has
    been generally useful, there are some remnants which have been
    less useful for python, and should be eliminated.

    This PEP proposes to eliminate Implicit String concatenation
    based on adjacency of literals.

    Instead of

        "abc" "def" == "abcdef"

    authors will need to be explicit, and add the strings

        "abc" + "def" == "abcdef"


Rationale for Removing Implicit String Concatenation

    Implicit String concatentation can lead to confusing, or even
    silent, errors.

        def f(arg1, arg2=None): pass

        f("abc" "def")  # forgot the comma, no warning ...
                        # silently becomes f("abcdef", None)

    or, using the scons build framework,

        sourceFiles = [
        'foo.c'
        'bar.c',
        #...many lines omitted...
        'q1000x.c']

    It's a common mistake to leave off a comma, and then scons complains
    that it can't find 'foo.cbar.c'.  This is pretty bewildering behavior
    even if you *are* a Python programmer, and not everyone here is.  [1]

    Note that in C, the implicit concatenation is more justified; there
    is no other way to join strings without (at least) a function call.

    In Python, strings are objects which support the __add__ operator;
    it is possible to write:

        "abc" + "def"

    Because these are literals, this addition can still be optimized
    away by the compiler.  (The CPython compiler already does.  [2])

    Guido indicated [2] that this change should be handled by PEP, because
    there were a few edge cases with other string operators, such as the %.
    (Assuming that str % stays -- it may be eliminated in favor of
    PEP 3101 -- Advanced String Formatting.  [3] [4])
    
    The resolution is to treat them the same as today.

        ("abc %s def" + "ghi" % var)  # fails like today.
                                      # raises TypeError because of
                                      # precedence.  (% before +)
        
        ("abc" + "def %s ghi" % var)  # works like today; precedence makes
                                      # the optimization more difficult to
                                      # recognize, but does not change the
                                      # semantics.

        ("abc %s def" + "ghi") % var  # works like today, because of
                                      # precedence:  () before % 
                                      # CPython compiler can already 
                                      # add the literals at compile-time.
    
    
References

    [1] Implicit String Concatenation, Jewett, Orendorff
        http://mail.python.org/pipermail/python-ideas/2007-April/000397.html

    [2] Reminder: Py3k PEPs due by April, Hettinger, van Rossum
        http://mail.python.org/pipermail/python-3000/2007-April/006563.html

    [3] PEP 3101, Advanced String Formatting, Talin
        http://www.python.org/peps/pep-3101.html

    [4] ps to question Re: Need help completing ABC pep, van Rossum
        http://mail.python.org/pipermail/python-3000/2007-April/006737.html

Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

From guido at python.org  Tue May  1 00:47:19 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Apr 2007 15:47:19 -0700
Subject: [Python-3000] octal literals PEP
In-Reply-To: <d09829f50704301251r35f05bf8wf7cf5397e3751faa@mail.gmail.com>
References: <fb6fbf560704300749r538c8674lc965f2dfcb67162e@mail.gmail.com>
	<d09829f50704301251r35f05bf8wf7cf5397e3751faa@mail.gmail.com>
Message-ID: <ca471dc20704301547l53495009p5c458ac0d16c181c@mail.gmail.com>

The PEP editors have admitted to being behind on the job. AFAIK PEPs
sent to the PEP editors before the deadline are in, regardless of when
the PEP goes online.

To save the PEP editors the effort, if you send it to me I will assign
it a PEP number and submit it. (Ditto for other PEPs in the same
situation.)

--Guido

On 4/30/07, Patrick Maupin <pmaupin at gmail.com> wrote:
> I sent an email with an initial PEP to the PEP editors a few weeks
> ago.  Never got a reply.  I noticed some traffic about this recently
> but was too busy to follow it really carefully.
>
> Pat
>
> On 4/30/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > On 4/30/07, Guido van Rossum <guido at python.org> wrote:
> > > I think these should be two separate proposals, with more specific
> > > names (e.g. "remove implicit string concatenation" and "remove
> > > backslash continuation"). There's no need to mention the octal thing
> > > if it's already a separate PEP.
> >
> > Patrick
> >
> > Guido had set an Apr 30 deadline for Py3000 PEPs that can't be
> > implemented in pure python.
> >
> > Are you still working on the "Integer literal syntax and radices ",
> > which included the octal literal?  I would much prefer to leave octal
> > literals with the rest of that PEP, (and to let you do it :D),  but I
> > will submit a much-simplified "023 raises SyntaxError" if you have
> > abandoned the rest.
> >
> > -jJ
> >
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Tue May  1 00:54:33 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 30 Apr 2007 18:54:33 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions, Interfaces,
	etc.
Message-ID: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>

This is just the first draft (also checked into SVN), and doesn't include 
the details of how the extension API works (so that third-party interfaces 
and generic functions can interoperate using the same decorators, 
annotations, etc.).

Comments and questions appreciated, as it'll help drive better explanations 
of both the design and rationales.  I'm usually not that good at guessing 
what other people will want to know (or are likely to misunderstand) until 
I get actual questions.


PEP: 3124
Title: Overloading, Generic Functions, Interfaces, and Adaptation
Version: $Revision: 55029 $
Last-Modified: $Date: 2007-04-30 18:48:06 -0400 (Mon, 30 Apr 2007) $
Author: Phillip J. Eby <pje at telecommunity.com>
Discussions-To: Python 3000 List <python-3000 at python.org>
Status: Draft
Type: Standards Track
Requires: 3107, 3115, 3119
Replaces: 245, 246
Content-Type: text/x-rst
Created: 28-Apr-2007
Post-History: 30-Apr-2007


Abstract
========

This PEP proposes a new standard library module, ``overloading``, to
provide generic programming features including dynamic overloading
(aka generic functions), interfaces, adaptation, method combining (ala
CLOS and AspectJ), and simple forms of aspect-oriented programming.

The proposed API is also open to extension; that is, it will be
possible for library developers to implement their own specialized
interface types, generic function dispatchers, method combination
algorithms, etc., and those extensions will be treated as first-class
citizens by the proposed API.

The API will be implemented in pure Python with no C, but may have
some dependency on CPython-specific features such as ``sys._getframe``
and the ``func_code`` attribute of functions.  It is expected that
e.g. Jython and IronPython will have other ways of implementing
similar functionality (perhaps using Java or C#).


Rationale and Goals
===================

Python has always provided a variety of built-in and standard-library
generic functions, such as ``len()``, ``iter()``, ``pprint.pprint()``,
and most of the functions in the ``operator`` module.  However, it
currently:

1. does not have a simple or straightforward way for developers to
    create new generic functions,

2. does not have a standard way for methods to be added to existing
    generic functions (i.e., some are added using registration
    functions, others require defining ``__special__`` methods,
    possibly by monkeypatching), and

3. does not allow dispatching on multiple argument types (except in
    a limited form for arithmetic operators, where "right-hand"
    (``__r*__``) methods can be used to do two-argument dispatch.

In addition, it is currently a common anti-pattern for Python code
to inspect the types of received arguments, in order to decide what
to do with the objects.  For example, code may wish to accept either
an object of some type, or a sequence of objects of that type.

Currently, the "obvious way" to do this is by type inspection, but
this is brittle and closed to extension.  A developer using an
already-written library may be unable to change how their objects are
treated by such code, especially if the objects they are using were
created by a third party.

Therefore, this PEP proposes a standard library module to address
these, and related issues, using decorators and argument annotations
(PEP 3107).  The primary features to be provided are:

* a dynamic overloading facility, similar to the static overloading
   found in languages such as Java and C++, but including optional
   method combination features as found in CLOS and AspectJ.

* a simple "interfaces and adaptation" library inspired by Haskell's
   typeclasses (but more dynamic, and without any static type-checking),
   with an extension API to allow registering user-defined interface
   types such as those found in PyProtocols and Zope.

* a simple "aspect" implementation to make it easy to create stateful
   adapters and to do other stateful AOP.

These features are to be provided in such a way that extended
implementations can be created and used.  For example, it should be
possible for libraries to define new dispatching criteria for
generic functions, and new kinds of interfaces, and use them in
place of the predefined features.  For example, it should be possible
to use a ``zope.interface`` interface object to specify the desired
type of a function argument, as long as the ``zope.interface`` package
registered itself correctly (or a third party did the registration).

In this way, the proposed API simply offers a uniform way of accessing
the functionality within its scope, rather than prescribing a single
implementation to be used for all libraries, frameworks, and
applications.


User API
========

The overloading API will be implemented as a single module, named
``overloading``, providing the following features:


Overloading/Generic Functions
-----------------------------

The ``@overload`` decorator allows you to define alternate
implementations of a function, specialized by argument type(s).  A
function with the same name must already exist in the local namespace.
The existing function is modified in-place by the decorator to add
the new implementation, and the modified function is returned by the
decorator.  Thus, the following code::

     from overloading import overload
     from collections import Iterable

     def flatten(ob):
         """Flatten an object to its component iterables"""
         yield ob

     @overload
     def flatten(ob: Iterable):
         for o in ob:
             for ob in flatten(o):
                 yield ob

     @overload
     def flatten(ob: basestring):
         yield ob

creates a single ``flatten()`` function whose implementation roughly
equates to::

     def flatten(ob):
         if isinstance(ob, basestring) or not isinstance(ob, Iterable):
             yield ob
         else:
             for o in ob:
                 for ob in flatten(o):
                     yield ob

**except** that the ``flatten()`` function defined by overloading
remains open to extension by adding more overloads, while the
hardcoded version cannot be extended.

For example, if someone wants to use ``flatten()`` with a string-like
type that doesn't subclass ``basestring``, they would be out of luck
with the second implementation.  With the overloaded implementation,
however, they can either write this::

     @overload
     def flatten(ob: MyString):
         yield ob

or this (to avoid copying the implementation)::

     from overloading import RuleSet
     RuleSet(flatten).copy_rules((basestring,), (MyString,))

(Note also that, although PEP 3119 proposes that it should be possible
for abstract base classes like ``Iterable`` to allow classes like
``MyString`` to claim subclass-hood, such a claim is *global*,
throughout the application.  In contrast, adding a specific overload
or copying a rule is specific to an individual function, and therefore
less likely to have undesired side effects.)


``@overload`` vs. ``@when``
~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``@overload`` decorator is a common-case shorthand for the more
general ``@when`` decorator.  It allows you to leave out the name of
the function you are overloading, at the expense of requiring the
target function to be in the local namespace.  It also doesn't support
adding additional criteria besides the ones specified via argument
annotations.  The following function definitions have identical
effects, except for name binding side-effects (which will be described
below)::

     @overload
     def flatten(ob: basestring):
         yield ob

     @when(flatten)
     def flatten(ob: basestring):
         yield ob

     @when(flatten)
     def flatten_basestring(ob: basestring):
         yield ob

     @when(flatten, (basestring,))
     def flatten_basestring(ob):
         yield ob

The first definition above will bind ``flatten`` to whatever it was
previously bound to.  The second will do the same, if it was already
bound to the ``when`` decorator's first argument.  If ``flatten`` is
unbound or bound to something else, it will be rebound to the function
definition as given.  The last two definitions above will always bind
``flatten_basestring`` to the function definition as given.

Using this approach allows you to both give a method a descriptive
name (often useful in tracebacks!) and to reuse the method later.

Except as otherwise specified, all ``overloading`` decorators have the
same signature and binding rules as ``@when``.  They accept a function
and an optional "predicate" object.

The default predicate implementation is a tuple of types with
positional matching to the overloaded function's arguments.  However,
an arbitrary number of other kinds of of predicates can be created and
registered using the `Extension API`_, and will then be usable with
``@when`` and other decorators created by this module (like
``@before``, ``@after``, and ``@around``).


Method Combination and Overriding
---------------------------------

When an overloaded function is invoked, the implementation with the
signature that *most specifically matches* the calling arguments is
the one used.  If no implementation matches, a ``NoApplicableMethods``
error is raised.  If more than one implementation matches, but none of
the signatures are more specific than the others, an ``AmbiguousMethods``
error is raised.

For example, the following pair of implementations are ambiguous, if
the ``foo()`` function is ever called with two integer arguments,
because both signatures would apply, but neither signature is more
*specific* than the other (i.e., neither implies the other)::

     def foo(bar:int, baz:object):
         pass

     @overload
     def foo(bar:object, baz:int):
         pass

In contrast, the following pair of implementations can never be
ambiguous, because one signature always implies the other; the
``int/int`` signature is more specific than the ``object/object``
signature::

     def foo(bar:object, baz:object):
         pass

     @overload
     def foo(bar:int, baz:int):
         pass

A signature S1 implies another signature S2, if whenever S1 would
apply, S2 would also.  A signature S1 is "more specific" than another
signature S2, if S1 implies S2, but S2 does not imply S1.

Although the examples above have all used concrete or abstract types
as argument annotations, there is no requirement that the annotations
be such.  They can also be "interface" objects (discussed in the
`Interfaces and Adaptation`_ section), including user-defined
interface types.  (They can also be other objects whose types are
appropriately registered via  the `Extension API`_.)


Proceeding to the "Next" Method
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If the first parameter of an overloaded function is named
``__proceed__``, it will be passed a callable representing the next
most-specific method.  For example, this code::

     def foo(bar:object, baz:object):
         print "got objects!"

     @overload
     def foo(__proceed__, bar:int, baz:int):
         print "got integers!"
         return __proceed__(bar, baz)

Will print "got integers!" followed by "got objects!".

If there is no next most-specific method, ``__proceed__`` will be
bound to a ``NoApplicableMethods`` instance.  When called, a new
``NoApplicableMethods`` instance will be raised, with the arguments
passed to the first instance.

Similarly, if the next most-specific methods have ambiguous precedence
with respect to each other, ``__proceed__`` will be bound to an
``AmbiguousMethods`` instance, and if called, it will raise a new
instance.

Thus, a method can either check if ``__proceed__`` is an error
instance, or simply invoke it.  The ``NoApplicableMethods`` and
``AmbiguousMethods`` error classes have a common ``DispatchError``
base class, so ``isinstance(__proceed__, overloading.DispatchError)``
is sufficient to identify whether ``__proceed__`` can be safely
called.

(Implementation note: using a magic argument name like ``__proceed__``
could potentially be replaced by a magic function that would be called
to obtain the next method.  A magic function, however, would degrade
performance and might be more difficult to implement on non-CPython
platforms.  Method chaining via magic argument names, however, can be
efficiently implemented on any Python platform that supports creating
bound methods from functions -- one simply recursively binds each
function to be chained, using the following function or error as the
``im_self`` of the bound method.)


"Before" and "After" Methods
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In addition to the simple next-method chaining shown above, it is
sometimes useful to have other ways of combining methods.  For
example, the "observer pattern" can sometimes be implemented by adding
extra methods to a function, that execute before or after the normal
implementation.

To support these use cases, the ``overloading`` module will supply
``@before``, ``@after``, and ``@around`` decorators, that roughly
correspond to the same types of methods in the Common Lisp Object
System (CLOS), or the corresponding "advice" types in AspectJ.

Like ``@when``, all of these decorators must be passed the function to
be overloaded, and can optionally accept a predicate as well::

     def begin_transaction(db):
         print "Beginning the actual transaction"


     @before(begin_transaction)
     def check_single_access(db: SingletonDB):
         if db.inuse:
             raise TransactionError("Database already in use")

     @after(begin_transaction)
     def start_logging(db: LoggableDB):
         db.set_log_level(VERBOSE)


``@before`` and ``@after`` methods are invoked either before or after
the main function body, and are *never considered ambiguous*.  That
is, it will not cause any errors to have multiple "before" or "after"
methods with identical or overlapping signatures.  Ambiguities are
resolved using the order in which the methods were added to the
target function.

"Before" methods are invoked most-specific method first, with
ambiguous methods being executed in the order they were added.  All
"before" methods are called before any of the function's "primary"
methods (i.e. normal ``@overload`` methods) are executed.

"After" methods are invoked in the *reverse* order, after all of the
function's "primary" methods are executed.  That is, they are executed
least-specific methods first, with ambiguous methods being executed in
the reverse of the order in which they were added.

The return values of both "before" and "after" methods are ignored,
and any uncaught exceptions raised by *any* methods (primary or other)
immediately end the dispatching process.  "Before" and "after" methods
cannot have ``__proceed__`` arguments, as they are not responsible
for calling any other methods.  They are simply called as a
notification before or after the primary methods.

Thus, "before" and "after" methods can be used to check or establish
preconditions (e.g. by raising an error if the conditions aren't met)
or to ensure postconditions, without needing to duplicate any existing
functionality.


"Around" Methods
~~~~~~~~~~~~~~~~

The ``@around`` decorator declares a method as an "around" method.
"Around" methods are much like primary methods, except that the
least-specific "around" method has higher precedence than the
most-specific "before" or method.

Unlike "before" and "after" methods, however, "Around" methods *are*
responsible for calling their ``__proceed__`` argument, in order to
continue the invocation process.  "Around" methods are usually used
to transform input arguments or return values, or to wrap specific
cases with special error handling or try/finally conditions, e.g.::

     @around(commit_transaction)
     def lock_while_committing(__proceed__, db: SingletonDB):
         with db.global_lock:
             return __proceed__(db)

They can also be used to replace the normal handling for a specific
case, by *not* invoking the ``__proceed__`` function.

The ``__proceed__`` given to an "around" method will either be the
next applicable "around" method, a ``DispatchError`` instance,
or a synthetic method object that will call all the "before" methods,
followed by the primary method chain, followed by all the "after"
methods, and return the result from the primary method chain.

Thus, just as with normal methods, ``__proceed__`` can be checked for
``DispatchError``-ness, or simply invoked.  The "around" method should
return the value returned by ``__proceed__``, unless of course it
wishes to modify or replace it with a different return value for the
function as a whole.


Custom Combinations
~~~~~~~~~~~~~~~~~~~

The decorators described above (``@overload``, ``@when``, ``@before``,
``@after``, and ``@around``) collectively implement what in CLOS is
called the "standard method combination" -- the most common patterns
used in combining methods.

Sometimes, however, an application or library may have use for a more
sophisticated type of method combination.  For example, if you
would like to have "discount" methods that return a percentage off,
to be subtracted from the value returned by the primary method(s),
you might write something like this::

     from overloading import always_overrides, merge_by_default
     from overloading import Around, Before, After, Method, MethodList

     class Discount(MethodList):
         """Apply return values as discounts"""

         def __call__(self, *args, **kw):
             retval = self.tail(*args, **kw)
             for sig, body in self.sorted():
                 retval -= retval * body(*args, **kw)
             return retval

     # merge discounts by priority
     merge_by_default(Discount)

     # discounts have precedence over before/after/primary methods
     always_overrides(Discount, Before)
     always_overrides(Discount, After)
     always_overrides(Discount, Method)

     # but not over "around" methods
     always_overrides(Around, Discount)

     # Make a decorator called "discount" that works just like the
     # standard decorators...
     discount = Discount.make_decorator('discount')

     # and now let's use it...
     def price(product):
         return product.list_price

     @discount(price)
     def ten_percent_off_shoes(product: Shoe)
         return Decimal('0.1')

Similar techniques can be used to implement a wide variety of
CLOS-style method qualifiers and combination rules.  The process of
creating custom method combination objects and their corresponding
decorators is described in more detail under the `Extension API`_
section.

Note, by the way, that the ``@discount`` decorator shown will work
correctly with any new predicates defined by other code.  For example,
if ``zope.interface`` were to register its interface types to work
correctly as argument annotations, you would be able to specify
discounts on the basis of its interface types, not just classes or
``overloading``-defined interface types.

Similarly, if a library like RuleDispatch or PEAK-Rules were to
register an appropriate predicate implementation and dispatch engine,
one would then be able to use those predicates for discounts as well,
e.g.::

     from somewhere import Pred  # some predicate implementation

     @discount(
         price,
         Pred("isinstance(product,Shoe) and"
              " product.material.name=='Blue Suede'")
     )
     def forty_off_blue_suede_shoes(product):
         return Decimal('0.4')

The process of defining custom predicate types and dispatching engines
is also described in more detail under the `Extension API`_ section.


Overloading Inside Classes
--------------------------

All of the decorators above have a special additional behavior when
they are directly invoked within a class body: the first parameter
(other than ``__proceed__``, if present) of the decorated function
will be treated as though it had an annotation equal to the class
in which it was defined.

That is, this code::

     class And(object):
         # ...
         @when(get_conjuncts)
         def __conjuncts(self):
             return self.conjuncts

produces the same effect as this (apart from the existence of a
private method)::

     class And(object):
         # ...

     @when(get_conjuncts)
     def get_conjuncts_of_and(ob: And):
         return ob.conjuncts

This behavior is both a convenience enhancement when defining lots of
methods, and a requirement for safely distinguishing multi-argument
overloads in subclasses.  Consider, for example, the following code::

     class A(object):
         def foo(self, ob):
             print "got an object"

         @overload
         def foo(__proceed__, self, ob:Iterable):
             print "it's iterable!"
             return __proceed__(self, ob)


     class B(A):
         foo = A.foo     # foo must be defined in local namespace

         @overload
         def foo(__proceed__, self, ob:Iterable):
             print "B got an iterable!"
             return __proceed__(self, ob)

Due to the implicit class rule, calling ``B().foo([])`` will print
"B got an iterable!" followed by "it's iterable!", and finally,
"got an object", while ``A().foo([])`` would print only the messages
defined in ``A``.

Conversely, without the implicit class rule, the two "Iterable"
methods would have the exact same applicability conditions, so calling
either ``A().foo([])`` or ``B().foo([])`` would result in an
``AmbiguousMethods`` error.

It is currently an open issue to determine the best way to implement
this rule in Python 3.0.  Under Python 2.x, a class' metaclass was
not chosen until the end of the class body, which means that
decorators could insert a custom metaclass to do processing of this
sort.  (This is how RuleDispatch, for example, implements the implicit
class rule.)

PEP 3115, however, requires that a class' metaclass be determined
*before* the class body has executed, making it impossible to use this
technique for class decoration any more.

At this writing, discussion on this issue is ongoing.


Interfaces and Adaptation
-------------------------

The ``overloading`` module provides a simple implementation of
interfaces and adaptation.  The following example defines an
``IStack`` interface, and declares that ``list`` objects support it::

     from overloading import abstract, Interface

     class IStack(Interface):
         @abstract
         def push(self, ob)
             """Push 'ob' onto the stack"""

         @abstract
         def pop(self):
             """Pop a value and return it"""


     when(IStack.push, (list, object))(list.append)
     when(IStack.pop, (list,))(list.pop)

     mylist = []
     mystack = IStack(mylist)
     mystack.push(42)
     assert mystack.pop()==42

The ``Interface`` class is a kind of "universal adapter".  It accepts
a single argument: an object to adapt.  It then binds all its methods
to the target object, in place of itself.  Thus, calling
``mystack.push(42``) is the same as calling
``IStack.push(mylist, 42)``.

The ``@abstract`` decorator marks a function as being abstract: i.e.,
having no implementation.  If an ``@abstract`` function is called,
it raises ``NoApplicableMethods``.  To become executable, overloaded
methods must be added using the techniques previously described. (That
is, methods can be added using ``@when``, ``@before``, ``@after``,
``@around``, or any custom method combination decorators.)

In the example above, the ``list.append`` method is added as a method
for ``IStack.push()`` when its arguments are a list and an arbitrary
object.  Thus, ``IStack.push(mylist, 42)`` is translated to
``list.append(mylist, 42)``, thereby implementing the desired
operation.

(Note: the ``@abstract`` decorator is not limited to use in interface
definitions; it can be used anywhere that you wish to create an
"empty" generic function that initially has no methods.  In
particular, it need not be used inside a class.)


Subclassing and Re-assembly
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Interfaces can be subclassed::

     class ISizedStack(IStack):
         @abstract
         def __len__(self):
             """Return the number of items on the stack"""

     # define __len__ support for ISizedStack
     when(ISizedStack.__len__, (list,))(list.__len__)

Or assembled by combining functions from existing interfaces::

     class Sizable(Interface):
         __len__ = ISizedStack.__len__

     # list now implements Sizable as well as ISizedStack, without
     # making any new declarations!

A class can be considered to "adapt to" an interface at a given
point in time, if no method defined in the interface is guaranteed to
raise a ``NoApplicableMethods`` error if invoked on an instance of
that class at that point in time.

In normal usage, however, it is "easier to ask forgiveness than
permission".  That is, it is easier to simply use an interface on
an object by adapting it to the interface (e.g. ``IStack(mylist)``)
or invoking interface methods directly (e.g. ``IStack.push(mylist,
42)``), than to try to figure out whether the object is adaptable to
(or directly implements) the interface.


Implementing an Interface in a Class
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It is possible to declare that a class directly implements an
interface, using the ``declare_implementation()`` function::

     from overloading import declare_implementation

     class Stack(object):
         def __init__(self):
             self.data = []
         def push(self, ob):
             self.data.append(ob)
         def pop(self):
             return self.data.pop()

     declare_implementation(IStack, Stack)

The ``declare_implementation()`` call above is roughly equivalent to
the following steps::

     when(IStack.push, (Stack,object))(lambda self, ob: self.push(ob))
     when(IStack.pop, (Stack,))(lambda self, ob: self.pop())

That is, calling ``IStack.push()`` or ``IStack.pop()`` on an instance
of any subclass of ``Stack``, will simply delegate to the actual
``push()`` or ``pop()`` methods thereof.

For the sake of efficiency, calling ``IStack(s)`` where ``s`` is an
instance of ``Stack``, **may** return ``s`` rather than an ``IStack``
adapter.  (Note that calling ``IStack(x)`` where ``x`` is already an
``IStack`` adapter will always return ``x`` unchanged; this is an
additional optimization allowed in cases where the adaptee is known
to *directly* implement the interface, without adaptation.)

For convenience, it may be useful to declare implementations in the
class header, e.g.::

     class Stack(metaclass=Implementer, implements=IStack):
         ...

Instead of calling ``declare_implementation()`` after the end of the
suite.


Interfaces as Type Specifiers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

``Interface`` subclasses can be used as argument annotations to
indicate what type of objects are acceptable to an overload, e.g.::

     @overload
     def traverse(g: IGraph, s: IStack):
         g = IGraph(g)
         s = IStack(s)
         # etc....

Note, however, that the actual arguments are *not* changed or adapted
in any way by the mere use of an interface as a type specifier.  You
must explicitly cast the objects to the appropriate interface, as
shown above.

Note, however, that other patterns of interface use are possible.
For example, other interface implementations might not support
adaptation, or might require that function arguments already be
adapted to the specified interface.  So the exact semantics of using
an interface as a type specifier are dependent on the interface
objects you actually use.

For the interface objects defined by this PEP, however, the semantics
are as described above.  An interface I1 is considered "more specific"
than another interface I2, if the set of descriptors in I1's
inheritance hierarchy are a proper superset of the descriptors in I2's
inheritance hierarchy.

So, for example, ``ISizedStack`` is more specific than both
``ISizable`` and ``ISizedStack``, irrespective of the inheritance
relationships between these interfaces.  It is purely a question of
what operations are included within those interfaces -- and the
*names* of the operations are unimportant.

Interfaces (at least the ones provided by ``overloading``) are always
considered less-specific than concrete classes.  Other interface
implementations can decide on their own specificity rules, both
between interfaces and other interfaces, and between interfaces and
classes.


Non-Method Attributes in Interfaces
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``Interface`` implementation actually treats all attributes and
methods (i.e. descriptors) in the same way: their ``__get__`` (and
``__set__`` and ``__delete__``, if present) methods are called with
the wrapped (adapted) object as "self".  For functions, this has the
effect of creating a bound method linking the generic function to the
wrapped object.

For non-function attributes, it may be easiest to specify them using
the ``property`` built-in, and the corresponding ``fget``, ``fset``,
and ``fdel`` attributes::

     class ILength(Interface):
         @property
         @abstract
         def length(self):
             """Read-only length attribute"""

     # ILength(aList).length == list.__len__(aList)
     when(ILength.length.fget, (list,))(list.__len__)

Alternatively, methods such as ``_get_foo()`` and ``_set_foo()``
may be defined as part of the interface, and the property defined
in terms of those methods, but this a bit more difficult for users
to implement correctly when creating a class that directly implements
the interface, as they would then need to match all the individual
method names, not just the name of the property or attribute.


Aspects
-------

The adaptation system provided assumes that adapters are "stateless",
which is to say that adapters have no attributes or storage apart from
those of the adapted object.  This follows the "typeclass/instance"
model of Haskell, and the concept of "pure" (i.e., transitively
composable) adapters.

However, there are occasionally cases where, to provide a complete
implementation of some interface, some sort of additional state is
required.

One possibility of course, would be to attach monkeypatched "private"
attributes to the adaptee.  But this is subject to name collisions,
and complicates the process of initialization.  It also doesn't work
on objects that don't have a ``__dict__`` attribute.

So the ``Aspect`` class is provided to make it easy to attach extra
information to objects that either:

1. have a ``__dict__`` attribute (so aspect instances can be stored
    in it, keyed by aspect class),

2. support weak referencing (so aspect instances can be managed using
    a global but thread-safe weak-reference dictionary), or

3. implement or can be adapt to the ``overloading.IAspectOwner``
    interface (technically, #1 or #2 imply this)

Subclassing ``Aspect`` creates an adapter class whose state is tied
to the life of the adapted object.

For example, suppose you would like to count all the times a certain
method is called on instances of ``Target`` (a classic AOP example).
You might do something like::

     from overloading import Aspect

     class Count(Aspect):
         count = 0

     @after(Target.some_method)
     def count_after_call(self, *args, **kw):
         Count(self).count += 1

The above code will keep track of the number of times that
``Target.some_method()`` is successfully called (i.e., it will not
count errors).  Other code can then access the count using
``Count(someTarget).count``.

``Aspect`` instances can of course have ``__init__`` methods, to
initialize any data structures.  They can use either ``__slots__``
or dictionary-based attributes for storage.

While this facility is rather primitive compared to a full-featured
AOP tool like AspectJ, persons who wish to build pointcut libraries
or other AspectJ-like features can certainly use ``Aspect`` objects
and method-combination decorators as a base for more expressive AOP
tools.

XXX spec out full aspect API, including keys, N-to-1 aspects, manual
     attach/detach/delete of aspect instances, and the ``IAspectOwner``
     interface.


Extension API
=============

TODO: explain how all of these work

implies(o1, o2)

declare_implementation(iface, class)

predicate_signatures(ob)

parse_rule(ruleset, body, predicate, actiontype, localdict, globaldict)

combine_actions(a1, a2)

rules_for(f)

Rule objects

ActionDef objects

RuleSet objects

Method objects

MethodList objects

IAspectOwner



Implementation Notes
====================

Most of the functionality described in this PEP is already implemented
in the in-development version of the PEAK-Rules framework.  In
particular, the basic overloading and method combination framework
(minus the ``@overload`` decorator) already exists there.  The
implementation of all of these features in ``peak.rules.core`` is 656
lines of Python at this writing.

``peak.rules.core`` currently relies on the DecoratorTools and
BytecodeAssembler modules, but both of these dependencies can be
replaced, as DecoratorTools is used mainly for Python 2.3
compatibility and to implement structure types (which can be done
with named tuples in later versions of Python).  The use of
BytecodeAssembler can be replaced using an "exec" or "compile"
workaround, given a reasonable effort.  (It would be easier to do this
if the ``func_closure`` attribute of function objects was writable.)

The ``Interface`` class has been previously prototyped, but is not
included in PEAK-Rules at the present time.

The "implicit class rule" has previously been implemented in the
RuleDispatch library.  However, it relies on the ``__metaclass__``
hook that is currently eliminated in PEP 3115.

I don't currently know how to make ``@overload`` play nicely with
``classmethod`` and ``staticmethod`` in class bodies.  It's not really
clear if it needs to, however.


Copyright
=========

This document has been placed in the public domain.


From guido at python.org  Tue May  1 00:54:04 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Apr 2007 15:54:04 -0700
Subject: [Python-3000] super(), class decorators, and PEP 3115
In-Reply-To: <5.1.1.6.0.20070430152320.02d31868@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430142844.03c96240@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430152320.02d31868@sparrow.telecommunity.com>
Message-ID: <ca471dc20704301554p590ccbedvc5727e346300e6ce@mail.gmail.com>

On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 12:17 PM 4/30/2007 -0700, Guido van Rossum wrote:
> >Assuming class decorators are added, can't you do all of this using a
> >custom metaclass?
>
> The only thing I need for the GF PEP is a way for a method decorator to get
> a callback after the class is created, so that overloading will work
> correctly in cases where overloaded methods are defined in a subclass.

I still don't understand why you can't tell the users "for this to
work, you must use my special magic super-duper metaclass defined
*here*". Surely a sufficiently advanced metaclass can pull of this
kind of magic in its __init__ method? If not a metaclass, then a
super-duper decorator. Or what am I missing?

> In essence, when you define an overloaded method inside a class body, you
> would like to be able to treat it as if it were defined with
> "self:__class__", where __class__ is the enclosing class.  In practice,
> this means that the actual overloading has to wait until the class
> definition is finished.
>
> In Python 2.x, RuleDispatch implements this by temporary tinkering with
> __metaclass__, but if I understand correctly this would not be possible
> with PEP 3115.  I didn't make this connection until I was fleshing out my
> PEP's explanation of how precedence works when you are overloading instance
> methods (as opposed to standalone functions).

Correct. As the word tinkering implies, you'll have to come up with a
different approach.

> If PEP 3115 were changed to restore support for __metaclass__, I could
> continue to use that approach.  Otherwise, some other sort of hook is required.

I'm -1 on augmenting PEP 3115 for this purpose.

> The class decorator thing isn't an issue for the GF PEP as such; it doesn't
> use them directly, only via the __metaclass__ hack.  I just brought it up
> because I was looking for the class decorator PEP when I realized that the
> old way of doing them wouldn't be possible any more.

As long as someone's working on it (which I hear someone is), the
class decorator PEP is secure; the actualy discussion was closed
successfully weeks ago.

But I don't understand how a __metaclass__ hack can use a class decorator.

> >I'm not sure that your proposal for implementing an improved super has
> >anything over the currently most-favored proposal by Timothy Delaney.
>
> It's merely another use for the hook, that would save on having another
> special-purpose mechanism strictly for super(); I figured that having other
> uses for it (besides mine) would be a plus.

I'd leave that up to the folks currently discussing super.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jimjjewett at gmail.com  Tue May  1 00:55:36 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 30 Apr 2007 18:55:36 -0400
Subject: [Python-3000] super() PEP
In-Reply-To: <cc93256f0704301536r2ab74894xfe15862a573263b4@mail.gmail.com>
References: <5.1.1.6.0.20070430142844.03c96240@sparrow.telecommunity.com>
	<ca471dc20704301217h1c117375r635bcae0034d593f@mail.gmail.com>
	<011b01c78b6a$72098810$0201a8c0@ryoko>
	<43aa6ff70704301403q5bf557c2wf43148f7a339353d@mail.gmail.com>
	<014901c78b6e$d2d66d80$0201a8c0@ryoko>
	<cc93256f0704301536r2ab74894xfe15862a573263b4@mail.gmail.com>
Message-ID: <fb6fbf560704301555g3d5def19j4e9ccddf8c2dbe53@mail.gmail.com>

On 4/30/07, Lino Mastrodomenico <l.mastrodomenico at gmail.com> wrote:

> One more thing: what do people think of modifying super so that when
> it doesn't find a method instead of raising AttributeError it returns
> something like "lambda *args, **kwargs: None"?

To me, the most important change is correctness --
super(__this_class__, self) over super(Name, self).  Anything else is
at least debatable.

But of all the shortcuts mentioned, this particular shortcut is easily
the most valuable to me.  At one point, I had even considered giving
the super object a special method to upcall in this manner.

For What Its Worth, in my own code, when I don't know whether or not
the next method exists, I will always be upcalling to the method of
the same name, and passing all my arguments.  Even changing the value
of one argument would be strange enough to count as a special case
worth spelling out.

Alas, Guido's recent opinion was "Don't do that".  He suggested, at a
minimum, inheriting from an ABC that provided the Nothing method.

> Optionally this can be a constant (e.g. default_method) defined
> somewhere so, if necessary, it's still possible to detect if the value
> of super.meth is a real method or the "fake" default_method.

http://www.python.org/sf/1673203 is a patch for adding an identity
method; I suspect a Nothing in the builtins would also make sense.

-jJ

From guido at python.org  Tue May  1 00:55:55 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Apr 2007 15:55:55 -0700
Subject: [Python-3000] super(), class decorators, and PEP 3115
In-Reply-To: <43aa6ff70704301403q5bf557c2wf43148f7a339353d@mail.gmail.com>
References: <5.1.1.6.0.20070430142844.03c96240@sparrow.telecommunity.com>
	<ca471dc20704301217h1c117375r635bcae0034d593f@mail.gmail.com>
	<011b01c78b6a$72098810$0201a8c0@ryoko>
	<43aa6ff70704301403q5bf557c2wf43148f7a339353d@mail.gmail.com>
Message-ID: <ca471dc20704301555j4b7da755s67e32f4ee87b9183@mail.gmail.com>

On 4/30/07, Collin Winter <collinw at gmail.com> wrote:
> On 4/30/07, Tim Delaney <tcdelaney at optusnet.com.au> wrote:
> > Would you prefer me to work with Calvin to get his existing PEP to match my
> > proposal, or would you prefer a competing PEP?
>
> Please work together with Calvin. One PEP is enough.

And don't worry too much about the exact deadline; at this point super
is not a new proposal, we're just hashing out details (however
violently :-).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From barry at python.org  Tue May  1 00:56:45 2007
From: barry at python.org (Barry Warsaw)
Date: Mon, 30 Apr 2007 18:56:45 -0400
Subject: [Python-3000] octal literals PEP
In-Reply-To: <ca471dc20704301547l53495009p5c458ac0d16c181c@mail.gmail.com>
References: <fb6fbf560704300749r538c8674lc965f2dfcb67162e@mail.gmail.com>
	<d09829f50704301251r35f05bf8wf7cf5397e3751faa@mail.gmail.com>
	<ca471dc20704301547l53495009p5c458ac0d16c181c@mail.gmail.com>
Message-ID: <EC1476D8-D183-4B28-84EF-1914E6B38A67@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Apr 30, 2007, at 6:47 PM, Guido van Rossum wrote:

> The PEP editors have admitted to being behind on the job. AFAIK PEPs
> sent to the PEP editors before the deadline are in, regardless of when
> the PEP goes online.
>
> To save the PEP editors the effort, if you send it to me I will assign
> it a PEP number and submit it. (Ditto for other PEPs in the same
> situation.)

Thanks Guido.

peps at python dot org is now a mailing list and we will soon have  
three additional editors to help out.  Please also see my call for  
junior editors, just posted.

Cheers,
- -Barry


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRjZ0LnEjvBPtnXfVAQLEkwP9Gl4SJtg+H1w91djZ5Bo1Ef+MMTfpwqwM
Rpr6nxgKRCg1Xuzo7Y2aHzrXOvO05r/Lla5djUfHnH7SKsoeP71Kw9+jfGyM4DcL
l3dQ2YCc1vD4fEWB5jp1VwjFGxXaes6fVBF7ERN1G2yTxbmWzk4ugNijcUYkbGiM
rj4koq2YNds=
=pvzf
-----END PGP SIGNATURE-----

From brett at python.org  Tue May  1 00:57:10 2007
From: brett at python.org (Brett Cannon)
Date: Mon, 30 Apr 2007 15:57:10 -0700
Subject: [Python-3000] super() PEP
In-Reply-To: <cc93256f0704301536r2ab74894xfe15862a573263b4@mail.gmail.com>
References: <5.1.1.6.0.20070430142844.03c96240@sparrow.telecommunity.com>
	<ca471dc20704301217h1c117375r635bcae0034d593f@mail.gmail.com>
	<011b01c78b6a$72098810$0201a8c0@ryoko>
	<43aa6ff70704301403q5bf557c2wf43148f7a339353d@mail.gmail.com>
	<014901c78b6e$d2d66d80$0201a8c0@ryoko>
	<cc93256f0704301536r2ab74894xfe15862a573263b4@mail.gmail.com>
Message-ID: <bbaeab100704301557m19d99b3dy26be3f83f00b3199@mail.gmail.com>

On 4/30/07, Lino Mastrodomenico <l.mastrodomenico at gmail.com> wrote:
> 2007/4/30, Tim Delaney <tcdelaney at optusnet.com.au>:
> > Fine with me. Calvin - want to send me your latest draft, and I'll do
some
> > modifications? I think we've got to the point now where we can take this
> > off-list.
>
> One more thing: what do people think of modifying super so that when
> it doesn't find a method instead of raising AttributeError it returns
> something like "lambda *args, **kwargs: None"?
>

Yuck.  That just smacks of JavaScript and its lax error detection.  There is
a reason they are adding a strict pragma in JS 2.0.

-?.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070430/f6d91370/attachment.html 

From barry at python.org  Tue May  1 01:06:11 2007
From: barry at python.org (Barry Warsaw)
Date: Mon, 30 Apr 2007 19:06:11 -0400
Subject: [Python-3000] PEP 3119 - Introducing Abstract Base Classes
In-Reply-To: <fb6fbf560704271010j50a6319et27d519fc468afeb8@mail.gmail.com>
References: <ca471dc20704261150o45a3fa22u16a515b3d4aa14ba@mail.gmail.com>
	<04A4F15C-38C3-4727-875D-82803F4FB974@python.org>
	<fb6fbf560704271010j50a6319et27d519fc468afeb8@mail.gmail.com>
Message-ID: <FE2C6C62-4DDD-4CB1-BADB-95B6ECA9EF95@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Apr 27, 2007, at 1:10 PM, Jim Jewett wrote:

> On 4/27/07, Barry Warsaw <barry at python.org> wrote:
>
>> - - Attributes.  Interfaces allow you to make assertions about
>> attributes, not just methods, while ABCs necessarily cover only  
>> methods.
>
> Why can't they have data attributes as well?

They can /have/ data attributes, but that's not really the point.   
The point (IMHO) is that such attributes can be documented,  
inspected, and reasoned about.  You could annotate interface  
attributes with type information in order to automatically generate  
database tables or web forms, etc.  Normal Python attributes can't do  
that, although if they were properties, they could.

>> - - With interfaces, you can make assertions about individual objects
>> which may be different than what their classes assert.  Interface
>> proponents seem to care a lot about this and it seems there are valid
>> uses cases for it.
>
> Isn't this something that could be handled by overriding isinstance?

It could.

>> Another example of separating inheritance and interface comes up when
>> you want to derive a subclass to share implementation details, but
>> you want to subtly change the semantics, which would invalidate an
>> ABC claim by the base class.  Something like a GrowOnlyDictionary
>> that derived from dict for implementation purposes, but didn't want
>> to implement __delitem__ as required by the MutableMapping ABC.
>
> OK, that makes the isubclass override trickier, so there should be an
> example, but I think it can still be done.
>
>> Finally, I'm concerned with the "weight" of adding ABCs to all the
>> built-in types.
>
> What if the builtin types did not initially derive from any ABC, but
> were added (through an issubclass override) when the abc module was
> imported?

That would allow for some unfortunately global side-effects.  Say I  
happen to import your library that imports abc.  Now all the built-in  
types in my entire application get globally changed.  I'm also not  
sure how you'd implement that.

Cheers,
- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRjZ2ZHEjvBPtnXfVAQJMYgP+PiEvTRe+AeQHJSjYfx3kxE3oV+n9kfbL
xns+fK6Chub+frAzcHz+an7GXikTxbdYHysunWqhpB0TSOZfF7SzKNgD3pHTKmN/
zyMVTykr5zynmLPi8bygZfTNlm340Qrc+ymE3qjCsbRP9XZtFC5CJYmlIM2kU0MI
HMV5KtXjbgc=
=77iN
-----END PGP SIGNATURE-----

From barry at python.org  Tue May  1 01:08:08 2007
From: barry at python.org (Barry Warsaw)
Date: Mon, 30 Apr 2007 19:08:08 -0400
Subject: [Python-3000] PEP 3119 - Introducing Abstract Base Classes
In-Reply-To: <217F4CF1-3CC1-48CE-A635-877C22562C78@PageDNA.com>
References: <ca471dc20704261150o45a3fa22u16a515b3d4aa14ba@mail.gmail.com>
	<04A4F15C-38C3-4727-875D-82803F4FB974@python.org>
	<fb6fbf560704271010j50a6319et27d519fc468afeb8@mail.gmail.com>
	<07Apr27.104005pdt."57996"@synergy1.parc.xerox.com>
	<217F4CF1-3CC1-48CE-A635-877C22562C78@PageDNA.com>
Message-ID: <C6D72C9F-B094-4BB4-9AF8-2A6A32021CE6@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Apr 27, 2007, at 2:17 PM, Tony Lownds wrote:

> +0 on abstract attributes. Methods seem to dominate most APIs that  
> make
> use of interfaces, but there are always a few exceptions.

One of the reasons to be able to specify attributes in an ABC or  
interface is so that you can use something more Pythonic than getters  
and setters.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRjZ22HEjvBPtnXfVAQLbwAQAlicMWta8mZSQEgiRcc+VvQG1kPVYRy/t
3Dlp5cEHog6VMdTH7iEN+TSAszsXatjbeo9nl/fT/fI3RYrre5+hiclVoyLCnfUF
jJda589xj9EzKjJfYPl1dbCjxp5S/nK2RmtOMN3HxLMcuKQ0I3ZSbAlR+BKRO55T
qRSmKp6Ebb8=
=y7ge
-----END PGP SIGNATURE-----

From guido at python.org  Tue May  1 01:10:45 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Apr 2007 16:10:45 -0700
Subject: [Python-3000] super() PEP
In-Reply-To: <cc93256f0704301536r2ab74894xfe15862a573263b4@mail.gmail.com>
References: <5.1.1.6.0.20070430142844.03c96240@sparrow.telecommunity.com>
	<ca471dc20704301217h1c117375r635bcae0034d593f@mail.gmail.com>
	<011b01c78b6a$72098810$0201a8c0@ryoko>
	<43aa6ff70704301403q5bf557c2wf43148f7a339353d@mail.gmail.com>
	<014901c78b6e$d2d66d80$0201a8c0@ryoko>
	<cc93256f0704301536r2ab74894xfe15862a573263b4@mail.gmail.com>
Message-ID: <ca471dc20704301610w58086ff2p5a69db953f6bed42@mail.gmail.com>

On 4/30/07, Lino Mastrodomenico <l.mastrodomenico at gmail.com> wrote:
> 2007/4/30, Tim Delaney <tcdelaney at optusnet.com.au>:
> > Fine with me. Calvin - want to send me your latest draft, and I'll do some
> > modifications? I think we've got to the point now where we can take this
> > off-list.
>
> One more thing: what do people think of modifying super so that when
> it doesn't find a method instead of raising AttributeError it returns
> something like "lambda *args, **kwargs: None"?
>
> Optionally this can be a constant (e.g. default_method) defined
> somewhere so, if necessary, it's still possible to detect if the value
> of super.meth is a real method or the "fake" default_method.
>
> I think this can be useful when a method *doesn't know* if it's the
> last in the MRO because it may depend on the inheritance hierarchy of
> its subclasses: you can always simply call super.meth(...) and if the
> current method is the last this will be a NOP.

Most definitely not. If you don't even know whether you're defining or
overriding a method you shouldn't be using super in the first place,
because you're *obviously* not engaged in cooperative MI.

And don't get me started abut __init__. Constructors don't do
cooperative MI, period.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue May  1 01:13:44 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Apr 2007 16:13:44 -0700
Subject: [Python-3000] PEP 3119 - Introducing Abstract Base Classes
In-Reply-To: <C6D72C9F-B094-4BB4-9AF8-2A6A32021CE6@python.org>
References: <ca471dc20704261150o45a3fa22u16a515b3d4aa14ba@mail.gmail.com>
	<04A4F15C-38C3-4727-875D-82803F4FB974@python.org>
	<fb6fbf560704271010j50a6319et27d519fc468afeb8@mail.gmail.com>
	<217F4CF1-3CC1-48CE-A635-877C22562C78@PageDNA.com>
	<C6D72C9F-B094-4BB4-9AF8-2A6A32021CE6@python.org>
Message-ID: <ca471dc20704301613x4969abe3p6ca27a1ad9e1c2e4@mail.gmail.com>

On 4/30/07, Barry Warsaw <barry at python.org> wrote:
> On Apr 27, 2007, at 2:17 PM, Tony Lownds wrote:
> > +0 on abstract attributes. Methods seem to dominate most APIs that
> > make use of interfaces, but there are always a few exceptions.
>
> One of the reasons to be able to specify attributes in an ABC or
> interface is so that you can use something more Pythonic than getters
> and setters.

Even if support for abstract attributes is not provided by default in
py3k, it shouldn't be hard to add as a pure-python 3rd party add-on,
using a custom metaclass or a class decorator.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From barry at python.org  Tue May  1 01:16:09 2007
From: barry at python.org (Barry Warsaw)
Date: Mon, 30 Apr 2007 19:16:09 -0400
Subject: [Python-3000] Traits/roles instead of ABCs
In-Reply-To: <740c3aec0704301501u7df7b5a6uaea854d4716eb87e@mail.gmail.com>
References: <43aa6ff70704291840s3384824et44ebfd360c15eda@mail.gmail.com>
	<014201c78adc$ca70d960$f101a8c0@RaymondLaptop1>
	<1d36917a0704300816ma3bf9c2o4dd674cfcefa9172@mail.gmail.com>
	<-3456230403858254882@unknownmsgid>
	<740c3aec0704301501u7df7b5a6uaea854d4716eb87e@mail.gmail.com>
Message-ID: <F6F8D381-EC6F-4F6C-95F1-339476B78F47@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Apr 30, 2007, at 6:01 PM, BJ?rn Lindqvist wrote:

> On 4/30/07, Bill Janssen <janssen at parc.com> wrote:
>>> On 4/30/07, Raymond Hettinger <python at rcn.com> wrote:
>>>> I'm concerned that the current ABC proposal will quickly evolve  
>>>> from optional
>>>> to required and create somewhat somewhat java-esque landscape where
>>>> inheritance and full-specification are the order of the day.
>>>
>>> +1 for preferring simple solutions to complex ones
>>
>> Me, too.  But which is the simple solution?  I tend to think ABCs  
>> are.
>
> Neither or. They are both an order of a magnitude more complex than
> the problem they are designed to solve. Raymond Hettingers small list
> of three example problems earlier in the thread, is the most concrete
> description of what the problem really is all about. And I would
> honestly rather sort them under "minor annoyances" than "really
> critical stuff, needs to be fixed asap."

Interfaces and ABCs are really all about Programming in the Really  
Large.  Most Python programs don't need this stuff, and in fact,  
having to deal with them in any way would IMO reduce the elegance of  
Python for small to medium (and even most large) applications.  I  
think the experience of Zope, Twisted, and PEAK have shown though  
that /something/ is necessary to manage the complexity when  
applications become frameworks.

To me, interfaces and/or generic functions strike the right balance.   
Such tools are completely invisible for Python programmers who don't  
care about them (the vast majority).  They're also essential for a  
very small subclass of very important Python applications.

If ABCs can walk that same tightrope of utility and invisibility,  
then maybe they'll successfully fill that niche.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRjZ4u3EjvBPtnXfVAQJGmAP+J2JMrQ985nx+ivFeq0Er9MWTo/zVtVyh
MH5X/W7NEYX+NfMEqbdM/pdi2JsvzVEX2bjaOpp28mMKw101DZ05wv5QMimjvzI1
WPR56AU7/an3yQPNQV3moBfAYtf5lIhRGG/uEjWYq9mG6ORQy3VmlxTsQygvFfwd
9t/lfCF7mKg=
=gAcm
-----END PGP SIGNATURE-----

From barry at python.org  Tue May  1 01:17:14 2007
From: barry at python.org (Barry Warsaw)
Date: Mon, 30 Apr 2007 19:17:14 -0400
Subject: [Python-3000] Traits/roles instead of ABCs
In-Reply-To: <bbaeab100704301531k4d194790y849864906eed180b@mail.gmail.com>
References: <43aa6ff70704291840s3384824et44ebfd360c15eda@mail.gmail.com>
	<014201c78adc$ca70d960$f101a8c0@RaymondLaptop1>
	<1d36917a0704300816ma3bf9c2o4dd674cfcefa9172@mail.gmail.com>
	<-3456230403858254882@unknownmsgid>
	<740c3aec0704301501u7df7b5a6uaea854d4716eb87e@mail.gmail.com>
	<bbaeab100704301531k4d194790y849864906eed180b@mail.gmail.com>
Message-ID: <E1873150-8484-4C24-8A06-911FC429468D@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Apr 30, 2007, at 6:31 PM, Brett Cannon wrote:

> I think it would be a little difficult in this situation as since a
> similar mechanism does not currently exist in the stdlib and so most
> code is not written so that ABCs or roles are needed.

This is for a reason... they're not! :)

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRjZ4+3EjvBPtnXfVAQL1JwP/TsBg8bPvyuTExNOgFQJIcjQ5yqaaw58Y
co6J0DDrNuZYxBzPtFJVmN4GfPxieqNrJOFGzP48O5zH1rpXFKfvGOKsh1RQCmjQ
+IQzz0bj4hz8st7hKTUZitblyDRxiiOAl3pwnLsKTimBrZ+HwPF5qC2g/INg4A4O
8qkHtCSmVG0=
=FGDt
-----END PGP SIGNATURE-----

From brett at python.org  Tue May  1 01:19:06 2007
From: brett at python.org (Brett Cannon)
Date: Mon, 30 Apr 2007 16:19:06 -0700
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
Message-ID: <bbaeab100704301619h47ee67f6wca599136a0845c97@mail.gmail.com>

On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>
> This is just the first draft (also checked into SVN), and doesn't include
> the details of how the extension API works (so that third-party interfaces
> and generic functions can interoperate using the same decorators,
> annotations, etc.).
>
> Comments and questions appreciated, as it'll help drive better
> explanations
> of both the design and rationales.  I'm usually not that good at guessing
> what other people will want to know (or are likely to misunderstand) until
> I get actual questions.
>
>
> PEP: 3124
> Title: Overloading, Generic Functions, Interfaces, and Adaptation
> Version: $Revision: 55029 $
> Last-Modified: $Date: 2007-04-30 18:48:06 -0400 (Mon, 30 Apr 2007) $
> Author: Phillip J. Eby <pje at telecommunity.com>
> Discussions-To: Python 3000 List <python-3000 at python.org>
> Status: Draft
> Type: Standards Track
> Requires: 3107, 3115, 3119
> Replaces: 245, 246
> Content-Type: text/x-rst
> Created: 28-Apr-2007
> Post-History: 30-Apr-2007



[SNIP]



> The ``@overload`` decorator allows you to define alternate
> implementations of a function, specialized by argument type(s).  A
> function with the same name must already exist in the local namespace.
> The existing function is modified in-place by the decorator to add
> the new implementation, and the modified function is returned by the
> decorator.  Thus, the following code::
>
>      from overloading import overload
>      from collections import Iterable
>
>      def flatten(ob):
>          """Flatten an object to its component iterables"""
>          yield ob
>
>      @overload
>      def flatten(ob: Iterable):
>          for o in ob:
>              for ob in flatten(o):
>                  yield ob
>
>      @overload
>      def flatten(ob: basestring):
>          yield ob



Doubt there is a ton of use for it, but any way to use this for pattern
matching ala Standard ML or Haskell?  Would be kind of neat to be able to do
recursive function definitions and choose which specific function
implementation based on the length of an argument.  But I don't see how that
would be possible with this directly.  I guess if a SingularSequence type
was defined that overloaded __isinstance__ properly maybe?  I have not
followed the __isinstance__ discussion closely so I am not sure.

[SNIP]


> Proceeding to the "Next" Method
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> If the first parameter of an overloaded function is named
> ``__proceed__``, it will be passed a callable representing the next
> most-specific method.  For example, this code::
>
>      def foo(bar:object, baz:object):
>          print "got objects!"
>
>      @overload
>      def foo(__proceed__, bar:int, baz:int):
>          print "got integers!"
>          return __proceed__(bar, baz)
>
> Will print "got integers!" followed by "got objects!".
>
> If there is no next most-specific method, ``__proceed__`` will be
> bound to a ``NoApplicableMethods`` instance.  When called, a new
> ``NoApplicableMethods`` instance will be raised, with the arguments
> passed to the first instance.
>
> Similarly, if the next most-specific methods have ambiguous precedence
> with respect to each other, ``__proceed__`` will be bound to an
> ``AmbiguousMethods`` instance, and if called, it will raise a new
> instance.
>
> Thus, a method can either check if ``__proceed__`` is an error
> instance, or simply invoke it.  The ``NoApplicableMethods`` and
> ``AmbiguousMethods`` error classes have a common ``DispatchError``
> base class, so ``isinstance(__proceed__, overloading.DispatchError)``
> is sufficient to identify whether ``__proceed__`` can be safely
> called.
>
> (Implementation note: using a magic argument name like ``__proceed__``
> could potentially be replaced by a magic function that would be called
> to obtain the next method.  A magic function, however, would degrade
> performance and might be more difficult to implement on non-CPython
> platforms.  Method chaining via magic argument names, however, can be
> efficiently implemented on any Python platform that supports creating
> bound methods from functions -- one simply recursively binds each
> function to be chained, using the following function or error as the
> ``im_self`` of the bound method.)



Could you change __proceed__ to be a keyword-only argument?  That way it
would match the precedence of class definitions and the 'metaclass' keyword
introduced by PEP 3115.  I personally would prefer to control what the
default is if __proceed__ is not passed in at the parameter level then have
to do a check if it's NoApplicableMethod.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070430/9ba16dd0/attachment.htm 

From jimjjewett at gmail.com  Tue May  1 01:29:30 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 30 Apr 2007 19:29:30 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
Message-ID: <fb6fbf560704301629w44d6891aue9743f8a72a0873e@mail.gmail.com>

On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:

> It is currently an open issue to determine the best way to implement
> this rule in Python 3.0.  Under Python 2.x, a class' metaclass was
> not chosen until the end of the class body, which means that
> decorators could insert a custom metaclass to do processing of this
> sort.  (This is how RuleDispatch, for example, implements the implicit
> class rule.)

> PEP 3115, however, requires that a class' metaclass be determined
> *before* the class body has executed, making it impossible to use this
> technique for class decoration any more.

It doesn't say what that metaclass has to do, though.

Is there any reason the metaclass couldn't delegate differently
depending on the value of __my_magic_attribute__ ?

-jJ

From jimjjewett at gmail.com  Tue May  1 01:37:25 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 30 Apr 2007 19:37:25 -0400
Subject: [Python-3000] PEP 3119 - Introducing Abstract Base Classes
In-Reply-To: <FE2C6C62-4DDD-4CB1-BADB-95B6ECA9EF95@python.org>
References: <ca471dc20704261150o45a3fa22u16a515b3d4aa14ba@mail.gmail.com>
	<04A4F15C-38C3-4727-875D-82803F4FB974@python.org>
	<fb6fbf560704271010j50a6319et27d519fc468afeb8@mail.gmail.com>
	<FE2C6C62-4DDD-4CB1-BADB-95B6ECA9EF95@python.org>
Message-ID: <fb6fbf560704301637r6feec868ye6dc3ed2d661142d@mail.gmail.com>

On 4/30/07, Barry Warsaw <barry at python.org> wrote:
> On Apr 27, 2007, at 1:10 PM, Jim Jewett wrote:
> > On 4/27/07, Barry Warsaw <barry at python.org> wrote:

> >> Finally, I'm concerned with the "weight" of adding ABCs to all the
> >> built-in types.

> > What if the builtin types did not initially derive from any ABC, but
> > were added (through an issubclass override) when the abc module was
> > imported?

> That would allow for some unfortunately global side-effects.  Say I
> happen to import your library that imports abc.  Now all the built-in
> types in my entire application get globally changed.  I'm also not
> sure how you'd implement that.

I don't see how these side-effects could ever be detected, except to
the extent that issubclass overrides are inherently dangerous.

I see it something like

# module abc.py

class Integer()...
    ...

Integer.register(int)
Integer.register(long)

After that, int (and long) are changed only by the addition of an
extra reference count; their __bases__ and __mro__ are utterly
unchanged.   But

    isinstance(int, Integer)

is now True.  Yes, this is global -- but the only way to detect it is
to have a reference to Integer, which implies having already relied on
the ABC framework.

-jJ

From guido at python.org  Tue May  1 01:43:16 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Apr 2007 16:43:16 -0700
Subject: [Python-3000] PEP 3119 - Introducing Abstract Base Classes
In-Reply-To: <fb6fbf560704301637r6feec868ye6dc3ed2d661142d@mail.gmail.com>
References: <ca471dc20704261150o45a3fa22u16a515b3d4aa14ba@mail.gmail.com>
	<04A4F15C-38C3-4727-875D-82803F4FB974@python.org>
	<fb6fbf560704271010j50a6319et27d519fc468afeb8@mail.gmail.com>
	<FE2C6C62-4DDD-4CB1-BADB-95B6ECA9EF95@python.org>
	<fb6fbf560704301637r6feec868ye6dc3ed2d661142d@mail.gmail.com>
Message-ID: <ca471dc20704301643q4c0f0865neefbc07a5ea78410@mail.gmail.com>

> > > On 4/27/07, Barry Warsaw <barry at python.org> wrote:
> > >> Finally, I'm concerned with the "weight" of adding ABCs to all the
> > >> built-in types.

> > On Apr 27, 2007, at 1:10 PM, Jim Jewett wrote:
> > > What if the builtin types did not initially derive from any ABC, but
> > > were added (through an issubclass override) when the abc module was
> > > imported?

> On 4/30/07, Barry Warsaw <barry at python.org> wrote:
> > That would allow for some unfortunately global side-effects.  Say I
> > happen to import your library that imports abc.  Now all the built-in
> > types in my entire application get globally changed.  I'm also not
> > sure how you'd implement that.

On 4/30/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> I don't see how these side-effects could ever be detected, except to
> the extent that issubclass overrides are inherently dangerous.
>
> I see it something like
>
> # module abc.py
>
> class Integer()...
>     ...
>
> Integer.register(int)
> Integer.register(long)
>
> After that, int (and long) are changed only by the addition of an
> extra reference count; their __bases__ and __mro__ are utterly
> unchanged.   But
>
>     isinstance(int, Integer)
>
> is now True.  Yes, this is global -- but the only way to detect it is
> to have a reference to Integer, which implies having already relied on
> the ABC framework.

Right. int (long doesn't exist in py3k!) doesn't change -- the only
thing that "changes" is that the question subclass(int, Integer) is
answered positively, but since you can't ask that question without
first importing Integer (from abc), there is no way that you can
detect this as a change. Note that you won't find Integer if you
traverse int.__mro__ or int.__bases__.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Tue May  1 01:42:03 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 01 May 2007 11:42:03 +1200
Subject: [Python-3000] [Python-Dev]  Pre-pre PEP for 'super' keyword
In-Reply-To: <76fd5acf0704291801o47733e29u634ffa317d32a0a7@mail.gmail.com>
References: <ca471dc20704181457h9ba0fe1wefcebd52da74e391@mail.gmail.com>
	<d11dcfba0704232122w2685c513g3aa23b8b94d1b5e7@mail.gmail.com>
	<76fd5acf0704240711p22f8060k25d787c0e85b6fb8@mail.gmail.com>
	<d11dcfba0704241035y55f0d111x898ad67898b7a4ae@mail.gmail.com>
	<d11dcfba0704242226s7983b664j5f5a5b78b185a3df@mail.gmail.com>
	<ca471dc20704251133o2baa5053xc8dc24e65ea4155a@mail.gmail.com>
	<002401c78778$75fb7eb0$0201a8c0@ryoko>
	<00b601c78a9f$38ec9390$0201a8c0@ryoko>
	<fb6fbf560704291425o7c84d72cqe2eafb041c43d088@mail.gmail.com>
	<ca471dc20704291630q7108d0ccl80c09baa57f98d6e@mail.gmail.com>
	<76fd5acf0704291801o47733e29u634ffa317d32a0a7@mail.gmail.com>
Message-ID: <46367ECB.3060504@canterbury.ac.nz>

Calvin Spealman wrote:

> I also checked and PyPy does implement a sys._getframe() and a
> IronPython currently doesn't, but seems to plan on it (there is a
> placeholder, at present). I am not sure if notes on this belongs in
> the PEP or not.

If this is to have a chance, you really need to
come up with an implementation that doesn't rely on
sys._getframe, even in CPython. It's a hack that
has no place in something intended for routine
use.

--
Greg

From pje at telecommunity.com  Tue May  1 01:50:09 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 30 Apr 2007 19:50:09 -0400
Subject: [Python-3000] super(), class decorators, and PEP 3115
In-Reply-To: <ca471dc20704301554p590ccbedvc5727e346300e6ce@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430152320.02d31868@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430142844.03c96240@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430152320.02d31868@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070430192208.02ddaee8@sparrow.telecommunity.com>

At 03:54 PM 4/30/2007 -0700, Guido van Rossum wrote:
>On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>>At 12:17 PM 4/30/2007 -0700, Guido van Rossum wrote:
>> >Assuming class decorators are added, can't you do all of this using a
>> >custom metaclass?
>>
>>The only thing I need for the GF PEP is a way for a method decorator to get
>>a callback after the class is created, so that overloading will work
>>correctly in cases where overloaded methods are defined in a subclass.
>
>I still don't understand why you can't tell the users "for this to
>work, you must use my special magic super-duper metaclass defined
>*here*". Surely a sufficiently advanced metaclass can pull of this
>kind of magic in its __init__ method? If not a metaclass, then a
>super-duper decorator. Or what am I missing?

Metaclasses don't mix well.  If the user already has a metaclass, they'll 
have to create a custom subclass, since Python doesn't do auto-combination 
of metaclasses (per the "Putting Metaclasses to Work" book).  This makes 
things messy, especially if the user doesn't *know* they're using a 
metaclass already (e.g., they got one by inheritance).

For the specific use case I'm concerned about, it's like "super()" in that 
a function defined inside a class body needs to know what class it's 
in.  (Actually, it's the decorator that needs to know, and it ideally needs 
to know as soon as the class is defined, rather than waiting until a call 
occurs later.)

As with "super()", this really has nothing to do with the class.  It would 
make about as much sense as having a metaclass or class decorator called 
``SuperUser``; i.e., it would work, but it's just overhead for the user.

So, if there ends up being a general way to access that "containing class" 
from a function decorator, or at least to get a callback once the class is 
defined, that's all I need for this use case that can't reasonably be 
handled by a normal metaclass.

Note, too, that the such a hook would also allow you to make classes into 
ABCs through the presence of an @abstractmethod, without also having to 
inherit from Abstract or set an explicit metaclass.  (Unless of course you 
prefer to have the abstractness called out up-front...  but then that 
explicitness goes out the window as soon as you e.g. sublcass Sequence from 
Iterable.)


>But I don't understand how a __metaclass__ hack can use a class decorator.

The __metaclass__ hack is used in Python 2.x to dynamically *add* class 
decorators while the class suite is being executed, that will be called 
*after* the class is created.  A function decorator (think of your 
@abstractmethod, for example) would monkeypatch the metaclass so it gets a 
crack at class after it's created, without the user having to explicitly 
set up the metaclass (or merge any inherited metaclasses).


From pje at telecommunity.com  Tue May  1 01:52:48 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 30 Apr 2007 19:52:48 -0400
Subject: [Python-3000] super() PEP
In-Reply-To: <ca471dc20704301610w58086ff2p5a69db953f6bed42@mail.gmail.co
 m>
References: <cc93256f0704301536r2ab74894xfe15862a573263b4@mail.gmail.com>
	<5.1.1.6.0.20070430142844.03c96240@sparrow.telecommunity.com>
	<ca471dc20704301217h1c117375r635bcae0034d593f@mail.gmail.com>
	<011b01c78b6a$72098810$0201a8c0@ryoko>
	<43aa6ff70704301403q5bf557c2wf43148f7a339353d@mail.gmail.com>
	<014901c78b6e$d2d66d80$0201a8c0@ryoko>
	<cc93256f0704301536r2ab74894xfe15862a573263b4@mail.gmail.com>
Message-ID: <5.1.1.6.0.20070430195130.04b52328@sparrow.telecommunity.com>

At 04:10 PM 4/30/2007 -0700, Guido van Rossum wrote:
>And don't get me started abut __init__. Constructors don't do
>cooperative MI, period.

Actually, metaclass __init__'s do.  In fact, they *have to*.

Right now, we get away with it because the type(name, bases, dict) 
signature is fixed.  Once we add keyword args, though, things will get hairier.


From barry at python.org  Tue May  1 01:53:58 2007
From: barry at python.org (Barry Warsaw)
Date: Mon, 30 Apr 2007 19:53:58 -0400
Subject: [Python-3000] PEP 3119 - Introducing Abstract Base Classes
In-Reply-To: <ca471dc20704301643q4c0f0865neefbc07a5ea78410@mail.gmail.com>
References: <ca471dc20704261150o45a3fa22u16a515b3d4aa14ba@mail.gmail.com>
	<04A4F15C-38C3-4727-875D-82803F4FB974@python.org>
	<fb6fbf560704271010j50a6319et27d519fc468afeb8@mail.gmail.com>
	<FE2C6C62-4DDD-4CB1-BADB-95B6ECA9EF95@python.org>
	<fb6fbf560704301637r6feec868ye6dc3ed2d661142d@mail.gmail.com>
	<ca471dc20704301643q4c0f0865neefbc07a5ea78410@mail.gmail.com>
Message-ID: <0840AFD6-909C-43F9-819A-ACF747B313EE@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Apr 30, 2007, at 7:43 PM, Guido van Rossum wrote:

> Right. int (long doesn't exist in py3k!) doesn't change -- the only
> thing that "changes" is that the question subclass(int, Integer) is
> answered positively, but since you can't ask that question without
> first importing Integer (from abc), there is no way that you can
> detect this as a change. Note that you won't find Integer if you
> traverse int.__mro__ or int.__bases__.

Cool.
- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRjaBlnEjvBPtnXfVAQLxgAQApwDaBmGw5UHemDloVr7NIxRhgaAXpg9p
x2JyoCi82xnHNw1kZl120thlc8PuWO4lEZ9YLh12CjZnyY22Q1W68WqJi3n6D6cq
UGXha0ANi7c82FpqZldtztAb3zPKSYr7g1XB2uAUz7lVcHdGYfr5HAAPKsDMWJXs
dTs1sBuoZkE=
=Q6Kq
-----END PGP SIGNATURE-----

From pje at telecommunity.com  Tue May  1 02:02:47 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 30 Apr 2007 20:02:47 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <bbaeab100704301619h47ee67f6wca599136a0845c97@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070430195413.047ad8d8@sparrow.telecommunity.com>

At 04:19 PM 4/30/2007 -0700, Brett Cannon wrote:
>Doubt there is a ton of use for it, but any way to use this for pattern 
>matching ala Standard ML or Haskell?

Yes.  You have to provide a different dispatching engine though, as will be 
described in the currently non-existent "extension API" section.  :)

Perhaps you saw the part of the PEP with the "Pred('python expression 
here')" example?


>   Would be kind of neat to be able to do recursive function definitions 
> and choose which specific function implementation based on the length of 
> an argument.  But I don't see how that would be possible with this 
> directly.  I guess if a SingularSequence type was defined that overloaded 
> __isinstance__ properly maybe?  I have not followed the __isinstance__ 
> discussion closely so I am not sure.

No, the base engine will only support __issubclass__ overrides and other 
class-based criteria, as it's strictly a type-tuple cache system (ala 
Guido's generic function prototype, previously discussed here and on his 
blog).  However, engines will be pluggable based on predicate type(s).  If 
you use a predicate that's not supported by the engine currently attached 
to a function, it will attempt to "upgrade" to a better engine.

So, PEAK-Rules for example will register a predicate type for arbitrary 
Python expressions, and an engine factory for dispatching on them.


>Could you change __proceed__ to be a keyword-only argument?  That way it 
>would match the precedence of class definitions and the 'metaclass' 
>keyword introduced by PEP 3115.  I personally would prefer to control what 
>the default is if __proceed__ is not passed in at the parameter level then 
>have to do a check if it's NoApplicableMethod.

You would still have to check if its ``AmbiguousMethods``, though.

My current GF libraries use bound methods for speed, which means that the 
special parameter has to be in the first position.  ``partial`` and other 
ways of constructing a method chain for keyword arguments would be a lot 
slower, just due to the use of keyword arguments.  But it's certainly an 
option to use ``partial`` instead of bound methods, just a slower one.

(My existing GF libraries all target Python 2.3, where ``partial`` didn't 
exist yet, so it wasn't an option I considered.)


From pje at telecommunity.com  Tue May  1 02:03:58 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 30 Apr 2007 20:03:58 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <fb6fbf560704301629w44d6891aue9743f8a72a0873e@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070430200255.04b88e10@sparrow.telecommunity.com>

At 07:29 PM 4/30/2007 -0400, Jim Jewett wrote:
>On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>
>>It is currently an open issue to determine the best way to implement
>>this rule in Python 3.0.  Under Python 2.x, a class' metaclass was
>>not chosen until the end of the class body, which means that
>>decorators could insert a custom metaclass to do processing of this
>>sort.  (This is how RuleDispatch, for example, implements the implicit
>>class rule.)
>
>>PEP 3115, however, requires that a class' metaclass be determined
>>*before* the class body has executed, making it impossible to use this
>>technique for class decoration any more.
>
>It doesn't say what that metaclass has to do, though.
>
>Is there any reason the metaclass couldn't delegate differently
>depending on the value of __my_magic_attribute__ ?

Sure -- that's what I suggested in the "super(), class decorators, and PEP 
3115" thread, but Guido voted -1 on adding such a magic attribute to PEP 
3115.  (Actually, I think he -1'd *any* change to 3115 to support this 
feature.)


From guido at python.org  Tue May  1 02:19:27 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Apr 2007 17:19:27 -0700
Subject: [Python-3000] Breakthrough in thinking about ABCs (PEPs 3119 and
	3141)
Message-ID: <ca471dc20704301719y5bc7e83fi1a6eb2a68d9f4e42@mail.gmail.com>

After a couple of whiteboard discussions with Collin Winter and
Jeffrey Jasskin I have a much better grip on where to go next with the
ABC PEPs.

(a) Roles

Collin will continue to develop his Roles PEP. This may or may not end
up providing a viable alternative to ABCs; in either case it will be
refreshing to compare and contrast the two proposals.

(b) Overloading isinstance and issublcass

The idea of overloading isinstance and issubclass is running into some
resistance. I still like it, but if there is overwhelming discomfort,
we can change it so that instead of writing isinstance(x, C) or
issubclass(D, C) (where C overloads these operations), you'd have to
write something like C.hasinstance(x) or C.hassubclass(D), where
hasinstance and hassubclass are defined by some ABC metaclass. I'd
still like to have the spec for hasinstance and hassubclass in the
core language, so that different 3rd party frameworks don't need to
invent different ways of spelling this inquiry.

Personally, I still think that the most uniform way of spelling this
is overloading isinstance and issubclass; that has the highest
likelihood of standardizing the spelling for such inquiries. I'd like
to avoid disasters such as Java's String.length vs. Vector.length()
vs. Collection.size(). One observation is that in most cases
isinstance and issubclass are used with a specific, known class as
their second argument, so that the likelihood of breaking code by this
overloading is minimal: the calls can be trusted as much as you trust
the second argument. (I found only 4 uses of isinstance(x, <variable>)
amongst the first 50 hits in Google Code Search.)

However this turns out, it makes me ant to reduce the number of ABCs
defined initially in the PEPs, as it will now be easy to define ABCs
representing "less-powerful abstractions" and insert them into the
right place in the ABC hierarchy by overloading either issubclass or
the alternative hassubclass class method.

(c) ABCs as classes vs. ABCs as metaclasses

The original ABC PEPs naively use subclassing from ABCs only. This ran
into trouble when someone observed that if classes C and D both
inherit from TotallyOrdered, that doesn't mean that C() < D() is
defined. (For a quick counterexample, consider that int and str are
both total orders, but 42 < "a" raises TypeError.) Similar for Ring
and other algebraic notions introduced in PEP 3141. The correct
approach is for TotallyOrdered to be a metaclass (is this the
typeclass thing in Haskell?). I expect that we'll leave them out of
the ABC namespace for now and instead just spell out __lt__ and __le__
as operators defined by various classes. If you want TotallyOrdered,
you can easily define it yourself, call TotallyOrdered.register(int)
etc., and then isinstance(int, TotallyOrdered) (or
TotallyOrdered.hasinstance(int)) will return True. OTOH, many of the
classes proposed in PEP 3119 (e.g. Set, Sequence, Mapping) do make
sense as base classes, and I don't expect to turn these into
metaclasses.

(d) Comparing containers

I am retracting the idea of making all sequences comparable; instead,
you can compare only list to list, tuple to tuple, str to str, etc.
Ditto for concatenation. This means that __eq__ and __and__ are not
part of the Sequence spec.

OTOH for sets, I think it makes sense to require all set
implementations to be inter-comparable: an efficient default
implementation can easily be provided, and since sets are a relatively
new addition, there is no prior art of multiple incompatible set
implementations in the core; to the contrary, the two built-in set
types (set and frozenset) are fully interoperable (unlike sequences,
of which there are many, and none of these are interoperable).

For mappings I'm on the fence; while it would be easy to formally
define m1 == m2 <==> set(m1.items()) == set(m2.items()), and that
would be relatively easy to compute using only traversal and
__getitem__, I'm not so sure there is any use for this, and it does
break with tradition (dict can't currently be compared to the various
dbm-style classes).

(e) Numeric tower

Jeffrey will write up the detailed specs for the numeric tower.
MonoidUnderPlus, Ring and other algebraic notions are gone. We will
have abstract classes Integer <: Rational <: Real <: Complex <: Number
(*); and concrete classes complex <: Complex, float <: Real,
decimal.Decimal <: Real, int <: Integer. (No concrete implementations
of Rational, but there are many 3rd prty ones to choose from.) We came
up with a really clever way whereby the implementations of binary
operations like __add__ and __radd__ in concrete classes should defer
to their counterpart in the abstract base class instead of returning
NotImplemented; the abstract base class can then (at least in most
cases) do the right thing when mixed operations are attempted on two
different concrete subclasses that don't know about each other
(solving a dilemma about which I blabbered yesterday). This does mean
that Integer...Number will be built-in.

(*) D <: C means that D is a subclass of C.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From daniel at stutzbachenterprises.com  Tue May  1 02:21:49 2007
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Mon, 30 Apr 2007 19:21:49 -0500
Subject: [Python-3000] Two proposals for a new list-like type: one
	modest, one radical
In-Reply-To: <00c601c786ff$95839700$f101a8c0@RaymondLaptop1>
References: <eae285400704230856n633aff15j9a16ed0811ea4adc@mail.gmail.com>
	<00c601c786ff$95839700$f101a8c0@RaymondLaptop1>
Message-ID: <eae285400704301721l4156fa0dt35007f4a013cc6e3@mail.gmail.com>

On 4/25/07, Raymond Hettinger <python at rcn.com> wrote:
> > There are only a few use-cases (that I can think of) where Python's
> > list() regularly outperforms the BList.  These are:
> >
> > 1. A large LIFO stack, where there are many .append() and .pop(-1)
> > operations.  These are O(1) for a Python list, but O(log n) for the
> > BList().
>
> This is a somewhat important use-case (we devote two methods to it).

I've been thinking about this a bit more.  For the LIFO use case, I
can cache a pointer within the root node to the right-most leaf node,
which will make a sequence of n append and pop operations take O(n)
amortized time (same as a regular list).

The latest version, 0.9.4, on PyPi fixes most of the issues raised by others:

- C++ style comments have been converted to C comments.
- Variable declarations are now always at the beginning of a block.
- Use Py_ssize_t instead of int in all (I think) the appropriate places.
- Cleaned up the debugging code to rely on fewer macros
- Removed all (I think) gcc-isms

-- 
Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC

From guido at python.org  Tue May  1 02:38:25 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Apr 2007 17:38:25 -0700
Subject: [Python-3000] super(), class decorators, and PEP 3115
In-Reply-To: <5.1.1.6.0.20070430192208.02ddaee8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430142844.03c96240@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430152320.02d31868@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430192208.02ddaee8@sparrow.telecommunity.com>
Message-ID: <ca471dc20704301738kc01bd30p16decfaff399aca3@mail.gmail.com>

On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 03:54 PM 4/30/2007 -0700, Guido van Rossum wrote:
> >On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> >>At 12:17 PM 4/30/2007 -0700, Guido van Rossum wrote:
> >> >Assuming class decorators are added, can't you do all of this using a
> >> >custom metaclass?
> >>
> >>The only thing I need for the GF PEP is a way for a method decorator to get
> >>a callback after the class is created, so that overloading will work
> >>correctly in cases where overloaded methods are defined in a subclass.
> >
> >I still don't understand why you can't tell the users "for this to
> >work, you must use my special magic super-duper metaclass defined
> >*here*". Surely a sufficiently advanced metaclass can pull of this
> >kind of magic in its __init__ method? If not a metaclass, then a
> >super-duper decorator. Or what am I missing?
>
> Metaclasses don't mix well.  If the user already has a metaclass, they'll
> have to create a custom subclass, since Python doesn't do auto-combination
> of metaclasses (per the "Putting Metaclasses to Work" book).  This makes
> things messy, especially if the user doesn't *know* they're using a
> metaclass already (e.g., they got one by inheritance).
>
> For the specific use case I'm concerned about, it's like "super()" in that
> a function defined inside a class body needs to know what class it's
> in.  (Actually, it's the decorator that needs to know, and it ideally needs
> to know as soon as the class is defined, rather than waiting until a call
> occurs later.)
>
> As with "super()", this really has nothing to do with the class.  It would
> make about as much sense as having a metaclass or class decorator called
> ``SuperUser``; i.e., it would work, but it's just overhead for the user.
>
> So, if there ends up being a general way to access that "containing class"
> from a function decorator, or at least to get a callback once the class is
> defined, that's all I need for this use case that can't reasonably be
> handled by a normal metaclass.
>
> Note, too, that the such a hook would also allow you to make classes into
> ABCs through the presence of an @abstractmethod, without also having to
> inherit from Abstract or set an explicit metaclass.  (Unless of course you
> prefer to have the abstractness called out up-front...  but then that
> explicitness goes out the window as soon as you e.g. sublcass Sequence from
> Iterable.)
>
>
> >But I don't understand how a __metaclass__ hack can use a class decorator.
>
> The __metaclass__ hack is used in Python 2.x to dynamically *add* class
> decorators while the class suite is being executed, that will be called
> *after* the class is created.  A function decorator (think of your
> @abstractmethod, for example) would monkeypatch the metaclass so it gets a
> crack at class after it's created, without the user having to explicitly
> set up the metaclass (or merge any inherited metaclasses).

It sounds like you were accessing __metaclass__ via sys._getframe()
from within the decorator, right? That sounds fragile and should not
be the basis of anything proposed for inclusion into the standard
library in a PEP. Perhaps the GF PEP could propose a standard hook
that a class could define to be run after the class is constructed.
The hook could be acquired by regular inheritance.

I think it's entirely reasonable to require that, in order to use an
advanced feature *that is not yet supported by the core language*,
users need to enable the feature not just by importing a module and
using a decorator but also by something they need to do once per
class, like specifying a metaclass, a class decorator, or a magic base
class.

Of course, once the core language adds built-in support for such a
feature, it becomes slightly less advanced, and it is reasonable to
expect that the special functionality be provided by object or type or
some other aspect of the standard class definition machinery (maybe
even a default decorator that's always invoked).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Tue May  1 02:50:45 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 01 May 2007 12:50:45 +1200
Subject: [Python-3000] Addition to PEP 3101
In-Reply-To: <d09829f50704301256n2b08b1fav7ac64b4fcc6a742c@mail.gmail.com>
References: <8f01efd00704300953t6154d7e1j7ef18cead1acb344@mail.gmail.com>
	<d09829f50704301256n2b08b1fav7ac64b4fcc6a742c@mail.gmail.com>
Message-ID: <46368EE5.6050409@canterbury.ac.nz>

Patrick Maupin wrote:

> Method calls are deliberately disallowed by the PEP, so that the
> implementation has some hope of being securable.

If attribute access is allowed, arbitrary code can already
be triggered, so I don't see how this makes a difference
to security.

--
Greg

From greg.ewing at canterbury.ac.nz  Tue May  1 02:59:17 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 01 May 2007 12:59:17 +1200
Subject: [Python-3000] Traits/roles instead of ABCs
In-Reply-To: <07Apr30.141916pdt.57996@synergy1.parc.xerox.com>
References: <43aa6ff70704291840s3384824et44ebfd360c15eda@mail.gmail.com>
	<014201c78adc$ca70d960$f101a8c0@RaymondLaptop1>
	<438708814690534630@unknownmsgid>
	<79990c6b0704301001ga0d2429sdaded9ac75fa15c5@mail.gmail.com>
	<07Apr30.141916pdt.57996@synergy1.parc.xerox.com>
Message-ID: <463690E5.6060603@canterbury.ac.nz>

Bill Janssen wrote:
>>On 30/04/07, Bill Janssen <janssen at parc.com> wrote:

>>After 15 years not being able to clearly state what "file-like" or
>>"mapping-like" means to different people, perhaps we should accept
>>that there is no clear-cut answer...?
> 
> And that's a problem -- people are confused.  Instead of throwing up
> our hands, I think we should define what "file-like" means.

That assumes there exists a single definition of
file-like that suits most purposes, and the only
reason for confusion is just that this definition
hasn't thus far been elucidated.

But I don't think there is any such definition, and
the confusion arises because people lazily use the
vague term "file-like" instead of spelling out what
they really mean ("has a read() method", etc.)

Hopefully the new I/O system will help by breaking
the API into more digestible pieces.

--
Greg

From pje at telecommunity.com  Tue May  1 03:08:04 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 30 Apr 2007 21:08:04 -0400
Subject: [Python-3000] Breakthrough in thinking about ABCs (PEPs 3119
 and 3141)
In-Reply-To: <ca471dc20704301719y5bc7e83fi1a6eb2a68d9f4e42@mail.gmail.co
 m>
Message-ID: <5.1.1.6.0.20070430205955.04953100@sparrow.telecommunity.com>

At 05:19 PM 4/30/2007 -0700, Guido van Rossum wrote:
>Collin will continue to develop his Roles PEP. This may or may not end
>up providing a viable alternative to ABCs; in either case it will be
>refreshing to compare and contrast the two proposals.

These should also be interesting to compare with the "interfaces" part of 
PEP 3124, although they need not compete.  (The module proposed in 3124 
should be able to use ABCs or Roles as easily as it does its own Interfaces.)


>Personally, I still think that the most uniform way of spelling this
>is overloading isinstance and issubclass; that has the highest
>likelihood of standardizing the spelling for such inquiries.

A big +1 here.  This is no different than e.g. operator.mul() being able to 
do different things depending on the second argument.


>(is this the typeclass thing in Haskell?).

Yeah; you'd say that in the typeclass "TotallyOrdered a", that "<" is a 
2-argument function taking two "a"'s and returning a boolean.  But that's 
way more parameterized than we can do in Python any time soon.  I don't 
even go that far for PEP 3124, although in principle you could use its 
extension API to do something like that.  Not something I want to even try 
thinking about in detail right now, though.


From pje at telecommunity.com  Tue May  1 03:13:56 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 30 Apr 2007 21:13:56 -0400
Subject: [Python-3000] super(), class decorators, and PEP 3115
In-Reply-To: <ca471dc20704301738kc01bd30p16decfaff399aca3@mail.gmail.com
 >
References: <5.1.1.6.0.20070430192208.02ddaee8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430142844.03c96240@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430152320.02d31868@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430192208.02ddaee8@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070430205712.04d6f408@sparrow.telecommunity.com>

At 05:38 PM 4/30/2007 -0700, Guido van Rossum wrote:
>Of course, once the core language adds built-in support for such a
>feature, it becomes slightly less advanced, and it is reasonable to
>expect that the special functionality be provided by object or type or
>some other aspect of the standard class definition machinery (maybe
>even a default decorator that's always invoked).

Yep.  That's precisely it.  I'm suggesting that since GF's, enhanced 
super(), and even potentially @abstractmethod have a use for such a hook, 
that this would be an appropriate hook to provide in object or type or 
whatever.  Or, have the MAKE_CLASS opcode just do something like:

       ...
       cls = mcls(name, bases, prepared_dict)
       for decorator in cls.__decorators__:
           cls = decorator(cls)
       ...

Heck, I'd settle for:

       ...
       cls = mcls(name, bases, prepared_dict)
       for callback in cls.__decorators__:
           callback(cls)
       ...

As this version would still handle all of my use cases; it just wouldn't be 
as useful for things like @abstractmethod that really do want to change the 
metaclass or bases rather than simply be notified of what the class is.


From rrr at ronadam.com  Tue May  1 03:37:12 2007
From: rrr at ronadam.com (Ron Adam)
Date: Mon, 30 Apr 2007 20:37:12 -0500
Subject: [Python-3000] Traits/roles instead of ABCs
In-Reply-To: <014201c78adc$ca70d960$f101a8c0@RaymondLaptop1>
References: <43aa6ff70704291840s3384824et44ebfd360c15eda@mail.gmail.com>
	<014201c78adc$ca70d960$f101a8c0@RaymondLaptop1>
Message-ID: <463699C8.40105@ronadam.com>

Raymond Hettinger wrote:
> [Collin Winter]
>> Put another way, a role is an assertion about a set of capabilities.
>  . . .
>> If there's interest in this, I could probably whip up a PEP before the deadline.
> 
> +100 I'm very interested in seeing a lighter weight alternative to abc.py that:
> 
> 1) is dynamic
> 2) doesn't require inheritance to work
> 3) doesn't require mucking with isinstance or other existing mechansims
> 4) makes a limited, useful set of assertions rather than broadly covering a whole API.
> 5) that leaves the notion of duck-typing as the rule rather than the exception
> 6) that doesn't freeze all of the key APIs in concrete
> 
> I'm concerned that the current ABC proposal will quickly evolve from optional
> to required and create somewhat somewhat java-esque landscape where
> inheritance and full-specification are the order of the day.

+100 on Raymonds list here.


I am concerned that the effect of most of the proposals will be to encode 
data as code to a greater degree.

I generally try to do the opposite.  That is, I try to make my data and 
code be independent of each other so my data is complete, and my code is 
usable for other things.

There are times I want to pipeline (or assembly line) data and mark it now 
for later dispatching at point further down stream.  In that case being 
able to temporarily and transparently attach a bit of meta data to the 
object and have it ride along with the data until some later point would be 
useful.  Then to have some nice general purpose dispatcher to initiate the 
work at that point.

A particular use case that I'm finding occurs quite often is that of 
sorting.  Not the sorting of putting things in order, but the sorting as in 
mail sorters or dividing large groups into smaller sub groups.  And of 
course that is a form of dispatching.  So far I haven't seen anything that 
directly addresses these use cases.



> IMHO, the ABC approach is using a cannon to shoot a mosquito.  My day-to-day
> problems are much smaller are could be solved by a metadata attribute or a
> role/trait solution:
> 
> * knowing whether a __getitem__ method implements a mapping or a sequence
> * knowing whether an object can have more that one iterator (i.e a file has one
>    but a list can have many)
> * knowing whether a sequence, file, cursor, etc is writable or just readonly.
> 
> 
> Raymond
> 


From rrr at ronadam.com  Tue May  1 03:37:12 2007
From: rrr at ronadam.com (Ron Adam)
Date: Mon, 30 Apr 2007 20:37:12 -0500
Subject: [Python-3000] Traits/roles instead of ABCs
In-Reply-To: <014201c78adc$ca70d960$f101a8c0@RaymondLaptop1>
References: <43aa6ff70704291840s3384824et44ebfd360c15eda@mail.gmail.com>
	<014201c78adc$ca70d960$f101a8c0@RaymondLaptop1>
Message-ID: <463699C8.40105@ronadam.com>

Raymond Hettinger wrote:
> [Collin Winter]
>> Put another way, a role is an assertion about a set of capabilities.
>  . . .
>> If there's interest in this, I could probably whip up a PEP before the deadline.
> 
> +100 I'm very interested in seeing a lighter weight alternative to abc.py that:
> 
> 1) is dynamic
> 2) doesn't require inheritance to work
> 3) doesn't require mucking with isinstance or other existing mechansims
> 4) makes a limited, useful set of assertions rather than broadly covering a whole API.
> 5) that leaves the notion of duck-typing as the rule rather than the exception
> 6) that doesn't freeze all of the key APIs in concrete
> 
> I'm concerned that the current ABC proposal will quickly evolve from optional
> to required and create somewhat somewhat java-esque landscape where
> inheritance and full-specification are the order of the day.

+100 on Raymonds list here.


I am concerned that the effect of most of the proposals will be to encode 
data as code to a greater degree.

I generally try to do the opposite.  That is, I try to make my data and 
code be independent of each other so my data is complete, and my code is 
usable for other things.

There are times I want to pipeline (or assembly line) data and mark it now 
for later dispatching at point further down stream.  In that case being 
able to temporarily and transparently attach a bit of meta data to the 
object and have it ride along with the data until some later point would be 
useful.  Then to have some nice general purpose dispatcher to initiate the 
work at that point.

A particular use case that I'm finding occurs quite often is that of 
sorting.  Not the sorting of putting things in order, but the sorting as in 
mail sorters or dividing large groups into smaller sub groups.  And of 
course that is a form of dispatching.  So far I haven't seen anything that 
directly addresses these use cases.



> IMHO, the ABC approach is using a cannon to shoot a mosquito.  My day-to-day
> problems are much smaller are could be solved by a metadata attribute or a
> role/trait solution:
> 
> * knowing whether a __getitem__ method implements a mapping or a sequence
> * knowing whether an object can have more that one iterator (i.e a file has one
>    but a list can have many)
> * knowing whether a sequence, file, cursor, etc is writable or just readonly.
> 
> 
> Raymond
> 


From guido at python.org  Tue May  1 04:43:55 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Apr 2007 19:43:55 -0700
Subject: [Python-3000] super(), class decorators, and PEP 3115
In-Reply-To: <5.1.1.6.0.20070430205712.04d6f408@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430142844.03c96240@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430152320.02d31868@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430192208.02ddaee8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430205712.04d6f408@sparrow.telecommunity.com>
Message-ID: <ca471dc20704301943l30c02bf1n5fba117f8cad87c@mail.gmail.com>

On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 05:38 PM 4/30/2007 -0700, Guido van Rossum wrote:
> >Of course, once the core language adds built-in support for such a
> >feature, it becomes slightly less advanced, and it is reasonable to
> >expect that the special functionality be provided by object or type or
> >some other aspect of the standard class definition machinery (maybe
> >even a default decorator that's always invoked).
>
> Yep.  That's precisely it.  I'm suggesting that since GF's, enhanced
> super(), and even potentially @abstractmethod have a use for such a hook,
> that this would be an appropriate hook to provide in object or type or
> whatever.  Or, have the MAKE_CLASS opcode just do something like:
>
>        ...
>        cls = mcls(name, bases, prepared_dict)
>        for decorator in cls.__decorators__:
>            cls = decorator(cls)
>        ...
>
> Heck, I'd settle for:
>
>        ...
>        cls = mcls(name, bases, prepared_dict)
>        for callback in cls.__decorators__:
>            callback(cls)
>        ...
>
> As this version would still handle all of my use cases; it just wouldn't be
> as useful for things like @abstractmethod that really do want to change the
> metaclass or bases rather than simply be notified of what the class is.

OK, put one of those in the PEP (but I still think it's a waste of
time to mention super). Though I think you may have to investigate
exactly what MAKE_CLASS does.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From talin at acm.org  Tue May  1 05:06:06 2007
From: talin at acm.org (Talin)
Date: Mon, 30 Apr 2007 20:06:06 -0700
Subject: [Python-3000] Addition to PEP 3101
In-Reply-To: <46368EE5.6050409@canterbury.ac.nz>
References: <8f01efd00704300953t6154d7e1j7ef18cead1acb344@mail.gmail.com>	<d09829f50704301256n2b08b1fav7ac64b4fcc6a742c@mail.gmail.com>
	<46368EE5.6050409@canterbury.ac.nz>
Message-ID: <4636AE9E.2020905@acm.org>

Greg Ewing wrote:
> Patrick Maupin wrote:
> 
>> Method calls are deliberately disallowed by the PEP, so that the
>> implementation has some hope of being securable.
> 
> If attribute access is allowed, arbitrary code can already
> be triggered, so I don't see how this makes a difference
> to security.

Not quite. It depends on what you mean by 'arbitrary code'.

Let's take a hypothetical example: Suppose I have a format string which 
I downloaded from the nefarious "evil.org" web site which I suspect may 
contain "evil" formatting fields.

Now, I'd like to be able to use this format string, but I want to be 
able to contain the damage that it can do. For example, if I pass a list 
of integers as the format parameters, there is little harm that can be 
done. Even if my evil string contains things like 
"{0.__class__.__module__}" - in other words, even if it spiders through 
the base class list and the MRO list and everything else, there's little 
  damage it can do, because it can't call any functions.

Now, lets suppose that somewhere in the set of objects that are 
transitively reachable from those parameter values, there's an object 
which has an attribute such that accessing that attribute deletes my 
hard drive or has some other bad effect. Obviously this would be bad. 
Bad because my hard drive was deleted, sure, but even worse because I'm 
an idiot for writing such a stupid class in the first place.

I know that's a bit over the top, but what I mean to say is that in the 
normal course of events, one can assume that attribute accesses are 
either stateless, or should at least *seem* to be stateless from the 
outside. It's considered bad form to go around writing classes where the 
mere access of an attribute has some potentially deleterious effect. 
Anyone who writes a class like that deserves to have their hard drive 
deleted IMHO.

So the judgment was made that it's relatively safe to access attributes 
(even if they can be overloaded), whereas allowing method invocations is 
much less safe.

So yes, theoretically attribute access can indeed run arbitrary code. 
But not in a world with mostly sane people in it.

-- Talin

From alan.mcintyre at gmail.com  Tue May  1 05:31:37 2007
From: alan.mcintyre at gmail.com (Alan McIntyre)
Date: Mon, 30 Apr 2007 23:31:37 -0400
Subject: [Python-3000] Traits/roles instead of ABCs
In-Reply-To: <740c3aec0704301501u7df7b5a6uaea854d4716eb87e@mail.gmail.com>
References: <43aa6ff70704291840s3384824et44ebfd360c15eda@mail.gmail.com>
	<014201c78adc$ca70d960$f101a8c0@RaymondLaptop1>
	<1d36917a0704300816ma3bf9c2o4dd674cfcefa9172@mail.gmail.com>
	<-3456230403858254882@unknownmsgid>
	<740c3aec0704301501u7df7b5a6uaea854d4716eb87e@mail.gmail.com>
Message-ID: <1d36917a0704302031hd34ffcfu2eee879aef426931@mail.gmail.com>

On 4/30/07, BJ?rn Lindqvist <bjourne at gmail.com> wrote:
> On 4/30/07, Bill Janssen <janssen at parc.com> wrote:
> > > +1 for preferring simple solutions to complex ones
> >
> > Me, too.  But which is the simple solution?  I tend to think ABCs are.
>
> Neither or. They are both an order of a magnitude more complex than
> the problem they are designed to solve. Raymond Hettingers small list
> of three example problems earlier in the thread, is the most concrete
> description of what the problem really is all about. And I would
> honestly rather sort them under "minor annoyances" than "really
> critical stuff, needs to be fixed asap."

Disclaimer: I've only tangentially followed this entire ABC
discussion, and I'm commenting off the cuff without having read as
much as I probably should have.  I am not a "power user" of Python (by
which I mean, I've never been tasked to solve a problem using Python
that made want to use abstract classes, or tinker with
how the isinstance or issubclass functions do their thing).

That said, the impression that I get from some of the discussions here
is that big helpings of complexity might get added to the core of the
language, and that's just a little unsettling to me.  Maybe it's just
that I don't have to solve the sorts of problems as advocates of these
ideas, or I just don't have an adequate background to understand how
really useful these additions would be.

On 4/30/07, Barry Warsaw <barry at python.org> wrote:
> Interfaces and ABCs are really all about Programming in the Really
> Large.  Most Python programs don't need this stuff, and in fact,
> having to deal with them in any way would IMO reduce the elegance of
> Python for small to medium (and even most large) applications.

I think this is what's generally bugging me: the impression that
there's a push to add features to help out those that program in The
Really Large, or in The Really Mathematical (rings and semigroups and
monoids, oh my!).  I have a nagging concern that these additions will
clutter up the core, and--no matter how hard you try--adding them is
going to have an impact on "run-of-the-mill" users of the language.

My-2-cents'ly yours,
Alan

From jason.orendorff at gmail.com  Tue May  1 06:32:04 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Tue, 1 May 2007 00:32:04 -0400
Subject: [Python-3000] Breakthrough in thinking about ABCs (PEPs 3119
	and 3141)
In-Reply-To: <ca471dc20704301719y5bc7e83fi1a6eb2a68d9f4e42@mail.gmail.com>
References: <ca471dc20704301719y5bc7e83fi1a6eb2a68d9f4e42@mail.gmail.com>
Message-ID: <bb8868b90704302132r4094377dm5a94c58652737fdf@mail.gmail.com>

On 4/30/07, Guido van Rossum <guido at python.org> wrote:
> The correct
> approach is for TotallyOrdered to be a metaclass (is this the
> typeclass thing in Haskell?).

Mmmm.  Typeclasses don't *feel* like metaclasses.  Haskell types
aren't objects.

A typeclass is like an interface, but more expressive.  Only an
example has any hope of delivering the "aha!" here:

  -- This defines a typeclass called "Set e".
  -- "e" and "s" here are type variables.
  class Set e s where
      -- here are a few that behave like OO methods...
      size :: s -> Int    -- (size set) returns an Int
      contains :: s -> e -> Bool   -- (contains set element) returns Bool

      -- here are some where the two arguments have to be of the same type
      union :: s -> s -> s
      intersection :: s -> s -> s

      -- here's a constructor!
      fromList :: [e] -> s

      -- and here's a constant... with a default implementation!
      emptySet :: s
      emptySet = fromList []

Suppose someone has written a super-fast data structure for
collections of ints.  If I wanted to "register" that type as a Set, I
would write:

  instance Set Int FastIntSet where
      -- the implementation goes in here
      size self = ...implement this using FastIntSet magic...

More complex relationships among types are surprisingly easy to
express.  See if you can puzzle these out:

  instance Hashable e => Set e (HashSet e) where ...

  instance Cmp e => Set e (TreeSet e) where ...

  class PartialOrd t where
      (<=*) :: t -> t -> Bool

  instance (Set s, Eq s) => PartialOrd s where
      (a <=* b) = (intersection a b == a)

See?  It's nice.  But, eh, this is what typeful languages do
all day; they'd better be good at it.  :)

-j

(Right now on a Haskell mailing list somewhere, a mirror image of me
is trying to explain what's so cool about zipfile.  Python wins.  ;)

From python at rcn.com  Tue May  1 08:25:42 2007
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 30 Apr 2007 23:25:42 -0700
Subject: [Python-3000] PEP:  Information Attributes
Message-ID: <008a01c78bb9$e4283780$f001a8c0@RaymondLaptop1>

Proto-PEP:  Information Attributes (First Draft)

Proposal:

Testing hasattr() is a broadly applicable and flexible technique that works well
whenever the presence of a method has an unambiguous interpretation
 (i.e. __hash__ for hashability, __iter__ for iterability, __len__ for sized
containers); however, there are other methods with ambiguous interpretations
that could be resolved by adding an information attribute.


Motivation:

Signal attributes are proposed as a lightweight alternative to abstract base
classes. The essential argument is that duck-typing works fairly well and
needs only minimal augmentation to address a small set of recurring
challenges.  In contrast, the ABC approach imposes an extensive javaesque
framework that cements APIs and limits flexibility.  Real world Python
programming experience has shown little day-to-day need for broad-sweeping API
definitions; instead, there seem to be only a handful of recurring issues that
can easily be addressed by a lightweight list of information attributes.


Use Cases with Ambiguous Interpretations

* The presence of a __getitem__ method is ambiguous in that it can be
  interpreted as either having sequence or mapping behavior.  The ambiguity is
  easily resolved with an attribute claiming either mapping behavior or
  sequence behavior.

* The presence of a rich comparison operator such as __lt__ is ambiguous in that
  it can return a vector or a scalar, the scalar may or may not be boolean,
  and it may be a NotImplemented instance.  Even the boolean case is ambigouus
  because __lt__ may imply a total ordering (as it does for numbers) or it may
  be a partial ordering (as it is for sets where __lt__ means a strict
  subset). That latter ambiguity (sortability) is easily resolved by an
  attribute indicating a total ordering.

* Some methods such as set.__add__ are too restrictive in that they preclude
  interaction with non-sets.  This makes it impossible to achieve set
  interoperability without subclassing from set (a choice which introduces
  other complications such as the inability to override set-to-set
  interactions).  This situation is easily resolved by an attribute like
  obj.__fake__=set which indicates that the object intends to be a set proxy.

* The __iter__ method doesn't tell you whether the object supports
  multiple iteration (such as with files) or single iteration (such as with lists).
  A __singleiterator__ attribute would clear-up the ambiguity.

* While you can test for the presence of a write() method, it would be
   helpful to have a __readonly__ information attribute for file-like objects,
   cursors, immutables, and whatnot.


Advantages

The attribute approach is dynamic (doesn't require inheritance to work). It
doesn't require mucking with isinstance() or other existing mechanisms.
It restricts itself to making a limited, useful set of assertions rather than
broadly covering a whole API. It leaves the proven pythonic notion of
duck-typing as the rule rather than the exception. It resists the temptation
to freeze all of the key APIs in concrete.

From python at rcn.com  Tue May  1 08:56:32 2007
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 30 Apr 2007 23:56:32 -0700
Subject: [Python-3000] PEP:  Drop Implicit String Concatentation
Message-ID: <009e01c78bbd$da8b71c0$f001a8c0@RaymondLaptop1>

PEP:  Remove Implicit String Concatenation

Motivation

One goal for Python 3000 should be to simplify the language by removing
unnecessary features.  Implicit string concatenation should be dropped in
favor of existing techniques. This will simplify the grammar and simplify a
user's mental picture of Python.  The latter is important for letting the
language "fit in your head".  A large group of current users do not even know
about implicit concatenation.  Of those who do know about it, a large portion
never use it or habitually avoid it. Of those both know about it and use it,
very few could state with confidence the implicit operator precedence and
under what circumstances it is computed when the definition is compiled versus
when it is run.

Uses and Substitutes

* Multi-line strings:

    s = "In the beginning, " \
        "there was a start."

    s = ("In the beginning," +
         "there was a start")

* Complex regular expressions are sometimes stated in terms of several
  implicitly concatenated strings with each regex component on a different
  line and followed by a comment.  The plus operator can be inserted here but
  it does make the regex harder to read.  One alternative is to use the
  re.VERBOSE option.  Another alternative is to build-up the regex with a
  series of += lines:

         r = ('a{20}'  # Twenty A's
              'b{5}'   # Followed by Five B's
              )

         r = '''a{20}  # Twenty A's
                b{5}   # Followed by Five B's
             '''                 # Compiled with thee re.VERBOSE flag

         r = 'a{20}'   # Twenty A's
         r += 'b{5}'   # Followed by Five B's

Automatic Substitution

When transitioning to Py3.0, some care should be taken to not blindly drop-in
a plus operator and possibly incur a change in semantics due to operator
precendence.  A pair such as:
         "abc" "def"
should be replaced using parentheses:
         ("abc" + "def")

From daniel at stutzbachenterprises.com  Tue May  1 09:00:33 2007
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Tue, 1 May 2007 02:00:33 -0500
Subject: [Python-3000] BList PEP
Message-ID: <eae285400705010000l2af0e890ifc8c2e0de8219961@mail.gmail.com>

PEP: 30XX
Title: BList: A faster list-like type
Version: $Revision$
Last-Modified: $Date$
Author: Daniel Stutzbach <daniel at stutzbachenterprises.com>
Discussions-To: Python 3000 List <python-3000 at python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 30-Apr-2007
Python-Version: 2.6 and/or 3.0
Post-History: 30-Apr-2007


Abstract
========

The common case for list operations is on small lists.  The current
array-based list implementation excels at small lists due to the
strong locality of reference and infrequency of memory allocation
operations.  However, an array takes O(n) time to insert and delete
elements, which can become problematic as the list gets large.

This PEP introduces a new data type, the BList, that has array-like
and tree-like aspects.  It enjoys the same good performance on small
lists as the existing array-based implementation, but offers superior
asymptotic performance for most operations.  This PEP proposes
replacing the makes two mutually exclusive proposals for including the
BList type in Python:

1. Add it to the collections module, or
2. Replace the existing list type


Motivation
==========

The BList grew out of the frustration of needing to rewrite intuitive
algorithms that worked fine for small inputs but took O(n**2) time for
large inputs due to the underlying O(n) behavior of array-based lists.
The deque type, introduced in Python 2.4, solved the most common
problem of needing a fast FIFO queue.  However, the deque type doesn't
help if we need to repeatedly insert or delete elements from the
middle of a long list.

A wide variety of data structure provide good asymptotic performance
for insertions and deletions, but they either have O(n) performance
for other operations (e.g., linked lists) or have inferior performance
for small lists (e.g., binary trees and skip lists).

The BList type proposed in this PEP is based on the principles of
B+Trees, which have array-like and tree-like aspects.  The BList
offers array-like performance on small lists, while offering O(log n)
asymptotic performance for all insert and delete operations.
Additionally, the BList implements copy-on-write under-the-hood, so
even operations like getslice take O(log n) time.  The table below
compares the asymptotic performance of the current array-based list
implementation with the asymptotic performance of the BList.

========= ================                     ====================
Operation Array-based list                     BList
========= ================                     ====================
Copy      O(n)                                 **O(1)**
Append    **O(1)**                             O(log n)
Insert    O(n)                                 **O(log n)**
Get Item  **O(1)**                             O(log n)
Set Item  **O(1)**                             **O(log n)**
Del Item  O(n)                                 **O(log n)**
Iteration O(n)                                 O(n)
Get Slice O(k)                                 **O(log n)**
Del Slice O(n)                                 **O(log n)**
Set Slice O(n+k)                               **O(log k + log n)**
Extend    O(k)                                 **O(log k + log n)**
Sort      O(n log n)                           O(n log n)
Multiply  O(nk)                                **O(log k)**
========= ================                     ====================

An extensive empirical comparison of Python's array-based list and the
BList are available at [2]_.

Use Case Trade-offs
===================

The BList offers superior performance for many, but not all,
operations.  Choosing the correct data type for a particular use case
depends on which operations are used.  Choosing the correct data type
as a built-in depends on balancing the importance of different use
cases and the magnitude of the performance differences.

For the common uses cases of small lists, the array-based list and the
BList have similar performance characteristics.

For the slightly less common case of large lists, there are two common
uses cases where the existing array-based list outperforms the
existing BList reference implementation.  These are:

1. A large LIFO stack, where there are many .append() and .pop(-1)
   operations.  Each operation is O(1) for an array-based list, but
   O(log n) for the BList.

2. A large list that does not change size.  The getitem and setitem
   calls are O(1) for an array-based list, but O(log n) for the BList.

In performance tests on a 10,000 element list, BLists exhibited a 50%
and 5% increase in execution time for these two uses cases,
respectively.

The performance for the LIFO use case could be improved to O(n) time,
by caching a pointer to the right-most leaf within the root node.  For
lists that do not change size, the common case of sequential access
could also be improved to O(n) time via caching in the root node.
However, the performance of these approaches has not been empirically
tested.

Many operations exhibit a tremendous speed-up (O(n) to O(log n)) when
switching from the array-based list to BLists.  In performance tests
on a 10,000 element list, operations such as getslice, setslice, and
FIFO-style insert and deletes on a BList take only 1% of the time
needed on array-based lists.

In light of the large performance speed-ups for many operations, the
small performance costs for some operations will be worthwhile for
many (but not all) applications.

Implementation
==============

The BList is based on the B+Tree data structure.  The BList is a wide,
bushy tree where each node contains an array of up to 128 pointers to
its children.  If the node is a leaf, its children are the
user-visible objects that the user has placed in the list.  If node is
not a leaf, its children are other BList nodes that are not
user-visible.  If the list contains only a few elements, they will all
be a children of single node that is both the root and a leaf.  Since
a node is little more than array of pointers, small lists operate in
effectively the same way as an array-based data type and share the
same good performance characteristics.

The BList maintains a few invariants to ensure good (O(log n))
asymptotic performance regardless of the sequence of insert and delete
operations.  The principle invariants are as follows:

1. Each node has at most 128 children.
2. Each non-root node has at least 64 children.
3. The root node has at least 2 children, unless the list contains
fewer than 2 elements.
4. The tree is of uniform depth.

If an insert would cause a node to exceed 128 children, the node
spawns a sibling and transfers half of its children to the sibling.
The sibling is inserted into the node's parent.  If the node is the
root node (and thus has no parent), a new parent is created and the
depth of the tree increases by one.

If a deletion would cause a node to have fewer than 64 children, the
node moves elements from one of its siblings if possible.  If both of
its siblings also only have 64 children, then two of the nodes merge
and the empty one is removed from its parent.  If the root node is
reduced to only one child, its single child becomes the new root
(i.e., the depth of the tree is reduced by one).

In addition to tree-like asymptotic performance and array-like
performance on small-lists, BLists support transparent
**copy-on-write**.  If a non-root node needs to be copied (as part of
a getslice, copy, setslice, etc.), the node is shared between multiple
parents instead of being copied.  If it needs to be modified later, it
will be copied at that time.  This is completely behind-the-scenes;
from the user's point of view, the BList works just like a regular
Python list.

Memory Usage
============

In the worst case, the leaf nodes of a BList have only 64 children
each, rather than a full 128, meaning that memory usage is around
twice that of a best-case array implementation.  Non-leaf nodes use up
a negligible amount of additional memory, since there are at least 63
times as many leaf nodes as non-leaf nodes.

The existing array-based list implementation must grow and shrink as
items are added and removed.  To be efficient, it grows and shrinks
only when the list has grow or shrunk exponentially.  In the worst
case, it, too, uses twice as much memory as the best case.

In summary, the BList's memory footprint is not significantly
different from the existing array-based implementation.

Backwards Compatibility
=======================

If the BList is added to the collections module, backwards
compatibility is not an issue.  This section focuses on the option of
replacing the existing array-based list with the BList.  For users of
the Python interpreter, a BList has an identical interface to the
current list-implementation.  For virtually all operations, the
behavior is identical, aside from execution speed.

For the C API, BList has a different interface than the existing
list-implementation.  Due to its more complex structure, the BList
does not lend itself well to poking and prodding by external sources.
Thankfully, the existing list-implementation defines an API of
functions and macros for accessing data from list objects.  Google
Code Search suggests that the majority of third-party modules uses the
well-defined API rather than relying on the list's structure
directly.  The table below summarizes the search queries and results:

======================== =================
Search String            Number of Results
======================== =================
PyList_GetItem           2,000
PySequence_GetItem         800
PySequence_Fast_GET_ITEM   100
PyList_GET_ITEM            400
\[^a\-zA\-Z\_\]ob_item          100
======================== =================


This can be achieved in one of two ways:

1. Redefine the various accessor functions and macros in listobject.h
   to access a BList instead.  The interface would be unchanged.  The
   functions can easily be redefined.  The macros need a bit more care
   and would have to resort to function calls for large lists.

   The macros would need to evaluate their arguments more than once,
   which could be a problem if the arguments have side effects.  A
   Google Code Search for "PyList_GET_ITEM\(\[^)\]+\(" found only a
   handful of cases where this occurs, so the impact appears to be
   low.

   The few extension modules that use list's undocumented structure
   directly, instead of using the API, would break.  The core code
   itself uses the accessor macros fairly consistently and should be
   easy to port.

2. Deprecate the existing list type, but continue to include it.
   Extension modules wishing to use the new BList type must do so
   explicitly.  The BList C interface can be changed to match the
   existing PyList interface so that a simple search-replace will be
   sufficient for 99% of module writers.

   Existing modules would continue to compile and work without change,
   but they would need to make a deliberate (but small) effort to
   migrate to the BList.

   The downside of this approach is that mixing modules that use
   BLists and array-based lists might lead to slow down if conversions
   are frequently necessary.

Reference Implementation
========================

A reference implementations of the BList is available for CPython at [1]_.

The source package also includes a pure Python implementation,
originally developed as a prototype for the CPython version.
Naturally, the pure Python version is rather slow and the asymptotic
improvements don't win out until the list is quite large.

When compiled with Py_DEBUG, the C implementation checks the
BList invariants when entering and exiting most functions.

An extensive set of test cases is also included in the source package.
The test cases include the existing Python sequence and list test
cases as a subset.  When the interpreter is built with Py_DEBUG, the
test cases also check for reference leaks.

Porting to Other Python Variants
--------------------------------

If the BList is added to the collections module, other Python variants
can support it in one of three ways:

1. Make blist an alias for list.  The asymptotic performance won't be
   as good, but it'll work.
2. Use the pure Python reference implementation.  The performance for
   small lists won't be as good, but it'll work.
3. Port the reference implementation.

Discussion
==========

This proposal has been discussed briefly on the Python-3000 mailing
list [3]_.  Although a number of people favored the proposal, there
were also some objections.  Below summarizes the pros and cons as
observed by posters to the thread.

General comments:

- Pro: Will outperform the array-based list in most cases
- Pro: "I've implemented variants of this ... a few different times"
- Con: Desirability and performance in actual applications is unproven

Comments on adding BList to the collections module:

- Pro: Matching the list-API reduces the learning curve to near-zero
- Pro: Useful for intermediate-level users; won't get in the way of beginners
- Con: Proliferation of data types makes the choices for developers harder.

Comments on replacing the array-based list with the BList:

- Con: Impact on extension modules (addressed in `Backwards
  Compatibility`_)
- Con: The use cases where BLists are slower are important
  (see `Use Case Trade-Offs`_ for how these might be addressed).
- Con: The array-based list code is simple and easy to maintain

To assess the desirability and performance in actual applications,
Raymond Hettinger suggested releasing the BList as an extension module
(now available at [1]_).  If it proves useful, he felt it would be a
strong candidate for inclusion in 2.6 as part of the collections
module.  If widely popular, then it could be considered for replacing
the array-based list, but not otherwise.

Guido van Rossum commented that he opposed the proliferation of data
types, but favored replacing the array-based list if backwards
compatibility could be addressed and the BList's performance was
uniformly better.

On-going Tasks
==============

- Reduce the memory footprint of small lists
- Implement TimSort for BLists, so that best-case sorting is O(n)
  instead of O(log n).
- Implement __reversed__
- Cache a pointer in the root to the rightmost leaf, to make LIFO
  operation O(n) time.

References
==========

.. [1] Reference Implementations for C and Python:
http://www.python.org/pypi/blist/

.. [2] Empirical performance comparison between Python's array-based
list and the blist: http://stutzbachenterprises.com/blist/

.. [3] Discussion on python-3000 starting at post:
http://mail.python.org/pipermail/python-3000/2007-April/006757.html

Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:

From jcarlson at uci.edu  Tue May  1 09:24:30 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 01 May 2007 00:24:30 -0700
Subject: [Python-3000] PEP:  Information Attributes
In-Reply-To: <008a01c78bb9$e4283780$f001a8c0@RaymondLaptop1>
References: <008a01c78bb9$e4283780$f001a8c0@RaymondLaptop1>
Message-ID: <20070501000436.6450.JCARLSON@uci.edu>


"Raymond Hettinger" <python at rcn.com> wrote:
> Proto-PEP:  Information Attributes (First Draft)
> 
> Proposal:
> 
> Testing hasattr() is a broadly applicable and flexible technique that works well
> whenever the presence of a method has an unambiguous interpretation
>  (i.e. __hash__ for hashability, __iter__ for iterability, __len__ for sized
> containers); however, there are other methods with ambiguous interpretations
> that could be resolved by adding an information attribute.

To me, this seems more like traits/roles than ABCs.  Though I haven't
weighed in on either of them, generally I'm with Raymond and others in
the whole "ABCs seem like overkill" perspective.  As such, I'm -1 on
ABCs, but +1 on the general idea of traits/roles - of which I would
consider this PEP to be one.

My concern with Information Attributes is similar to my concern about
ABCs; in order to state the information available from these information
attributes, they need to be part of the class or instance.  On built-in
types, users would not be able to add things to classes or instances, as
is the case with the numpy folks wanting to add 'ring' to integers.

While I've not seen a PEP for offering live traits/roles addition or
removal, I suspect that it would involve weak key dictionaries adding
traits to classes, and only allow hashable instances for single-object
trait additions (depending on the kinds of traits/roles, it could
probably be implemented as a dictionary of weak key sets). I would be +1
in this case, as it would offer most of the benefits of ABCs*, with none
of the pre-implementation drawbacks.


 - Josiah

* Related to ABCs is the __issubclass__ and __isinstance__ stuff that
allows for proxy objects directly.  Traits/roles could be massaged to do
similar things, but using __is...__ directly seems like it would perform
this operation better.  I'm not a real big 


From python at rcn.com  Tue May  1 09:31:21 2007
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 1 May 2007 00:31:21 -0700
Subject: [Python-3000] PEP:  Eliminate __del__
Message-ID: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>

PEP:  Eliminating __del__

Motivation

Historically, __del__ has been one of the more error-laden dark corners
of the language.  From an implementation point of view, it has
proven to be a thorny maintenance problem that grew almost beyond
the range of human comprehension once garbage collection was introduced.

>From a user point-of-view, the method is deceptively simple and tends
to lead to fragile designs.  The fragility arises in-part because it
is difficult to know when or if an object is going to be deleted whether
by reference counts going to zero or by garbage collection.  Even if all
the relationships are known at the time a script is written, a subsequent
maintainer may innocently introduce (directly or indirectly) a reference
that prevents the desired finalization code from running.  From a design
perspective, it is almost always better to provide for explicit finalization
(for example, experienced Python programmers have learned to call file.close()
 and connection.close() rather than rely on automatic closing when the
 file or sql connection goes out of scope).  For finalization that needs
to occur only a the end of all operations, users have learned to use atexit()
as the preferred technique.

That leaves a handful of cases where some action does need to be taken
when an object is being collected.  In those cases, users have turned
to __del__ "because it was there".  Through the school of hard-knocks,
they eventually learn to avoid to hazards of accidentally bringing the
object back to life during finalization and possibly leaving the object
in an invalid half-finalized state.  The risks occur because the object
is still alive at the time the arbitrary python code in __del__ is called.

The alternative is to code the automatic finalization steps using
weakref callbacks.  For those used to using __del__, it takes a little
while to learn the idiom but essentially the technique is hold a proxy
or ref with a callback to a boundmethod for finalization:
    self.resource = resource = CreateResource()
    self.callbacks.append(proxy(resource, resource.closedown))
In this manner, all of the object's resources can be freed automatically
when the object is collected.  Note, that the callbacks only bind
the resource object and not client object, so the client object
can already have been collected and the teardown code can be run
without risk of resurrecting the client (with a possibly invalid state).


Proposal

The proposal is to eliminate __del__ and thereby eliminate a strong
temptation to code implicit rather than explicit finalization.  The
remaining approaches to teardown procedures such as atexit() and
weakref callbacks are much less problematic and should become the
one way to do it.

From jcarlson at uci.edu  Tue May  1 09:53:51 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 01 May 2007 00:53:51 -0700
Subject: [Python-3000] BList PEP
In-Reply-To: <eae285400705010000l2af0e890ifc8c2e0de8219961@mail.gmail.com>
References: <eae285400705010000l2af0e890ifc8c2e0de8219961@mail.gmail.com>
Message-ID: <20070501002446.6453.JCARLSON@uci.edu>


"Daniel Stutzbach" <daniel at stutzbachenterprises.com> wrote:
> Title: BList: A faster list-like type
> 1. Add it to the collections module, or

+1

> 2. Replace the existing list type

-1

For types that are used every day, I can't help but prefer a simple
implementation.  Among the features of the current Python list
implementation is that small lists (0, 4, 8, 16 elements) use very
little space.  Your current BList implementation uses a fixed size for
the smallest sequence, 128, which would offer worse memory performance
for applications where many small lists are common.


> ========= ================                     ====================
> Operation Array-based list                     BList
> ========= ================                     ====================
> Copy      O(n)                                 **O(1)**
> Append    **O(1)**                             O(log n)
> Insert    O(n)                                 **O(log n)**
> Get Item  **O(1)**                             O(log n)
> Set Item  **O(1)**                             **O(log n)**
what's going on with this pair?                  ^^        ^^

> Del Item  O(n)                                 **O(log n)**
> Iteration O(n)                                 O(n)
> Get Slice O(k)                                 **O(log n)**
> Del Slice O(n)                                 **O(log n)**
> Set Slice O(n+k)                               **O(log k + log n)**
> Extend    O(k)                                 **O(log k + log n)**
> Sort      O(n log n)                           O(n log n)
> Multiply  O(nk)                                **O(log k)**
> ========= ================                     ====================



> The performance for the LIFO use case could be improved to O(n) time,

You probably want to mention "over n appends/pop(-1)s".  You also may
want to update the above chart to take into consideration that you plan
on doing that modification.

Generally, the BList is as fast or faster asymptotically than a list for
everything except random getitem/setitem; at which point it is O(logn)
rather than O(1).  You may want to explicitly state this in some later
version.


> Implementation
> ==============
> 
> The BList is based on the B+Tree data structure.  The BList is a wide,
> bushy tree where each node contains an array of up to 128 pointers to
> its children.  If the node is a leaf, its children are the
> user-visible objects that the user has placed in the list.  If node is
> not a leaf, its children are other BList nodes that are not
> user-visible.  If the list contains only a few elements, they will all
> be a children of single node that is both the root and a leaf.  Since
> a node is little more than array of pointers, small lists operate in
> effectively the same way as an array-based data type and share the
> same good performance characteristics.

In the collections module, there exists a deque type.  This deque type
more or less uses a sequence of 64 pointers, the first two of which are
linked list pointers to the previous and next block of pointers.  I
don't know how much tuning was done to choose this value of 64, but you
may consider reducing the number of pointers to 64 for the the same
cache/allocation behavior.


 - Josiah


From martin at v.loewis.de  Tue May  1 11:17:02 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 01 May 2007 11:17:02 +0200
Subject: [Python-3000] PEP:  Eliminate __del__
In-Reply-To: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
Message-ID: <4637058E.2070604@v.loewis.de>

> Historically, __del__ has been one of the more error-laden dark corners
> of the language.  From an implementation point of view, it has
> proven to be a thorny maintenance problem that grew almost beyond
> the range of human comprehension once garbage collection was introduced.

+1

Martin

From martin at v.loewis.de  Tue May  1 12:52:02 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 01 May 2007 12:52:02 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
Message-ID: <46371BD2.7050303@v.loewis.de>

PEP: 31xx
Title: Supporting Non-ASCII Identifiers
Version: $Revision$
Last-Modified: $Date$
Author: Martin v. L?wis <martin at v.loewis.de>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 1-May-2007
Python-Version: 3.0
Post-History:

Abstract
========

This PEP suggests to support Non-ASCII letters (such as accented
characters, Cyrillic, Greek, Kanji, etc.) in Python identifiers.

Rationale
=========

Python code is written by many people in the world who are not
familiar with the English language, or even well-acquainted with the
Latin writing system. Such developers often desire to define classes
and functions with names in their native languages, rather than
having to come up with an (often incorrect) English translation
of the concept they want to name.

For some languages, common transliteration systems exists (in
particular, for the Latin-bases writing systems). For other languages,
users have larger difficulties to use Latin to write their native
words.

Common Objections
=================

Some objections are often raised agains proposals similar to this one.

People claim that they will not be able to use a library if to do so
they have to use characters they cannot type on their
keyboards. However, it is the choice of the designer of the library to
decide on various constraints for using the library: people may not be
able to use the library because they cannot get physical access to the
source code (because it is not published), or because licensing
prohibits usage, or because the documentation is in a language they
cannot understand. A developer wishing to make a library widely
available needs to make a number of explicit choices (such as
publication, licensing, language of documentation, and language of
identifiers). It should always be the choice of the author to make
these decisions - not the choice of the language designers.

In particular, projects wishing to have wide usage probably might want
to establish a policy that all identifiers, comments, and
documentation is written in English (see the GNU coding style guide
for an example of such a policy). Restricting the language to
ASCII-only identifiers does not enforce comments and documentation to
be English, or the identifiers actually to be English words, so an
additional policy is necessary, anyway.

Specification of Language Changes
=================================

The syntax of identifiers in Python will be based on the Unicode
standard annex UAX-31 [1]_, with elaboration and changes as defined
below.

Within the ASCII range (U+0001..U+007F), the valid characters for
identifiers are the same as in Python 2.5. This specification only
introduces additional characters from outside the ASCII range. For
other characters, the classification uses the version of the Unicode
Character Database as included in the unicodedata module.

The identifier syntax is <ID_Start> <ID_Continue>\*.

ID_Start is defined as all characters having one of the general
categories uppercase letters (Lu), lowercase letters (Ll), titlecase
letters (Lt), modifier letters (Lm), other letters (Lo), letter
numbers (Nl), plus the underscore (XXX what are "stability extensions
listed in UAX 31).

ID_Continue is defined as all characters in ID_Start, plus nonspacing
marks (Mn), spacing combining marks (Mc), decimal number (Nd), and
connector punctuations (Pc).

All identifiers are converted into the normal form NFC while parsing;
comparison of identifiers is based on NFC.

Policy Specification
====================

As an addition to the Python Coding style, the following policy is
prescribed: All identifiers in the Python standard library MUST use
ASCII-only identifiers, and SHOULD use English words whereever
feasible.

As an option, this specification can be applied to Python 2.x.  In
that case, ASCII-only identifiers would continue to be represented as
byte string objects in namespace dictionaries; identifiers with
non-ASCII characters would be represented as Unicode strings.

Implementation
==============

The following changes will need to be made to the parser:

1. If a non-ASCII character is found in the UTF-8 representation of
   the source code, a forward scan is made to find the first ASCII
   non-identifier character (e.g. a space or punctuation character)

2. The entire UTF-8 string is passed to a function to normalize the
   string to NFC, and then verify that it follows the identifier
   syntax. No such callout is made for pure-ASCII identifiers, which
   continue to be parsed the way they are today.

3. If this specification is implemented for 2.x, reflective libraries
   (such as pydoc) must be verified to continue to work when Unicode
   strings appear in __dict__ slots as keys.

References
==========

.. [1] http://www.unicode.org/reports/tr31/


Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:

From rasky at develer.com  Tue May  1 13:32:52 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Tue, 01 May 2007 13:32:52 +0200
Subject: [Python-3000] PEP:  Eliminate __del__
In-Reply-To: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
Message-ID: <f178h4$s0n$1@sea.gmane.org>

On 01/05/2007 9.31, Raymond Hettinger wrote:

> PEP:  Eliminating __del__

*sigh* I'm still -1, but I won't revive the discussion of course.

I would still like if the PEP listed the alternative me and others were 
proposing, that is changing the semantic of __del__ (or dropping __del__ in 
favor of a new __close__ method with the new semantic) such as:

  1) It is guaranteed to be called only once per object.
  2) In case of circular reference, __del__ methods are called in random order 
on the objects of the cycle, and then the cycle is broken. (This is because 
step #1 fixes the main problem with calling __del__ in random orders).

In fact, your PEP concentrates on the problem of implicit finalization, which 
I don't think it's generally perceived as *the* problem with __del__. I'm 
still a *strong* proponent of implicit finalization (aka RAII). It always 
worked well for me.

The problem is that __del__ currently *breaks* implicit finalization, causing 
garbage if it's used by objects in a cycle. With the fixes above, I'd use it 
more not less.
-- 
Giovanni Bajo


From rasky at develer.com  Tue May  1 13:34:47 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Tue, 01 May 2007 13:34:47 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <46371BD2.7050303@v.loewis.de>
References: <46371BD2.7050303@v.loewis.de>
Message-ID: <f178kn$s0n$2@sea.gmane.org>

On 01/05/2007 12.52, Martin v. L?wis wrote:

> PEP: 31xx
> Title: Supporting Non-ASCII Identifiers

Isn't this already blacklisted in PEP 3099?
-- 
Giovanni Bajo



From jimjjewett at gmail.com  Tue May  1 16:11:26 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 1 May 2007 10:11:26 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070430200255.04b88e10@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430200255.04b88e10@sparrow.telecommunity.com>
Message-ID: <fb6fbf560705010711w3d533820m53b1eeef229c7131@mail.gmail.com>

On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 07:29 PM 4/30/2007 -0400, Jim Jewett wrote:
> >On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:

> >>PEP 3115, however, requires that a class' metaclass be determined
> >>*before* the class body has executed, making it impossible to use this
> >>technique for class decoration any more.

> >It doesn't say what that metaclass has to do, though.

> >Is there any reason the metaclass couldn't delegate differently
> >depending on the value of __my_magic_attribute__ ?

> Sure -- that's what I suggested in the "super(), class decorators, and PEP
> 3115" thread, but Guido voted -1 on adding such a magic attribute to PEP
> 3115.

I don't think we're understanding each other.  Why couldn't you use a
suitably fleshed-out version of:

class _ConditionalMetaclass(type):

    def __init__(cls, name, bases, dct):
        super(_ConditionalMetaclass, cls).__init__(name, bases, dct)
        hooks = [(k, v) for (k, v) in dct.items() if
k.startswith("_afterhook_")]
        for k, v in hooks:
            cls = AfterHooksRegistry[k](cls, v)

-jJ

From jimjjewett at gmail.com  Tue May  1 16:31:08 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 1 May 2007 10:31:08 -0400
Subject: [Python-3000] Addition to PEP 3101
In-Reply-To: <4636AE9E.2020905@acm.org>
References: <8f01efd00704300953t6154d7e1j7ef18cead1acb344@mail.gmail.com>
	<d09829f50704301256n2b08b1fav7ac64b4fcc6a742c@mail.gmail.com>
	<46368EE5.6050409@canterbury.ac.nz> <4636AE9E.2020905@acm.org>
Message-ID: <fb6fbf560705010731i719da8efibc14c72e6175053d@mail.gmail.com>

On 4/30/07, Talin <talin at acm.org> wrote:
> Greg Ewing wrote:
> > Patrick Maupin wrote:

> >> Method calls are deliberately disallowed by the PEP, so that the
> >> implementation has some hope of being securable.

> > If attribute access is allowed, arbitrary code can already
> > be triggered, so I don't see how this makes a difference
> > to security.

> Not quite. It depends on what you mean by 'arbitrary code'. ...

If I understood that correctly, then

(1)  The format string cannot run arbitrary code, but
(2)  The formatted objects themselves can.

This is probably a feature, since you can pass proxy objects, but it
should definately be called out explicitly in the security section
(currently just some text in Simple and Compound Names section).
Example Text:


Note that while (literal strings used as) format strings are
effectively sandboxed, the formatted objects themselves are not.

    "My name is {0[name]}".format(evil_map)

would still allow evil_map to run arbitrary code.


-jJ

From jimjjewett at gmail.com  Tue May  1 16:52:18 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 1 May 2007 10:52:18 -0400
Subject: [Python-3000] Breakthrough in thinking about ABCs (PEPs 3119
	and 3141)
In-Reply-To: <ca471dc20704301719y5bc7e83fi1a6eb2a68d9f4e42@mail.gmail.com>
References: <ca471dc20704301719y5bc7e83fi1a6eb2a68d9f4e42@mail.gmail.com>
Message-ID: <fb6fbf560705010752y54f88cfctb4577d7d962a565a@mail.gmail.com>

On 4/30/07, Guido van Rossum <guido at python.org> wrote:

> The idea of overloading isinstance and issubclass is running into some
> resistance. I still like it, but if there is overwhelming discomfort,
> we can change it so that instead of writing isinstance(x, C) or
> issubclass(D, C) (where C overloads these operations), you'd have to
> write something like C.hasinstance(x) or C.hassubclass(D), where
> hasinstance and hassubclass are defined by some ABC metaclass. I'd
> still like to have the spec for hasinstance and hassubclass in the
> core language, so that different 3rd party frameworks don't need to
> invent different ways of spelling this inquiry.

Would it help to get away from class/instance entirely, and call them
something like isexample?  (Though class vs instance gets harder then.
 areexamples?)

(And yes, I think it would, but no, I don't yet have the code written
out to explain.)

-jJ

From jimjjewett at gmail.com  Tue May  1 17:18:52 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 1 May 2007 11:18:52 -0400
Subject: [Python-3000] PEP: Information Attributes
In-Reply-To: <008a01c78bb9$e4283780$f001a8c0@RaymondLaptop1>
References: <008a01c78bb9$e4283780$f001a8c0@RaymondLaptop1>
Message-ID: <fb6fbf560705010818t4c17f5e3oc85295cadc49bd8e@mail.gmail.com>

On 5/1/07, Raymond Hettinger <python at rcn.com> wrote:
> Use Cases with Ambiguous Interpretations

> * The presence of a __getitem__ method is ambiguous in that it can be
>   interpreted as either having sequence or mapping behavior.  The ambiguity is
>   easily resolved with an attribute claiming either mapping behavior or
>   sequence behavior.

If you're really duck-typing, it doesn't matter; just try the key and
see if it works.  At this level, Sequences *are* mappings which happen
to have (exactly the) integers from 0 to size-1 as keys.

Knowing that the keys are integers won't tell you whether you can push and pop.

The advantage of the ABC variant is that you do know you can push and
pop, because if the object itself didn't provide an implementation,
then python will fall back to the (abstract class' concrete) default
implementation for you.

> * The presence of a rich comparison operator such as __lt__ is ambiguous in that
>   it can return a vector or a scalar, the scalar may or may not be boolean,
>   and it may be a NotImplemented instance.  Even the boolean case is ambigouus
>   because __lt__ may imply a total ordering (as it does for numbers) or it may
>   be a partial ordering (as it is for sets where __lt__ means a strict
>   subset). That latter ambiguity (sortability) is easily resolved by an
>   attribute indicating a total ordering.

erm... sortability with respect to what?  Only instances of its own
class?  With other string-like things?

> * Some methods such as set.__add__ are too restrictive in that they preclude
>   interaction with non-sets.  This makes it impossible to achieve set
>   interoperability without subclassing from set (a choice which introduces
>   other complications such as the inability to override set-to-set
>   interactions).  This situation is easily resolved by an attribute like
>   obj.__fake__=set which indicates that the object intends to be a set proxy.

How does this improve on registering the object with the abstract Set
class?  If anything, it seems worse, because you need to be able to
modify obj.  ( Josiah suggests a lookaside dictionary -- but that
might as well *be* the ABC.)

> * The __iter__ method doesn't tell you whether the object supports
>   multiple iteration (such as with files) or single iteration (such as with lists).
>   A __singleiterator__ attribute would clear-up the ambiguity.

This seems backwards.  I hope that was just a typo, but *I* can't be
as sure from a single name as I could from a docstringed class.

> * While you can test for the presence of a write() method, it would be
>    helpful to have a __readonly__ information attribute for file-like objects,
>    cursors, immutables, and whatnot.

readonly meaning that I can't modify it, or readonly  meaning that no
one else will?

> The attribute approach is dynamic (doesn't require inheritance to work). It
> doesn't require mucking with isinstance() or other existing mechanisms.

I think a Traits version of ABCs could do that as well, and will try
to get an example coded in the next week or so.

> It restricts itself to making a limited, useful set of assertions rather than
> broadly covering a whole API. It leaves the proven pythonic notion of
> duck-typing as the rule rather than the exception. It resists the temptation
> to freeze all of the key APIs in concrete.

I feel almost the opposite.  Because the attribute is right there on
the object (instead of in a registry I have to import), it is more
tempting to use it; I expect this will cause many more people will
code defensively by adding extra asserts, so that it becomes more
important to support.  Because the object itself is only a single
namespace, it effectively freezes the API that goes out first.

Josiah wrote:
> ... suspect ... weak key dictionaries adding
> traits to classes, and only allow hashable instances for single-object

(I assumed the key would be id(obj), though it would still need a
weakref for data integrity.)

jJ

From collinw at gmail.com  Tue May  1 17:22:45 2007
From: collinw at gmail.com (Collin Winter)
Date: Tue, 1 May 2007 08:22:45 -0700
Subject: [Python-3000] PEP: Drop Implicit String Concatentation
In-Reply-To: <009e01c78bbd$da8b71c0$f001a8c0@RaymondLaptop1>
References: <009e01c78bbd$da8b71c0$f001a8c0@RaymondLaptop1>
Message-ID: <43aa6ff70705010822q1066ab51sa7c58547dd3d18f1@mail.gmail.com>

On 4/30/07, Raymond Hettinger <python at rcn.com> wrote:
> PEP:  Remove Implicit String Concatenation

Jim Jewett has already submitted a PEP that does this, PEP 3126. It's
in SVN but not showing up on PEP 0 for some reason:
http://svn.python.org/view/peps/trunk/pep-3126.txt?rev=55030&view=markup

Collin Winter

From p.f.moore at gmail.com  Tue May  1 17:42:04 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 1 May 2007 16:42:04 +0100
Subject: [Python-3000] BList PEP
In-Reply-To: <eae285400705010000l2af0e890ifc8c2e0de8219961@mail.gmail.com>
References: <eae285400705010000l2af0e890ifc8c2e0de8219961@mail.gmail.com>
Message-ID: <79990c6b0705010842m20f0cfa1o4dd14574fbc8769@mail.gmail.com>

> - Implement TimSort for BLists, so that best-case sorting is O(n)
>  instead of O(log n).

Is that a typo? Why would you want to make best-case sorting worse?
Paul.

From jimjjewett at gmail.com  Tue May  1 17:43:33 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 1 May 2007 11:43:33 -0400
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
Message-ID: <fb6fbf560705010843n39a46977o9e29152ac3d92c21@mail.gmail.com>

On 5/1/07, Raymond Hettinger <python at rcn.com> wrote:
> The alternative is to code the automatic finalization steps using
> weakref callbacks.  For those used to using __del__, it takes a little
> while to learn the idiom but essentially the technique is hold a proxy
> or ref with a callback to a boundmethod for finalization:
>     self.resource = resource = CreateResource()
>     self.callbacks.append(proxy(resource, resource.closedown))
> In this manner, all of the object's resources can be freed automatically
> when the object is collected.  Note, that the callbacks only bind
> the resource object and not client object, so the client object
> can already have been collected and the teardown code can be run
> without risk of resurrecting the client (with a possibly invalid state).

That alternative is pretty ugly, and I think we found some cases where
it required major rewriting.  (I don't have them handy, but may end up
searching for them again, if need be.)

A smaller change would be to add __close__ (which covers most use
cases), or even to give __del__ the __close__ semantics.

The key distinction is that __close__ says to go ahead and break the
cycle in an arbitrary location, rather than immortalizing it.

-jJ

From collinw at gmail.com  Tue May  1 17:44:15 2007
From: collinw at gmail.com (Collin Winter)
Date: Tue, 1 May 2007 08:44:15 -0700
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <46371BD2.7050303@v.loewis.de>
References: <46371BD2.7050303@v.loewis.de>
Message-ID: <43aa6ff70705010844u4a6333f5hf1d4d3a807361ffe@mail.gmail.com>

On 5/1/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Rationale
> =========
>
> Python code is written by many people in the world who are not
> familiar with the English language, or even well-acquainted with the
> Latin writing system.
[snip]

That makes absolutely no sense. You mean to tell me that people write
Python without being able to understand any of the language's
keywords, builtin functions, standard library or documentation?

-?.

Collin Winter

From daniel at stutzbachenterprises.com  Tue May  1 17:46:36 2007
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Tue, 1 May 2007 10:46:36 -0500
Subject: [Python-3000] BList PEP
In-Reply-To: <79990c6b0705010842m20f0cfa1o4dd14574fbc8769@mail.gmail.com>
References: <eae285400705010000l2af0e890ifc8c2e0de8219961@mail.gmail.com>
	<79990c6b0705010842m20f0cfa1o4dd14574fbc8769@mail.gmail.com>
Message-ID: <eae285400705010846u254f709cm15ac223c6633d710@mail.gmail.com>

On 5/1/07, Paul Moore <p.f.moore at gmail.com> wrote:
> > - Implement TimSort for BLists, so that best-case sorting is O(n)
> >  instead of O(log n).
>
> Is that a typo? Why would you want to make best-case sorting worse?

Yes, it should read O(n log n), not O(log n).

-- 
Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC

From collinw at gmail.com  Tue May  1 17:52:13 2007
From: collinw at gmail.com (Collin Winter)
Date: Tue, 1 May 2007 08:52:13 -0700
Subject: [Python-3000] Adding class decorators to PEP 318
Message-ID: <43aa6ff70705010852g112924a2hbf13f31d83631a85@mail.gmail.com>

In talking to Neal Norwitz about this, I don't see a need for a
separate PEP for class decorators; we already have a decorators PEP,
#318. The following is a proposed patch to PEP 318 that adds in class
decorators.

Collin Winter


Index: pep-0318.txt
===================================================================
--- pep-0318.txt	(revision 55034)
+++ pep-0318.txt	(working copy)
@@ -1,5 +1,5 @@
 PEP: 318
-Title: Decorators for Functions and Methods
+Title: Decorators for Functions, Methods and Classes
 Version: $Revision$
 Last-Modified: $Date$
 Author: Kevin D. Smith, Jim Jewett, Skip Montanaro, Anthony Baxter
@@ -9,7 +9,7 @@
 Created: 05-Jun-2003
 Python-Version: 2.4
 Post-History: 09-Jun-2003, 10-Jun-2003, 27-Feb-2004, 23-Mar-2004, 30-Aug-2004,
-              2-Sep-2004
+              2-Sep-2004, 30-Apr-2007


 WarningWarningWarning
@@ -22,24 +22,40 @@
 negatives of each form.


+UpdateUpdateUpdate
+==================
+
+In April 2007, this PEP was updated to reflect the evolution of the Python
+community's attitude toward class decorators.  Though they had previously
+been rejected as too obscure and with limited use-cases, by mid-2006,
+class decorators had come to be seen as the logical next step, with some
+wondering why they hadn't been included originally.  As a result, class
+decorators will ship in Python 2.6.
+
+This PEP has been modified accordingly, with references to class decorators
+injected into the narrative.  While some references to the lack of class
+decorators have been left in place to preserve the historical record, others
+have been removed for the sake of coherence.
+
+
 Abstract
 ========

-The current method for transforming functions and methods (for instance,
-declaring them as a class or static method) is awkward and can lead to
-code that is difficult to understand.  Ideally, these transformations
-should be made at the same point in the code where the declaration
-itself is made.  This PEP introduces new syntax for transformations of a
-function or method declaration.
+The current method for transforming functions, methods and classes (for
+instance, declaring a method as a class or static method) is awkward and
+can lead to code that is difficult to understand.  Ideally, these
+transformations should be made at the same point in the code where the
+declaration itself is made.  This PEP introduces new syntax for
+transformations of a function, method or class declaration.


 Motivation
 ==========

-The current method of applying a transformation to a function or method
-places the actual transformation after the function body.  For large
-functions this separates a key component of the function's behavior from
-the definition of the rest of the function's external interface.  For
+The current method of applying a transformation to a function, method or class
+places the actual transformation after the body.  For large
+code blocks this separates a key component of the object's behavior from
+the definition of the rest of the object's external interface.  For
 example::

     def foo(self):
@@ -69,14 +85,22 @@
 are not as immediately apparent.  Almost certainly, anything which could
 be done with class decorators could be done using metaclasses, but
 using metaclasses is sufficiently obscure that there is some attraction
-to having an easier way to make simple modifications to classes.  For
-Python 2.4, only function/method decorators are being added.
+to having an easier way to make simple modifications to classes.  The
+following is much clearer than the metaclass-based alternative::

+    @singleton
+    class Foo(object):
+        pass

+Because of the greater ease-of-use of class decorators and the symmetry
+with function and method decorators, class decorators will be included in
+Python 2.6.
+
+
 Why Is This So Hard?
 --------------------

-Two decorators (``classmethod()`` and ``staticmethod()``) have been
+Two method decorators (``classmethod()`` and ``staticmethod()``) have been
 available in Python since version 2.2.  It's been assumed since
 approximately that time that some syntactic support for them would
 eventually be added to the language.  Given this assumption, one might
@@ -135,11 +159,16 @@
 .. _gareth mccaughan:
    http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=slrna40k88.2h9o.Gareth.McCaughan%40g.local

-Class decorations seem like an obvious next step because class
+Class decorations seemed like an obvious next step because class
 definition and function definition are syntactically similar,
-however Guido remains unconvinced, and class decorators will almost
-certainly not be in Python 2.4.
+however Guido was not convinced of their usefulness, and class
+decorators were not in Python 2.4.  `The issue was revisited`_ in March 2006
+and sufficient use-cases were found to justify the inclusion of class
+decorators in Python 2.6.

+.. _The issue was revisited:
+   http://mail.python.org/pipermail/python-dev/2006-March/062942.html
+
 The discussion continued on and off on python-dev from February
 2002 through July 2004.  Hundreds and hundreds of posts were made,
 with people proposing many possible syntax variations.  Guido took
@@ -147,8 +176,8 @@
 place.  Subsequent to this, he decided that we'd have the `Java-style`_
 @decorator syntax, and this appeared for the first time in 2.4a2.
 Barry Warsaw named this the 'pie-decorator' syntax, in honor of the
-Pie-thon Parrot shootout which was occured around the same time as
-the decorator syntax, and because the @ looks a little like a pie.
+Pie-thon Parrot shootout which was occuring around the same time as
+the decorator syntax debate, and because the @ looks a little like a pie.
 Guido `outlined his case`_ on Python-dev, including `this piece`_
 on some of the (many) rejected forms.

@@ -250,6 +279,19 @@
 decorators are near the function declaration.  The @ sign makes it clear
 that something new is going on here.

+Python 2.6's class decorators work similarly::
+
+    @dec2
+    @dec1
+    class Foo:
+        pass
+
+This is equivalent to::
+
+    class Foo:
+        pass
+    Foo = dec2(dec1(Foo))
+
 The rationale for the `order of application`_ (bottom to top) is that it
 matches the usual order for function-application.  In mathematics,
 composition of functions (g o f)(x) translates to g(f(x)).  In Python,
@@ -321,7 +363,7 @@
 There have been a number of objections raised to this location -- the
 primary one is that it's the first real Python case where a line of code
 has an effect on a following line.  The syntax available in 2.4a3
-requires one decorator per line (in a2, multiple decorators could be
+requires one decorator per line (in 2.4a2, multiple decorators could be
 specified on the same line).

 People also complained that the syntax quickly got unwieldy when
@@ -330,52 +372,61 @@
 were small and thus this was not a large worry.

 Some of the advantages of this form are that the decorators live outside
-the method body -- they are obviously executed at the time the function
+the function/class body -- they are obviously executed at the time the object
 is defined.

-Another advantage is that a prefix to the function definition fits
+Another advantage is that a prefix to the definition fits
 the idea of knowing about a change to the semantics of the code before
-the code itself, thus you know how to interpret the code's semantics
+the code itself. This way, you know how to interpret the code's semantics
 properly without having to go back and change your initial perceptions
 if the syntax did not come before the function definition.

 Guido decided `he preferred`_ having the decorators on the line before
-the 'def', because it was felt that a long argument list would mean that
-the decorators would be 'hidden'
+the 'def' or 'class', because it was felt that a long argument list would mean
+that the decorators would be 'hidden'

 .. _he preferred:
     http://mail.python.org/pipermail/python-dev/2004-March/043756.html

-The second form is the decorators between the def and the function name,
-or the function name and the argument list::
+The second form is the decorators between the 'def' or 'class' and the object's
+name, or between the name and the argument list::

     def @classmethod foo(arg1,arg2):
         pass
+
+    class @singleton Foo(arg1, arg2):
+        pass

     def @accepts(int,int), at returns(float) bar(low,high):
         pass

     def foo @classmethod (arg1,arg2):
         pass
+
+    class Foo @singleton (arg1, arg2):
+        pass

     def bar @accepts(int,int), at returns(float) (low,high):
         pass

 There are a couple of objections to this form.  The first is that it
-breaks easily 'greppability' of the source -- you can no longer search
+breaks easy 'greppability' of the source -- you can no longer search
 for 'def foo(' and find the definition of the function.  The second,
 more serious, objection is that in the case of multiple decorators, the
 syntax would be extremely unwieldy.

 The next form, which has had a number of strong proponents, is to have
 the decorators between the argument list and the trailing ``:`` in the
-'def' line::
+'def' or 'class' line::

     def foo(arg1,arg2) @classmethod:
         pass

     def bar(low,high) @accepts(int,int), at returns(float):
         pass
+
+    class Foo(object) @singleton:
+        pass

 Guido `summarized the arguments`_ against this form (many of which also
 apply to the previous form) as:
@@ -403,15 +454,19 @@
         @accepts(int,int)
         @returns(float)
         pass
+
+    class Foo(object):
+        @singleton
+        pass

 The primary objection to this form is that it requires "peeking inside"
-the method body to determine the decorators.  In addition, even though
-the code is inside the method body, it is not executed when the method
+the suite body to determine the decorators.  In addition, even though
+the code is inside the suite body, it is not executed when the code
 is run.  Guido felt that docstrings were not a good counter-example, and
 that it was quite possible that a 'docstring' decorator could help move
 the docstring to outside the function body.

-The final form is a new block that encloses the method's code.  For this
+The final form is a new block that encloses the function or clas.  For this
 example, we'll use a 'decorate' keyword, as it makes no sense with the
 @syntax. ::

@@ -425,9 +480,14 @@
         returns(float)
         def bar(low,high):
             pass
+
+    decorate:
+        singleton
+        class Foo(object):
+            pass

 This form would result in inconsistent indentation for decorated and
-undecorated methods.  In addition, a decorated method's body would start
+undecorated code.  In addition, a decorated object's body would start
 three indent levels in.


@@ -444,6 +504,10 @@
     @returns(float)
     def bar(low,high):
         pass
+
+    @singleton
+    class Foo(object):
+        pass

   The major objections against this syntax are that the @ symbol is
   not currently used in Python (and is used in both IPython and Leo),
@@ -461,6 +525,10 @@
     |returns(float)
     def bar(low,high):
         pass
+
+    |singleton
+    class Foo(object):
+        pass

   This is a variant on the @decorator syntax -- it has the advantage
   that it does not break IPython and Leo.  Its major disadvantage
@@ -476,6 +544,10 @@
     [accepts(int,int), returns(float)]
     def bar(low,high):
         pass
+
+    [singleton]
+    class Foo(object):
+        pass

   The major objection to the list syntax is that it's currently
   meaningful (when used in the form before the method).  It's also
@@ -490,6 +562,10 @@
     <accepts(int,int), returns(float)>
     def bar(low,high):
         pass
+
+    <singleton>
+    class Foo(object):
+        pass

   None of these alternatives gained much traction. The alternatives
   which involve square brackets only serve to make it obvious that the
@@ -659,7 +735,10 @@
 .. _subsequently rejected:
      http://mail.python.org/pipermail/python-dev/2004-September/048518.html

+For Python 2.6, the Python grammar and compiler were modified to allow
+class decorators in addition to function and method decorators.

+
 Community Consensus
 -------------------

From jimjjewett at gmail.com  Tue May  1 17:54:24 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 1 May 2007 11:54:24 -0400
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <43aa6ff70705010844u4a6333f5hf1d4d3a807361ffe@mail.gmail.com>
References: <46371BD2.7050303@v.loewis.de>
	<43aa6ff70705010844u4a6333f5hf1d4d3a807361ffe@mail.gmail.com>
Message-ID: <fb6fbf560705010854y930ce5br19079828ca0bf208@mail.gmail.com>

On 5/1/07, Collin Winter <collinw at gmail.com> wrote:
> On 5/1/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > Rationale
> > =========

> That makes absolutely no sense. You mean to tell me that people write
> Python without being able to understand any of the language's
> keywords, builtin functions, standard library or documentation?

If they have translations of the important documentation and the small
number of keywords -- yes, they probably do; the alternative
programming languages aren't really all that much easier for
non-English speakers.

FWIW, I've used undocumented variants of Assembler, based only on
examples.  (So no doc, didn't have a complete set of
keywords/functions/libraries, misundstood some of what I did have.)  I
won't say it was a great environment, but it did work.

-jJ

From martin at v.loewis.de  Tue May  1 17:56:10 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 01 May 2007 17:56:10 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <f178kn$s0n$2@sea.gmane.org>
References: <46371BD2.7050303@v.loewis.de> <f178kn$s0n$2@sea.gmane.org>
Message-ID: <4637631A.6030702@v.loewis.de>

>> Title: Supporting Non-ASCII Identifiers
> 
> Isn't this already blacklisted in PEP 3099?

It's not clear to me. That was in response to a suggestion
that non-ASCII symbols will be used in the syntax of Python,
i.e. in a way making it mandatory to be able to type these
symbols.

This is not the intent of this PEP.

There is also

http://mail.python.org/pipermail/python-3000/2006-April/001526.html

where Guido states that he trusts me that it can be made to work,
and that "eventually" it needs to be supported. Rather than asking
for trust, I put out a specification of how precisely the change
would be implemented. Then, in

http://mail.python.org/pipermail/python-3000/2006-April/001551.html

he indicates that this doesn't have to be synchronized with Py3k.

So if it is rejected for Py3k because of PEP 3099, I will need to
suggest it for addition to Python 2.6. However, if I had proposed
it for Python 2.6, people would have objected that it should rather
be included in Py3k. If it is rejected for 2.6 on the grounds of
being premature, I will resubmit it for 3.1, and so on, until
"eventually" is "now". If it gets rejected "for good", I shall feel
sorry.

Regards,
Martin

From martin at v.loewis.de  Tue May  1 17:59:39 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 01 May 2007 17:59:39 +0200
Subject: [Python-3000] PEP: Drop Implicit String Concatentation
In-Reply-To: <43aa6ff70705010822q1066ab51sa7c58547dd3d18f1@mail.gmail.com>
References: <009e01c78bbd$da8b71c0$f001a8c0@RaymondLaptop1>
	<43aa6ff70705010822q1066ab51sa7c58547dd3d18f1@mail.gmail.com>
Message-ID: <463763EB.3070400@v.loewis.de>

Collin Winter schrieb:
> On 4/30/07, Raymond Hettinger <python at rcn.com> wrote:
>> PEP:  Remove Implicit String Concatenation
> 
> Jim Jewett has already submitted a PEP that does this, PEP 3126. It's
> in SVN but not showing up on PEP 0 for some reason:

It does show on PEP 0:

http://www.python.org/dev/peps/pep-0000/

however it does not show on the PEP index, for some reason.

Andrew said he fixed that, it apparently still doesn't work.

Martin

From collinw at gmail.com  Tue May  1 18:05:52 2007
From: collinw at gmail.com (Collin Winter)
Date: Tue, 1 May 2007 09:05:52 -0700
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <4637631A.6030702@v.loewis.de>
References: <46371BD2.7050303@v.loewis.de> <f178kn$s0n$2@sea.gmane.org>
	<4637631A.6030702@v.loewis.de>
Message-ID: <43aa6ff70705010905l3f87d57ck5a8f5597a6de9dab@mail.gmail.com>

On 5/1/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> >> Title: Supporting Non-ASCII Identifiers
> >
> > Isn't this already blacklisted in PEP 3099?
>
> It's not clear to me. That was in response to a suggestion
> that non-ASCII symbols will be used in the syntax of Python,
> i.e. in a way making it mandatory to be able to type these
> symbols.

Reading from http://mail.python.org/pipermail/python-3000/2006-April/001474.html,
the message that prompted this particular addition to PEP 3099, "I
want good Unicode support for string literals and comments. Everything
else in the language ought to be ASCII."

Identifiers aren't string literals or comments.

Collin Winter

From pje at telecommunity.com  Tue May  1 18:09:30 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 12:09:30 -0400
Subject: [Python-3000] PEP:  Eliminate __del__
In-Reply-To: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
Message-ID: <5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com>

At 12:31 AM 5/1/2007 -0700, Raymond Hettinger wrote:
>The alternative is to code the automatic finalization steps using
>weakref callbacks.  For those used to using __del__, it takes a little
>while to learn the idiom but essentially the technique is hold a proxy
>or ref with a callback to a boundmethod for finalization:
>     self.resource = resource = CreateResource()
>     self.callbacks.append(proxy(resource, resource.closedown))
>In this manner, all of the object's resources can be freed automatically
>when the object is collected.  Note, that the callbacks only bind
>the resource object and not client object, so the client object
>can already have been collected and the teardown code can be run
>without risk of resurrecting the client (with a possibly invalid state).

I'm a bit confused about the above.  My understanding is that in order for 
a weakref's callback to be invoked, the weakref itself *must still be 
live*.  That means that if 'self' in your example above is collected, then 
the weakref no longer exists, so the closedown won't be called.

Yet, at the same time, it appears that in your example, deleting 
self.resource would *not* cause resource to be GC'd either, because the 
weakref still holds a reference to 'resource.closedown', which in turn must 
hold a reference to 'resource'.

So, at first glance, your example looks like it can't possibly do the right 
thing, ever, unless I'm missing something rather big.  In which case, the 
explanation for *how* this is supposed to work should go in the PEP.

In principle I'm in favor of ditching __del__, as long as there's actually 
a viable technique for doing so.  My own experience has been that setting 
up a simple mechanism to replace it (and that actually works) is really 
difficult, because you have to find some place for the weakref itself to 
live, which usually means a global dictionary or something of that 
sort.  It would be nice if the gc or weakref modules grew a facility to 
make it easier to register finalization callbacks, and could optionally 
check whether you were registering a callback that referenced the thing you 
were tying the callback's life to.


From pje at telecommunity.com  Tue May  1 18:14:19 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 12:14:19 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <fb6fbf560705010711w3d533820m53b1eeef229c7131@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430200255.04b88e10@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430200255.04b88e10@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070501121143.02d31398@sparrow.telecommunity.com>

At 10:11 AM 5/1/2007 -0400, Jim Jewett wrote:
>On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>>At 07:29 PM 4/30/2007 -0400, Jim Jewett wrote:
>> >On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>
>> >>PEP 3115, however, requires that a class' metaclass be determined
>> >>*before* the class body has executed, making it impossible to use this
>> >>technique for class decoration any more.
>
>> >It doesn't say what that metaclass has to do, though.
>
>> >Is there any reason the metaclass couldn't delegate differently
>> >depending on the value of __my_magic_attribute__ ?
>
>>Sure -- that's what I suggested in the "super(), class decorators, and PEP
>>3115" thread, but Guido voted -1 on adding such a magic attribute to PEP
>>3115.
>
>I don't think we're understanding each other.

Yup, and we're still not now.  :)  Or at least, I don't understand what the 
code below does, or more precisely, why it's different from just having a 
__decorators__ list containing direct callbacks.  The extra indirection of 
having an "after hooks" registry and separate attributes doesn't appear to 
add anything, although if it turned out you really needed it, you could 
just add a callback to __decorators__ that did it.

>   Why couldn't you use a
>suitably fleshed-out version of:
>
>class _ConditionalMetaclass(type):
>
>    def __init__(cls, name, bases, dct):
>        super(_ConditionalMetaclass, cls).__init__(name, bases, dct)
>        hooks = [(k, v) for (k, v) in dct.items() if
>k.startswith("_afterhook_")]
>        for k, v in hooks:
>            cls = AfterHooksRegistry[k](cls, v)
>
>-jJ


From martin at v.loewis.de  Tue May  1 18:14:10 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 01 May 2007 18:14:10 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <43aa6ff70705010844u4a6333f5hf1d4d3a807361ffe@mail.gmail.com>
References: <46371BD2.7050303@v.loewis.de>
	<43aa6ff70705010844u4a6333f5hf1d4d3a807361ffe@mail.gmail.com>
Message-ID: <46376752.2070007@v.loewis.de>

>> Python code is written by many people in the world who are not
>> familiar with the English language, or even well-acquainted with the
>> Latin writing system.
> [snip]
> 
> That makes absolutely no sense. You mean to tell me that people write
> Python without being able to understand any of the language's
> keywords, builtin functions, standard library or documentation?

Exactly so. They have natural-language of the documentation, by means of
books and literal translation of the Python documentation, and they
don't try to grasp the meaning of the identifiers (e.g. I only yesterday
learned what a "hub" is, as in "hub-and-spoke". I accepted it to mean
"networking device that forwards packets" before. Many people around
here think that ASCII is pronounced A-S-C-two, i.e. II stands for a
Roman numeral - and these people did have some English training.)

I still don't understand why the "no operation" statement is called
"pass" - it's not the opposite of "fail", and seems to have no
relationship to "can you pass me the butter, please?".

The point is that even though many people get some passive knowledge
of English over time, they have a hard time with active usage of the
language. So when they need to come up with identifiers and put comments
into the code, they use their first language. See the comments for PEP
328 in

http://python.com.ua/ru/news/2006/09/20/nakonets-to-vyishel-python-25/

(I'm sure I can also find code with transliterated identifiers in the
 net, but finding that is bit more tedious, so I would prefer if
 you trust me on that).

Regards,
Martin

From jjb5 at cornell.edu  Tue May  1 18:16:08 2007
From: jjb5 at cornell.edu (Joel Bender)
Date: Tue, 01 May 2007 12:16:08 -0400
Subject: [Python-3000] Breakthrough in thinking about ABCs (PEPs 3119
 and 3141)
In-Reply-To: <5.1.1.6.0.20070430205955.04953100@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430205955.04953100@sparrow.telecommunity.com>
Message-ID: <463767C8.5070608@cornell.edu>

Phillip J. Eby wrote:

>> Personally, I still think that the most uniform way of spelling this
>> is overloading isinstance and issubclass; that has the highest
>> likelihood of standardizing the spelling for such inquiries.
> 
> A big +1 here.  This is no different than e.g. operator.mul() being able to 
> do different things depending on the second argument.

n00b here, trying to follow this...

     class X:
         def __mul__(self, y): print self, "mul", y
         def __rmul__(self, y): print self, "rmul", y

Treating isinstance like operator.mul, I could do this (and I would 
expect that you want to make it a class method)...

     class Y:
         @classmethod
         def __risinstance__(cls, obj): print obj, "is instance of", cls

So issubclass(D, C) would call D.__issubclass__(C) or 
C.__rissubclass__(D) and leave it up to the programmer.  The former is 
"somebody is checking to see if I inherit some functionality" and the 
latter is "somebody is checking to see if something is a proper derived 
class of me".

     class A(object):
         @classmethod
         def __rissubclass__(cls, subcls):
             if not object.__rissubclass__(cls, subcls):
                 return False
             return subcls.f is not A.f

         def f(self):
             raise RuntimeError, "f must be overridden"

     class B(A):
         def g(self): print "B.g"

     class C(A):
         def f(self): print "C.f"

Now my testing can check issubclass(B, A) and it will fail because B.f 
hasn't been provided, but issubclass(C, A) passes.  I don't have to call 
B().f() and have it fail, it might be expensive to create a B().


Joel


From martin at v.loewis.de  Tue May  1 18:19:02 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 01 May 2007 18:19:02 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <43aa6ff70705010905l3f87d57ck5a8f5597a6de9dab@mail.gmail.com>
References: <46371BD2.7050303@v.loewis.de> <f178kn$s0n$2@sea.gmane.org>	
	<4637631A.6030702@v.loewis.de>
	<43aa6ff70705010905l3f87d57ck5a8f5597a6de9dab@mail.gmail.com>
Message-ID: <46376876.1010803@v.loewis.de>

> Reading from
> http://mail.python.org/pipermail/python-3000/2006-April/001474.html,
> the message that prompted this particular addition to PEP 3099, "I
> want good Unicode support for string literals and comments. Everything
> else in the language ought to be ASCII."
> 
> Identifiers aren't string literals or comments.

Sure, but please follow the follow-up communication also.

If this is to be rejected, I'd rather get a PEP number and an explicit
rejection, instead of having to guess.

Regards,
Martin

From jimjjewett at gmail.com  Tue May  1 18:19:43 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 1 May 2007 12:19:43 -0400
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <46371BD2.7050303@v.loewis.de>
References: <46371BD2.7050303@v.loewis.de>
Message-ID: <fb6fbf560705010919u6e765c7av39b81f3fd17e5dba@mail.gmail.com>

On 5/1/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:

> The identifier syntax is <ID_Start> <ID_Continue>\*.

> ID_Start is defined as all characters having one of the general
> categories uppercase letters (Lu), lowercase letters (Ll), titlecase
> letters (Lt), modifier letters (Lm), other letters (Lo), letter
> numbers (Nl), plus the underscore (XXX what are "stability extensions
> listed in UAX 31).

Are you sure that modifier letters should be included?  The standard
says so, but as nearly as I can tell, these are really more like
diacritics -- and some of them look an awful lot like punctuation.

    http://unicode.org/charts/PDF/U02B0.pdf

-jJ

From talin at acm.org  Tue May  1 18:13:29 2007
From: talin at acm.org (Talin)
Date: Tue, 01 May 2007 09:13:29 -0700
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
Message-ID: <46376729.9000008@acm.org>

Phillip J. Eby wrote:
> Proceeding to the "Next" Method
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> If the first parameter of an overloaded function is named
> ``__proceed__``, it will be passed a callable representing the next
> most-specific method.  For example, this code::
> 
>      def foo(bar:object, baz:object):
>          print "got objects!"
> 
>      @overload
>      def foo(__proceed__, bar:int, baz:int):
>          print "got integers!"
>          return __proceed__(bar, baz)

I don't care for the idea of testing against a specially named argument. 
Why couldn't you just have a different decorator, such as 
"overload_chained" which triggers this behavior?

-- Talin


From martin at v.loewis.de  Tue May  1 18:24:49 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 01 May 2007 18:24:49 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <fb6fbf560705010919u6e765c7av39b81f3fd17e5dba@mail.gmail.com>
References: <46371BD2.7050303@v.loewis.de>
	<fb6fbf560705010919u6e765c7av39b81f3fd17e5dba@mail.gmail.com>
Message-ID: <463769D1.5020505@v.loewis.de>

Jim Jewett schrieb:
> On 5/1/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
>> The identifier syntax is <ID_Start> <ID_Continue>\*.
> 
>> ID_Start is defined as all characters having one of the general
>> categories uppercase letters (Lu), lowercase letters (Ll), titlecase
>> letters (Lt), modifier letters (Lm), other letters (Lo), letter
>> numbers (Nl), plus the underscore (XXX what are "stability extensions
>> listed in UAX 31).
> 
> Are you sure that modifier letters should be included?  The standard
> says so, but as nearly as I can tell, these are really more like
> diacritics -- and some of them look an awful lot like punctuation.

Interesting question. I included them because the standard says so,
but I don't see an inherent need. I'll see whether I can find some
rationale as to why they were included in UAX 31, and then check
whether that rationale applies to Python as well.

Regards,
Martin

From pje at telecommunity.com  Tue May  1 18:27:50 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 12:27:50 -0400
Subject: [Python-3000] Why isinstance() and issubclass() don't need to be
	unforgeable
Message-ID: <5.1.1.6.0.20070501121527.02f47f90@sparrow.telecommunity.com>

I just wanted to throw in a note for those who are upset with the idea that 
classes should be able to decide how isinstance() and issubclass() 
work.  If you want "true, unforgeable" isinstance and subclass, you can 
still use these formulas:


def true_issubclass(C1, C2):
     return C2 in type.__mro__.__get__(C1)

def isinstance_no_proxy(o, C):
     return true_issubclass(type(o), C)

def isinstance_with_proxy(o, C):
     cls = getattr(o, '__class__', None)
     return true_issubclass(cls, C) or isinstance_no_proxy(o, C)


Their complexity reflects the fact that they rely on implementation details 
which the vast majority of code should not care about.

So, if you really have a need to find out whether something is truly an 
instance of something for *structural* reasons, you will still be able to 
do that.  Yes, it will be a pain.  But deliberately inducing structural 
dependencies *should* be painful, because you're making it painful for the 
*users* of your code, whenever you impose isinstance/issubclass checks 
beyond necessity.

The fact that it's currently *not* painful, is precisely what makes it such 
a good idea to add the new hooks to make these operations forgeable.

The default, in other words, should not be to care about what objects 
*are*, only what they *claim* to be.


From l.mastrodomenico at gmail.com  Tue May  1 18:27:30 2007
From: l.mastrodomenico at gmail.com (Lino Mastrodomenico)
Date: Tue, 1 May 2007 18:27:30 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <46376752.2070007@v.loewis.de>
References: <46371BD2.7050303@v.loewis.de>
	<43aa6ff70705010844u4a6333f5hf1d4d3a807361ffe@mail.gmail.com>
	<46376752.2070007@v.loewis.de>
Message-ID: <cc93256f0705010927u72b04ffeo14fdbea2112ce5b6@mail.gmail.com>

2007/5/1, "Martin v. L?wis" <martin at v.loewis.de>:
> The point is that even though many people get some passive knowledge
> of English over time, they have a hard time with active usage of the
> language. So when they need to come up with identifiers and put comments
> into the code, they use their first language. See the comments for PEP
> 328 in
>
> http://python.com.ua/ru/news/2006/09/20/nakonets-to-vyishel-python-25/
>
> (I'm sure I can also find code with transliterated identifiers in the
>  net, but finding that is bit more tedious, so I would prefer if
>  you trust me on that).

If this can help the discussion, the first example of Python code in
the Italian translation of the tutorial is:

>>> il_mondo_?_piatto = 1
>>> if il_mondo_?_piatto:
...     print "Occhio a non caderne fuori!"
...
Occhio a non caderne fuori!

http://python.it/doc/Python-Docs/html/tut/node4.html

Please note the "?" character in the variable name. And yes, this code
used to work out of the box (AFAIK at least until Python 2.2).

-- 
Lino Mastrodomenico
E-mail: l.mastrodomenico at gmail.com

From pje at telecommunity.com  Tue May  1 18:36:17 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 12:36:17 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <46376729.9000008@acm.org>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>

At 09:13 AM 5/1/2007 -0700, Talin wrote:
>Phillip J. Eby wrote:
>>Proceeding to the "Next" Method
>>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>If the first parameter of an overloaded function is named
>>``__proceed__``, it will be passed a callable representing the next
>>most-specific method.  For example, this code::
>>      def foo(bar:object, baz:object):
>>          print "got objects!"
>>      @overload
>>      def foo(__proceed__, bar:int, baz:int):
>>          print "got integers!"
>>          return __proceed__(bar, baz)
>
>I don't care for the idea of testing against a specially named argument. 
>Why couldn't you just have a different decorator, such as 
>"overload_chained" which triggers this behavior?

The PEP lists *five* built-in decorators, all of which support this behavior::

    @overload, @when, @before, @after, @around

And in addition, it demonstrates how to create *new* method combination 
decorators, that *also* support this behavior (e.g. '@discount').

All in all, there are an unbounded number of possible decorators that would 
require chained and non-chained variations.

The other alternative would be to have a "magic" function like 
"get_next_method()" that you could call, but the setup for such an animal 
is more complex and would likely involve either sys._getframe() or some 
kind of special thread variable(s).  Performance would also be reduced for 
*all* generic function invocations, because the setup would have to occur 
whether or not chaining was happening.  The argument list technique allows 
the overhead to happen only once, and only when it's needed.

One new possibility, however...  suppose we did it like this:

      from overloading import next_method

      @overload
      def foo(blah:next_method, ...):

That is, if we used an argument *annotation* to designate the argument that 
would receive the next method?  For efficiency's sake, it would still need 
to be the first argument, but at least the special name would go away, and 
you could call it whatever you like.


From janssen at parc.com  Tue May  1 18:42:43 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 1 May 2007 09:42:43 PDT
Subject: [Python-3000] Traits/roles instead of ABCs
In-Reply-To: <F6F8D381-EC6F-4F6C-95F1-339476B78F47@python.org> 
References: <43aa6ff70704291840s3384824et44ebfd360c15eda@mail.gmail.com>
	<014201c78adc$ca70d960$f101a8c0@RaymondLaptop1>
	<1d36917a0704300816ma3bf9c2o4dd674cfcefa9172@mail.gmail.com>
	<-3456230403858254882@unknownmsgid>
	<740c3aec0704301501u7df7b5a6uaea854d4716eb87e@mail.gmail.com>
	<F6F8D381-EC6F-4F6C-95F1-339476B78F47@python.org>
Message-ID: <07May1.094247pdt."57996"@synergy1.parc.xerox.com>

> To me, interfaces and/or generic functions strike the right balance.

I agree.  As I've said before, if this was 1994 I think I'd be in the
PJE camp and prefer generic functions.  As it is, I think interfaces
better fit the current state of Python.  And I think the existing type
system has everything that's needed to indicate interfaces.  All we
need are some base definitions to stand on (dict, sequence, file, etc.).

> Such tools are completely invisible for Python programmers who don't =20
> care about them (the vast majority).  They're also essential for a =20
> very small subclass of very important Python applications.

Yep, those of us who write very large Python applications.

> If ABCs can walk that same tightrope of utility and invisibility, =20
> then maybe they'll successfully fill that niche.

I'm sure they will.

Bill

From pje at telecommunity.com  Tue May  1 18:44:49 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 12:44:49 -0400
Subject: [Python-3000] Breakthrough in thinking about ABCs (PEPs 3119
 and 3141)
In-Reply-To: <463767C8.5070608@cornell.edu>
References: <5.1.1.6.0.20070430205955.04953100@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430205955.04953100@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070501124243.068aa3a0@sparrow.telecommunity.com>

At 12:16 PM 5/1/2007 -0400, Joel Bender wrote:
>So issubclass(D, C) would call D.__issubclass__(C) or
>C.__rissubclass__(D) and leave it up to the programmer.

Yes, except there's only the '__r__' versions and they're not called that.


>   The former is
>"somebody is checking to see if I inherit some functionality" and the
>latter is "somebody is checking to see if something is a proper derived
>class of me".
>
>      class A(object):
>          @classmethod
>          def __rissubclass__(cls, subcls):
>              if not object.__rissubclass__(cls, subcls):
>                  return False
>              return subcls.f is not A.f
>
>          def f(self):
>              raise RuntimeError, "f must be overridden"
>
>      class B(A):
>          def g(self): print "B.g"
>
>      class C(A):
>          def f(self): print "C.f"
>
>Now my testing can check issubclass(B, A) and it will fail because B.f
>hasn't been provided, but issubclass(C, A) passes.  I don't have to call
>B().f() and have it fail, it might be expensive to create a B().

Right; you've just pointed out something important that hasn't been stated 
until now.  The objections to this extension have focused on the idea that 
this makes type checking less strict, but you've just demonstrated that it 
can actually be used to make it *more* strict.  I hadn't thought of that.


From pje at telecommunity.com  Tue May  1 18:45:43 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 12:45:43 -0400
Subject: [Python-3000] Derivation of "pass" in Python (was Re: PEP:
 Supporting Non-ASCII Identifiers)
In-Reply-To: <46376752.2070007@v.loewis.de>
References: <43aa6ff70705010844u4a6333f5hf1d4d3a807361ffe@mail.gmail.com>
	<46371BD2.7050303@v.loewis.de>
	<43aa6ff70705010844u4a6333f5hf1d4d3a807361ffe@mail.gmail.com>
Message-ID: <5.1.1.6.0.20070501123738.05263610@sparrow.telecommunity.com>

At 06:14 PM 5/1/2007 +0200, Martin v. L??wis wrote:
>I still don't understand why the "no operation" statement is called
>"pass" - it's not the opposite of "fail", and seems to have no
>relationship to "can you pass me the butter, please?".

Actually, it does, in the sense that to "pass" on something means to give 
up the chance to take it.  So, if butter is being passed around the dinner 
table, one who chooses not to take it, but passes it on to the next person, 
is said to be "passing on" (i.e. conceding the opportunity).

Thus, when someone is offered something, they may say, "I'll pass", meaning 
they are declining to act.  Ergo, to "pass" in Python is to decline to give 
up the opportunity to act.


From guido at python.org  Tue May  1 18:48:43 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 09:48:43 -0700
Subject: [Python-3000] Addition to PEP 3101
In-Reply-To: <fb6fbf560705010731i719da8efibc14c72e6175053d@mail.gmail.com>
References: <8f01efd00704300953t6154d7e1j7ef18cead1acb344@mail.gmail.com>
	<d09829f50704301256n2b08b1fav7ac64b4fcc6a742c@mail.gmail.com>
	<46368EE5.6050409@canterbury.ac.nz> <4636AE9E.2020905@acm.org>
	<fb6fbf560705010731i719da8efibc14c72e6175053d@mail.gmail.com>
Message-ID: <ca471dc20705010948o5b348fb3hf3f03e4cdf1dbb16@mail.gmail.com>

On 5/1/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 4/30/07, Talin <talin at acm.org> wrote:
> > Greg Ewing wrote:
> > > Patrick Maupin wrote:
>
> > >> Method calls are deliberately disallowed by the PEP, so that the
> > >> implementation has some hope of being securable.
>
> > > If attribute access is allowed, arbitrary code can already
> > > be triggered, so I don't see how this makes a difference
> > > to security.
>
> > Not quite. It depends on what you mean by 'arbitrary code'. ...
>
> If I understood that correctly, then
>
> (1)  The format string cannot run arbitrary code, but
> (2)  The formatted objects themselves can.
>
> This is probably a feature, since you can pass proxy objects, but it
> should definately be called out explicitly in the security section
> (currently just some text in Simple and Compound Names section).
> Example Text:
>
>
> Note that while (literal strings used as) format strings are
> effectively sandboxed, the formatted objects themselves are not.
>
>     "My name is {0[name]}".format(evil_map)
>
> would still allow evil_map to run arbitrary code.

And how on earth would that be a security threat?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From janssen at parc.com  Tue May  1 18:48:47 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 1 May 2007 09:48:47 PDT
Subject: [Python-3000] Traits/roles instead of ABCs
In-Reply-To: <463690E5.6060603@canterbury.ac.nz> 
References: <43aa6ff70704291840s3384824et44ebfd360c15eda@mail.gmail.com>
	<014201c78adc$ca70d960$f101a8c0@RaymondLaptop1>
	<438708814690534630@unknownmsgid>
	<79990c6b0704301001ga0d2429sdaded9ac75fa15c5@mail.gmail.com>
	<07Apr30.141916pdt.57996@synergy1.parc.xerox.com>
	<463690E5.6060603@canterbury.ac.nz>
Message-ID: <07May1.094853pdt."57996"@synergy1.parc.xerox.com>

Greg Ewing writes:
> But I don't think there is any such definition, and
> the confusion arises because people lazily use the
> vague term "file-like" instead of spelling out what
> they really mean ("has a read() method", etc.)

Yes, I agree with this.  That's why
http://wiki.python.org/moin/AbstractBaseClasses?highlight=%28AbstractBaseClasses%29
splits the "file-like" interface into a number of composable pieces.

Bill

From martin at v.loewis.de  Tue May  1 18:54:49 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 01 May 2007 18:54:49 +0200
Subject: [Python-3000] Derivation of "pass" in Python (was Re: PEP:
 Supporting Non-ASCII Identifiers)
In-Reply-To: <5.1.1.6.0.20070501123738.05263610@sparrow.telecommunity.com>
References: <43aa6ff70705010844u4a6333f5hf1d4d3a807361ffe@mail.gmail.com>
	<46371BD2.7050303@v.loewis.de>
	<43aa6ff70705010844u4a6333f5hf1d4d3a807361ffe@mail.gmail.com>
	<5.1.1.6.0.20070501123738.05263610@sparrow.telecommunity.com>
Message-ID: <463770D9.3050405@v.loewis.de>

> Thus, when someone is offered something, they may say, "I'll pass",
> meaning they are declining to act.  Ergo, to "pass" in Python is to
> decline to give up the opportunity to act.

Ah, ok. It would then be similar to "Passe!" in German, which is
used in card games, if you don't play a card, but instead hand
over to the next player. Even though this is clearly the same
ancestry, it never occurred to me that the same meaning is also
present in English (also, "passen" is somewhat oldish now, so
I don't use it actively myself).

Regards,
Martin

From janssen at parc.com  Tue May  1 18:54:38 2007
From: janssen at parc.com (Bill Janssen)
Date: Tue, 1 May 2007 09:54:38 PDT
Subject: [Python-3000] Traits/roles instead of ABCs
In-Reply-To: <1d36917a0704302031hd34ffcfu2eee879aef426931@mail.gmail.com> 
References: <43aa6ff70704291840s3384824et44ebfd360c15eda@mail.gmail.com>
	<014201c78adc$ca70d960$f101a8c0@RaymondLaptop1>
	<1d36917a0704300816ma3bf9c2o4dd674cfcefa9172@mail.gmail.com>
	<-3456230403858254882@unknownmsgid>
	<740c3aec0704301501u7df7b5a6uaea854d4716eb87e@mail.gmail.com>
	<1d36917a0704302031hd34ffcfu2eee879aef426931@mail.gmail.com>
Message-ID: <07May1.095448pdt."57996"@synergy1.parc.xerox.com>

Alan McIntyre writes:
> I have a nagging concern that these additions will
> clutter up the core, and--no matter how hard you try--adding them is
> going to have an impact on "run-of-the-mill" users of the language.

I don't think that will be the case, if we just use ABCs.  There will
be a definition somewhere of the basic type APIs, but the normal user
will still just say "f=open(FILENAME)" or "d={}" without really caring
what APIs the value returned by "open" or "{}" support -- the current
informal and fuzzy understanding of the type will suffice, just as it
does now.  Those who do care, however, will be able to find out, and
use that information.

Bill

From guido at python.org  Tue May  1 18:58:03 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 09:58:03 -0700
Subject: [Python-3000] BList PEP
In-Reply-To: <eae285400705010000l2af0e890ifc8c2e0de8219961@mail.gmail.com>
References: <eae285400705010000l2af0e890ifc8c2e0de8219961@mail.gmail.com>
Message-ID: <ca471dc20705010958o15121023ke7546e5a1a1d23f3@mail.gmail.com>

On 5/1/07, Daniel Stutzbach <daniel at stutzbachenterprises.com> wrote:
> PEP: 30XX
> Title: BList: A faster list-like type

Checked in as PEP 3128.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From fumanchu at amor.org  Tue May  1 18:58:32 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Tue, 1 May 2007 09:58:32 -0700
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <46371BD2.7050303@v.loewis.de>
Message-ID: <435DF58A933BA74397B42CDEB8145A860B745DEF@ex9.hostedexchange.local>

Martin v. L?wis wrote:
> Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
> 
> Common Objections
> =================
> 
> People claim that they will not be able to use a library if to do so
> they have to use characters they cannot type on their
> keyboards. However, it is the choice of the designer of the library to
> decide on various constraints for using the library: people may not be
> able to use the library because they cannot get physical access to the
> source code (because it is not published), or because licensing
> prohibits usage, or because the documentation is in a language they
> cannot understand. A developer wishing to make a library widely
> available needs to make a number of explicit choices (such as
> publication, licensing, language of documentation, and language of
> identifiers). It should always be the choice of the author to make
> these decisions - not the choice of the language designers.

That seems true when each such decision is considered in isolation. But the language designers are responsible to make sure the number of such explicit decisions/choices does not grow beyond a reasonable limit.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From guido at python.org  Tue May  1 19:07:31 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 10:07:31 -0700
Subject: [Python-3000] Why isinstance() and issubclass() don't need to
	be unforgeable
In-Reply-To: <5.1.1.6.0.20070501121527.02f47f90@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070501121527.02f47f90@sparrow.telecommunity.com>
Message-ID: <ca471dc20705011007g7e3d8e2dyf03587ffccea6bf2@mail.gmail.com>

On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> I just wanted to throw in a note for those who are upset with the idea that
> classes should be able to decide how isinstance() and issubclass()
> work.  If you want "true, unforgeable" isinstance and subclass, you can
> still use these formulas:
>
>
> def true_issubclass(C1, C2):
>      return C2 in type.__mro__.__get__(C1)
>
> def isinstance_no_proxy(o, C):
>      return true_issubclass(type(o), C)
>
> def isinstance_with_proxy(o, C):
>      cls = getattr(o, '__class__', None)
>      return true_issubclass(cls, C) or isinstance_no_proxy(o, C)
>
>
> Their complexity reflects the fact that they rely on implementation details
> which the vast majority of code should not care about.
>
> So, if you really have a need to find out whether something is truly an
> instance of something for *structural* reasons, you will still be able to
> do that.  Yes, it will be a pain.  But deliberately inducing structural
> dependencies *should* be painful, because you're making it painful for the
> *users* of your code, whenever you impose isinstance/issubclass checks
> beyond necessity.
>
> The fact that it's currently *not* painful, is precisely what makes it such
> a good idea to add the new hooks to make these operations forgeable.
>
> The default, in other words, should not be to care about what objects
> *are*, only what they *claim* to be.

(Or what is claimed about them!)

Thanks for writing this note!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue May  1 19:17:25 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 10:17:25 -0700
Subject: [Python-3000] PEP index out of date, and work-around
Message-ID: <ca471dc20705011017n291992eenb4a1c88c803c44eb@mail.gmail.com>

There seems to be an issue with the PEP index:
http://python.org/dev/peps/ lists PEP 3122 as the last PEP (not
counting PEP 3141 which is deliberately out of sequence). As a
work-around, an up to date index is here:

  http://python.org/dev/peps/pep-0000/

PEPs 3123-3128 are alive and well and reachable via this index.

One of the webmasters will look into this tonight.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue May  1 19:19:21 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 10:19:21 -0700
Subject: [Python-3000] Adding class decorators to PEP 318
In-Reply-To: <43aa6ff70705010852g112924a2hbf13f31d83631a85@mail.gmail.com>
References: <43aa6ff70705010852g112924a2hbf13f31d83631a85@mail.gmail.com>
Message-ID: <ca471dc20705011019o68069256mf2b8678a17692f71@mail.gmail.com>

I don't like this -- it seems like rewriting history to me. I'd rather
leave PEP 318 alone and create a new PEP. Of course the new PEP can be
short because it can refer to PEP 318.

--Guido

On 5/1/07, Collin Winter <collinw at gmail.com> wrote:
> In talking to Neal Norwitz about this, I don't see a need for a
> separate PEP for class decorators; we already have a decorators PEP,
> #318. The following is a proposed patch to PEP 318 that adds in class
> decorators.
>
> Collin Winter
>
>
> Index: pep-0318.txt
> ===================================================================
> --- pep-0318.txt        (revision 55034)
> +++ pep-0318.txt        (working copy)
> @@ -1,5 +1,5 @@
>  PEP: 318
> -Title: Decorators for Functions and Methods
> +Title: Decorators for Functions, Methods and Classes
>  Version: $Revision$
>  Last-Modified: $Date$
>  Author: Kevin D. Smith, Jim Jewett, Skip Montanaro, Anthony Baxter
> @@ -9,7 +9,7 @@
>  Created: 05-Jun-2003
>  Python-Version: 2.4
>  Post-History: 09-Jun-2003, 10-Jun-2003, 27-Feb-2004, 23-Mar-2004, 30-Aug-2004,
> -              2-Sep-2004
> +              2-Sep-2004, 30-Apr-2007
>
>
>  WarningWarningWarning
> @@ -22,24 +22,40 @@
>  negatives of each form.
>
>
> +UpdateUpdateUpdate
> +==================
> +
> +In April 2007, this PEP was updated to reflect the evolution of the Python
> +community's attitude toward class decorators.  Though they had previously
> +been rejected as too obscure and with limited use-cases, by mid-2006,
> +class decorators had come to be seen as the logical next step, with some
> +wondering why they hadn't been included originally.  As a result, class
> +decorators will ship in Python 2.6.
> +
> +This PEP has been modified accordingly, with references to class decorators
> +injected into the narrative.  While some references to the lack of class
> +decorators have been left in place to preserve the historical record, others
> +have been removed for the sake of coherence.
> +
> +
>  Abstract
>  ========
>
> -The current method for transforming functions and methods (for instance,
> -declaring them as a class or static method) is awkward and can lead to
> -code that is difficult to understand.  Ideally, these transformations
> -should be made at the same point in the code where the declaration
> -itself is made.  This PEP introduces new syntax for transformations of a
> -function or method declaration.
> +The current method for transforming functions, methods and classes (for
> +instance, declaring a method as a class or static method) is awkward and
> +can lead to code that is difficult to understand.  Ideally, these
> +transformations should be made at the same point in the code where the
> +declaration itself is made.  This PEP introduces new syntax for
> +transformations of a function, method or class declaration.
>
>
>  Motivation
>  ==========
>
> -The current method of applying a transformation to a function or method
> -places the actual transformation after the function body.  For large
> -functions this separates a key component of the function's behavior from
> -the definition of the rest of the function's external interface.  For
> +The current method of applying a transformation to a function, method or class
> +places the actual transformation after the body.  For large
> +code blocks this separates a key component of the object's behavior from
> +the definition of the rest of the object's external interface.  For
>  example::
>
>      def foo(self):
> @@ -69,14 +85,22 @@
>  are not as immediately apparent.  Almost certainly, anything which could
>  be done with class decorators could be done using metaclasses, but
>  using metaclasses is sufficiently obscure that there is some attraction
> -to having an easier way to make simple modifications to classes.  For
> -Python 2.4, only function/method decorators are being added.
> +to having an easier way to make simple modifications to classes.  The
> +following is much clearer than the metaclass-based alternative::
>
> +    @singleton
> +    class Foo(object):
> +        pass
>
> +Because of the greater ease-of-use of class decorators and the symmetry
> +with function and method decorators, class decorators will be included in
> +Python 2.6.
> +
> +
>  Why Is This So Hard?
>  --------------------
>
> -Two decorators (``classmethod()`` and ``staticmethod()``) have been
> +Two method decorators (``classmethod()`` and ``staticmethod()``) have been
>  available in Python since version 2.2.  It's been assumed since
>  approximately that time that some syntactic support for them would
>  eventually be added to the language.  Given this assumption, one might
> @@ -135,11 +159,16 @@
>  .. _gareth mccaughan:
>     http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=slrna40k88.2h9o.Gareth.McCaughan%40g.local
>
> -Class decorations seem like an obvious next step because class
> +Class decorations seemed like an obvious next step because class
>  definition and function definition are syntactically similar,
> -however Guido remains unconvinced, and class decorators will almost
> -certainly not be in Python 2.4.
> +however Guido was not convinced of their usefulness, and class
> +decorators were not in Python 2.4.  `The issue was revisited`_ in March 2006
> +and sufficient use-cases were found to justify the inclusion of class
> +decorators in Python 2.6.
>
> +.. _The issue was revisited:
> +   http://mail.python.org/pipermail/python-dev/2006-March/062942.html
> +
>  The discussion continued on and off on python-dev from February
>  2002 through July 2004.  Hundreds and hundreds of posts were made,
>  with people proposing many possible syntax variations.  Guido took
> @@ -147,8 +176,8 @@
>  place.  Subsequent to this, he decided that we'd have the `Java-style`_
>  @decorator syntax, and this appeared for the first time in 2.4a2.
>  Barry Warsaw named this the 'pie-decorator' syntax, in honor of the
> -Pie-thon Parrot shootout which was occured around the same time as
> -the decorator syntax, and because the @ looks a little like a pie.
> +Pie-thon Parrot shootout which was occuring around the same time as
> +the decorator syntax debate, and because the @ looks a little like a pie.
>  Guido `outlined his case`_ on Python-dev, including `this piece`_
>  on some of the (many) rejected forms.
>
> @@ -250,6 +279,19 @@
>  decorators are near the function declaration.  The @ sign makes it clear
>  that something new is going on here.
>
> +Python 2.6's class decorators work similarly::
> +
> +    @dec2
> +    @dec1
> +    class Foo:
> +        pass
> +
> +This is equivalent to::
> +
> +    class Foo:
> +        pass
> +    Foo = dec2(dec1(Foo))
> +
>  The rationale for the `order of application`_ (bottom to top) is that it
>  matches the usual order for function-application.  In mathematics,
>  composition of functions (g o f)(x) translates to g(f(x)).  In Python,
> @@ -321,7 +363,7 @@
>  There have been a number of objections raised to this location -- the
>  primary one is that it's the first real Python case where a line of code
>  has an effect on a following line.  The syntax available in 2.4a3
> -requires one decorator per line (in a2, multiple decorators could be
> +requires one decorator per line (in 2.4a2, multiple decorators could be
>  specified on the same line).
>
>  People also complained that the syntax quickly got unwieldy when
> @@ -330,52 +372,61 @@
>  were small and thus this was not a large worry.
>
>  Some of the advantages of this form are that the decorators live outside
> -the method body -- they are obviously executed at the time the function
> +the function/class body -- they are obviously executed at the time the object
>  is defined.
>
> -Another advantage is that a prefix to the function definition fits
> +Another advantage is that a prefix to the definition fits
>  the idea of knowing about a change to the semantics of the code before
> -the code itself, thus you know how to interpret the code's semantics
> +the code itself. This way, you know how to interpret the code's semantics
>  properly without having to go back and change your initial perceptions
>  if the syntax did not come before the function definition.
>
>  Guido decided `he preferred`_ having the decorators on the line before
> -the 'def', because it was felt that a long argument list would mean that
> -the decorators would be 'hidden'
> +the 'def' or 'class', because it was felt that a long argument list would mean
> +that the decorators would be 'hidden'
>
>  .. _he preferred:
>      http://mail.python.org/pipermail/python-dev/2004-March/043756.html
>
> -The second form is the decorators between the def and the function name,
> -or the function name and the argument list::
> +The second form is the decorators between the 'def' or 'class' and the object's
> +name, or between the name and the argument list::
>
>      def @classmethod foo(arg1,arg2):
>          pass
> +
> +    class @singleton Foo(arg1, arg2):
> +        pass
>
>      def @accepts(int,int), at returns(float) bar(low,high):
>          pass
>
>      def foo @classmethod (arg1,arg2):
>          pass
> +
> +    class Foo @singleton (arg1, arg2):
> +        pass
>
>      def bar @accepts(int,int), at returns(float) (low,high):
>          pass
>
>  There are a couple of objections to this form.  The first is that it
> -breaks easily 'greppability' of the source -- you can no longer search
> +breaks easy 'greppability' of the source -- you can no longer search
>  for 'def foo(' and find the definition of the function.  The second,
>  more serious, objection is that in the case of multiple decorators, the
>  syntax would be extremely unwieldy.
>
>  The next form, which has had a number of strong proponents, is to have
>  the decorators between the argument list and the trailing ``:`` in the
> -'def' line::
> +'def' or 'class' line::
>
>      def foo(arg1,arg2) @classmethod:
>          pass
>
>      def bar(low,high) @accepts(int,int), at returns(float):
>          pass
> +
> +    class Foo(object) @singleton:
> +        pass
>
>  Guido `summarized the arguments`_ against this form (many of which also
>  apply to the previous form) as:
> @@ -403,15 +454,19 @@
>          @accepts(int,int)
>          @returns(float)
>          pass
> +
> +    class Foo(object):
> +        @singleton
> +        pass
>
>  The primary objection to this form is that it requires "peeking inside"
> -the method body to determine the decorators.  In addition, even though
> -the code is inside the method body, it is not executed when the method
> +the suite body to determine the decorators.  In addition, even though
> +the code is inside the suite body, it is not executed when the code
>  is run.  Guido felt that docstrings were not a good counter-example, and
>  that it was quite possible that a 'docstring' decorator could help move
>  the docstring to outside the function body.
>
> -The final form is a new block that encloses the method's code.  For this
> +The final form is a new block that encloses the function or clas.  For this
>  example, we'll use a 'decorate' keyword, as it makes no sense with the
>  @syntax. ::
>
> @@ -425,9 +480,14 @@
>          returns(float)
>          def bar(low,high):
>              pass
> +
> +    decorate:
> +        singleton
> +        class Foo(object):
> +            pass
>
>  This form would result in inconsistent indentation for decorated and
> -undecorated methods.  In addition, a decorated method's body would start
> +undecorated code.  In addition, a decorated object's body would start
>  three indent levels in.
>
>
> @@ -444,6 +504,10 @@
>      @returns(float)
>      def bar(low,high):
>          pass
> +
> +    @singleton
> +    class Foo(object):
> +        pass
>
>    The major objections against this syntax are that the @ symbol is
>    not currently used in Python (and is used in both IPython and Leo),
> @@ -461,6 +525,10 @@
>      |returns(float)
>      def bar(low,high):
>          pass
> +
> +    |singleton
> +    class Foo(object):
> +        pass
>
>    This is a variant on the @decorator syntax -- it has the advantage
>    that it does not break IPython and Leo.  Its major disadvantage
> @@ -476,6 +544,10 @@
>      [accepts(int,int), returns(float)]
>      def bar(low,high):
>          pass
> +
> +    [singleton]
> +    class Foo(object):
> +        pass
>
>    The major objection to the list syntax is that it's currently
>    meaningful (when used in the form before the method).  It's also
> @@ -490,6 +562,10 @@
>      <accepts(int,int), returns(float)>
>      def bar(low,high):
>          pass
> +
> +    <singleton>
> +    class Foo(object):
> +        pass
>
>    None of these alternatives gained much traction. The alternatives
>    which involve square brackets only serve to make it obvious that the
> @@ -659,7 +735,10 @@
>  .. _subsequently rejected:
>       http://mail.python.org/pipermail/python-dev/2004-September/048518.html
>
> +For Python 2.6, the Python grammar and compiler were modified to allow
> +class decorators in addition to function and method decorators.
>
> +
>  Community Consensus
>  -------------------
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From foom at fuhm.net  Tue May  1 18:58:19 2007
From: foom at fuhm.net (James Y Knight)
Date: Tue, 1 May 2007 12:58:19 -0400
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <fb6fbf560705010919u6e765c7av39b81f3fd17e5dba@mail.gmail.com>
References: <46371BD2.7050303@v.loewis.de>
	<fb6fbf560705010919u6e765c7av39b81f3fd17e5dba@mail.gmail.com>
Message-ID: <C54B1073-F326-43E7-A7E4-098DDA506288@fuhm.net>


On May 1, 2007, at 12:19 PM, Jim Jewett wrote:

> On 5/1/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>
>> The identifier syntax is <ID_Start> <ID_Continue>\*.
>
>> ID_Start is defined as all characters having one of the general
>> categories uppercase letters (Lu), lowercase letters (Ll), titlecase
>> letters (Lt), modifier letters (Lm), other letters (Lo), letter
>> numbers (Nl), plus the underscore (XXX what are "stability extensions
>> listed in UAX 31).
>
> Are you sure that modifier letters should be included?  The standard
> says so, but as nearly as I can tell, these are really more like
> diacritics -- and some of them look an awful lot like punctuation.
>
>     http://unicode.org/charts/PDF/U02B0.pdf

The entire point of these characters is that they are to be treated  
as letters (that is, can make up part of a word). If they were  
punctuation or diacritics, the other very-similar-looking characters  
in other parts of the codespace could be used. These letters seem to  
be mainly intended for spelling out phonetic pronunciations. It's  
unlikely that anyone would want to write an python identifier in IPA,  
but that's not a good reason to go against the standard.

James

From martin at v.loewis.de  Tue May  1 19:39:44 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 01 May 2007 19:39:44 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <435DF58A933BA74397B42CDEB8145A860B745DEF@ex9.hostedexchange.local>
References: <435DF58A933BA74397B42CDEB8145A860B745DEF@ex9.hostedexchange.local>
Message-ID: <46377B60.1030501@v.loewis.de>

Robert Brewer schrieb:
> Martin v. L?wis wrote:
>> Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
>> 
>> Common Objections =================
>> 
>> People claim that they will not be able to use a library if to do
>> so they have to use characters they cannot type on their keyboards.
>> However, it is the choice of the designer of the library to decide
>> on various constraints for using the library: people may not be 
>> able to use the library because they cannot get physical access to
>> the source code (because it is not published), or because licensing
>>  prohibits usage, or because the documentation is in a language
>> they cannot understand. A developer wishing to make a library
>> widely available needs to make a number of explicit choices (such
>> as publication, licensing, language of documentation, and language
>> of identifiers). It should always be the choice of the author to
>> make these decisions - not the choice of the language designers.
> 
> That seems true when each such decision is considered in isolation.
> But the language designers are responsible to make sure the number of
> such explicit decisions/choices does not grow beyond a reasonable
> limit.

Right. However, it is today already the developer's choice to use
English-based identifiers, or from a different language using
transliteration. So offering support to use the correct script
if they have chosen to use native-language identifiers does not
really change the number of explicit decisions.

Regards,
Martin


From jimjjewett at gmail.com  Tue May  1 20:13:40 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 1 May 2007 14:13:40 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070501121143.02d31398@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430200255.04b88e10@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501121143.02d31398@sparrow.telecommunity.com>
Message-ID: <fb6fbf560705011113n2ad71706n5f1a72127fd316c8@mail.gmail.com>

On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 10:11 AM 5/1/2007 -0400, Jim Jewett wrote:
> >On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> >> >On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:

> >> >>PEP 3115, however, requires that a class' metaclass be determined
> >> >>*before* the class body has executed, making it impossible to use this
> >> >>technique for class decoration any more.

...

> >>Sure -- that's what I suggested in the "super(), class decorators, and PEP
> >>3115" thread, but Guido voted -1 on adding such a magic attribute to PEP
> >>3115.

> >I don't think we're understanding each other.

> Yup, and we're still not now.  :)  Or at least, I don't understand what the
> code below does, or more precisely, why it's different from just having a
> __decorators__ list containing direct callbacks.

That would be fine too... but I thought you were saying that you
couldn't do this at all any more, because the metaclass had to be
determined before the class, instead of inside it.

Note that it doesn't have to be any particular magic name -- just one
agreed upon by the metaclass and the class author.  Today, some such
names are semi-standardized already; you don't need language support.

Why would you suddenly start needing language support after 3115?

-jJ

From jimjjewett at gmail.com  Tue May  1 20:20:01 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 1 May 2007 14:20:01 -0400
Subject: [Python-3000] Addition to PEP 3101
In-Reply-To: <ca471dc20705010948o5b348fb3hf3f03e4cdf1dbb16@mail.gmail.com>
References: <8f01efd00704300953t6154d7e1j7ef18cead1acb344@mail.gmail.com>
	<d09829f50704301256n2b08b1fav7ac64b4fcc6a742c@mail.gmail.com>
	<46368EE5.6050409@canterbury.ac.nz> <4636AE9E.2020905@acm.org>
	<fb6fbf560705010731i719da8efibc14c72e6175053d@mail.gmail.com>
	<ca471dc20705010948o5b348fb3hf3f03e4cdf1dbb16@mail.gmail.com>
Message-ID: <fb6fbf560705011120q46b8c096y59388140836acd27@mail.gmail.com>

On 5/1/07, Guido van Rossum <guido at python.org> wrote:
> On 5/1/07, Jim Jewett <jimjjewett at gmail.com> wrote:

> > Note that while (literal strings used as) format strings are
> > effectively sandboxed, the formatted objects themselves are not.

> >     "My name is {0[name]}".format(evil_map)

> > would still allow evil_map to run arbitrary code.

> And how on earth would that be a security threat?

There are some things you can safely do with even arbitrary objects --
such as appending them to a list.

By mentioning security as a reason to restrict the format, it suggests
that this is another safe context.  It isn't.

-jJ

From jason.orendorff at gmail.com  Tue May  1 20:22:27 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Tue, 1 May 2007 14:22:27 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<46376729.9000008@acm.org>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
Message-ID: <bb8868b90705011122l55bdb168y3f363e0a35255081@mail.gmail.com>

On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 09:13 AM 5/1/2007 -0700, Talin wrote:
> >I don't care for the idea of testing against a specially named argument.
> >Why couldn't you just have a different decorator, such as
> >"overload_chained" which triggers this behavior?
>
> The PEP lists *five* built-in decorators, all of which support this behavior::
>
>     @overload, @when, @before, @after, @around

Actually @before and @after don't support __proceeds__,
according to the first draft anyway.

I think I would prefer to *always* pass the next method
to @around methods, which always need it, and *never*
pass it to any of the others.  What use case am I missing?
The one in the PEP involves foo(bar, baz), not a very
convincing example.

-j

From guido at python.org  Tue May  1 20:25:52 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 11:25:52 -0700
Subject: [Python-3000] Addition to PEP 3101
In-Reply-To: <fb6fbf560705011120q46b8c096y59388140836acd27@mail.gmail.com>
References: <8f01efd00704300953t6154d7e1j7ef18cead1acb344@mail.gmail.com>
	<d09829f50704301256n2b08b1fav7ac64b4fcc6a742c@mail.gmail.com>
	<46368EE5.6050409@canterbury.ac.nz> <4636AE9E.2020905@acm.org>
	<fb6fbf560705010731i719da8efibc14c72e6175053d@mail.gmail.com>
	<ca471dc20705010948o5b348fb3hf3f03e4cdf1dbb16@mail.gmail.com>
	<fb6fbf560705011120q46b8c096y59388140836acd27@mail.gmail.com>
Message-ID: <ca471dc20705011125h4f68a3ck6ed1b79c77a1cbbd@mail.gmail.com>

On 5/1/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 5/1/07, Guido van Rossum <guido at python.org> wrote:
> > On 5/1/07, Jim Jewett <jimjjewett at gmail.com> wrote:
>
> > > Note that while (literal strings used as) format strings are
> > > effectively sandboxed, the formatted objects themselves are not.
>
> > >     "My name is {0[name]}".format(evil_map)
>
> > > would still allow evil_map to run arbitrary code.
>
> > And how on earth would that be a security threat?
>
> There are some things you can safely do with even arbitrary objects --
> such as appending them to a list.
>
> By mentioning security as a reason to restrict the format, it suggests
> that this is another safe context.  It isn't.

But your presumption that the map is already evil makes it irrelevant
whether the format is safe or not. Having the evil map is the problem,
not passing it to the format operation.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue May  1 20:31:00 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 11:31:00 -0700
Subject: [Python-3000] PEP Parade
Message-ID: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>

So the PEP submissions are in, and a few late ones will be submitted
ASAP. Let me write up a capsule review of what we've got. Please let
me know if I missed anything (e.g. a PEP that someone has committed to
write but hasn't submitted yet).


First the PEPs that have numbers as of this writing (I'm pasting the
section heads right out of PEP 0, so apoligies for the formatting):

 S  3101  Advanced String Formatting                   Talin

While we're still tweaking details, I expect this will be ready for
acceptance soon. We also have an implementation in the sandbox!

 S  3108  Standard Library Reorganization              Cannon

I expect this to happen after 3.0a1 is released.

 S  3116  New I/O                                      Stutzbach, Verdone, GvR

A prototype is in the py3k branch. There are details to work through
(like how to seek on text files with non-trivial encodings) but I feel
that the basis is solid. I could use help coding!

 S  3117  Postfix Type Declarations                    Brandl

I forgot to reject this -- it was my favorite April Fool's post of the
year though. :-)

 S  3118  Revising the buffer protocol                 Oliphant, Banks

Where's this standing? I'm assuming that it's pretty much ready to be
implemented. I haven't had the time to participate in the discussion.

 S  3119  Introducing Abstract Base Classes            GvR, Talin

This is clearly still controversial. It is also awaiting a rewrite. I
am still in favor of something this (or I wouldn't bother with the
rewrite).

 S  3120  Using UTF-8 as the default source encoding   von L?wis

The basic idea seems very reasonable. I expect that the changes to the
parser may be quite significant though. Also, the parser ought to be
weened of C stdio in favor of Python's own I/O library. I wonder if
it's really possible to let the parser read the raw bytes though --
this would seem to rule out supporting encodings like UTF-16. Somehow
I wonder if it wouldn't be easier if the parser operated on Unicode
input? That way parsing unicode strings (which we must support as all
strings will become unicode) will be simpler.

 S  3121  Module Initialization and finalization       von L?wis

I like it. I wish the title were changed to "Extension Module ..." though.

 S  3123  Making PyObject_HEAD conform to standard C   von L?wis

I like it, but who's going to make the changes? Once those chnges have
been made, will it still be reasonable to expect to merge C code from
the (2.6) trunk into the 3.0 branch?

 S  3124  Overloading, Generic Functions, Interfaces   Eby

I haven't had the time to read this in detail, but in general I'm
feeling favorable about this idea. I'd rather see it decoupled from
sys._getframe() and modifying func_code (actually __code__ nowadays,
see PEP 3100).

 S  3125  Remove Backslash Continuation                Jewett

Sounds reasonable. I think we should still support \ inside string
literals though; the PEP isn't clear on this. I hope this falls within
the scope of the refactoring tool (sandbox/2to3).

 S  3126  Remove Implicit String Concatenation         Jewett

Sounds reasonable as well. A fixer for this would be trivial to add to
the refactoring tool.

 S  3127  Integer Literal Support and Syntax           Maupin

Fully in favor.

 S  3128  BList: A Faster List-like Type               Stutzbach

I still have misgivings about having too many options for developers.
While wizards will have no problem deciding between regular lists and
BLists, I worry that a meme might spread among junior coders that the
built-in list type is slow, causing overuse of BLists for no good
reason. But I am deferring to Raymond Hettinger in this matter.

 S  3141  A Type Hierarchy for Numbers                 Yasskin

Jeffrey has promised to rewrite this, removing most of the references
to algebra. I expect I'll like his rewrite, once it happens.


Now on to the PEPs that don't have numbers yet.

PEP: Supporting Non-ASCII identifiers (Martin von Loewis)

I'm on record as not liking this; my worry is that it will become a
barrier to the free exchange of code. It's not just languages I can't
read (Russian transliterated to the latin alphabet would be just as
bad and we don't stop that now); many text editors have no or limited
support for other scripts (not to mention mixing right-to-left script
with Python's left-to-right identifiers). But if this receives a lot
of popular support I'm willing to give it a try. The One Laptop Per
Child project for example would like to enable students to code in
their own language (of course they'd rather see the language keywords
and standard library translated too...).

PEP: Adding class decorators (???)

I'm in favor of this. I'm just writing for someone to write it up.

PEP: Eliminate __del__ (Raymond Hettinger)

I would be in favor of this or one of the alternative ideas for fixing
the can't-GC-a-cycle-with-__del__ issue if there was a clear recipe
and (if necessary) stdlib support for what to do instead. There are
real use cases for automatic finalization for which the atexit module
isn't the right solution and try/finally or with statements don't cut
it either.

PEP: Information Attributes (Raymond Hettinger)

This would be better served by a continued discussion about the merits
and flaws of ABCs (PEP 3119 and 3141).

PEP: Traits/roles instead of ABCs (Collin Winter)

This could serve as an interesting alternative to PEP 3119. However, I
believe that it doesn't really solve the distinction between
abstractions that can be implemented as "classic" ABCs and
abstractions that require a metaclass (like TotalOrder or Ring).


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From collinw at gmail.com  Tue May  1 20:34:18 2007
From: collinw at gmail.com (Collin Winter)
Date: Tue, 1 May 2007 11:34:18 -0700
Subject: [Python-3000] PEP Parade
In-Reply-To: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
References: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
Message-ID: <43aa6ff70705011134p46c8269dv39059c242ba8e12b@mail.gmail.com>

On 5/1/07, Guido van Rossum <guido at python.org> wrote:
> PEP: Adding class decorators (???)
>
> I'm in favor of this. I'm just writing for someone to write it up.

I just checked in PEP 3129, "Class Decorators".

Collin Winter

From jimjjewett at gmail.com  Tue May  1 20:39:59 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 1 May 2007 14:39:59 -0400
Subject: [Python-3000] Addition to PEP 3101
In-Reply-To: <ca471dc20705011125h4f68a3ck6ed1b79c77a1cbbd@mail.gmail.com>
References: <8f01efd00704300953t6154d7e1j7ef18cead1acb344@mail.gmail.com>
	<d09829f50704301256n2b08b1fav7ac64b4fcc6a742c@mail.gmail.com>
	<46368EE5.6050409@canterbury.ac.nz> <4636AE9E.2020905@acm.org>
	<fb6fbf560705010731i719da8efibc14c72e6175053d@mail.gmail.com>
	<ca471dc20705010948o5b348fb3hf3f03e4cdf1dbb16@mail.gmail.com>
	<fb6fbf560705011120q46b8c096y59388140836acd27@mail.gmail.com>
	<ca471dc20705011125h4f68a3ck6ed1b79c77a1cbbd@mail.gmail.com>
Message-ID: <fb6fbf560705011139r7b23d76ge7f110be0e2a6851@mail.gmail.com>

On 5/1/07, Guido van Rossum <guido at python.org> wrote:
> On 5/1/07, Jim Jewett <jimjjewett at gmail.com> wrote:

> > There are some things you can safely do with even arbitrary objects --
> > such as appending them to a list.

> > By mentioning security as a reason to restrict the format, it suggests
> > that this is another safe context.  It isn't.

> But your presumption that the map is already evil makes it irrelevant
> whether the format is safe or not. Having the evil map is the problem,
> not passing it to the format operation.

Using a map was probably misleading.  Let me rephrase:

While the literal string itself is safe, the format function is only
as safe as the objects being formatted.  The example below gets
person.name; if the person object itself is malicious, then even this
attribute access could run arbitrary code.

     "My name is {0.name}".format(person)

-jJ

From pje at telecommunity.com  Tue May  1 20:48:54 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 14:48:54 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <bb8868b90705011122l55bdb168y3f363e0a35255081@mail.gmail.co
 m>
References: <5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<46376729.9000008@acm.org>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070501143840.02ca7760@sparrow.telecommunity.com>

At 02:22 PM 5/1/2007 -0400, Jason Orendorff wrote:
>On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>>At 09:13 AM 5/1/2007 -0700, Talin wrote:
>> >I don't care for the idea of testing against a specially named argument.
>> >Why couldn't you just have a different decorator, such as
>> >"overload_chained" which triggers this behavior?
>>
>>The PEP lists *five* built-in decorators, all of which support this 
>>behavior::
>>
>>     @overload, @when, @before, @after, @around
>
>Actually @before and @after don't support __proceeds__,
>according to the first draft anyway.

True; anything that derives from MethodList isn't going to need it, so that 
means that @discount won't use it, either.

Still, that's three decorators left: @overload, @when, and @around, plus 
any custom decorators based on Method in place of MethodList.  (@when and 
@around are implemented as the 'make_decorator' of Method and Around, 
respectively.)


>I think I would prefer to *always* pass the next method
>to @around methods, which always need it, and *never*
>pass it to any of the others.  What use case am I missing?

Calling the next method in a generic function is equivalent to calling 
super() in a normal method.  Anytime you want to add more specific behavior 
for a type, while reusing the more general behavior, you're going to need 
it.  Therefore, "primary" methods are always potential users of it.

Syntactically speaking, I would certainly agree that the ideal solution is 
something that looks like a super() call; it's just that supporting that 
requires *more* of the sort of hackery that Guido wants *less* of 
here.  Signature inspection isn't as much of a black art as magical 
functions that need to know how the current function was invoked.

The other possibility would be to clone the functions using copied 
func_globals (__globals__?) so that 'next_method' in those namespaces would 
point to the right next method.  But then, if the function *writes* any 
globals, it'll be updating the wrong namespace.  Do you have any other ideas?


From pje at telecommunity.com  Tue May  1 20:51:58 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 14:51:58 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <fb6fbf560705011113n2ad71706n5f1a72127fd316c8@mail.gmail.co
 m>
References: <5.1.1.6.0.20070501121143.02d31398@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430200255.04b88e10@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501121143.02d31398@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070501144907.04c2efe8@sparrow.telecommunity.com>

At 02:13 PM 5/1/2007 -0400, Jim Jewett wrote:
>On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > Yup, and we're still not now.  :)  Or at least, I don't understand what the
> > code below does, or more precisely, why it's different from just having a
> > __decorators__ list containing direct callbacks.
>
>That would be fine too... but I thought you were saying that you
>couldn't do this at all any more, because the metaclass had to be
>determined before the class, instead of inside it.

Correct.


>Note that it doesn't have to be any particular magic name -- just one
>agreed upon by the metaclass and the class author.  Today, some such
>names are semi-standardized already; you don't need language support.
>
>Why would you suddenly start needing language support after 3115?

Because it eliminated an existing magic name: __metaclass__.  Under the old 
regime, you could simply replace __metaclass__ with a function that called 
the old __metaclass__, then applied any desired decoration to the result.

A __decorators__ hook would replace this hack with something less 
convoluted, and allow method decorators and attribute descriptors a chance 
to modify the class, if needed.  (For example, the @abstractmethod could 
ensure the class was abstract, or raise an error if the class wasn't 
explicitly declared abstract.)


From pmaupin at gmail.com  Tue May  1 20:52:20 2007
From: pmaupin at gmail.com (Patrick Maupin)
Date: Tue, 1 May 2007 13:52:20 -0500
Subject: [Python-3000] Addition to PEP 3101
In-Reply-To: <fb6fbf560705011139r7b23d76ge7f110be0e2a6851@mail.gmail.com>
References: <8f01efd00704300953t6154d7e1j7ef18cead1acb344@mail.gmail.com>
	<d09829f50704301256n2b08b1fav7ac64b4fcc6a742c@mail.gmail.com>
	<46368EE5.6050409@canterbury.ac.nz> <4636AE9E.2020905@acm.org>
	<fb6fbf560705010731i719da8efibc14c72e6175053d@mail.gmail.com>
	<ca471dc20705010948o5b348fb3hf3f03e4cdf1dbb16@mail.gmail.com>
	<fb6fbf560705011120q46b8c096y59388140836acd27@mail.gmail.com>
	<ca471dc20705011125h4f68a3ck6ed1b79c77a1cbbd@mail.gmail.com>
	<fb6fbf560705011139r7b23d76ge7f110be0e2a6851@mail.gmail.com>
Message-ID: <d09829f50705011152r146914b5k7d1c92877f5f32c9@mail.gmail.com>

On 5/1/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 5/1/07, Guido van Rossum <guido at python.org> wrote:
> > But your presumption that the map is already evil makes it irrelevant
> > whether the format is safe or not. Having the evil map is the problem,
> > not passing it to the format operation.
>
> Using a map was probably misleading.  Let me rephrase:
>
> While the literal string itself is safe, the format function is only
> as safe as the objects being formatted.  The example below gets
> person.name; if the person object itself is malicious, then even this
> attribute access could run arbitrary code.
>
>      "My name is {0.name}".format(person)
>
> -jJ

There is a (perhaps misguided) consensus that the format() operation
ought to have the property that a programmer can write a program which
will not have an issue with potentially hostile strings.  (Personally,
I view security as an open-ended problem, and don't deal with hostile
strings without a LOT of massaging.)

It is, and will continue to be the case, that the programmer can
EASILY write code that would do something bad with a given format
string, and yet not do something bad with another format string.  This
is true even with the percent operator and a dictionary (which might
be subclassed to do something evil on a lookup operator).

All the format() operation can do to help in this instance is a few
minor restriction.  Don't allow calls, don't allow lookups of
attributes with leading underscores.  This makes it relatively easy to
write "format-safe" objects.  Does it make it impossible to write a
"format-unsafe" object?  No, and that was never the intention.

Regards,
Pat

From eric+python-dev at trueblade.com  Tue May  1 20:54:53 2007
From: eric+python-dev at trueblade.com (Eric V. Smith)
Date: Tue, 01 May 2007 14:54:53 -0400
Subject: [Python-3000] Addition to PEP 3101
In-Reply-To: <fb6fbf560705011139r7b23d76ge7f110be0e2a6851@mail.gmail.com>
References: <8f01efd00704300953t6154d7e1j7ef18cead1acb344@mail.gmail.com>	<d09829f50704301256n2b08b1fav7ac64b4fcc6a742c@mail.gmail.com>	<46368EE5.6050409@canterbury.ac.nz>
	<4636AE9E.2020905@acm.org>	<fb6fbf560705010731i719da8efibc14c72e6175053d@mail.gmail.com>	<ca471dc20705010948o5b348fb3hf3f03e4cdf1dbb16@mail.gmail.com>	<fb6fbf560705011120q46b8c096y59388140836acd27@mail.gmail.com>	<ca471dc20705011125h4f68a3ck6ed1b79c77a1cbbd@mail.gmail.com>
	<fb6fbf560705011139r7b23d76ge7f110be0e2a6851@mail.gmail.com>
Message-ID: <46378CFD.2000004@trueblade.com>

Jim Jewett wrote:
> On 5/1/07, Guido van Rossum <guido at python.org> wrote:
>> On 5/1/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> 
>>> There are some things you can safely do with even arbitrary objects --
>>> such as appending them to a list.
> 
>>> By mentioning security as a reason to restrict the format, it suggests
>>> that this is another safe context.  It isn't.
> 
>> But your presumption that the map is already evil makes it irrelevant
>> whether the format is safe or not. Having the evil map is the problem,
>> not passing it to the format operation.
> 
> Using a map was probably misleading.  Let me rephrase:
> 
> While the literal string itself is safe, the format function is only
> as safe as the objects being formatted.  The example below gets
> person.name; if the person object itself is malicious, then even this
> attribute access could run arbitrary code.
> 
>      "My name is {0.name}".format(person)
> 

I think the concern is this:

Suppose we have:

class Person:
     def destroy_children(self):
         # do something destructive
     name = 'me'

person = Person()

"My name is {0.name}".format(person)               # ok
"My name is {0.destroy_children()}".format(person) # ouch

One intent of the PEP is that the strings come from a translation, or 
are otherwise out of the direct control of the original programmer.  So 
the thought is that attributes of objects being formatted are probably 
always "safe" to call, while methods might be "unsafe", for some 
definitions of "safe" and "unsafe".

Whether this justifies the exclusion of calling methods (or callables 
themselves), I can't say.  I can say that calling methods that have 
parameters would significantly complicate our implementation of PEP 
3101.  The original message in this thread only has examples of calling 
methods without parameters, it's not clear to me if that's only intended 
use.

From jimjjewett at gmail.com  Tue May  1 20:57:41 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 1 May 2007 14:57:41 -0400
Subject: [Python-3000] PEP Parade
In-Reply-To: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
References: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
Message-ID: <fb6fbf560705011157k6644a405vd66bd2a029457fdf@mail.gmail.com>

On 5/1/07, Guido van Rossum <guido at python.org> wrote:

> So the PEP submissions are in, and a few late ones will be submitted
> ASAP. Let me write up a capsule review of what we've got. Please let
> me know if I missed anything (e.g. a PEP that someone has committed to
> write but hasn't submitted yet).

(1)  The __this_*__ PEP was written and posted; I'll revise it slightly tonight.

One benefit would be a minimal-change version of super.

(2)  Calvin's and Tim's more complete reworking of super.

(3)  final/once/name annotations -- I *think* this was dropped when
case statements were rejected, but I'm not sure.


> PEP: Eliminate __del__ (Raymond Hettinger)

> I would be in favor of this or one of the alternative ideas for fixing
> the can't-GC-a-cycle-with-__del__ issue if there was a clear recipe
> and (if necessary) stdlib support for what to do instead. There are
> real use cases for automatic finalization for which the atexit module
> isn't the right solution and try/finally or with statements don't cut
> it either.

Does the alternative need to cover 100% of use cases?

If it covers 99%, should the other 1% become impossible, or should we
keep __del__ as fallback?

-jJ

From guido at python.org  Tue May  1 21:01:53 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 12:01:53 -0700
Subject: [Python-3000] PEP Parade
In-Reply-To: <fb6fbf560705011157k6644a405vd66bd2a029457fdf@mail.gmail.com>
References: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
	<fb6fbf560705011157k6644a405vd66bd2a029457fdf@mail.gmail.com>
Message-ID: <ca471dc20705011201m4d913f42g2ef38c477ecae52c@mail.gmail.com>

On 5/1/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 5/1/07, Guido van Rossum <guido at python.org> wrote:
>
> > So the PEP submissions are in, and a few late ones will be submitted
> > ASAP. Let me write up a capsule review of what we've got. Please let
> > me know if I missed anything (e.g. a PEP that someone has committed to
> > write but hasn't submitted yet).
>
> (1)  The __this_*__ PEP was written and posted; I'll revise it slightly tonight.

__this__? What's that? I must've missed the posting of the pep, sorry.
You can mail me the PEP (best as an attachment) and I will assign it a
number and check it in.

> One benefit would be a minimal-change version of super.
>
> (2)  Calvin's and Tim's more complete reworking of super.

Oooh, I missed that too.

> (3)  final/once/name annotations -- I *think* this was dropped when
> case statements were rejected, but I'm not sure.

Unless there's a PEP that was posted before the deadline I don't want
to hear about it.

> > PEP: Eliminate __del__ (Raymond Hettinger)
>
> > I would be in favor of this or one of the alternative ideas for fixing
> > the can't-GC-a-cycle-with-__del__ issue if there was a clear recipe
> > and (if necessary) stdlib support for what to do instead. There are
> > real use cases for automatic finalization for which the atexit module
> > isn't the right solution and try/finally or with statements don't cut
> > it either.
>
> Does the alternative need to cover 100% of use cases?
>
> If it covers 99%, should the other 1% become impossible, or should we
> keep __del__ as fallback?

What 1% use case are you thinking of?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Tue May  1 21:07:36 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 15:07:36 -0400
Subject: [Python-3000] PEP Parade
In-Reply-To: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.co
 m>
Message-ID: <5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>

At 11:31 AM 5/1/2007 -0700, Guido van Rossum wrote:
>I haven't had the time to read this in detail, but in general I'm
>feeling favorable about this idea. I'd rather see it decoupled from
>sys._getframe() and modifying func_code (actually __code__ nowadays,
>see PEP 3100).

I've figured out how to drop *some* (but not all) of the _getframe() 
hackery from the current proposal, btw.  (Specifically, I believe I can 
make the decorators decide which function to return using __name__ 
comparisons instead of by checking frame contents.)

Regarding __code__, however, it's either that or allow functions to be 
subclassed and have their type changed at runtime.

In other words, if you could meaningfully assign to a function's __class__, 
then mucking with its __code__ would be unnecessary; we'd just override 
__call__ in a subclass, and change the __class__ when overloading an 
existing function.

Unfortunately, I believe that CPython 2.3 and up don't let you change the 
type of instances of built-in classes, and it's never been possible to 
subclass the function type, AFAIK.

OTOH, these restrictions may not exist in Jython, IronPython, or PyPy; if 
they allow you to subclass the function type and change a function's 
__class__, then that approach becomes a reasonable implementation choice on 
those platforms.

Thus, assignment to __code__ might reasonably be considered a workaround 
for the limitations of CPython in this respect, rather than a 
CPython-dependent hack.  :)


From jimjjewett at gmail.com  Tue May  1 21:08:02 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 1 May 2007 15:08:02 -0400
Subject: [Python-3000] Addition to PEP 3101
In-Reply-To: <d09829f50705011152r146914b5k7d1c92877f5f32c9@mail.gmail.com>
References: <8f01efd00704300953t6154d7e1j7ef18cead1acb344@mail.gmail.com>
	<d09829f50704301256n2b08b1fav7ac64b4fcc6a742c@mail.gmail.com>
	<46368EE5.6050409@canterbury.ac.nz> <4636AE9E.2020905@acm.org>
	<fb6fbf560705010731i719da8efibc14c72e6175053d@mail.gmail.com>
	<ca471dc20705010948o5b348fb3hf3f03e4cdf1dbb16@mail.gmail.com>
	<fb6fbf560705011120q46b8c096y59388140836acd27@mail.gmail.com>
	<ca471dc20705011125h4f68a3ck6ed1b79c77a1cbbd@mail.gmail.com>
	<fb6fbf560705011139r7b23d76ge7f110be0e2a6851@mail.gmail.com>
	<d09829f50705011152r146914b5k7d1c92877f5f32c9@mail.gmail.com>
Message-ID: <fb6fbf560705011208s3ab11bc6o13e506916190df25@mail.gmail.com>

On 5/1/07, Patrick Maupin <pmaupin at gmail.com> wrote:

> attributes with leading underscores.  This makes it relatively easy to
> write "format-safe" objects.  Does it make it impossible to write a
> "format-unsafe" object?  No, and that was never the intention.

Agreed; I just think this restriction should be explicit, given that
security is mentioned.

-jJ

From guido at python.org  Tue May  1 21:11:31 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 12:11:31 -0700
Subject: [Python-3000] Addition to PEP 3101
In-Reply-To: <fb6fbf560705011139r7b23d76ge7f110be0e2a6851@mail.gmail.com>
References: <8f01efd00704300953t6154d7e1j7ef18cead1acb344@mail.gmail.com>
	<d09829f50704301256n2b08b1fav7ac64b4fcc6a742c@mail.gmail.com>
	<46368EE5.6050409@canterbury.ac.nz> <4636AE9E.2020905@acm.org>
	<fb6fbf560705010731i719da8efibc14c72e6175053d@mail.gmail.com>
	<ca471dc20705010948o5b348fb3hf3f03e4cdf1dbb16@mail.gmail.com>
	<fb6fbf560705011120q46b8c096y59388140836acd27@mail.gmail.com>
	<ca471dc20705011125h4f68a3ck6ed1b79c77a1cbbd@mail.gmail.com>
	<fb6fbf560705011139r7b23d76ge7f110be0e2a6851@mail.gmail.com>
Message-ID: <ca471dc20705011211r5e15916dw6b203a0349228a4f@mail.gmail.com>

On 5/1/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 5/1/07, Guido van Rossum <guido at python.org> wrote:
> > On 5/1/07, Jim Jewett <jimjjewett at gmail.com> wrote:
>
> > > There are some things you can safely do with even arbitrary objects --
> > > such as appending them to a list.
>
> > > By mentioning security as a reason to restrict the format, it suggests
> > > that this is another safe context.  It isn't.
>
> > But your presumption that the map is already evil makes it irrelevant
> > whether the format is safe or not. Having the evil map is the problem,
> > not passing it to the format operation.
>
> Using a map was probably misleading.  Let me rephrase:
>
> While the literal string itself is safe, the format function is only
> as safe as the objects being formatted.  The example below gets
> person.name; if the person object itself is malicious, then even this
> attribute access could run arbitrary code.
>
>      "My name is {0.name}".format(person)

And my point is that the security concerns here are not about
malicious arguments to the format() method; that's not part of the
threat model. If you have a person object in your program you can't
trust, you have a problem whether or not you use the format method.

 The threat we're concerned here (as Patrick explained in his
response) is format strings provided by translators or non-root
webmasters or (less likely) end users. Translation is probably the
main use case; another use case is exemplified by mailman, which gives
list owners the means to edit list-specific html templates which are
used as format strings. We want to prevent those folks from
(accidentally or intentionally) crashing the web server or elevating
their privileges.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue May  1 21:14:47 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 12:14:47 -0700
Subject: [Python-3000] PEP Parade
In-Reply-To: <5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>
Message-ID: <ca471dc20705011214j3c191fb9x9013ec6fba6b01c4@mail.gmail.com>

Suppose you couldn't assign to __class__ of a function (that's too
messy to deal with in CPython) and you couldn't assign to its __code__
either. What proposed functionality would you lose? How would you
ideally implement that functionality if you had the ability to modify
CPython in other ways? (I'm guessing you'd want to add some
functionality to function objects; what would that functionality have
to do?)

--Guido

On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 11:31 AM 5/1/2007 -0700, Guido van Rossum wrote:
> >I haven't had the time to read this in detail, but in general I'm
> >feeling favorable about this idea. I'd rather see it decoupled from
> >sys._getframe() and modifying func_code (actually __code__ nowadays,
> >see PEP 3100).
>
> I've figured out how to drop *some* (but not all) of the _getframe()
> hackery from the current proposal, btw.  (Specifically, I believe I can
> make the decorators decide which function to return using __name__
> comparisons instead of by checking frame contents.)
>
> Regarding __code__, however, it's either that or allow functions to be
> subclassed and have their type changed at runtime.
>
> In other words, if you could meaningfully assign to a function's __class__,
> then mucking with its __code__ would be unnecessary; we'd just override
> __call__ in a subclass, and change the __class__ when overloading an
> existing function.
>
> Unfortunately, I believe that CPython 2.3 and up don't let you change the
> type of instances of built-in classes, and it's never been possible to
> subclass the function type, AFAIK.
>
> OTOH, these restrictions may not exist in Jython, IronPython, or PyPy; if
> they allow you to subclass the function type and change a function's
> __class__, then that approach becomes a reasonable implementation choice on
> those platforms.
>
> Thus, assignment to __code__ might reasonably be considered a workaround
> for the limitations of CPython in this respect, rather than a
> CPython-dependent hack.  :)
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Tue May  1 22:08:02 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 16:08:02 -0400
Subject: [Python-3000] PEP Parade
In-Reply-To: <ca471dc20705011214j3c191fb9x9013ec6fba6b01c4@mail.gmail.co
 m>
References: <5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070501155833.043589f8@sparrow.telecommunity.com>

At 12:14 PM 5/1/2007 -0700, Guido van Rossum wrote:
>Suppose you couldn't assign to __class__ of a function (that's too
>messy to deal with in CPython) and you couldn't assign to its __code__
>either. What proposed functionality would you lose?

The ability to overload any function, without having to track down all the 
places it's already been imported or otherwise saved, and change them to 
point to a new function or a non-function object.


>How would you
>ideally implement that functionality if you had the ability to modify
>CPython in other ways? (I'm guessing you'd want to add some
>functionality to function objects; what would that functionality have
>to do?)

Hm...  well, in PyPy they have a "become" feature (I don't know if it's a 
mainline feature or not) that allows you to say, "replace object A with 
object B, wherever A is currently referenced".  Then the replacement object 
(GF implementation) needn't even be a function.

A narrower feature, however, more specific to functions, would just be 
*some* way to redirect or guard the function's actual execution.  For 
example, if function objects had a writable __call__ attribute, that would 
be invoked in place of the normal behavior.  (Assuming there was a way to 
save the old __call__ or make a copy of the function before it was modified.)

I really just need a way to make calling the function do something 
different from what it normally would -- and ideally this should be in such 
a way that I could still invoke the function's original behavior.  (So it 
can be used as the default method when nothing else matches, or the 
least-specific fallback method.)


From jgarber at ionzoft.com  Tue May  1 21:14:05 2007
From: jgarber at ionzoft.com (Jason Garber)
Date: Tue, 1 May 2007 14:14:05 -0500
Subject: [Python-3000] DB API SQL injection issue
Message-ID: <E7DE807861E8474E8AC3DC7AC2C75EE50143A6DE@34093-EVS2C1.exchange.rackspace.com>

Hello,

In PEP 249 (Python Database API Specification v2.0), there is a
paragraph about cursors that reads:

.execute(operation[,parameters]) 
   Prepare and execute a database operation (query or
   command).  Parameters may be provided as sequence or
   mapping and will be bound to variables in the operation.
   Variables are specified in a database-specific notation
   (see the module's paramstyle attribute for details). [5]

I propose that the second parameter to execute() is changed to be a
required parameter to prevent accidental SQL injection vulnerabilities.

Why?  Consider the following two lines of code

cur.execute("SELECT * FROM t WHERE a=%s", (avalue))
cur.execute("SELECT * FROM t WHERE a=%s" % (avalue))

It is easy for a developer to inadvertently place a "%" operator instead
of a "," between the two parameters.  In this case, python string
formatting rules take over, and un-escaped values get inserted directly
into the SQL - silently.

After using standard string formatting characters like "%s" in the
string, and it is quite natural to place a % at the end.

The requirement of the second parameter would eliminate this
possibility.  None would be passed (explicitly) if there are no
replacements needed.

My rational for this is based: 
1. partly on observation of code with this problem.
2. partly on the rationale for PEP 3126 (Remove Implicit String
Concatenation).

>From PEP 3126: Rationale for Removing Implicit String Concatenation

    Implicit String concatentation can lead to confusing, or even
    silent, errors.

        def f(arg1, arg2=None): pass

        f("abc" "def")  # forgot the comma, no warning ...
                        # silently becomes f("abcdef", None)

    or, using the scons build framework,

        sourceFiles = [
        'foo.c'
        'bar.c',
        #...many lines omitted...
        'q1000x.c']

    It's a common mistake to leave off a comma, and then scons complains
    that it can't find 'foo.cbar.c'.  This is pretty bewildering
behavior
    even if you *are* a Python programmer, and not everyone here is.
[1]


I know that this is not a functional problem, but perhaps a safeguard
can be put in place to prevent disastrous SQL injection issues from
arising needlessly.

For your consideration.

Sincerely,

Jason Garber
Senior Systems Engineer
IonZoft, Inc.






From g.brandl at gmx.net  Tue May  1 22:11:54 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 01 May 2007 22:11:54 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <46376876.1010803@v.loewis.de>
References: <46371BD2.7050303@v.loewis.de>
	<f178kn$s0n$2@sea.gmane.org>		<4637631A.6030702@v.loewis.de>	<43aa6ff70705010905l3f87d57ck5a8f5597a6de9dab@mail.gmail.com>
	<46376876.1010803@v.loewis.de>
Message-ID: <f186u7$v33$1@sea.gmane.org>

Martin v. L?wis schrieb:
>> Reading from
>> http://mail.python.org/pipermail/python-3000/2006-April/001474.html,
>> the message that prompted this particular addition to PEP 3099, "I
>> want good Unicode support for string literals and comments. Everything
>> else in the language ought to be ASCII."
>> 
>> Identifiers aren't string literals or comments.
> 
> Sure, but please follow the follow-up communication also.

In any case, the entry in PEP 3099 should not be used as a reason
to reject the PEP.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From pedronis at openendsystems.com  Tue May  1 22:34:58 2007
From: pedronis at openendsystems.com (Samuele Pedroni)
Date: Tue, 01 May 2007 22:34:58 +0200
Subject: [Python-3000] PEP Parade
In-Reply-To: <5.1.1.6.0.20070501155833.043589f8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>	<5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501155833.043589f8@sparrow.telecommunity.com>
Message-ID: <4637A472.7090307@openendsystems.com>

For what is worth changing func_code is supported both by PyPy and 
Jython. What cannot be done
in Jython is construct a code object out of a string of CPython 
bytecode, but it can be extracted from
other functions.

Jython 2.2b1 on java1.5.0_07
Type "copyright", "credits" or "license" for more information.
 >>> def f():
...    return 1
...
 >>> f()
1
 >>> def g():
...    return 2
...
 >>> f.func_code = g.func_code
 >>> f()
2
 >>>


Phillip J. Eby wrote:
> At 12:14 PM 5/1/2007 -0700, Guido van Rossum wrote:
>   
>> Suppose you couldn't assign to __class__ of a function (that's too
>> messy to deal with in CPython) and you couldn't assign to its __code__
>> either. What proposed functionality would you lose?
>>     
>
> The ability to overload any function, without having to track down all the 
> places it's already been imported or otherwise saved, and change them to 
> point to a new function or a non-function object.
>
>
>   
>> How would you
>> ideally implement that functionality if you had the ability to modify
>> CPython in other ways? (I'm guessing you'd want to add some
>> functionality to function objects; what would that functionality have
>> to do?)
>>     
>
> Hm...  well, in PyPy they have a "become" feature (I don't know if it's a 
> mainline feature or not) that allows you to say, "replace object A with 
> object B, wherever A is currently referenced".  Then the replacement object 
> (GF implementation) needn't even be a function.
>
> A narrower feature, however, more specific to functions, would just be 
> *some* way to redirect or guard the function's actual execution.  For 
> example, if function objects had a writable __call__ attribute, that would 
> be invoked in place of the normal behavior.  (Assuming there was a way to 
> save the old __call__ or make a copy of the function before it was modified.)
>
> I really just need a way to make calling the function do something 
> different from what it normally would -- and ideally this should be in such 
> a way that I could still invoke the function's original behavior.  (So it 
> can be used as the default method when nothing else matches, or the 
> least-specific fallback method.)
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/pedronis%40openendsystems.com
>   


From tcdelaney at optusnet.com.au  Tue May  1 22:51:35 2007
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Wed, 2 May 2007 06:51:35 +1000
Subject: [Python-3000] PEP Parade
References: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
	<fb6fbf560705011157k6644a405vd66bd2a029457fdf@mail.gmail.com>
Message-ID: <01cb01c78c32$8254d6c0$0201a8c0@ryoko>

From: "Jim Jewett" <jimjjewett at gmail.com>


> On 5/1/07, Guido van Rossum <guido at python.org> wrote:
>
>> So the PEP submissions are in, and a few late ones will be submitted
>> ASAP. Let me write up a capsule review of what we've got. Please let
>> me know if I missed anything (e.g. a PEP that someone has committed to
>> write but hasn't submitted yet).
>
> (1)  The __this_*__ PEP was written and posted; I'll revise it slightly 
> tonight.
>
> One benefit would be a minimal-change version of super.

I intend for the 'super' PEP to not rely on this in any way, but will add a 
note that your PEP (and other changes) may make the implementation simpler, 
and so the implementation should be revisited before 3.0.

The semantics of 'super' OTOH should be fully clarified in our PEP.

Tim Delaney 


From nicko at nicko.org  Tue May  1 22:38:26 2007
From: nicko at nicko.org (Nicko van Someren)
Date: Tue, 1 May 2007 21:38:26 +0100
Subject: [Python-3000] DB API SQL injection issue
In-Reply-To: <E7DE807861E8474E8AC3DC7AC2C75EE50143A6DE@34093-EVS2C1.exchange.rackspace.com>
References: <E7DE807861E8474E8AC3DC7AC2C75EE50143A6DE@34093-EVS2C1.exchange.rackspace.com>
Message-ID: <AC576D4F-6217-4390-85EA-5DE21EE148BF@nicko.org>

On 1 May 2007, at 20:14, Jason Garber wrote:
> In PEP 249 (Python Database API Specification v2.0), there is a
> paragraph about cursors that reads:
>
> .execute(operation[,parameters])
>    Prepare and execute a database operation (query or
>    command).  Parameters may be provided as sequence or
>    mapping and will be bound to variables in the operation.
>    Variables are specified in a database-specific notation
>    (see the module's paramstyle attribute for details). [5]
>
> I propose that the second parameter to execute() is changed to be a
> required parameter to prevent accidental SQL injection  
> vulnerabilities.

How do you propose to deal with the SQL commands for which there is  
no need to do any parameter replacement?  This is not at all  
uncommon; would you expect to make people type cur.execute("SELECT  
DISTINCT zip_code FROM customer_addresses", None) or somesuch?

	Nicko


From guido at python.org  Tue May  1 23:04:35 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 14:04:35 -0700
Subject: [Python-3000] PEP Parade
In-Reply-To: <5.1.1.6.0.20070501155833.043589f8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501155833.043589f8@sparrow.telecommunity.com>
Message-ID: <ca471dc20705011404r57a83c04y55211acd9eb83969@mail.gmail.com>

On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 12:14 PM 5/1/2007 -0700, Guido van Rossum wrote:
> >Suppose you couldn't assign to __class__ of a function (that's too
> >messy to deal with in CPython) and you couldn't assign to its __code__
> >either. What proposed functionality would you lose?
>
> The ability to overload any function, without having to track down all the
> places it's already been imported or otherwise saved, and change them to
> point to a new function or a non-function object.

Frankly, I'm not sure this is worth all the proposed contortions. I'd
be happy (especially as long as this is a pure-Python thing) to have
to flag the base implementation explicitly with a decorator to make it
overloadable. That seems KISS to me.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From g.brandl at gmx.net  Tue May  1 23:44:59 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 01 May 2007 23:44:59 +0200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
Message-ID: <f18ccp$utj$1@sea.gmane.org>

This is a bit late, but it was in my queue by April 30, I swear! ;)
Comments are appreciated, especially some phrasing sounds very clumsy
to me, but I couldn't find a better one.

Georg


PEP: 3132
Title: Extended Iterable Unpacking
Version: $Revision$
Last-Modified: $Date$
Author: Georg Brandl <georg at python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 30-Apr-2007
Python-Version: 3.0
Post-History:


Abstract
========

This PEP proposes a change to iterable unpacking syntax, allowing to
specify a "catch-all" name which will be assigned a list of all items
not assigned to a "regular" name.

An example says more than a thousand words::

    >>> a, *b, c = range(5)
    >>> a
    0
    >>> c
    4
    >>> b
    [1, 2, 3]


Rationale
=========

Many algorithms require splitting a sequence in a "first, rest" pair.
With the new syntax, ::

    first, rest = seq[0], seq[1:]

is replaced by the cleaner and probably more efficient::

    first, *rest = seq

For more complex unpacking patterns, the new syntax looks even
cleaner, and the clumsy index handling is not necessary anymore.


Specification
=============

A tuple (or list) on the left side of a simple assignment (unpacking
is not defined for augmented assignment) may contain at most one
expression prepended with a single asterisk.  For the rest of this
section, the other expressions in the list are called "mandatory".

Note that this also refers to tuples in implicit assignment context,
such as in a ``for`` statement.

This designates a subexpression that will be assigned a list of all
items from the iterable being unpacked that are not assigned to any
of the mandatory expressions, or an empty list if there are no such
items.

It is an error (as it is currently) if the iterable doesn't contain
enough items to assign to all the mandatory expressions.


Implementation
==============

The proposed implementation strategy is:

- add a new grammar rule, ``star_test``, which consists of ``'*'
  test`` and is used in test lists
- add a new ASDL type ``Starred`` to represent a starred expression
- catch all cases where starred expressions are not allowed in the AST
  and symtable generation stage
- add a new opcode, ``UNPACK_EX``, which will only be used if a
  list/tuple to be assigned to contains a starred expression
- change ``unpack_iterable()`` in ceval.c to handle the extended
  unpacking case

Note that the starred expression element introduced here is universal
and could be used for other purposes in non-assignment context, such
as the ``yield *iterable`` proposal.

The author has written a draft implementation, but there are some open
issues which will be resolved in case this PEP is looked upon
benevolently.


Open Issues
===========

- Should the catch-all expression be assigned a list or a tuple of items?


References
==========

None yet.


Copyright
=========

This document has been placed in the public domain.


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From guido at python.org  Wed May  2 00:00:33 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 15:00:33 -0700
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <f18ccp$utj$1@sea.gmane.org>
References: <f18ccp$utj$1@sea.gmane.org>
Message-ID: <ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>

On 5/1/07, Georg Brandl <g.brandl at gmx.net> wrote:
> This is a bit late, but it was in my queue by April 30, I swear! ;)

Accepted.

> Comments are appreciated, especially some phrasing sounds very clumsy
> to me, but I couldn't find a better one.
>
> Georg
>
>
> PEP: 3132
> Title: Extended Iterable Unpacking
> Version: $Revision$
> Last-Modified: $Date$
> Author: Georg Brandl <georg at python.org>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 30-Apr-2007
> Python-Version: 3.0
> Post-History:
>
>
> Abstract
> ========
>
> This PEP proposes a change to iterable unpacking syntax, allowing to
> specify a "catch-all" name which will be assigned a list of all items
> not assigned to a "regular" name.
>
> An example says more than a thousand words::
>
>     >>> a, *b, c = range(5)
>     >>> a
>     0
>     >>> c
>     4
>     >>> b
>     [1, 2, 3]

Has it been pointed out to you already that this particular example is
hard to implement if the RHS is an iterator whose length is not known
a priori? The implementation would have to be quite hairy -- it would
have to assign everything to the list b until the iterator is
exhausted, and then pop a value from the end of the list and assign it
to c. it would be much easier if *b was only allowed at the end. (It
would be even worse if b were assigned a tuple instead of a list, as
per your open issues.)

Also, what should this do? Perhaps the grammar could disallow it?

*a = range(5)

> Rationale
> =========
>
> Many algorithms require splitting a sequence in a "first, rest" pair.
> With the new syntax, ::
>
>     first, rest = seq[0], seq[1:]
>
> is replaced by the cleaner and probably more efficient::
>
>     first, *rest = seq
>
> For more complex unpacking patterns, the new syntax looks even
> cleaner, and the clumsy index handling is not necessary anymore.
>
>
> Specification
> =============
>
> A tuple (or list) on the left side of a simple assignment (unpacking
> is not defined for augmented assignment) may contain at most one
> expression prepended with a single asterisk.  For the rest of this
> section, the other expressions in the list are called "mandatory".
>
> Note that this also refers to tuples in implicit assignment context,
> such as in a ``for`` statement.
>
> This designates a subexpression that will be assigned a list of all
> items from the iterable being unpacked that are not assigned to any
> of the mandatory expressions, or an empty list if there are no such
> items.
>
> It is an error (as it is currently) if the iterable doesn't contain
> enough items to assign to all the mandatory expressions.
>
>
> Implementation
> ==============
>
> The proposed implementation strategy is:
>
> - add a new grammar rule, ``star_test``, which consists of ``'*'
>   test`` and is used in test lists
> - add a new ASDL type ``Starred`` to represent a starred expression
> - catch all cases where starred expressions are not allowed in the AST
>   and symtable generation stage
> - add a new opcode, ``UNPACK_EX``, which will only be used if a
>   list/tuple to be assigned to contains a starred expression
> - change ``unpack_iterable()`` in ceval.c to handle the extended
>   unpacking case
>
> Note that the starred expression element introduced here is universal
> and could be used for other purposes in non-assignment context, such
> as the ``yield *iterable`` proposal.
>
> The author has written a draft implementation, but there are some open
> issues which will be resolved in case this PEP is looked upon
> benevolently.
>
>
> Open Issues
> ===========
>
> - Should the catch-all expression be assigned a list or a tuple of items?
>
>
> References
> ==========
>
> None yet.
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
>
> --
> Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
> Four shall be the number of spaces thou shalt indent, and the number of thy
> indenting shall be four. Eight shalt thou not indent, nor either indent thou
> two, excepting that thou then proceed to four. Tabs are right out.
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nevillegrech at gmail.com  Wed May  2 00:55:55 2007
From: nevillegrech at gmail.com (Neville Grech Neville Grech)
Date: Wed, 2 May 2007 00:55:55 +0200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
Message-ID: <de9ae4950705011555s12d77147j4851fcab723c40d1@mail.gmail.com>

This reminds me a lot of haskell/prolog's head/tail list splitting. Looks
like a good feature.

a*=range(5)
hmmn maybe in such a case, whenever there is the * operator, the resulting
item is always a list/tuple, like the following:
 a=[[0,1,2,3,4]] ?

I have another question, what would happen in the case a*,b=tuple(range(5))

a = (0,1,2,3) ?

Should this keep the same type of container i.e. lists to lists and tuples
to tuples or always convert to list?

-Neville

On 5/2/07, Guido van Rossum <guido at python.org> wrote:
>
> On 5/1/07, Georg Brandl <g.brandl at gmx.net> wrote:
> > This is a bit late, but it was in my queue by April 30, I swear! ;)
>
> Accepted.
>
> > Comments are appreciated, especially some phrasing sounds very clumsy
> > to me, but I couldn't find a better one.
> >
> > Georg
> >
> >
> > PEP: 3132
> > Title: Extended Iterable Unpacking
> > Version: $Revision$
> > Last-Modified: $Date$
> > Author: Georg Brandl <georg at python.org>
> > Status: Draft
> > Type: Standards Track
> > Content-Type: text/x-rst
> > Created: 30-Apr-2007
> > Python-Version: 3.0
> > Post-History:
> >
> >
> > Abstract
> > ========
> >
> > This PEP proposes a change to iterable unpacking syntax, allowing to
> > specify a "catch-all" name which will be assigned a list of all items
> > not assigned to a "regular" name.
> >
> > An example says more than a thousand words::
> >
> >     >>> a, *b, c = range(5)
> >     >>> a
> >     0
> >     >>> c
> >     4
> >     >>> b
> >     [1, 2, 3]
>
> Has it been pointed out to you already that this particular example is
> hard to implement if the RHS is an iterator whose length is not known
> a priori? The implementation would have to be quite hairy -- it would
> have to assign everything to the list b until the iterator is
> exhausted, and then pop a value from the end of the list and assign it
> to c. it would be much easier if *b was only allowed at the end. (It
> would be even worse if b were assigned a tuple instead of a list, as
> per your open issues.)
>
> Also, what should this do? Perhaps the grammar could disallow it?
>
> *a = range(5)
>
> > Rationale
> > =========
> >
> > Many algorithms require splitting a sequence in a "first, rest" pair.
> > With the new syntax, ::
> >
> >     first, rest = seq[0], seq[1:]
> >
> > is replaced by the cleaner and probably more efficient::
> >
> >     first, *rest = seq
> >
> > For more complex unpacking patterns, the new syntax looks even
> > cleaner, and the clumsy index handling is not necessary anymore.
> >
> >
> > Specification
> > =============
> >
> > A tuple (or list) on the left side of a simple assignment (unpacking
> > is not defined for augmented assignment) may contain at most one
> > expression prepended with a single asterisk.  For the rest of this
> > section, the other expressions in the list are called "mandatory".
> >
> > Note that this also refers to tuples in implicit assignment context,
> > such as in a ``for`` statement.
> >
> > This designates a subexpression that will be assigned a list of all
> > items from the iterable being unpacked that are not assigned to any
> > of the mandatory expressions, or an empty list if there are no such
> > items.
> >
> > It is an error (as it is currently) if the iterable doesn't contain
> > enough items to assign to all the mandatory expressions.
> >
> >
> > Implementation
> > ==============
> >
> > The proposed implementation strategy is:
> >
> > - add a new grammar rule, ``star_test``, which consists of ``'*'
> >   test`` and is used in test lists
> > - add a new ASDL type ``Starred`` to represent a starred expression
> > - catch all cases where starred expressions are not allowed in the AST
> >   and symtable generation stage
> > - add a new opcode, ``UNPACK_EX``, which will only be used if a
> >   list/tuple to be assigned to contains a starred expression
> > - change ``unpack_iterable()`` in ceval.c to handle the extended
> >   unpacking case
> >
> > Note that the starred expression element introduced here is universal
> > and could be used for other purposes in non-assignment context, such
> > as the ``yield *iterable`` proposal.
> >
> > The author has written a draft implementation, but there are some open
> > issues which will be resolved in case this PEP is looked upon
> > benevolently.
> >
> >
> > Open Issues
> > ===========
> >
> > - Should the catch-all expression be assigned a list or a tuple of
> items?
> >
> >
> > References
> > ==========
> >
> > None yet.
> >
> >
> > Copyright
> > =========
> >
> > This document has been placed in the public domain.
> >
> >
> > --
> > Thus spake the Lord: Thou shalt indent with four spaces. No more, no
> less.
> > Four shall be the number of spaces thou shalt indent, and the number of
> thy
> > indenting shall be four. Eight shalt thou not indent, nor either indent
> thou
> > two, excepting that thou then proceed to four. Tabs are right out.
> >
> > _______________________________________________
> > Python-3000 mailing list
> > Python-3000 at python.org
> > http://mail.python.org/mailman/listinfo/python-3000
> > Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/guido%40python.org
> >
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/nevillegrech%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070502/7efb8918/attachment.html 

From pje at telecommunity.com  Wed May  2 02:30:20 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 20:30:20 -0400
Subject: [Python-3000] PEP Parade
In-Reply-To: <ca471dc20705011404r57a83c04y55211acd9eb83969@mail.gmail.co
 m>
References: <5.1.1.6.0.20070501155833.043589f8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501155833.043589f8@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070501195640.02e20ca0@sparrow.telecommunity.com>

At 02:04 PM 5/1/2007 -0700, Guido van Rossum wrote:
>On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 12:14 PM 5/1/2007 -0700, Guido van Rossum wrote:
> > >Suppose you couldn't assign to __class__ of a function (that's too
> > >messy to deal with in CPython) and you couldn't assign to its __code__
> > >either. What proposed functionality would you lose?
> >
> > The ability to overload any function, without having to track down all the
> > places it's already been imported or otherwise saved, and change them to
> > point to a new function or a non-function object.
>
>Frankly, I'm not sure this is worth all the proposed contortions. I'd
>be happy (especially as long as this is a pure-Python thing) to have
>to flag the base implementation explicitly with a decorator to make it
>overloadable. That seems KISS to me.

I can see that perspective; in fact my earlier libraries didn't have this 
feature.  But later I realized that making only specially-marked functions 
amenable to overloading was rather like having classes that had to be 
specially marked in order to enable others to subclass them.

It would mean that either you would obsessively mark every class in order 
to make sure that you or others would be able to extend it later, or you 
would have to sit and think on whether a given class would be meaningful 
for other users to subclass, since they wouldn't be able to change the 
status of a class without changing your source code.  Either way, after 
using classes for a bit, it would make you wonder why classes shouldn't 
just be subclassable by default, to save all the effort and/or worry.

Of course, I also understand that you aren't likely to consider overloads 
to be so ubiquitous as subclassing; however, in languages where merely 
static overloading exists, it tends to be used just that ubiquitously.  And 
even C++, which requires you to declare subclass-overrideable methods 
as  "virtual", does not require you to specifically declare which names 
will have overloads!

But all that having been said, it appears that all of the current major 
Python implementations (CPython, Jython, IronPython, and PyPy) do in fact 
support assigning to func_code as long as the assigned value comes from 
another valid function object.  So at the moment it certainly seems 
practical (if perhaps not pure!) to make use of this.

Unless, of course, your intention is to make functions immutable in 
3.x.  But that would seem to put a damper on e.g. your recent "xreload" 
module, which makes use of __code__ assignment for exactly the purpose of 
redefining a function in-place.


From lists at cheimes.de  Wed May  2 02:32:35 2007
From: lists at cheimes.de (Christian Heimes)
Date: Wed, 02 May 2007 02:32:35 +0200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <de9ae4950705011555s12d77147j4851fcab723c40d1@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<de9ae4950705011555s12d77147j4851fcab723c40d1@mail.gmail.com>
Message-ID: <f18m75$ar7$1@sea.gmane.org>

Neville Grech Neville Grech wrote:
> This reminds me a lot of haskell/prolog's head/tail list splitting. Looks
> like a good feature.

Agreed!
> a*=range(5)
> hmmn maybe in such a case, whenever there is the * operator, the resulting
> item is always a list/tuple, like the following:
> a=[[0,1,2,3,4]] ?

Did you mean *a = range(5)?
The result is too surprising for me. I would suspect that *a = range(5) 
has the same output as a = range(5).

 >>> *b = (1, 2, 3)
 >>> b
(1, 2, 3)

 >>> a, *b = (1, 2, 3)
 >>> a, b
1, (2, 3)

 >>> *b, c = (1, 2, 3)
 >>> b, c
(1, 2), 3


 >>> a, *b, c = (1, 2, 3)
 >>> a, b, c
1, (2,), 3

But what would happen when the right side is too small?
 >>> a, *b, c = (1, 2)
 >>> a, b, c
1, (), 2

or should it raise an unpack exception?

This should definitely raise an exception
 >>> a, *b, c, d = (1, 2)

Christian


From tjreedy at udel.edu  Wed May  2 02:48:54 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 1 May 2007 20:48:54 -0400
Subject: [Python-3000] BList PEP
References: <eae285400705010000l2af0e890ifc8c2e0de8219961@mail.gmail.com>
Message-ID: <f18n5k$esa$1@sea.gmane.org>


"Daniel Stutzbach" <daniel at stutzbachenterprises.com> wrote in message 
news:eae285400705010000l2af0e890ifc8c2e0de8219961 at mail.gmail.com...
| Sort      O(n log n)                           O(n log n)

Tim Peters' list.sort is, I believe, better than nlogn for a number of 
practically important special cases.  I believe he documented this in the 
code comments.  Can you duplicate this with your structure?

tjr







From guido at python.org  Wed May  2 02:52:16 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 17:52:16 -0700
Subject: [Python-3000] PEP Parade
In-Reply-To: <5.1.1.6.0.20070501195640.02e20ca0@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501155833.043589f8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501195640.02e20ca0@sparrow.telecommunity.com>
Message-ID: <ca471dc20705011752q36eecbdbu6dff7d2a2eb85b72@mail.gmail.com>

On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 02:04 PM 5/1/2007 -0700, Guido van Rossum wrote:
> >On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > > At 12:14 PM 5/1/2007 -0700, Guido van Rossum wrote:
> > > >Suppose you couldn't assign to __class__ of a function (that's too
> > > >messy to deal with in CPython) and you couldn't assign to its __code__
> > > >either. What proposed functionality would you lose?
> > >
> > > The ability to overload any function, without having to track down all the
> > > places it's already been imported or otherwise saved, and change them to
> > > point to a new function or a non-function object.
> >
> >Frankly, I'm not sure this is worth all the proposed contortions. I'd
> >be happy (especially as long as this is a pure-Python thing) to have
> >to flag the base implementation explicitly with a decorator to make it
> >overloadable. That seems KISS to me.
>
> I can see that perspective; in fact my earlier libraries didn't have this
> feature.  But later I realized that making only specially-marked functions
> amenable to overloading was rather like having classes that had to be
> specially marked in order to enable others to subclass them.

I admit I'm new to this game -- but most 3.0 users will be too. But I
would be rather fearful of someone else stomping on a function I
defined (and which I may be calling myself!) without my knowing it.
ISTM (again admittedly from the fairly inexperienced perspective) that
most functions and methods just *aren't* going to be useful as generic
functions. The most likely initial use cases are situations where
people sit down and specifically design an extensible framework with
some seedling GFs and instructions for extending them.

> It would mean that either you would obsessively mark every class in order
> to make sure that you or others would be able to extend it later, or you
> would have to sit and think on whether a given class would be meaningful
> for other users to subclass, since they wouldn't be able to change the
> status of a class without changing your source code.  Either way, after
> using classes for a bit, it would make you wonder why classes shouldn't
> just be subclassable by default, to save all the effort and/or worry.

Looking at it from a different way, you *do* have to mark APIs to be
subclasses explicitly -- using the "class" syntax. You can leave that
out, and then you end up with a bunch of functions in a module. Every
time I write some code I make a conscious decision whether to do it as
a class or as a method -- I don't create classes for everything by
default.

> Of course, I also understand that you aren't likely to consider overloads
> to be so ubiquitous as subclassing; however, in languages where merely
> static overloading exists, it tends to be used just that ubiquitously.  And
> even C++, which requires you to declare subclass-overrideable methods
> as  "virtual", does not require you to specifically declare which names
> will have overloads!

So this example can be interpreted both way -- sometimes you have to
declare an anticipated use, sometimes you don't. It still hasn't
convinced me that it's such a burden to have to declare GFs. I rather
like the idea that it warns readers who are new to GFs and more
familiar with how functions behave in Python 2. I can guarantee that
very few people are aware of being able to assign to func_code (hey,
*I* had to look it up! :-).

> But all that having been said, it appears that all of the current major
> Python implementations (CPython, Jython, IronPython, and PyPy) do in fact
> support assigning to func_code as long as the assigned value comes from
> another valid function object.  So at the moment it certainly seems
> practical (if perhaps not pure!) to make use of this.

I see your PBP and I raise you an EIBTI. :-)

> Unless, of course, your intention is to make functions immutable in
> 3.x.  But that would seem to put a damper on e.g. your recent "xreload"
> module, which makes use of __code__ assignment for exactly the purpose of
> redefining a function in-place.

No plans in that direction. Just general discomfort with depending on
the feature. Also noting that __code__ is an implementation detail --
it doesn't exist for other callables such as built-in functions.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Wed May  2 03:40:41 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 21:40:41 -0400
Subject: [Python-3000] PEP Parade
In-Reply-To: <ca471dc20705011752q36eecbdbu6dff7d2a2eb85b72@mail.gmail.co
 m>
References: <5.1.1.6.0.20070501195640.02e20ca0@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501155833.043589f8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501195640.02e20ca0@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070501212508.02f11d30@sparrow.telecommunity.com>

At 05:52 PM 5/1/2007 -0700, Guido van Rossum wrote:
>I rather like the idea that it warns readers who are new to GFs and more 
>familiar with how functions behave in Python 2.

Until somebody adds an overload, it *does* behave the same; that was sort 
of the point.  :)

>Also noting that __code__ is an implementation detail --
>it doesn't exist for other callables such as built-in functions.

Fair enough, although the PEP doesn't propose to allow extending built-in 
functions, only Python ones.

>I would be rather fearful of someone else stomping on a function I defined 
>(and which I may be calling myself!) without my knowing it.

All they can do is add special cases or wrappers to it; which is not quite 
the same thing.  It's actually *safer* than monkeypatching, as you don't 
have to go out of your way to save the original version of the function, 
your method is only called when its condition applies, etc.  For simple 
callbacks using before/after methods they needn't even remember to *call* 
the old function.

However, since your objections are more in the nature of general unease 
than arguments against, it probably doesn't make sense for me to continue 
quibbling with them point by point, and instead focus on how to move forward.

If you would like to require that the stdlib module use some sort of 
decorator (@overloadable, perhaps?) to explicitly mark a function as 
generic, that's probably fine, because the way it will work internally is 
that all the overloads still have to pass through a generic 
function...  which I can then easily add an overload to in a separate 
library, which will then allow direct modification of existing functions, 
without needing a decorator.  That way, we're both happy, and maybe by 3.1 
you'll be comfortable with dropping the extra decorator.  :)

One possible issue, however, with this approach, is pydoc.  In all three of 
my existing generic function libraries, I use function objects rather than 
custom objects, for the simple reason that pydoc won't document the 
signatures of anything else.  On the other hand, I suppose there's no 
reason that the "make this overloadable" decorator couldn't just create 
another function object via compile or exec, whose implementation is fixed 
at creation time to do whatever lookup is required.


From cvrebert at gmail.com  Wed May  2 03:51:24 2007
From: cvrebert at gmail.com (Chris Rebert)
Date: Tue, 1 May 2007 18:51:24 -0700
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <f18m75$ar7$1@sea.gmane.org>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<de9ae4950705011555s12d77147j4851fcab723c40d1@mail.gmail.com>
	<f18m75$ar7$1@sea.gmane.org>
Message-ID: <47c890dc0705011851o273d1f16x5c4d9363c5e62822@mail.gmail.com>

In the interest of furthering the discussion, here are two past
threads on similar suggestions:

[Python-Dev] Half-baked proposal: * (and **?) in assignments
http://mail.python.org/pipermail/python-dev/2002-November/030349.html

[Python-ideas] Javascript Destructuring Assignment
http://mail.python.org/pipermail/python-ideas/2007-March/000284.html

- Chris Rebert

On 5/1/07, Christian Heimes <lists at cheimes.de> wrote:
> Neville Grech Neville Grech wrote:
> > This reminds me a lot of haskell/prolog's head/tail list splitting. Looks
> > like a good feature.
>
> Agreed!
> > a*=range(5)
> > hmmn maybe in such a case, whenever there is the * operator, the resulting
> > item is always a list/tuple, like the following:
> > a=[[0,1,2,3,4]] ?
>
> Did you mean *a = range(5)?
> The result is too surprising for me. I would suspect that *a = range(5)
> has the same output as a = range(5).
>
>  >>> *b = (1, 2, 3)
>  >>> b
> (1, 2, 3)
>
>  >>> a, *b = (1, 2, 3)
>  >>> a, b
> 1, (2, 3)
>
>  >>> *b, c = (1, 2, 3)
>  >>> b, c
> (1, 2), 3
>
>
>  >>> a, *b, c = (1, 2, 3)
>  >>> a, b, c
> 1, (2,), 3
>
> But what would happen when the right side is too small?
>  >>> a, *b, c = (1, 2)
>  >>> a, b, c
> 1, (), 2
>
> or should it raise an unpack exception?
>
> This should definitely raise an exception
>  >>> a, *b, c, d = (1, 2)
>
> Christian
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/cvrebert%40gmail.com
>

From brett at python.org  Wed May  2 04:02:11 2007
From: brett at python.org (Brett Cannon)
Date: Tue, 1 May 2007 19:02:11 -0700
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
Message-ID: <bbaeab100705011902r5598ec44k9c170c1314b6b981@mail.gmail.com>

On 5/1/07, Guido van Rossum <guido at python.org> wrote:
>
> On 5/1/07, Georg Brandl <g.brandl at gmx.net> wrote:
> > This is a bit late, but it was in my queue by April 30, I swear! ;)
>
> Accepted.
>
> > Comments are appreciated, especially some phrasing sounds very clumsy
> > to me, but I couldn't find a better one.
> >
> > Georg
> >
> >
> > PEP: 3132
> > Title: Extended Iterable Unpacking
> > Version: $Revision$
> > Last-Modified: $Date$
> > Author: Georg Brandl <georg at python.org>
> > Status: Draft
> > Type: Standards Track
> > Content-Type: text/x-rst
> > Created: 30-Apr-2007
> > Python-Version: 3.0
> > Post-History:
> >
> >
> > Abstract
> > ========
> >
> > This PEP proposes a change to iterable unpacking syntax, allowing to
> > specify a "catch-all" name which will be assigned a list of all items
> > not assigned to a "regular" name.
> >
> > An example says more than a thousand words::
> >
> >     >>> a, *b, c = range(5)
> >     >>> a
> >     0
> >     >>> c
> >     4
> >     >>> b
> >     [1, 2, 3]
>
> Has it been pointed out to you already that this particular example is
> hard to implement if the RHS is an iterator whose length is not known
> a priori? The implementation would have to be quite hairy -- it would
> have to assign everything to the list b until the iterator is
> exhausted, and then pop a value from the end of the list and assign it
> to c. it would be much easier if *b was only allowed at the end. (It
> would be even worse if b were assigned a tuple instead of a list, as
> per your open issues.)



If a clean implementation solution cannot be found then I say go with the
last-item-only restriction.  You still get the nice functional language
feature of car/cdr (or x:xs if you prefer ML or Haskell) without the
implementation headache.  I mean how often do you want the head and tail
with everything in between left together?  If I needed that kind of sequence
control I would feed the iterator to a list comp and get to the items that
way.

Also, what should this do? Perhaps the grammar could disallow it?
>
> *a = range(5)



I say disallow it.  That is ambiguous as to what your intentions are even if
you know what '*' does for multiple assignment.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070501/6af84286/attachment.htm 

From talin at acm.org  Wed May  2 04:21:28 2007
From: talin at acm.org (Talin)
Date: Tue, 01 May 2007 19:21:28 -0700
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
Message-ID: <4637F5A8.5000009@acm.org>

Phillip J. Eby wrote:
> At 09:13 AM 5/1/2007 -0700, Talin wrote:
>> Phillip J. Eby wrote:
>>> Proceeding to the "Next" Method
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> If the first parameter of an overloaded function is named
>>> ``__proceed__``, it will be passed a callable representing the next
>>> most-specific method.  For example, this code::
>>>      def foo(bar:object, baz:object):
>>>          print "got objects!"
>>>      @overload
>>>      def foo(__proceed__, bar:int, baz:int):
>>>          print "got integers!"
>>>          return __proceed__(bar, baz)
>>
>> I don't care for the idea of testing against a specially named 
>> argument. Why couldn't you just have a different decorator, such as 
>> "overload_chained" which triggers this behavior?
> 
> The PEP lists *five* built-in decorators, all of which support this 
> behavior::
> 
>    @overload, @when, @before, @after, @around
> 
> And in addition, it demonstrates how to create *new* method combination 
> decorators, that *also* support this behavior (e.g. '@discount').
> 
> All in all, there are an unbounded number of possible decorators that 
> would require chained and non-chained variations.

Well, I suppose you could make "chained" a modifier of the decorator, so 
for example @operator.chained, @discount.chained, and so on. In other 
words, the decorator can be called directly, or the attribute 'chained' 
also produces a callable that causes the modified behavior. Moreover, 
this would support an arbitrary number of modifiers on the decorator, 
such as @overload.chained.strict(True).whatever.

-- Talin

From greg.ewing at canterbury.ac.nz  Wed May  2 04:23:02 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 02 May 2007 14:23:02 +1200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <4637631A.6030702@v.loewis.de>
References: <46371BD2.7050303@v.loewis.de> <f178kn$s0n$2@sea.gmane.org>
	<4637631A.6030702@v.loewis.de>
Message-ID: <4637F606.4060707@canterbury.ac.nz>

Martin v. L?wis wrote:

> http://mail.python.org/pipermail/python-3000/2006-April/001526.html
> 
> where Guido states that he trusts me that it can be made to work,
> and that "eventually" it needs to be supported.

He says "the tools aren't ready yet", which I take to
mean that Python won't need to support it until all
widely-used editors, email and news software, etc, etc,
reliably support displaying and editing of all
unicode characters. We're clearly a long way from
that situation.

--
Greg

From guido at python.org  Wed May  2 04:37:55 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 19:37:55 -0700
Subject: [Python-3000] PEP Parade
In-Reply-To: <5.1.1.6.0.20070501212508.02f11d30@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501155833.043589f8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501195640.02e20ca0@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501212508.02f11d30@sparrow.telecommunity.com>
Message-ID: <ca471dc20705011937n3c53517r807a1f572d944cbe@mail.gmail.com>

On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> However, since your objections are more in the nature of general unease
> than arguments against, it probably doesn't make sense for me to continue
> quibbling with them point by point, and instead focus on how to move forward.

Thanks for indulging my insecurities.

> If you would like to require that the stdlib module use some sort of
> decorator (@overloadable, perhaps?) to explicitly mark a function as
> generic, that's probably fine, because the way it will work internally is
> that all the overloads still have to pass through a generic
> function...  which I can then easily add an overload to in a separate
> library, which will then allow direct modification of existing functions,
> without needing a decorator.  That way, we're both happy, and maybe by 3.1
> you'll be comfortable with dropping the extra decorator.  :)

I'll take my cue from the users.

> One possible issue, however, with this approach, is pydoc.  In all three of
> my existing generic function libraries, I use function objects rather than
> custom objects, for the simple reason that pydoc won't document the
> signatures of anything else.  On the other hand, I suppose there's no
> reason that the "make this overloadable" decorator couldn't just create
> another function object via compile or exec, whose implementation is fixed
> at creation time to do whatever lookup is required.

That's one solution. Another solution would be to use GFs in Pydoc to
make it overloadable; I'd say pydoc could use a bit of an overhault at
this point.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed May  2 04:39:45 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 1 May 2007 19:39:45 -0700
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <bbaeab100705011902r5598ec44k9c170c1314b6b981@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<bbaeab100705011902r5598ec44k9c170c1314b6b981@mail.gmail.com>
Message-ID: <ca471dc20705011939m7953fa4fif3c4e2361d7d6aa8@mail.gmail.com>

On 5/1/07, Brett Cannon <brett at python.org> wrote:
> > Also, what should this do? Perhaps the grammar could disallow it?
> >
> > *a = range(5)
>
> I say disallow it.  That is ambiguous as to what your intentions are even if
> you know what '*' does for multiple assignment.

My real point was that the PEP lacks precision here. It should list
the exact proposed changes to Grammar/Grammar.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Wed May  2 04:38:57 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 02 May 2007 14:38:57 +1200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <46376752.2070007@v.loewis.de>
References: <46371BD2.7050303@v.loewis.de>
	<43aa6ff70705010844u4a6333f5hf1d4d3a807361ffe@mail.gmail.com>
	<46376752.2070007@v.loewis.de>
Message-ID: <4637F9C1.2060303@canterbury.ac.nz>

Martin v. L?wis wrote:

> I still don't understand why the "no operation" statement is called
> "pass" - it's not the opposite of "fail", and seems to have no
> relationship to "can you pass me the butter, please?".

It's "pass" as in "pass through", i.e. move on to the next
statement without stopping to do anything.

Also there's an idiom that you hear in a setting such as
a quiz show, where a contestant will say "Pass", meaning
"I don't know the answer to that, give me the next
question."

--
Greg

From jason.orendorff at gmail.com  Wed May  2 04:43:22 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Tue, 1 May 2007 22:43:22 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070501143840.02ca7760@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<46376729.9000008@acm.org>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501143840.02ca7760@sparrow.telecommunity.com>
Message-ID: <bb8868b90705011943s7de5f0b8j42eadcfebfaf6f5e@mail.gmail.com>

On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 02:22 PM 5/1/2007 -0400, Jason Orendorff wrote:
> >I think I would prefer to *always* pass the next method
> >to @around methods, which always need it, and *never*
> >pass it to any of the others.  What use case am I missing?
>
> Calling the next method in a generic function is equivalent to calling
> super() in a normal method.  Anytime you want to add more specific behavior
> for a type, while reusing the more general behavior, you're going to need
> it.  [...]

Oh, I see.  I thought @before, @after, and @around should cover all
the use cases.  But timing is not the only difference between them.
It also affects how you affect other people's advice and how later,
more specific advice will affect you.  In short, you have to ask
yourself: am I hooking something (before/after), implementing it
(when), or just generally looking for trouble (around)?

I haven't used CLOS or Aspect-J, but I have played Magic: the
Gathering, which judging by these examples is largely the same
thing.  Incidentally, Magic gets by with just @around (which they
spell "instead of") and @after (which they spell "when").

Come to think of it, Inform 7 is the other system I know of that has
an advice system like this.  Now I'm suspicious.  Are you trying to
turn Python into some kind of game?

I forgot to say earlier:  Thanks very much for writing this PEP.
This should be interesting.

> The other possibility would be to clone the functions using copied
> func_globals (__globals__?) so that 'next_method' in those namespaces would
> point to the right next method.  But then, if the function *writes* any
> globals, it'll be updating the wrong namespace.  Do you have any other ideas?

Here's what I've got left.  Take your pick:

  @when(bisect.bisect, withNextMethod=True)
  def bisect_bee(nextMethod, seq : Sequence, eric : Bee, *options):
      ...

..in which case @override would be left out in the cold, but I'm okay
with that.  Or else:

  @override
  @withNextMethod
  def bisect(nextMethod, ...): ...

Your idea of using the argument annotation was fine, too.  Any
of these three is better than detecting the argument name.

-j

From pje at telecommunity.com  Wed May  2 04:47:07 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 22:47:07 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <4637F5A8.5000009@acm.org>
References: <5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070501223519.029ca628@sparrow.telecommunity.com>

At 07:21 PM 5/1/2007 -0700, Talin wrote:
>Well, I suppose you could make "chained" a modifier of the decorator, so
>for example @operator.chained, @discount.chained, and so on. In other
>words, the decorator can be called directly, or the attribute 'chained'
>also produces a callable that causes the modified behavior.

Well, that certainly seems like enough of an option to list as an 
alternative in the PEP, but I personally think it increases implementation 
complexity, compared to the way things work now.  One reason for that is 
that right now the decorators don't actually *do* much of anything, as per 
this excerpt from peak.rules.core::

         def decorate(f, pred=()):
             rules = rules_for(f)
             def callback(frame, name, func, old_locals):
                 rule = parse_rule(
                     rules, func, pred, maker, frame.f_locals, frame.f_globals
                 )
                 rules.add(rule)
                 if old_locals.get(name) in (f, rules):
                     return f    # prevent overwriting if name is the same
                 return func
             return decorate_assignment(callback)

The above is the function used for *all* of the decorators proposed in the 
PEP, except for @overload.  The only bit that differs between them is the 
``maker``, which is a classmethod of the corresponding Method class (e.g. 
Method, Before, Around, etc.).  The maker is used to create action 
instances, which are then combined into chains using 
combine_actions().  The signature stuff for __proceed__ (which is actually 
called 'next_method' in peak.rules) is done inside the ``maker`` and the 
action instance itself, not in the decorator.

So, it would require a fair amount of refactoring and additional complexity 
to do it the way you suggest.  It's intriguing, but I'm not sure it's a big 
win compared to e.g. "def foo(next:next_method, ...)".  I *could* see 
allowing the next_method to be in a different position, since the partial() 
can still be precomputed, and bound methods could still be used in the case 
where it was in the first position.


>  Moreover, this would support an arbitrary number of modifiers on the 
> decorator, such as @overload.chained.strict(True).whatever.

Actually, I don't think Python's grammar allows you to do that.  IIRC, 
decorators have to be a dotted name followed by an optional (arglist).  So 
the '.whatever' part wouldn't be legal.


From talin at acm.org  Wed May  2 04:50:11 2007
From: talin at acm.org (Talin)
Date: Tue, 01 May 2007 19:50:11 -0700
Subject: [Python-3000] PEP Parade
In-Reply-To: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
References: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
Message-ID: <4637FC63.7070301@acm.org>

Guido van Rossum wrote:
>  S  3125  Remove Backslash Continuation                Jewett
> 
> Sounds reasonable. I think we should still support \ inside string
> literals though; the PEP isn't clear on this. I hope this falls within
> the scope of the refactoring tool (sandbox/2to3).

I'm a strong -1 on this one BTW. I really dislike the idea of having to 
add spurious parentheses or other grouping operators in order to force 
line continuation. It requires the Python programmer to replace an ugly 
lexical-level hack into an ugly and cluttered parsing-level hack. 
Readability suffers as a consequence. In general, parens or grouping 
operators should only be used when they *mean* something, not merely as 
a hint to the parser as to how to parse something.

-- Talin

From greg.ewing at canterbury.ac.nz  Wed May  2 04:50:12 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 02 May 2007 14:50:12 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
Message-ID: <4637FC64.4050701@canterbury.ac.nz>

Phillip J. Eby wrote:
> The PEP lists *five* built-in decorators, all of which support this behavior::
> 
>     @overload, @when, @before, @after, @around

This seems massively over-designed. All you need is the
ability to call the next method, and you can get all of
these behaviours. If you call it first, then you get
after behaviour; if you call it last, you get before
behaviour; etc.

--
Greg

From talin at acm.org  Wed May  2 05:10:53 2007
From: talin at acm.org (Talin)
Date: Tue, 01 May 2007 20:10:53 -0700
Subject: [Python-3000] Some canonical use-cases for ABCs/Interfaces/Generics
Message-ID: <4638013D.8090902@acm.org>

One of my concerns in the ABC/interface discussion so far is that a lot 
of the use cases presented are "toy" examples. This makes perfect sense 
considering that you don't want to have to spend several pages 
explaining the use case. But at the same time, it means that we might be 
solving problems that aren't real, while ignoring problems that are.

What I'd like to do is collect a set of "real-world" use cases and 
document them. The idea would be that we could refer to these use cases 
during the discussion, using a common terminology and shorthand examples.

I'll present one very broad use case here, and I'd be interested if 
people have ideas for other use cases. The goal is to define a small 
number of broadly-defined cases that provide a broad coverage of the 
problem space.

====

The use case I will describe is what I will call "Object Graph 
Transformation". The general pattern is that you have a collection of 
objects organized in a graph which you wish to transform. The objects in 
the graph may be standard Python built-in types (lists, tuples, dicts, 
numbers), or they may be specialized application-specific types.

The Python "pickle" operation is an example of this type of 
transformation: Converting a graph of objects into a flat stream format 
that can later be reconstituted back into a graph.

Other kinds of transformations would include:

   Serialization: pickling, marshaling, conversion to XML or JSON, ORMs 
and other persistence frameworks, migration of objects between runtime 
environments or languages, etc.

   Presentation: Conversion of a graph of objects to a visible form, 
such as a web page.

   Interactive Editing: The graph is converted to a user editable form, 
a la JavaBeans. An example is an user-interface editor application which 
allows widgets to be edited via a property sheet. The object graph is 
displayed in a live "preview" window, while a "tree view" of object 
properties is shown in a side panel. The transformation occurs when the 
objects in the graph are transformed into a hierarchy of key/value 
properties that are displayed in the tree view window.

These various cases may seem different but they all have a similar 
structure in terms of the roles of the participants involved. For a 
given transformation, there are 4 roles involved:

   1) The author of the objects to be transformed.
   2) The author of the generic transform function, such as "serialize".
   3) The author of the special transform function for each specific class.
   4) The person invoking the transform operation within the application.

We can give names to these various bits of code if we wish, such as the 
"Operand", the "General Operator", the "Special Operator", and the 
"Invocation". But for now, I'll simply refer to them by number.

Using the terminology of generic functions, (1) is the author of the 
argument that is passed to the generic function, (2) is the author of 
the original "generic" function, (3) is the author of the overloads of 
the generic function, and (4) is the person calling the generic function.

Each of these authors may have participated at different times and may 
be unaware of each other's work. The only dependencies is that (3) must 
know about (1) and (2), and (4) must know about (2).

Note that if any of these participants *do* have prior knowledge of the 
others, then the need for a generic adaption framework is considerably 
weakened. So for example, if (2) already knows all of the types of 
objects that are going to be operated on, then it can simply hard-code 
that knowledge into its own implementation. Similarly, if (1) knows that 
it is going to be operated on in this way, then it can simply add a 
method to do that operation. Its only when the system needs to be N-way 
extensible, where N is the number of participants, that a more general 
dispatch solution is required.

A real-world example of this use case is the TurboGears/TurboJSON 
conversion of Python objects into JSON format, which currently uses 
RuleDispatch to do the heavy lifting.

    @jsonify.when(Person)
    def jsonify_person(obj):
       # Code to convert a Person object to a dict
       # of properties which can be serialized as JSON

In this example, the "Person" need never know anything about JSON 
formatting, and conversely the JSON serialization framework need know 
nothing about Person objects. Instead, this little adaptor function is 
the glue that ties them together.

This also means that built-in types can be serialized under the new 
system without having to modify them. Otherwise, you would either have 
to build into the serializer special-case knowledge of these types, or 
you would have to restrict your object graph to using only special 
application-specific container and value types. Thus, a list of Person 
objects can be a plain list, but can still be serialized using the same 
persistence framework as is used for the Person object.

===

OK that is the description of the use case. I'd be interested to know 
what uses cases people have that fall *outside* of the above.

-- Talin

From daniel at stutzbachenterprises.com  Wed May  2 05:17:04 2007
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Tue, 1 May 2007 22:17:04 -0500
Subject: [Python-3000] BList PEP
In-Reply-To: <f18n5k$esa$1@sea.gmane.org>
References: <eae285400705010000l2af0e890ifc8c2e0de8219961@mail.gmail.com>
	<f18n5k$esa$1@sea.gmane.org>
Message-ID: <eae285400705012017s513023f4u1bc443f6fe702826@mail.gmail.com>

On 5/1/07, Terry Reedy <tjreedy at udel.edu> wrote:
> "Daniel Stutzbach" <daniel at stutzbachenterprises.com> wrote in message
> news:eae285400705010000l2af0e890ifc8c2e0de8219961 at mail.gmail.com...
> | Sort      O(n log n)                           O(n log n)
>
> Tim Peters' list.sort is, I believe, better than nlogn for a number of
> practically important special cases.  I believe he documented this in the
> code comments.  Can you duplicate this with your structure?

The table in the PEP lists worst-case execution times.  I'll make that
explicit in the next revision.  You are correct that TimSort is O(n)
for nearly-sorted lists.  It's possible to implement TimSort over the
BList, but I have not yet done so.

-- 
Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC

From talin at acm.org  Wed May  2 05:38:31 2007
From: talin at acm.org (Talin)
Date: Tue, 01 May 2007 20:38:31 -0700
Subject: [Python-3000] Another way to understand static metaprogramming in
	functional languages
Message-ID: <463807B7.2000308@acm.org>

Guido was complaining to me today, something along the lines that every 
time someone presents him with an example of Haskell code, his eyes 
start glazing over. I have pretty much the same problem, even though 
I've actually taken the time to read a little bit about Haskell.

If you are someone who is interested in how the magic of static 
metaprogramming in functional languages can work, but find Haskell code 
hard to read, then I strongly recommend Graydon Hoare's "One Day 
Compilers" presentation:

http://www.venge.net/graydon/talks/mkc/html/mgp00001.html

This is a slideshow that shows how to build a simple compiler in one day 
in OCAML. The slideshow is easy to read, and covers a brief introduction 
to OCAML (which is similar to Haskell in spirit) as well as details of 
how to construct the compiler. There's lots of stuff in there on how to 
use manipulation of types to get things done fast.

-- Talin

From tjreedy at udel.edu  Wed May  2 05:42:19 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 1 May 2007 23:42:19 -0400
Subject: [Python-3000] Derivation of "pass" in Python (was Re: PEP:
	Supporting Non-ASCII Identifiers)
References: <43aa6ff70705010844u4a6333f5hf1d4d3a807361ffe@mail.gmail.com><46371BD2.7050303@v.loewis.de><43aa6ff70705010844u4a6333f5hf1d4d3a807361ffe@mail.gmail.com><5.1.1.6.0.20070501123738.05263610@sparrow.telecommunity.com>
	<463770D9.3050405@v.loewis.de>
Message-ID: <f191ap$plu$1@sea.gmane.org>


""Martin v. L?wis"" <martin at v.loewis.de> wrote in message 
news:463770D9.3050405 at v.loewis.de...

|> Thus, when someone is offered something, they may say, "I'll pass",
| > meaning they are declining to act.  Ergo, to "pass" in Python is to
| > decline to give up the opportunity to act.

The person being quoted meant "to decline, to give up...".  The missing 
comma inverts the meaning.

|
| Ah, ok. It would then be similar to "Passe!" in German, which is
| used in card games, if you don't play a card, but instead hand
| over to the next player. Even though this is clearly the same
| ancestry, it never occurred to me that the same meaning is also
| present in English (also, "passen" is somewhat oldish now, so
| I don't use it actively myself).

In the card game bridge, for instance, 'pass' is the official word for 'no 
bid'.  Anything else meaning the same thing is illegal.  So 'pass', among 
other things, can either mean 'not fail' or 'fail to act' ;-)

tjr




From pje at telecommunity.com  Wed May  2 05:54:20 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 23:54:20 -0400
Subject: [Python-3000] PEP Parade
In-Reply-To: <ca471dc20705011937n3c53517r807a1f572d944cbe@mail.gmail.com
 >
References: <5.1.1.6.0.20070501212508.02f11d30@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501155833.043589f8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501195640.02e20ca0@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501212508.02f11d30@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070501234608.02a589f8@sparrow.telecommunity.com>

At 07:37 PM 5/1/2007 -0700, Guido van Rossum wrote:
>On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > However, since your objections are more in the nature of general unease
> > than arguments against, it probably doesn't make sense for me to continue
> > quibbling with them point by point, and instead focus on how to move 
> forward.
>
>Thanks for indulging my insecurities.

I hope that didn't come across as patronizing; I didn't mean to say that 
your arguments weren't valid, just that it seemed unlikely your position 
would be swayed solely by argument, and that thus it would be better not to 
keep arguing with you about them.


>That's one solution. Another solution would be to use GFs in Pydoc to
>make it overloadable; I'd say pydoc could use a bit of an overhault at
>this point.

True enough; until you mentioned that, I'd forgotten that a week or two ago 
I got an email from somebody working on the pydoc overhaul who mentioned 
that he had had to work up an ad-hoc generic function implementation for 
just that reason.  :)


From pje at telecommunity.com  Wed May  2 05:56:39 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 01 May 2007 23:56:39 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <bb8868b90705011943s7de5f0b8j42eadcfebfaf6f5e@mail.gmail.co
 m>
References: <5.1.1.6.0.20070501143840.02ca7760@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<46376729.9000008@acm.org>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501143840.02ca7760@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070501235431.043df428@sparrow.telecommunity.com>

At 10:43 PM 5/1/2007 -0400, Jason Orendorff wrote:
>In short, you have to ask
>yourself: am I hooking something (before/after), implementing it
>(when), or just generally looking for trouble (around)?

Nice summary!  I'll add something like this to the PEP, although I suppose 
I'll have to make the language a bit more formal.  :)


From pje at telecommunity.com  Wed May  2 06:04:35 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 02 May 2007 00:04:35 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <4637FC64.4050701@canterbury.ac.nz>
References: <5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070501235655.02a57fd8@sparrow.telecommunity.com>

At 02:50 PM 5/2/2007 +1200, Greg Ewing wrote:
>Phillip J. Eby wrote:
> > The PEP lists *five* built-in decorators, all of which support this 
> behavior::
> >
> >     @overload, @when, @before, @after, @around
>
>This seems massively over-designed. All you need is the
>ability to call the next method, and you can get all of
>these behaviours. If you call it first, then you get
>after behaviour; if you call it last, you get before
>behaviour; etc.

Yep, that was my theory too, until I actually used generic functions.  As 
it happens, it's:

1) a lot more pleasant not to write the extra boilerplate all the time, and

2) having @before or @after tells you right away the intent of the method, 
without having to carefully inspect the body to see when and whether it is 
calling the next method, and whether it is modifying the arguments or 
return values in some way.

In other words, the restricted behavior of @before and @after methods makes 
them easier to write *and* easier to read.

By the way, if you look at the PEP, you'll find motivating examples for 
each of the decorators, as well as an explanation and examples of when and 
how you might want to create even *more* such decorators.

IIRC, CLOS has about *8 more* kinds of method combinators that come 
standard, including ones that we'd probably spell something like @sum, 
@product, @min, @max, @list, @any, and @all, if it weren't for most of 
those names already being builtins that mean something else.  :)  The PEP 
doesn't propose implementing all of those, but it does show how easily you 
can create things like that if you want to.


From pje at telecommunity.com  Wed May  2 06:29:06 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 02 May 2007 00:29:06 -0400
Subject: [Python-3000] Some canonical use-cases for
 ABCs/Interfaces/Generics
In-Reply-To: <4638013D.8090902@acm.org>
Message-ID: <5.1.1.6.0.20070502000659.044c03c8@sparrow.telecommunity.com>

Thanks for writing this!

At 08:10 PM 5/1/2007 -0700, Talin wrote:
>Other kinds of transformations would include:

Compilation is a collection of such transforms over an AST, by the 
way.  Documentation generators are transforms over either an AST 
(source-based doc generators) or a set of modules' contents (e.g. epydoc 
and pydoc).

Zope 3 transforms object trees into views.  It also uses something called 
"event adapters" that are basically a crude sort of before/after/around 
method system.  Twisted and Zope also do a lot of adaptation, which is sort 
of a poor man's generic function combined with namespaces.


>Note that if any of these participants *do* have prior knowledge of the
>others, then the need for a generic adaption framework is considerably
>weakened. So for example, if (2) already knows all of the types of
>objects that are going to be operated on, then it can simply hard-code
>that knowledge into its own implementation.

One of the goals of PEP 3124, btw, is to encourage people to use overloads 
even in the case where they *think* that they know all the types to be 
operated on, because there is always the chance that somebody else will 
come along and want to reuse that code.  Pydoc and epydoc are good examples 
of a situation where author #2 thought they knew all the things to be 
operated on.


>Similarly, if (1) knows that
>it is going to be operated on in this way, then it can simply add a
>method to do that operation. Its only when the system needs to be N-way
>extensible, where N is the number of participants, that a more general
>dispatch solution is required.

This makes it sound more complex than it is; all that is required is for 
one person to try to use two other people's code -- neither of whom 
anticipated the combination.

This situation is very easy to come by, but from your description each of 
persons 1 through 4 might conclude that since they normally work alone, 
this won't affect them.  ;)


>OK that is the description of the use case. I'd be interested to know
>what uses cases people have that fall *outside* of the above.

Well, consider what other things generic functions are used for in ordinary 
Python, like len(), iter(), sum(), all the operator.* functions, etc.  Any 
operation that might be performed on a variety of types is applicable.

Of course, these generic functions built in to Python use __special__ 
methods rather than having a registry; but this is an implementation 
detail.  They are simply generic functions that don't let you register 
methods for built-in types (since they're immutable and you thus can't add 
the __special__ methods).

So, generic functions that allow registration are just a generalization of 
what we already have so that:

1. there are no namespace collisions between authors competing for reserved 
method names
2. dispatch can be on any number of arguments
3. any type can play in any function, even the type is built-in

Note that if it sounds like I'm saying that all functions are potentially 
generic, it's because they are.  Heck, they *already* are, in Python.  It's 
just that we don't have a uniform way of expressing them, as opposed to an 
ad-hoc assortment of patterns.


From fdrake at acm.org  Wed May  2 06:31:43 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 2 May 2007 00:31:43 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <bb8868b90705011943s7de5f0b8j42eadcfebfaf6f5e@mail.gmail.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501143840.02ca7760@sparrow.telecommunity.com>
	<bb8868b90705011943s7de5f0b8j42eadcfebfaf6f5e@mail.gmail.com>
Message-ID: <200705020031.44147.fdrake@acm.org>

On Tuesday 01 May 2007, Jason Orendorff wrote:
 > Come to think of it, Inform 7 is the other system I know of that has
 > an advice system like this.  Now I'm suspicious.  Are you trying to
 > turn Python into some kind of game?

Software is always a game, and I've been beginning to think the spoils of the 
victor always involve large amounts of pain.  For the loser, at least the 
pain ends.

I wonder if today is one of my cynical days?  --sigh--


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From santagada at gmail.com  Wed May  2 06:32:36 2007
From: santagada at gmail.com (Leonardo Santagada)
Date: Wed, 2 May 2007 01:32:36 -0300
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <46377B60.1030501@v.loewis.de>
References: <435DF58A933BA74397B42CDEB8145A860B745DEF@ex9.hostedexchange.local>
	<46377B60.1030501@v.loewis.de>
Message-ID: <8C502782-3E6F-4613-BB8D-5BB827FB7238@gmail.com>


I don't know if I can speak on the py3k list, but I would give this a  
-1.

Supporting non-ascii identifiers don't fix the bigger problem. People  
want to write programs in their own language. Not only identifiers,  
but all of the literals on the sintax of python would be better if  
they can be on the programmers language, as what the guys from OLPC  
want. I think we should defer this pep and try to come with a broader  
solution that can work as a diferent dialect of python... something  
using the python VM but with a completely different parser. Having a  
parser that reads unicode as guido recently suggested is the first  
step. Then you could have something like encoding but called language  
where you set your dialect of python (maybe this can be set per  
account in a system), and for the last part you will need some files  
that translate the stdlibrary and anyother library so you can do  
stuff like this:

#!/usr/bin/env python
# _*_ encoding: utf-8
# _*_ lang: Portuguese

minha_vari?vel = 2

para contador em faixa(10):
	se contador % minha_vari?vel == 0:
		imprime "oi mundo"


this is useful to teach programming to really young kids and in  
places where english is really not common...
but the thing is that just having non-ascii identifiers is not going  
to solve your problem.

--
Leonardo Santagada
santagada at gmail.com




From krstic at solarsail.hcs.harvard.edu  Wed May  2 06:42:17 2007
From: krstic at solarsail.hcs.harvard.edu (=?UTF-8?B?SXZhbiBLcnN0acSH?=)
Date: Wed, 02 May 2007 00:42:17 -0400
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <8C502782-3E6F-4613-BB8D-5BB827FB7238@gmail.com>
References: <435DF58A933BA74397B42CDEB8145A860B745DEF@ex9.hostedexchange.local>	<46377B60.1030501@v.loewis.de>
	<8C502782-3E6F-4613-BB8D-5BB827FB7238@gmail.com>
Message-ID: <463816A9.8070704@solarsail.hcs.harvard.edu>

Leonardo Santagada wrote:
> but all of the literals on the sintax of python would be better if  
> they can be on the programmers language, as what the guys from OLPC  
> want. 

It's not clear to me that that's what we want, actually. I think Alan
Kay mentioned that they can do this level of i18n with Squeak already,
and that will probably do quite well for the really young kids. For the
rest, I think a single set of language keywords is generally a Very Good
Thing.

Guido and others have justified not wanting to add more syntax-level
metaprogramming abilities to Python by saying that it's important for
all Python code to read as Python code, not as "Python code that might
sometimes mean something entirely different because of macros". Keyword
translation would cause a similarly ugly problem, I suspect.

-- 
Ivan Krsti? <krstic at solarsail.hcs.harvard.edu> | GPG: 0x147C722D

From tjreedy at udel.edu  Wed May  2 06:46:50 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 2 May 2007 00:46:50 -0400
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
References: <46371BD2.7050303@v.loewis.de> <f178kn$s0n$2@sea.gmane.org>
Message-ID: <f1953p$3vo$1@sea.gmane.org>


"Giovanni Bajo" <rasky at develer.com> wrote in message 
news:f178kn$s0n$2 at sea.gmane.org...
On 01/05/2007 12.52, Martin v. L?wis wrote:

| Isn't this already blacklisted in PEP 3099?

In today's Pep Parade post, he implies no.

"PEP: Supporting Non-ASCII identifiers (Martin von Loewis)

I'm on record as not liking this; my worry is that it will become a
barrier to the free exchange of code. It's not just languages I can't
read (Russian transliterated to the latin alphabet would be just as
bad and we don't stop that now); many text editors have no or limited
support for other scripts (not to mention mixing right-to-left script
with Python's left-to-right identifiers). But if this receives a lot
of popular support I'm willing to give it a try. The One Laptop Per
Child project for example would like to enable students to code in
their own language (of course they'd rather see the language keywords
and standard library translated too...)."

OLPC, which is one realization of Guido's CP4E dream (computer programming 
for everyone), changes the ball game (to use an American expression).  I 
expect that most anyone with a college education from anywhere in the world 
has been exposed to latin characters and at least a few English words.  But 
the case is different, I think, for elementary kids.

Given that Guido has given the language and implementation freely and for 
free, I think it reasonable that he want to be able to read programs that 
recipients write.  And 'foreign' words are much easier for him and many of 
us to read, match, and differentiate* when transliterated to Latin chars 
than when written in one of the ever proliferating character sets.  (And I 
believe that Unicode is, sadly, encouraging the invention of unneeded new 
sets for obscure languages that would be much better off using one of the 
existing writing systems.)

* To understand a program, one must be able to match all occurences of the 
same identifier and differentiate different identifiers.

So, Martin, I suggest that you expand your proposal to include a 
transliteration mechanism and limit the allowed characters to those which 
can be translitered.  I presume that this would be an expanding set.  Once 
a mechanism is in place, people who want 'their' character set included can 
do the work needed for that set.

Terry Jan Reedy








From martin at v.loewis.de  Wed May  2 07:05:21 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 02 May 2007 07:05:21 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <4637F606.4060707@canterbury.ac.nz>
References: <46371BD2.7050303@v.loewis.de>
	<f178kn$s0n$2@sea.gmane.org>	<4637631A.6030702@v.loewis.de>
	<4637F606.4060707@canterbury.ac.nz>
Message-ID: <46381C11.5030104@v.loewis.de>

> He says "the tools aren't ready yet", which I take to
> mean that Python won't need to support it until all
> widely-used editors, email and news software, etc, etc,
> reliably support displaying and editing of all
> unicode characters. We're clearly a long way from
> that situation.

I don't understand that requirement. Clearly, editors do
support non-ASCII characters already for many years (atleast
since 1980, maybe longer). Is the complaint that a single
editor does not support all characters? I don't see a need
for that - the editor will present a replacement character.
However, if somebody bothered entering the character in a
source file, there is actually a high chance that an editor
can display it (how else did he enter the character?)

Or is the complaint that editors don't support UTF-8?
That is simply not true anymore. E.g. IDLE has supported
editing UTF-8 for several Python releases now.

Regards,
Martin

From martin at v.loewis.de  Wed May  2 07:10:35 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 02 May 2007 07:10:35 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <f1953p$3vo$1@sea.gmane.org>
References: <46371BD2.7050303@v.loewis.de> <f178kn$s0n$2@sea.gmane.org>
	<f1953p$3vo$1@sea.gmane.org>
Message-ID: <46381D4B.2010802@v.loewis.de>

> So, Martin, I suggest that you expand your proposal to include a 
> transliteration mechanism and limit the allowed characters to those which 
> can be translitered.  I presume that this would be an expanding set.  Once 
> a mechanism is in place, people who want 'their' character set included can 
> do the work needed for that set.

I can certainly add that as a request, but I'm -1 on it. There shouldn't
be two different spellings for the same identifier, plus transliteration
systems often depend on the natural language (e.g. ? is transliterated
as oe in German, but (I believe) just as o in the Skandinavian languages
that have that character).

Regards,
Martin

From g.brandl at gmx.net  Wed May  2 09:02:18 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 02 May 2007 09:02:18 +0200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <ca471dc20705011939m7953fa4fif3c4e2361d7d6aa8@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>	<bbaeab100705011902r5598ec44k9c170c1314b6b981@mail.gmail.com>
	<ca471dc20705011939m7953fa4fif3c4e2361d7d6aa8@mail.gmail.com>
Message-ID: <f19d1n$mfm$1@sea.gmane.org>

Guido van Rossum schrieb:
> On 5/1/07, Brett Cannon <brett at python.org> wrote:
>> > Also, what should this do? Perhaps the grammar could disallow it?
>> >
>> > *a = range(5)
>>
>> I say disallow it.  That is ambiguous as to what your intentions are even if
>> you know what '*' does for multiple assignment.
> 
> My real point was that the PEP lacks precision here. It should list
> the exact proposed changes to Grammar/Grammar.

You're right.

I tried to imply this with "A tuple (or list) on the left side of a
simple assignment", but it isn't clear enough.

I'll update the PEP to incorporate the grammar changes today.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From g.brandl at gmx.net  Wed May  2 09:04:29 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 02 May 2007 09:04:29 +0200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
Message-ID: <f19d5r$mfm$2@sea.gmane.org>

Guido van Rossum schrieb:
> On 5/1/07, Georg Brandl <g.brandl at gmx.net> wrote:
>> This is a bit late, but it was in my queue by April 30, I swear! ;)
> 
> Accepted.
> 
>> Comments are appreciated, especially some phrasing sounds very clumsy
>> to me, but I couldn't find a better one.
>>
>> Georg
>>
>>
>> PEP: 3132
>> Title: Extended Iterable Unpacking
>> Version: $Revision$
>> Last-Modified: $Date$
>> Author: Georg Brandl <georg at python.org>
>> Status: Draft
>> Type: Standards Track
>> Content-Type: text/x-rst
>> Created: 30-Apr-2007
>> Python-Version: 3.0
>> Post-History:
>>
>>
>> Abstract
>> ========
>>
>> This PEP proposes a change to iterable unpacking syntax, allowing to
>> specify a "catch-all" name which will be assigned a list of all items
>> not assigned to a "regular" name.
>>
>> An example says more than a thousand words::
>>
>>     >>> a, *b, c = range(5)
>>     >>> a
>>     0
>>     >>> c
>>     4
>>     >>> b
>>     [1, 2, 3]
> 
> Has it been pointed out to you already that this particular example is
> hard to implement if the RHS is an iterator whose length is not known
> a priori? The implementation would have to be quite hairy -- it would
> have to assign everything to the list b until the iterator is
> exhausted, and then pop a value from the end of the list and assign it
> to c.

Yes, that is correct. My implementation isn't *that* hairy, though, it's
only 13 lines of code more.

I'll post the patch to SourceForge later today.

> it would be much easier if *b was only allowed at the end. (It
> would be even worse if b were assigned a tuple instead of a list, as
> per your open issues.)

The created tuple is a fresh one, so can't I just copy pointers like from a
list and set ob_size later?

> Also, what should this do? Perhaps the grammar could disallow it?
> 
> *a = range(5)

I'm not so sure about the grammar, I'm currently catching it in the AST
generation stage.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From greg.ewing at canterbury.ac.nz  Wed May  2 09:15:17 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 02 May 2007 19:15:17 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <bb8868b90705011943s7de5f0b8j42eadcfebfaf6f5e@mail.gmail.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<46376729.9000008@acm.org>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501143840.02ca7760@sparrow.telecommunity.com>
	<bb8868b90705011943s7de5f0b8j42eadcfebfaf6f5e@mail.gmail.com>
Message-ID: <46383A85.1020003@canterbury.ac.nz>

Jason Orendorff wrote:
> Now I'm suspicious.  Are you trying to
> turn Python into some kind of game?

You mean it isn't already? I've always felt that
writing Python code is more like fun than work...

--
Greg

From greg.ewing at canterbury.ac.nz  Wed May  2 09:48:23 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 02 May 2007 19:48:23 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070501235655.02a57fd8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501235655.02a57fd8@sparrow.telecommunity.com>
Message-ID: <46384247.8020601@canterbury.ac.nz>

Phillip J. Eby wrote:

> Yep, that was my theory too, until I actually used generic functions.

Is there something about generic functions that makes
them different from methods in this regard? I've used
OO systems which have the equivalent of @before, @after
etc. for overriding methods, and others (including
Python) which don't, and I've never found myself missing
them. So I'm skeptical that they're a must-have feature
for generic functions.

> 1) a lot more pleasant not to write the extra boilerplate all the time,

I'd work on that by finding ways to reduce the boilerplate.
Calling the next method of a generic function shouldn't
be any harder than calling the inherited implementation
of a normal method.

> By the way, if you look at the PEP, you'll find motivating examples for 
> each of the decorators,

There are examples, yes, but they don't come across as
very compelling as to why there should be so many variations
of the overloading decorator rather than a single general
one.

> IIRC, CLOS has about *8 more* kinds of method combinators

CLOS strikes me as being the union of all Lisp dialects that
anyone has ever used, rather than something with a coherent
design behind it. So quoting CLOS is not going to make me
think better of anything.

--
Greg

From eric+python-dev at trueblade.com  Wed May  2 14:26:42 2007
From: eric+python-dev at trueblade.com (Eric V. Smith)
Date: Wed, 02 May 2007 08:26:42 -0400
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <46371BD2.7050303@v.loewis.de>
References: <46371BD2.7050303@v.loewis.de>
Message-ID: <46388382.90800@trueblade.com>

Martin v. L?wis wrote:
...
> Specification of Language Changes
> =================================
> 
> The syntax of identifiers in Python will be based on the Unicode
> standard annex UAX-31 [1]_, with elaboration and changes as defined
> below.
> 
> Within the ASCII range (U+0001..U+007F), the valid characters for
> identifiers are the same as in Python 2.5. This specification only
> introduces additional characters from outside the ASCII range. For
> other characters, the classification uses the version of the Unicode
> Character Database as included in the unicodedata module.
> 
> The identifier syntax is <ID_Start> <ID_Continue>\*.
> 
> ID_Start is defined as all characters having one of the general
> categories uppercase letters (Lu), lowercase letters (Ll), titlecase
> letters (Lt), modifier letters (Lm), other letters (Lo), letter
> numbers (Nl), plus the underscore (XXX what are "stability extensions
> listed in UAX 31).
> 
> ID_Continue is defined as all characters in ID_Start, plus nonspacing
> marks (Mn), spacing combining marks (Mc), decimal number (Nd), and
> connector punctuations (Pc).
> 
> All identifiers are converted into the normal form NFC while parsing;
> comparison of identifiers is based on NFC.

Martin:

I don't understand Unicode nearly well enough to really comment on this, 
but could you add a comment that the PEP3101 code might need to be 
adjusted to deal with Unicode identifiers?

I don't actually think your PEP would make any difference to how we're 
parsing, because we don't have a "is this a valid character for an 
identifier" function.  But I'd like to get a note somewhere in the PEP 
saying that all code that parses for identifiers might be impacted.  The 
PEP 3101 code is one place where we have such a parser.  We'd at least 
need to implement tests for Unicode identifiers.

Which reminds me that we need better tests for the existing PEP 3101 
code, especially for strings with surrogate pairs.  I'll look at beefing 
that up.

Thanks.

Eric.


From rrr at ronadam.com  Wed May  2 15:24:58 2007
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 02 May 2007 08:24:58 -0500
Subject: [Python-3000] PEP Parade
In-Reply-To: <5.1.1.6.0.20070501234608.02a589f8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070501212508.02f11d30@sparrow.telecommunity.com>	<5.1.1.6.0.20070501145217.04d293e8@sparrow.telecommunity.com>	<5.1.1.6.0.20070501155833.043589f8@sparrow.telecommunity.com>	<5.1.1.6.0.20070501195640.02e20ca0@sparrow.telecommunity.com>	<5.1.1.6.0.20070501212508.02f11d30@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501234608.02a589f8@sparrow.telecommunity.com>
Message-ID: <4638912A.1060804@ronadam.com>

Phillip J. Eby wrote:
> At 07:37 PM 5/1/2007 -0700, Guido van Rossum wrote:

>> That's one solution. Another solution would be to use GFs in Pydoc to
>> make it overloadable; I'd say pydoc could use a bit of an overhault at
>> this point.
> 
> True enough; until you mentioned that, I'd forgotten that a week or two ago 
> I got an email from somebody working on the pydoc overhaul who mentioned 
> that he had had to work up an ad-hoc generic function implementation for 
> just that reason.  :)

Ah, That would be me.  :-)


I'm still working on it and hope to have it done before the library 
reorganization.  (Minus Python 3000 enhancements since it needs to work 
with python 2.6)

The resulting (yes add-hoc) solution, is basically what I needed to do to 
get it to work nicely.

Talon showed me an example of his that used a decorator to initialize a 
dispatch table.  Which is much easier to maintain over manually editing it 
as I was doing.

Here is an outline of how it generally works.  Maybe you can see where 
proper generic functions might be useful.


INTROSPECTION or (Making a DocInfo Object.)
===========================================

The actual introspection consists of mostly using the inspect module or 
looking directly at either the attributes, files, or file system.  The end 
product is a data structure containing tagged strings that can be parsed 
and formatted at a later stage.

* A DocInfo object is a DocInfo-list of strings, and more DocInfo objects 
in fifo order, with a tag attribute and a depth first iterator method. 
Ultimately the contents are strings (or string like objects) that came from 
some input source.

(Note: All or any of this can be used outside of pydoc if it is found to be 
generally useful.)

General use:

     1.  Create a inspection dispatcher.

               select = Dispatcher()   # A dispatcher/dictionary.

     2.  Define introspective functions, and use a decorator to
         add them to the dispatcher.

               @select.add('tag')
               foo(tag, name, obj):
                  # The tag is added by the dispatcher.
                  items = get_some_info_about_obj()
                  title = DocInfo('title', name)
                  body = DocInfo('list', items)
                  return DocInfo(tag, title, body)

          Do the above for all unique objectes.
         (Functions can have have more than one tag name.)

     3.  Get input from the help function, interactive help, or web
         server request and create a DocInfo structure.

	    get_info(request):

                # parse if needed. (search, topics, inexes, etc...)

                obj = locate(request)
                tag = describe(obj)   # get a descripton that matches a tag.
                return select(tag, request, obj)

Some keys are very general such as 'list', 'item', 'name', 'text', and some 
are specific to the source the data came from... 'package', 'class', 
'module', 'function', etc...

If all you want is to send the text out the way it came in... you can use 
simple string functions.

   result = DocInfo(request).format(str)

That will produce one long everythingruntogether output.


   def repr_line(s):
      return repr(s) + '\n'

   result = DocInfo(request).present(repr_line)

This will put each tagged string on it's own line with quotes around it. 
Since it's a nested structure the quotes will be nested too.



ADVANCED FORMATTING
===================

Create a DocFormatter object to format a DocInfo data structure with.

     1.  Define a pre_processor function. (optional)

         This alters the data structure.  For example you can
         rearrange, remove, and/or replace parts before any formatting
         has occured.

     2.  Define a pre_formatter function.  (optional)

         Pre-format input strings at the bottom (depth first) but not
         intermediate result strings.  Any function that takes a string
         and returns a string is fine here.  ie... cgi.escape()

     3.  Define a formatter to format DocInfo list objects according to
         the tags.  (list objects are joined after sub items are formatted.)

	* A function with an if-elif-else structure here is
         perfectly fine, but a dispatcher is better more complex things. ;-)

         (a)  Create a dispatcher object.

         (b)  Add functions to the dispatcher by using decorators and
              tag names.

         * The tags passed to the functions by the dispatcher contains
         all parent tags prepended to them with '.' seperators.  This
         allows you to format based on where something is in addition
         to what it is.


The dispatcher class *is* the formatter function in this case.  The 
__call__ method is used to keep it interchangeable with regular functions. 
  So a method @dispatcher.add(tag1) is used as the decorator to add 
functions to the dispatcher.

     select = Dispatcher()

     @select.add('function', 'method')
     def format_function(tag, info):
        ...
        return formatted_info

Multiple tags allow a single function to perform several roles.

     4.  Create a post_formatter.  (optional)

         An example of this might be to replace place holders with
         dynamic content collected during the formatting process.

     5.  Combine the formatters into a single DocFormatter class.

         Example from the html formatter:

             formatter = DocFormatter(preformat=cgi.escape,
                                      formatter=select,
                                      postformat=page)

The callable, 'select', is the dispatcher, and 'page' is a post-formatter 
that wraps the contents into the final html page with head, navigation bar 
and the body sections.

The DocInfo format method iterates it's contents and calls the formatter on 
it.  The formatter determines what action needs to be taken by what's given 
it and whether or not it's the first time or last time it is called.

And so to put it all together...

        result = DocInfo(request).format(formatter)

I currently have three formatters.

    - text/console
    - html
    - xml

It should be very easy to write other formatters by using these as starting 
points.

Cheers,
    Ron



From ark at acm.org  Wed May  2 15:37:37 2007
From: ark at acm.org (Andrew Koenig)
Date: Wed, 2 May 2007 09:37:37 -0400
Subject: [Python-3000] PEP-3125 -- remove backslash continuation
Message-ID: <003601c78cbf$0cacec90$2606c5b0$@org>

Looking at PEP-3125, I see that one of the rejected alternatives is to allow
any unfinished expression to indicate a line continuation.

I would like to suggest a modification to that alternative that has worked
successfully in another programming language, namely Stu Feldman's EFL.  EFL
is a language intended for numerical programming; it compiles into Fortran
with the interesting property that the resulting Fortran code is intended to
be human-readable and maintainable by people who do not happen to have
access to the EFL compiler.

Anyway, the (only) continuation rule in EFL is that if the last token in a
line is one that lexically cannot be the last token in a statement, then the
next line is considered a continuation of the current line.

Python currently has a rule that if parentheses are unbalanced, a newline
does not end the statement.  If we were to translate the EFL rule to Python,
it would be something like this:

	The whitespace that follows an operator or open bracket or
parenthesis
	can include newline characters.

Note that if this suggestion were implemented, it would presumably be at a
very low lexical level--even before the decision is made to turn a newline
followed by spaces into an INDENT or DEDENT token.  I think that this
property solves the difficulty-of-parsing problem.  Indeed, I think that
this suggestion would be easier to implement than the current
unbalanced-parentheses rule.

Note also that like the current backslash rule, the space after the newline
would be just space, with no special significance.  So to rewrite the
examples from the PEP:

	"abc" +      # Plus is an operator, so it continues
	    "def"    # The extra spaces before "def" do not constitute an
INDENT

	"abc"        # Line does not end with an operator, so statement ends
	    + "def"  # The newline and spaces constitute an INDENT -- this
is a syntax error

	("abc"       # I have no opinion about keeping the
unbalanced-parentheses rule --
	    + "def") # but I do think that it is harder to parse (and also
harder to read)
	             # than what I am proposing.



From jimjjewett at gmail.com  Wed May  2 15:58:37 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 2 May 2007 09:58:37 -0400
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <46381D4B.2010802@v.loewis.de>
References: <46371BD2.7050303@v.loewis.de> <f178kn$s0n$2@sea.gmane.org>
	<f1953p$3vo$1@sea.gmane.org> <46381D4B.2010802@v.loewis.de>
Message-ID: <fb6fbf560705020658g2cef6c9ey483f0dc63f300bf5@mail.gmail.com>

On 5/2/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > So, Martin, I suggest that you expand your proposal to include a
> > transliteration mechanism and limit the allowed characters to those which
> > can be translitered.  I presume that this would be an expanding set.  Once
> > a mechanism is in place, people who want 'their' character set included can
> > do the work needed for that set.

> I can certainly add that as a request, but I'm -1 on it. There shouldn't
> be two different spellings for the same identifier, plus transliteration
> systems often depend on the natural language (e.g. ? is transliterated
> as oe in German, but (I believe) just as o in the Skandinavian languages
> that have that character).

I think this might be a job for the IDE, at at most an import hook.

If you want the German transliteration, then use the German import hook.

The reason it might need to be part of the IDE is to transliterate
back; for example when there are error messages.

Pity there isn't a good way to say

"Stop parsing this source file; feed the rest to XXX and then what it
sends back."

-jJ

From fuzzyman at voidspace.org.uk  Wed May  2 17:42:09 2007
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Wed, 02 May 2007 16:42:09 +0100
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
Message-ID: <4638B151.6020901@voidspace.org.uk>

Jim Jewett wrote:
> PEP: 30xz
> Title: Simplified Parsing
> Version: $Revision$
> Last-Modified: $Date$
> Author: Jim J. Jewett <JimJJewett at gmail.com>
> Status: Draft
> Type: Standards Track
> Content-Type: text/plain
> Created: 29-Apr-2007
> Post-History: 29-Apr-2007
>
>
> Abstract
>
>     Python initially inherited its parsing from C.  While this has
>     been generally useful, there are some remnants which have been
>     less useful for python, and should be eliminated.
>
>     + Implicit String concatenation
>
>     + Line continuation with "\"
>
>     + 034 as an octal number (== decimal 28).  Note that this is
>       listed only for completeness; the decision to raise an
>       Exception for leading zeros has already been made in the
>       context of PEP XXX, about adding a binary literal.
>
>
> Rationale for Removing Implicit String Concatenation
>
>     Implicit String concatentation can lead to confusing, or even
>     silent, errors. [1]
>
>         def f(arg1, arg2=None): pass
>
>         f("abc" "def")  # forgot the comma, no warning ...
>                         # silently becomes f("abcdef", None)
>
>   
Implicit string concatenation is massively useful for creating long 
strings in a readable way though:

    call_something("first part\n"
                           "second line\n"
                            "third line\n")

I find it an elegant way of building strings and would be sad to see it 
go. Adding trailing '+' signs is ugly.

Michael Foord


>     or, using the scons build framework,
>
>         sourceFiles = [
>         'foo.c',
>         'bar.c',
>         #...many lines omitted...
>         'q1000x.c']
>
>     It's a common mistake to leave off a comma, and then scons complains
>     that it can't find 'foo.cbar.c'.  This is pretty bewildering behavior
>     even if you *are* a Python programmer, and not everyone here is.
>
>     Note that in C, the implicit concatenation is more justified; there
>     is no other way to join strings without (at least) a function call.
>
>     In Python, strings are objects which support the __add__ operator;
>     it is possible to write:
>
>         "abc" + "def"
>
>     Because these are literals, this addition can still be optimized
>     away by the compiler.
>
>     Guido indicated [2] that this change should be handled by PEP, because
>     there were a few edge cases with other string operators, such as the %.
>     The resolution is to treat them the same as today.
>
>         ("abc %s def" + "ghi" % var)  # fails like today.
>                                       # raises TypeError because of
>                                       # precedence.  (% before +)
>
>         ("abc" + "def %s ghi" % var)  # works like today; precedence makes
>                                       # the optimization more difficult to
>                                       # recognize, but does not change the
>                                       # semantics.
>
>         ("abc %s def" + "ghi") % var  # works like today, because of
>                                       # precedence:  () before %
>                                       # CPython compiler can already
>                                       # add the literals at compile-time.
>
>
> Rationale for Removing Explicit Line Continuation
>
>     A terminal "\" indicates that the logical line is continued on the
>     following physical line (after whitespace).
>
>     Note that a non-terminal "\" does not have this meaning, even if the
>     only additional characters are invisible whitespace.  (Python depends
>     heavily on *visible* whitespace at the beginning of a line; it does
>     not otherwise depend on *invisible* terminal whitespace.)  Adding
>     whitespace after a "\" will typically cause a syntax error rather
>     than a silent bug, but it still isn't desirable.
>
>     The reason to keep "\" is that occasionally code looks better with
>     a "\" than with a () pair.
>
>         assert True, (
>             "This Paren is goofy")
>
>     But realistically, that paren is no worse than a "\".  The only
>     advantage of "\" is that it is slightly more familiar to users of
>     C-based languages.  These same languages all also support line
>     continuation with (), so reading code will not be a problem, and
>     there will be one less rule to learn for people entirely new to
>     programming.
>
>
> Rationale for Removing Implicit Octal Literals
>
>     This decision should be covered by PEP ???, on numeric literals.
>     It is mentioned here only for completeness.
>
>     C treats integers beginning with "0" as octal, rather than decimal.
>     Historically, Python has inherited this usage.  This has caused
>     quite a few annoying bugs for people who forgot the rule, and
>     tried to line up their constants.
>
>         a = 123
>         b = 024   # really only 20, because octal
>         c = 245
>
>     In Python 3.0, the second line will instead raise a SyntaxError,
>     because of the ambiguity.  Instead, the line should be written
>     as in one of the following ways:
>
>         b = 24    # PEP 8
>         b =  24   # columns line up, for quick scanning
>         b = 0t24  # really did want an Octal!
>
>
> References
>
>     [1] Implicit String Concatenation, Jewett, Orendorff
>         http://mail.python.org/pipermail/python-ideas/2007-April/000397.html
>
>     [2] PEP 12, Sample reStructuredText PEP Template, Goodger, Warsaw
>         http://www.python.org/peps/pep-0012
>
>     [3] http://www.opencontent.org/openpub/
>
>
>
> Copyright
>
>     This document has been placed in the public domain.
>
>
> 
> Local Variables:
> mode: indented-text
> indent-tabs-mode: nil
> sentence-end-double-space: t
> fill-column: 70
> coding: utf-8
> End:
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>
>   


From pje at telecommunity.com  Wed May  2 18:00:03 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 02 May 2007 12:00:03 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <46384247.8020601@canterbury.ac.nz>
References: <5.1.1.6.0.20070501235655.02a57fd8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501235655.02a57fd8@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070502113122.04e41350@sparrow.telecommunity.com>

At 07:48 PM 5/2/2007 +1200, Greg Ewing wrote:
>Is there something about generic functions that makes
>them different from methods in this regard?

Yes.

1. When you're dispatching on more than one argument type, you're likely to 
have more methods involved.

2. If you are using generic functions to implement "events", or using them 
AOP-style to "hook" other actions (e.g. to implement logging, persistence, 
transactions, undo, etc.), then you will be *mostly* doing "before" and 
"after" actions, with the occasional "around".  (See also Jason's comment 
quoted below.)


>>1) a lot more pleasant not to write the extra boilerplate all the time,
>
>I'd work on that by finding ways to reduce the boilerplate.

Um...  I did.  They're called @before and @after.  :)


>There are examples, yes, but they don't come across as
>very compelling as to why there should be so many variations
>of the overloading decorator rather than a single general
>one.

I notice that you didn't respond to my point that these also make it easier 
for the reader to tell what the method is doing, without needing to 
carefully inspect the body.

Meanwhile, it takes less than 40 lines of code to implement both @before 
and @after; if nothing else they would serve as excellent examples of how 
to implement other method combinations (besides the @discount example in 
the PEP).  However, as it happens they are quite useful in and of 
themselves.  As Jason Orendorff put it:

"""In short, you have to ask yourself: am I hooking something 
(before/after), implementing it (when), or just generally looking for 
trouble (around)?"""


>CLOS strikes me as being the union of all Lisp dialects that
>anyone has ever used,

You seem to be confusing Common Lisp with CLOS.  They are not the same thing.

Meanwhile, AspectJ and Inform 7 also include before/after/around advice for 
their generic functions, so it's hardly only CLOS as an example of their 
usefulness.  In Inform 7, the manual notes that "after" and "instead" 
(around) are the most commonly used; however, this is probably because 
every action (generic function) in the language already has three method 
combination phases called "check", "carry out", and "report"!

So, some of the uses of "before" that would happen in other languages get 
handled as "check"-phase rules in Inform.  (Also, they are called "instead" 
rules because in Inform you are usually *not* invoking the overridden 
action, but providing a substitute behavior.)


From steven.bethard at gmail.com  Wed May  2 19:00:01 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Wed, 2 May 2007 11:00:01 -0600
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <4638B151.6020901@voidspace.org.uk>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<4638B151.6020901@voidspace.org.uk>
Message-ID: <d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>

On 5/2/07, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> Implicit string concatenation is massively useful for creating long
> strings in a readable way though:
>
>     call_something("first part\n"
>                            "second line\n"
>                             "third line\n")
>
> I find it an elegant way of building strings and would be sad to see it
> go. Adding trailing '+' signs is ugly.

You'll still have textwrap.dedent::

    call_something(dedent('''\
        first part
        second line
        third line
        '''))

And using textwrap.dedent, you don't have to remember to add the \n at
the end of every line.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From trentm at activestate.com  Wed May  2 19:34:15 2007
From: trentm at activestate.com (Trent Mick)
Date: Wed, 02 May 2007 10:34:15 -0700
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>	<4638B151.6020901@voidspace.org.uk>
	<d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
Message-ID: <4638CB97.1040503@activestate.com>

Steven Bethard wrote:
> On 5/2/07, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
>> Implicit string concatenation is massively useful for creating long
>> strings in a readable way though:
>>
>>     call_something("first part\n"
>>                            "second line\n"
>>                             "third line\n")
>>
>> I find it an elegant way of building strings and would be sad to see it
>> go. Adding trailing '+' signs is ugly.
> 
> You'll still have textwrap.dedent::
> 
>     call_something(dedent('''\
>         first part
>         second line
>         third line
>         '''))
> 
> And using textwrap.dedent, you don't have to remember to add the \n at
> the end of every line.

But if you don't want the EOLs? Example from some code of mine:

     raise MakeError("extracting '%s' in '%s' did not create the "
                     "directory that the Python build will expect: "
                     "'%s'" % (src_pkg, dst_dir, dst))

I use this kind of thing frequently. Don't know if others consider it 
bad style.

Trent

-- 
Trent Mick
trentm at activestate.com

From guido at python.org  Wed May  2 20:08:36 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 2 May 2007 11:08:36 -0700
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <f19d5r$mfm$2@sea.gmane.org>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
Message-ID: <ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>

[Georg]
> >>     >>> a, *b, c = range(5)
> >>     >>> a
> >>     0
> >>     >>> c
> >>     4
> >>     >>> b
> >>     [1, 2, 3]

[Guido]
> > Has it been pointed out to you already that this particular example is
> > hard to implement if the RHS is an iterator whose length is not known
> > a priori? The implementation would have to be quite hairy -- it would
> > have to assign everything to the list b until the iterator is
> > exhausted, and then pop a value from the end of the list and assign it
> > to c.

[Georg]
> Yes, that is correct. My implementation isn't *that* hairy, though, it's
> only 13 lines of code more.

OK. The PEP was kind of light on substance here. Glad you've thought about it.

> I'll post the patch to SourceForge later today.

Cool.

> > it would be much easier if *b was only allowed at the end. (It
> > would be even worse if b were assigned a tuple instead of a list, as
> > per your open issues.)
>
> The created tuple is a fresh one, so can't I just copy pointers like from a
> list and set ob_size later?

Sure.

> > Also, what should this do? Perhaps the grammar could disallow it?
> >
> > *a = range(5)
>
> I'm not so sure about the grammar, I'm currently catching it in the AST
> generation stage.

Hopefully it's possible to only allow this if there's at least one comma?

In any case the grammar will probably end up accepting *a in lots of
places where it isn't really allowed and you'll have to fix all of
those. That sounds messy; only allowing *a at the end seems a bit more
manageable. But I'll hold off until I can shoot holes in your
implementation. ;-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed May  2 20:15:04 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 2 May 2007 11:15:04 -0700
Subject: [Python-3000] PEP-3125 -- remove backslash continuation
In-Reply-To: <003601c78cbf$0cacec90$2606c5b0$@org>
References: <003601c78cbf$0cacec90$2606c5b0$@org>
Message-ID: <ca471dc20705021115q1c19b6aied959c8c452559a@mail.gmail.com>

On 5/2/07, Andrew Koenig <ark at acm.org> wrote:
> Looking at PEP-3125, I see that one of the rejected alternatives is to allow
> any unfinished expression to indicate a line continuation.
>
> I would like to suggest a modification to that alternative that has worked
> successfully in another programming language, namely Stu Feldman's EFL.  EFL
> is a language intended for numerical programming; it compiles into Fortran
> with the interesting property that the resulting Fortran code is intended to
> be human-readable and maintainable by people who do not happen to have
> access to the EFL compiler.
>
> Anyway, the (only) continuation rule in EFL is that if the last token in a
> line is one that lexically cannot be the last token in a statement, then the
> next line is considered a continuation of the current line.
>
> Python currently has a rule that if parentheses are unbalanced, a newline
> does not end the statement.  If we were to translate the EFL rule to Python,
> it would be something like this:
>
>         The whitespace that follows an operator or open bracket or
> parenthesis
>         can include newline characters.
>
> Note that if this suggestion were implemented, it would presumably be at a
> very low lexical level--even before the decision is made to turn a newline
> followed by spaces into an INDENT or DEDENT token.  I think that this
> property solves the difficulty-of-parsing problem.  Indeed, I think that
> this suggestion would be easier to implement than the current
> unbalanced-parentheses rule.
>
> Note also that like the current backslash rule, the space after the newline
> would be just space, with no special significance.  So to rewrite the
> examples from the PEP:
>
>         "abc" +      # Plus is an operator, so it continues
>             "def"    # The extra spaces before "def" do not constitute an
> INDENT
>
>         "abc"        # Line does not end with an operator, so statement ends
>             + "def"  # The newline and spaces constitute an INDENT -- this
> is a syntax error
>
>         ("abc"       # I have no opinion about keeping the
> unbalanced-parentheses rule --
>             + "def") # but I do think that it is harder to parse (and also
> harder to read)
>                      # than what I am proposing.

I am worried that (as no indent is required on the next line) it will
accidentally introduce legal interpretations for certain common (?)
typos, e.g.

  x = y+    # Used to be y+1, the 1 got dropped
  f(x)

Still, if someone wants to give implementing this a try we could add
this to the PEP.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ark at acm.org  Wed May  2 20:24:50 2007
From: ark at acm.org (Andrew Koenig)
Date: Wed, 2 May 2007 14:24:50 -0400
Subject: [Python-3000] PEP-3125 -- remove backslash continuation
In-Reply-To: <ca471dc20705021115q1c19b6aied959c8c452559a@mail.gmail.com>
References: <003601c78cbf$0cacec90$2606c5b0$@org>
	<ca471dc20705021115q1c19b6aied959c8c452559a@mail.gmail.com>
Message-ID: <001e01c78ce7$2c986f70$85c94e50$@org>

> I am worried that (as no indent is required on the next line) it will
> accidentally introduce legal interpretations for certain common (?)
> typos, e.g.

>   x = y+    # Used to be y+1, the 1 got dropped
>   f(x)

A reasonable worry.  It could still be solved at the lexical level by
requiring every continuation line to have more leading whitespace than the
first of the lines being continued, and still not mapping that whitespace
into an INDENT, but of course that approach adds complexity.

All I can say is that it worked in practice in EFL, and I adopted the same
approach in Snocone without any complaints.  (Of course Python has lots more
users than Snocone)



From steven.bethard at gmail.com  Wed May  2 20:45:38 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Wed, 2 May 2007 12:45:38 -0600
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <20070502181937.GF19189@seldon>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<4638B151.6020901@voidspace.org.uk>
	<20070502175301.GA24510@localhost.localdomain>
	<20070502181937.GF19189@seldon>
Message-ID: <d11dcfba0705021145p26751bdeke184602448afbdd0@mail.gmail.com>

On 5/2/07, Brian Harring <ferringb at gmail.com> wrote:
> Personally, I'm -1 on nuking implicit string concatenation; the
> examples provided for the 'why' aren't that strong in my experience,
> and the forced shift to concattenation is rather annoying when you're
> dealing with code limits (80 char limit for example)-
>
>                         dprint("depends level cycle: %s: "
>                                "dropping cycle for %s from %s" %
>                                 (cur_frame.atom, datom,
>                                  cur_frame.current_pkg),
>                                 "cycle")
>

FWLIW, I pretty much always write this as::

    msg = "depends level cycle: %s: dropping cycle for %s from %s"
    tup = cur_frame.atom, datom, cur_frame.current_pkg, "cycle"
    dprint(msg % tup)

But yes, occasionally I run into problems when the string still
doesn't fit on a single line. (Of course, I usually solve that by
shortening the string...) ;-)

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From pje at telecommunity.com  Wed May  2 20:51:06 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 02 May 2007 14:51:06 -0400
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <4638CB97.1040503@activestate.com>
References: <d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<4638B151.6020901@voidspace.org.uk>
	<d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
Message-ID: <5.1.1.6.0.20070502144742.02bc1908@sparrow.telecommunity.com>

At 10:34 AM 5/2/2007 -0700, Trent Mick wrote:
>But if you don't want the EOLs? Example from some code of mine:
>
>      raise MakeError("extracting '%s' in '%s' did not create the "
>                      "directory that the Python build will expect: "
>                      "'%s'" % (src_pkg, dst_dir, dst))
>
>I use this kind of thing frequently. Don't know if others consider it
>bad style.

Well, I do it a lot too; don't know if that makes it good or bad, though.  :)

I personally don't see a lot of benefit to changing the lexical rules for 
Py3K, however.  The hard part of lexing Python is INDENT/DEDENT (and the 
associated unbalanced parens rule), and none of these proposals suggest 
removing *that*.  Overall, this whole thing seems like a bikeshed to me.


From fdrake at acm.org  Wed May  2 20:57:38 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 2 May 2007 14:57:38 -0400
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <4638CB97.1040503@activestate.com>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<4638CB97.1040503@activestate.com>
Message-ID: <200705021457.38344.fdrake@acm.org>

On Wednesday 02 May 2007, Trent Mick wrote:
 >      raise MakeError("extracting '%s' in '%s' did not create the "
 >                      "directory that the Python build will expect: "
 >                      "'%s'" % (src_pkg, dst_dir, dst))
 >
 > I use this kind of thing frequently. Don't know if others consider it
 > bad style.

I do this too; this is a good way to have a simple human-readable message 
without doing weird things to about extraneous newlines or strange 
indentation.

-1 on removing implicit string catenation.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From mike.klaas at gmail.com  Wed May  2 21:08:21 2007
From: mike.klaas at gmail.com (Mike Klaas)
Date: Wed, 2 May 2007 12:08:21 -0700
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <4637F606.4060707@canterbury.ac.nz>
References: <46371BD2.7050303@v.loewis.de> <f178kn$s0n$2@sea.gmane.org>
	<4637631A.6030702@v.loewis.de> <4637F606.4060707@canterbury.ac.nz>
Message-ID: <3d2ce8cb0705021208g38d88020t6b183a3061191766@mail.gmail.com>

On 5/1/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Martin v. L?wis wrote:
>
> > http://mail.python.org/pipermail/python-3000/2006-April/001526.html
> >
> > where Guido states that he trusts me that it can be made to work,
> > and that "eventually" it needs to be supported.

+0

> He says "the tools aren't ready yet", which I take to
> mean that Python won't need to support it until all
> widely-used editors, email and news software, etc, etc,
> reliably support displaying and editing of all
> unicode characters. We're clearly a long way from
> that situation.

Couldn't the same argument be applied against non-ascii characters in
string literals?  It would be safest to enforce the use of \u escapes,
no?

It is certainly true that the use of non-ascii indentifiers will cause
the code to be unusable by _someone_ using _some_ set of tools.  This
is something that the user will be aware of (as all new users of
non-ascii have been in the past).  Tools won't change without bug
reports.

-Mike

From snaury at gmail.com  Wed May  2 21:23:47 2007
From: snaury at gmail.com (Alexey Borzenkov)
Date: Wed, 2 May 2007 23:23:47 +0400
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
Message-ID: <e2480c70705021223y18e7dc5bxc58f39f31be62ac4@mail.gmail.com>

On 4/30/07, Jim Jewett <jimjjewett at gmail.com> wrote:
>     Python initially inherited its parsing from C.  While this has
>     been generally useful, there are some remnants which have been
>     less useful for python, and should be eliminated.
>
>     + Implicit String concatenation
>
>     + Line continuation with "\"

I don't know if I can vote, but if I could I'd be -1 on this. Can't
say I'm using continuation often, but there's one case when I'm using
it and I'd like to continue using it:

    #!/usr/bin/env python
    """\
    Usage: some-tool.py [arguments...]

        Does this and that based on its arguments"""

    if condition:
        print __doc__
        sys.exit(1)

This way usage immediately stands out much better, without any
unnecessary new lines.

Best regards,
Alexey.

From barry at python.org  Wed May  2 21:40:33 2007
From: barry at python.org (Barry Warsaw)
Date: Wed, 2 May 2007 15:40:33 -0400
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <5.1.1.6.0.20070502144742.02bc1908@sparrow.telecommunity.com>
References: <d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<4638B151.6020901@voidspace.org.uk>
	<d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<5.1.1.6.0.20070502144742.02bc1908@sparrow.telecommunity.com>
Message-ID: <179D5383-88F0-4246-B355-5A817B9F7EBE@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On May 2, 2007, at 2:51 PM, Phillip J. Eby wrote:

> At 10:34 AM 5/2/2007 -0700, Trent Mick wrote:
>> But if you don't want the EOLs? Example from some code of mine:
>>
>>      raise MakeError("extracting '%s' in '%s' did not create the "
>>                      "directory that the Python build will expect: "
>>                      "'%s'" % (src_pkg, dst_dir, dst))
>>
>> I use this kind of thing frequently. Don't know if others consider it
>> bad style.
>
> Well, I do it a lot too; don't know if that makes it good or bad,  
> though.  :)

I just realized that changing these lexical rules might have an  
adverse affect on internationalization.  Or it might force more lines  
to go over the 79 character limit.

The problem is that

	_("some string"
	  " and more of it")

is not the same as

	_("some string" +
	  " and more of it")

because the latter won't be extracted by tools like pygettext (I'm  
not sure about standard gettext).  You would either have to teach  
pygettext and maybe gettext about this construct, or you'd have to  
use something different.  Triple quoted strings are probably not so  
good because you'd have to still backslash the trailing newlines.   
You can't split the strings up into sentence fragments because that  
makes some translations impossible.  Someone ease my worries here.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRjjpOHEjvBPtnXfVAQJ/xwP7BNMGvrmuxKmb7QiIawYjORKt9Pxmz7XJ
kFVHl47UusOGzgmtwm6Qi2DeSDsG0JOu0XwlZbX3YPE8omTzTP8WLdavJ1e+i2nP
V8GwXVyFgyFHx3V1jb0o9eiUGFEwkXInCGcOFqdWOEF49TtRNHGY6ne+eumwkqxK
qOyTGkcreG4=
=J6I/
-----END PGP SIGNATURE-----

From barry at python.org  Wed May  2 21:41:38 2007
From: barry at python.org (Barry Warsaw)
Date: Wed, 2 May 2007 15:41:38 -0400
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <e2480c70705021223y18e7dc5bxc58f39f31be62ac4@mail.gmail.com>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<e2480c70705021223y18e7dc5bxc58f39f31be62ac4@mail.gmail.com>
Message-ID: <BE39B1F5-CA43-4A59-A44F-B2CFEB84ABF7@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On May 2, 2007, at 3:23 PM, Alexey Borzenkov wrote:

> On 4/30/07, Jim Jewett <jimjjewett at gmail.com> wrote:
>>     Python initially inherited its parsing from C.  While this has
>>     been generally useful, there are some remnants which have been
>>     less useful for python, and should be eliminated.
>>
>>     + Implicit String concatenation
>>
>>     + Line continuation with "\"
>
> I don't know if I can vote, but if I could I'd be -1 on this. Can't
> say I'm using continuation often, but there's one case when I'm using
> it and I'd like to continue using it:
>
>     #!/usr/bin/env python
>     """\
>     Usage: some-tool.py [arguments...]
>
>         Does this and that based on its arguments"""
>
>     if condition:
>         print __doc__
>         sys.exit(1)
>
> This way usage immediately stands out much better, without any
> unnecessary new lines.

Me too, all the time.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRjjpcnEjvBPtnXfVAQL0ngP9FwE7swQSdPiH4wAMQRe1CAzWXBLCXKok
d08GHhyp5GWHs1UzDZbnxnLRVZt+ra/3iSJT8g32X2gX9gWkFUJfqZFN9wLVjzDZ
qlX4m2cJs4nlskRDsycPMY9MLGUwQ8bt7mn92Oh3vXAvtXm42Dxu66NvTlyYdIFQ
9M2HrMbBn1M=
=3kNg
-----END PGP SIGNATURE-----

From tim.peters at gmail.com  Wed May  2 21:46:05 2007
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 2 May 2007 15:46:05 -0400
Subject: [Python-3000] PEP-3125 -- remove backslash continuation
In-Reply-To: <ca471dc20705021115q1c19b6aied959c8c452559a@mail.gmail.com>
References: <003601c78cbf$0cacec90$2606c5b0$@org>
	<ca471dc20705021115q1c19b6aied959c8c452559a@mail.gmail.com>
Message-ID: <1f7befae0705021246w3446183du10c178067b3d3d7d@mail.gmail.com>

...

[Guido]
> I am worried that (as no indent is required on the next line) it will
> accidentally introduce legal interpretations for certain common (?)
> typos, e.g.
>
>   x = y+    # Used to be y+1, the 1 got dropped
>   f(x)

The Icon language also uses this rule, and I never experienced
problems with it there.

OTOH, the "open bracket" rule is certainly sufficient by itself, and
is invaluable for writing "big" list, tuple, and dict literals (things
I doubt come up in Andrew's EFL inspiration).

From rasky at develer.com  Wed May  2 21:50:52 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Wed, 02 May 2007 21:50:52 +0200
Subject: [Python-3000] PEP-3125 -- remove backslash continuation
In-Reply-To: <ca471dc20705021115q1c19b6aied959c8c452559a@mail.gmail.com>
References: <003601c78cbf$0cacec90$2606c5b0$@org>
	<ca471dc20705021115q1c19b6aied959c8c452559a@mail.gmail.com>
Message-ID: <f1aq2r$t74$1@sea.gmane.org>

On 02/05/2007 20.15, Guido van Rossum wrote:

> I am worried that (as no indent is required on the next line) it will
> accidentally introduce legal interpretations for certain common (?)
> typos, e.g.
> 
>   x = y+    # Used to be y+1, the 1 got dropped
>   f(x)

It would also change the meaning of existing valid programs such as:

   x = 1,
   y()

The additional ident would solve this of course, but as you already said it's 
a bad idea from an implementation standpoint.
-- 
Giovanni Bajo


From ark-mlist at att.net  Wed May  2 22:03:15 2007
From: ark-mlist at att.net (Andrew Koenig)
Date: Wed, 2 May 2007 16:03:15 -0400
Subject: [Python-3000] PEP-3125 -- remove backslash continuation
In-Reply-To: <1f7befae0705021246w3446183du10c178067b3d3d7d@mail.gmail.com>
References: <003601c78cbf$0cacec90$2606c5b0$@org>	<ca471dc20705021115q1c19b6aied959c8c452559a@mail.gmail.com>
	<1f7befae0705021246w3446183du10c178067b3d3d7d@mail.gmail.com>
Message-ID: <003a01c78cf4$ebe54ad0$c3afe070$@net>

> OTOH, the "open bracket" rule is certainly sufficient by itself, and
> is invaluable for writing "big" list, tuple, and dict literals (things
> I doubt come up in Andrew's EFL inspiration).

If comma is treated as an operator, the "open bracket" rule doesn't seem all
that invaluable to me.  Can you give me an example?



From ark-mlist at att.net  Wed May  2 22:04:46 2007
From: ark-mlist at att.net (Andrew Koenig)
Date: Wed, 2 May 2007 16:04:46 -0400
Subject: [Python-3000] PEP-3125 -- remove backslash continuation
In-Reply-To: <f1aq2r$t74$1@sea.gmane.org>
References: <003601c78cbf$0cacec90$2606c5b0$@org>	<ca471dc20705021115q1c19b6aied959c8c452559a@mail.gmail.com>
	<f1aq2r$t74$1@sea.gmane.org>
Message-ID: <003b01c78cf5$2227ac50$667704f0$@net>

> It would also change the meaning of existing valid programs such as:
> 
>    x = 1,
>    y()

This is the strongest argument against the idea that I've seen so far.

It could be solved by *not* treating , as an operator, and by keeping the
open bracket rule.



From guido at python.org  Wed May  2 22:17:39 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 2 May 2007 13:17:39 -0700
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <e2480c70705021223y18e7dc5bxc58f39f31be62ac4@mail.gmail.com>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<e2480c70705021223y18e7dc5bxc58f39f31be62ac4@mail.gmail.com>
Message-ID: <ca471dc20705021317w32fdff7du80e21dcec2632754@mail.gmail.com>

On 5/2/07, Alexey Borzenkov <snaury at gmail.com> wrote:
> I don't know if I can vote, but if I could I'd be -1 on this. Can't
> say I'm using continuation often, but there's one case when I'm using
> it and I'd like to continue using it:
>
>     #!/usr/bin/env python
>     """\
>     Usage: some-tool.py [arguments...]
>
>         Does this and that based on its arguments"""

I've been trying to tease out of the PEP author whether \ inside
string literals would also be dropped. I'd be against that even if \
outside strings were to be killed. So a vote based solely on this
argument has little value.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim.peters at gmail.com  Wed May  2 22:30:16 2007
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 2 May 2007 16:30:16 -0400
Subject: [Python-3000] PEP-3125 -- remove backslash continuation
In-Reply-To: <003a01c78cf4$ebe54ad0$c3afe070$@net>
References: <003601c78cbf$0cacec90$2606c5b0$@org>
	<ca471dc20705021115q1c19b6aied959c8c452559a@mail.gmail.com>
	<1f7befae0705021246w3446183du10c178067b3d3d7d@mail.gmail.com>
	<003a01c78cf4$ebe54ad0$c3afe070$@net>
Message-ID: <1f7befae0705021330y755e20fcrfbaf8bfd94151ded@mail.gmail.com>

[Tim Peters]
>> ...
>> OTOH, the "open bracket" rule is certainly sufficient by itself, and
>> is invaluable for writing "big" list, tuple, and dict literals (things
>> I doubt come up in Andrew's EFL inspiration).

[Andrew Koenig]
> If comma is treated as an operator, the "open bracket" rule doesn't seem all
> that invaluable to me.  Can you give me an example?

Treating comma as an infix operator would clash in weird ways with the
current "sometimes" treatment of comma as denoting a tuple literal ...
and I see that Giovanni Bajo already posted an example while I was
typing this :-)  Icon doesn't have this problem, and I'm guessing that
EFL doesn't either.

Incidentally, I know one Python programmer who writes list literals like this:

mylist = [
           1
         , 2
         , 3
         ]

In a fixed-width font, the commas and brackets are all in the same
column.  While "bleech" is the proper reaction ;-), that does work
fine today.

Historical note:  the open bracket rule was introduced in Python 0.9.9
(29 Jul 1993).  Before that, backslash continuation was the only way
to split a statement across lines.  If the open bracket rule had been
there from the start, I doubt backslash continuation would have been
there at all (except in string literals).

From ark-mlist at att.net  Wed May  2 23:09:18 2007
From: ark-mlist at att.net (Andrew Koenig)
Date: Wed, 2 May 2007 17:09:18 -0400
Subject: [Python-3000] PEP-3125 -- remove backslash continuation
In-Reply-To: <1f7befae0705021330y755e20fcrfbaf8bfd94151ded@mail.gmail.com>
References: <003601c78cbf$0cacec90$2606c5b0$@org>	
	<ca471dc20705021115q1c19b6aied959c8c452559a@mail.gmail.com>	
	<1f7befae0705021246w3446183du10c178067b3d3d7d@mail.gmail.com>	
	<003a01c78cf4$ebe54ad0$c3afe070$@net>
	<1f7befae0705021330y755e20fcrfbaf8bfd94151ded@mail.gmail.com>
Message-ID: <004501c78cfe$264c39a0$72e4ace0$@net>


> Incidentally, I know one Python programmer who writes list literals
> like this:
> 
> mylist = [
>            1
>          , 2
>          , 3
>          ]
> 
> In a fixed-width font, the commas and brackets are all in the same
> column.  While "bleech" is the proper reaction ;-), that does work
> fine today.

Sounds like another argument in favor of not treating comma as an operator
(because it *can* end a statement) but keeping the open-bracket rule.



From g.brandl at gmx.net  Wed May  2 23:31:40 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 02 May 2007 23:31:40 +0200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
Message-ID: <f1avvq$vm1$1@sea.gmane.org>

Guido van Rossum schrieb:

>> > Also, what should this do? Perhaps the grammar could disallow it?
>> >
>> > *a = range(5)
>>
>> I'm not so sure about the grammar, I'm currently catching it in the AST
>> generation stage.
> 
> Hopefully it's possible to only allow this if there's at least one comma?

That's easy. But now that I have lightened the grammar changes a bit, catching
the no-comma case has gotten a bit hairy, as you'll see in the patch.

> In any case the grammar will probably end up accepting *a in lots of
> places where it isn't really allowed and you'll have to fix all of
> those.

In fact it's not too hard: only store context is allowed.

> That sounds messy; only allowing *a at the end seems a bit more
> manageable. But I'll hold off until I can shoot holes in your
> implementation. ;-)

The patch is at http://python.org/sf/1711529. Have fun :)

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From skip at pobox.com  Wed May  2 23:23:00 2007
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 2 May 2007 16:23:00 -0500
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <4638CB97.1040503@activestate.com>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<4638B151.6020901@voidspace.org.uk>
	<d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<4638CB97.1040503@activestate.com>
Message-ID: <17977.308.192435.48545@montanaro.dyndns.org>

    Trent> But if you don't want the EOLs? Example from some code of mine:

    Trent>      raise MakeError("extracting '%s' in '%s' did not create the "
    Trent>                      "directory that the Python build will expect: "
    Trent>                      "'%s'" % (src_pkg, dst_dir, dst))

    Trent> I use this kind of thing frequently. Don't know if others
    Trent> consider it bad style.

I use it all the time.  For example, to build up (what I consider to be)
readable SQL queries:

    rows = self.executesql("select cities.city, state, country"
                           "    from cities, venues, events, addresses"
                           "    where cities.city like %s"
                           "      and events.active = 1"
                           "      and venues.address = addresses.id"
                           "      and addresses.city = cities.id"
                           "      and events.venue = venues.id",
                           (city,))

I would be disappointed it string literal concatention went away.

Skip

From snaury at gmail.com  Wed May  2 23:25:00 2007
From: snaury at gmail.com (Alexey Borzenkov)
Date: Thu, 3 May 2007 01:25:00 +0400
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <ca471dc20705021317w32fdff7du80e21dcec2632754@mail.gmail.com>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<e2480c70705021223y18e7dc5bxc58f39f31be62ac4@mail.gmail.com>
	<ca471dc20705021317w32fdff7du80e21dcec2632754@mail.gmail.com>
Message-ID: <e2480c70705021425o5b87e32ak3094efeabcb16b6@mail.gmail.com>

On 5/3/07, Guido van Rossum <guido at python.org> wrote:
> On 5/2/07, Alexey Borzenkov <snaury at gmail.com> wrote:
> > I don't know if I can vote, but if I could I'd be -1 on this. Can't
> > say I'm using continuation often, but there's one case when I'm using
> > it and I'd like to continue using it:
> >
> >     #!/usr/bin/env python
> >     """\
> >     Usage: some-tool.py [arguments...]
> >
> >         Does this and that based on its arguments"""
> I've been trying to tease out of the PEP author whether \ inside
> string literals would also be dropped. I'd be against that even if \
> outside strings were to be killed. So a vote based solely on this
> argument has little value.

Ouch, I didn't even think it could be dropped one way and not the
other. To be honest, I don't have opinion on the usage of \ outside of
string literals, never needed to use it, and it seems there's always a
workaround with parenthesis. So I'm sorry, that have been very
premature.

--
Alexey

From rhamph at gmail.com  Thu May  3 00:48:30 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Wed, 2 May 2007 16:48:30 -0600
Subject: [Python-3000] Some canonical use-cases for
	ABCs/Interfaces/Generics
In-Reply-To: <4638013D.8090902@acm.org>
References: <4638013D.8090902@acm.org>
Message-ID: <aac2c7cb0705021548o6c10b6e6qa238829911a63ac3@mail.gmail.com>

On 5/1/07, Talin <talin at acm.org> wrote:
> One of my concerns in the ABC/interface discussion so far is that a lot
> of the use cases presented are "toy" examples. This makes perfect sense
> considering that you don't want to have to spend several pages
> explaining the use case. But at the same time, it means that we might be
> solving problems that aren't real, while ignoring problems that are.
>
> What I'd like to do is collect a set of "real-world" use cases and
> document them. The idea would be that we could refer to these use cases
> during the discussion, using a common terminology and shorthand examples.
>
> I'll present one very broad use case here, and I'd be interested if
> people have ideas for other use cases. The goal is to define a small
> number of broadly-defined cases that provide a broad coverage of the
> problem space.

The only use case I commonly experience is that of __init__.  For instance:

class MyClass:
    def __init__(self, x):
        if isinstance(x, basestring):
            self.stream = open(x)
        else:
            self.stream = x

It can be called with either a path or a file-like object.  It might
be possible to replace the LBYL with EAFP, passing x into open() and
catching errors, but it's not obvious what if any exception would be
raised; the try/except is too broad.

Many builtin types (int, float, etc) are conceptually similar, but
with an acceptable way to narrow down the try/except: check for an
x.__int__ method, without calling it.

<ramble>
It's not obvious to me how to best do this dispatching in the long
term.  If you hardcode the references to basestring then, if open()
starts accepting path objects that don't derive from str, all the
existing code will break.  Generic functions provide a cleaner way to
override them if you don't want to modify the original code, but
you'll still have to write *some* update for every piece of code that
does a check.  The only way to avoid all the updates is to create some
sort of Pathish ABC to check for, but that assumes you know you'll
switch to non-str-derived path objects when you define Pathish.

For those keeping score, that means that duck typing will *always*
fail at these problems.  You have to hardcode some way of
discriminating between types, and you'll have to rewrite all that
hardcoding if the assumptions you made no longer hold.  The goal then
is to pick assumptions that will live for the longest period of time
and require the least effort to change (and avoid making them unless
you *need* them, ie stick to duck typing if it works).
</ramble>

-- 
Adam Olsen, aka Rhamphoryncus

From mhammond at skippinet.com.au  Thu May  3 01:59:35 2007
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Thu, 3 May 2007 09:59:35 +1000
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
Message-ID: <02b401c78d15$f04b6110$090a0a0a@enfoldsystems.local>

Please add my -1 to the chorus here, for the same reasons already expressed.

Cheers,

Mark

> -----Original Message-----
> From: python-dev-bounces+mhammond=keypoint.com.au at python.org
> [mailto:python-dev-bounces+mhammond=keypoint.com.au at python.org
> ]On Behalf
> Of Jim Jewett
> Sent: Monday, 30 April 2007 1:29 PM
> To: Python 3000; Python Dev
> Subject: [Python-Dev] PEP 30XZ: Simplified Parsing
>
>
> PEP: 30xz
> Title: Simplified Parsing
> Version: $Revision$
> Last-Modified: $Date$
> Author: Jim J. Jewett <JimJJewett at gmail.com>
> Status: Draft
> Type: Standards Track
> Content-Type: text/plain
> Created: 29-Apr-2007
> Post-History: 29-Apr-2007
>
>
> Abstract
>
>     Python initially inherited its parsing from C.  While this has
>     been generally useful, there are some remnants which have been
>     less useful for python, and should be eliminated.
>
>     + Implicit String concatenation
>
>     + Line continuation with "\"
>
>     + 034 as an octal number (== decimal 28).  Note that this is
>       listed only for completeness; the decision to raise an
>       Exception for leading zeros has already been made in the
>       context of PEP XXX, about adding a binary literal.
>
>
> Rationale for Removing Implicit String Concatenation
>
>     Implicit String concatentation can lead to confusing, or even
>     silent, errors. [1]
>
>         def f(arg1, arg2=None): pass
>
>         f("abc" "def")  # forgot the comma, no warning ...
>                         # silently becomes f("abcdef", None)
>
>     or, using the scons build framework,
>
>         sourceFiles = [
>         'foo.c',
>         'bar.c',
>         #...many lines omitted...
>         'q1000x.c']
>
>     It's a common mistake to leave off a comma, and then
> scons complains
>     that it can't find 'foo.cbar.c'.  This is pretty
> bewildering behavior
>     even if you *are* a Python programmer, and not everyone here is.
>
>     Note that in C, the implicit concatenation is more
> justified; there
>     is no other way to join strings without (at least) a
> function call.
>
>     In Python, strings are objects which support the __add__ operator;
>     it is possible to write:
>
>         "abc" + "def"
>
>     Because these are literals, this addition can still be optimized
>     away by the compiler.
>
>     Guido indicated [2] that this change should be handled by
> PEP, because
>     there were a few edge cases with other string operators,
> such as the %.
>     The resolution is to treat them the same as today.
>
>         ("abc %s def" + "ghi" % var)  # fails like today.
>                                       # raises TypeError because of
>                                       # precedence.  (% before +)
>
>         ("abc" + "def %s ghi" % var)  # works like today;
> precedence makes
>                                       # the optimization more
> difficult to
>                                       # recognize, but does
> not change the
>                                       # semantics.
>
>         ("abc %s def" + "ghi") % var  # works like today, because of
>                                       # precedence:  () before %
>                                       # CPython compiler can already
>                                       # add the literals at
> compile-time.
>
>
> Rationale for Removing Explicit Line Continuation
>
>     A terminal "\" indicates that the logical line is continued on the
>     following physical line (after whitespace).
>
>     Note that a non-terminal "\" does not have this meaning,
> even if the
>     only additional characters are invisible whitespace.
> (Python depends
>     heavily on *visible* whitespace at the beginning of a
> line; it does
>     not otherwise depend on *invisible* terminal whitespace.)  Adding
>     whitespace after a "\" will typically cause a syntax error rather
>     than a silent bug, but it still isn't desirable.
>
>     The reason to keep "\" is that occasionally code looks better with
>     a "\" than with a () pair.
>
>         assert True, (
>             "This Paren is goofy")
>
>     But realistically, that paren is no worse than a "\".  The only
>     advantage of "\" is that it is slightly more familiar to users of
>     C-based languages.  These same languages all also support line
>     continuation with (), so reading code will not be a problem, and
>     there will be one less rule to learn for people entirely new to
>     programming.
>
>
> Rationale for Removing Implicit Octal Literals
>
>     This decision should be covered by PEP ???, on numeric literals.
>     It is mentioned here only for completeness.
>
>     C treats integers beginning with "0" as octal, rather
> than decimal.
>     Historically, Python has inherited this usage.  This has caused
>     quite a few annoying bugs for people who forgot the rule, and
>     tried to line up their constants.
>
>         a = 123
>         b = 024   # really only 20, because octal
>         c = 245
>
>     In Python 3.0, the second line will instead raise a SyntaxError,
>     because of the ambiguity.  Instead, the line should be written
>     as in one of the following ways:
>
>         b = 24    # PEP 8
>         b =  24   # columns line up, for quick scanning
>         b = 0t24  # really did want an Octal!
>
>
> References
>
>     [1] Implicit String Concatenation, Jewett, Orendorff
>
http://mail.python.org/pipermail/python-ideas/2007-April/000397.html

    [2] PEP 12, Sample reStructuredText PEP Template, Goodger, Warsaw
        http://www.python.org/peps/pep-0012

    [3] http://www.opencontent.org/openpub/



Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/mhammond%40keypoint.com.au


From greg.ewing at canterbury.ac.nz  Thu May  3 02:38:46 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 03 May 2007 12:38:46 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070502113122.04e41350@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070501235655.02a57fd8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501235655.02a57fd8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070502113122.04e41350@sparrow.telecommunity.com>
Message-ID: <46392F16.7080707@canterbury.ac.nz>

Phillip J. Eby wrote:
> At 07:48 PM 5/2/2007 +1200, Greg Ewing wrote:
 >
> > I'd work on that by finding ways to reduce the boilerplate.
> 
> Um...  I did.  They're called @before and @after.  :)

I was talking about the need to put extra magic names
in the parameter list just to be able to call the next
method.

> I notice that you didn't respond to my point that these also make it 
> easier for the reader to tell what the method is doing,

No, it doesn't. It tells you a very small amount about
*how* the method does whatever it does. To find out
*what* the method does, you have to either read the
comment/docstring, or if it doesn't have one, read the
method body anyway. If you read the body, you'll notice
whether and when it calls the next method.

In other words, I see the calling of the next method
as an implementation detail that doesn't need to be
announced prominently at the top of the method.

> Meanwhile, it takes less than 40 lines of code to implement both @before 
> and @after;

Size of implementation isn't the issue, it's the mental
load on someone trying to learn all this stuff and keep
it in their head. It's a lot easier to learn and retain
knowledge about one general mechanism than five or more
special-case variations of it.

> """In short, you have to ask yourself: am I hooking something 
> (before/after), implementing it (when), or just generally looking for 
> trouble (around)?"""

There are a lot of other things you have to ask yourself
before writing your method, too. I don't see this particular
question as fundamental enough to pick out for special
treatment.

> You seem to be confusing Common Lisp with CLOS.  They are not the same thing.

You're right, my comment was really about Common Lisp as
a whole. But if they can't even keep the basic Lisp dialect
clean and coherent, it doesn't give me confidence that
they've made any attempt to do so with its object system.

More generally, arguments of the form "Language X does
it this way, so it must be good" don't impress me if I
don't regard language X as being particularly well
designed in the first place.

> Meanwhile, AspectJ and Inform 7 also include before/after/around advice 
> for their generic functions, so it's hardly only CLOS as an example of 
> their usefulness.

I'm very skeptical about the whole business of aspects,
too, and I find Inform 7 to be massively confusing in
many ways. So you're not going to impress me by appealing
to those, either. :-)

--
Greg

From python-dev at zesty.ca  Thu May  3 02:26:36 2007
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Wed, 2 May 2007 19:26:36 -0500 (CDT)
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0705021924420.29254@server1.LFW.org>

I fully support the removal of implicit string concatenation
(explicit is better than implicit; there's only one way to do it).

I also fully support the removal of backslashes for line continuation
of statements (same reasons).  (I mean this as distinct from line
continuation within a string; that's a separate issue.)


-- ?!ng

From greg.ewing at canterbury.ac.nz  Thu May  3 02:49:14 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 03 May 2007 12:49:14 +1200
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <4638CB97.1040503@activestate.com>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<4638B151.6020901@voidspace.org.uk>
	<d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<4638CB97.1040503@activestate.com>
Message-ID: <4639318A.8030206@canterbury.ac.nz>

Trent Mick wrote:

> But if you don't want the EOLs? Example from some code of mine:
> 
>      raise MakeError("extracting '%s' in '%s' did not create the "
>                      "directory that the Python build will expect: "
>                      "'%s'" % (src_pkg, dst_dir, dst))
> 
> I use this kind of thing frequently. Don't know if others consider it 
> bad style.

I use it too, and would be disappointed if it were
taken away. I find the usefulness considerably
outweighs any occasional problems.

--
Greg

From greg.ewing at canterbury.ac.nz  Thu May  3 02:52:21 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 03 May 2007 12:52:21 +1200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
Message-ID: <46393245.5080203@canterbury.ac.nz>

Guido van Rossum wrote:

> In any case the grammar will probably end up accepting *a in lots of
> places where it isn't really allowed and you'll have to fix all of
> those. That sounds messy; only allowing *a at the end seems a bit more
> manageable.

I also would be quite happy if it were only allowed at
the end, and not allowed on its own. I don't see any
utility in being able to write *a = b instead of
a = list(b) or some such.

--
Greg

From guido at python.org  Thu May  3 02:55:49 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 2 May 2007 17:55:49 -0700
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <4639318A.8030206@canterbury.ac.nz>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<4638B151.6020901@voidspace.org.uk>
	<d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<4638CB97.1040503@activestate.com> <4639318A.8030206@canterbury.ac.nz>
Message-ID: <ca471dc20705021755h1f178dbfq5996149bedd4f450@mail.gmail.com>

I think it looks like not enough people are ready for both these
changes (PEP 3125 and PEP 3126). Maybe we could start by discouraging
these in the style guide (PEP 8) instead?

--Guido

On 5/2/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Trent Mick wrote:
>
> > But if you don't want the EOLs? Example from some code of mine:
> >
> >      raise MakeError("extracting '%s' in '%s' did not create the "
> >                      "directory that the Python build will expect: "
> >                      "'%s'" % (src_pkg, dst_dir, dst))
> >
> > I use this kind of thing frequently. Don't know if others consider it
> > bad style.
>
> I use it too, and would be disappointed if it were
> taken away. I find the usefulness considerably
> outweighs any occasional problems.
>
> --
> Greg
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at rcn.com  Thu May  3 03:03:39 2007
From: python at rcn.com (Raymond Hettinger)
Date: Wed,  2 May 2007 21:03:39 -0400 (EDT)
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
Message-ID: <20070502210339.BHU28881@ms09.lnh.mail.rcn.net>

[Skip]
> I use it all the time.  For example, to build up (what I consider to be)
>readable SQL queries:
>
> rows = self.executesql("select cities.city, state, country"
>                        "    from cities, venues, events, addresses"
>                        "    where cities.city like %s"
>                        "      and events.active = 1"
>                        "      and venues.address = addresses.id"
>                        "      and addresses.city = cities.id"
>                        "      and events.venue = venues.id",
>                        (city,))

I find that style hard to maintain.  What is the advantage over multi-line strings?


 rows = self.executesql('''
    select cities.city, state, country
    from cities, venues, events, addresses
    where cities.city like %s
          and events.active = 1
          and venues.address = addresses.id
          and addresses.city = cities.id
          and events.venue = venues.id
    ''', 
    (city,))


Raymond

From guido at python.org  Thu May  3 03:29:53 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 2 May 2007 18:29:53 -0700
Subject: [Python-3000] [Python-ideas] PEP 30xx: Access to
	Module/Class/Function Currently Being Defined (this)
In-Reply-To: <fb6fbf560704222005le4798a4j5daa5e71e644f069@mail.gmail.com>
References: <fb6fbf560704222005le4798a4j5daa5e71e644f069@mail.gmail.com>
Message-ID: <ca471dc20705021829p53d67f57g49298b91be4bbb8f@mail.gmail.com>

Summary for the impatient: -1; the PEP is insufficiently motivated and
poorly specified.

> PEP: 3130
> Title: Access to Current Module/Class/Function
> Version: $Revision: 55056 $
> Last-Modified: $Date: 2007-05-01 12:35:45 -0700 (Tue, 01 May 2007) $
> Author: Jim J. Jewett <jimjjewett at gmail.com>
> Status: Draft
> Type: Standards Track
> Content-Type: text/plain
> Created: 22-Apr-2007
> Python-Version: 3.0
> Post-History: 22-Apr-2007
>
>
> Abstract
>
>     It is common to need a reference to the current module, class,
>     or function, but there is currently no entirely correct way to
>     do this.  This PEP proposes adding the keywords __module__,
>     __class__, and __function__.
>
>
> Rationale for __module__
>
>     Many modules export various functions, classes, and other objects,
>     but will perform additional activities (such as running unit
>     tests) when run as a script.  The current idiom is to test whether
>     the module's name has been set to magic value.
>
>         if __name__ == "__main__": ...
>
>     More complicated introspection requires a module to (attempt to)
>     import itself.  If importing the expected name actually produces
>     a different module, there is no good workaround.
>
>         # __import__ lets you use a variable, but... it gets more
>         # complicated if the module is in a package.
>         __import__(__name__)
>
>         # So just go to sys modules... and hope that the module wasn't
>         # hidden/removed (perhaps for security), that __name__ wasn't
>         # changed, and definitely hope that no other module with the
>         # same name is now available.
>         class X(object):
>             pass
>
>         import sys
>         mod = sys.modules[__name__]
>         mod = sys.modules[X.__class__.__module__]

You're making this way too complicated.  sys.modules[__name__] always
works.

>     Proposal:  Add a __module__ keyword which refers to the module
>     currently being defined (executed).  (But see open issues.)
>
>         # XXX sys.main is still changing as draft progresses.  May
>         # really need sys.modules[sys.main]
>         if __module__ is sys.main:    # assumes PEP (3122), Cannon
>             ...

PEP 3122 is already rejected.

> Rationale for __class__
>
>     Class methods are passed the current instance; from this they can

"current instance" is confusing when talking about class method.
I'll assume you mean "class".

>     determine self.__class__ (or cls, for class methods).
>     Unfortunately, this reference is to the object's actual class,

Why unforunately?  All the semantics around self.__class__ and the cls
argument are geared towards the instance's class, not the lexically
current class.

>     which may be a subclass of the defining class.  The current
>     workaround is to repeat the name of the class, and assume that the
>     name will not be rebound.
>
>         class C(B):
>
>             def meth(self):
>                 super(C, self).meth() # Hope C is never rebound.
>
>         class D(C):
>
>             def meth(self):
>                 # ?!? issubclass(D,C), so it "works":
>                 super(C, self).meth()
>
>     Proposal: Add a __class__ keyword which refers to the class
>     currently being defined (executed).  (But see open issues.)
>
>         class C(B):
>             def meth(self):
>                 super(__class__, self).meth()
>
>     Note that super calls may be further simplified by the "New Super"
>     PEP (Spealman).  The __class__ (or __this_class__) attribute came
>     up in attempts to simplify the explanation and/or implementation
>     of that PEP, but was separated out as an independent decision.
>
>     Note that __class__ (or __this_class__) is not quite the same as
>     the __thisclass__ property on bound super objects.  The existing
>     super.__thisclass__ property refers to the class from which the
>     Method Resolution Order search begins.  In the above class D, it
>     would refer to (the current reference of name) C.

Do you have any other use cases?  Because Tim Delaney's 'super'
implementation doesn't need this.

I also note that the name __class__ is a bit confusing because it
means "the object's class" in other contexts.

> Rationale for __function__
>
>     Functions (including methods) often want access to themselves,
>     usually for a private storage location or true recursion.  While
>     there are several workarounds, all have their drawbacks.

Often?  Private storage can just as well be placed in the class or
module.  The recursion use case just doesn't occur as a problem in
reality (hasn't since we introduced properly nested namespaces in
2.1).

>         def counter(_total=[0]):
>             # _total shouldn't really appear in the
>             # signature at all; the list wrapping and
>             # [0] unwrapping obscure the code
>             _total[0] += 1
>             return _total[0]
>
>         @annotate(total=0)

It makes no sense to put dangling references like this in motivating
examples.  Without the definion of @annotate the example is
meaningless.

>         def counter():
>             # Assume name counter is never rebound:

Why do you care so much about this?  It's a vanishingly rare situation
in my experience.

>             counter.total += 1
>             return counter.total

You're abusing function attributes here IMO.  Function attributes are
*metadata* about the function; they should not be used as per-function
global storage.

>         # class exists only to provide storage:

If you don't need a class, use a module global.  That's what they're
for.  Name it with a leading underscore to flag the fact that it's an
implementation detail.

>         class _wrap(object):
>
>             __total = 0
>
>             def f(self):
>                 self.__total += 1
>                 return self.__total
>
>         # set module attribute to a bound method:
>         accum = _wrap().f
>
>         # This function calls "factorial", which should be itself --
>         # but the same programming styles that use heavy recursion
>         # often have a greater willingness to rebind function names.
>         def factorial(n):
>             return (n * factorial(n-1) if n else 1)
>
>     Proposal: Add a __function__ keyword which refers to the function
>     (or method) currently being defined (executed).  (But see open
>     issues.)
>
>         @annotate(total=0)
>         def counter():
>             # Always refers to this function obj:
>             __function__.total += 1
>             return __function__.total
>
>         def factorial(n):
>             return (n * __function__(n-1) if n else 1)
>
>
> Backwards Compatibility
>
>     While a user could be using these names already, double-underscore
>     names ( __anything__ ) are explicitly reserved to the interpreter.
>     It is therefore acceptable to introduce special meaning to these
>     names within a single feature release.
>
>
> Implementation
>
>     Ideally, these names would be keywords treated specially by the
>     bytecode compiler.

That is a completely insufficient attempt at describing the semantics.

>     Guido has suggested [1] using a cell variable filled in by the
>     metaclass.
>
>     Michele Simionato has provided a prototype using bytecode hacks
>     [2].  This does not require any new bytecode operators; it just
>     modifies the which specific sequence of existing operators gets
>     run.

Sorry, bytecode hacks don't count as a semantic specification.


> Open Issues
>
>     - Are __module__, __class__, and __function__ the right names?  In
>       particular, should the names include the word "this", either as
>       __this_module__, __this_class__, and __this_function__, (format
>       discussed on the python-3000 and python-ideas lists) or as
>       __thismodule__, __thisclass__, and __thisfunction__ (inspired
>       by, but conflicting with, current usage of super.__thisclass__).
>
>     - Are all three keywords needed, or should this enhancement be
>       limited to a subset of the objects?  Should methods be treated
>       separately from other functions?

What do __class__ and __function__ refer to inside a nested class or
function?

> References
>
>     [1] Fixing super anyone?  Guido van Rossum
>         http://mail.python.org/pipermail/python-3000/2007-April/006671.html
>
>     [2] Descriptor/Decorator challenge,  Michele Simionato
>         http://groups.google.com/group/comp.lang.python/browse_frm/thread/a6010c7494871bb1/62a2da68961caeb6?lnk=gst&q=simionato+challenge&rnum=1&hl=en#62a2da68961caeb6
>
>
> Copyright
>
>     This document has been placed in the public domain.
>
>
> 
> Local Variables:
> mode: indented-text
> indent-tabs-mode: nil
> sentence-end-double-space: t
> fill-column: 70
> coding: utf-8
> End:

---
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Thu May  3 03:32:23 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 02 May 2007 21:32:23 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <46392F16.7080707@canterbury.ac.nz>
References: <5.1.1.6.0.20070502113122.04e41350@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501235655.02a57fd8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501235655.02a57fd8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070502113122.04e41350@sparrow.telecommunity.com>
Message-ID: <5.1.1.6.0.20070502211440.02a386d8@sparrow.telecommunity.com>

At 12:38 PM 5/3/2007 +1200, Greg Ewing wrote:
>In other words, I see the calling of the next method
>as an implementation detail that doesn't need to be
>announced prominently at the top of the method.

It's not an implementation detail - it's an expression of *intent*.  E.g., 
in English, "After you start a transaction on a database, make sure you 
turn its logging up all the way."

Please explain how you would improve one the clarity of that sentence 
*without* using the word "after" or any synonyms thereof.

ISTM that your argument is like saying there's no need for C with all its 
fancy function parameter names; after all, if you read the assembly code 
you can see right away which registers are being used for what.  That may 
be true, but I'd rather not have to.

Meanwhile, in the case of before/after methods, not having to call the next 
method or return its return value means there's less code to possibly get 
wrong in the process.


>So you're not going to impress me by appealing
>to those, either. :-)

I wasn't under the illusion that impressing you was possible, actually.  :)


From tjreedy at udel.edu  Thu May  3 03:35:45 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 2 May 2007 21:35:45 -0400
Subject: [Python-3000] PEP3099 += 'Assignment will not become an operation'
Message-ID: <f1be9f$5re$1@sea.gmane.org>

and hence '=' will not become an operator and hence '=' will not become 
overloadable.

(unless, of course, Guido has revised previous rejections).

Came up again today on c.l.p.  Surprised not alread in PEP.

tjr




From rrr at ronadam.com  Thu May  3 08:05:38 2007
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 03 May 2007 01:05:38 -0500
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <f1bqp0$vf0$1@sea.gmane.org>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>	<02b401c78d15$f04b6110$090a0a0a@enfoldsystems.local>
	<f1bqp0$vf0$1@sea.gmane.org>
Message-ID: <46397BB2.4060404@ronadam.com>

Georg Brandl wrote:
> FWIW, I'm -1 on both proposals too. I like implicit string literal concatenation
> and I really can't see what we gain from backslash continuation removal.
> 
> Georg

-1 on removing them also.  I find they are helpful.


It could be made optional in block headers that end with a ':'. It's 
optional, (just more white space), in parenthesized expressions, tuples, 
lists, and dictionary literals already.

 >>> [1,\
... 2,\
... 3]
[1, 2, 3]

 >>> (1,\
... 2,\
... 3)
(1, 2, 3)

 >>> {1:'a',\
... 2:'b',\
... 3:'c'}
{1: 'a', 2: 'b', 3: 'c'}

The rule would be any keyword that starts a block, (class, def, if, elif, 
with, ... etc.), until an unused (for anything else) colon, would always 
evaluate to be a single line weather or not it has parentheses or line 
continuations in it.  These can never be multi-line statements as far as I 
know.

The back slash would still be needed in console input.



The following inconsistency still bothers me, but I suppose it's an edge 
case that doesn't cause problems.

 >>> print r"hello world\"
   File "<stdin>", line 1
     print r"hello world\"
                         ^
SyntaxError: EOL while scanning single-quoted string
 >>> print r"hello\
...         world"
hello\
         world

In the first case, it's treated as a continuation character even though 
it's not at the end of a physical line. So it gives an error.

In the second case, its accepted as a continuation character, *and* a '\' 
character at the same time. (?)

Cheers,
    Ron

From python at rcn.com  Thu May  3 07:23:39 2007
From: python at rcn.com (Raymond Hettinger)
Date: Wed, 2 May 2007 22:23:39 -0700
Subject: [Python-3000] [Python-Dev] Implicit String Concatenation and Octal
	Literals Was: PEP 30XZ: Simplified Parsing
References: <20070502210339.BHU28881@ms09.lnh.mail.rcn.net>
	<17977.16058.847429.905398@montanaro.dyndns.org>
Message-ID: <000401c78d4c$796bfe60$f301a8c0@RaymondLaptop1>


>    Raymond> I find that style hard to maintain.  What is the advantage over
>    Raymond> multi-line strings?
> 
>    Raymond>  rows = self.executesql('''
>    Raymond>     select cities.city, state, country
>    Raymond>     from cities, venues, events, addresses
>    Raymond>     where cities.city like %s
>    Raymond>           and events.active = 1
>    Raymond>           and venues.address = addresses.id
>    Raymond>           and addresses.city = cities.id
>    Raymond>           and events.venue = venues.id
>    Raymond>     ''', 
>    Raymond>     (city,))

[Skip]
> Maybe it's just a quirk of how python-mode in Emacs treats multiline strings
> that caused me to start doing things this way (I've been doing my embedded
> SQL statements this way for several years now), but when I hit LF in an open
> multiline string a newline is inserted and the cursor is lined up under the
> "r" of "rows", not under the opening quote of the multiline string, and not
> where you chose to indent your example.  When I use individual strings the
> parameters line up where I want them to (the way I lined things up in my
> example).  At any rate, it's what I'm used to now.


I completely understand.  Almost any simplification or feature elimination
proposal is going to bump-up against, "what we're used to now".
Py3k may be our last chance to simplify the language.  We have so many
special little rules that even advanced users can't keep them
all in their head.  Certainly, every feature has someone who uses it.
But, there is some value to reducing the number of rules, especially
if those rules are non-essential (i.e. implicit string concatenation has
simple, clear alternatives with multi-line strings or with the plus-operator).

Another way to look at it is to ask whether we would consider 
adding implicit string concatenation if we didn't already have it.
I think there would be a chorus of emails against it -- arguing
against language bloat and noting that we already have triple-quoted
strings, raw-strings, a verbose flag for regexs, backslashes inside multiline
strings, the explicit plus-operator, and multi-line expressions delimited
by parentheses or brackets.  Collectively, that is A LOT of ways to do it.

I'm asking this group to give up a minor habit so that we can achieve
at least a few simplifications on the way to Py3.0 -- basically, our last chance.

Similar thoughts apply to the octal literal PEP.  I'm -1 on introducing
yet another way to write the literal (and a non-standard one at that).
My proposal was simply to eliminate it.  The use cases are few and
far between (translating C headers and setting unix file permissions).
In either case, writing int('0777', 8) suffices.  In the latter case, we've
already provided clear symbolic alternatives.  This simplification of the
language would be a freebie (impacting very little code, simplifying the
lexer, eliminating a special rule, and eliminating a source of confusion
for the young amoung us who do not know about such things).


Raymond

From greg.ewing at canterbury.ac.nz  Thu May  3 08:27:55 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 03 May 2007 18:27:55 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070502211440.02a386d8@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070502113122.04e41350@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501235655.02a57fd8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501235655.02a57fd8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070502113122.04e41350@sparrow.telecommunity.com>
	<5.1.1.6.0.20070502211440.02a386d8@sparrow.telecommunity.com>
Message-ID: <463980EB.1070102@canterbury.ac.nz>

Phillip J. Eby wrote:

> "After you start a transaction on a database, make 
> sure you turn its logging up all the way."
> 
> Please explain how you would improve one the clarity of that sentence 
> *without* using the word "after" or any synonyms thereof.

I don't object to using the word "after" in the docstring
if it helps. Although in this case the intent could be
described as "Ensure that all transactions are performed
with logging turned up all the way." Whether this is
done before or after starting the transaction doesn't
seem particularly important.

If it *is* important for some reason, that fact should
be mentioned in the docstring. The mere presence of an
@after decorator doesn't indicate whether it's important.
And if it's mentioned in the docstring, there's no need to
announce it again in the decorator.

> ISTM that your argument is like saying there's no need for C
 > with all its fancy function parameter names

Parameter names help to document the interface of a
function, which is something you need to know when you're
calling it. You don't need to know the position of a
next-method call to use a generic function.

I don't doubt that things like @before and @after are
handy. But being handy isn't enough for something to
get into the Python core. Python has come a long way
by providing a few very general mechanisms that can
be used in flexible ways. It tends not to go in for
gimmicks whose only benefit is to save a line or two
of code here and there. I think the same philosophy
should be applied to generic functions if we are to
get them.

--
Greg

From greg.ewing at canterbury.ac.nz  Thu May  3 08:31:14 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 03 May 2007 18:31:14 +1200
Subject: [Python-3000] PEP3099 += 'Assignment will not become an
	operation'
In-Reply-To: <f1be9f$5re$1@sea.gmane.org>
References: <f1be9f$5re$1@sea.gmane.org>
Message-ID: <463981B2.6090704@canterbury.ac.nz>

Terry Reedy wrote:
> and hence '=' will not become an operator and hence '=' will not become 
> overloadable.

Actually, '=' *is* overloadable in most cases, if
you can arrange for a suitably customised object
to be used as the namespace being assigned into.
About the only case you can't hook is assignment
to a local name in a function.

--
Greg

From greg.ewing at canterbury.ac.nz  Thu May  3 08:36:15 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 03 May 2007 18:36:15 +1200
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <17977.16058.847429.905398@montanaro.dyndns.org>
References: <20070502210339.BHU28881@ms09.lnh.mail.rcn.net>
	<17977.16058.847429.905398@montanaro.dyndns.org>
Message-ID: <463982DF.6000700@canterbury.ac.nz>

skip at pobox.com wrote:
> when I hit LF in an open
> multiline string a newline is inserted and the cursor is lined up under the
> "r" of "rows", not under the opening quote of the multiline string, and not
> where you chose to indent your example.

Seems to me that Python actually benefits from an
editor which doesn't try to be too clever about
auto-formatting. I'm doing most of my Python editing
at the moment using BBEdit Lite, which knows nothing
at all about Python code -- but it works very
well.

--
Greg

From martin at v.loewis.de  Thu May  3 09:03:16 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 03 May 2007 09:03:16 +0200
Subject: [Python-3000] PEP Parade
In-Reply-To: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
References: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
Message-ID: <46398934.2010700@v.loewis.de>

>  S  3121  Module Initialization and finalization       von L?wis
> 
> I like it. I wish the title were changed to "Extension Module ..." though.

Done!

Martin

From skip at pobox.com  Thu May  3 03:45:30 2007
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 2 May 2007 20:45:30 -0500
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <20070502210339.BHU28881@ms09.lnh.mail.rcn.net>
References: <20070502210339.BHU28881@ms09.lnh.mail.rcn.net>
Message-ID: <17977.16058.847429.905398@montanaro.dyndns.org>


    Raymond> [Skip]
    >> I use it all the time.  For example, to build up (what I consider to be)
    >> readable SQL queries:
    >> 
    >> rows = self.executesql("select cities.city, state, country"
    >>                        "    from cities, venues, events, addresses"
    >>                        "    where cities.city like %s"
    >>                        "      and events.active = 1"
    >>                        "      and venues.address = addresses.id"
    >>                        "      and addresses.city = cities.id"
    >>                        "      and events.venue = venues.id",
    >>                        (city,))

    Raymond> I find that style hard to maintain.  What is the advantage over
    Raymond> multi-line strings?

    Raymond>  rows = self.executesql('''
    Raymond>     select cities.city, state, country
    Raymond>     from cities, venues, events, addresses
    Raymond>     where cities.city like %s
    Raymond>           and events.active = 1
    Raymond>           and venues.address = addresses.id
    Raymond>           and addresses.city = cities.id
    Raymond>           and events.venue = venues.id
    Raymond>     ''', 
    Raymond>     (city,))

Maybe it's just a quirk of how python-mode in Emacs treats multiline strings
that caused me to start doing things this way (I've been doing my embedded
SQL statements this way for several years now), but when I hit LF in an open
multiline string a newline is inserted and the cursor is lined up under the
"r" of "rows", not under the opening quote of the multiline string, and not
where you chose to indent your example.  When I use individual strings the
parameters line up where I want them to (the way I lined things up in my
example).  At any rate, it's what I'm used to now.

Skip

From martin at v.loewis.de  Thu May  3 09:19:04 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 03 May 2007 09:19:04 +0200
Subject: [Python-3000] PEP 3120 (Was: PEP Parade)
In-Reply-To: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
References: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
Message-ID: <46398CE8.2060206@v.loewis.de>

>  S  3120  Using UTF-8 as the default source encoding   von L?wis
> 
> The basic idea seems very reasonable. I expect that the changes to the
> parser may be quite significant though. Also, the parser ought to be
> weened of C stdio in favor of Python's own I/O library. I wonder if
> it's really possible to let the parser read the raw bytes though --
> this would seem to rule out supporting encodings like UTF-16. Somehow
> I wonder if it wouldn't be easier if the parser operated on Unicode
> input? That way parsing unicode strings (which we must support as all
> strings will become unicode) will be simpler.

Actually, changes should be fairly minimal. The parser already
transforms all input (no matter what source encoding) to UTF-8
before doing the parsing; this has worked well (as all keywords
continue to be one-byte characters). The parser also already
special-cases UTF-8 as the input encoding, by not putting it
through a codec. That can also stay, except that it should now
check that any non-ASCII bytes are well-formed UTF-8.

Untangling the parser from stdio - sure. I also think it would
be desirable to read the whole source into a buffer, rather than
applying a line-by-line input. That might be a bigger change,
making the tokenizer a multi-stage algorithm:
1. read input into a buffer
2. determine source encoding (looking at a BOM, else a
   declaration within the first two lines, else default
   to UTF-8)
3. if the source encoding is not UTF-8, pass it through
   a codec (decode to string, encode to UTF-8). Otherwise,
   check that all bytes are really well-formed UTF-8.
4. start parsing

As for UTF-16: the lexer currently does not support UTF-16
as a source encoding, as we require an ASCII superset.

I'm not sure whether UTF-16 needs to be supported as a
source encoding, but with above changes, it would be fairly
easy to support, assuming we detect UTF-16 from the BOM
(can't use the encoding declaration, because that works
only for ASCII supersets).

Regards,
Martin

From talin at acm.org  Thu May  3 09:24:30 2007
From: talin at acm.org (Talin)
Date: Thu, 03 May 2007 00:24:30 -0700
Subject: [Python-3000] [Python-Dev] Implicit String Concatenation and
 Octal	Literals Was: PEP 30XZ: Simplified Parsing
In-Reply-To: <000401c78d4c$796bfe60$f301a8c0@RaymondLaptop1>
References: <20070502210339.BHU28881@ms09.lnh.mail.rcn.net>	<17977.16058.847429.905398@montanaro.dyndns.org>
	<000401c78d4c$796bfe60$f301a8c0@RaymondLaptop1>
Message-ID: <46398E2E.1010604@acm.org>

Raymond Hettinger wrote:
>>    Raymond> I find that style hard to maintain.  What is the advantage over
>>    Raymond> multi-line strings?
>>
>>    Raymond>  rows = self.executesql('''
>>    Raymond>     select cities.city, state, country
>>    Raymond>     from cities, venues, events, addresses
>>    Raymond>     where cities.city like %s
>>    Raymond>           and events.active = 1
>>    Raymond>           and venues.address = addresses.id
>>    Raymond>           and addresses.city = cities.id
>>    Raymond>           and events.venue = venues.id
>>    Raymond>     ''', 
>>    Raymond>     (city,))
> 
> [Skip]
>> Maybe it's just a quirk of how python-mode in Emacs treats multiline strings
>> that caused me to start doing things this way (I've been doing my embedded
>> SQL statements this way for several years now), but when I hit LF in an open
>> multiline string a newline is inserted and the cursor is lined up under the
>> "r" of "rows", not under the opening quote of the multiline string, and not
>> where you chose to indent your example.  When I use individual strings the
>> parameters line up where I want them to (the way I lined things up in my
>> example).  At any rate, it's what I'm used to now.
> 
> 
> I completely understand.  Almost any simplification or feature elimination
> proposal is going to bump-up against, "what we're used to now".
> Py3k may be our last chance to simplify the language.  We have so many
> special little rules that even advanced users can't keep them
> all in their head.  Certainly, every feature has someone who uses it.
> But, there is some value to reducing the number of rules, especially
> if those rules are non-essential (i.e. implicit string concatenation has
> simple, clear alternatives with multi-line strings or with the plus-operator).
> 
> Another way to look at it is to ask whether we would consider 
> adding implicit string concatenation if we didn't already have it.
> I think there would be a chorus of emails against it -- arguing
> against language bloat and noting that we already have triple-quoted
> strings, raw-strings, a verbose flag for regexs, backslashes inside multiline
> strings, the explicit plus-operator, and multi-line expressions delimited
> by parentheses or brackets.  Collectively, that is A LOT of ways to do it.
> 
> I'm asking this group to give up a minor habit so that we can achieve
> at least a few simplifications on the way to Py3.0 -- basically, our last chance.
> 
> Similar thoughts apply to the octal literal PEP.  I'm -1 on introducing
> yet another way to write the literal (and a non-standard one at that).
> My proposal was simply to eliminate it.  The use cases are few and
> far between (translating C headers and setting unix file permissions).
> In either case, writing int('0777', 8) suffices.  In the latter case, we've
> already provided clear symbolic alternatives.  This simplification of the
> language would be a freebie (impacting very little code, simplifying the
> lexer, eliminating a special rule, and eliminating a source of confusion
> for the young amoung us who do not know about such things).

My counter argument is that these simplifications aren't simplifying 
much - that is, the removals don't cascade and cause other 
simplifications. The grammar file, for example, won't look dramatically 
different if these changes are made. The simplification argument seems 
weak to me because the change in overall language complexity is very 
small, whereas the inconvenience caused, while not huge, is at least 
significant.

That being said, line continuation is the only one I really care about. 
And I would happily give up backslashes in exchange for a more sane 
method of continuing lines. Either way avoids "spurious" grouping 
operators which IMHO don't make for easier-to-read code.

-- Talin


From rasky at develer.com  Thu May  3 09:25:44 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Thu, 03 May 2007 09:25:44 +0200
Subject: [Python-3000] PEP:  Eliminate __del__
In-Reply-To: <5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
	<5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com>
Message-ID: <f1c2pp$l30$1@sea.gmane.org>

On 01/05/2007 18.09, Phillip J. Eby wrote:

>> The alternative is to code the automatic finalization steps using
>> weakref callbacks.  For those used to using __del__, it takes a little
>> while to learn the idiom but essentially the technique is hold a proxy
>> or ref with a callback to a boundmethod for finalization:
>>     self.resource = resource = CreateResource()
>>     self.callbacks.append(proxy(resource, resource.closedown))
>> In this manner, all of the object's resources can be freed automatically
>> when the object is collected.  Note, that the callbacks only bind
>> the resource object and not client object, so the client object
>> can already have been collected and the teardown code can be run
>> without risk of resurrecting the client (with a possibly invalid state).
> 
> I'm a bit confused about the above.  My understanding is that in order for 
> a weakref's callback to be invoked, the weakref itself *must still be 
> live*.  That means that if 'self' in your example above is collected, then 
> the weakref no longer exists, so the closedown won't be called.

Yes, but as far as I understand it, the GC does special care to ensure that 
the callback of a weakref that is *not* part of a cyclic trash being collected 
is always called. See this comment in gcmodule.c:

	 * OTOH, if wr isn't part of CT, we should invoke the callback:  the
	 * weakref outlived the trash.  Note that since wr isn't CT in this
	 * case, its callback can't be CT either -- wr acted as an external
	 * root to this generation, and therefore its callback did too.  So
	 * nothing in CT is reachable from the callback either, so it's hard
	 * to imagine how calling it later could create a problem for us.  wr
	 * is moved to wrcb_to_call in this case.


I might be wrong about the inners of GC, but I have used the weakref idiom 
many times and it always appeared to be working.

> In principle I'm in favor of ditching __del__, as long as there's actually 
> a viable technique for doing so.  My own experience has been that setting 
> up a simple mechanism to replace it (and that actually works) is really 
> difficult, because you have to find some place for the weakref itself to 
> live, which usually means a global dictionary or something of that 
> sort.

Others suggested that such a framework could be prepared, but I have not seen 
one yet.

   It would be nice if the gc or weakref modules grew a facility to
> make it easier to register finalization callbacks, and could optionally 
> check whether you were registering a callback that referenced the thing you 
> were tying the callback's life to.

That'd be absolutely great! OTOH, the GC could possibly re-verify such 
assertion every time it kicks in (when a special debug flag is activated).
-- 
Giovanni Bajo
Develer S.r.l.
http://www.develer.com


From tjreedy at udel.edu  Thu May  3 10:17:15 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 3 May 2007 04:17:15 -0400
Subject: [Python-3000] PEP3099 += 'Assignment will not become
	anoperation'
References: <f1be9f$5re$1@sea.gmane.org> <463981B2.6090704@canterbury.ac.nz>
Message-ID: <f1c5q9$u81$1@sea.gmane.org>


"Greg Ewing" <greg.ewing at canterbury.ac.nz> wrote in message 
news:463981B2.6090704 at canterbury.ac.nz...
| Terry Reedy wrote:
| > and hence '=' will not become an operator and hence '=' will not become
| > overloadable.
|
| Actually, '=' *is* overloadable in most cases,

It is not overloadable in the sense I meant, and in the sense people 
occasionally request, which is to have '=' be an *operation* that invokes a 
special method such as __assign__, just as the '+' operation invokes 
'__add__'.

| you can arrange for a suitably customised object
| to be used as the namespace being assigned into.
| About the only case you can't hook is assignment
| to a local name in a function.

I mentioned purse classes in the appropriate place -- c.l.p.
I cannot think of any way to make plain assignment statements ('a = 
object') at module scope do anything other than bind an object to a name in 
the global namespace.

Back to my original point: people occasionally ask that assignment 
statements become assignment expressions, as in C, by making '=' an 
operation with an overloadable special method.  Guido has consistently said 
no.  This came up again today.  Since this is a much more frequent request 
than some of the items already in 3099, I think it should be added their.

Terry Jan Reedy




From walter at livinglogic.de  Thu May  3 12:01:48 2007
From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=)
Date: Thu, 03 May 2007 12:01:48 +0200
Subject: [Python-3000] [Python-checkins] r55079 - in
 python/branches/py3k-struni/Lib: [many files]
In-Reply-To: <20070502191059.A48491E4010@bag.python.org>
References: <20070502191059.A48491E4010@bag.python.org>
Message-ID: <4639B30C.10307@livinglogic.de>

guido.van.rossum wrote:

> Author: guido.van.rossum
> Date: Wed May  2 21:09:54 2007
> New Revision: 55079
> 
> Modified:
> Log:
 > [...]
> Rip out all the u"..." literals and calls to unicode().

That might be one of the largest diffs in Python's history. ;)

Some of the changes lead to strange code like
    isinstance(foo, (str, str))

Below are the strange spots I noticed at first glance. I'm sure I missed 
a few.

Servus,
    Walter

> [...]
> Modified: python/branches/py3k-struni/Lib/copy.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/copy.py	(original)
> +++ python/branches/py3k-struni/Lib/copy.py	Wed May  2 21:09:54 2007
> @@ -186,7 +186,7 @@
>      pass
>  d[str] = _deepcopy_atomic
>  try:
> -    d[unicode] = _deepcopy_atomic
> +    d[str] = _deepcopy_atomic
>  except NameError:
>      pass

The try:except: is unnecessary now.

>  try:
> 
> Modified: python/branches/py3k-struni/Lib/ctypes/__init__.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/ctypes/__init__.py	(original)
> +++ python/branches/py3k-struni/Lib/ctypes/__init__.py	Wed May  2 21:09:54 2007
> @@ -59,7 +59,7 @@
>      create_string_buffer(anInteger) -> character array
>      create_string_buffer(aString, anInteger) -> character array
>      """
> -    if isinstance(init, (str, unicode)):
> +    if isinstance(init, (str, str)):
>          if size is None:
>              size = len(init)+1
>          buftype = c_char * size
> @@ -281,7 +281,7 @@
>          create_unicode_buffer(anInteger) -> character array
>          create_unicode_buffer(aString, anInteger) -> character array
>          """
> -        if isinstance(init, (str, unicode)):
> +        if isinstance(init, (str, str)):
>              if size is None:
>                  size = len(init)+1
>              buftype = c_wchar * size

This could be simplyfied to:
    if isinstance(init, str):

> Modified: python/branches/py3k-struni/Lib/distutils/command/bdist_wininst.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/distutils/command/bdist_wininst.py	(original)
> +++ python/branches/py3k-struni/Lib/distutils/command/bdist_wininst.py	Wed May  2 21:09:54 2007
> @@ -247,11 +247,11 @@
>  
>          # Convert cfgdata from unicode to ascii, mbcs encoded
>          try:
> -            unicode
> +            str
>          except NameError:
>              pass
>          else:
> -            if isinstance(cfgdata, unicode):
> +            if isinstance(cfgdata, str):
>                  cfgdata = cfgdata.encode("mbcs")

The try:except: is again unnecessary.

> Modified: python/branches/py3k-struni/Lib/doctest.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/doctest.py	(original)
> +++ python/branches/py3k-struni/Lib/doctest.py	Wed May  2 21:09:54 2007
> @@ -196,7 +196,7 @@
>      """
>      if inspect.ismodule(module):
>          return module
> -    elif isinstance(module, (str, unicode)):
> +    elif isinstance(module, (str, str)):

-> elif isinstance(module, str):


> Modified: python/branches/py3k-struni/Lib/encodings/idna.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/encodings/idna.py	(original)
> +++ python/branches/py3k-struni/Lib/encodings/idna.py	Wed May  2 21:09:54 2007
> @@ -4,11 +4,11 @@
>  from unicodedata import ucd_3_2_0 as unicodedata
>  
>  # IDNA section 3.1
> -dots = re.compile(u"[\u002E\u3002\uFF0E\uFF61]")
> +dots = re.compile("[\u002E\u3002\uFF0E\uFF61]")
>  
>  # IDNA section 5
>  ace_prefix = "xn--"
> -uace_prefix = unicode(ace_prefix, "ascii")
> +uace_prefix = str(ace_prefix, "ascii")

This looks unnecessary to me.

> Modified: python/branches/py3k-struni/Lib/idlelib/PyParse.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/idlelib/PyParse.py	(original)
> +++ python/branches/py3k-struni/Lib/idlelib/PyParse.py	Wed May  2 21:09:54 2007
> @@ -105,7 +105,7 @@
>  del ch
>  
>  try:
> -    UnicodeType = type(unicode(""))
> +    UnicodeType = type(str(""))
>  except NameError:
>      UnicodeType = None

This should probably be:
    UnicodeType = str
(or the code could directly use str)

> Modified: python/branches/py3k-struni/Lib/lib-tk/Tkinter.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/lib-tk/Tkinter.py	(original)
> +++ python/branches/py3k-struni/Lib/lib-tk/Tkinter.py	Wed May  2 21:09:54 2007
> @@ -3736,7 +3736,7 @@
>      text = "This is Tcl/Tk version %s" % TclVersion
>      if TclVersion >= 8.1:
>          try:
> -            text = text + unicode("\nThis should be a cedilla: \347",
> +            text = text + str("\nThis should be a cedilla: \347",
>                                    "iso-8859-1")

Better:
             text = text + "\nThis should be a cedilla: \xe7"

> Modified: python/branches/py3k-struni/Lib/pickle.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/pickle.py	(original)
> +++ python/branches/py3k-struni/Lib/pickle.py	Wed May  2 21:09:54 2007
> @@ -523,22 +523,22 @@
>      if StringType == UnicodeType:
>          # This is true for Jython

What's happening here?

> [...]


> Modified: python/branches/py3k-struni/Lib/plat-mac/EasyDialogs.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/plat-mac/EasyDialogs.py	(original)
> +++ python/branches/py3k-struni/Lib/plat-mac/EasyDialogs.py	Wed May  2 21:09:54 2007
> @@ -662,7 +662,7 @@
>          return tpwanted(rr.selection[0])
>      if issubclass(tpwanted, str):
>          return tpwanted(rr.selection_fsr[0].as_pathname())
> -    if issubclass(tpwanted, unicode):
> +    if issubclass(tpwanted, str):
>          return tpwanted(rr.selection_fsr[0].as_pathname(), 'utf8')
>      raise TypeError, "Unknown value for argument 'wanted': %s" % repr(tpwanted)
>  
> @@ -713,7 +713,7 @@
>          raise TypeError, "Cannot pass wanted=FSRef to AskFileForSave"
>      if issubclass(tpwanted, Carbon.File.FSSpec):
>          return tpwanted(rr.selection[0])
> -    if issubclass(tpwanted, (str, unicode)):
> +    if issubclass(tpwanted, (str, str)):

-> if issubclass(tpwanted, str):

>          if sys.platform == 'mac':
>              fullpath = rr.selection[0].as_pathname()
>          else:
> @@ -722,10 +722,10 @@
>              pardir_fss = Carbon.File.FSSpec((vrefnum, dirid, ''))
>              pardir_fsr = Carbon.File.FSRef(pardir_fss)
>              pardir_path = pardir_fsr.FSRefMakePath()  # This is utf-8
> -            name_utf8 = unicode(name, 'macroman').encode('utf8')
> +            name_utf8 = str(name, 'macroman').encode('utf8')
>              fullpath = os.path.join(pardir_path, name_utf8)
> -        if issubclass(tpwanted, unicode):
> -            return unicode(fullpath, 'utf8')
> +        if issubclass(tpwanted, str):
> +            return str(fullpath, 'utf8')
>          return tpwanted(fullpath)
>      raise TypeError, "Unknown value for argument 'wanted': %s" % repr(tpwanted)
>  
> @@ -775,7 +775,7 @@
>          return tpwanted(rr.selection[0])
>      if issubclass(tpwanted, str):
>          return tpwanted(rr.selection_fsr[0].as_pathname())
> -    if issubclass(tpwanted, unicode):
> +    if issubclass(tpwanted, str):

This does the same check twice.

> Modified: python/branches/py3k-struni/Lib/plat-mac/plistlib.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/plat-mac/plistlib.py	(original)
> +++ python/branches/py3k-struni/Lib/plat-mac/plistlib.py	Wed May  2 21:09:54 2007
> @@ -70,7 +70,7 @@
>      usually is a dictionary).
>      """
>      didOpen = 0
> -    if isinstance(pathOrFile, (str, unicode)):
> +    if isinstance(pathOrFile, (str, str)):

-> if isinstance(pathOrFile, str):

>          pathOrFile = open(pathOrFile)
>          didOpen = 1
>      p = PlistParser()
> @@ -85,7 +85,7 @@
>      file name or a (writable) file object.
>      """
>      didOpen = 0
> -    if isinstance(pathOrFile, (str, unicode)):
> +    if isinstance(pathOrFile, (str, str)):

-> if isinstance(pathOrFile, str):

>          pathOrFile = open(pathOrFile, "w")
>          didOpen = 1
>      writer = PlistWriter(pathOrFile)
> @@ -231,7 +231,7 @@
>          DumbXMLWriter.__init__(self, file, indentLevel, indent)
>  
>      def writeValue(self, value):
> -        if isinstance(value, (str, unicode)):
> +        if isinstance(value, (str, str)):

-> if isinstance(value, str):

>              self.simpleElement("string", value)
>          elif isinstance(value, bool):
>              # must switch for bool before int, as bool is a
> @@ -270,7 +270,7 @@
>          self.beginElement("dict")
>          items = sorted(d.items())
>          for key, value in items:
> -            if not isinstance(key, (str, unicode)):
> +            if not isinstance(key, (str, str)):

-> if not isinstance(key, str):

> Modified: python/branches/py3k-struni/Lib/sqlite3/test/factory.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/sqlite3/test/factory.py	(original)
> +++ python/branches/py3k-struni/Lib/sqlite3/test/factory.py	Wed May  2 21:09:54 2007
> @@ -139,31 +139,31 @@
>          self.con = sqlite.connect(":memory:")
>  
>      def CheckUnicode(self):
> -        austria = unicode("?sterreich", "latin1")
> +        austria = str("?sterreich", "latin1")
>          row = self.con.execute("select ?", (austria,)).fetchone()
> -        self.failUnless(type(row[0]) == unicode, "type of row[0] must be unicode")
> +        self.failUnless(type(row[0]) == str, "type of row[0] must be unicode")
>  
>      def CheckString(self):
>          self.con.text_factory = str
> -        austria = unicode("?sterreich", "latin1")
> +        austria = str("?sterreich", "latin1")
>          row = self.con.execute("select ?", (austria,)).fetchone()
>          self.failUnless(type(row[0]) == str, "type of row[0] must be str")
>          self.failUnless(row[0] == austria.encode("utf-8"), "column must equal original data in UTF-8")

It looks like both those test do the same thing now.

> Modified: python/branches/py3k-struni/Lib/tarfile.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/tarfile.py	(original)
> +++ python/branches/py3k-struni/Lib/tarfile.py	Wed May  2 21:09:54 2007
> @@ -1031,7 +1031,7 @@
>          for name, digits in (("uid", 8), ("gid", 8), ("size", 12), ("mtime", 12)):
>              val = info[name]
>              if not 0 <= val < 8 ** (digits - 1) or isinstance(val, float):
> -                pax_headers[name] = unicode(val)
> +                pax_headers[name] = str(val)
>                  info[name] = 0
>  
>          if pax_headers:
> @@ -1054,12 +1054,12 @@
>  
>      @staticmethod
>      def _to_unicode(value, encoding):
> -        if isinstance(value, unicode):
> +        if isinstance(value, str):
>              return value
>          elif isinstance(value, (int, float)):
> -            return unicode(value)
> +            return str(value)
>          elif isinstance(value, str):
> -            return unicode(value, encoding)
> +            return str(value, encoding)
>          else:
>              raise ValueError("unable to convert to unicode: %r" % value)

Here the same test is done twice too.

> Modified: python/branches/py3k-struni/Lib/test/pickletester.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/pickletester.py	(original)
> +++ python/branches/py3k-struni/Lib/test/pickletester.py	Wed May  2 21:09:54 2007
> @@ -484,8 +484,8 @@
>  
>      if have_unicode:
>          def test_unicode(self):
> -            endcases = [unicode(''), unicode('<\\u>'), unicode('<\\\u1234>'),
> -                        unicode('<\n>'),  unicode('<\\>')]
> +            endcases = [str(''), str('<\\u>'), str('<\\\u1234>'),
> +                        str('<\n>'),  str('<\\>')]

The str() call is unnecessary.

> Modified: python/branches/py3k-struni/Lib/test/string_tests.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/string_tests.py	(original)
> +++ python/branches/py3k-struni/Lib/test/string_tests.py	Wed May  2 21:09:54 2007
> @@ -589,7 +589,7 @@
>          self.checkequal(['a']*19 + ['a '], aaa, 'split', None, 19)
>  
>          # mixed use of str and unicode
> -        self.checkequal([u'a', u'b', u'c d'], 'a b c d', 'split', u' ', 2)
> +        self.checkequal(['a', 'b', 'c d'], 'a b c d', 'split', ' ', 2)
>  
>      def test_additional_rsplit(self):
>          self.checkequal(['this', 'is', 'the', 'rsplit', 'function'],
> @@ -622,7 +622,7 @@
>          self.checkequal([' a  a'] + ['a']*18, aaa, 'rsplit', None, 18)
>  
>          # mixed use of str and unicode
> -        self.checkequal([u'a b', u'c', u'd'], 'a b c d', 'rsplit', u' ', 2)
> +        self.checkequal(['a b', 'c', 'd'], 'a b c d', 'rsplit', ' ', 2)
>  
>      def test_strip(self):
>          self.checkequal('hello', '   hello   ', 'strip')
> @@ -644,14 +644,14 @@
>  
>          # strip/lstrip/rstrip with unicode arg
>          if test_support.have_unicode:
> -            self.checkequal(unicode('hello', 'ascii'), 'xyzzyhelloxyzzy',
> -                 'strip', unicode('xyz', 'ascii'))
> -            self.checkequal(unicode('helloxyzzy', 'ascii'), 'xyzzyhelloxyzzy',
> -                 'lstrip', unicode('xyz', 'ascii'))
> -            self.checkequal(unicode('xyzzyhello', 'ascii'), 'xyzzyhelloxyzzy',
> -                 'rstrip', unicode('xyz', 'ascii'))
> -            self.checkequal(unicode('hello', 'ascii'), 'hello',
> -                 'strip', unicode('xyz', 'ascii'))
> +            self.checkequal(str('hello', 'ascii'), 'xyzzyhelloxyzzy',
> +                 'strip', str('xyz', 'ascii'))
> +            self.checkequal(str('helloxyzzy', 'ascii'), 'xyzzyhelloxyzzy',
> +                 'lstrip', str('xyz', 'ascii'))
> +            self.checkequal(str('xyzzyhello', 'ascii'), 'xyzzyhelloxyzzy',
> +                 'rstrip', str('xyz', 'ascii'))
> +            self.checkequal(str('hello', 'ascii'), 'hello',
> +                 'strip', str('xyz', 'ascii'))

The str() call is unnecessary.

>          self.checkraises(TypeError, 'hello', 'strip', 42, 42)
>          self.checkraises(TypeError, 'hello', 'lstrip', 42, 42)
> @@ -908,13 +908,13 @@
>          self.checkequal(False, '', '__contains__', 'asdf')    # vereq('asdf' in '', False)
>  
>      def test_subscript(self):
> -        self.checkequal(u'a', 'abc', '__getitem__', 0)
> -        self.checkequal(u'c', 'abc', '__getitem__', -1)
> -        self.checkequal(u'a', 'abc', '__getitem__', 0)
> -        self.checkequal(u'abc', 'abc', '__getitem__', slice(0, 3))
> -        self.checkequal(u'abc', 'abc', '__getitem__', slice(0, 1000))
> -        self.checkequal(u'a', 'abc', '__getitem__', slice(0, 1))
> -        self.checkequal(u'', 'abc', '__getitem__', slice(0, 0))
> +        self.checkequal('a', 'abc', '__getitem__', 0)
> +        self.checkequal('c', 'abc', '__getitem__', -1)
> +        self.checkequal('a', 'abc', '__getitem__', 0)
> +        self.checkequal('abc', 'abc', '__getitem__', slice(0, 3))
> +        self.checkequal('abc', 'abc', '__getitem__', slice(0, 1000))
> +        self.checkequal('a', 'abc', '__getitem__', slice(0, 1))
> +        self.checkequal('', 'abc', '__getitem__', slice(0, 0))
>          # FIXME What about negative indices? This is handled differently by [] and __getitem__(slice)
>  
>          self.checkraises(TypeError, 'abc', '__getitem__', 'def')
> @@ -957,11 +957,11 @@
>          self.checkequal('abc', 'a', 'join', ('abc',))
>          self.checkequal('z', 'a', 'join', UserList(['z']))
>          if test_support.have_unicode:
> -            self.checkequal(unicode('a.b.c'), unicode('.'), 'join', ['a', 'b', 'c'])
> -            self.checkequal(unicode('a.b.c'), '.', 'join', [unicode('a'), 'b', 'c'])
> -            self.checkequal(unicode('a.b.c'), '.', 'join', ['a', unicode('b'), 'c'])
> -            self.checkequal(unicode('a.b.c'), '.', 'join', ['a', 'b', unicode('c')])
> -            self.checkraises(TypeError, '.', 'join', ['a', unicode('b'), 3])
> +            self.checkequal(str('a.b.c'), str('.'), 'join', ['a', 'b', 'c'])
> +            self.checkequal(str('a.b.c'), '.', 'join', [str('a'), 'b', 'c'])
> +            self.checkequal(str('a.b.c'), '.', 'join', ['a', str('b'), 'c'])
> +            self.checkequal(str('a.b.c'), '.', 'join', ['a', 'b', str('c')])
> +            self.checkraises(TypeError, '.', 'join', ['a', str('b'), 3])

The str() call is unnecessary.

> Modified: python/branches/py3k-struni/Lib/test/test_array.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_array.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_array.py	Wed May  2 21:09:54 2007
> @@ -747,7 +747,7 @@
>  
>      def test_nounicode(self):
>          a = array.array(self.typecode, self.example)
> -        self.assertRaises(ValueError, a.fromunicode, unicode(''))
> +        self.assertRaises(ValueError, a.fromunicode, str(''))
>          self.assertRaises(ValueError, a.tounicode)

Should the method fromunicode() and tounicode() be renamed?

>  tests.append(CharacterTest)
> @@ -755,27 +755,27 @@
>  if test_support.have_unicode:
>      class UnicodeTest(StringTest):
>          typecode = 'u'
> -        example = unicode(r'\x01\u263a\x00\ufeff', 'unicode-escape')
> -        smallerexample = unicode(r'\x01\u263a\x00\ufefe', 'unicode-escape')
> -        biggerexample = unicode(r'\x01\u263a\x01\ufeff', 'unicode-escape')
> -        outside = unicode('\x33')
> +        example = str(r'\x01\u263a\x00\ufeff', 'unicode-escape')
> +        smallerexample = str(r'\x01\u263a\x00\ufefe', 'unicode-escape')
> +        biggerexample = str(r'\x01\u263a\x01\ufeff', 'unicode-escape')
> +        outside = str('\x33')
>          minitemsize = 2
>  
>          def test_unicode(self):
> -            self.assertRaises(TypeError, array.array, 'b', unicode('foo', 'ascii'))
> +            self.assertRaises(TypeError, array.array, 'b', str('foo', 'ascii'))
> -            a = array.array('u', unicode(r'\xa0\xc2\u1234', 'unicode-escape'))
> -            a.fromunicode(unicode(' ', 'ascii'))
> -            a.fromunicode(unicode('', 'ascii'))
> -            a.fromunicode(unicode('', 'ascii'))
> -            a.fromunicode(unicode(r'\x11abc\xff\u1234', 'unicode-escape'))
> +            a = array.array('u', str(r'\xa0\xc2\u1234', 'unicode-escape'))
> +            a.fromunicode(str(' ', 'ascii'))
> +            a.fromunicode(str('', 'ascii'))
> +            a.fromunicode(str('', 'ascii'))
> +            a.fromunicode(str(r'\x11abc\xff\u1234', 'unicode-escape'))
>              s = a.tounicode()
>              self.assertEqual(
>                  s,
> -                unicode(r'\xa0\xc2\u1234 \x11abc\xff\u1234', 'unicode-escape')
> +                str(r'\xa0\xc2\u1234 \x11abc\xff\u1234', 'unicode-escape')
>              )
>  
> -            s = unicode(r'\x00="\'a\\b\x80\xff\u0000\u0001\u1234', 'unicode-escape')
> +            s = str(r'\x00="\'a\\b\x80\xff\u0000\u0001\u1234', 'unicode-escape')
>              a = array.array('u', s)
>              self.assertEqual(
>                  repr(a),

The str(..., 'ascii') call is unnecessary.

> Modified: python/branches/py3k-struni/Lib/test/test_binascii.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_binascii.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_binascii.py	Wed May  2 21:09:54 2007
> @@ -124,7 +124,7 @@
>  
>          # Verify the treatment of Unicode strings
>          if test_support.have_unicode:
> -            self.assertEqual(binascii.hexlify(unicode('a', 'ascii')), '61')
> +            self.assertEqual(binascii.hexlify(str('a', 'ascii')), '61')

The str() call is unnecessary.

> Modified: python/branches/py3k-struni/Lib/test/test_bool.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_bool.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_bool.py	Wed May  2 21:09:54 2007
> @@ -208,28 +208,28 @@
>          self.assertIs("xyz".startswith("z"), False)
>  
>          if test_support.have_unicode:
> -            self.assertIs(unicode("xyz", 'ascii').endswith(unicode("z", 'ascii')), True)
> -            self.assertIs(unicode("xyz", 'ascii').endswith(unicode("x", 'ascii')), False)
> -            self.assertIs(unicode("xyz0123", 'ascii').isalnum(), True)
> -            self.assertIs(unicode("@#$%", 'ascii').isalnum(), False)
> -            self.assertIs(unicode("xyz", 'ascii').isalpha(), True)
> -            self.assertIs(unicode("@#$%", 'ascii').isalpha(), False)
> -            self.assertIs(unicode("0123", 'ascii').isdecimal(), True)
> -            self.assertIs(unicode("xyz", 'ascii').isdecimal(), False)
> -            self.assertIs(unicode("0123", 'ascii').isdigit(), True)
> -            self.assertIs(unicode("xyz", 'ascii').isdigit(), False)
> -            self.assertIs(unicode("xyz", 'ascii').islower(), True)
> -            self.assertIs(unicode("XYZ", 'ascii').islower(), False)
> -            self.assertIs(unicode("0123", 'ascii').isnumeric(), True)
> -            self.assertIs(unicode("xyz", 'ascii').isnumeric(), False)
> -            self.assertIs(unicode(" ", 'ascii').isspace(), True)
> -            self.assertIs(unicode("XYZ", 'ascii').isspace(), False)
> -            self.assertIs(unicode("X", 'ascii').istitle(), True)
> -            self.assertIs(unicode("x", 'ascii').istitle(), False)
> -            self.assertIs(unicode("XYZ", 'ascii').isupper(), True)
> -            self.assertIs(unicode("xyz", 'ascii').isupper(), False)
> -            self.assertIs(unicode("xyz", 'ascii').startswith(unicode("x", 'ascii')), True)
> -            self.assertIs(unicode("xyz", 'ascii').startswith(unicode("z", 'ascii')), False)
> +            self.assertIs(str("xyz", 'ascii').endswith(str("z", 'ascii')), True)
> +            self.assertIs(str("xyz", 'ascii').endswith(str("x", 'ascii')), False)
> +            self.assertIs(str("xyz0123", 'ascii').isalnum(), True)
> +            self.assertIs(str("@#$%", 'ascii').isalnum(), False)
> +            self.assertIs(str("xyz", 'ascii').isalpha(), True)
> +            self.assertIs(str("@#$%", 'ascii').isalpha(), False)
> +            self.assertIs(str("0123", 'ascii').isdecimal(), True)
> +            self.assertIs(str("xyz", 'ascii').isdecimal(), False)
> +            self.assertIs(str("0123", 'ascii').isdigit(), True)
> +            self.assertIs(str("xyz", 'ascii').isdigit(), False)
> +            self.assertIs(str("xyz", 'ascii').islower(), True)
> +            self.assertIs(str("XYZ", 'ascii').islower(), False)
> +            self.assertIs(str("0123", 'ascii').isnumeric(), True)
> +            self.assertIs(str("xyz", 'ascii').isnumeric(), False)
> +            self.assertIs(str(" ", 'ascii').isspace(), True)
> +            self.assertIs(str("XYZ", 'ascii').isspace(), False)
> +            self.assertIs(str("X", 'ascii').istitle(), True)
> +            self.assertIs(str("x", 'ascii').istitle(), False)
> +            self.assertIs(str("XYZ", 'ascii').isupper(), True)
> +            self.assertIs(str("xyz", 'ascii').isupper(), False)
> +            self.assertIs(str("xyz", 'ascii').startswith(str("x", 'ascii')), True)
> +            self.assertIs(str("xyz", 'ascii').startswith(str("z", 'ascii')), False)

These tests can IMHO simply be dropped.

> Modified: python/branches/py3k-struni/Lib/test/test_builtin.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_builtin.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_builtin.py	Wed May  2 21:09:54 2007
> @@ -74,22 +74,22 @@
>  ]
>  if have_unicode:
>      L += [
> -        (unicode('0'), 0),
> -        (unicode('1'), 1),
> -        (unicode('9'), 9),
> -        (unicode('10'), 10),
> -        (unicode('99'), 99),
> -        (unicode('100'), 100),
> -        (unicode('314'), 314),
> -        (unicode(' 314'), 314),
> -        (unicode(b'\u0663\u0661\u0664 ','raw-unicode-escape'), 314),
> -        (unicode('  \t\t  314  \t\t  '), 314),
> -        (unicode('  1x'), ValueError),
> -        (unicode('  1  '), 1),
> -        (unicode('  1\02  '), ValueError),
> -        (unicode(''), ValueError),
> -        (unicode(' '), ValueError),
> -        (unicode('  \t\t  '), ValueError),
> +        (str('0'), 0),
> +        (str('1'), 1),
> +        (str('9'), 9),
> +        (str('10'), 10),
> +        (str('99'), 99),
> +        (str('100'), 100),
> +        (str('314'), 314),
> +        (str(' 314'), 314),
> +        (str(b'\u0663\u0661\u0664 ','raw-unicode-escape'), 314),
> +        (str('  \t\t  314  \t\t  '), 314),
> +        (str('  1x'), ValueError),
> +        (str('  1  '), 1),
> +        (str('  1\02  '), ValueError),
> +        (str(''), ValueError),
> +        (str(' '), ValueError),
> +        (str('  \t\t  '), ValueError),
>          (unichr(0x200), ValueError),
>  ]

Most of these tests can probably be dropped too.

Probably any test that checks have_unicode should be looked at.

> Modified: python/branches/py3k-struni/Lib/test/test_cfgparser.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_cfgparser.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_cfgparser.py	Wed May  2 21:09:54 2007
> @@ -248,12 +248,12 @@
>          cf.set("sect", "option2", "splat")
>          cf.set("sect", "option2", mystr("splat"))
>          try:
> -            unicode
> +            str
>          except NameError:
>              pass
>          else:
> -            cf.set("sect", "option1", unicode("splat"))
> -            cf.set("sect", "option2", unicode("splat"))
> +            cf.set("sect", "option1", str("splat"))
> +            cf.set("sect", "option2", str("splat"))

The try:except: and the str() call is unnecessary.

> Modified: python/branches/py3k-struni/Lib/test/test_charmapcodec.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_charmapcodec.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_charmapcodec.py	Wed May  2 21:09:54 2007
> @@ -27,27 +27,27 @@
>  
>  class CharmapCodecTest(unittest.TestCase):
>      def test_constructorx(self):
> -        self.assertEquals(unicode('abc', codecname), u'abc')
> -        self.assertEquals(unicode('xdef', codecname), u'abcdef')
> -        self.assertEquals(unicode('defx', codecname), u'defabc')
> -        self.assertEquals(unicode('dxf', codecname), u'dabcf')
> -        self.assertEquals(unicode('dxfx', codecname), u'dabcfabc')
> +        self.assertEquals(str('abc', codecname), 'abc')
> +        self.assertEquals(str('xdef', codecname), 'abcdef')
> +        self.assertEquals(str('defx', codecname), 'defabc')
> +        self.assertEquals(str('dxf', codecname), 'dabcf')
> +        self.assertEquals(str('dxfx', codecname), 'dabcfabc')
>  
>      def test_encodex(self):
> -        self.assertEquals(u'abc'.encode(codecname), 'abc')
> -        self.assertEquals(u'xdef'.encode(codecname), 'abcdef')
> -        self.assertEquals(u'defx'.encode(codecname), 'defabc')
> -        self.assertEquals(u'dxf'.encode(codecname), 'dabcf')
> -        self.assertEquals(u'dxfx'.encode(codecname), 'dabcfabc')
> +        self.assertEquals('abc'.encode(codecname), 'abc')
> +        self.assertEquals('xdef'.encode(codecname), 'abcdef')
> +        self.assertEquals('defx'.encode(codecname), 'defabc')
> +        self.assertEquals('dxf'.encode(codecname), 'dabcf')
> +        self.assertEquals('dxfx'.encode(codecname), 'dabcfabc')
>  
>      def test_constructory(self):
> -        self.assertEquals(unicode('ydef', codecname), u'def')
> -        self.assertEquals(unicode('defy', codecname), u'def')
> -        self.assertEquals(unicode('dyf', codecname), u'df')
> -        self.assertEquals(unicode('dyfy', codecname), u'df')
> +        self.assertEquals(str('ydef', codecname), 'def')
> +        self.assertEquals(str('defy', codecname), 'def')
> +        self.assertEquals(str('dyf', codecname), 'df')
> +        self.assertEquals(str('dyfy', codecname), 'df')

These should probably be b'...' constants.

> Modified: python/branches/py3k-struni/Lib/test/test_complex.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_complex.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_complex.py	Wed May  2 21:09:54 2007
> @@ -227,7 +227,7 @@
>  
>          self.assertEqual(complex("  3.14+J  "), 3.14+1j)
>          if test_support.have_unicode:
> -            self.assertEqual(complex(unicode("  3.14+J  ")), 3.14+1j)
> +            self.assertEqual(complex(str("  3.14+J  ")), 3.14+1j)
>  
>          # SF bug 543840:  complex(string) accepts strings with \0
>          # Fixed in 2.3.
> @@ -251,8 +251,8 @@
>          self.assertRaises(ValueError, complex, "1+(2j)")
>          self.assertRaises(ValueError, complex, "(1+2j)123")
>          if test_support.have_unicode:
> -            self.assertRaises(ValueError, complex, unicode("1"*500))
> -            self.assertRaises(ValueError, complex, unicode("x"))
> +            self.assertRaises(ValueError, complex, str("1"*500))
> +            self.assertRaises(ValueError, complex, str("x"))

The str() calls are unnecessary.

> Modified: python/branches/py3k-struni/Lib/test/test_contains.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_contains.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_contains.py	Wed May  2 21:09:54 2007
> @@ -59,31 +59,31 @@
>  
>      # Test char in Unicode
>  
> -    check('c' in unicode('abc'), "'c' not in u'abc'")
> -    check('d' not in unicode('abc'), "'d' in u'abc'")
> +    check('c' in str('abc'), "'c' not in u'abc'")
> +    check('d' not in str('abc'), "'d' in u'abc'")
>  
> -    check('' in unicode(''), "'' not in u''")
> -    check(unicode('') in '', "u'' not in ''")
> -    check(unicode('') in unicode(''), "u'' not in u''")
> -    check('' in unicode('abc'), "'' not in u'abc'")
> -    check(unicode('') in 'abc', "u'' not in 'abc'")
> -    check(unicode('') in unicode('abc'), "u'' not in u'abc'")
> +    check('' in str(''), "'' not in u''")
> +    check(str('') in '', "u'' not in ''")
> +    check(str('') in str(''), "u'' not in u''")
> +    check('' in str('abc'), "'' not in u'abc'")
> +    check(str('') in 'abc', "u'' not in 'abc'")
> +    check(str('') in str('abc'), "u'' not in u'abc'")
>  
>      try:
> -        None in unicode('abc')
> +        None in str('abc')
>          check(0, "None in u'abc' did not raise error")
>      except TypeError:
>          pass
>  
>      # Test Unicode char in Unicode
>  
> -    check(unicode('c') in unicode('abc'), "u'c' not in u'abc'")
> -    check(unicode('d') not in unicode('abc'), "u'd' in u'abc'")
> +    check(str('c') in str('abc'), "u'c' not in u'abc'")
> +    check(str('d') not in str('abc'), "u'd' in u'abc'")

The str() calls are unnecessary.

>      # Test Unicode char in string
>  
> -    check(unicode('c') in 'abc', "u'c' not in 'abc'")
> -    check(unicode('d') not in 'abc', "u'd' in 'abc'")
> +    check(str('c') in 'abc', "u'c' not in 'abc'")
> +    check(str('d') not in 'abc', "u'd' in 'abc'")

This is testing the same as above.

> Modified: python/branches/py3k-struni/Lib/test/test_descr.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_descr.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_descr.py	Wed May  2 21:09:54 2007
> @@ -264,7 +264,7 @@
>      del junk
>  
>      # Just make sure these don't blow up!
> -    for arg in 2, 2, 2j, 2e0, [2], "2", u"2", (2,), {2:2}, type, test_dir:
> +    for arg in 2, 2, 2j, 2e0, [2], "2", "2", (2,), {2:2}, type, test_dir:

This tests "2" twice.

>          dir(arg)
>  
>      # Test dir on custom classes. Since these have object as a
> @@ -1100,25 +1100,25 @@
>  
>      # Test unicode slot names
>      try:
> -        unicode
> +        str
>      except NameError:
>          pass

The try:except: is be unnecessary.

>      else:
>          # Test a single unicode string is not expanded as a sequence.
>          class C(object):
> -            __slots__ = unicode("abc")
> +            __slots__ = str("abc")

The str() call is unnecessary.

>          c = C()
>          c.abc = 5
>          vereq(c.abc, 5)
>  
>          # _unicode_to_string used to modify slots in certain circumstances
> -        slots = (unicode("foo"), unicode("bar"))
> +        slots = (str("foo"), str("bar"))

The str() calls are unnecessary.

>          class C(object):
>              __slots__ = slots
>          x = C()
>          x.foo = 5
>          vereq(x.foo, 5)
> -        veris(type(slots[0]), unicode)
> +        veris(type(slots[0]), str)
>          # this used to leak references
>          try:
>              class C(object):
> @@ -2301,64 +2301,64 @@
> [...]
>      class sublist(list):
>          pass
> @@ -2437,12 +2437,12 @@
>      vereq(int(x=3), 3)
>      vereq(complex(imag=42, real=666), complex(666, 42))
>      vereq(str(object=500), '500')
> -    vereq(unicode(string='abc', errors='strict'), u'abc')
> +    vereq(str(string='abc', errors='strict'), 'abc')
>      vereq(tuple(sequence=range(3)), (0, 1, 2))
>      vereq(list(sequence=(0, 1, 2)), range(3))
>      # note: as of Python 2.3, dict() no longer has an "items" keyword arg
>  
> -    for constructor in (int, float, int, complex, str, unicode,
> +    for constructor in (int, float, int, complex, str, str,
>                          tuple, list, file):
>          try:
>              constructor(bogus_keyword_arg=1)
> @@ -2719,13 +2719,13 @@
>      class H(object):
>          __slots__ = ["b", "a"]
>      try:
> -        unicode
> +        str

The try:except: is unnecessary.

>      except NameError:
>          class I(object):
>              __slots__ = ["a", "b"]
>      else:
>          class I(object):
> -            __slots__ = [unicode("a"), unicode("b")]
> +            __slots__ = [str("a"), str("b")]
>      class J(object):
>          __slots__ = ["c", "b"]
>      class K(object):
> @@ -3124,9 +3124,9 @@
>  
>      # It's not clear that unicode will continue to support the character
>      # buffer interface, and this test will fail if that's taken away.
> -    class MyUni(unicode):
> +    class MyUni(str):
>          pass
> -    base = u'abc'
> +    base = 'abc'
>      m = MyUni(base)
>      vereq(binascii.b2a_hex(m), binascii.b2a_hex(base))

> Modified: python/branches/py3k-struni/Lib/test/test_file.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_file.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_file.py	Wed May  2 21:09:54 2007
> @@ -145,7 +145,7 @@
>  
>      def testUnicodeOpen(self):
>          # verify repr works for unicode too
> -        f = open(unicode(TESTFN), "w")
> +        f = open(str(TESTFN), "w")
>          self.assert_(repr(f).startswith("<open file u'" + TESTFN))

This test might fail, because the u prefix is gone.

> Modified: python/branches/py3k-struni/Lib/test/test_format.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_format.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_format.py	Wed May  2 21:09:54 2007
> @@ -35,7 +35,7 @@
>  def testboth(formatstr, *args):
>      testformat(formatstr, *args)
>      if have_unicode:
> -        testformat(unicode(formatstr), *args)
> +        testformat(str(formatstr), *args)

This is the same test twice.

> Modified: python/branches/py3k-struni/Lib/test/test_iter.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_iter.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_iter.py	Wed May  2 21:09:54 2007
> @@ -216,9 +216,9 @@
>      # Test a Unicode string
>      if have_unicode:
>          def test_iter_unicode(self):
> -            self.check_for_loop(iter(unicode("abcde")),
> -                                [unicode("a"), unicode("b"), unicode("c"),
> -                                 unicode("d"), unicode("e")])
> +            self.check_for_loop(iter(str("abcde")),
> +                                [str("a"), str("b"), str("c"),
> +                                 str("d"), str("e")])

The str() calls are unnecessary.

>      # Test a directory
>      def test_iter_dict(self):
> @@ -518,7 +518,7 @@
>                  i = self.i
>                  self.i = i+1
>                  if i == 2:
> -                    return unicode("fooled you!")
> +                    return str("fooled you!")

The str() call is unnecessary.

>                  return next(self.it)
>  
>          f = open(TESTFN, "w")
> @@ -535,7 +535,7 @@
>          # and pass that on to unicode.join().
>          try:
>              got = " - ".join(OhPhooey(f))
> -            self.assertEqual(got, unicode("a\n - b\n - fooled you! - c\n"))
> +            self.assertEqual(got, str("a\n - b\n - fooled you! - c\n"))

The str() call is unnecessary.

> Modified: python/branches/py3k-struni/Lib/test/test_pep352.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_pep352.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_pep352.py	Wed May  2 21:09:54 2007
> @@ -90,7 +90,7 @@
>          arg = "spam"
>          exc = Exception(arg)
>          results = ([len(exc.args), 1], [exc.args[0], arg], [exc.message, arg],
> -                [str(exc), str(arg)], [unicode(exc), unicode(arg)],
> +                [str(exc), str(arg)], [str(exc), str(arg)],
>              [repr(exc), exc.__class__.__name__ + repr(exc.args)])
>          self.interface_test_driver(results)
>  
> @@ -101,7 +101,7 @@
>          exc = Exception(*args)
>          results = ([len(exc.args), arg_count], [exc.args, args],
>                  [exc.message, ''], [str(exc), str(args)],
> -                [unicode(exc), unicode(args)],
> +                [str(exc), str(args)],
>                  [repr(exc), exc.__class__.__name__ + repr(exc.args)])
>          self.interface_test_driver(results)
>  
> @@ -109,7 +109,7 @@
>          # Make sure that with no args that interface is correct
>          exc = Exception()
>          results = ([len(exc.args), 0], [exc.args, tuple()], [exc.message, ''],
> -                [str(exc), ''], [unicode(exc), u''],
> +                [str(exc), ''], [str(exc), ''],
>                  [repr(exc), exc.__class__.__name__ + '()'])
>          self.interface_test_driver(results)

Seems like here the same test is done twice too.

> Modified: python/branches/py3k-struni/Lib/test/test_pprint.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_pprint.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_pprint.py	Wed May  2 21:09:54 2007
> @@ -3,7 +3,7 @@
>  import unittest
>  
>  try:
> -    uni = unicode
> +    uni = str
>  except NameError:
>      def uni(x):
>          return x

This can be simplyfied to
    uni = str
(or use str everywhere)

> Modified: python/branches/py3k-struni/Lib/test/test_re.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_re.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_re.py	Wed May  2 21:09:54 2007
> @@ -324,12 +324,12 @@
> [...]
>      def test_stack_overflow(self):
> @@ -561,10 +561,10 @@
>      def test_bug_764548(self):
>          # bug 764548, re.compile() barfs on str/unicode subclasses
>          try:
> -            unicode
> +            str
>          except NameError:
>              return  # no problem if we have no unicode

The try:except: can be removed.

> -        class my_unicode(unicode): pass
> +        class my_unicode(str): pass
>          pat = re.compile(my_unicode("abc"))
>          self.assertEqual(pat.match("xyz"), None)
>  
> @@ -575,7 +575,7 @@
>  
>      def test_bug_926075(self):
>          try:
> -            unicode
> +            str
>          except NameError:
>              return # no problem if we have no unicode
>          self.assert_(re.compile('bug_926075') is not

The try:except: can be removed.

> @@ -583,7 +583,7 @@
>  
>      def test_bug_931848(self):
>          try:
> -            unicode
> +            str
>          except NameError:
>              pass
>          pattern = eval('u"[\u002E\u3002\uFF0E\uFF61]"')

The try:except: can be removed.

> Modified: python/branches/py3k-struni/Lib/test/test_set.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_set.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_set.py	Wed May  2 21:09:54 2007
> @@ -72,7 +72,7 @@
>          self.assertEqual(type(u), self.thetype)
>          self.assertRaises(PassThru, self.s.union, check_pass_thru())
>          self.assertRaises(TypeError, self.s.union, [[]])
> -        for C in set, frozenset, dict.fromkeys, str, unicode, list, tuple:
> +        for C in set, frozenset, dict.fromkeys, str, str, list, tuple:

This tests str twice. (This happends several times in test_set.py

>              self.assertEqual(self.thetype('abcba').union(C('cdc')), set('abcd'))
>              self.assertEqual(self.thetype('abcba').union(C('efgfe')), set('abcefg'))
>              self.assertEqual(self.thetype('abcba').union(C('ccb')), set('abc'))
> [...]

> Modified: python/branches/py3k-struni/Lib/test/test_str.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_str.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_str.py	Wed May  2 21:09:54 2007
> @@ -31,7 +31,7 @@
>          # Make sure __str__() behaves properly
>          class Foo0:
>              def __unicode__(self):

What happens with __unicode__ after unification?


> Modified: python/branches/py3k-struni/Lib/test/test_support.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_support.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_support.py	Wed May  2 21:09:54 2007
> @@ -131,7 +131,7 @@
>      return (x > y) - (x < y)
>  
>  try:
> -    unicode
> +    str
>      have_unicode = True
>  except NameError:
>      have_unicode = False

Can this be dropped?

> @@ -151,13 +151,13 @@
>          # Assuming sys.getfilesystemencoding()!=sys.getdefaultencoding()
>          # TESTFN_UNICODE is a filename that can be encoded using the
>          # file system encoding, but *not* with the default (ascii) encoding
> -        if isinstance('', unicode):
> +        if isinstance('', str):
>              # python -U
>              # XXX perhaps unicode() should accept Unicode strings?
>              TESTFN_UNICODE = "@test-\xe0\xf2"
>          else:
>              # 2 latin characters.
> -            TESTFN_UNICODE = unicode("@test-\xe0\xf2", "latin-1")
> +            TESTFN_UNICODE = str("@test-\xe0\xf2", "latin-1")
>          TESTFN_ENCODING = sys.getfilesystemencoding()
>          # TESTFN_UNICODE_UNENCODEABLE is a filename that should *not* be
>          # able to be encoded by *either* the default or filesystem encoding.
 >
> Modified: python/branches/py3k-struni/Lib/test/test_unicode.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_unicode.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_unicode.py	Wed May  2 21:09:54 2007

This should probably be dropped/merged into test_str.

> Modified: python/branches/py3k-struni/Lib/test/test_xmlrpc.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/test/test_xmlrpc.py	(original)
> +++ python/branches/py3k-struni/Lib/test/test_xmlrpc.py	Wed May  2 21:09:54 2007
> @@ -5,7 +5,7 @@
>  from test import test_support
>  
>  try:
> -    unicode
> +    str
>  except NameError:
>      have_unicode = False

The try:except: can be dropped.


> Modified: python/branches/py3k-struni/Lib/textwrap.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/textwrap.py	(original)
> +++ python/branches/py3k-struni/Lib/textwrap.py	Wed May  2 21:09:54 2007
> @@ -70,7 +70,7 @@
>      whitespace_trans = string.maketrans(_whitespace, ' ' * len(_whitespace))
>  
>      unicode_whitespace_trans = {}
> -    uspace = ord(u' ')
> +    uspace = ord(' ')
>      for x in map(ord, _whitespace):
>          unicode_whitespace_trans[x] = uspace
>  
> @@ -127,7 +127,7 @@
>          if self.replace_whitespace:
>              if isinstance(text, str):
>                  text = text.translate(self.whitespace_trans)
> -            elif isinstance(text, unicode):
> +            elif isinstance(text, str):

This checks for str twice.

> Modified: python/branches/py3k-struni/Lib/types.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/types.py	(original)
> +++ python/branches/py3k-struni/Lib/types.py	Wed May  2 21:09:54 2007
> @@ -28,7 +28,7 @@
>  # types.StringTypes", you should use "isinstance(x, basestring)".  But
>  # we keep around for compatibility with Python 2.2.
>  try:
> -    UnicodeType = unicode
> +    UnicodeType = str
>      StringTypes = (StringType, UnicodeType)
>  except NameError:
>      StringTypes = (StringType,)

Can we drop this?

> Modified: python/branches/py3k-struni/Lib/urllib.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/urllib.py	(original)
> +++ python/branches/py3k-struni/Lib/urllib.py	Wed May  2 21:09:54 2007
> @@ -984,13 +984,13 @@
>  # quote('abc def') -> 'abc%20def')
>  
>  try:
> -    unicode
> +    str
>  except NameError:
>      def _is_unicode(x):
>          return 0
>  else:
>      def _is_unicode(x):
> -        return isinstance(x, unicode)
> +        return isinstance(x, str)

Can _is_unicode simply return True?

> Modified: python/branches/py3k-struni/Lib/xml/dom/minicompat.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/xml/dom/minicompat.py	(original)
> +++ python/branches/py3k-struni/Lib/xml/dom/minicompat.py	Wed May  2 21:09:54 2007
> @@ -41,11 +41,11 @@
>  import xml.dom
>  
>  try:
> -    unicode
> +    str
>  except NameError:
>      StringTypes = type(''),
>  else:
> -    StringTypes = type(''), type(unicode(''))
> +    StringTypes = type(''), type(str(''))

This ammounts to
    StringTypes = str

>  class NodeList(list):
> 
> Modified: python/branches/py3k-struni/Lib/xmlrpclib.py
> ==============================================================================
> --- python/branches/py3k-struni/Lib/xmlrpclib.py	(original)
> +++ python/branches/py3k-struni/Lib/xmlrpclib.py	Wed May  2 21:09:54 2007
> @@ -144,9 +144,9 @@
>  # Internal stuff
>  
>  try:
> -    unicode
> +    str
>  except NameError:
> -    unicode = None # unicode support not available
> +    str = None # unicode support not available

The try:except: can be dropped and all subsequent "if str:" tests too.


From percivall at gmail.com  Thu May  3 12:26:56 2007
From: percivall at gmail.com (Simon Percivall)
Date: Thu, 3 May 2007 12:26:56 +0200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
Message-ID: <D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>

On 2 maj 2007, at 20.08, Guido van Rossum wrote:
> [Georg]
>>>>>>> a, *b, c = range(5)
>>>>>>> a
>>>>     0
>>>>>>> c
>>>>     4
>>>>>>> b
>>>>     [1, 2, 3]
>
> <snip>
> That sounds messy; only allowing *a at the end seems a bit more
> manageable. But I'll hold off until I can shoot holes in your
> implementation. ;-)

As the patch works right now, any iterator will be exhausted,
but if the proposal is constrained to only allowing the *name at
the end, wouldn't a more useful behavior be to not exhaust the
iterator, making it similar to:

 > it = iter(range(10))
 > a = next(it)
 > b = it

or would this be too surprising?

//Simon

From g.brandl at gmx.net  Thu May  3 13:46:10 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 03 May 2007 13:46:10 +0200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>	<f19d5r$mfm$2@sea.gmane.org>	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
Message-ID: <f1ci1u$7tk$1@sea.gmane.org>

Simon Percivall schrieb:
> On 2 maj 2007, at 20.08, Guido van Rossum wrote:
>> [Georg]
>>>>>>>> a, *b, c = range(5)
>>>>>>>> a
>>>>>     0
>>>>>>>> c
>>>>>     4
>>>>>>>> b
>>>>>     [1, 2, 3]
>>
>> <snip>
>> That sounds messy; only allowing *a at the end seems a bit more
>> manageable. But I'll hold off until I can shoot holes in your
>> implementation. ;-)
> 
> As the patch works right now, any iterator will be exhausted,
> but if the proposal is constrained to only allowing the *name at
> the end, wouldn't a more useful behavior be to not exhaust the
> iterator, making it similar to:
> 
>  > it = iter(range(10))
>  > a = next(it)
>  > b = it
> 
> or would this be too surprising?

IMO yes.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From skip at pobox.com  Thu May  3 12:35:09 2007
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 3 May 2007 05:35:09 -0500
Subject: [Python-3000] [Python-Dev] Implicit String Concatenation and
 Octal Literals Was: PEP 30XZ: Simplified Parsing
In-Reply-To: <000401c78d4c$796bfe60$f301a8c0@RaymondLaptop1>
References: <20070502210339.BHU28881@ms09.lnh.mail.rcn.net>
	<17977.16058.847429.905398@montanaro.dyndns.org>
	<000401c78d4c$796bfe60$f301a8c0@RaymondLaptop1>
Message-ID: <17977.47837.397664.190390@montanaro.dyndns.org>


    Raymond> Another way to look at it is to ask whether we would consider
    Raymond> adding implicit string concatenation if we didn't already have
    Raymond> it.

As I recall it was a "relatively recent" addition.  Maybe 2.0 or 2.1?  It
certainly hasn't been there from the beginning.

Skip

From benji at benjiyork.com  Thu May  3 15:01:54 2007
From: benji at benjiyork.com (Benji York)
Date: Thu, 03 May 2007 09:01:54 -0400
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <46397BB2.4060404@ronadam.com>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>	<02b401c78d15$f04b6110$090a0a0a@enfoldsystems.local>	<f1bqp0$vf0$1@sea.gmane.org>
	<46397BB2.4060404@ronadam.com>
Message-ID: <4639DD42.3020307@benjiyork.com>

Ron Adam wrote:
> The following inconsistency still bothers me, but I suppose it's an edge 
> case that doesn't cause problems.
> 
>  >>> print r"hello world\"
>    File "<stdin>", line 1
>      print r"hello world\"
>                          ^
> SyntaxError: EOL while scanning single-quoted string

> In the first case, it's treated as a continuation character even though 
> it's not at the end of a physical line. So it gives an error.

No, that is unrelated to line continuation.  The \" is an escape 
sequence, therefore there is no double-quote to end the string literal.
-- 
Benji York
http://benjiyork.com

From g.brandl at gmx.net  Thu May  3 15:50:07 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 03 May 2007 15:50:07 +0200
Subject: [Python-3000] Escaping in raw strings (was Re: [Python-Dev] PEP
 30XZ: Simplified Parsing)
Message-ID: <f1cpad$4ml$1@sea.gmane.org>

Benji York schrieb:
> Ron Adam wrote:
>> The following inconsistency still bothers me, but I suppose it's an edge
>> case that doesn't cause problems.
>>
>>  >>> print r"hello world\"
>>    File "<stdin>", line 1
>>      print r"hello world\"
>>                          ^
>> SyntaxError: EOL while scanning single-quoted string
>
>> In the first case, it's treated as a continuation character even though
>> it's not at the end of a physical line. So it gives an error.
>
> No, that is unrelated to line continuation.  The \" is an escape
> sequence, therefore there is no double-quote to end the string literal.

But IMHO this is really something that can and ought to be fixed.

I would let a raw string end at the first matching quote and not have any
escaping available. That's no loss of functionality since there is no
way to put a single " into a r"" string today. You can do r"\"", but it
doesn't have the effect of just escaping the closing quote, so it's
pretty useless.

Is that something that can be agreed upon without a PEP?

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From rrr at ronadam.com  Thu May  3 15:55:13 2007
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 03 May 2007 08:55:13 -0500
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <4639DD42.3020307@benjiyork.com>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>	<02b401c78d15$f04b6110$090a0a0a@enfoldsystems.local>	<f1bqp0$vf0$1@sea.gmane.org>	<46397BB2.4060404@ronadam.com>
	<4639DD42.3020307@benjiyork.com>
Message-ID: <4639E9C1.4010109@ronadam.com>

Benji York wrote:
> Ron Adam wrote:
>> The following inconsistency still bothers me, but I suppose it's an edge 
>> case that doesn't cause problems.
>>
>>  >>> print r"hello world\"
>>    File "<stdin>", line 1
>>      print r"hello world\"
>>                          ^
>> SyntaxError: EOL while scanning single-quoted string
> 
>> In the first case, it's treated as a continuation character even though 
>> it's not at the end of a physical line. So it gives an error.
> 
> No, that is unrelated to line continuation.  The \" is an escape 
> sequence, therefore there is no double-quote to end the string literal.

Are you sure?


 >>> print r'\"'
\"

It's just a '\' here.

These are raw strings if you didn't notice.


Cheers,
    Ron


From fdrake at acm.org  Thu May  3 15:58:53 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 3 May 2007 09:58:53 -0400
Subject: [Python-3000] Escaping in raw strings (was Re: [Python-Dev] PEP
	30XZ: Simplified Parsing)
In-Reply-To: <f1cpad$4ml$1@sea.gmane.org>
References: <f1cpad$4ml$1@sea.gmane.org>
Message-ID: <200705030958.53558.fdrake@acm.org>

On Thursday 03 May 2007, Georg Brandl wrote:
 > Is that something that can be agreed upon without a PEP?

I expect this to be at least somewhat controversial, so a PEP is warranted.  
I'd like to see it fixed, though.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From rrr at ronadam.com  Thu May  3 15:55:13 2007
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 03 May 2007 08:55:13 -0500
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <4639DD42.3020307@benjiyork.com>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>	<02b401c78d15$f04b6110$090a0a0a@enfoldsystems.local>	<f1bqp0$vf0$1@sea.gmane.org>	<46397BB2.4060404@ronadam.com>
	<4639DD42.3020307@benjiyork.com>
Message-ID: <4639E9C1.4010109@ronadam.com>

Benji York wrote:
> Ron Adam wrote:
>> The following inconsistency still bothers me, but I suppose it's an edge 
>> case that doesn't cause problems.
>>
>>  >>> print r"hello world\"
>>    File "<stdin>", line 1
>>      print r"hello world\"
>>                          ^
>> SyntaxError: EOL while scanning single-quoted string
> 
>> In the first case, it's treated as a continuation character even though 
>> it's not at the end of a physical line. So it gives an error.
> 
> No, that is unrelated to line continuation.  The \" is an escape 
> sequence, therefore there is no double-quote to end the string literal.

Are you sure?


 >>> print r'\"'
\"

It's just a '\' here.

These are raw strings if you didn't notice.


Cheers,
    Ron


From skip at pobox.com  Thu May  3 15:11:01 2007
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 3 May 2007 08:11:01 -0500
Subject: [Python-3000] [Python-Dev] Implicit String Concatenation and
 Octal Literals Was: PEP 30XZ: Simplified Parsing
In-Reply-To: <17977.47837.397664.190390@montanaro.dyndns.org>
References: <20070502210339.BHU28881@ms09.lnh.mail.rcn.net>
	<17977.16058.847429.905398@montanaro.dyndns.org>
	<000401c78d4c$796bfe60$f301a8c0@RaymondLaptop1>
	<17977.47837.397664.190390@montanaro.dyndns.org>
Message-ID: <17977.57189.849175.981712@montanaro.dyndns.org>

>>>>> "skip" == skip  <skip at pobox.com> writes:

    Raymond> Another way to look at it is to ask whether we would consider
    Raymond> adding implicit string concatenation if we didn't already have
    Raymond> it.

    skip> As I recall it was a "relatively recent" addition.  Maybe 2.0 or
    skip> 2.1?  It certainly hasn't been there from the beginning.

Misc/HISTORY suggests this feature was added in 1.0.2 (May 1994).  Apologies
for my bad memory.

Skip

From jimjjewett at gmail.com  Thu May  3 16:16:53 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 3 May 2007 10:16:53 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <463980EB.1070102@canterbury.ac.nz>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501235655.02a57fd8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070502113122.04e41350@sparrow.telecommunity.com>
	<5.1.1.6.0.20070502211440.02a386d8@sparrow.telecommunity.com>
	<463980EB.1070102@canterbury.ac.nz>
Message-ID: <fb6fbf560705030716l50b3a1deqd2f01b7e39c864fb@mail.gmail.com>

On 5/3/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

> I don't doubt that things like @before and @after are
> handy. But being handy isn't enough for something to
> get into the Python core.

I hadn't thought of @before and @after as truly core; I had assumed
they were decorators that would be available in a genfunc module.

I'll agree that the actual timing of the super-call is often not
essential, and having time-words confuses that.  On the other hand,
they do give you

(1)  The function being added as an overload doesn't have to know
anything about the framework, or even that another method may ever be
called at all; so long as the super-call is at one end, the
registration function can take care of this.

(2)  The explicit version of next_method corresponds to super, but is
uglier in practice, becaues their isn't inheritance involved.  My
strawman would boil down to...

    def foo():...
        next_method = GenFunc.dispatch(*args, after=__this_function__)

Note that the overriding function foo would need to have both a
reference to itself (as opposed to its name, which will often be bound
to somthing else) and to the generic function from which it is being
called (and it might be called from several such functions).
Arranging this during the registration seems like an awaful lots of
work to avoid @after

-jJ

From turnbull at sk.tsukuba.ac.jp  Thu May  3 16:40:03 2007
From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull)
Date: Thu, 03 May 2007 23:40:03 +0900
Subject: [Python-3000] [Python-Dev]   PEP 30XZ: Simplified Parsing
In-Reply-To: <179D5383-88F0-4246-B355-5A817B9F7EBE@python.org>
References: <d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<4638B151.6020901@voidspace.org.uk>
	<5.1.1.6.0.20070502144742.02bc1908@sparrow.telecommunity.com>
	<179D5383-88F0-4246-B355-5A817B9F7EBE@python.org>
Message-ID: <87hcquezss.fsf@uwakimon.sk.tsukuba.ac.jp>

Barry Warsaw writes:

 > The problem is that
 > 
 > 	_("some string"
 > 	  " and more of it")
 > 
 > is not the same as
 > 
 > 	_("some string" +
 > 	  " and more of it")

Are you worried about translators?  The gettext functions themselves
will just see the result of the operation.  The extraction tools like
xgettext do fail, however.  Translating the above to

# The problem is that

 	gettext("some string"
 	  " and more of it")

# is not the same as
 
 	gettext("some string" +
 	  " and more of it")

and invoking "xgettext --force-po --language=Python test.py" gives

# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL at ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2007-05-03 23:32+0900\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL at ADDRESS>\n"
"Language-Team: LANGUAGE <LL at li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: 8bit\n"

#: test.py:3
msgid "some string and more of it"
msgstr ""

#: test.py:8
msgid "some string"
msgstr ""

BTW, it doesn't work for the C equivalent, either.

 > You would either have to teach pygettext and maybe gettext about
 > this construct, or you'd have to use something different.

Teaching Python-based extraction tools about it isn't hard, just make
sure that you slurp in the whole argument, and eval it.  If what you
get isn't a string, throw an exception.  xgettext will be harder,
since apparently does not do it, nor does it even know enough to error
or warn on syntax it doesn't handle within gettext()'s argument.


From barry at python.org  Thu May  3 17:34:58 2007
From: barry at python.org (Barry Warsaw)
Date: Thu, 3 May 2007 11:34:58 -0400
Subject: [Python-3000] [Python-Dev]   PEP 30XZ: Simplified Parsing
In-Reply-To: <87hcquezss.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<4638B151.6020901@voidspace.org.uk>
	<5.1.1.6.0.20070502144742.02bc1908@sparrow.telecommunity.com>
	<179D5383-88F0-4246-B355-5A817B9F7EBE@python.org>
	<87hcquezss.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <2CF5A0DA-509D-4A3D-96A6-30D601572E3E@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On May 3, 2007, at 10:40 AM, Stephen J. Turnbull wrote:

> Barry Warsaw writes:
>
>> The problem is that
>>
>> 	_("some string"
>> 	  " and more of it")
>>
>> is not the same as
>>
>> 	_("some string" +
>> 	  " and more of it")
>
> Are you worried about translators?  The gettext functions themselves
> will just see the result of the operation.  The extraction tools like
> xgettext do fail, however.

Yep, sorry, it is the extraction tools I'm worried about.

> Teaching Python-based extraction tools about it isn't hard, just make
> sure that you slurp in the whole argument, and eval it.  If what you
> get isn't a string, throw an exception.  xgettext will be harder,
> since apparently does not do it, nor does it even know enough to error
> or warn on syntax it doesn't handle within gettext()'s argument.

IMO, this is a problem.  We can make the Python extraction tool work,  
but we should still be very careful about breaking 3rd party tools  
like xgettext, since other projects may be using such tools.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRjoBI3EjvBPtnXfVAQLg0AP/Y1ncqie1NgzRFzuZpnZapMs/+oo+5BCK
1MYqsJwucnDJnOqrUcU34Vq3SB7X7VsSDv3TuoTNnheinX6senorIFQKRAj4abKT
f2Y63t6BT97mSOAITFZvVSj0YSG+zkD/HMGeDj4dOJFLj1tYxgKpVprlhMbELzG1
AIKe+wsYjcs=
=+oFV
-----END PGP SIGNATURE-----

From pje at telecommunity.com  Thu May  3 18:05:32 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 03 May 2007 12:05:32 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <fb6fbf560705030716l50b3a1deqd2f01b7e39c864fb@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501123026.0524eac8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070501235655.02a57fd8@sparrow.telecommunity.com>
	<5.1.1.6.0.20070502113122.04e41350@sparrow.telecommunity.com>
	<5.1.1.6.0.20070502211440.02a386d8@sparrow.telecommunity.com>
	<463980EB.1070102@canterbury.ac.nz>
	<fb6fbf560705030716l50b3a1deqd2f01b7e39c864fb@mail.gmail.com>
Message-ID: <20070503160351.1FF833A4070@sparrow.telecommunity.com>

At 10:16 AM 5/3/2007 -0400, Jim Jewett wrote:
>On 5/3/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>
> > I don't doubt that things like @before and @after are
> > handy. But being handy isn't enough for something to
> > get into the Python core.
>
>I hadn't thought of @before and @after as truly core; I had assumed
>they were decorators that would be available in a genfunc module.

Everything in the PEP is imported from an "overloading" module.  I'm 
not crazy enough to try proposing any built-ins at this point.


>(2)  The explicit version of next_method corresponds to super, but is
>uglier in practice, becaues their isn't inheritance involved.  My
>strawman would boil down to...
>
>     def foo():...
>         next_method = GenFunc.dispatch(*args, after=__this_function__)

Keep in mind that the same function can be re-registered under 
multiple rules, so a reference to the function is insufficient to 
specify where to chain from.  Also, your proposal appears to be 
*re-dispatching* the arguments.  My implementation doesn't redispatch 
anything; it creates a chain of method objects, which each know their 
next method.  These chains are created and cached whenever a new 
combination of methods is required.

In RuleDispatch, the chains are actually linked as bound method 
objects, so that a function's next_method is bound as if it were the 
"self" of that function.  Thus, calling the next method takes 
advantage of Python's "bound method" optimizations.


>Note that the overriding function foo would need to have both a
>reference to itself (as opposed to its name, which will often be bound
>to somthing else) and to the generic function from which it is being
>called (and it might be called from several such functions).
>Arranging this during the registration seems like an awaful lots of
>work to avoid @after

Yep, it's a whole lot simpler just to provide the next_method as an 
extra argument.


From steven.bethard at gmail.com  Thu May  3 18:08:54 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu, 3 May 2007 10:08:54 -0600
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
Message-ID: <d11dcfba0705030908i5a4fe2dfx9b38acd3f3b2fc10@mail.gmail.com>

On 5/3/07, Simon Percivall <percivall at gmail.com> wrote:
> On 2 maj 2007, at 20.08, Guido van Rossum wrote:
> > [Georg]
> >>>>>>> a, *b, c = range(5)
> >>>>>>> a
> >>>>     0
> >>>>>>> c
> >>>>     4
> >>>>>>> b
> >>>>     [1, 2, 3]
> >
> > <snip>
> > That sounds messy; only allowing *a at the end seems a bit more
> > manageable. But I'll hold off until I can shoot holes in your
> > implementation. ;-)
>
> As the patch works right now, any iterator will be exhausted,
> but if the proposal is constrained to only allowing the *name at
> the end, wouldn't a more useful behavior be to not exhaust the
> iterator, making it similar to:
>
>  > it = iter(range(10))
>  > a = next(it)
>  > b = it
>
> or would this be too surprising?

In argument lists, *args exhausts iterators, converting them to
tuples. I think it would be confusing if *args in tuple unpacking
didn't do the same thing.

This brings up the question of why the patch produces lists, not
tuples. What's the reasoning behind that?

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From g.brandl at gmx.net  Thu May  3 18:12:44 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 03 May 2007 18:12:44 +0200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <d11dcfba0705030908i5a4fe2dfx9b38acd3f3b2fc10@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>	<f19d5r$mfm$2@sea.gmane.org>	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
	<d11dcfba0705030908i5a4fe2dfx9b38acd3f3b2fc10@mail.gmail.com>
Message-ID: <f1d1lp$6nr$1@sea.gmane.org>

Steven Bethard schrieb:
> On 5/3/07, Simon Percivall <percivall at gmail.com> wrote:
>> On 2 maj 2007, at 20.08, Guido van Rossum wrote:
>> > [Georg]
>> >>>>>>> a, *b, c = range(5)
>> >>>>>>> a
>> >>>>     0
>> >>>>>>> c
>> >>>>     4
>> >>>>>>> b
>> >>>>     [1, 2, 3]
>> >
>> > <snip>
>> > That sounds messy; only allowing *a at the end seems a bit more
>> > manageable. But I'll hold off until I can shoot holes in your
>> > implementation. ;-)
>>
>> As the patch works right now, any iterator will be exhausted,
>> but if the proposal is constrained to only allowing the *name at
>> the end, wouldn't a more useful behavior be to not exhaust the
>> iterator, making it similar to:
>>
>>  > it = iter(range(10))
>>  > a = next(it)
>>  > b = it
>>
>> or would this be too surprising?
> 
> In argument lists, *args exhausts iterators, converting them to
> tuples. I think it would be confusing if *args in tuple unpacking
> didn't do the same thing.
> 
> This brings up the question of why the patch produces lists, not
> tuples. What's the reasoning behind that?

IMO, it's likely that you would like to further process the resulting
sequence, including modifying it.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From mark.m.mcmahon at gmail.com  Thu May  3 17:59:42 2007
From: mark.m.mcmahon at gmail.com (Mark Mc Mahon)
Date: Thu, 3 May 2007 11:59:42 -0400
Subject: [Python-3000]  PEP: Supporting Non-ASCII Identifiers
Message-ID: <71b6302c0705030859x1aec71cena2ae950255043fd1@mail.gmail.com>

Hi,

One item that I haven't seen mentioned in support of this is that
there is code that uses getattr for accessing things that might be
access other ways.

For example the Attribute access Dictionaries
(http://mail.python.org/pipermail/python-list/2007-March/429137.html),
if one of the keys has a non ASCII character then will not be
accessible through attribute access.
(you could say the same for punctuation - but I think they are not the
same thing).

In pywinauto I try to let people use attribute access for accessing
dialogs and controls of Windows applications

e.g. your_app.DialogTitle.ControlCaption.Click()

This works great for English - but for other languages people have to
use item access
your_app.[u'DialogTitle'].[u'ControlCaption'].Click()

Anyway, just wanted to raise that option too for consideration.

Thanks for the wonderful langauge,
   Mark

From steven.bethard at gmail.com  Thu May  3 18:24:46 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu, 3 May 2007 10:24:46 -0600
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <f1d1lp$6nr$1@sea.gmane.org>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
	<d11dcfba0705030908i5a4fe2dfx9b38acd3f3b2fc10@mail.gmail.com>
	<f1d1lp$6nr$1@sea.gmane.org>
Message-ID: <d11dcfba0705030924u5f242b53y5882832db0cee583@mail.gmail.com>

On 5/3/07, Georg Brandl <g.brandl at gmx.net> wrote:
> Steven Bethard schrieb:
> > On 5/3/07, Simon Percivall <percivall at gmail.com> wrote:
> >> On 2 maj 2007, at 20.08, Guido van Rossum wrote:
> >> > [Georg]
> >> >>>>>>> a, *b, c = range(5)
> >> >>>>>>> a
> >> >>>>     0
> >> >>>>>>> c
> >> >>>>     4
> >> >>>>>>> b
> >> >>>>     [1, 2, 3]
[snip]
> > In argument lists, *args exhausts iterators, converting them to
> > tuples. I think it would be confusing if *args in tuple unpacking
> > didn't do the same thing.
> >
> > This brings up the question of why the patch produces lists, not
> > tuples. What's the reasoning behind that?
>
> IMO, it's likely that you would like to further process the resulting
> sequence, including modifying it.

Well if that's what you're aiming at, then I'd expect it to be more
useful to have the unpacking generate not lists, but the same type you
started with, e.g. if I started with a string, I probably want to
continue using strings::

    >>> first, *rest = 'abcdef'
    >>> assert first == 'a', rest == 'bcdef'

By that same logic, if I started with iterators, I probably want to
continue using iterators, e.g.::

    >>> f = open(...)
    >>> first_line, *remaining_lines = f

So I guess it seems pretty arbitrary to me to assume that a list is
what people want to be using. And if we're going to be arbitrary, I
don't see why we shouldn't be arbitrary in the same way as function
arguments so that we only need on explanation.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From jimjjewett at gmail.com  Thu May  3 18:44:18 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 3 May 2007 12:44:18 -0400
Subject: [Python-3000] PEP 3120 (Was: PEP Parade)
In-Reply-To: <46398CE8.2060206@v.loewis.de>
References: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
	<46398CE8.2060206@v.loewis.de>
Message-ID: <fb6fbf560705030944l595c51a8ha5a3aa1a2d6f382d@mail.gmail.com>

On 5/3/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Untangling the parser from stdio - sure. I also think it would
> be desirable to read the whole source into a buffer, rather than
> applying a line-by-line input. That might be a bigger change,
> making the tokenizer a multi-stage algorithm:

> 1. read input into a buffer
> 2. determine source encoding (looking at a BOM, else a
>    declaration within the first two lines, else default
>    to UTF-8)
> 3. if the source encoding is not UTF-8, pass it through
>    a codec (decode to string, encode to UTF-8). Otherwise,
>    check that all bytes are really well-formed UTF-8.
> 4. start parsing

So people could hook into their own "codec" that, say, replaced native
language keywords with standard python keywords?

Part of me says that should be an import hook instead of pretending to
be a codec...

-jJ

From jseutter at gmail.com  Thu May  3 19:17:27 2007
From: jseutter at gmail.com (Jerry Seutter)
Date: Thu, 3 May 2007 11:17:27 -0600
Subject: [Python-3000] PEP-3125 -- remove backslash continuation
In-Reply-To: <003601c78cbf$0cacec90$2606c5b0$@org>
References: <003601c78cbf$0cacec90$2606c5b0$@org>
Message-ID: <2c8d48d70705031017l10254449q509f7e4fd06c0442@mail.gmail.com>

On 5/2/07, Andrew Koenig <ark at acm.org> wrote:
>
> Looking at PEP-3125, I see that one of the rejected alternatives is to
> allow
> any unfinished expression to indicate a line continuation.
>
> I would like to suggest a modification to that alternative that has worked
> successfully in another programming language, namely Stu Feldman's
> EFL.  EFL
> is a language intended for numerical programming; it compiles into Fortran
> with the interesting property that the resulting Fortran code is intended
> to
> be human-readable and maintainable by people who do not happen to have
> access to the EFL compiler.
>
> Anyway, the (only) continuation rule in EFL is that if the last token in a
> line is one that lexically cannot be the last token in a statement, then
> the
> next line is considered a continuation of the current line.
>
> Python currently has a rule that if parentheses are unbalanced, a newline
> does not end the statement.  If we were to translate the EFL rule to
> Python,
> it would be something like this:
>
>         The whitespace that follows an operator or open bracket or
> parenthesis
>         can include newline characters.
>
> Note that if this suggestion were implemented, it would presumably be at a
> very low lexical level--even before the decision is made to turn a newline
> followed by spaces into an INDENT or DEDENT token.  I think that this
> property solves the difficulty-of-parsing problem.  Indeed, I think that
> this suggestion would be easier to implement than the current
> unbalanced-parentheses rule.
>
>
Would this change alter where errors are reported by the parser?  Is my

x = x +    # Oops.
... some other code ...

going to have an error reported 15 lines below where the actual typo was
made?

Jerry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070503/2a3579bb/attachment.htm 

From barry at python.org  Thu May  3 19:52:11 2007
From: barry at python.org (Barry Warsaw)
Date: Thu, 3 May 2007 13:52:11 -0400
Subject: [Python-3000] [Python-Dev]   PEP 30XZ: Simplified Parsing
In-Reply-To: <878xc5g8qj.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<4638B151.6020901@voidspace.org.uk>
	<5.1.1.6.0.20070502144742.02bc1908@sparrow.telecommunity.com>
	<179D5383-88F0-4246-B355-5A817B9F7EBE@python.org>
	<87hcquezss.fsf@uwakimon.sk.tsukuba.ac.jp>
	<2CF5A0DA-509D-4A3D-96A6-30D601572E3E@python.org>
	<878xc5g8qj.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <1C94BBE1-F569-4F59-85E0-B585B9D21D1A@python.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On May 3, 2007, at 12:41 PM, Stephen J. Turnbull wrote:

> Barry Warsaw writes:
>
>> IMO, this is a problem.  We can make the Python extraction tool work,
>> but we should still be very careful about breaking 3rd party tools
>> like xgettext, since other projects may be using such tools.
>
> But
>
>  	_("some string" +
>  	  " and more of it")
>
> is already legal Python, and xgettext is already broken for it.

Yep, but the idiom that *gettext accepts is used far more often.  If  
that's outlawed then the tools /have/ to be taught the alternative.

> Arguably, xgettext's implementation of -L Python should be
>
>         execve ("pygettext", argv, environ);
>
> <wink>

Ouch. :)

- -Barry


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRjohUXEjvBPtnXfVAQLHhAQAmKNyjbPpIMIlz7zObvb09wdw7jyC2bBa
2w+rDilRgxicUXWqH/L6AeHHl3HiVOO+tELU6upTxOWBMlJG8xcY70rde/32I0gb
Wm0ylLlvDU/bAlSMyUscs77BVt82UQsBEqXyQ2+PRfQj7aOkpqgT8P3dwCYrtPaH
L4W4JzvoK1M=
=9pgu
-----END PGP SIGNATURE-----

From guido at python.org  Thu May  3 20:30:19 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 3 May 2007 11:30:19 -0700
Subject: [Python-3000] Escaping in raw strings (was Re: [Python-Dev] PEP
	30XZ: Simplified Parsing)
In-Reply-To: <200705030958.53558.fdrake@acm.org>
References: <f1cpad$4ml$1@sea.gmane.org> <200705030958.53558.fdrake@acm.org>
Message-ID: <ca471dc20705031130l73fb9f6dj88d383374bcfe952@mail.gmail.com>

On 5/3/07, Fred L. Drake, Jr. <fdrake at acm.org> wrote:
> On Thursday 03 May 2007, Georg Brandl wrote:
>  > Is that something that can be agreed upon without a PEP?
>
> I expect this to be at least somewhat controversial, so a PEP is warranted.
> I'd like to see it fixed, though.

It's too late for a new PEP.

It certainly is controversial; how would you write a regexp that
matches a single or double quote using r"..." or r'...'?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu May  3 20:35:50 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 3 May 2007 11:35:50 -0700
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <f1cpus$7de$1@sea.gmane.org>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<02b401c78d15$f04b6110$090a0a0a@enfoldsystems.local>
	<f1bqp0$vf0$1@sea.gmane.org> <46397BB2.4060404@ronadam.com>
	<4639DD42.3020307@benjiyork.com> <4639E9C1.4010109@ronadam.com>
	<f1cpus$7de$1@sea.gmane.org>
Message-ID: <ca471dc20705031135n4e6b8165i5f829e898daad361@mail.gmail.com>

On 5/3/07, Georg Brandl <g.brandl at gmx.net> wrote:
> > These are raw strings if you didn't notice.
>
> It's all in the implementation. The tokenizer takes it as an escape sequence
> -- it doesn't specialcase raw strings -- the AST builder (parsestr() in ast.c)
> doesn't.

FWIW, it wasn't designed this way so as to be easy to implement. It
was designed this way because the overwhelming use case is regular
expressions, where one needs to be able to escape single and double
quotes -- the re module unescapes \" and \' when it encounters them.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From g.brandl at gmx.net  Thu May  3 20:40:18 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 03 May 2007 20:40:18 +0200
Subject: [Python-3000] Escaping in raw strings (was Re: [Python-Dev] PEP
 30XZ: Simplified Parsing)
In-Reply-To: <ca471dc20705031130l73fb9f6dj88d383374bcfe952@mail.gmail.com>
References: <f1cpad$4ml$1@sea.gmane.org> <200705030958.53558.fdrake@acm.org>
	<ca471dc20705031130l73fb9f6dj88d383374bcfe952@mail.gmail.com>
Message-ID: <f1daae$8g9$1@sea.gmane.org>

Guido van Rossum schrieb:
> On 5/3/07, Fred L. Drake, Jr. <fdrake at acm.org> wrote:
>> On Thursday 03 May 2007, Georg Brandl wrote:
>>  > Is that something that can be agreed upon without a PEP?
>>
>> I expect this to be at least somewhat controversial, so a PEP is warranted.
>> I'd like to see it fixed, though.
> 
> It's too late for a new PEP.

It wouldn't be too late for a 2.6 PEP, would it? However, I'm not going to
champion this.

> It certainly is controversial; how would you write a regexp that
> matches a single or double quote using r"..." or r'...'?

You'd have to concatenate two string literals...

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From martin at v.loewis.de  Thu May  3 23:09:16 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 03 May 2007 23:09:16 +0200
Subject: [Python-3000] PEP 3120 (Was: PEP Parade)
In-Reply-To: <fb6fbf560705030944l595c51a8ha5a3aa1a2d6f382d@mail.gmail.com>
References: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>	
	<46398CE8.2060206@v.loewis.de>
	<fb6fbf560705030944l595c51a8ha5a3aa1a2d6f382d@mail.gmail.com>
Message-ID: <463A4F7C.8090406@v.loewis.de>

>> 1. read input into a buffer
>> 2. determine source encoding (looking at a BOM, else a
>>    declaration within the first two lines, else default
>>    to UTF-8)
>> 3. if the source encoding is not UTF-8, pass it through
>>    a codec (decode to string, encode to UTF-8). Otherwise,
>>    check that all bytes are really well-formed UTF-8.
>> 4. start parsing
> 
> So people could hook into their own "codec" that, say, replaced native
> language keywords with standard python keywords?

No, so that PEP 263 remains implemented.

Martin

From greg.ewing at canterbury.ac.nz  Fri May  4 06:08:59 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 04 May 2007 16:08:59 +1200
Subject: [Python-3000] PEP:  Eliminate __del__
In-Reply-To: <f1c2pp$l30$1@sea.gmane.org>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
	<5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com>
	<f1c2pp$l30$1@sea.gmane.org>
Message-ID: <463AB1DB.5010308@canterbury.ac.nz>

Giovanni Bajo wrote:
> On 01/05/2007 18.09, Phillip J. Eby wrote:
> > That means that if 'self' in your example above is collected, then 
> > the weakref no longer exists, so the closedown won't be called.
> 
> Yes, but as far as I understand it, the GC does special care to ensure that 
> the callback of a weakref that is *not* part of a cyclic trash being collected 
> is always called.

It has nothing to do with cyclic GC. The point is that
if the refcount of a weak reference drops to zero before
that of the object being weakly referenced, the weak
reference object itself is deallocated and its callback
is *not* called. So having the resource-using object
hold the weak ref to the resource doesn't work -- it
has to be kept in some kind of separate registry.

--
Greg

From greg.ewing at canterbury.ac.nz  Fri May  4 06:13:21 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 04 May 2007 16:13:21 +1200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
Message-ID: <463AB2E1.2030408@canterbury.ac.nz>

Simon Percivall wrote:
> if the proposal is constrained to only allowing the *name at
> the end, wouldn't a more useful behavior be to not exhaust the
> iterator, making it similar to:
> 
>  > it = iter(range(10))
>  > a = next(it)
>  > b = it
> 
> or would this be too surprising?

It would surprise the heck out of me when I started
with something that wasn't an iterator and ended
up with b being something that I could only iterate
and couldn't index.

--
Greg

From greg.ewing at canterbury.ac.nz  Fri May  4 06:26:24 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 04 May 2007 16:26:24 +1200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <d11dcfba0705030908i5a4fe2dfx9b38acd3f3b2fc10@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
	<d11dcfba0705030908i5a4fe2dfx9b38acd3f3b2fc10@mail.gmail.com>
Message-ID: <463AB5F0.7020407@canterbury.ac.nz>

Steven Bethard wrote:

> This brings up the question of why the patch produces lists, not
> tuples. What's the reasoning behind that?

When dealing with an iterator, you don't know the
length in advance, so the only way to get a tuple
would be to produce a list first and then create
a tuple from it.

--
Greg

From guido at python.org  Fri May  4 06:34:58 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 3 May 2007 21:34:58 -0700
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <463AB1DB.5010308@canterbury.ac.nz>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
	<5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com>
	<f1c2pp$l30$1@sea.gmane.org> <463AB1DB.5010308@canterbury.ac.nz>
Message-ID: <ca471dc20705032134u664dc510ucacfe91002c89654@mail.gmail.com>

In all the threads about this PEP I still haven't seen a single
example of how to write a finalizer.

Let's take a specific example of a file object (this occurs in io.py
in the p3yk branch). When a write buffer is GC'ed it must be flushed.
The current way of writing this is simple:

class BufferedWriter:

  def __init__(self, raw):
    self.raw = raw
    self.buffer = b""

  def write(self, data):
    self.buffer += data
    if len(self.buffer) >= 8192:
      self.flush()

  def flush(self):
    self.raw.write(self.buffer)
    self.buffer = b""

  def __del__(self):
    self.flush()

How would I write this without using __del__(), e.g. using weak references?

P.S. Don't bother arguing that the caller should use try/finally or
whatever. That's not the point. Assuming we have a class like this
where it has been decided that some method must be called upon
destruction, how do we arrange for that call to happen?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at rcn.com  Fri May  4 07:12:19 2007
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 3 May 2007 22:12:19 -0700
Subject: [Python-3000] PEP:  Eliminate __del__
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1><5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com><f1c2pp$l30$1@sea.gmane.org>
	<463AB1DB.5010308@canterbury.ac.nz>
Message-ID: <011601c78e0a$cadd0cd0$f001a8c0@RaymondLaptop1>

From: "Greg Ewing" <greg.ewing at canterbury.ac.nz>
> It has nothing to do with cyclic GC. The point is that
> if the refcount of a weak reference drops to zero before
> that of the object being weakly referenced, the weak
> reference object itself is deallocated and its callback
> is *not* called. So having the resource-using object
> hold the weak ref to the resource doesn't work -- it
> has to be kept in some kind of separate registry.

I'll write-up an idiomaticc approach an include it in PEP this weekend.


Raymond

From turnbull at sk.tsukuba.ac.jp  Thu May  3 18:41:40 2007
From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull)
Date: Fri, 04 May 2007 01:41:40 +0900
Subject: [Python-3000] [Python-Dev]   PEP 30XZ: Simplified Parsing
In-Reply-To: <2CF5A0DA-509D-4A3D-96A6-30D601572E3E@python.org>
References: <d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<4638B151.6020901@voidspace.org.uk>
	<5.1.1.6.0.20070502144742.02bc1908@sparrow.telecommunity.com>
	<179D5383-88F0-4246-B355-5A817B9F7EBE@python.org>
	<87hcquezss.fsf@uwakimon.sk.tsukuba.ac.jp>
	<2CF5A0DA-509D-4A3D-96A6-30D601572E3E@python.org>
Message-ID: <878xc5g8qj.fsf@uwakimon.sk.tsukuba.ac.jp>

Barry Warsaw writes:

 > IMO, this is a problem.  We can make the Python extraction tool work,  
 > but we should still be very careful about breaking 3rd party tools  
 > like xgettext, since other projects may be using such tools.

But

 	_("some string" +
 	  " and more of it")

is already legal Python, and xgettext is already broken for it.
Arguably, xgettext's implementation of -L Python should be

        execve ("pygettext", argv, environ);

<wink>


From ms at cerenity.org  Thu May  3 18:06:58 2007
From: ms at cerenity.org (Michael Sparks)
Date: Thu, 3 May 2007 17:06:58 +0100
Subject: [Python-3000] [Python-Dev]   PEP 30XZ: Simplified Parsing
In-Reply-To: <87hcquezss.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<179D5383-88F0-4246-B355-5A817B9F7EBE@python.org>
	<87hcquezss.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <200705031706.59685.ms@cerenity.org>

On Thursday 03 May 2007 15:40, Stephen J. Turnbull wrote:
> Teaching Python-based extraction tools about it isn't hard, just make
> sure that you slurp in the whole argument, and eval it.

We generate our component documentation based on going through the AST
generated by compiler.ast, finding doc strings (and other strings in
other known/expected locations), and then formatting using docutils.

Eval'ing the file isn't always going to work due to imports relying on
libraries that may need to be installed. (This is especially the case with 
Kamaelia because we tend to wrap libraries for usage as components in a 
convenient way) 

We've also specifically moved away from importing the file or eval'ing things 
because of this issue. It makes it easier to have docs built on a random 
machine with not too much installed on it.

You could special case "12345" + "67890" as a compile timeconstructor and 
jiggle things such that by the time it came out the parser that looked like 
"1234567890", but I don't see what that has to gain over the current form. 
(which doesn't look like an expression) I also think that's a rather nasty 
version.

On the flip side if we're eval'ing an expression to get a docstring, there 
would be great temptation to extend that to be a doc-object - eg using 
dictionaries, etc as well for more specific docs. Is that wise? I don't 
know :)


Michael.
--
Kamaelia project lead
http://kamaelia.sourceforge.net/Home

From stephen at xemacs.org  Thu May  3 19:54:54 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 04 May 2007 02:54:54 +0900
Subject: [Python-3000] [Python-Dev]   PEP 30XZ: Simplified Parsing
In-Reply-To: <200705031706.59685.ms@cerenity.org>
References: <d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<179D5383-88F0-4246-B355-5A817B9F7EBE@python.org>
	<87hcquezss.fsf@uwakimon.sk.tsukuba.ac.jp>
	<200705031706.59685.ms@cerenity.org>
Message-ID: <87y7k5eqs1.fsf@uwakimon.sk.tsukuba.ac.jp>

Michael Sparks writes:

 > We generate our component documentation based on going through the AST
 > generated by compiler.ast, finding doc strings (and other strings in
 > other known/expected locations), and then formatting using docutils.

Are you talking about I18N and gettext?  If so, I'm really lost ....

 > You could special case "12345" + "67890" as a compile timeconstructor and 
 > jiggle things such that by the time it came out the parser that looked like 
 > "1234567890", but I don't see what that has to gain over the current form. 

I'm not arguing it's a gain, simply that it's a case that *should* be
handled by extractors of translatable strings anyway, and if it were,
there would not be an I18N issue in this PEP.

It *should* be handled because this is just constant folding.  Any
half-witted compiler does it, and programmers expect their compilers
to do it.  pygettext and xgettext are (very special) compilers.  I
don't see why that expectation should be violated just because the
constants in question are translatable strings.

I recognize that for xgettext implementing that in C for languages as
disparate as Lisp, Python, and Perl (all of which have string
concatenation operators) is hard, and to the extent that xgettext is
recommended by 9 out of 10 translators, we need to worry about how
long it's going to take for xgettext to get fixed (because it *is*
broken in this respect, at least for Python).


From percivall at gmail.com  Fri May  4 13:05:45 2007
From: percivall at gmail.com (Simon Percivall)
Date: Fri, 4 May 2007 13:05:45 +0200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <463AB2E1.2030408@canterbury.ac.nz>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
	<463AB2E1.2030408@canterbury.ac.nz>
Message-ID: <A7B470E4-DABA-4223-8E97-237249459FCD@gmail.com>

On 4 maj 2007, at 06.13, Greg Ewing wrote:
> Simon Percivall wrote:
>> if the proposal is constrained to only allowing the *name at
>> the end, wouldn't a more useful behavior be to not exhaust the
>> iterator, making it similar to:
>>  > it = iter(range(10))
>>  > a = next(it)
>>  > b = it
>> or would this be too surprising?
>
> It would surprise the heck out of me when I started
> with something that wasn't an iterator and ended
> up with b being something that I could only iterate
> and couldn't index.

Yes, that would be surprising.

This was more in the way of returning the type that was given:
if you start with a list you end up with a list in "b", if you
start with an iterator you end up with an iterator. This would
enable stuff like using this with itertools.count and other
iterators that represent infinite sequences.

Also, I'm not intending to argue this, but exhausting the
iterator is not exactly like *args in argument lists, because
the iterator isn't the name being starred. It's more like the
formal parameter of a function, when the receiver of the
iterator _is_ starred, but the iterator is not. The iterator
isn't automatically exhausted in those cases.

//Simon

From mike_mp at zzzcomputing.com  Fri May  4 16:21:59 2007
From: mike_mp at zzzcomputing.com (Michael Bayer)
Date: Fri, 4 May 2007 10:21:59 -0400
Subject: [Python-3000] PEP:  Eliminate __del__
In-Reply-To: <011601c78e0a$cadd0cd0$f001a8c0@RaymondLaptop1>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1><5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com><f1c2pp$l30$1@sea.gmane.org>
	<463AB1DB.5010308@canterbury.ac.nz>
	<011601c78e0a$cadd0cd0$f001a8c0@RaymondLaptop1>
Message-ID: <CA43D727-CB77-4ADE-8420-9F5B639A4B23@zzzcomputing.com>


On May 4, 2007, at 1:12 AM, Raymond Hettinger wrote:

> From: "Greg Ewing" <greg.ewing at canterbury.ac.nz>
>> It has nothing to do with cyclic GC. The point is that
>> if the refcount of a weak reference drops to zero before
>> that of the object being weakly referenced, the weak
>> reference object itself is deallocated and its callback
>> is *not* called. So having the resource-using object
>> hold the weak ref to the resource doesn't work -- it
>> has to be kept in some kind of separate registry.
>
> I'll write-up an idiomaticc approach an include it in PEP this  
> weekend.
>

why not encapsulate the "proper" weakref-based approach in an easy-to- 
use method such as "__close__()" ?  that way nobody has to guess how  
to follow this pattern.

From python at rcn.com  Fri May  4 17:22:45 2007
From: python at rcn.com (Raymond Hettinger)
Date: Fri, 4 May 2007 08:22:45 -0700
Subject: [Python-3000] PEP:  Eliminate __del__
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1><5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com><f1c2pp$l30$1@sea.gmane.org>
	<463AB1DB.5010308@canterbury.ac.nz>
	<011601c78e0a$cadd0cd0$f001a8c0@RaymondLaptop1>
	<CA43D727-CB77-4ADE-8420-9F5B639A4B23@zzzcomputing.com>
Message-ID: <01dd01c78e60$11d8e460$f001a8c0@RaymondLaptop1>

[Michael Bayer]
> why not encapsulate the "proper" weakref-based approach in an easy-to- 
> use method such as "__close__()" ?  that way nobody has to guess how  
> to follow this pattern.

An encapsulating function should be added to the weakref module
so that Guido's example could be written as:

class BufferedWriter:

  def __init__(self, raw):
    self.raw = raw
    self.buffer = ""
    weakref.cleanup(self, lambda s: s.raw.write(s.buffer))
    
  def write(self, data):
    self.buffer += data
    if len(self.buffer) >= 8192:
      self.flush()

  def flush(self):
    self.raw.write(self.buffer)
    self.buffer = ""


I've got a first cut at an encapsulating function but am not happy with it yet.
There is almost certainly a better way.  First draft:

def cleanup(obj, callback, _reg = []):
    class AttrMap(object):
        def __init__(self, map):
            self._map = map
        def __getattr__(self, key):
            return self._map[key]    
    def wrapper(wr, mp=AttrMap(obj.__dict__), callback=callback):
        _reg.remove(wr)
        callback(mp)
    _reg.append(ref(obj, wrapper))



Raymond

From steven.bethard at gmail.com  Fri May  4 17:54:40 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 4 May 2007 09:54:40 -0600
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <463AB5F0.7020407@canterbury.ac.nz>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
	<d11dcfba0705030908i5a4fe2dfx9b38acd3f3b2fc10@mail.gmail.com>
	<463AB5F0.7020407@canterbury.ac.nz>
Message-ID: <d11dcfba0705040854o6d48eb76y7cbec5a4a6d447ec@mail.gmail.com>

On 5/3/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Steven Bethard wrote:
>
> > This brings up the question of why the patch produces lists, not
> > tuples. What's the reasoning behind that?
>
> When dealing with an iterator, you don't know the
> length in advance, so the only way to get a tuple
> would be to produce a list first and then create
> a tuple from it.

Yep.  That was one of the reasons it was suggested that the *args
should only appear at the end of the tuple unpacking.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From mike_mp at zzzcomputing.com  Fri May  4 18:45:07 2007
From: mike_mp at zzzcomputing.com (Michael Bayer)
Date: Fri, 4 May 2007 12:45:07 -0400
Subject: [Python-3000] PEP:  Eliminate __del__
In-Reply-To: <463B4455.7060100@develer.com>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1><5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com><f1c2pp$l30$1@sea.gmane.org>
	<463AB1DB.5010308@canterbury.ac.nz>
	<011601c78e0a$cadd0cd0$f001a8c0@RaymondLaptop1>
	<CA43D727-CB77-4ADE-8420-9F5B639A4B23@zzzcomputing.com>
	<463B4455.7060100@develer.com>
Message-ID: <8B815829-8E98-4547-BC99-14E2241C13CB@zzzcomputing.com>


On May 4, 2007, at 10:33 AM, Giovanni Bajo wrote:

> On 5/4/2007 4:21 PM, Michael Bayer wrote:
>
>>>
>> why not encapsulate the "proper" weakref-based approach in an easy- 
>> to-use method such as "__close__()" ?  that way nobody has to  
>> guess how to follow this pattern.
>
> Because the idea is that the callback of the weakref will *NOT*  
> hold a reference to the object being destroyed, but only to the  
> resources that need to be deallocated (that is, to the objects  
> bound as attributes of the object).

a __close__() method on a class is first bound to the class, not any  
particular self.  the Python runtime could detect this and create the  
appropriate callable/weakref scenario behind the scenes; not even  
binding __close__() to the self in the usual way.  obviously it cant  
be a pure python solution, it would have to be a specific runtime  
supported idea (the same way __metaclass__ or any other magic  
attribute is supported).

i just dont understand why such an important feature would have to be  
relegated to just a "recipe".  i think thats a product of the notion  
that "implicit finalizers are bad, use try/finally".  thats not  
really valid for things like buffers that flush and database/network  
connections that must be released when they fall out of scope.






From guido at python.org  Fri May  4 19:15:19 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 4 May 2007 10:15:19 -0700
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <01dd01c78e60$11d8e460$f001a8c0@RaymondLaptop1>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
	<5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com>
	<f1c2pp$l30$1@sea.gmane.org> <463AB1DB.5010308@canterbury.ac.nz>
	<011601c78e0a$cadd0cd0$f001a8c0@RaymondLaptop1>
	<CA43D727-CB77-4ADE-8420-9F5B639A4B23@zzzcomputing.com>
	<01dd01c78e60$11d8e460$f001a8c0@RaymondLaptop1>
Message-ID: <ca471dc20705041015o34351f23k2c66f246f6b305a0@mail.gmail.com>

On 5/4/07, Raymond Hettinger <python at rcn.com> wrote:
> An encapsulating function should be added to the weakref module
> so that Guido's example could be written as:
>
> class BufferedWriter:
>
>   def __init__(self, raw):
>     self.raw = raw
>     self.buffer = ""
>     weakref.cleanup(self, lambda s: s.raw.write(s.buffer))

Or, instead of a new lambda, just use the unbound method:

    weakref.cleanup(self, self.__class__.flush)

Important: use the dynamic class (self.__class___), not the static
class (BufferedWriter). The distinction matters when BufferedWriter is
subclassed and the subclass overrides flush().

Hm, a thought just occurred to me. Why not arrange for object.__new__
to call [the moral equivalent of] weakref.cleanup(self,
self.__class__.__del__), and get rid of the direct call to __del__
from the destructor? (And the special-casing of objects with __del__
in the GC module, of course.)

Then classes that define __del__ won't have to be changed at all. (Of
course dynamically patching a different __del__ method into the class
won't have quite exactly the same semantics, but I don't really care
about such a fragile and rare possibility; I care about vanilla use of
__del__ methods.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From steven.bethard at gmail.com  Fri May  4 20:02:45 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 4 May 2007 12:02:45 -0600
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <ca471dc20705041015o34351f23k2c66f246f6b305a0@mail.gmail.com>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
	<5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com>
	<f1c2pp$l30$1@sea.gmane.org> <463AB1DB.5010308@canterbury.ac.nz>
	<011601c78e0a$cadd0cd0$f001a8c0@RaymondLaptop1>
	<CA43D727-CB77-4ADE-8420-9F5B639A4B23@zzzcomputing.com>
	<01dd01c78e60$11d8e460$f001a8c0@RaymondLaptop1>
	<ca471dc20705041015o34351f23k2c66f246f6b305a0@mail.gmail.com>
Message-ID: <d11dcfba0705041102y5bffc1f9ta444a0c4f44b53ce@mail.gmail.com>

On 5/4/07, Guido van Rossum <guido at python.org> wrote:
> On 5/4/07, Raymond Hettinger <python at rcn.com> wrote:
> > An encapsulating function should be added to the weakref module
> > so that Guido's example could be written as:
> >
> > class BufferedWriter:
> >
> >   def __init__(self, raw):
> >     self.raw = raw
> >     self.buffer = ""
> >     weakref.cleanup(self, lambda s: s.raw.write(s.buffer))
>
> Or, instead of a new lambda, just use the unbound method:
>
>     weakref.cleanup(self, self.__class__.flush)
>
> Important: use the dynamic class (self.__class___), not the static
> class (BufferedWriter). The distinction matters when BufferedWriter is
> subclassed and the subclass overrides flush().
>
> Hm, a thought just occurred to me. Why not arrange for object.__new__
> to call [the moral equivalent of] weakref.cleanup(self,
> self.__class__.__del__), and get rid of the direct call to __del__
> from the destructor? (And the special-casing of objects with __del__
> in the GC module, of course.)

That seems like a good idea, though I'm still a little unclear as to
how far the AttrMap should be going to look like a real instance. As
it stands, you can only access items from the instance __dict__. That
means no methods, class attributes, etc.::

>>> import weakref
>>> def cleanup(obj, callback, _reg=[]):
...    class AttrMap(object):
...        def __init__(self, map):
...            self._map = map
...        def __getattr__(self, key):
...            return self._map[key]
...    def wrapper(wr, mp=AttrMap(obj.__dict__), callback=callback):
...        _reg.remove(wr)
...        callback(mp)
...    _reg.append(weakref.ref(obj, wrapper))
...
>>> class Object(object):
...     # note that we do this in __init__ because in __new__, the
...     # object has no references to it yet
...     def __init__(self):
...         super(Object, self).__init__()
...         if hasattr(self.__class__, '__newdel__'):
...             # note we use .im_func so that we can later pass
...             # any object as the "self" parameter
...             cleanup(self, self.__class__.__newdel__.im_func)
...
>>> class Foo(Object):
...     def flush(self):
...         print 'flushing'
...     def __newdel__(self):
...         print 'deleting'
...         self.flush()
...
>>> f = Foo()
>>> del f
deleting
Exception exceptions.KeyError: 'flush' in <function wrapper at
0x00F34630> ignored

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From guido at python.org  Fri May  4 20:09:42 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 4 May 2007 11:09:42 -0700
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <d11dcfba0705041102y5bffc1f9ta444a0c4f44b53ce@mail.gmail.com>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
	<5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com>
	<f1c2pp$l30$1@sea.gmane.org> <463AB1DB.5010308@canterbury.ac.nz>
	<011601c78e0a$cadd0cd0$f001a8c0@RaymondLaptop1>
	<CA43D727-CB77-4ADE-8420-9F5B639A4B23@zzzcomputing.com>
	<01dd01c78e60$11d8e460$f001a8c0@RaymondLaptop1>
	<ca471dc20705041015o34351f23k2c66f246f6b305a0@mail.gmail.com>
	<d11dcfba0705041102y5bffc1f9ta444a0c4f44b53ce@mail.gmail.com>
Message-ID: <ca471dc20705041109k3543b311id396c85e2b3d03dd@mail.gmail.com>

On 5/4/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> On 5/4/07, Guido van Rossum <guido at python.org> wrote:
> > On 5/4/07, Raymond Hettinger <python at rcn.com> wrote:
> > > An encapsulating function should be added to the weakref module
> > > so that Guido's example could be written as:
> > >
> > > class BufferedWriter:
> > >
> > >   def __init__(self, raw):
> > >     self.raw = raw
> > >     self.buffer = ""
> > >     weakref.cleanup(self, lambda s: s.raw.write(s.buffer))
> >
> > Or, instead of a new lambda, just use the unbound method:
> >
> >     weakref.cleanup(self, self.__class__.flush)
> >
> > Important: use the dynamic class (self.__class___), not the static
> > class (BufferedWriter). The distinction matters when BufferedWriter is
> > subclassed and the subclass overrides flush().
> >
> > Hm, a thought just occurred to me. Why not arrange for object.__new__
> > to call [the moral equivalent of] weakref.cleanup(self,
> > self.__class__.__del__), and get rid of the direct call to __del__
> > from the destructor? (And the special-casing of objects with __del__
> > in the GC module, of course.)
>
> That seems like a good idea, though I'm still a little unclear as to
> how far the AttrMap should be going to look like a real instance. As
> it stands, you can only access items from the instance __dict__. That
> means no methods, class attributes, etc.::

Oh, you mean 'self' as passed to the callback is not the instance?
That kills the whole idea (since the typical __del__ calls
self.flush() or self.close()).

> >>> import weakref
> >>> def cleanup(obj, callback, _reg=[]):
> ...    class AttrMap(object):
> ...        def __init__(self, map):
> ...            self._map = map
> ...        def __getattr__(self, key):
> ...            return self._map[key]
> ...    def wrapper(wr, mp=AttrMap(obj.__dict__), callback=callback):
> ...        _reg.remove(wr)
> ...        callback(mp)
> ...    _reg.append(weakref.ref(obj, wrapper))
> ...
> >>> class Object(object):
> ...     # note that we do this in __init__ because in __new__, the
> ...     # object has no references to it yet
> ...     def __init__(self):
> ...         super(Object, self).__init__()
> ...         if hasattr(self.__class__, '__newdel__'):
> ...             # note we use .im_func so that we can later pass
> ...             # any object as the "self" parameter
> ...             cleanup(self, self.__class__.__newdel__.im_func)
> ...
> >>> class Foo(Object):
> ...     def flush(self):
> ...         print 'flushing'
> ...     def __newdel__(self):
> ...         print 'deleting'
> ...         self.flush()
> ...
> >>> f = Foo()
> >>> del f
> deleting
> Exception exceptions.KeyError: 'flush' in <function wrapper at
> 0x00F34630> ignored

If it really has to be done this way, I think the whole PEP is doomed.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rhamph at gmail.com  Fri May  4 20:35:28 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 4 May 2007 12:35:28 -0600
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <ca471dc20705041109k3543b311id396c85e2b3d03dd@mail.gmail.com>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
	<5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com>
	<f1c2pp$l30$1@sea.gmane.org> <463AB1DB.5010308@canterbury.ac.nz>
	<011601c78e0a$cadd0cd0$f001a8c0@RaymondLaptop1>
	<CA43D727-CB77-4ADE-8420-9F5B639A4B23@zzzcomputing.com>
	<01dd01c78e60$11d8e460$f001a8c0@RaymondLaptop1>
	<ca471dc20705041015o34351f23k2c66f246f6b305a0@mail.gmail.com>
	<d11dcfba0705041102y5bffc1f9ta444a0c4f44b53ce@mail.gmail.com>
	<ca471dc20705041109k3543b311id396c85e2b3d03dd@mail.gmail.com>
Message-ID: <aac2c7cb0705041135q426c995ei5569643ecf4e37d0@mail.gmail.com>

On 5/4/07, Guido van Rossum <guido at python.org> wrote:
> On 5/4/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> > On 5/4/07, Guido van Rossum <guido at python.org> wrote:
> > > Hm, a thought just occurred to me. Why not arrange for object.__new__
> > > to call [the moral equivalent of] weakref.cleanup(self,
> > > self.__class__.__del__), and get rid of the direct call to __del__
> > > from the destructor? (And the special-casing of objects with __del__
> > > in the GC module, of course.)
> >
> > That seems like a good idea, though I'm still a little unclear as to
> > how far the AttrMap should be going to look like a real instance. As
> > it stands, you can only access items from the instance __dict__. That
> > means no methods, class attributes, etc.::
>
> Oh, you mean 'self' as passed to the callback is not the instance?
> That kills the whole idea (since the typical __del__ calls
> self.flush() or self.close()).
>
[..snip example using __dict__..]
>
> If it really has to be done this way, I think the whole PEP is doomed.

Any attempt that keeps the entire contents of __dict__ alive is
doomed.  It's likely to contain a cycle back to the original object,
and avoiding that is the whole point of jumping through these hoops.

I've got a metaclass that moves explicitly marked attributes and
methods into a "core" object, allowing you to write code like this:

class MyFile(safedel):
    __coreattrs__ = ['_fd']
    def __init__(self, path):
        super(MyFile, self).__init__()
        self._fd = os.open(path, ...)
    @coremethod
    def __safedel__(core):
        core.close()
    @coremethod
    def close(core):
        # This method is written to be idempotent
        if core._fd is not None:
            os.close(core._fd)
            core._fd = None

I've submitted it to the python cookbook, but I don't know how long
it'll take to get posted; it's a little on the long side at 163 lines.

The biggest limitation is you can't easily use super() in core
methods, although the proposed changes to super() would probably fix
this.

-- 
Adam Olsen, aka Rhamphoryncus

From python at rcn.com  Fri May  4 20:37:59 2007
From: python at rcn.com (Raymond Hettinger)
Date: Fri,  4 May 2007 14:37:59 -0400 (EDT)
Subject: [Python-3000] PEP: Eliminate __del__
Message-ID: <20070504143759.BIA64211@ms09.lnh.mail.rcn.net>

> If it really has to be done this way, I think the whole PEP is doomed.

This thread is getting way ahead of me and starting to self-destruct before I've had a chance to put together a concrete proposal and scan existing code for use cases.

Can I please press the <slow> button for a few days until I can offer a useful starting point.  So far, it is clear that some of the everyday use-cases can be handled trivially, but there are some use cases that are not going to yield without much more thought.

Raymond


From steve at holdenweb.com  Fri May  4 20:51:00 2007
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 04 May 2007 14:51:00 -0400
Subject: [Python-3000] PEP 30XZ: Simplified Parsing
In-Reply-To: <4638B151.6020901@voidspace.org.uk>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<4638B151.6020901@voidspace.org.uk>
Message-ID: <f1fvam$1ic$1@sea.gmane.org>

Michael Foord wrote:
> Jim Jewett wrote:
>> PEP: 30xz
>> Title: Simplified Parsing
>> Version: $Revision$
>> Last-Modified: $Date$
>> Author: Jim J. Jewett <JimJJewett at gmail.com>
>> Status: Draft
>> Type: Standards Track
>> Content-Type: text/plain
>> Created: 29-Apr-2007
>> Post-History: 29-Apr-2007
>>
>>
>> Abstract
>>
>>     Python initially inherited its parsing from C.  While this has
>>     been generally useful, there are some remnants which have been
>>     less useful for python, and should be eliminated.
>>
>>     + Implicit String concatenation
>>
>>     + Line continuation with "\"
>>
>>     + 034 as an octal number (== decimal 28).  Note that this is
>>       listed only for completeness; the decision to raise an
>>       Exception for leading zeros has already been made in the
>>       context of PEP XXX, about adding a binary literal.
>>
>>
>> Rationale for Removing Implicit String Concatenation
>>
>>     Implicit String concatentation can lead to confusing, or even
>>     silent, errors. [1]
>>
>>         def f(arg1, arg2=None): pass
>>
>>         f("abc" "def")  # forgot the comma, no warning ...
>>                         # silently becomes f("abcdef", None)
>>
>>   
> Implicit string concatenation is massively useful for creating long 
> strings in a readable way though:
> 
>     call_something("first part\n"
>                            "second line\n"
>                             "third line\n")
> 
> I find it an elegant way of building strings and would be sad to see it 
> go. Adding trailing '+' signs is ugly.
> 
Currently at least possible, though doubtless some people won't like the 
left-hand alignment, is

     call_something("""\
first part
second part
third part
""")

Alas if the proposal to remove the continuation backslash goes through 
this may not remain available to us.

I realise that the arrival of Py3 means all these are up for grabs, but 
don't think any of them are really warty enough to require removal.

I take the point that octal constants are counter-intuitive and wouldn't 
be too disappointed by their removal. I still think Icon had the right 
answer there in allowing an explicit decimal radix in constants, so 16 
as a binary constant would be 10000r2, or 10r16. IIRC it still allowed 
0x10 as well (though Tim may shoot me down there).

regards
  Steve
-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd           http://www.holdenweb.com
Skype: holdenweb      http://del.icio.us/steve.holden
------------------ Asciimercial ---------------------
Get on the web: Blog, lens and tag your way to fame!!
holdenweb.blogspot.com        squidoo.com/pythonology
tagged items:         del.icio.us/steve.holden/python
All these services currently offer free registration!
-------------- Thank You for Reading ----------------


From jimjjewett at gmail.com  Fri May  4 21:09:46 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 4 May 2007 15:09:46 -0400
Subject: [Python-3000] updated PEP3125, Remove Backslash Continuation
Message-ID: <fb6fbf560705041209g45ca6ce1g543fd0491f2e40d7@mail.gmail.com>

Major rewrite.

The inside-a-string continuation is separated from the general continuation.

The alternatives section is expaned to als list Andrew Koenig's
improved inside-expressions variant, since that is a real contender.

If anyone feels I haven't acknowledged their concerns, please tell me.

--------------

PEP: 3125
Title: Remove Backslash Continuation
Version: $Revision$
Last-Modified: $Date$
Author: Jim J. Jewett <JimJJewett at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 29-Apr-2007
Post-History: 29-Apr-2007, 30-Apr-2007, 04-May-2007


Abstract
========

    Python initially inherited its parsing from C.  While this has
    been generally useful, there are some remnants which have been
    less useful for python, and should be eliminated.

    This PEP proposes elimination of terminal ``\`` as a marker for
    line continuation.


Motivation
==========

    One goal for Python 3000 should be to simplify the language by
    removing unnecessary or duplicated features.  There are currently
    several ways to indicate that a logical line is continued on the
    following physical line.

    The other continuation methods are easily explained as a logical
    consequence of the semantics they provide; ``\`` is simply an escape
    character that needs to be memorized.


Existing Line Continuation Methods
==================================


Parenthetical Expression - ([{}])
---------------------------------

    Open a parenthetical expression.  It doesn't matter whether people
    view the "line" as continuing; they do immediately recognize that
    the expression needs to be closed before the statement can end.

    An examples using each of (), [], and {}::

        def fn(long_argname1,
               long_argname2):
            settings = {"background":  "random noise"
                        "volume":  "barely audible"}
            restrictions = ["Warrantee void if used",
                            "Notice must be recieved by yesterday"
                            "Not responsible for sales pitch"]

    Note that it is always possible to parenthesize an expression,
    but it can seem odd to parenthesize an expression that needs
    them only for the line break::

        assert val>4, (
            "val is too small")


Triple-Quoted Strings
---------------------

    Open a triple-quoted string; again, people recognize that the
    string needs to finish before the next statement starts.

        banner_message = """
            Satisfaction Guaranteed,
            or DOUBLE YOUR MONEY BACK!!!





                                            some minor restrictions apply"""


Terminal ``\`` in the general case
----------------------------------

    A terminal ``\`` indicates that the logical line is continued on the
    following physical line (after whitespace).  There are no
    particular semantics associated with this.  This form is never
    required, although it may look better (particularly for people
    with a C language background) in some cases::

        >>> assert val>4, \
                "val is too small"

    Also note that the ``\`` must be the final character in the line.
    If your editor navigation can add whitespace to the end of a line,
    that invisible change will alter the semantics of the program.
    Fortunately, the typical result is only a syntax error, rather
    than a runtime bug::

        >>> assert val>4, \
                "val is too small"

        SyntaxError: unexpected character after line continuation character

    This PEP proposes to eliminate this redundant and potentially
    confusing alternative.


Terminal ``\`` within a string
------------------------------

    A terminal ``\`` within a single-quoted string, at the end of the
    line.  This is arguably a special case of the terminal ``\``, but
    it is a special case that may be worth keeping.

        >>> "abd\
         def"
        'abd def'

    + Many of the objections to removing ``\`` termination were really
      just objections to removing it within literal strings; several
      people clarified that they want to keep this literal-string
      usage, but don't mind losing the general case.

    + The use of ``\`` for an escape character within strings is well
      known.

    - But note that this particular usage is odd, because the escaped
      character (the newline) is invisible, and the special treatment
      is to delete the character.  That said, the ``\`` of
      ``\(newline)`` is still an escape which changes the meaning of
      the following character.


Alternate Proposals
===================

    Several people have suggested alternative ways of marking the line
    end.  Most of these were rejected for not actually simplifying things.

    The one exception was to let any unfished expression signify a line
    continuation, possibly in conjunction with increased indentation.

    This is attractive because it is a generalization of the rule for
    parentheses.

    The initial objections to this were:

        - The amount of whitespace may be contentious; expression
          continuation should not be confused with opening a new
          suite.

        - The "expression continuation" markers are not as clearly marked
          in Python as the grouping punctuation "(), [], {}" marks are::

              # Plus needs another operand, so the line continues
              "abc" +
                  "def"

              # String ends an expression, so the line does not
              # not continue.  The next line is a syntax error because
              # unary plus does not apply to strings.
              "abc"
                  + "def"

        - Guido objected for technical reasons.  [#dedent]_  The most
          obvious implementation would require allowing INDENT or
          DEDENT tokens anywhere, or at least in a widely expanded
          (and ill-defined) set of locations.  While this is concern
          only for the internal parsing mechanism (rather than for
          users), it would be a major new source of complexity.

    Andrew Koenig then pointed out [#lexical]_ a better implementation
    strategy, and said that it had worked quite well in other
    languages. [#snocone]_  The improved suggestion boiled down to::

        The whitespace that follows an (operator or) open bracket or
        parenthesis can include newline characters.

        It would be implemented at a very low lexical level -- even
        before the decision is made to turn a newline followed by
        spaces into an INDENT or DEDENT token.

    There is still some concern that it could mask bugs, as in this
    example [#guidobughide]_::

        # Used to be y+1, the 1 got dropped.  Syntax Error (today)
        # would become nonsense.
        x = y+
        f(x)

    Requiring that the continuation be indented more than the initial
    line would add both safety and complexity.


Open Issues
===========

    + Should ``\``-continuation be removed even inside strings?

    + Should the continuation markers be expanced from just ([{}])
      to include lines ending with an operator?

    + As a safety measure, should the continuation line be required
      to be more indented than the initial line?


References
==========

..  [#dedent] (email subject) PEP 30XZ: Simplified Parsing, van Rossum
    http://mail.python.org/pipermail/python-3000/2007-April/007063.html

..  [#lexical] (email subject) PEP-3125 -- remove backslash
    continuation, Koenig
    http://mail.python.org/pipermail/python-3000/2007-May/007237.html

..  [#snocone] The Snocone Programming Language, Koenig
    http://www.snobol4.com/report.htm

..  [#guidobughide] (email subject) PEP-3125 -- remove backslash
    continuation, van Rossum
    http://mail.python.org/pipermail/python-3000/2007-May/007244.html


Copyright
=========

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

From steven.bethard at gmail.com  Fri May  4 21:30:15 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 4 May 2007 13:30:15 -0600
Subject: [Python-3000] [Python-Dev] updated PEP3125,
	Remove Backslash Continuation
In-Reply-To: <fb6fbf560705041209g45ca6ce1g543fd0491f2e40d7@mail.gmail.com>
References: <fb6fbf560705041209g45ca6ce1g543fd0491f2e40d7@mail.gmail.com>
Message-ID: <d11dcfba0705041230t44aec001gd370768891d58b82@mail.gmail.com>

[cc -python-dev]

On 5/4/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> Open Issues
> ===========
>
>     + Should ``\``-continuation be removed even inside strings?

I'm a strong -1 on this PEP if ``\``-continuation is removed from
inside triple-quoted strings. I'd hate to have to go from writing::

    >>> textwrap.dedent('''\
    ...      foo
    ...      bar
    ... ''')
    'foo\nbar\n'

to writing::

    >>> textwrap.dedent('''
    ...     foo
    ...     bar
    ... '''[1:])
    'foo\nbar\n'

or maybe::

    >>> textwrap.dedent('''
    ...     foo
    ...     bar
    ... '''.lstrip('\n'))
    'foo\nbar\n'

>     + Should the continuation markers be expanced from just ([{}])
>       to include lines ending with an operator?

I think the only way to answer this is to have someone actually
implement it, so that we can evaluate the complexity of the
implementation.  If someone can produce a patch, we can talk about
this.

>     + As a safety measure, should the continuation line be required
>       to be more indented than the initial line?

Again, let's see a patch and we can talk about it.


STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From mike.klaas at gmail.com  Fri May  4 22:45:00 2007
From: mike.klaas at gmail.com (Mike Klaas)
Date: Fri, 4 May 2007 13:45:00 -0700
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <f1g4c4$l2m$1@sea.gmane.org>
References: <fb6fbf560704292029g118592bpb19a87a2ee5ea312@mail.gmail.com>
	<4638B151.6020901@voidspace.org.uk>
	<d11dcfba0705021000k3c0d1e20pdc961e2b9947f67a@mail.gmail.com>
	<f1g4c4$l2m$1@sea.gmane.org>
Message-ID: <3d2ce8cb0705041345m5b5d2b30oe11b0d392e7324cd@mail.gmail.com>

On 5/4/07, Baptiste Carvello <baptiste13 at altern.org> wrote:

> maybe we could have a "dedent" literal that would remove the first newline and
> all indentation so that you can just write:
>
> call_something( d'''
>                  first part
>                  second line
>                  third line
>                  ''' )

Surely

from textwrap import dedent as d

is close enough?

-Mike

From baptiste13 at altern.org  Fri May  4 22:47:07 2007
From: baptiste13 at altern.org (Baptiste Carvello)
Date: Fri, 04 May 2007 22:47:07 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <46371BD2.7050303@v.loewis.de>
References: <46371BD2.7050303@v.loewis.de>
Message-ID: <f1g65s$nar$1@sea.gmane.org>

Martin v. L?wis a ?crit :
> PEP: 31xx
> Title: Supporting Non-ASCII Identifiers
> Version: $Revision$
> Last-Modified: $Date$
> Author: Martin v. L?wis <martin at v.loewis.de>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 1-May-2007
> Python-Version: 3.0
> Post-History:
> 
> Abstract
> ========
> 
> This PEP suggests to support Non-ASCII letters (such as accented
> characters, Cyrillic, Greek, Kanji, etc.) in Python identifiers.
> 

If this is to ever happen, it should be only accessible through a command-line
option to python. That way we make sure people are aware that they are making
their code incompatible with the larger world.

Cheers,
Baptiste


From nevillegrech at gmail.com  Fri May  4 22:51:19 2007
From: nevillegrech at gmail.com (Neville Grech Neville Grech)
Date: Fri, 4 May 2007 22:51:19 +0200
Subject: [Python-3000] [Python-Dev] updated PEP3125,
	Remove Backslash Continuation
In-Reply-To: <d11dcfba0705041230t44aec001gd370768891d58b82@mail.gmail.com>
References: <fb6fbf560705041209g45ca6ce1g543fd0491f2e40d7@mail.gmail.com>
	<d11dcfba0705041230t44aec001gd370768891d58b82@mail.gmail.com>
Message-ID: <de9ae4950705041351m52c793f3g83fff9ed6330a0a2@mail.gmail.com>

This PEP is much more reasonable.

Should ``\``-continuation be removed even inside strings? -1

 Backslash continuation in strings is used a lot.. especially in strings
that must not start with a newline but are written in the following format
for clarity:
'''\
  first line
  second line\
'''

Should the continuation markers be expanced from just ([{}]) to include
lines ending with an operator?

-1

I think that the following is much more clear:

a=(3 +
     2 +
     4)
f(x)

than:

a= 3+
     2+
     4
f(x)


On 5/4/07, Steven Bethard <steven.bethard at gmail.com> wrote:
>
> [cc -python-dev]
>
> On 5/4/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > Open Issues
> > ===========
> >
> >     + Should ``\``-continuation be removed even inside strings?
>
> I'm a strong -1 on this PEP if ``\``-continuation is removed from
> inside triple-quoted strings. I'd hate to have to go from writing::
>
>     >>> textwrap.dedent('''\
>     ...      foo
>     ...      bar
>     ... ''')
>     'foo\nbar\n'
>
> to writing::
>
>     >>> textwrap.dedent('''
>     ...     foo
>     ...     bar
>     ... '''[1:])
>     'foo\nbar\n'
>
> or maybe::
>
>     >>> textwrap.dedent('''
>     ...     foo
>     ...     bar
>     ... '''.lstrip('\n'))
>     'foo\nbar\n'
>
> >     + Should the continuation markers be expanced from just ([{}])
> >       to include lines ending with an operator?
>
> I think the only way to answer this is to have someone actually
> implement it, so that we can evaluate the complexity of the
> implementation.  If someone can produce a patch, we can talk about
> this.
>
> >     + As a safety measure, should the continuation line be required
> >       to be more indented than the initial line?
>
> Again, let's see a patch and we can talk about it.
>
>
> STeVe
> --
> I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
> tiny blip on the distant coast of sanity.
>         --- Bucky Katt, Get Fuzzy
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/nevillegrech%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070504/7326cc0e/attachment.htm 

From martin at v.loewis.de  Sat May  5 01:00:27 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 05 May 2007 01:00:27 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <f1g65s$nar$1@sea.gmane.org>
References: <46371BD2.7050303@v.loewis.de> <f1g65s$nar$1@sea.gmane.org>
Message-ID: <463BBB0B.40703@v.loewis.de>

> If this is to ever happen, it should be only accessible through a command-line
> option to python. That way we make sure people are aware that they are making
> their code incompatible with the larger world.

In what way will the source code be incompatible with the larger world?

Martin

From guido at python.org  Sat May  5 01:10:05 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 4 May 2007 16:10:05 -0700
Subject: [Python-3000] Can someone please make py3k* checkins go to the
	python-3000-checkins mailing list?
Message-ID: <ca471dc20705041610n69818041l883b5ed6ddefaee2@mail.gmail.com>

I don't know how the filters for checkin emails are set up, but this
seems wrong: mail related to the p3yk branch goes to
python-3000-checkins, but mail related to the py3k-unistr branch goes
to python-checkins. There are a bunch of branches of relevance to py3k
now; these should all go to the python-3000-checkins list. I suggest
to filter on branches that start with either py3k or with p3yk.
-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Sat May  5 03:23:39 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 05 May 2007 13:23:39 +1200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <A7B470E4-DABA-4223-8E97-237249459FCD@gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
	<463AB2E1.2030408@canterbury.ac.nz>
	<A7B470E4-DABA-4223-8E97-237249459FCD@gmail.com>
Message-ID: <463BDC9B.2030500@canterbury.ac.nz>

Simon Percivall wrote:

> This was more in the way of returning the type that was given:
> if you start with a list you end up with a list in "b", if you
> start with an iterator you end up with an iterator.

I don't think that returning the type given is a goal
that should be attempted, because it can only ever work
for a fixed set of known types. Given an arbitrary
sequence type, there is no way of knowing how to
create a new instance of it with specified contents.

--
Greg

From daniel at stutzbachenterprises.com  Sat May  5 03:44:03 2007
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Fri, 4 May 2007 20:44:03 -0500
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <463BDC9B.2030500@canterbury.ac.nz>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
	<463AB2E1.2030408@canterbury.ac.nz>
	<A7B470E4-DABA-4223-8E97-237249459FCD@gmail.com>
	<463BDC9B.2030500@canterbury.ac.nz>
Message-ID: <eae285400705041844l1dbecbe2o6d7858b0fde8e33@mail.gmail.com>

On 5/4/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> I don't think that returning the type given is a goal
> that should be attempted, because it can only ever work
> for a fixed set of known types. Given an arbitrary
> sequence type, there is no way of knowing how to
> create a new instance of it with specified contents.

For objects that support the sequence protocol, how about specifying that:

a, *b = container_object

must be equivalent to:

a, b = container_object[0], container_object[1:]

That way, b is assigned whatever container_object's getslice method
returns.  A list will return a list, a tuple will return a tuple, and
widgets (or BLists...) can return whatever makes sense for them.

-- 
Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC

From greg.ewing at canterbury.ac.nz  Sat May  5 04:07:00 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 05 May 2007 14:07:00 +1200
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <20070504143759.BIA64211@ms09.lnh.mail.rcn.net>
References: <20070504143759.BIA64211@ms09.lnh.mail.rcn.net>
Message-ID: <463BE6C4.5060309@canterbury.ac.nz>

Raymond Hettinger wrote:

> Can I please press the <slow> button for a few days until I can offer
 > a useful starting point.

Before you go any further, the important thing to take
from the thread so far is that you mustn't keep the
whole contents of the object's __dict__ alive via
the callback.

--
Greg

From foom at fuhm.net  Sat May  5 06:22:57 2007
From: foom at fuhm.net (James Y Knight)
Date: Sat, 5 May 2007 00:22:57 -0400
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <f1g65s$nar$1@sea.gmane.org>
References: <46371BD2.7050303@v.loewis.de> <f1g65s$nar$1@sea.gmane.org>
Message-ID: <C5DE295C-E159-4DBE-8C96-4972A02927FB@fuhm.net>

On May 4, 2007, at 4:47 PM, Baptiste Carvello wrote:
> If this is to ever happen, it should be only accessible through a  
> command-line
> option to python. That way we make sure people are aware that they  
> are making
> their code incompatible with the larger world.

That's ridiculous. Without your special option, the code would run  
perfectly well on pythons world-wide. Requiring a special option is a  
surefire way to *ensure* compatibility issues, of course...

James


From ncoghlan at gmail.com  Sat May  5 12:12:36 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 05 May 2007 20:12:36 +1000
Subject: [Python-3000] PEP:  Eliminate __del__
In-Reply-To: <8B815829-8E98-4547-BC99-14E2241C13CB@zzzcomputing.com>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1><5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com><f1c2pp$l30$1@sea.gmane.org>	<463AB1DB.5010308@canterbury.ac.nz>	<011601c78e0a$cadd0cd0$f001a8c0@RaymondLaptop1>	<CA43D727-CB77-4ADE-8420-9F5B639A4B23@zzzcomputing.com>	<463B4455.7060100@develer.com>
	<8B815829-8E98-4547-BC99-14E2241C13CB@zzzcomputing.com>
Message-ID: <463C5894.8020601@gmail.com>

Michael Bayer wrote:
> i just dont understand why such an important feature would have to be  
> relegated to just a "recipe".  i think thats a product of the notion  
> that "implicit finalizers are bad, use try/finally".  thats not  
> really valid for things like buffers that flush and database/network  
> connections that must be released when they fall out of scope.

Implicit finalizers are typically bad because they don't provide any 
kind of guarantee as to when they're going to be executed - all they 
promise is "eventually, maybe". If the gc is paused or disabled for some 
reason, the answer is quite possibly never (and with current __del__ 
semantics, the answer in CPython may be never even when full gc is running).

It's just a quirk of CPython that "eventually" normally translates to 
"when the variable goes out of scope" for objects that don't participate 
in cycles.

Accordingly, anything which requires explicit finalization (such as 
flushing a buffer, or releasing a database connection) needs to migrate 
towards using the context management protocol and the with statement to 
ensure things are cleaned up properly regardless of the GC semantics 
that currently happen to be in force.

Implicit finalization still has a place though, and it is curently 
supported in a far more definite fashion by using a weakref callback and 
leaving __del__ undefined. The downside is that the current weakref 
module leaves you with some extra work to do before you can easily use 
it for finalization.

The reason for initially pursuing a recipe approach for weakref based 
finalisation is that it allows time to determine whether or not there 
are better recipes than whatever is proposed in the PEP before casting 
it in the form of fixed language syntax. Adding syntactic sugar for a 
recipe is child's play compared to trying to get rid of syntax (or 
change its semantics) after discovering it is broken in some fashion.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From rasky at develer.com  Sat May  5 13:21:59 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Sat, 05 May 2007 13:21:59 +0200
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <aac2c7cb0705041135q426c995ei5569643ecf4e37d0@mail.gmail.com>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>	<5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com>	<f1c2pp$l30$1@sea.gmane.org>
	<463AB1DB.5010308@canterbury.ac.nz>	<011601c78e0a$cadd0cd0$f001a8c0@RaymondLaptop1>	<CA43D727-CB77-4ADE-8420-9F5B639A4B23@zzzcomputing.com>	<01dd01c78e60$11d8e460$f001a8c0@RaymondLaptop1>	<ca471dc20705041015o34351f23k2c66f246f6b305a0@mail.gmail.com>	<d11dcfba0705041102y5bffc1f9ta444a0c4f44b53ce@mail.gmail.com>	<ca471dc20705041109k3543b311id396c85e2b3d03dd@mail.gmail.com>
	<aac2c7cb0705041135q426c995ei5569643ecf4e37d0@mail.gmail.com>
Message-ID: <f1hpcn$lti$1@sea.gmane.org>

On 04/05/2007 20.35, Adam Olsen wrote:

> Any attempt that keeps the entire contents of __dict__ alive is
> doomed.  It's likely to contain a cycle back to the original object,
> and avoiding that is the whole point of jumping through these hoops.

Uh? If __dict__ contains a cycle back to the original object, then the object 
is part of a cycle already, with or without getting an additional reference to 
the __dict__ within the finalization callback.

And if there's no cycle, you're not creating one by just referencing __dict__.
-- 
Giovanni Bajo


From rasky at develer.com  Sat May  5 13:39:37 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Sat, 05 May 2007 13:39:37 +0200
Subject: [Python-3000] PEP to change how the main module is delineated
In-Reply-To: <ca471dc20704231505w2ae34a0dwb6a2227f1df52831@mail.gmail.com>
References: <bbaeab100704221845i18417626o8dc1289fa7f9b685@mail.gmail.com>
	<ca471dc20704231505w2ae34a0dwb6a2227f1df52831@mail.gmail.com>
Message-ID: <f1hqdq$pf7$1@sea.gmane.org>

On 24/04/2007 0.05, Guido van Rossum wrote:

>> This PEP is to change the ``if __name__ == "__main__": ...`` idiom to
>> ``if __name__ == sys.main: ...``  so that you at least have a chance
>> to execute module in a package that use relative imports.
>>
>> Ran this PEP past python-ideas.  Stopped the discussion there when too
>> many new ideas were being proposed.  =)  I have listed all of them in
>> the Rejected Ideas section, although if overwhelming support for one
>> comes forward the PEP can shift to one of them.
> 
> I'm -1 on this and on any other proposed twiddlings of the __main__
> machinery. The only use case seems to be running scripts that happen
> to be living inside a module's directory, which I've always seen as an
> antipattern. To make me change my mind you'd have to convince me that
> it isn't.

Sometimes, beginners get confused because of this. They start with a single 
module, it grows and grows, until they split it into another module. But if 
the two modules then import each other, there is an asymmetry because one is 
internally renamed to __main__. For instance:

==== a.py ====
class A:
    pass

if __name__ == "__main__":
    a = A()
    print A.__name__
    print a.__class__
    import b
    b.run(a)
===============

==== b.py ====
from a import A

def run(a):
    print A
    print a.__class__
    assert isinstance(a, A)  # FAIL!
==============

$ python a.py
A
__main__.A
a.A
__main__.A
Traceback (most recent call last):
   File "a.py", line 9, in ?
     b.run(a)
   File "E:\work\b.py", line 6, in run
     assert isinstance(a, A)  # FAIL!
AssertionError

I think this behaviour confuses many beginners, and it is unnatural for 
experts too. I've got bitten a few times in the past.

I still believe that it would be much easier to just support an explicit 
__main__ function:

==== a.py ====
class A:
    pass

def __main__():
    a = A()
    print A.__name__
    print a.__class__
    import b
    b.run(a)
===============

which is easier to read for beginners and let the main module keep its 
original name, thus not causinng these weird side-effects.
-- 
Giovanni Bajo


From exarkun at divmod.com  Sat May  5 18:27:08 2007
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Sat, 5 May 2007 12:27:08 -0400
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <f1hpcn$lti$1@sea.gmane.org>
Message-ID: <20070505162708.19381.10763891.divmod.quotient.8722@ohm>

On Sat, 05 May 2007 13:21:59 +0200, Giovanni Bajo <rasky at develer.com> wrote:
>On 04/05/2007 20.35, Adam Olsen wrote:
>
>> Any attempt that keeps the entire contents of __dict__ alive is
>> doomed.  It's likely to contain a cycle back to the original object,
>> and avoiding that is the whole point of jumping through these hoops.
>
>Uh? If __dict__ contains a cycle back to the original object, then the object
>is part of a cycle already, with or without getting an additional reference to
>the __dict__ within the finalization callback.

If the __dict__ contains a cycle back to the original object, then if you
keep the __dict__ alive in the weakref callback (which is what you are doing
if the weakref callback references the __dict__ - it does not weakly
reference it), then you will keep the original object alive and the weakref
callback will never run, because the original object will live forever.

Contrariwise, if the weakref callback has only a reference to the particular
objects which it needs, then it doesn't matter if there is a cycle through
some _other_ objects which are in the __dict__, since the weakref callback
will not keep the cycle alive: eventually the cyclic gc will clean up the
cycle (but leave the objects referenced by the weakref callback alone, since
the weakref callback is keeping them alive), then the weakref callback will
run since it was a weakref to the original object which has now been
collected, the weakref callback will be able to use the specific references
it has to do some cleanup, and then most likely both the weakref object, the
weakref callback object, and whatever specific objects it held references to
will be dropped and eventually collected (though this is not necessarily the
case, since the weakref callback could choose to keep the specific objects it
referenced (not the object weakly referenced in the first place) alive by
putting them into some other living container).

Jean-Paul

From mike_mp at zzzcomputing.com  Sat May  5 21:46:51 2007
From: mike_mp at zzzcomputing.com (Michael Bayer)
Date: Sat, 5 May 2007 15:46:51 -0400
Subject: [Python-3000] PEP:  Eliminate __del__
In-Reply-To: <463C5894.8020601@gmail.com>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1><5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com><f1c2pp$l30$1@sea.gmane.org>	<463AB1DB.5010308@canterbury.ac.nz>	<011601c78e0a$cadd0cd0$f001a8c0@RaymondLaptop1>	<CA43D727-CB77-4ADE-8420-9F5B639A4B23@zzzcomputing.com>	<463B4455.7060100@develer.com>
	<8B815829-8E98-4547-BC99-14E2241C13CB@zzzcomputing.com>
	<463C5894.8020601@gmail.com>
Message-ID: <EB001A2E-AF6B-4F2C-86BB-E7BCE051785B@zzzcomputing.com>


On May 5, 2007, at 6:12 AM, Nick Coghlan wrote:

>
> The reason for initially pursuing a recipe approach for weakref  
> based finalisation is that it allows time to determine whether or  
> not there are better recipes than whatever is proposed in the PEP  
> before casting it in the form of fixed language syntax. Adding  
> syntactic sugar for a recipe is child's play compared to trying to  
> get rid of syntax (or change its semantics) after discovering it is  
> broken in some fashion.
>

if the recipe is just an interim step towards developing something  
that "just works", then we agree.   obviously explicit finalization  
is preferable and relying upon cpython's "immediate" GC of non-cycled  
objects is a bad trap to fall into (particularly if you then run the  
same code using Jython for example)...but in a garbage collected  
language, the "loose ends" still need some way to clean themselves up  
even if its deferred.

From tomerfiliba at gmail.com  Sat May  5 15:29:47 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Sat, 5 May 2007 15:29:47 +0200
Subject: [Python-3000] the future of the GIL
Message-ID: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>

hi all

i have to admit i've been neglecting the list in the past few months,
and i don't know whether the issue i want to bring up has been
settled already.

as you all may have noticed, multicore processors are becoming
more and more common in all kinds of machines, from desktops
to servers, and will surely become more prevalent with time,
as all major CPU vendors plan to ship 8-core processors
by mid-2008.

back in the day of uniprocessor machines, having the GIL really
made life simpler and the sacrifice was negligible.

however, running a threaded python script over an 8-core
machine, where you can utilize at most 12.5% of the horsepower,
seems like too large a sacrifice to me.

the only way to overcome this with cpython is to Kill The GIL (TM),
and since it's a very big architectural change, it ought to happen
soon. pushing it further than version 3.0 means all library authors
would have to adapt their code twice (once to make it compatible
with 3.0, and then again to make it thread safe).

i see all hell has broken loose here, PEP-wise speaking, but i really
hope there's still time to consider killing the GIL at last.


-tomer

From steven.bethard at gmail.com  Sun May  6 01:15:44 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 5 May 2007 17:15:44 -0600
Subject: [Python-3000] the future of the GIL
In-Reply-To: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
References: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
Message-ID: <d11dcfba0705051615x21773145jf6cacb34074575f3@mail.gmail.com>

On 5/5/07, tomer filiba <tomerfiliba at gmail.com> wrote:
> the only way to overcome this with cpython is to Kill The GIL (TM),
> and since it's a very big architectural change, it ought to happen
> soon. pushing it further than version 3.0 means all library authors
> would have to adapt their code twice (once to make it compatible
> with 3.0, and then again to make it thread safe).
>
> i see all hell has broken loose here, PEP-wise speaking, but i really
> hope there's still time to consider killing the GIL at last.

You've missed the deadline for Python 3000 PEPs.  (It was April 30th.)
 This discussion is also probably more appropriate for python-ideas
until someone has something resembling an implementation ready...

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From greg.ewing at canterbury.ac.nz  Sun May  6 01:14:30 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 06 May 2007 11:14:30 +1200
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <f1hpcn$lti$1@sea.gmane.org>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
	<5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com>
	<f1c2pp$l30$1@sea.gmane.org> <463AB1DB.5010308@canterbury.ac.nz>
	<011601c78e0a$cadd0cd0$f001a8c0@RaymondLaptop1>
	<CA43D727-CB77-4ADE-8420-9F5B639A4B23@zzzcomputing.com>
	<01dd01c78e60$11d8e460$f001a8c0@RaymondLaptop1>
	<ca471dc20705041015o34351f23k2c66f246f6b305a0@mail.gmail.com>
	<d11dcfba0705041102y5bffc1f9ta444a0c4f44b53ce@mail.gmail.com>
	<ca471dc20705041109k3543b311id396c85e2b3d03dd@mail.gmail.com>
	<aac2c7cb0705041135q426c995ei5569643ecf4e37d0@mail.gmail.com>
	<f1hpcn$lti$1@sea.gmane.org>
Message-ID: <463D0FD6.4030100@canterbury.ac.nz>

Giovanni Bajo wrote:

> Uh? If __dict__ contains a cycle back to the original object, then the object 
> is part of a cycle already, with or without getting an additional reference to 
> the __dict__ within the finalization callback.

Yes, but storing a finalizer in a global registry that
references the __dict__ makes it an *immortal* cycle,
because the GC won't see it as an isolated cycle that's
not referenced from outside.

> And if there's no cycle, you're not creating one by just 
 > referencing __dict__.

It's not creation of the cycle that's the issue, it's
keeping it alive forever once it's created.

--
Greg

From greg.ewing at canterbury.ac.nz  Sun May  6 01:46:38 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 06 May 2007 11:46:38 +1200
Subject: [Python-3000] PEP:  Eliminate __del__
In-Reply-To: <463B02EB.6060006@develer.com>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
	<5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com>
	<f1c2pp$l30$1@sea.gmane.org> <463AB1DB.5010308@canterbury.ac.nz>
	<463B02EB.6060006@develer.com>
Message-ID: <463D175E.1000201@canterbury.ac.nz>

Giovanni Bajo wrote:

> class Holder:
>    def __init__(self):
>       self.resource = ....
>       self.__wr = weakref(self.resource, ....)
> 
> So, are you 
> saying that it's possible that the weakreference refcount goes to zero 
> *before* Holder's refcount?

No, but depending on the order in which the dict contents
gets decrefed when Holder is deallocated, the __wr attribute
may get deallocated before the resource attribute. If that
happens, the callback is never called.

I have run the following code with Python 2.3, 2.4 and 2.5
and it does not print "Cleaning up":

from weakref import ref

class Resource:
     pass

def cleanup(x):
     print "Cleaning up"

class Holder:

     def __init__(self):
         self.resource = Resource()
         self.weakref = ref(self.resource, cleanup)

h = Holder()
del h

> Are you saying that the fact that it works for me in real-world code is 
> just out of luck and might randomically break?

If this is really what you're doing, then yes, it will
randomly break. I may have misunderstood exactly what it
is you're doing, however.

--
Greg

From talin at acm.org  Sun May  6 02:57:51 2007
From: talin at acm.org (Talin)
Date: Sat, 05 May 2007 17:57:51 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
References: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
Message-ID: <463D280F.6070101@acm.org>

tomer filiba wrote:
> the only way to overcome this with cpython is to Kill The GIL (TM),
> and since it's a very big architectural change, it ought to happen
> soon. pushing it further than version 3.0 means all library authors
> would have to adapt their code twice (once to make it compatible
> with 3.0, and then again to make it thread safe).
> 
> i see all hell has broken loose here, PEP-wise speaking, but i really
> hope there's still time to consider killing the GIL at last.

I've brought up this issue as well, but the consensus seems to be that 
this is just too hard to even consider for 3.0.

Note that Jython and IronPython don't have the same restrictions in this 
regard as CPython. Both VMs are able to run in multiprocessing 
environments. (I don't know whether or not Jython/IronPython even have a 
GIL or not.)

My suggested approach to making CPython concurrent is to first tackle 
the problem of garbage collection in a multiprocessing environment. Once 
that is done, the next piece would be to address the issues of thread 
safety of the interpreter's internal data structures.

At one point, I started working on a generic, concurrent garbage 
collector that would be useful for a variety of interpreted languages 
such as Python, but I haven't had time to work on it lately. Its similar 
to the Boehm collector, except that it's designed for "cooperative" 
languages in which the collector knows about the structure of objects.

When I last worked on it, I had gotten the "young generation" collection 
working, and I had just finished implementing the global heap, and was 
in the process of writing unit tests for it. I hadn't started on 
old-generation collection or cross-generation reference tracking.

-- Talin

From jcarlson at uci.edu  Sun May  6 03:29:31 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 05 May 2007 18:29:31 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
References: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
Message-ID: <20070505181324.649A.JCARLSON@uci.edu>


"tomer filiba" <tomerfiliba at gmail.com> wrote:
> the only way to overcome this with cpython is to Kill The GIL (TM),
> and since it's a very big architectural change, it ought to happen
> soon. pushing it further than version 3.0 means all library authors
> would have to adapt their code twice (once to make it compatible
> with 3.0, and then again to make it thread safe).

There are many solutions to handling the scaling of Python on multicore
processors, only one of which is killing the GIL.  Another is Greg
Ewing's ideas offered in the "Ideas towards GIL removal" thread in the
python-ideas list.

My personal favorite, because it doesn't require a complete re-design of
the CPython runtime, is better abstractions.  I was skeptical at first,
but in reading the documentation, installing, testing, and monkeying
around with the processing package by Richard Oudkerk, I do think that
it has the proper level of abstraction.  Like thread programming it has
its quirks, but it seems that one should be able to apply much of their
experience with threads to the processing module (as long as they rely
on explicitly shared objects for communication).

If you are used to using threads, give the processing package a try. You
may be as pleasantly surprised as I was.  Note that it would take some
more work to get it to work with passing sockets to another process, but
that has been done before (I have code that others have written if
anyone is curious).


 - Josiah


From martin at v.loewis.de  Sun May  6 09:47:12 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sun, 06 May 2007 09:47:12 +0200
Subject: [Python-3000] PEP 3112
Message-ID: <463D8800.1010906@v.loewis.de>

I just read PEP 3112, and I believe it contains a
flaw/underspecification.

It says

# Each shortstringchar or longstringchar must be a character between 1
# and 127 inclusive, regardless of any encoding declaration [2] in the
# source file.

What does that mean? In particular, what is "a character between 1 and
127"?

Assuming this refers to ordinal values in some encoding: what encoding?
It's particularly puzzling that it says "regardless of any encoding
declaration of the source file".

I fear (but hope that I'm wrong) that this was meant to mean "use the
bytes as they are stored on disk in the source file". If so: is the
attached file valid Python? In case your editor can't render it: it
reads

#! -*- coding: iso-2022-jp -*-
a = b"?????"

But if you look at the file with a hex editor, you see it contains
only bytes between 1 and 127.

I would hope that this code is indeed ill-formed (i.e. that
the byte representation on disk is irrelevant, and only the
Unicode ordinals of the source characters matter)

If so, can the specification please be updated to clarify that
1. in Grammar changes: Each shortstringchar or longstringchar must
   be a character whose Unicode ordinal value is between 1 and
   127 inclusive.
2. in Semantics: The bytes in the new object are obtained as if
   encoding a string literal with "iso-8859-1"

Regards,
Martin

-------------- next part --------------
A non-text attachment was scrubbed...
Name: a.py
Type: text/x-python
Size: 55 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20070506/c0269ce4/attachment.py 

From martin at v.loewis.de  Sun May  6 10:20:02 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 06 May 2007 10:20:02 +0200
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <20070504143759.BIA64211@ms09.lnh.mail.rcn.net>
References: <20070504143759.BIA64211@ms09.lnh.mail.rcn.net>
Message-ID: <463D8FB2.8050900@v.loewis.de>

> Can I please press the <slow> button for a few days until I can offer a useful starting point. 

Socially, this is the point of the PEP process in the first place: the
PEP author is supposed to collect community feedback in the PEP, and
address it as necessary. People won't stop discussing if the PEP author
is away, but eventually, discussion will die off, and restart when a
new version of the PEP is published. Of course, at that time, people
will have their bias when the next version of the PEP comes, and you
can do nothing about that.

Procedurally, there is a problem that this still isn't an
officially-posted PEP, even though it's already several days
past the deadline. OTOH, it's listed in the PEP parade. Still,
I would like to see a posted PEP rather sooner than later.
Defending the deadline will be necessary in the future, and
that will become more difficult (on grounds of fairness) if
some PEPs get accepted that had their first appearance on
python.org/peps/ way after the deadline.

Regards,
Martin

From talin at acm.org  Sun May  6 10:34:14 2007
From: talin at acm.org (Talin)
Date: Sun, 06 May 2007 01:34:14 -0700
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <463D8FB2.8050900@v.loewis.de>
References: <20070504143759.BIA64211@ms09.lnh.mail.rcn.net>
	<463D8FB2.8050900@v.loewis.de>
Message-ID: <463D9306.1040109@acm.org>

Martin v. L?wis wrote:
> Procedurally, there is a problem that this still isn't an
> officially-posted PEP, even though it's already several days
> past the deadline. OTOH, it's listed in the PEP parade. Still,
> I would like to see a posted PEP rather sooner than later.
> Defending the deadline will be necessary in the future, and
> that will become more difficult (on grounds of fairness) if
> some PEPs get accepted that had their first appearance on
> python.org/peps/ way after the deadline.

My vote would be to allow those people who have "reserved a spot" for a 
PEP before the deadline to be allowed to proceed, even if they didn't 
have an actual PEP in hand by that date. So in other words, the rule at 
this point should be "no new *topics* for 3.0".

I would also say that a real PEP should follow within a few weeks, and 
if not then I'd say go ahead and disqualify the PEP - i.e. you lose your 
"reserved" spot if you don't come up with an actual document within a 
reasonable time frame.

-- Talin

From hasan.diwan at gmail.com  Sun May  6 10:44:54 2007
From: hasan.diwan at gmail.com (Hasan Diwan)
Date: Sun, 6 May 2007 01:44:54 -0700
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
Message-ID: <2cda2fc90705060144p1d621fb0w835c5b32ba999d65@mail.gmail.com>

On 01/05/07, Raymond Hettinger <python at rcn.com> wrote:
>
> PEP:  Eliminating __del__


+1


-- 
Cheers,
Hasan Diwan <hasan.diwan at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070506/5c01aede/attachment.html 

From greg.ewing at canterbury.ac.nz  Sun May  6 11:00:45 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 06 May 2007 21:00:45 +1200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <20070505181324.649A.JCARLSON@uci.edu>
References: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
	<20070505181324.649A.JCARLSON@uci.edu>
Message-ID: <463D993D.2020107@canterbury.ac.nz>

Josiah Carlson wrote:

> There are many solutions to handling the scaling of Python on multicore
> processors, only one of which is killing the GIL.  Another is Greg
> Ewing's ideas offered in the "Ideas towards GIL removal" thread in the
> python-ideas list.

Yeah, except I think only one of those would actually work
(the "permanent objects" idea). The "thread-local refcount"
idea seems to have at least one fatal flaw.

I'm now more interested in the IBM "Recycler" idea that
was mentioned. If I get a spare moment, I might have a
go at implementing a "Repycler" by means of suitable
redefinitions of Py_INCREF and Py_DECREF.

--
Greg

From baptiste13 at altern.org  Sat May  5 13:07:23 2007
From: baptiste13 at altern.org (Baptiste Carvello)
Date: Sat, 05 May 2007 13:07:23 +0200
Subject: [Python-3000] PEP: Supporting Non-ASCII Identifiers
In-Reply-To: <463BBB0B.40703@v.loewis.de>
References: <46371BD2.7050303@v.loewis.de> <f1g65s$nar$1@sea.gmane.org>
	<463BBB0B.40703@v.loewis.de>
Message-ID: <463C656B.9070200@altern.org>

Martin v. L?wis a ?crit :
>> If this is to ever happen, it should be only accessible through a command-line
>> option to python. That way we make sure people are aware that they are making
>> their code incompatible with the larger world.
> 
> In what way will the source code be incompatible with the larger world?
> 
> Martin
> 
> !DSPAM:463be8e2237561355422449!
> 

I mean incompatible from a maintenance point of view.

Imagine your employer buys some chinese company (or some chinese company decides
to open source its software), and you end up maintaining code where identifiers
are each one chinese character... Maybe this can be solved easily with a proper
IDE, though.

As a user of open source software, I would also hate to open the source file in
search for a bug, only to find out I can't even recognise the identifiers from
one another. I'm sure big projects will have guidelines, but in my field
(physics), a lot of code is written by people with little programming background.

For this reason, I think using this feature should be a conscious decision at
the project level, and not just one developper finding out the "cool new
feature" and starting to use it in his code without much thinking about the
consequences.

Cheers,
Baptiste

P.S.: I do believe this feature is nice in some cases, for example when teaching
programming to children.

From tomerfiliba at gmail.com  Sun May  6 19:03:43 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Sun, 6 May 2007 19:03:43 +0200
Subject: [Python-3000] comments
Message-ID: <1d85506f0705061003y51d62ddcm7a0ea91221a613c6@mail.gmail.com>

i finished reading almost all of the new peps, so to prevent cluttering
i'll post all my comments in a single message.


3130 (Access to Current Module/Class/Function)
------------------------------------------------
why make them keywords? they could as well be builtin functions,
like globals() and locals(). i.e., getmodule(), getclass(), and
getfunction(). these functions will just climb up the stack frames
until they find what you're asking for.

also -- the class object is constructed only AFTER the code
of the class has finished executing, meaning getclass()
or __thisclass__ will not work at class level.

so the class mechanism needs to be changed as well.


3119 (Introducing Abstract Base Classes)
3141 (A Type Hierarchy for Numbers)
------------------------------------------------
these two are very closely related, so i'll treat them as one.
first, being able to override isinstance and issubclass is a
great addition. it would make proxies much more transparent,
and i could remove half of the black magic code from my RPC lib.

other than that -- it would be horrible -- as i'll now explain.
first, about the algebraic structures, such as fields, rings
and what not -- not all of us are mathematicians.

when john doe wants to write HisLeetNumber, i doubt he'll
be able understand all of the subtle differences. adding two
numbers does not require one to take Algebra 101.

second -- i have stressed that before but i hope this time it
may sound more convincing -- a type hierarchy is a foolish concept
that is not strong enough to convey *all* of the constraints one
may want to express, while being very rigid and *static*.
PJE's proposal seems the only suitable one, imho.

sure, duck typing by itself is not powerful enough to allow
constraints and adaptation -- but a new type hierarchy
is not gonna solve these issues.

to start with, it's python after all, not some statically compiled
type-checking langauge. i can still derive from Set and change
the signature of a method (def __len__(self, foobar)), and break
everything -- even though isinstance would approve. this may happen
because of a "malicious" coder or just by a blunt user.

so i hope this settles the case for "type safety". if you want to be
static, use java.

what you DO want is a way to distinguish between objects that "look
similar" to others. for instance, sequences and mappings both
"look similar" -- having __getitem__, __len__, __contains__,
and so on.

what you want is a way to say "this class is a sequence".
you can do that by inheriting from some BaseSequence, but sooner
or later, you'll end up with classes that derive from 10 different
bases, and maintaining such a class hierarchy will become very
time consuming and bug-prone.

besides, imagine that someone wrote his own sequence class, which
does not inherit from BaseSequence, but is otherwise a perfectly
good sequence. still -- you will not be able to use it, as isinstance
checks will fail. manually patching the MRO is impossible, and so
you have to start finding workarounds.

the solution, imo, would be in the form of contracts. the contract
is just a class that defines the interface, and documents how it
should behave. for instance, whether __add__ is commutative, etc.

by itself is may sound just like abstract base classes, but the
difference  is your class won't inherit from them -- rather it
would state it *conforms* with them.

class MappingContract:
    implements(ContainerContract)

    def __getitem__(self, key):
        """if key is not found, raises KeyError"""
    def get(self, key, default = None):
        """returns self[key] if this does not raise KeyError,
         and the default value if it does"""
    def __contains__(self, key):
        """tests if the key exists in the mapping"""

class LeetDict:
    implements(MappingContract)
    implements(SomeOtherContract)

    def ...

ld = LeetDict()
isimplementing(ld, MappingContract) # True
isimplementing(ld, ContainerContract) # True
isimplementing(ld, SequenceContract) # False

this way, you'll never have conflicting base classes/metaclasses,
and still be able to express any functionality that you'll ever
want. again, with ABCs, classes would grow very large inheritance
trees, that at some point are bound to conflict/collide.

moreover, contracts are more "declarative". LeetDict declares
it complies to some contract, rather than forcing it to have statically
inherited that contract as an ABC. we can, if the need arises, patch
a third-party class by declaring it complies with some contract,
which is in fact unknown to the third-party class.

this approach is also more extensible than ABCs:
* a metaclass/class decorator can be used to check at class
  creation time that all of the contracts are satisfied, etc.
* the contract may be any object. even just a big string that
  describes the contract textually
* but it may also be possible to describe complex requirements
  expressively with decorators; for example:

        class Number:
            @commutative
            @associative
            def __add__(self, other):
                "returns self + other"

allowing you to specify individual "properties" to each member
of the contract, so you don't have to know about fields and rings
just to implement an associative operation.

still, the contracts approach has no trouble tackling the suggested
Fields/Rings/Monads classification, should one desire to.


3129, 3127, 3177
------------------------------------------------
as for 3129 (Class Decorators) and 3127 (Integer Literal Support
and Syntax) -- it's about time we have these. and btw, the status
of pep 3117 ought to be changed to 'accepted'... it would have
more impact that way :)


-tomer

From jimjjewett at gmail.com  Sun May  6 19:33:52 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sun, 6 May 2007 13:33:52 -0400
Subject: [Python-3000] comments
In-Reply-To: <1d85506f0705061003y51d62ddcm7a0ea91221a613c6@mail.gmail.com>
References: <1d85506f0705061003y51d62ddcm7a0ea91221a613c6@mail.gmail.com>
Message-ID: <fb6fbf560705061033h49465f5yb59c47886afa4a13@mail.gmail.com>

On 5/6/07, tomer filiba <tomerfiliba at gmail.com> wrote:

> 3130 (Access to Current Module/Class/Function)
> ------------------------------------------------
> why make them keywords? they could as well be builtin functions,
> like globals() and locals(). i.e., getmodule(), getclass(), and
> getfunction(). these functions will just climb up the stack frames
> until they find what you're asking for.

Because I couldn't figure out how to do it after compile-time.

> also -- the class object is constructed only AFTER the code
> of the class has finished executing, meaning getclass()
> or __thisclass__ will not work at class level.

Correct, but it would work within methods of the class.  Functions
also don't exist while still being defined, and modules aren't fully
usable while being defined.

-jJ

From rasky at develer.com  Sun May  6 21:10:40 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Sun, 06 May 2007 21:10:40 +0200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
References: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
Message-ID: <f1l97g$5cm$1@sea.gmane.org>

On 05/05/2007 15.29, tomer filiba wrote:

> however, running a threaded python script over an 8-core
> machine, where you can utilize at most 12.5% of the horsepower,
> seems like too large a sacrifice to me.

You seem to believe that the only way to parallelize your programs is to use 
threads. IMHO, threads is just the most common and absolutely the worst, under 
many points of views.
-- 
Giovanni Bajo


From talin at acm.org  Sun May  6 23:19:01 2007
From: talin at acm.org (Talin)
Date: Sun, 06 May 2007 14:19:01 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <f1l97g$5cm$1@sea.gmane.org>
References: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
	<f1l97g$5cm$1@sea.gmane.org>
Message-ID: <463E4645.5000503@acm.org>

Giovanni Bajo wrote:
> On 05/05/2007 15.29, tomer filiba wrote:
> 
>> however, running a threaded python script over an 8-core
>> machine, where you can utilize at most 12.5% of the horsepower,
>> seems like too large a sacrifice to me.
> 
> You seem to believe that the only way to parallelize your programs is to use 
> threads. IMHO, threads is just the most common and absolutely the worst, under 
> many points of views.

I think it's a case of wanting the most general mechanism for doing 
parallel computation. Any algorithm that can be efficiently parallelized 
using processes can also be done with threads (assuming that the 
infrastructure for threading is there), but the converse is not true.

-- Talin

From brett at python.org  Sun May  6 23:50:09 2007
From: brett at python.org (Brett Cannon)
Date: Sun, 6 May 2007 14:50:09 -0700
Subject: [Python-3000] Dealing with timestamp issues for rebuiling AST using
	Parser/asdl_c.py
Message-ID: <bbaeab100705061450w1d280d93oc591891681c5d4d@mail.gmail.com>

I am sending this email to make sure people are aware of a possible build
problem they might come up against that is unique to Python 3.0 and how to
deal with it.

I decided to do a ``make distclean`` and rebuild my p3yk checkout.  But I
came across the error of::

  File "./Parser/asdl_c.py", line 744
    print(auto_gen_msg, file=f)

Oops.  Turns out the Makefile executes 'python' which is 2.4.3 on my
machine; joys of bootstrapping the build process with Python.  After
touching Include/Python-ast.h and Parser/Python-ast.h I got p3yk to build.
But to make sure I had the newest auto-generated files I touched
Parser/asdl.py and got the same error.  Oops again.

So, I took my clean build of Py3K in my checkout and basically did what the
Makefile wanted to do, just with the proper Python version::

  ./python.exe Parser/asdl_c.py -h Include Parser/Python.asdl
  ./python.exe Parser/asdl_c.py -c Python Parser/Python.asdl

This all came about because I am reviewing Tony Lownds' patch for PEP 3113
which touches both the grammar and the AST.  I had to run the above
statements in the separate checkout I have for this patch using my pristine
copy of the p3yk branch.  Then, after a ``make clean`` the thing built
properly.

Hopefully this won't be a problem with source distributions of Python or
else there might be a flurry of emails and such about this error when people
really start trying to use Python 3.0.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070506/10a05316/attachment.htm 

From greg.ewing at canterbury.ac.nz  Mon May  7 03:40:10 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 07 May 2007 13:40:10 +1200
Subject: [Python-3000] PEP:  Eliminate __del__
In-Reply-To: <463E29A4.5010003@develer.com>
References: <001d01c78bc2$b9367a60$f001a8c0@RaymondLaptop1>
	<5.1.1.6.0.20070501120218.02d31250@sparrow.telecommunity.com>
	<f1c2pp$l30$1@sea.gmane.org> <463AB1DB.5010308@canterbury.ac.nz>
	<463B02EB.6060006@develer.com> <463D175E.1000201@canterbury.ac.nz>
	<463E29A4.5010003@develer.com>
Message-ID: <463E837A.3010905@canterbury.ac.nz>

Giovanni Bajo wrote:
> What I really meant was:
> 
>    self.__wr = weakref.ref(self, ...)

Okay, that looks better. But I'm not sure what will
happen if the holder becomes part of a cycle. If the
GC picks the holder as the object to clear to break
the cycle, then the weakref will be deallocated
before the holder, and the callback won't be called.
So it doesn't seem to be an improvement over __del__.

--
Greg


From jcarlson at uci.edu  Mon May  7 07:36:35 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 06 May 2007 22:36:35 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <463E4645.5000503@acm.org>
References: <f1l97g$5cm$1@sea.gmane.org> <463E4645.5000503@acm.org>
Message-ID: <20070506222840.25B2.JCARLSON@uci.edu>


Talin <talin at acm.org> wrote:
> Giovanni Bajo wrote:
> > On 05/05/2007 15.29, tomer filiba wrote:
> > 
> >> however, running a threaded python script over an 8-core
> >> machine, where you can utilize at most 12.5% of the horsepower,
> >> seems like too large a sacrifice to me.
> > 
> > You seem to believe that the only way to parallelize your programs is to use 
> > threads. IMHO, threads is just the most common and absolutely the worst, under 
> > many points of views.
> 
> I think it's a case of wanting the most general mechanism for doing 
> parallel computation. Any algorithm that can be efficiently parallelized 
> using processes can also be done with threads (assuming that the 
> infrastructure for threading is there), but the converse is not true.

The proposals to remove the GIL have been under the assumption that
shared memory processing using multiple threads is desired.  They also
presume that there will be some sort of locking mechanism on a
per-object basis so that objects won't be clobbered.

By going multi-process rather than multi-threaded, one generally removes
shared memory from the equasion.  Note that this has the same effect as
using queues with threads, which is generally seen as the only way of
making threads "easy".  If one *needs* shared memory, we can certainly
create an mmap-based shared memory subsystem with fine-grained object
locking, or emulate it via a server process as the processing package
has done.

Seriously, give the processing package a try.  It's much faster than one
would expect.


 - Josiah


From martin at v.loewis.de  Mon May  7 07:35:41 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 07 May 2007 07:35:41 +0200
Subject: [Python-3000] Dealing with timestamp issues for rebuiling AST
 using	Parser/asdl_c.py
In-Reply-To: <bbaeab100705061450w1d280d93oc591891681c5d4d@mail.gmail.com>
References: <bbaeab100705061450w1d280d93oc591891681c5d4d@mail.gmail.com>
Message-ID: <463EBAAD.6070102@v.loewis.de>

>   File "./Parser/asdl_c.py", line 744
>     print(auto_gen_msg, file=f)

I think asdl_c.py should be formulated in a way
that is compatible with 2.x. It already uses
f.write in many places; the few remaining ones
should be updated.

Regards,
Martin

From tdelaney at avaya.com  Mon May  7 00:34:52 2007
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Mon, 7 May 2007 08:34:52 +1000
Subject: [Python-3000] [Python-Dev] Pre-pre PEP for 'super' keyword
Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1ED82@au3010avexu1.global.avaya.com>

Steve Holden wrote:

> Tim Delaney wrote:
>> BTW, one of my test cases involves multiple super calls in the same
>> method - there is a *very* large performance improvement by
>> instantiating it once. 
>> 
> And how does speed deteriorate for methods with no uses of super at
> all (which will, I suspect, be in the majority)?

Zero - in those cases, no super instance is instantiated. There is a
small one-time cost when the class is constructed in the reference
implementation (due to the need to parse the bytecode to determine if if
'super' is used) but in the final implementation that information will
be gathered during compilation.

Tim Delaney

From nnorwitz at gmail.com  Mon May  7 08:10:39 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 6 May 2007 23:10:39 -0700
Subject: [Python-3000] Dealing with timestamp issues for rebuiling AST
	using Parser/asdl_c.py
In-Reply-To: <463EBAAD.6070102@v.loewis.de>
References: <bbaeab100705061450w1d280d93oc591891681c5d4d@mail.gmail.com>
	<463EBAAD.6070102@v.loewis.de>
Message-ID: <ee2a432c0705062310u66fd64afie0b20c7c8ee8f8b7@mail.gmail.com>

On 5/6/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> >   File "./Parser/asdl_c.py", line 744
> >     print(auto_gen_msg, file=f)
>
> I think asdl_c.py should be formulated in a way
> that is compatible with 2.x. It already uses
> f.write in many places; the few remaining ones
> should be updated.

This is the case since about 6 minutes before you sent your message. :-)

Date: Mon May  7 07:29:18 2007
New Revision: 55162

Modified:
  python/branches/p3yk/Parser/asdl.py
  python/branches/p3yk/Parser/asdl_c.py
  python/branches/p3yk/Parser/spark.py
Log:
Get asdl code gen working with Python 2.3.  Should continue to work with 3.0

n

From nnorwitz at gmail.com  Mon May  7 09:05:14 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 7 May 2007 00:05:14 -0700
Subject: [Python-3000] Can someone please make py3k* checkins go to the
	python-3000-checkins mailing list?
In-Reply-To: <ca471dc20705041610n69818041l883b5ed6ddefaee2@mail.gmail.com>
References: <ca471dc20705041610n69818041l883b5ed6ddefaee2@mail.gmail.com>
Message-ID: <ee2a432c0705070005q500efcaco78432defa787c223@mail.gmail.com>

On 5/4/07, Guido van Rossum <guido at python.org> wrote:
> I don't know how the filters for checkin emails are set up, but this
> seems wrong: mail related to the p3yk branch goes to
> python-3000-checkins, but mail related to the py3k-unistr branch goes
> to python-checkins. There are a bunch of branches of relevance to py3k
> now; these should all go to the python-3000-checkins list. I suggest
> to filter on branches that start with either py3k or with p3yk.

I've done that (more or less).  Here is the regex.  Please (re)name
your branches appropriately.

    ^python/branches/(p3yk/|py3k).*

n

From nnorwitz at gmail.com  Mon May  7 10:04:22 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 7 May 2007 01:04:22 -0700
Subject: [Python-3000] failing tests
Message-ID: <ee2a432c0705070104j7c71842dxf33307a1468cfeb5@mail.gmail.com>

There are 3* failing tests:
    test_compiler test_doctest test_transformer
* plus a few more when running on a 64-bit platform

These failures occurred before and after xrange checkin.

Do other people see these failures?  Any ideas when they started?

The doctest failures are due to no space at the end of the line (print
behavior change).  Not sure what to do about that now that we prevent
blanks at the end of lines from being checked in. :-)

n

From tomerfiliba at gmail.com  Mon May  7 13:08:04 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Mon, 7 May 2007 13:08:04 +0200
Subject: [Python-3000] new io (pep 3116)
Message-ID: <1d85506f0705070408h6b21540ej54888e97ad6854dc@mail.gmail.com>

my original idea about the new i/o foundation was more elaborate
than the pep, but i have to admit the pep is more feasible and
compact. some comments though:

writeline
-----------------------------
TextIOBase should grow a writeline() method, to be symmetrical
with readline(). the reason is simple -- the newline char is
configurable in the constructor, so it's not necessarily "\n".
so instead of adding the configurable newline char manually,
the user should call writeline() which would append the
appropriate newline automatically.

sockets
-----------------------------
iirc, SocketIO is a layer that wraps an underlying socket object.
that's a good distinction -- to separate the underlying socket from
the RawIO interface -- but don't forget socket objects,
by themselves, need a cleanup too.

for instance, there's no point in UDP sockets having listen(), or send()
or getpeername() -- with UDP you only ever use sendto and recvfrom.
on the other hand, TCP sockets make no use of sendto(). and even with
TCP sockets, listeners never use send() or recv(), while connected
sockets never use listen() or connect().

moreover, the current socket interface simply mimics the BSD
interface. setsockopt, getsockopt, et al, are very unpythonic by nature --
the ought to be exposed as properties or methods of the socket.
all in all, the current socket model is very low level with no high
level design.

some time ago i was working on a sketch for a new socket module
(called sock2) which had a clear distinction between connected sockets,
listener sockets and datagram sockets. each protocol was implemented
as a subclass of one of these base classes, and exposed only the
relevant methods. socket options were added as properties and
methods, and a new DNS module was added for dns-related queries.

you can see it here -- http://sebulba.wikispaces.com/project+sock2
i know it's late already, but i can write a PEP over the weekend,
or if someone else wants to carry on with the idea, that's fine
with me.

non-blocking IO
-----------------------------
the pep says "In order to put an object in object in non-blocking
mode, the user must extract the fileno and do it by hand."
but i think it would only lead to trouble. as the entire IO library
is being rethought from the grounds up, non-blocking IO
should be taken into account.

non-blocking IO depends greatly on the platform -- and this is
exactly why a cross-platform language should standardized that
as part of the new IO layer. saying "let's keep it for later" would only
require more work at some later stage.

it's true that SyncIO and AsyncIO don't mingle well with the same
interfaces. that's why i think they should be two distinct classes.
the class hierarchy should be something like:

class RawIO:
    def fileno()
    def close()

class SyncIO(RawIO):
    def read(count)
    def write(data)

class AsyncIO(RawIO):
    def read(count, timeout)
    def write(data, timeout)
    def bgread(count, callback)
    def bgwrite(data, callback)

or something similar. there's no point to add both sync and async
operations to the RawIO level -- it just won't work together.
we need to keep the two distinct.

buffering should only support SyncIO -- i also don't see much point
in having buffered async IO. it's mostly used for sockets and devices,
which are most likely to work with binary data structures rather than
text, and if you *require* non-blocking mode, buffering will only
get in your way.

if you really want a buffered AsyncIO stream, you could write a
compatibility layer that makes the underlying AsyncIO object
appear synchronous.

records
-----------------------------
another addition to the PEP that seems useful to me would be a
RecordIOBase/Wrapper. records are fixed-length binary data
structures, defined as format strings of the struct-module.

class RecordIOWrapper:
    def __init__(self, buffer, format)
    def read(self) -> tuple of fields
    def write(self, *fields)

another cool feature i can think of is "multiplexing",  or working
with the same underlying stream in different ways by having multiple
wrappers over it.

for example, to implement a type-length-value stream, which is very
common in communication protocols, one could do something like

class MultiplexedIO:
    def __init__(self, *streams):
        self.streams = itertools.cycle(streams)
    def read(self, *args):
        """read from the next stream each time it's called"""
        return self.streams.next().read(*args)

sock = BufferedRW(SocketIO(...))
tlrec = Record(sock, "!BL")
tlv = MultiplexedIO(tvrec, sock)

type, length = tlv.read()
value = tlv.read(length)

you can also build higher-level state machines with that -- for instance,
if the type was "int", the next call to read() would decode the value as
an integer, and so on. you could write parsers right on top of the IO
layer.

just an idea. i'm not sure if that's proper design or just a silly idea,
but we'll leave that to the programmer.


-tomer

From tomerfiliba at gmail.com  Mon May  7 13:21:39 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Mon, 7 May 2007 13:21:39 +0200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <463D280F.6070101@acm.org>
References: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
	<463D280F.6070101@acm.org>
Message-ID: <1d85506f0705070421n1d083f47ge0bcfca2a27af8f9@mail.gmail.com>

[Talin]
> Note that Jython and IronPython don't have the same restrictions in this
> regard as CPython. Both VMs are able to run in multiprocessing
> environments. (I don't know whether or not Jython/IronPython even have a
> GIL or not.)

they don't. they rely on jvm/clr for GC, and probably per-thread locking when
they touch global data.

[Giovanni Bajo]
> You seem to believe that the only way to parallelize your programs is to use
> threads. IMHO, threads is just the most common and absolutely the worst, under
> many points of views.

not at all. personally i hate threads, but there are many place where you can
use them properly to distribute workload -- without mutual dependencies or
shard state. this makes them essentially like light-weight processes, using
background workers and queues, etc., only without the overhead of multiple
processes.

there could be a stdlib threading module that would provide you with all
kinds of queues, schedulers, locks, and decorators, so you wouldn't have
to manually lock things every time.


-tomer

From daniel at stutzbachenterprises.com  Mon May  7 15:32:06 2007
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Mon, 7 May 2007 08:32:06 -0500
Subject: [Python-3000] new io (pep 3116)
In-Reply-To: <1d85506f0705070408h6b21540ej54888e97ad6854dc@mail.gmail.com>
References: <1d85506f0705070408h6b21540ej54888e97ad6854dc@mail.gmail.com>
Message-ID: <eae285400705070632p4731ac18x523112927ed945c1@mail.gmail.com>

On 5/7/07, tomer filiba <tomerfiliba at gmail.com> wrote:
> for instance, there's no point in UDP sockets having listen(), or send()
> or getpeername() -- with UDP you only ever use sendto and recvfrom.
> on the other hand,

Actually, you can connect() UDP sockets, and then you can use send(),
recv(), and getpeername().

> TCP sockets make no use of sendto(). and even with
> TCP sockets, listeners never use send() or recv(), while connected
> sockets never use listen() or connect().

Agreed.

-- 
Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC

From guido at python.org  Mon May  7 16:28:10 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 May 2007 07:28:10 -0700
Subject: [Python-3000] failing tests
In-Reply-To: <ee2a432c0705070104j7c71842dxf33307a1468cfeb5@mail.gmail.com>
References: <ee2a432c0705070104j7c71842dxf33307a1468cfeb5@mail.gmail.com>
Message-ID: <ca471dc20705070728q2ce99be4x1b427b34543827f9@mail.gmail.com>

Thanks for checking in xrange!!!!! Woot!

test_compiler and test_transformer are waiting for someone to clean up
the compiler package (I forget what it doesn't support, perhapes only
nonlocal needs to be added.)

Looks like you diagnosed the doctest failure correctly. This is
probably because, when print changed into print(), lines ending in
spaces are generated in some cases:

  # Py 2 code, writes "42\n"
  print 42,
  print

  # Py3k automatically translated, writes "42 \n"
  print(42, end=" ")
  print()

I'm afraid we'll have to track down the places where this affects the
doctest and fix them. (Fixing the doctest is possible too, though less
elegant: just add <space> \n\ to the end of the line.)

--Guido

On 5/7/07, Neal Norwitz <nnorwitz at gmail.com> wrote:
> There are 3* failing tests:
>     test_compiler test_doctest test_transformer
> * plus a few more when running on a 64-bit platform
>
> These failures occurred before and after xrange checkin.
>
> Do other people see these failures?  Any ideas when they started?
>
> The doctest failures are due to no space at the end of the line (print
> behavior change).  Not sure what to do about that now that we prevent
> blanks at the end of lines from being checked in. :-)
>
> n
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jimjjewett at gmail.com  Mon May  7 17:47:00 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 7 May 2007 11:47:00 -0400
Subject: [Python-3000] updated PEP3126: Remove Implicit String Concatenation
Message-ID: <fb6fbf560705070847i7c5af87dr912de765d063b8eb@mail.gmail.com>

Rewritten -- please tell me if there are any concerns I have missed.

And of course, please tell me if you have a suggestion for the open
issue -- how to better support external internationalization tools, or
at least xgettext in particular.

-jJ

-----------------------------------

PEP: 3126
Title: Remove Implicit String Concatenation
Version: $Revision$
Last-Modified: $Date$
Author: Jim J. Jewett <JimJJewett at gmail.com>,
        Raymond D. Hettinger <python at rcn.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 29-Apr-2007
Post-History: 29-Apr-2007, 30-Apr-2007, 07-May-2007


Abstract
========

Python inherited many of its parsing rules from C.  While this has
been generally useful, there are some individual rules which are less
useful for python, and should be eliminated.

This PEP proposes to eliminate implicit string concatenation based
only on the adjacency of literals.

Instead of::

    "abc" "def" == "abcdef"

authors will need to be explicit, and either add the strings::

    "abc" + "def" == "abcdef"

or join them::

    "".join(["abc", "def"]) == "abcdef"


Motivation
==========

One goal for Python 3000 should be to simplify the language by
removing unnecessary features.  Implicit string concatenation should
be dropped in favor of existing techniques. This will simplify the
grammar and simplify a user's mental picture of Python.  The latter is
important for letting the language "fit in your head".  A large group
of current users do not even know about implicit concatenation.  Of
those who do know about it, a large portion never use it or habitually
avoid it. Of those who both know about it and use it, very few could
state with confidence the implicit operator precedence and under what
circumstances it is computed when the definition is compiled versus
when it is run.


History or Future
-----------------

Many Python parsing rules are intentionally compatible with C.  This
is a useful default, but Special Cases need to be justified based on
their utility in Python.  We should no longer assume that python
programmers will also be familiar with C, so compatibility between
languages should be treated as a tie-breaker, rather than a
justification.

In C, implicit concatenation is the only way to join strings without
using a (run-time) function call to store into a variable.  In Python,
the strings can be joined (and still recognized as immutable) using
more standard Python idioms, such ``+`` or ``"".join``.


Problem
-------

Implicit String concatentation leads to tuples and lists which are
shorter than they appear; this is turn can lead to confusing, or even
silent, errors.  For example, given a function which accepts several
parameters, but offers a default value for some of them::

    def f(fmt, *args):
        print fmt % args

This looks like a valid call, but isn't::

    >>> f("User %s got a message %s",
          "Bob"
          "Time for dinner")

    Traceback (most recent call last):
      File "<pyshell#8>", line 2, in <module>
        "Bob"
      File "<pyshell#3>", line 2, in f
        print fmt % args
    TypeError: not enough arguments for format string


Calls to this function can silently do the wrong thing::

    def g(arg1, arg2=None):
        ...

    # silently transformed into the possibly very different
    # g("arg1 on this linearg2 on this line", None)
    g("arg1 on this line"
      "arg2 on this line")

To quote Jason Orendorff [#Orendorff]

    Oh.  I just realized this happens a lot out here.  Where I work,
    we use scons, and each SConscript has a long list of filenames::

        sourceFiles = [
            'foo.c'
            'bar.c',
            #...many lines omitted...
            'q1000x.c']

    It's a common mistake to leave off a comma, and then scons
    complains that it can't find 'foo.cbar.c'.  This is pretty
    bewildering behavior even if you *are* a Python programmer,
    and not everyone here is.


Solution
========

In Python, strings are objects and they support the __add__ operator,
so it is possible to write::

    "abc" + "def"

Because these are literals, this addition can still be optimized away
by the compiler; the CPython compiler already does so.
[#rcn-constantfold]_

Other existing alternatives include multiline (triple-quoted) strings,
and the join method::

    """This string
       extends across
       multiple lines, but you may want to use something like
       Textwrap.dedent
       to clear out the leading spaces
       and/or reformat.
    """


    >>> "".join(["empty", "string", "joiner"]) == "emptystringjoiner"
    True

    >>> " ".join(["space", "string", "joiner"]) == "space string joiner"

    >>> "\n".join(["multiple", "lines"]) == "multiple\nlines" == (
    """multiple
    lines""")
    True


Concerns
========


Operator Precedence
-------------------

Guido indicated [#rcn-constantfold]_ that this change should be
handled by PEP, because there were a few edge cases with other string
operators, such as the %.  (Assuming that str % stays -- it may be
eliminated in favor of PEP 3101 -- Advanced String Formatting.
[#PEP3101]_ [#elimpercent]_)

The resolution is to use parentheses to enforce precedence -- the same
solution that can be used today::

    # Clearest, works today, continues to work, optimization is
    # already possible.
    ("abc %s def" + "ghi") % var

    # Already works today; precedence makes the optimization more
    # difficult to recognize, but does not change the semantics.
    "abc" + "def %s ghi" % var

as opposed to::

    # Already fails because modulus (%) is higher precedence than
    # addition (+)
    ("abc %s def" + "ghi" % var)

    # Works today only because adjacency is higher precedence than
    # modulus.  This will no longer be available.
    "abc %s" "def" % var

    # So the 2-to-3 translator can automically replace it with the
    # (already valid):
    ("abc %s" + "def") % var


Long Commands
-------------

    ... build up (what I consider to be) readable SQL queries [#skipSQL]_::

        rows = self.executesql("select cities.city, state, country"
                               "    from cities, venues, events, addresses"
                               "    where cities.city like %s"
                               "      and events.active = 1"
                               "      and venues.address = addresses.id"
                               "      and addresses.city = cities.id"
                               "      and events.venue = venues.id",
                               (city,))

Alternatives again include triple-quoted strings, ``+``, and ``.join``::

    query="""select cities.city, state, country
                 from cities, venues, events, addresses
                 where cities.city like %s
                   and events.active = 1"
                   and venues.address = addresses.id
                   and addresses.city = cities.id
                   and events.venue = venues.id"""

    query=( "select cities.city, state, country"
          + "    from cities, venues, events, addresses"
          + "    where cities.city like %s"
          + "      and events.active = 1"
          + "      and venues.address = addresses.id"
          + "      and addresses.city = cities.id"
          + "      and events.venue = venues.id"
          )

    query="\n".join(["select cities.city, state, country",
                     "    from cities, venues, events, addresses",
                     "    where cities.city like %s",
                     "      and events.active = 1",
                     "      and venues.address = addresses.id",
                     "      and addresses.city = cities.id",
                     "      and events.venue = venues.id"])

    # And yes, you *could* inline any of the above querystrings
    # the same way the original was inlined.
    rows = self.executesql(query, (city,))


Regular Expressions
-------------------

Complex regular expressions are sometimes stated in terms of several
implicitly concatenated strings with each regex component on a
different line and followed by a comment.  The plus operator can be
inserted here but it does make the regex harder to read.  One
alternative is to use the re.VERBOSE option.  Another alternative is
to build-up the regex with a series of += lines::

    # Existing idiom which relies on implicit concatenation
    r = ('a{20}'  # Twenty A's
         'b{5}'   # Followed by Five B's
         )

    # Mechanical replacement
    r = ('a{20}'  +# Twenty A's
         'b{5}'   # Followed by Five B's
         )

    # already works today
    r = '''a{20}  # Twenty A's
           b{5}   # Followed by Five B's
        '''                 # Compiled with the re.VERBOSE flag

    # already works today
    r = 'a{20}'   # Twenty A's
    r += 'b{5}'   # Followed by Five B's


Internationalization
--------------------

Some internationalization tools -- notably xgettext -- have already
been special-cased for implicit concatenation, but not for Python's
explicit concatenation. [#barryi8]_

These tools will fail to extract the (already legal)::

    _("some string" +
      " and more of it")

but often have a special case for::

    _("some string"
      " and more of it")

It should also be possible to just use an overly long line (xgettext
limits messages to 2048 characters [#xgettext2048]_, which is less
than Python's enforced limit) or triple-quoted strings, but these
solutions sacrifice some readability in the code::

    # Lines over a certain length are unpleasant.
    _("some string and more of it")

    # Changing whitespace is not ideal.
    _("""Some string
         and more of it""")
    _("""Some string
    and more of it""")
    _("Some string \
    and more of it")

I do not see a good short-term resolution for this.


Transition
==========

The proposed new constructs are already legal in current Python, and
can be used immediately.

The 2 to 3 translator can be made to mechanically change::

    "str1" "str2"
    ("line1"  #comment
     "line2")

into::

    ("str1" + "str2")
    ("line1"   +#comments
     "line2")

If users want to use one of the other idioms, they can; as these
idioms are all already legal in python 2, the edits can be made
to the original source, rather than patching up the translator.


Open Issues
===========

Is there a better way to support external text extraction tools, or at
least ``xgettext`` [#gettext]_ in particular?


References
==========

..  [#Orendorff] Implicit String Concatenation, Orendorff
    http://mail.python.org/pipermail/python-ideas/2007-April/000397.html

..  [#rcn-constantfold] Reminder: Py3k PEPs due by April, Hettinger,
    van Rossum
    http://mail.python.org/pipermail/python-3000/2007-April/006563.html

..  [#PEP3101] PEP 3101, Advanced String Formatting, Talin
    http://www.python.org/peps/pep-3101.html

..  [#elimpercent] ps to question Re: Need help completing ABC pep,
    van Rossum
    http://mail.python.org/pipermail/python-3000/2007-April/006737.html

..  [#skipSQL] (email Subject) PEP 30XZ: Simplified Parsing, Skip,
    http://mail.python.org/pipermail/python-3000/2007-May/007261.html

..  [#barryi8] (email Subject) PEP 30XZ: Simplified Parsing
    http://mail.python.org/pipermail/python-3000/2007-May/007305.html

..  [#gettext] GNU gettext manual
    http://www.gnu.org/software/gettext/

..  [#xgettext2048] Unix man page for xgettext -- Notes section
    http://www.scit.wlv.ac.uk/cgi-bin/mansec?1+xgettext


Copyright
=========

    This document has been placed in the public domain.



..
    Local Variables:
    mode: indented-text
    indent-tabs-mode: nil
    sentence-end-double-space: t
    fill-column: 70
    coding: utf-8
    End:

From antoine.pitrou at wengo.com  Mon May  7 17:27:41 2007
From: antoine.pitrou at wengo.com (Antoine Pitrou)
Date: Mon, 07 May 2007 17:27:41 +0200
Subject: [Python-3000]  PEP:  Eliminate __del__
Message-ID: <1178551661.8251.16.camel@antoine-ubuntu>


FWIW and in light of the thread on removing __del__ from the language, I
just posted Yet Another Recipe for automatic finalization:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/519621

It allows writing a finalizer as a single __finalize__ method, at the
cost of explicitly calling an enable_finalizer() method with the list of
attributes to keep alive on the "ghost object".

Antoine.



From ncoghlan at gmail.com  Mon May  7 18:17:57 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 08 May 2007 02:17:57 +1000
Subject: [Python-3000] failing tests
In-Reply-To: <ca471dc20705070728q2ce99be4x1b427b34543827f9@mail.gmail.com>
References: <ee2a432c0705070104j7c71842dxf33307a1468cfeb5@mail.gmail.com>
	<ca471dc20705070728q2ce99be4x1b427b34543827f9@mail.gmail.com>
Message-ID: <463F5135.2090007@gmail.com>

Guido van Rossum wrote:
> Thanks for checking in xrange!!!!! Woot!
> 
> test_compiler and test_transformer are waiting for someone to clean up
> the compiler package (I forget what it doesn't support, perhapes only
> nonlocal needs to be added.)

It's definitely lagging on set comprehensions as well. I'm also pretty 
sure those two tests broke before nonlocal was added, as they were 
already broken when I started helping Georg in looking at the setcomp 
updates.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From collinw at gmail.com  Mon May  7 19:08:34 2007
From: collinw at gmail.com (Collin Winter)
Date: Mon, 7 May 2007 10:08:34 -0700
Subject: [Python-3000] PEP 3129: Class Decorators
Message-ID: <43aa6ff70705071008q6a33e00eq7e5073dba5fa07e@mail.gmail.com>

Can I go ahead and mark PEP 3129 as "accepted"?

From steven.bethard at gmail.com  Mon May  7 19:21:31 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon, 7 May 2007 11:21:31 -0600
Subject: [Python-3000] new io (pep 3116)
In-Reply-To: <1d85506f0705070408h6b21540ej54888e97ad6854dc@mail.gmail.com>
References: <1d85506f0705070408h6b21540ej54888e97ad6854dc@mail.gmail.com>
Message-ID: <d11dcfba0705071021n1cbbe25cgf8bd535d477eeb16@mail.gmail.com>

On 5/7/07, tomer filiba <tomerfiliba at gmail.com> wrote:
> some time ago i was working on a sketch for a new socket module
> (called sock2) which had a clear distinction between connected sockets,
> listener sockets and datagram sockets. each protocol was implemented
> as a subclass of one of these base classes, and exposed only the
> relevant methods. socket options were added as properties and
> methods, and a new DNS module was added for dns-related queries.
>
> you can see it here -- http://sebulba.wikispaces.com/project+sock2
> i know it's late already, but i can write a PEP over the weekend,

It's not too late for standard library PEPs, only PEPs that change the
core language. Since your proposal here would presumably replace the
socket module, I assume it counts as a stdlib change.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From guido at python.org  Mon May  7 19:33:07 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 May 2007 10:33:07 -0700
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <eae285400705041844l1dbecbe2o6d7858b0fde8e33@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
	<463AB2E1.2030408@canterbury.ac.nz>
	<A7B470E4-DABA-4223-8E97-237249459FCD@gmail.com>
	<463BDC9B.2030500@canterbury.ac.nz>
	<eae285400705041844l1dbecbe2o6d7858b0fde8e33@mail.gmail.com>
Message-ID: <ca471dc20705071033j13761ad1t9fc75dddcda3c0ce@mail.gmail.com>

On 5/4/07, Daniel Stutzbach <daniel at stutzbachenterprises.com> wrote:
> On 5/4/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > I don't think that returning the type given is a goal
> > that should be attempted, because it can only ever work
> > for a fixed set of known types. Given an arbitrary
> > sequence type, there is no way of knowing how to
> > create a new instance of it with specified contents.
>
> For objects that support the sequence protocol, how about specifying that:
>
> a, *b = container_object
>
> must be equivalent to:
>
> a, b = container_object[0], container_object[1:]
>
> That way, b is assigned whatever container_object's getslice method
> returns.  A list will return a list, a tuple will return a tuple, and
> widgets (or BLists...) can return whatever makes sense for them.

And what do you return when it doesn't support the container protocol?

Think about the use cases. It seems that *your* use case is some kind
of (car, cdr) splitting known from Lisp and from functional languages
(Haskell is built out of this idiom it seems from the examples). But
in Python, if you want to loop over one of those things, you ought to
use a for-loop; and if you really want a car/cdr split, explicitly
using the syntax you show above (x[0], x[1:]) is fine.

The important use case in Python for the proposed semantics is when
you have a variable-length record, the first few items of which are
interesting, and the rest of which is less so, but not unimportant.
(If you wanted to throw the rest away, you'd just write a, b, c =
x[:3] instead of a, b, c, *d = x.) It is much more convenient for this
use case if the type of d is fixed by the operation, so you can count
on its behavior.

There's a bug in the design of filter() in Python 2 (which will be
fixed in 3.0 by turning it into an iterator BTW): if the input is a
tuple, the output is a tuple too, but if the input is a list *or
anything else*, the output is a list.  That's a totally insane
signature, since it means that you can't count on the result being a
list, *nor* on it being a tuple -- if you need it to be one or the
other, you have to convert it to one, which is a waste of time and
space. Please let's not repeat this design bug.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  7 19:42:41 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 May 2007 10:42:41 -0700
Subject: [Python-3000] [Python-Dev] Byte literals (was Re:
	[Python-checkins] Changing string constants to byte arrays (
	r55119 - in python/branches/py3k-struni/Lib: codecs.py
	test/test_codecs.py ))
In-Reply-To: <20070505124008.648D.JCARLSON@uci.edu>
References: <20070505150035.GA16303@panix.com>
	<200705051334.45120.fdrake@acm.org>
	<20070505124008.648D.JCARLSON@uci.edu>
Message-ID: <ca471dc20705071042x586b21bcgbfdf8fbf8144913e@mail.gmail.com>

[+python-3000; replies please remove python-dev]

On 5/5/07, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "Fred L. Drake, Jr." <fdrake at acm.org> wrote:
> >
> > On Saturday 05 May 2007, Aahz wrote:
> >  > I'm with MAL and Fred on making literals immutable -- that's safe and
> >  > lots of newbies will need to use byte literals early in their Python
> >  > experience if they pick up Python to operate on network data.
> >
> > Yes; there are lots of places where bytes literals will be used the way str
> > literals are today.  buffer(b'...') might be good enough, but it seems more
> > than a little idiomatic, and doesn't seem particularly readable.
> >
> > I'm not suggesting that /all/ literals result in constants, but bytes literals
> > seem like a case where what's wanted is the value.  If b'...' results in a
> > new object on every reference, that's a lot of overhead for a network
> > protocol implementation, where the data is just going to be written to a
> > socket or concatenated with other data.  An immutable bytes type would be
> > very useful as a dictionary key as well, and more space-efficient than
> > tuple(b'...').
>
> I was saying the exact same thing last summer.  See my discussion with
> Martin about parsing/unmarshaling.  What I expect will happen with bytes
> as dictionary keys is that people will end up subclassing dictionaries
> (with varying amounts of success and correctness) to do something like
> the following...
>
>     class bytesKeys(dict):
>         ...
>         def __setitem__(self, key, value):
>             if isinstance(key, bytes):
>                 key = key.decode('latin-1')
>             else:
>                 raise KeyError("only bytes can be used as keys")
>             dict.__setitem__(self, key, value)
>         ...
>
> Is it optimal?  No.  Would it be nice to have immtable bytes?  Yes.  Do
> I think it will really be a problem in parsing/unmarshaling?  I don't
> know, but the fact that there now exists a reasonable literal syntax b'...'
> rather than the previous bytes([1, 2, 3, ...]) means that we are coming
> much closer to having what really is about the best way to handle this;
> Python 2.x str.

I don't know how this will work out yet. I'm not convinced that having
both mutable and immutable bytes is the right thing to do; but I'm
also not convinced of the opposite. I am slowly working on the
string/unicode unification, and so far, unfortunately, it is quite
daunting to get rid of 8-bit strings even at the Python level let
alone at the C level.

I suggest that the following exercise, to be carried out in the
py3k-struni branch, might be helpful: (1) change the socket module to
return bytes instead of strings (it already takes bytes, by virtue of
the buffer protocol); (2) change its makefile() method so that it uses
the new io.py library, in particular the SocketIO wrapper there; (3)
fix up the httplib module and perhaps other similar ones. Take copious
notes while doing this. Anyone up for this? I will listen! (I'd do it
myself but I don't know where I'd find the time).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  7 19:45:40 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 May 2007 10:45:40 -0700
Subject: [Python-3000] PEP 3112
In-Reply-To: <463D8800.1010906@v.loewis.de>
References: <463D8800.1010906@v.loewis.de>
Message-ID: <ca471dc20705071045o509dd13bs93e7495e2fd3d288@mail.gmail.com>

On 5/6/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> I just read PEP 3112, and I believe it contains a
> flaw/underspecification.
>
> It says
>
> # Each shortstringchar or longstringchar must be a character between 1
> # and 127 inclusive, regardless of any encoding declaration [2] in the
> # source file.
>
> What does that mean? In particular, what is "a character between 1 and
> 127"?
>
> Assuming this refers to ordinal values in some encoding: what encoding?
> It's particularly puzzling that it says "regardless of any encoding
> declaration of the source file".
>
> I fear (but hope that I'm wrong) that this was meant to mean "use the
> bytes as they are stored on disk in the source file". If so: is the
> attached file valid Python? In case your editor can't render it: it
> reads
>
> #! -*- coding: iso-2022-jp -*-
> a = b"?????"
>
> But if you look at the file with a hex editor, you see it contains
> only bytes between 1 and 127.
>
> I would hope that this code is indeed ill-formed (i.e. that
> the byte representation on disk is irrelevant, and only the
> Unicode ordinals of the source characters matter)
>
> If so, can the specification please be updated to clarify that
> 1. in Grammar changes: Each shortstringchar or longstringchar must
>    be a character whose Unicode ordinal value is between 1 and
>    127 inclusive.
> 2. in Semantics: The bytes in the new object are obtained as if
>    encoding a string literal with "iso-8859-1"

Sounds like a good fix to me; I agree that bytes literals, like
Unicode literals, should not vary depending on the source encoding. In
step 2, can't you use "ascii" as the encoding?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  7 19:49:36 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 May 2007 10:49:36 -0700
Subject: [Python-3000] comments
In-Reply-To: <1d85506f0705061003y51d62ddcm7a0ea91221a613c6@mail.gmail.com>
References: <1d85506f0705061003y51d62ddcm7a0ea91221a613c6@mail.gmail.com>
Message-ID: <ca471dc20705071049y3cd8a744s536285e8f5c60301@mail.gmail.com>

On 5/6/07, tomer filiba <tomerfiliba at gmail.com> wrote:
> i finished reading almost all of the new peps, so to prevent cluttering
> i'll post all my comments in a single message.

Please don't do that -- it leads to multiple discussions going on in
the same email thread, and that's really hard to keep track of (as I
learned after posting my "PEP parade" email).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  7 19:58:45 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 May 2007 10:58:45 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
References: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
Message-ID: <ca471dc20705071058n25a21acfvaca8e4979edfa404@mail.gmail.com>

On 5/5/07, tomer filiba <tomerfiliba at gmail.com> wrote:
> i have to admit i've been neglecting the list in the past few months,
> and i don't know whether the issue i want to bring up has been
> settled already.

It's been settled by default -- nobody submitted a PEP to kill the GIL
in time for the April 30 deadline, and I won't accept one now.

> as you all may have noticed, multicore processors are becoming
> more and more common in all kinds of machines, from desktops
> to servers, and will surely become more prevalent with time,
> as all major CPU vendors plan to ship 8-core processors
> by mid-2008.
>
> back in the day of uniprocessor machines, having the GIL really
> made life simpler and the sacrifice was negligible.
>
> however, running a threaded python script over an 8-core
> machine, where you can utilize at most 12.5% of the horsepower,
> seems like too large a sacrifice to me.
>
> the only way to overcome this with cpython is to Kill The GIL (TM),
> and since it's a very big architectural change, it ought to happen
> soon. pushing it further than version 3.0 means all library authors
> would have to adapt their code twice (once to make it compatible
> with 3.0, and then again to make it thread safe).

Here's something I wrote recently to someone (a respected researcher)
who has a specific design in mind to kill the GIL (rather than an
agenda without a plan).

"""
Briefly, the reason why it's so hard to get rid of the GIL
is that this Python implementation uses reference
counting as its primary GC approach (there's a cycle-traversing GC
algorithm bolted on the side, but most objects are reclaimed by
refcounting). In Python, everything is an object (even small integers
and characters), and many objects are conceptually immutable, allowing
free sharing of objects as values between threads. There is no marking
of objects as "local to a thread" or "local to a frame" -- that would
be a totally alien concept. All objects have a refcount field (a long
at the front of the object structure) and this sees a lot of traffic.
As C doesn't have an atomic increment nor an atomic
decrement-and-test, the INCREF and DECREF macros sprinkled throughout
the code (many thousands of them) must be protected by some lock.

Around '99 Greg Stein and Mark Hammond tried to get rid of the GIL.
They removed most of the global mutable data structures, added
explicit locks to the remaining ones and to individual mutable
objects, and actually got the whole thing working. Unfortunately even
on the system with the fastest locking primitives (Windows at the
time) they measured a 2x slow-down on a single CPU due to all the
extra locking operations going on.

Good luck fixing this! My personal view on it is that it's not worth
it. If you want to run Python on a multiprocessor, you're much better
off finding a way to break the application off into multiple processes
each running a single CPU-bound thread and any number of I/O-bound
threads; alternatively, if you cannot resist the urge for multiple
CPU-bound threads, you can use one of the Python implementations built
on inherently multi-threading frameworks, i.e. Jython or IronPython.
But I'd be happy to be proven wrong, if only because this certainly is
a recurring heckle whenever I give a talk about Python anywhere.
"""

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  7 20:07:54 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 May 2007 11:07:54 -0700
Subject: [Python-3000] updated PEP3126: Remove Implicit String
	Concatenation
In-Reply-To: <fb6fbf560705070847i7c5af87dr912de765d063b8eb@mail.gmail.com>
References: <fb6fbf560705070847i7c5af87dr912de765d063b8eb@mail.gmail.com>
Message-ID: <ca471dc20705071107n18ac4b7bsa674b389733bce77@mail.gmail.com>

Committed revision 55172.

For the record, I'm more and more -1 on this (and on its companion to
remove \ line continuation). These seem pretty harmless features that
serve a purpose; those of us who don't like them can avoid them.

--Guido

On 5/7/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> Rewritten -- please tell me if there are any concerns I have missed.
>
> And of course, please tell me if you have a suggestion for the open
> issue -- how to better support external internationalization tools, or
> at least xgettext in particular.
>
> -jJ
>
> -----------------------------------
>
> PEP: 3126
> Title: Remove Implicit String Concatenation
> Version: $Revision$
> Last-Modified: $Date$
> Author: Jim J. Jewett <JimJJewett at gmail.com>,
>         Raymond D. Hettinger <python at rcn.com>

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  7 20:10:30 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 May 2007 11:10:30 -0700
Subject: [Python-3000] failing tests
In-Reply-To: <463F5135.2090007@gmail.com>
References: <ee2a432c0705070104j7c71842dxf33307a1468cfeb5@mail.gmail.com>
	<ca471dc20705070728q2ce99be4x1b427b34543827f9@mail.gmail.com>
	<463F5135.2090007@gmail.com>
Message-ID: <ca471dc20705071110n26989e7cs9e54cb1736ad8a65@mail.gmail.com>

On 5/7/07, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Guido van Rossum wrote:
> > Thanks for checking in xrange!!!!! Woot!
> >
> > test_compiler and test_transformer are waiting for someone to clean up
> > the compiler package (I forget what it doesn't support, perhapes only
> > nonlocal needs to be added.)
>
> It's definitely lagging on set comprehensions as well. I'm also pretty
> sure those two tests broke before nonlocal was added, as they were
> already broken when I started helping Georg in looking at the setcomp
> updates.

I  just fixed the doctest failures; but for the compiler package I
need help. Would you have the time?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  7 20:12:40 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 May 2007 11:12:40 -0700
Subject: [Python-3000] PEP 3129: Class Decorators
In-Reply-To: <43aa6ff70705071008q6a33e00eq7e5073dba5fa07e@mail.gmail.com>
References: <43aa6ff70705071008q6a33e00eq7e5073dba5fa07e@mail.gmail.com>
Message-ID: <ca471dc20705071112y103d7ea1v68711794718e3fbf@mail.gmail.com>

On 5/7/07, Collin Winter <collinw at gmail.com> wrote:
> Can I go ahead and mark PEP 3129 as "accepted"?

Almost. I'm ok with it, but I think that to follow the procedure you
ought to post the full text at least once on python-3000, so you can
add the date to the "Post-History" header. In the mean time, I think
it would be fine to start on the implementation!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From daniel at stutzbachenterprises.com  Mon May  7 20:13:48 2007
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Mon, 7 May 2007 13:13:48 -0500
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <ca471dc20705071033j13761ad1t9fc75dddcda3c0ce@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
	<463AB2E1.2030408@canterbury.ac.nz>
	<A7B470E4-DABA-4223-8E97-237249459FCD@gmail.com>
	<463BDC9B.2030500@canterbury.ac.nz>
	<eae285400705041844l1dbecbe2o6d7858b0fde8e33@mail.gmail.com>
	<ca471dc20705071033j13761ad1t9fc75dddcda3c0ce@mail.gmail.com>
Message-ID: <eae285400705071113l385f36eayd8f072c3b8312202@mail.gmail.com>

On 5/7/07, Guido van Rossum <guido at python.org> wrote:
> And what do you return when it doesn't support the container protocol?

Assign the iterator object with the remaining items to d.

> Think about the use cases. It seems that *your* use case is some kind
> of (car, cdr) splitting known from Lisp and from functional languages
> (Haskell is built out of this idiom it seems from the examples). But
> in Python, if you want to loop over one of those things, you ought to
> use a for-loop; and if you really want a car/cdr split, explicitly
> using the syntax you show above (x[0], x[1:]) is fine.

The use came I'm thinking of is this:

A container type or an iterable where the first few entries contain
one type of information, and the rest of the entries are something
that will either be discard or run through for-loop.

I encounter this frequently when reading text files where the first
few lines are some kind of header with a known format and the rest of
the file is data.

> The important use case in Python for the proposed semantics is when
> you have a variable-length record, the first few items of which are
> interesting, and the rest of which is less so, but not unimportant.

> (If you wanted to throw the rest away, you'd just write a, b, c =
> x[:3] instead of a, b, c, *d = x.)

That doesn't work if x is an iterable that doesn't support getslice
(such as a file object).

> It is much more convenient for this
> use case if the type of d is fixed by the operation, so you can count
> on its behavior.

> There's a bug in the design of filter() in Python 2 (which will be
> fixed in 3.0 by turning it into an iterator BTW): if the input is a
> tuple, the output is a tuple too, but if the input is a list *or
> anything else*, the output is a list.  That's a totally insane
> signature, since it means that you can't count on the result being a
> list, *nor* on it being a tuple -- if you need it to be one or the
> other, you have to convert it to one, which is a waste of time and
> space. Please let's not repeat this design bug.

I agree that's broken, because it carves out a weird exception for
tuples.  I disagree that it's analogous because I'm not suggesting
carving out an exception.

I'm suggesting, that:

- lists return lists
- tuples return tuples
- XYZ containers return XYZ containers
- non-container iterables return iterators.

It's a consistent rule, albeit a different consistent rule than always
returning the same type.

-- 
Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC

From tomerfiliba at gmail.com  Mon May  7 20:28:27 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Mon, 7 May 2007 20:28:27 +0200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <ca471dc20705071058n25a21acfvaca8e4979edfa404@mail.gmail.com>
References: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
	<ca471dc20705071058n25a21acfvaca8e4979edfa404@mail.gmail.com>
Message-ID: <1d85506f0705071128geac9062lef8309915f0ab7db@mail.gmail.com>

On 5/7/07, Guido van Rossum <guido at python.org> wrote:
> It's been settled by default -- nobody submitted a PEP to kill the GIL
> in time for the April 30 deadline, and I won't accept one now.

oh, i didn't mean to submit a PEP about that -- i don't have the time
or the brainpower to do that. i was just wondering if there were
any plans to do that in py3k, or if that's at all desired. but as you
said, it's been settled by default.


-tomer

From collinw at gmail.com  Mon May  7 20:29:25 2007
From: collinw at gmail.com (Collin Winter)
Date: Mon, 7 May 2007 11:29:25 -0700
Subject: [Python-3000] PEP 3129: Class Decorators
Message-ID: <43aa6ff70705071129r662d0627ma8882a2a8ded3b5d@mail.gmail.com>

Guido pointed out that this PEP hadn't been sent to the list yet.


Abstract
========

This PEP proposes class decorators, an extension to the function
and method decorators introduced in PEP 318.


Rationale
=========

When function decorators were originally debated for inclusion in
Python 2.4, class decorators were seen as obscure and unnecessary
[#obscure]_ thanks to metaclasses.  After several years' experience
with the Python 2.4.x series of releases and an increasing
familiarity with function decorators and their uses, the BDFL and
the community re-evaluated class decorators and recommended their
inclusion in Python 3.0 [#approval]_.

The motivating use-case was to make certain constructs more easily
expressed and less reliant on implementation details of the CPython
interpreter.  While it is possible to express class decorator-like
functionality using metaclasses, the results are generally
unpleasant and the implementation highly fragile [#motivation]_.  In
addition, metaclasses are inherited, whereas class decorators are not,
making metaclasses unsuitable for some, single class-specific uses of
class decorators. The fact that large-scale Python projects like Zope
were going through these wild contortions to achieve something like
class decorators won over the BDFL.


Semantics
=========

The semantics and design goals of class decorators are the same as
for function decorators ([#semantics]_, [#goals]_); the only
difference is that you're decorating a class instead of a function.
The following two snippets are semantically identical: ::

  class A:
    pass
  A = foo(bar(A))


  @foo
  @bar
  class A:
    pass

For a detailed examination of decorators, please refer to PEP 318.


Implementation
==============

Adapating Python's grammar to support class decorators requires
modifying two rules and adding a new rule ::

 funcdef: [decorators] 'def' NAME parameters ['->' test] ':' suite

 compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt |
                with_stmt | funcdef | classdef

need to be changed to ::

 decorated: decorators (classdef | funcdef)

 funcdef: 'def' NAME parameters ['->' test] ':' suite

 compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt |
                with_stmt | funcdef | classdef | decorated

Adding ``decorated`` is necessary to avoid an ambiguity in the
grammar.

The Python AST and bytecode must be modified accordingly.

A reference implementation [#implementation]_ has been provided by
Jack Diederich.


References
==========

.. [#obscure]
   http://www.python.org/dev/peps/pep-0318/#motivation

.. [#approval]
   http://mail.python.org/pipermail/python-dev/2006-March/062942.html

.. [#motivation]
   http://mail.python.org/pipermail/python-dev/2006-March/062888.html

.. [#semantics]
   http://www.python.org/dev/peps/pep-0318/#current-syntax

.. [#goals]
   http://www.python.org/dev/peps/pep-0318/#design-goals

.. [#implementation]
   http://python.org/sf/1671208

From guido at python.org  Mon May  7 21:14:30 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 May 2007 12:14:30 -0700
Subject: [Python-3000] new io (pep 3116)
In-Reply-To: <1d85506f0705070408h6b21540ej54888e97ad6854dc@mail.gmail.com>
References: <1d85506f0705070408h6b21540ej54888e97ad6854dc@mail.gmail.com>
Message-ID: <ca471dc20705071214h5214aba8od3dc2e341f3422b9@mail.gmail.com>

On 5/7/07, tomer filiba <tomerfiliba at gmail.com> wrote:
> my original idea about the new i/o foundation was more elaborate
> than the pep, but i have to admit the pep is more feasible and
> compact. some comments though:
>
> writeline
> -----------------------------
> TextIOBase should grow a writeline() method, to be symmetrical
> with readline(). the reason is simple -- the newline char is
> configurable in the constructor, so it's not necessarily "\n".
> so instead of adding the configurable newline char manually,
> the user should call writeline() which would append the
> appropriate newline automatically.

That's not symmetric. readline() returns a string that includes a
trailing \n even if the actual file contained \r or \r\n. write()
already is supposed to translate \n anywhere (not just at the end of
the line) into the specified or platform-default (os.sep) separator. A
method writeline() that *appended* a separator would be totally new to
the I/O library. Even writelines() doesn't do that.

> sockets
> -----------------------------
> iirc, SocketIO is a layer that wraps an underlying socket object.
> that's a good distinction -- to separate the underlying socket from
> the RawIO interface -- but don't forget socket objects,
> by themselves, need a cleanup too.

But that's out of the scope of the PEP. The main change I intend to
make is to return bytes instead of strings.

> for instance, there's no point in UDP sockets having listen(), or send()
> or getpeername() -- with UDP you only ever use sendto and recvfrom.
> on the other hand, TCP sockets make no use of sendto(). and even with
> TCP sockets, listeners never use send() or recv(), while connected
> sockets never use listen() or connect().
>
> moreover, the current socket interface simply mimics the BSD
> interface. setsockopt, getsockopt, et al, are very unpythonic by nature --
> the ought to be exposed as properties or methods of the socket.
> all in all, the current socket model is very low level with no high
> level design.

That's all out of scope for the PEP. Also I happen to think that
there's nothing particularly wrong with sockets -- they generally get
wrapped in higher layers like httplib.

> some time ago i was working on a sketch for a new socket module
> (called sock2) which had a clear distinction between connected sockets,
> listener sockets and datagram sockets. each protocol was implemented
> as a subclass of one of these base classes, and exposed only the
> relevant methods. socket options were added as properties and
> methods, and a new DNS module was added for dns-related queries.
>
> you can see it here -- http://sebulba.wikispaces.com/project+sock2
> i know it's late already, but i can write a PEP over the weekend,
> or if someone else wants to carry on with the idea, that's fine
> with me.

Sorry, too late. We're putting serious pressue already on authors who
posted draft PEPs before the deadline but haven't submitted their text
to Subversion yet. At this point we have a definite list of PEPs that
were either checked in or promised on time for the deadline. New
proposals will have to wait until after 3.0a1 is released (hopefully
end of June). Also note that the whole stdlib reorg is planned to
happen after that release.

> non-blocking IO
> -----------------------------
> the pep says "In order to put an object in object in non-blocking
> mode, the user must extract the fileno and do it by hand."
> but i think it would only lead to trouble. as the entire IO library
> is being rethought from the grounds up, non-blocking IO
> should be taken into account.

Why? Non-blocking I/O makes most of the proposed API useless.
Non-blocking I/O is highly specialized and hard to code against. I'm
all for a standard non-blocking I/O library but this one isn't it.

> non-blocking IO depends greatly on the platform -- and this is
> exactly why a cross-platform language should standardized that
> as part of the new IO layer. saying "let's keep it for later" would only
> require more work at some later stage.

Actually there are only two things platform-specific: how to turn it
on (or off) and how to tell the difference between "this operation
would block" and "there was an error".

> it's true that SyncIO and AsyncIO don't mingle well with the same
> interfaces. that's why i think they should be two distinct classes.
> the class hierarchy should be something like:
>
> class RawIO:
>     def fileno()
>     def close()
>
> class SyncIO(RawIO):
>     def read(count)
>     def write(data)
>
> class AsyncIO(RawIO):
>     def read(count, timeout)
>     def write(data, timeout)
>     def bgread(count, callback)
>     def bgwrite(data, callback)
>
> or something similar. there's no point to add both sync and async
> operations to the RawIO level -- it just won't work together.
> we need to keep the two distinct.

I'd rather cut out all support for async I/O from this library and
leave it for someone else to invent. I don't need it. People who use
async I/O on sockets to implement e.g. fast web servers are unlikely
to use io.py; they have their own API on top of raw sockets + select
or poll.

> buffering should only support SyncIO -- i also don't see much point
> in having buffered async IO. it's mostly used for sockets and devices,
> which are most likely to work with binary data structures rather than
> text, and if you *require* non-blocking mode, buffering will only
> get in your way.
>
> if you really want a buffered AsyncIO stream, you could write a
> compatibility layer that makes the underlying AsyncIO object
> appear synchronous.

I agree with cutting async I/O from the buffered API, *except* for
specifying that when the equivalent of EWOULDBLOCK happens at the
lower level the buffering layer should notr retry but raise an error.
I think it's okay if the raw layer has minimal support for async I/O.

> records
> -----------------------------
> another addition to the PEP that seems useful to me would be a
> RecordIOBase/Wrapper. records are fixed-length binary data
> structures, defined as format strings of the struct-module.
>
> class RecordIOWrapper:
>     def __init__(self, buffer, format)
>     def read(self) -> tuple of fields
>     def write(self, *fields)

The struct module has the means to build that out of lower-level reads
and writes already. If you think a library module to support this is
needed, write one and make it available as a third party module and
see how many customers you get. Personally I haven't had the need for
files containing of fixed-length records of the same type since the
mid '80s.

> another cool feature i can think of is "multiplexing",  or working
> with the same underlying stream in different ways by having multiple
> wrappers over it.

That's why we make the underlying 'raw' object available as an
attribute. So you can experiment with this.

> for example, to implement a type-length-value stream, which is very
> common in communication protocols, one could do something like
>
> class MultiplexedIO:
>     def __init__(self, *streams):
>         self.streams = itertools.cycle(streams)
>     def read(self, *args):
>         """read from the next stream each time it's called"""
>         return self.streams.next().read(*args)
>
> sock = BufferedRW(SocketIO(...))
> tlrec = Record(sock, "!BL")
> tlv = MultiplexedIO(tvrec, sock)
>
> type, length = tlv.read()
> value = tlv.read(length)
>
> you can also build higher-level state machines with that -- for instance,
> if the type was "int", the next call to read() would decode the value as
> an integer, and so on. you could write parsers right on top of the IO
> layer.
>
> just an idea. i'm not sure if that's proper design or just a silly idea,
> but we'll leave that to the programmer.

I don't think the new I/O library is the place to put in a bunch of
new, essentially untried ideas. Instead, we should aim for a flexible
implementation of APIs that we know work and are needed. I think the
current stack is pretty flexible in that it supports streams and
random access, unidirectional and bidirectional, raw and buffered,
bytes and text. Applications can do a lot with those.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  7 21:16:27 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 May 2007 12:16:27 -0700
Subject: [Python-3000] PEP 3129: Class Decorators
In-Reply-To: <43aa6ff70705071129r662d0627ma8882a2a8ded3b5d@mail.gmail.com>
References: <43aa6ff70705071129r662d0627ma8882a2a8ded3b5d@mail.gmail.com>
Message-ID: <ca471dc20705071216h529f7712xea61d36f8f411fd@mail.gmail.com>

On 5/7/07, Collin Winter <collinw at gmail.com> wrote:
[...]
> This PEP proposes class decorators, an extension to the function
> and method decorators introduced in PEP 318.
[...]
> The semantics and design goals of class decorators are the same as
> for function decorators ([#semantics]_, [#goals]_); the only
> difference is that you're decorating a class instead of a function.
> The following two snippets are semantically identical: ::
>
>   class A:
>     pass
>   A = foo(bar(A))
>
>
>   @foo
>   @bar
>   class A:
>     pass

I'm +1 on this PEP.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  7 21:24:28 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 May 2007 12:24:28 -0700
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <eae285400705071113l385f36eayd8f072c3b8312202@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org> <f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
	<463AB2E1.2030408@canterbury.ac.nz>
	<A7B470E4-DABA-4223-8E97-237249459FCD@gmail.com>
	<463BDC9B.2030500@canterbury.ac.nz>
	<eae285400705041844l1dbecbe2o6d7858b0fde8e33@mail.gmail.com>
	<ca471dc20705071033j13761ad1t9fc75dddcda3c0ce@mail.gmail.com>
	<eae285400705071113l385f36eayd8f072c3b8312202@mail.gmail.com>
Message-ID: <ca471dc20705071224v61cd25cdsc3b21d93696d9273@mail.gmail.com>

On 5/7/07, Daniel Stutzbach <daniel at stutzbachenterprises.com> wrote:
> On 5/7/07, Guido van Rossum <guido at python.org> wrote:
> > And what do you return when it doesn't support the container protocol?
>
> Assign the iterator object with the remaining items to d.
>
> > Think about the use cases. It seems that *your* use case is some kind
> > of (car, cdr) splitting known from Lisp and from functional languages
> > (Haskell is built out of this idiom it seems from the examples). But
> > in Python, if you want to loop over one of those things, you ought to
> > use a for-loop; and if you really want a car/cdr split, explicitly
> > using the syntax you show above (x[0], x[1:]) is fine.
>
> The use came I'm thinking of is this:
>
> A container type or an iterable where the first few entries contain
> one type of information, and the rest of the entries are something
> that will either be discard or run through for-loop.
>
> I encounter this frequently when reading text files where the first
> few lines are some kind of header with a known format and the rest of
> the file is data.

This sounds like a parsing problem. IMO it's better to treat it as such.

> > The important use case in Python for the proposed semantics is when
> > you have a variable-length record, the first few items of which are
> > interesting, and the rest of which is less so, but not unimportant.
>
> > (If you wanted to throw the rest away, you'd just write a, b, c =
> > x[:3] instead of a, b, c, *d = x.)
>
> That doesn't work if x is an iterable that doesn't support getslice
> (such as a file object).
>
> > It is much more convenient for this
> > use case if the type of d is fixed by the operation, so you can count
> > on its behavior.
>
> > There's a bug in the design of filter() in Python 2 (which will be
> > fixed in 3.0 by turning it into an iterator BTW): if the input is a
> > tuple, the output is a tuple too, but if the input is a list *or
> > anything else*, the output is a list.  That's a totally insane
> > signature, since it means that you can't count on the result being a
> > list, *nor* on it being a tuple -- if you need it to be one or the
> > other, you have to convert it to one, which is a waste of time and
> > space. Please let's not repeat this design bug.
>
> I agree that's broken, because it carves out a weird exception for
> tuples.  I disagree that it's analogous because I'm not suggesting
> carving out an exception.
>
> I'm suggesting, that:
>
> - lists return lists
> - tuples return tuples
> - XYZ containers return XYZ containers
> - non-container iterables return iterators.
>
> It's a consistent rule, albeit a different consistent rule than always
> returning the same type.

But I expect less useful. It won't support "a, *b, c = <something>"
either. From an implementation POV, if you have an unknown object on
the RHS, you have to try slicing it before you try iterating over it;
this may cause problems e.g. if the object happens to be a defaultdict
-- since x[3:] is implemented as x[slice(None, 3, None)], the
defaultdict will give you its default value. I'd much rather define
this in terms of iterating over the object until it is exhausted,
which can be optimized for certain known types like lists and tuples.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tomerfiliba at gmail.com  Mon May  7 22:12:05 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Mon, 7 May 2007 22:12:05 +0200
Subject: [Python-3000] new io (pep 3116)
In-Reply-To: <ca471dc20705071214h5214aba8od3dc2e341f3422b9@mail.gmail.com>
References: <1d85506f0705070408h6b21540ej54888e97ad6854dc@mail.gmail.com>
	<ca471dc20705071214h5214aba8od3dc2e341f3422b9@mail.gmail.com>
Message-ID: <1d85506f0705071312w43c1f59cwe4f84239a4a69a6a@mail.gmail.com>

On 5/7/07, Guido van Rossum <guido at python.org> wrote:
> That's not symmetric. readline() returns a string that includes a
> trailing \n even if the actual file contained \r or \r\n. write()
> already is supposed to translate \n anywhere (not just at the end of
> the line) into the specified or platform-default (os.sep) separator.

well, if write() is meant to do that anyway, writeline() is not required.

> > moreover, the current socket interface simply mimics the BSD
> > interface. setsockopt, getsockopt, et al, are very unpythonic by nature --
> > the ought to be exposed as properties or methods of the socket.
> > all in all, the current socket model is very low level with no high
> > level design.
>
> That's all out of scope for the PEP. Also I happen to think that
> there's nothing particularly wrong with sockets -- they generally get
> wrapped in higher layers like httplib.

first, when i first brought this up a year ago, you were in favor
http://mail.python.org/pipermail/python-3000/2006-April/001497.html

still, there's much code that handles sockets. the client side is
mostly standard: connect, do something, quit. but on the server side
it's another story, and i have had many long battles with man pages
to make sockets behave as expected. not all protocols are based
on http, after all, and the fellas that write modules like httplib have
a lot of black magic to do.

a better design of the socket module could help a lot, as well as
making small, repeated tasks easier/more logical. compare
    import socket
    s = socket.socket()
    s.connect(("foobar", 1234))
to
    from socket import TcpStream
    s = TcpStream("foobar", 1234)

> > you can see it here -- http://sebulba.wikispaces.com/project+sock2
> > i know it's late already, but i can write a PEP over the weekend,
> > or if someone else wants to carry on with the idea, that's fine
> > with me.
>
> Sorry, too late. We're putting serious pressue already on authors who
> posted draft PEPs before the deadline but haven't submitted their text
> to Subversion yet. At this point we have a definite list of PEPs that
> were either checked in or promised on time for the deadline. New
> proposals will have to wait until after 3.0a1 is released (hopefully
> end of June). Also note that the whole stdlib reorg is planned to
> happen after that release.

well, my code is pure python, and can just replace the existing socket.py
module. the _socket module remains in tact. it can surely wait for the
stdlib reorg though, there's no need to rush into it now. i'll submit the
PEP in the near future.

> > non-blocking IO depends greatly on the platform -- and this is
> > exactly why a cross-platform language should standardized that
> > as part of the new IO layer. saying "let's keep it for later" would only
> > require more work at some later stage.
>
> Actually there are only two things platform-specific: how to turn it
> on (or off) and how to tell the difference between "this operation
> would block" and "there was an error".

well, the way i see it, that's exactly why this calls for standardization.

> The struct module has the means to build that out of lower-level reads
> and writes already. If you think a library module to support this is
> needed, write one and make it available as a third party module and
> see how many customers you get. Personally I haven't had the need for
> files containing of fixed-length records of the same type since the
> mid '80s.

not all of us are that lucky :)
there's still lots of protocols ugly protocols and file formats for
us programmers to handle. and TLV structures happen to the be
one of the pretty ones.

> I don't think the new I/O library is the place to put in a bunch of
> new, essentially untried ideas. Instead, we should aim for a flexible
> implementation of APIs that we know work and are needed. I think the
> current stack is pretty flexible in that it supports streams and
> random access, unidirectional and bidirectional, raw and buffered,
> bytes and text. Applications can do a lot with those.

yeah, it was more like a wild idea really. it should be placed in a
different module.


-tomer

From martin at v.loewis.de  Tue May  8 00:02:58 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 08 May 2007 00:02:58 +0200
Subject: [Python-3000] PEP 3112
In-Reply-To: <ca471dc20705071045o509dd13bs93e7495e2fd3d288@mail.gmail.com>
References: <463D8800.1010906@v.loewis.de>
	<ca471dc20705071045o509dd13bs93e7495e2fd3d288@mail.gmail.com>
Message-ID: <463FA212.9070609@v.loewis.de>

>> 1. in Grammar changes: Each shortstringchar or longstringchar must
>>    be a character whose Unicode ordinal value is between 1 and
>>    127 inclusive.
> 
> Sounds like a good fix to me; I agree that bytes literals, like
> Unicode literals, should not vary depending on the source encoding. In
> step 2, can't you use "ascii" as the encoding?

Sure. Technically, ASCII might include \0 (depending on definition),
but that is ruled out as a character in Python source code, anyway.

So: "must be an ASCII character" is just as clear, and much shorter.

I guess Jason associated "ASCII character" with "single byte",
so it can't be simultaneously both ASCII and Unicode, hence he
chose the more elaborate wording.  Of course, if one views ASCII
as a character set (rather than an coded character set), a Unicode
character may or may not simultaneously be an ASCII character.

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Tue May  8 00:44:15 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 08 May 2007 10:44:15 +1200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <eae285400705071113l385f36eayd8f072c3b8312202@mail.gmail.com>
References: <f18ccp$utj$1@sea.gmane.org>
	<ca471dc20705011500u55f26862vd6747c1ac6bcff7f@mail.gmail.com>
	<f19d5r$mfm$2@sea.gmane.org>
	<ca471dc20705021108l5dc3b061m53ad668ce6f873f1@mail.gmail.com>
	<D87990F0-787E-4E79-A26F-BD408E3465E0@gmail.com>
	<463AB2E1.2030408@canterbury.ac.nz>
	<A7B470E4-DABA-4223-8E97-237249459FCD@gmail.com>
	<463BDC9B.2030500@canterbury.ac.nz>
	<eae285400705041844l1dbecbe2o6d7858b0fde8e33@mail.gmail.com>
	<ca471dc20705071033j13761ad1t9fc75dddcda3c0ce@mail.gmail.com>
	<eae285400705071113l385f36eayd8f072c3b8312202@mail.gmail.com>
Message-ID: <463FABBF.2080808@canterbury.ac.nz>

Daniel Stutzbach wrote:

> The use came I'm thinking of is this:
> 
> A container type or an iterable where the first few entries contain
> one type of information, and the rest of the entries are something
> that will either be discard or run through for-loop.

If you know you're dealing with an iterator x, then
after a, b, c, *d = x, d would simply be x, so all
you really need is a function to get the first n
items from x.

> I'm suggesting, that:
> 
> - lists return lists
> - tuples return tuples
> - XYZ containers return XYZ containers
> - non-container iterables return iterators.

How do you propose to distinguish between the last
two cases? Attempting to slice it and catching an
exception is not acceptable, IMO, as it can too
easily mask bugs.

--
Greg

From ark at acm.org  Tue May  8 01:48:27 2007
From: ark at acm.org (Andrew Koenig)
Date: Mon, 7 May 2007 19:48:27 -0400
Subject: [Python-3000] PEP 3125 -- a modest proposal
Message-ID: <000101c79102$385e1340$a91a39c0$@org>

Yes, I have read Swift :-)  And in that spirit, I don't know whether to take
this proposal seriously because it's kind of radical.  Nevertheless, here
goes...

It has occurred to me that as Python stands today, an indent always begins
with a colon.  So in principle, we could define anything that looks like an
indent but doesn't begin with a colon as a continuation.  So the idea would
be that you can continue a statement onto as many lines as you wish,
provided that

	Each line after the first is indented strictly more than the first
line
	(but not necessarily more than the remaining lines in the
statement), and

	If there is a colon that will precede an indent, it is the last
token of
	the last line, in which case the line after the colon must be
indented
	strictly more than the first line (but not necessarily more than the
	remaining lines in the statement).

For example:

	"abc"
	   + "def"	# second line with more whitespace than the first --
continuation

	"abc"
	+ "def"	# attempt to apply unary + to string literal

	"abc"
	     + "def"
	   + "ghi"	# OK -- this line is indented more than "abc"

This proposal has the advantage of being entirely lexical -- it doesn't even
rely on counting parentheses or brackets, so unlike the current Python rule,
it can be implemented entirely as a regular expression.

It has the disadvantage of being a change, and may have its own pitfalls:

	if foo		# Oops, I forgot the colon
	    + bar		# which makes this line a continuation

Of course, when "if" isn't followed eventually by a colon, the code won't
compile.

However...

	x = 3,
	    4        # x = (3, 4)

	x = 3,	 # x = (3,)
	4		 # evaluate 4 and throw it away

So it may be that this proposed rule is too tricky to use.  However, it does
have the merit of being even simpler than the current rule.

Just a thought...



From ncoghlan at gmail.com  Tue May  8 03:36:46 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 08 May 2007 11:36:46 +1000
Subject: [Python-3000] failing tests
In-Reply-To: <ca471dc20705071110n26989e7cs9e54cb1736ad8a65@mail.gmail.com>
References: <ee2a432c0705070104j7c71842dxf33307a1468cfeb5@mail.gmail.com>	
	<ca471dc20705070728q2ce99be4x1b427b34543827f9@mail.gmail.com>	
	<463F5135.2090007@gmail.com>
	<ca471dc20705071110n26989e7cs9e54cb1736ad8a65@mail.gmail.com>
Message-ID: <463FD42E.8060408@gmail.com>

Guido van Rossum wrote:
> On 5/7/07, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Guido van Rossum wrote:
>> > Thanks for checking in xrange!!!!! Woot!
>> >
>> > test_compiler and test_transformer are waiting for someone to clean up
>> > the compiler package (I forget what it doesn't support, perhapes only
>> > nonlocal needs to be added.)
>>
>> It's definitely lagging on set comprehensions as well. I'm also pretty
>> sure those two tests broke before nonlocal was added, as they were
>> already broken when I started helping Georg in looking at the setcomp
>> updates.
> 
> I  just fixed the doctest failures; but for the compiler package I
> need help. Would you have the time?

I don't really know the compiler package at all. I'll have a look, but 
it's going to take me a while to even figure out where the fixes need to 
go, let alone what they will actually look like.

So if someone more familiar with the package beats me to fixing it, I 
won't be the least bit upset ;)

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From guido at python.org  Tue May  8 04:06:49 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 7 May 2007 19:06:49 -0700
Subject: [Python-3000] failing tests
In-Reply-To: <463FD42E.8060408@gmail.com>
References: <ee2a432c0705070104j7c71842dxf33307a1468cfeb5@mail.gmail.com>
	<ca471dc20705070728q2ce99be4x1b427b34543827f9@mail.gmail.com>
	<463F5135.2090007@gmail.com>
	<ca471dc20705071110n26989e7cs9e54cb1736ad8a65@mail.gmail.com>
	<463FD42E.8060408@gmail.com>
Message-ID: <ca471dc20705071906kec6bb0dtcc2be225d2c39e43@mail.gmail.com>

On 5/7/07, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Guido van Rossum wrote:
> > I  just fixed the doctest failures; but for the compiler package I
> > need help. Would you have the time?
>
> I don't really know the compiler package at all. I'll have a look, but
> it's going to take me a while to even figure out where the fixes need to
> go, let alone what they will actually look like.
>
> So if someone more familiar with the package beats me to fixing it, I
> won't be the least bit upset ;)

Same boat I'm in. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nnorwitz at gmail.com  Tue May  8 07:36:36 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 7 May 2007 22:36:36 -0700
Subject: [Python-3000] ref counts
Message-ID: <ee2a432c0705072236o2124fb36q59c9e1b91528b59d@mail.gmail.com>

I'm starting up a continuous build of sorts on the PSF machine for the
3k branch.  Right now the failures will only go to me.  I've excluded
the two tests that are known to currently fail.  This will help us
find new failures (including ref leaks).  Probably in a week or so
I'll send the results to python-3000-checkins.  Since it's just
running on a single machine (every 12 hours), this should be pretty
stable.  It has been for the trunk and 2.5 branch.

I just wanted to point out some data points wrt ref counts.

At the end of a test run on trunk with 298 tests the total ref count is:
  [482838 refs]

When starting a new process (like during the subprocess tests):
  [7323 refs]

With 3k and 302 tests:
  [615279 refs]

and:
  [10457 refs]

I don't think these are problematic.  I expect that these ~30%
increases in total ref counts are primarily the result of new-style
classes.

n

From python at rcn.com  Tue May  8 07:40:21 2007
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 7 May 2007 22:40:21 -0700
Subject: [Python-3000] PEP 3125 -- a modest proposal
References: <000101c79102$385e1340$a91a39c0$@org>
Message-ID: <009701c79142$6424a260$f001a8c0@RaymondLaptop1>

[Andrew Koenig]
> It has occurred to me that as Python stands today, an indent always begins
> with a colon.  So in principle, we could define anything that looks like an
> indent but doesn't begin with a colon as a continuation.  So the idea would
> be that you can continue a statement onto as many lines as you wish,

Too dangerous.  The most common Python syntax error (by far, even for
experienced users) is omission of a colon.  If the missing colon starts
to have its own special meaning, that would not be a good thing.

If you're in the mood to propose something radical, how about dropping
the colon altogether, leaving indention as the sure reliable cue and 
cleaning-up the appearance of code in a new world where colons
are also being used for annotation as well as slicing:

   def f(x: xtype, y: type)
        result = []
        for i, elem in enumerate(x)
             if elem < 0
                  result.append(y[:i])
             else
                  result.append(y[i:])
        return result

It looks very clean to my eyes.


Raymond

From rrr at ronadam.com  Tue May  8 10:28:56 2007
From: rrr at ronadam.com (Ron Adam)
Date: Tue, 08 May 2007 03:28:56 -0500
Subject: [Python-3000] PEP 3125 -- a modest proposal
In-Reply-To: <009701c79142$6424a260$f001a8c0@RaymondLaptop1>
References: <000101c79102$385e1340$a91a39c0$@org>
	<009701c79142$6424a260$f001a8c0@RaymondLaptop1>
Message-ID: <464034C8.6030007@ronadam.com>

Raymond Hettinger wrote:
> [Andrew Koenig]
>> It has occurred to me that as Python stands today, an indent always begins
>> with a colon.  So in principle, we could define anything that looks like an
>> indent but doesn't begin with a colon as a continuation.  So the idea would
>> be that you can continue a statement onto as many lines as you wish,
> 
> Too dangerous.  The most common Python syntax error (by far, even for
> experienced users) is omission of a colon.  If the missing colon starts
> to have its own special meaning, that would not be a good thing.
> 
> If you're in the mood to propose something radical, how about dropping
> the colon altogether, leaving indention as the sure reliable cue and 
> cleaning-up the appearance of code in a new world where colons
> are also being used for annotation as well as slicing:
> 
>    def f(x: xtype, y: type)
>         result = []
>         for i, elem in enumerate(x)
>              if elem < 0
>                   result.append(y[:i])
>              else
>                   result.append(y[i:])
>         return result
> 
> It looks very clean to my eyes.

So no more single line definitions?


If you think of the colon as meaning, 'associated to', it's use is both 
clear and consistent in all cases except when used in slicing.

I also think it helps the code be more readable because when its used in 
combination with indenting because it looks more like a common outline 
definition that even non-programmers are familiar with.  So it may have 
value in this regard because it makes the intent of the code clearer to new 
users.

Removing it may blur the difference of block headers and block bodies in 
the mind.  The computer may not need it, but I expect it helps us humans 
keep things straight in our heads.

So maybe a more modest proposal is to change the colon in slicing to a 
semi-colon where it can have it's own meaning.

     def f(x: xtype, y: type):
          result = []
          for i, elem in enumerate(x):
               if elem < 0:
                    result.append(y[;i])
               else:
                    result.append(y[i;])
          return result

Cheers,
    Ron

From ncoghlan at gmail.com  Tue May  8 14:45:05 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 08 May 2007 22:45:05 +1000
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <02b401c78d15$f04b6110$090a0a0a@enfoldsystems.local>
References: <02b401c78d15$f04b6110$090a0a0a@enfoldsystems.local>
Message-ID: <464070D1.6040701@gmail.com>

Mark Hammond wrote:
> Please add my -1 to the chorus here, for the same reasons already expressed.

Another -1 here - while I agree there are benefits to removing backslash 
continuations and string literal concatenation, I don't think they're 
significant enough to justify the hassle of making it happen.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ark at acm.org  Tue May  8 15:08:51 2007
From: ark at acm.org (Andrew Koenig)
Date: Tue, 8 May 2007 09:08:51 -0400
Subject: [Python-3000] PEP 3125 -- a modest proposal
In-Reply-To: <009701c79142$6424a260$f001a8c0@RaymondLaptop1>
References: <000101c79102$385e1340$a91a39c0$@org>
	<009701c79142$6424a260$f001a8c0@RaymondLaptop1>
Message-ID: <007301c79172$065bfea0$1313fbe0$@org>

> Too dangerous.  The most common Python syntax error (by far, even for
> experienced users) is omission of a colon.  If the missing colon starts
> to have its own special meaning, that would not be a good thing.

It's not special -- omitting it would have exactly the same effect as
omitting a colon does today in a single-line statement.  That is, today you
can write

	if x < y: x = y

or you can forget the colon and write

	if x < y x = y

and (usually) be diagnosed by the compiler.  My proposal would make

	if x < y:
		x = y

and

	if x < y
		x = y

have the same meanings as (respectively) the first two examples above, so
the fourth example would still be diagnosed as an error for the same reason.



From jason.orendorff at gmail.com  Tue May  8 15:16:32 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Tue, 8 May 2007 08:16:32 -0500
Subject: [Python-3000] [Python-Dev] Byte literals (was Re:
	[Python-checkins] Changing string constants to byte arrays (
	r55119 - in python/branches/py3k-struni/Lib: codecs.py
	test/test_codecs.py ))
In-Reply-To: <ca471dc20705071042x586b21bcgbfdf8fbf8144913e@mail.gmail.com>
References: <20070505150035.GA16303@panix.com>
	<200705051334.45120.fdrake@acm.org>
	<20070505124008.648D.JCARLSON@uci.edu>
	<ca471dc20705071042x586b21bcgbfdf8fbf8144913e@mail.gmail.com>
Message-ID: <bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>

On 5/7/07, Guido van Rossum <guido at python.org> wrote:
> I don't know how this will work out yet. I'm not convinced that having
> both mutable and immutable bytes is the right thing to do; but I'm
> also not convinced of the opposite. I am slowly working on the
> string/unicode unification, and so far, unfortunately, it is quite
> daunting to get rid of 8-bit strings even at the Python level let
> alone at the C level.

Guido, if 3.x had an immutable bytes type, could 2to3 provide a
better guarantee?  Namely, "Set your default encoding to None
in your 2.x code today, and 2to3 will not introduce bugs around
str/unicode."

2to3 could produce 3.x code that preserves the 2.x meaning by
using 2.x-ish types, including immutable byte strings.

Without this, my understanding is that 2to3 will introduce bugs.
Am I wrong?

This might be worth doing even if you decide an immutable 8-bit
type is wrong for the core language.  The type could be hidden
away in an "upgradelib" module somewhere.  Surely people will
prefer correctness over "producing nice, idiomatic 3.x code"
in the 2to3 tool.

-j

From guido at python.org  Tue May  8 15:48:57 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 8 May 2007 06:48:57 -0700
Subject: [Python-3000] [Python-Dev] PEP 30XZ: Simplified Parsing
In-Reply-To: <464070D1.6040701@gmail.com>
References: <02b401c78d15$f04b6110$090a0a0a@enfoldsystems.local>
	<464070D1.6040701@gmail.com>
Message-ID: <ca471dc20705080648p7d36236i86546a094cc8ba54@mail.gmail.com>

On 5/8/07, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Mark Hammond wrote:
> > Please add my -1 to the chorus here, for the same reasons already expressed.
>
> Another -1 here - while I agree there are benefits to removing backslash
> continuations and string literal concatenation, I don't think they're
> significant enough to justify the hassle of making it happen.

OK. I'm just about ready to reject both PEP 3125 and PEP 3126 on the
grounds of lack of popular support and insufficient benefits. If
anyone is truly upset about this, let them speak up now, or be forever
silent.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue May  8 16:02:01 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 8 May 2007 07:02:01 -0700
Subject: [Python-3000] Deadline for checking PEPs into subversion
Message-ID: <ca471dc20705080702p44bec213pb8eaa725d3911d0a@mail.gmail.com>

In fairness to would-be new PEP proposals for Python 3000, I am asking
everyone who still has a draft PEP that's not checked in to subversion
to please check in *a* version of it as soon as possible. This version
doesn't have to be final; expect debate which may require a rewrite
all or part of your PEP. But I want every proposal that's on the table
in subversion so we can make up a definitive list of proposals being
considered for Python 3.0a1 (to be released by the end of June).

If you don't have checkin permissions, send your PEP to
peps at python.org. Please do follow the PEP guidelines in PEP 1 and use
either PEP 9 or PEP 12 as a template.

Any PEP non checked into subversion (or at least received by
peps at python.org) by the end of Sunday, May 13, is out of
consideration.

Note: standard library reorganization PEPs don't fall under this
deadline; the stdlib reorg will begin after the release of 3.0a1.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From avassalotti at acm.org  Tue May  8 16:03:42 2007
From: avassalotti at acm.org (Alexandre Vassalotti)
Date: Tue, 8 May 2007 10:03:42 -0400
Subject: [Python-3000] PEP 3125 -- a modest proposal
In-Reply-To: <009701c79142$6424a260$f001a8c0@RaymondLaptop1>
References: <000101c79102$385e1340$a91a39c0$@org>
	<009701c79142$6424a260$f001a8c0@RaymondLaptop1>
Message-ID: <acd65fa20705080703y7972d953y21c2619ca78df82d@mail.gmail.com>

On 5/8/07, Raymond Hettinger <python at rcn.com> wrote:
> If you're in the mood to propose something radical, how about dropping
> the colon altogether, leaving indention as the sure reliable cue and
> cleaning-up the appearance of code in a new world where colons
> are also being used for annotation as well as slicing:
>
>    def f(x: xtype, y: type)
>         result = []
>         for i, elem in enumerate(x)
>              if elem < 0
>                   result.append(y[:i])
>              else
>                   result.append(y[i:])
>         return result
>
> It looks very clean to my eyes.
>

This proposal is surely doomed is advance. If I remember well the
trailing colon comes from Python's precursor, ABC. They realized it
was not necessary for the parser but it did make the programs more
readable for humans.

Would it be a good idea, to continue this thread on Python-ideas? I
doubt such changes will be accepted, since we are now past the PEP
deadline for changes to the core language.

-- Alexandre

From guido at python.org  Tue May  8 16:10:51 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 8 May 2007 07:10:51 -0700
Subject: [Python-3000] [Python-Dev] Byte literals (was Re:
	[Python-checkins] Changing string constants to byte arrays (
	r55119 - in python/branches/py3k-struni/Lib: codecs.py
	test/test_codecs.py ))
In-Reply-To: <bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>
References: <20070505150035.GA16303@panix.com>
	<200705051334.45120.fdrake@acm.org>
	<20070505124008.648D.JCARLSON@uci.edu>
	<ca471dc20705071042x586b21bcgbfdf8fbf8144913e@mail.gmail.com>
	<bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>
Message-ID: <ca471dc20705080710l4b88e5cbv87e6933c52f4f9c3@mail.gmail.com>

On 5/8/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> On 5/7/07, Guido van Rossum <guido at python.org> wrote:
> > I don't know how this will work out yet. I'm not convinced that having
> > both mutable and immutable bytes is the right thing to do; but I'm
> > also not convinced of the opposite. I am slowly working on the
> > string/unicode unification, and so far, unfortunately, it is quite
> > daunting to get rid of 8-bit strings even at the Python level let
> > alone at the C level.
>
> Guido, if 3.x had an immutable bytes type, could 2to3 provide a
> better guarantee?  Namely, "Set your default encoding to None
> in your 2.x code today, and 2to3 will not introduce bugs around
> str/unicode."

I don't know. I may be able to tell you when I'm further into the
process of unifying str and unicode.

> 2to3 could produce 3.x code that preserves the 2.x meaning by
> using 2.x-ish types, including immutable byte strings.

This sounds dangerously close to crippling 3.0 with backwards
compatibility. I want to reserve this option as a last resort.

> Without this, my understanding is that 2to3 will introduce bugs.
> Am I wrong?

No -- 2to3 cannot guarantee that your code will work correctly,
because it doesn't do any data flow analysis or type inferencing. This
is not limited to strings.

> This might be worth doing even if you decide an immutable 8-bit
> type is wrong for the core language.  The type could be hidden
> away in an "upgradelib" module somewhere.  Surely people will
> prefer correctness over "producing nice, idiomatic 3.x code"
> in the 2to3 tool.

With that I agree, at least in general (e.g. d.keys() gets translated
to list(d.keys()) and d.iterkeys(0 to iter(d.keys())). In the current
py3k-struni branch I have temporarily kept the 8-bit string type
around, renamed to str8. I am hoping I will be able to get rid of it
eventually but I may not succeed and then we'll have it available as a
backup.

For anyone who wants to discuss this more -- please come and help out
in the py3k-struni branch first. It is simply too soon to be able to
make decisions based on the evidence available so far, and I won't be
forced.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jimjjewett at gmail.com  Tue May  8 16:34:26 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 8 May 2007 10:34:26 -0400
Subject: [Python-3000] [Python-Dev] Byte literals (was Re:
	[Python-checkins] Changing string constants to byte arrays (
	r55119 - in python/branches/py3k-struni/Lib: codecs.py
	test/test_codecs.py ))
In-Reply-To: <bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>
References: <20070505150035.GA16303@panix.com>
	<200705051334.45120.fdrake@acm.org>
	<20070505124008.648D.JCARLSON@uci.edu>
	<ca471dc20705071042x586b21bcgbfdf8fbf8144913e@mail.gmail.com>
	<bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>
Message-ID: <fb6fbf560705080734s4c315bc3i38c5ccbe3fc9a9a3@mail.gmail.com>

On 5/8/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> On 5/7/07, Guido van Rossum <guido at python.org> wrote:

> > daunting to get rid of 8-bit strings even at the Python level let
> > alone at the C level.

> Guido, if 3.x had an immutable bytes type, could 2to3 provide a
> better guarantee?  Namely, "Set your default encoding to None
> in your 2.x code today, and 2to3 will not introduce bugs around
> str/unicode."

Presumably b"  " would be the immutable version.

In some sense, this would mean that the string/unicode unification
(assuming interning; so that I can use "is" for something stronger
than __eq__) would boil down to:

    Py2.6    b"str" is "str"  == u"str"
    Py3.X    b"str" == "str"  is u"str"

with a few details like 2.5 didn't have the b"str" spelling, and 3.x
might not support the u"str" spelling.

> This might be worth doing even if you decide an immutable 8-bit
> type is wrong for the core language.  The type could be hidden
> away in an "upgradelib" module somewhere.  Surely people will
> prefer correctness over "producing nice, idiomatic 3.x code"
> in the 2to3 tool.

I will be unhappy if 2to3 produces code that I can't run in (at least)
2.6, because then I would need to convert more than once.

I would be unhappy if 2to3 produced code that I couldn't safely copy;
that is too magical.

I would be unhappy if 2to3 produced code that isn't a good example,
unless it also had (at least an option, probably a default) to add
comments suggesting a manual verification and what could *probably* be
used instead.

-jJ

From lcaamano at gmail.com  Tue May  8 16:35:46 2007
From: lcaamano at gmail.com (Luis P Caamano)
Date: Tue, 8 May 2007 10:35:46 -0400
Subject: [Python-3000] the future of the GIL
Message-ID: <c56e219d0705080735g65f2412bu361d798ac403e538@mail.gmail.com>

On 5/7/07, "Guido van Rossum" wrote:
>
>
> Around '99 Greg Stein and Mark Hammond tried to get rid of the GIL.
> They removed most of the global mutable data structures, added
> explicit locks to the remaining ones and to individual mutable
> objects, and actually got the whole thing working. Unfortunately even
> on the system with the fastest locking primitives (Windows at the
> time) they measured a 2x slow-down on a single CPU due to all the
> extra locking operations going on.

That just breaks my heart.<sigh>

You gotta finish that sentence, it was a slow down on single CPU with
a speed increase with two or more CPUs, leveling out at 4 CPUs or so.

This was the same situation on every major OS kernel, including AIX,
HPUX, Linux, Tru64, etc., when they started supporting SMP machines,
which is why all of them at some time sported two kernels, one for SMP
machines with the spinlock code and one for single processor machines
with the spinlock code #ifdef'ed out.  For some, like IBM/AIX and
HPUX, eventually and as expected, all their servers became MPs and
then they stopped delivering the SP kernel.

The same would've been true for the python interpreter, one for MP and
one for SP, and eventually, even in the PC world, everything would be
MP and the SP interpreter would disappear.

People need to understand though that the GIL is not as bad as one
would initially think as most C extensions release the GIL and run
concurrently on multiple CPUs.  It takes a bit of researching through
old emails in the python list and a bit of time to really understand
that.  Nevertheless, when the itch is bad enough, it'll get scratched.

-- 
Luis P Caamano
Atlanta, GA USA

From p.f.moore at gmail.com  Tue May  8 16:57:41 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 8 May 2007 15:57:41 +0100
Subject: [Python-3000] [Python-Dev] Byte literals (was Re:
	[Python-checkins] Changing string constants to byte arrays (
	r55119 - in python/branches/py3k-struni/Lib: codecs.py
	test/test_codecs.py ))
In-Reply-To: <fb6fbf560705080734s4c315bc3i38c5ccbe3fc9a9a3@mail.gmail.com>
References: <20070505150035.GA16303@panix.com>
	<200705051334.45120.fdrake@acm.org>
	<20070505124008.648D.JCARLSON@uci.edu>
	<ca471dc20705071042x586b21bcgbfdf8fbf8144913e@mail.gmail.com>
	<bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>
	<fb6fbf560705080734s4c315bc3i38c5ccbe3fc9a9a3@mail.gmail.com>
Message-ID: <79990c6b0705080757y23742af4pd4d424ba77e4fe7@mail.gmail.com>

On 08/05/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> I will be unhappy if 2to3 produces code that I can't run in (at least)
> 2.6, because then I would need to convert more than once.

IIUC, the idea is that you should be able to write valid Python 2.6
code which 2to3 can convert automatically. There is no intention that
2to3 should automatically handle arbitrary 2.x code (at least, not
without the risk of bugs)., and certainly no intention that the
*output* of 2to3 be runnable in 2.6 (in general).

Yes, you convert more than once. Until you cut over, your 2.6 source
is the master, and the output of 2to3 should be treated as generated
code.

> I would be unhappy if 2to3 produced code that I couldn't safely copy;
> that is too magical.

Not sure what that means.

> I would be unhappy if 2to3 produced code that isn't a good example,
> unless it also had (at least an option, probably a default) to add
> comments suggesting a manual verification and what could *probably* be
> used instead.

I'd like 2to3 code to be at least maintainable. Surely it's too much
to assume it's going to be a good example of idiomatic 3.x code,
though?

Paul.

From jimjjewett at gmail.com  Tue May  8 17:09:46 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 8 May 2007 11:09:46 -0400
Subject: [Python-3000] [Python-Dev] Byte literals (was Re:
	[Python-checkins] Changing string constants to byte arrays (
	r55119 - in python/branches/py3k-struni/Lib: codecs.py
	test/test_codecs.py ))
In-Reply-To: <79990c6b0705080757y23742af4pd4d424ba77e4fe7@mail.gmail.com>
References: <20070505150035.GA16303@panix.com>
	<200705051334.45120.fdrake@acm.org>
	<20070505124008.648D.JCARLSON@uci.edu>
	<ca471dc20705071042x586b21bcgbfdf8fbf8144913e@mail.gmail.com>
	<bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>
	<fb6fbf560705080734s4c315bc3i38c5ccbe3fc9a9a3@mail.gmail.com>
	<79990c6b0705080757y23742af4pd4d424ba77e4fe7@mail.gmail.com>
Message-ID: <fb6fbf560705080809w634c9d7ak8d3d2d56b40ba872@mail.gmail.com>

On 5/8/07, Paul Moore <p.f.moore at gmail.com> wrote:
> On 08/05/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > I will be unhappy if 2to3 produces code that I can't run in
> > (at least) 2.6, because then I would need to convert more
> > than once.

> IIUC, the idea is that you should be able to write valid Python 2.6
> code which 2to3 can convert automatically. There is no intention
> that 2to3 should automatically handle arbitrary 2.x code (at
> least, not without the risk of bugs).,

I thought that was indeed the goal.

> and certainly no intention that the
> *output* of 2to3 be runnable in 2.6 (in general).

Agreed that it isn't, but I think it should be.

> Yes, you convert more than once. Until you cut over, your
> 2.6 source is the master, and the output of 2to3 should be
> treated as generated code.

And you can't cut over until you're ready to abandon 2.x.

> > I would be unhappy if 2to3 produced code that I couldn't
> > safely copy; that is too magical.

> Not sure what that means.

Many people learn by example.  Many people don't even bother learning;
they just cut and paste.  If the only example Py3 code they see is
ugly and bloated, that is the idiom they will internalize.

> > I would be unhappy if 2to3 produced code that isn't a good
> > example, unless it also had (at least an option, probably a
> > default) to add comments suggesting a manual verification
> > and what could *probably* be used instead.
>
> I'd like 2to3 code to be at least maintainable. Surely it's too
> much to assume it's going to be a good example of idiomatic
> 3.x code, though?

Probably -- which is why it should at least be possible to focus your
attention on the parts that need manual changes.

(And, of course, the number of such places should be minimized,
particularly if you can't run the result in 2.6, since it is
effectively a fork.)

-jJ

From guido at python.org  Tue May  8 17:25:27 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 8 May 2007 08:25:27 -0700
Subject: [Python-3000] [Python-Dev] Byte literals (was Re:
	[Python-checkins] Changing string constants to byte arrays (
	r55119 - in python/branches/py3k-struni/Lib: codecs.py
	test/test_codecs.py ))
In-Reply-To: <fb6fbf560705080734s4c315bc3i38c5ccbe3fc9a9a3@mail.gmail.com>
References: <20070505150035.GA16303@panix.com>
	<200705051334.45120.fdrake@acm.org>
	<20070505124008.648D.JCARLSON@uci.edu>
	<ca471dc20705071042x586b21bcgbfdf8fbf8144913e@mail.gmail.com>
	<bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>
	<fb6fbf560705080734s4c315bc3i38c5ccbe3fc9a9a3@mail.gmail.com>
Message-ID: <ca471dc20705080825j6b665307o408569dd2a9fb55d@mail.gmail.com>

On 5/8/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> I will be unhappy if 2to3 produces code that I can't run in (at least)
> 2.6, because then I would need to convert more than once.

This is the first time I hear of this requirement. It has not so far
been a design goal for the conversions in 2to3. The workflow that I
have in mind (and that others have agreed to be workable) is more like
this:

1. develop working code under 2.6
2. make sure it is warning-free with the special -Wpy3k option
3. use 2to3 to convert it to 3.0 compatible syntax in a temporary directory
4. run your unit test suite with 3.0
5. for any defects you find, EDIT THE 2.6 SOURCE AND GO BACK TO STEP 2

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From foom at fuhm.net  Tue May  8 17:26:09 2007
From: foom at fuhm.net (James Y Knight)
Date: Tue, 8 May 2007 11:26:09 -0400
Subject: [Python-3000] [Python-Dev] Byte literals (was Re:
	[Python-checkins] Changing string constants to byte arrays (
	r55119 - in python/branches/py3k-struni/Lib: codecs.py
	test/test_codecs.py ))
In-Reply-To: <bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>
References: <20070505150035.GA16303@panix.com>
	<200705051334.45120.fdrake@acm.org>
	<20070505124008.648D.JCARLSON@uci.edu>
	<ca471dc20705071042x586b21bcgbfdf8fbf8144913e@mail.gmail.com>
	<bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>
Message-ID: <1B70083D-1CC2-44EA-A8F4-404F1E493271@fuhm.net>


On May 8, 2007, at 9:16 AM, Jason Orendorff wrote:

> Guido, if 3.x had an immutable bytes type, could 2to3 provide a
> better guarantee?  Namely, "Set your default encoding to None
> in your 2.x code today, and 2to3 will not introduce bugs around
> str/unicode."

You cannot set the default encoding to None (rather, "undefined") in  
2.x, without making half the stdlib completely unusable. So that's  
not really much of an option.

James

From guido at python.org  Tue May  8 17:37:44 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 8 May 2007 08:37:44 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <c56e219d0705080735g65f2412bu361d798ac403e538@mail.gmail.com>
References: <c56e219d0705080735g65f2412bu361d798ac403e538@mail.gmail.com>
Message-ID: <ca471dc20705080837h7e60a567ud3bcecb042dbb00d@mail.gmail.com>

On 5/8/07, Luis P Caamano <lcaamano at gmail.com> wrote:
> On 5/7/07, "Guido van Rossum" wrote:
> > Around '99 Greg Stein and Mark Hammond tried to get rid of the GIL.
> > They removed most of the global mutable data structures, added
> > explicit locks to the remaining ones and to individual mutable
> > objects, and actually got the whole thing working. Unfortunately even
> > on the system with the fastest locking primitives (Windows at the
> > time) they measured a 2x slow-down on a single CPU due to all the
> > extra locking operations going on.
>
> That just breaks my heart.<sigh>
>
> You gotta finish that sentence, it was a slow down on single CPU with
> a speed increase with two or more CPUs, leveling out at 4 CPUs or so.
>
> This was the same situation on every major OS kernel, including AIX,
> HPUX, Linux, Tru64, etc., when they started supporting SMP machines,
> which is why all of them at some time sported two kernels, one for SMP
> machines with the spinlock code and one for single processor machines
> with the spinlock code #ifdef'ed out.  For some, like IBM/AIX and
> HPUX, eventually and as expected, all their servers became MPs and
> then they stopped delivering the SP kernel.
>
> The same would've been true for the python interpreter, one for MP and
> one for SP, and eventually, even in the PC world, everything would be
> MP and the SP interpreter would disappear.

The difference is, for an OS kernel, there really isn't any other way
to benefit from multiple CPUs. But for Python, there is -- run
multiple processes instead of threads!

> People need to understand though that the GIL is not as bad as one
> would initially think as most C extensions release the GIL and run
> concurrently on multiple CPUs.  It takes a bit of researching through
> old emails in the python list and a bit of time to really understand
> that.  Nevertheless, when the itch is bad enough, it'll get scratched.

I think you're overestimating the sophistication of the average
extension developer, and the hardware to which they have access.

Nevertheless, you're right the GIL is not as bad as you would
initially think: you just have to undo the brainwashing you got from
Windows and Java proponents who seem to consider threads as the only
way to approach concurrent activities.

Just because Java was once aimed at a set-top box OS that didn't
support multiple address spaces, and just because process creation in
Windows used to be slow as a dog, doesn't mean that multiple processes
(with judicious use of IPC) aren't a much better approach to writing
apps for multi-CPU boxes than threads.

Just Say No to the combined evils of locking, deadlocks, lock
granularity, livelocks, nondeterminism and race conditions.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From theller at ctypes.org  Tue May  8 18:47:55 2007
From: theller at ctypes.org (Thomas Heller)
Date: Tue, 08 May 2007 18:47:55 +0200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <ca471dc20705080837h7e60a567ud3bcecb042dbb00d@mail.gmail.com>
References: <c56e219d0705080735g65f2412bu361d798ac403e538@mail.gmail.com>
	<ca471dc20705080837h7e60a567ud3bcecb042dbb00d@mail.gmail.com>
Message-ID: <f1q9jp$l8l$1@sea.gmane.org>

Guido van Rossum schrieb:
> On 5/8/07, Luis P Caamano <lcaamano at gmail.com> wrote:
>> On 5/7/07, "Guido van Rossum" wrote:
>> > Around '99 Greg Stein and Mark Hammond tried to get rid of the GIL.
>> > They removed most of the global mutable data structures, added
>> > explicit locks to the remaining ones and to individual mutable
>> > objects, and actually got the whole thing working. Unfortunately even
>> > on the system with the fastest locking primitives (Windows at the
>> > time) they measured a 2x slow-down on a single CPU due to all the
>> > extra locking operations going on.
>>
>> That just breaks my heart.<sigh>
>>
>> You gotta finish that sentence, it was a slow down on single CPU with
>> a speed increase with two or more CPUs, leveling out at 4 CPUs or so.
>>
>> This was the same situation on every major OS kernel, including AIX,
>> HPUX, Linux, Tru64, etc., when they started supporting SMP machines,
>> which is why all of them at some time sported two kernels, one for SMP
>> machines with the spinlock code and one for single processor machines
>> with the spinlock code #ifdef'ed out.  For some, like IBM/AIX and
>> HPUX, eventually and as expected, all their servers became MPs and
>> then they stopped delivering the SP kernel.
>>
>> The same would've been true for the python interpreter, one for MP and
>> one for SP, and eventually, even in the PC world, everything would be
>> MP and the SP interpreter would disappear.
> 
> The difference is, for an OS kernel, there really isn't any other way
> to benefit from multiple CPUs. But for Python, there is -- run
> multiple processes instead of threads!

Wouldn't multiple interpreters (assuming the problems with them would be fixed)
in the same process give the same benefit?  A separate GIL for each one?

Thomas


From guido at python.org  Tue May  8 19:09:46 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 8 May 2007 10:09:46 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <f1q9jp$l8l$1@sea.gmane.org>
References: <c56e219d0705080735g65f2412bu361d798ac403e538@mail.gmail.com>
	<ca471dc20705080837h7e60a567ud3bcecb042dbb00d@mail.gmail.com>
	<f1q9jp$l8l$1@sea.gmane.org>
Message-ID: <ca471dc20705081009k294fce71mf978cfd9eef20628@mail.gmail.com>

On 5/8/07, Thomas Heller <theller at ctypes.org> wrote:
> Wouldn't multiple interpreters (assuming the problems with them would be fixed)
> in the same process give the same benefit?  A separate GIL for each one?

No; numerous read-only and immutable objects (e.g. the small integers,
1-character strings, the empty tuple; and all built-in type objects)
are shared between all interpreters. Also, extensions can easily share
state between interpreters I believe.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nnorwitz at gmail.com  Tue May  8 19:30:09 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 8 May 2007 10:30:09 -0700
Subject: [Python-3000] failing tests
In-Reply-To: <ee2a432c0705070104j7c71842dxf33307a1468cfeb5@mail.gmail.com>
References: <ee2a432c0705070104j7c71842dxf33307a1468cfeb5@mail.gmail.com>
Message-ID: <ee2a432c0705081030w9352c4cmd52723354d1c17ae@mail.gmail.com>

One more test is failing:

test test_fileio failed -- Traceback (most recent call last):
 File "/tmp/python-test-3.0/local/lib/python3.0/test/test_fileio.py",
line 128, in testAbles
   f = _fileio._FileIO("/dev/tty", "a")
IOError: [Errno 6] No such device or address: '/dev/tty'

This seems to only happen when there is no tty associated with a
terminal which happens when run from cron (among other situations).

n
--

On 5/7/07, Neal Norwitz <nnorwitz at gmail.com> wrote:
> There are 3* failing tests:
>     test_compiler test_doctest test_transformer
> * plus a few more when running on a 64-bit platform
>
> These failures occurred before and after xrange checkin.
>
> Do other people see these failures?  Any ideas when they started?
>
> The doctest failures are due to no space at the end of the line (print
> behavior change).  Not sure what to do about that now that we prevent
> blanks at the end of lines from being checked in. :-)
>
> n
>

From guido at python.org  Tue May  8 19:38:05 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 8 May 2007 10:38:05 -0700
Subject: [Python-3000] failing tests
In-Reply-To: <ee2a432c0705081030w9352c4cmd52723354d1c17ae@mail.gmail.com>
References: <ee2a432c0705070104j7c71842dxf33307a1468cfeb5@mail.gmail.com>
	<ee2a432c0705081030w9352c4cmd52723354d1c17ae@mail.gmail.com>
Message-ID: <ca471dc20705081038m3e04d475t7864969ef977b1a4@mail.gmail.com>

Should be fixed now.
Committed revision 55186.


On 5/8/07, Neal Norwitz <nnorwitz at gmail.com> wrote:
> One more test is failing:
>
> test test_fileio failed -- Traceback (most recent call last):
>  File "/tmp/python-test-3.0/local/lib/python3.0/test/test_fileio.py",
> line 128, in testAbles
>    f = _fileio._FileIO("/dev/tty", "a")
> IOError: [Errno 6] No such device or address: '/dev/tty'
>
> This seems to only happen when there is no tty associated with a
> terminal which happens when run from cron (among other situations).
>
> n
> --
>
> On 5/7/07, Neal Norwitz <nnorwitz at gmail.com> wrote:
> > There are 3* failing tests:
> >     test_compiler test_doctest test_transformer
> > * plus a few more when running on a 64-bit platform
> >
> > These failures occurred before and after xrange checkin.
> >
> > Do other people see these failures?  Any ideas when they started?
> >
> > The doctest failures are due to no space at the end of the line (print
> > behavior change).  Not sure what to do about that now that we prevent
> > blanks at the end of lines from being checked in. :-)
> >
> > n
> >
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From brett at python.org  Tue May  8 19:49:31 2007
From: brett at python.org (Brett Cannon)
Date: Tue, 8 May 2007 10:49:31 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <ca471dc20705081009k294fce71mf978cfd9eef20628@mail.gmail.com>
References: <c56e219d0705080735g65f2412bu361d798ac403e538@mail.gmail.com>
	<ca471dc20705080837h7e60a567ud3bcecb042dbb00d@mail.gmail.com>
	<f1q9jp$l8l$1@sea.gmane.org>
	<ca471dc20705081009k294fce71mf978cfd9eef20628@mail.gmail.com>
Message-ID: <bbaeab100705081049i39c3a73eube2809ddee51b9f4@mail.gmail.com>

On 5/8/07, Guido van Rossum <guido at python.org> wrote:
>
> On 5/8/07, Thomas Heller <theller at ctypes.org> wrote:
> > Wouldn't multiple interpreters (assuming the problems with them would be
> fixed)
> > in the same process give the same benefit?  A separate GIL for each one?
>
> No; numerous read-only and immutable objects (e.g. the small integers,
> 1-character strings, the empty tuple; and all built-in type objects)
> are shared between all interpreters. Also, extensions can easily share
> state between interpreters I believe.



All extensions share their state between interpreters.  The import machinery
literally caches the module dict for an extension and uses that to
reinitialize any new instances.

But Martin's PEP on module init helps to deal with this issue.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070508/b5fe92bf/attachment-0001.html 

From jimjjewett at gmail.com  Tue May  8 20:42:57 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 8 May 2007 14:42:57 -0400
Subject: [Python-3000] [Python-Dev] Byte literals (was Re:
	[Python-checkins] Changing string constants to byte arrays (
	r55119 - in python/branches/py3k-struni/Lib: codecs.py
	test/test_codecs.py ))
In-Reply-To: <ca471dc20705080825j6b665307o408569dd2a9fb55d@mail.gmail.com>
References: <20070505150035.GA16303@panix.com>
	<200705051334.45120.fdrake@acm.org>
	<20070505124008.648D.JCARLSON@uci.edu>
	<ca471dc20705071042x586b21bcgbfdf8fbf8144913e@mail.gmail.com>
	<bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>
	<fb6fbf560705080734s4c315bc3i38c5ccbe3fc9a9a3@mail.gmail.com>
	<ca471dc20705080825j6b665307o408569dd2a9fb55d@mail.gmail.com>
Message-ID: <fb6fbf560705081142j75dd696eybbefcf9e310b5a44@mail.gmail.com>

On 5/8/07, Guido van Rossum <guido at python.org> wrote:
> On 5/8/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > I will be unhappy if 2to3 produces code that I can't run in (at least)
> > 2.6, because then I would need to convert more than once.

> This is the first time I hear of this requirement. It has not so far
> been a design goal for the conversions in 2to3. The workflow that I
> have in mind (and that others have agreed to be workable) is more like
> this:

> 1. develop working code under 2.6
> 2. make sure it is warning-free with the special -Wpy3k option
> 3. use 2to3 to convert it to 3.0 compatible syntax in a temporary directory
> 4. run your unit test suite with 3.0
> 5. for any defects you find, EDIT THE 2.6 SOURCE AND GO BACK TO STEP 2

The problem is what to do after step 5 ...

Do you leave your 3 code in the awkward auto-generated format, and
suggest (by example) that py3 code is clunky?

Do you immediately stop supporting 2.x?

Or do you fork the code?

-jJ

From guido at python.org  Tue May  8 20:55:31 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 8 May 2007 11:55:31 -0700
Subject: [Python-3000] [Python-Dev] Byte literals (was Re:
	[Python-checkins] Changing string constants to byte arrays (
	r55119 - in python/branches/py3k-struni/Lib: codecs.py
	test/test_codecs.py ))
In-Reply-To: <fb6fbf560705081142j75dd696eybbefcf9e310b5a44@mail.gmail.com>
References: <20070505150035.GA16303@panix.com>
	<200705051334.45120.fdrake@acm.org>
	<20070505124008.648D.JCARLSON@uci.edu>
	<ca471dc20705071042x586b21bcgbfdf8fbf8144913e@mail.gmail.com>
	<bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>
	<fb6fbf560705080734s4c315bc3i38c5ccbe3fc9a9a3@mail.gmail.com>
	<ca471dc20705080825j6b665307o408569dd2a9fb55d@mail.gmail.com>
	<fb6fbf560705081142j75dd696eybbefcf9e310b5a44@mail.gmail.com>
Message-ID: <ca471dc20705081155u31b0b4d1y5868630398ca3206@mail.gmail.com>

On 5/8/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 5/8/07, Guido van Rossum <guido at python.org> wrote:
> > On 5/8/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > > I will be unhappy if 2to3 produces code that I can't run in (at least)
> > > 2.6, because then I would need to convert more than once.
>
> > This is the first time I hear of this requirement. It has not so far
> > been a design goal for the conversions in 2to3. The workflow that I
> > have in mind (and that others have agreed to be workable) is more like
> > this:
>
> > 1. develop working code under 2.6
> > 2. make sure it is warning-free with the special -Wpy3k option
> > 3. use 2to3 to convert it to 3.0 compatible syntax in a temporary directory
> > 4. run your unit test suite with 3.0
> > 5. for any defects you find, EDIT THE 2.6 SOURCE AND GO BACK TO STEP 2
>
> The problem is what to do after step 5 ...
>
> Do you leave your 3 code in the awkward auto-generated format, and
> suggest (by example) that py3 code is clunky?
>
> Do you immediately stop supporting 2.x?
>
> Or do you fork the code?

As long as you have to support 2.6, you keep developing for 2.6 and
cut distros from the converted code after they pass step 5. Once you
are comfortable with dropping support for 2.6 (or when 2.6 support can
be relegated to a maintenance branch) you can start developing using
the converted code.

I disagree that the converted code is awkward. Have you even tried the
2to3 tool yet?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jackdied at jackdied.com  Tue May  8 20:56:39 2007
From: jackdied at jackdied.com (Jack Diederich)
Date: Tue, 8 May 2007 14:56:39 -0400
Subject: [Python-3000] PEP 3129: Class Decorators
In-Reply-To: <ca471dc20705071112y103d7ea1v68711794718e3fbf@mail.gmail.com>
References: <43aa6ff70705071008q6a33e00eq7e5073dba5fa07e@mail.gmail.com>
	<ca471dc20705071112y103d7ea1v68711794718e3fbf@mail.gmail.com>
Message-ID: <20070508185639.GA5429@performancedrivers.com>

On Mon, May 07, 2007 at 11:12:40AM -0700, Guido van Rossum wrote:
> On 5/7/07, Collin Winter <collinw at gmail.com> wrote:
> > Can I go ahead and mark PEP 3129 as "accepted"?
> 
> Almost. I'm ok with it, but I think that to follow the procedure you
> ought to post the full text at least once on python-3000, so you can
> add the date to the "Post-History" header. In the mean time, I think
> it would be fine to start on the implementation!
> 

My implementation worked as of PyCon but has some conflicts with
stuff that has been checked in since.  I will have time next week
to get it working on the current 3k branch.

-Jack

From guido at python.org  Tue May  8 21:12:36 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 8 May 2007 12:12:36 -0700
Subject: [Python-3000] PEP 3129: Class Decorators
In-Reply-To: <20070508185639.GA5429@performancedrivers.com>
References: <43aa6ff70705071008q6a33e00eq7e5073dba5fa07e@mail.gmail.com>
	<ca471dc20705071112y103d7ea1v68711794718e3fbf@mail.gmail.com>
	<20070508185639.GA5429@performancedrivers.com>
Message-ID: <ca471dc20705081212i70c8f1a9p71e57753eccf99a9@mail.gmail.com>

Cool! Looking forward to it. Collin or someone else can help you get
it checked in if you don't have dev privs yet.

Given the lack of discussion following the posting of the PEP, let's accept it.

On 5/8/07, Jack Diederich <jackdied at jackdied.com> wrote:
> On Mon, May 07, 2007 at 11:12:40AM -0700, Guido van Rossum wrote:
> > On 5/7/07, Collin Winter <collinw at gmail.com> wrote:
> > > Can I go ahead and mark PEP 3129 as "accepted"?
> >
> > Almost. I'm ok with it, but I think that to follow the procedure you
> > ought to post the full text at least once on python-3000, so you can
> > add the date to the "Post-History" header. In the mean time, I think
> > it would be fine to start on the implementation!
> >
>
> My implementation worked as of PyCon but has some conflicts with
> stuff that has been checked in since.  I will have time next week
> to get it working on the current 3k branch.
>
> -Jack
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Tue May  8 21:25:15 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 08 May 2007 21:25:15 +0200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <f1q9jp$l8l$1@sea.gmane.org>
References: <c56e219d0705080735g65f2412bu361d798ac403e538@mail.gmail.com>	<ca471dc20705080837h7e60a567ud3bcecb042dbb00d@mail.gmail.com>
	<f1q9jp$l8l$1@sea.gmane.org>
Message-ID: <4640CE9B.9010907@v.loewis.de>

> Wouldn't multiple interpreters (assuming the problems with them would be fixed)
> in the same process give the same benefit?  A separate GIL for each one?

No. There is a global "current thread" variable that is protected by the
GIL (namely, _PyThreadState_Current). Without that, you would not even
know what the current interpreter is, so fixing all the other problems
with multiple interpreters won't help. You could try to save the current
thread reference into TLS, but, depending on the platform, that may be
expensive to access.

The "right" way would be to pass the current interpreter to all API
functions, the way Tcl does it. Indeed, Tcl's threading model is that
you have one interpreter per thread, and don't need any locking at
all (but you can't have multi-threaded Tcl scripts under that model).

However, even if you give multiple interpreters separate GILs, you
still won't see a speed-up on a multi-processor system if you have
a multi-threaded Python script: once one thread blocks on that
interpreter's GIL, that thread is also "wasted" for all other
interpreters, since the thread is hanging waiting for the GIL. To
fix that, you would also have to use separate threads for the
separate interpreters. When you do so, you might just as well start
separate OS processes.

Regards,
Martin

From collinw at gmail.com  Tue May  8 21:34:52 2007
From: collinw at gmail.com (Collin Winter)
Date: Tue, 8 May 2007 12:34:52 -0700
Subject: [Python-3000] PEP 3129: Class Decorators
In-Reply-To: <ca471dc20705081212i70c8f1a9p71e57753eccf99a9@mail.gmail.com>
References: <43aa6ff70705071008q6a33e00eq7e5073dba5fa07e@mail.gmail.com>
	<ca471dc20705071112y103d7ea1v68711794718e3fbf@mail.gmail.com>
	<20070508185639.GA5429@performancedrivers.com>
	<ca471dc20705081212i70c8f1a9p71e57753eccf99a9@mail.gmail.com>
Message-ID: <43aa6ff70705081234j2691202cj7c871c7c1b20f02d@mail.gmail.com>

On 5/8/07, Guido van Rossum <guido at python.org> wrote:
> Given the lack of discussion following the posting of the PEP, let's accept it.

Marked as accepted in r55190.

Collin Winter

> On 5/8/07, Jack Diederich <jackdied at jackdied.com> wrote:
> > On Mon, May 07, 2007 at 11:12:40AM -0700, Guido van Rossum wrote:
> > > On 5/7/07, Collin Winter <collinw at gmail.com> wrote:
> > > > Can I go ahead and mark PEP 3129 as "accepted"?
> > >
> > > Almost. I'm ok with it, but I think that to follow the procedure you
> > > ought to post the full text at least once on python-3000, so you can
> > > add the date to the "Post-History" header. In the mean time, I think
> > > it would be fine to start on the implementation!
> > >
> >
> > My implementation worked as of PyCon but has some conflicts with
> > stuff that has been checked in since.  I will have time next week
> > to get it working on the current 3k branch.
> >
> > -Jack
> > _______________________________________________
> > Python-3000 mailing list
> > Python-3000 at python.org
> > http://mail.python.org/mailman/listinfo/python-3000
> > Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
> >
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/collinw%40gmail.com
>

From eucci.group at gmail.com  Tue May  8 23:52:48 2007
From: eucci.group at gmail.com (Jeff Shell)
Date: Tue, 8 May 2007 15:52:48 -0600
Subject: [Python-3000] ABC's, Roles, etc
Message-ID: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>

Hello. I just joined the list as the whole Abstract Base Class,
Interfaces, and Roles/Traits system is of significant interest to me.
I've tried to catch up on the discussion by reading through the
archives, but I'm sure I missed a few posts and I apologize if I'm
wasting time covering ground that's already been covered.

I have a lengthy post that dissects a major issue that I have with
ABCs and the Interface definition that I saw in PEP 3124:: it all
seems rigidly class and class-instance based. The cardinal sin I saw
in the Interface definition in PEP 3124 (at least, at the time I last
viewed it) was the inclusion of 'self' in a method spec.

It seems to me that Abstract Base Classes and even PEP 3124 are
primarily focused on classes. But in Python, "everything is an
object", but not everything is class-based.

Jim Fulton taught me a long time ago that there are numerous ways to
fulfill a role, or provide an interface. 'self' is an internal detail
of class-instance implementations. In my post, I show some (stupid)
implementations of the 'IStack' interface seen in PEP 3124, only one
of which is the traditional class - instance based style.

http://griddlenoise.blogspot.com/2007/05/abc-may-be-easy-as-123-but-it-cant-beat.html

The rest of this post focuses on what `zope.interface` already
provides - a system for specifying behavior and declaring support at
both the class and object level - and 'object' really means 'object',
which includes modules. You're more than welcome to tune out now. My
main focus is on determining what Abstract Base Classes and/or PEP
3124's Interfaces do better than `zope.interface` (if anyone else is
familiar with that package). I've found great success using
`zope.interface` to satisfy many of the requirements and issues that
these systems may try to solve, and more. In fact, `zope.interface` is
closer to Roles/Traits than anything else.

# .....

I wanted to chime in here and say that `zope.interface` (from Zope 3,
but available separately) is an existing implementation that comes
quite close to what Collin Winter proposed. Even in some of its
spellings.

http://cheeseshop.python.org/pypi/zope.interface/3.3.0.1

The main thing is that `zope.interface` focuses declaration on the
object - NOT the class. You do not use `self` in interface
specifications.

Terms I've grown fond of while using `zope.interface` are "specifies",
"provides", and "implements".

An Interface **specifies** desired *object behavior* - basically it's the API::

    class IAuthVerification(Interface):
        def verify(invoice_number, amount):
            """
            Returns an IAuthResult containing status information about
            success or failure.
            """

An *object* **provides** that behavior::

    >>> IAuthVerification.providedBy(authorizer)
    True
    >>> result = authorizer.verify(invoice_number='KB125', amount=43.40)

Now, a class may **implement** that behavior, which is a way of saying
that "instances of this class will provide the behavior":

    class AuthNet(object):
        def verify(self, invoice_number, amount):
            """ ... (class - instance based implementation) """
    classImplements(AuthNet, IAuthVerification)

    >>> IAuthVerification.providedBy(AuthNet)
    False
    >>> AuthNet.verify(invoice_number='KB125', amount=43.40)
    <UnboundMethod Exception>

Alternatively, class or static methods could be used:

    class StaticAuthNet(object):
        @staticmethod
        def verify(invoice_number, amount):
            """ ... """
    alsoProvides(StaticAuthNet, IAuthVerification)

    >>> IAuthVerification.providedBy(StaticAuthNet)
    True
    >>> result = StaticAuthNet.verify(invoice_number='KB125', amount=43.40)

Or a module could even provide the interfaces. In the first example
above (under 'an object **provides** that behavior'), do you know
whether 'authorizer' is an instance, class, or module? Hell, maybe
it's a function that has 'verify' added as an attribute. It doesn't
matter - it fills the 'IAuthVerification' role.

In my blog post, I also show a dynamically constructed object
providing an interface's specified behavior. An instance of an empty
class is made, and then methods and other supporting attributes are
attached to this specific instance only. Real world examples of this
include Zope 2, where a folder may have "Python Scripts" or other
callable members that, in effect, make for a totally custom object. It
can also provide this same behavior (in fact, I was able to take
advantage of this on some old old old Zope 2 projects that started in
the web environment and transitioned to regular Python
modules/classes).

In any case, there are numerous ways to fulfill a role. I think any
system that was limited to classes and involved 'issubclass' and
'isinstance' trickery would be limiting or confusing if it started to
be used to describe behaviors of modules, one-off objects, and so on.

-- 
Jeff Shell

From greg.ewing at canterbury.ac.nz  Wed May  9 02:46:14 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 09 May 2007 12:46:14 +1200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <c56e219d0705080735g65f2412bu361d798ac403e538@mail.gmail.com>
References: <c56e219d0705080735g65f2412bu361d798ac403e538@mail.gmail.com>
Message-ID: <464119D6.6060303@canterbury.ac.nz>

Luis P Caamano wrote:

> You gotta finish that sentence, it was a slow down on single CPU with
> a speed increase with two or more CPUs, leveling out at 4 CPUs or so.

But it's still going to slow down all code that
doesn't use threads. I don't want to be *forced*
to use threads to get decent speed from my programs!

--
Greg

From exarkun at divmod.com  Wed May  9 03:43:49 2007
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Tue, 8 May 2007 21:43:49 -0400
Subject: [Python-3000] the future of the GIL
In-Reply-To: <464119D6.6060303@canterbury.ac.nz>
Message-ID: <20070509014349.19381.246190631.divmod.quotient.10259@ohm>

On Wed, 09 May 2007 12:46:14 +1200, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>Luis P Caamano wrote:
>
>> You gotta finish that sentence, it was a slow down on single CPU with
>> a speed increase with two or more CPUs, leveling out at 4 CPUs or so.
>
>But it's still going to slow down all code that
>doesn't use threads. I don't want to be *forced*
>to use threads to get decent speed from my programs!
>

It would also make Python applications a much greater drag on the
system as a whole, as they would need to use four whole CPUs to
just break even on a multithreaded compute-intensive task.  Even if
this can be improved over time (which it probably can be, to some
extent, given sufficient effort), one might want to consider the
consequences of having any widespread usage of such a resource-
intensive Python interpreter on general perception of Python as a
language.

Having a GIL-free build of CPython alongside a GIL-having build
does something to alleviate this, but it's not immediately what
the development maintenance burden of this would be, nor whether
the resulting user-experience would be desirable (which will be
packaged, what would become of "#!/usr/bin/env python", etc).  It
may be doable, but it doesn't strike me as an obviously good idea.

Jean-Paul

From pje at telecommunity.com  Wed May  9 03:57:37 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 08 May 2007 21:57:37 -0400
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.co
 m>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>
Message-ID: <20070509015553.9C6843A4061@sparrow.telecommunity.com>

At 03:52 PM 5/8/2007 -0600, Jeff Shell wrote:
>I have a lengthy post that dissects a major issue that I have with
>ABCs and the Interface definition that I saw in PEP 3124:: it all
>seems rigidly class and class-instance based.

Hi Jeff; I read your post a few days ago, but your blog doesn't 
support comments, so I've been "getting around to" writing a 
counterpoint on my blog.  But, now I can do it here instead.  :)


>The cardinal sin I saw
>in the Interface definition in PEP 3124 (at least, at the time I last
>viewed it) was the inclusion of 'self' in a method spec.

That's because you're confusing a generic function and a "method 
spec".  PEP 3124 interfaces are not *specifications*; they're 
namespaces for generic functions.  They are much closer in nature to 
ABCs than they are zope.interface-style Interfaces.  The principal 
thing they have in common with zope.interface (aside from the name) 
is the support for "IFoo(ob)"-style adaptation.

Very few of zope.interface's design goals are shared by PEP 
3124.  Notably, they are not particularly good for type checking or 
verification.  In PEP 3124, for example, IFoo(ob) *always* returns an 
object with the specified attributes; the only way to know whether 
they are actually implemented is to try using them.

Notice that this is diametrically opposed to what zope.interface 
wants to do in such situations -- which is why the PEP makes such a 
big deal about it being possible to use zope.interface *instead*.

That is, my own observation is that different frameworks sometimes 
need different kinds of interfaces.  For example, someone might 
create a framework that wants to verify preconditions and 
postconditions of methods in an interface, rather than merely 
specifying their names and arguments!  Using zope.interface as an 
exclusive basis for interface definition and type annotations would 
block innovation in this area.

PEP 3124 interfaces are therefore explicitly intended to be merely 
one *possible* kind of interface, rather than a be-all end-all 
interface system.  They have many differences from zope.interface, 
which, depending on your goals, may be a plus or minus.  But you 
certainly aren't obligated to *use* them.  PEP 3124 merely proposes a 
framework for how to use interfaces for method overloading, generic 
functions, and AOP.

 From the PEP itself:

"""For example, it should be possible
to use a ``zope.interface`` interface object to specify the desired
type of a function argument, as long as the ``zope.interface`` package
registered itself correctly (or a third party did the registration).

In this way, the proposed API simply offers a uniform way of accessing
the functionality within its scope, rather than prescribing a single
implementation to be used for all libraries, frameworks, and
applications."""


>... 'self' is an internal detail of class-instance implementations.

Again - this is because you're assuming the purpose of a PEP 3124 
interface is to *specify* an interface, when in fact it's much more 
like an ABC, which may also *implement* the interface.  The 
specification and implementation are intentionally unified here.

Of course, again, you will be able to use zope.interfaces as argument 
annotations to @overload-ed functions and methods, which is the point 
of the PEP.  Its "Interface" class is merely a suggested default, 
leaving the fancier tricks to established packages, in the same way 
that its generic function implementation will not do everything that 
RuleDispatch or PEAK-Rules can do.  It's supposed to be a core 
framework for such add-on packages, not a replacement for them.


>It seems to me that Abstract Base Classes and even PEP 3124 are
>primarily focused on classes. But in Python, "everything is an
>object", but not everything is class-based.
>...
>The rest of this post focuses on what `zope.interface` already
>provides - a system for specifying behavior and declaring support at
>both the class and object level - and 'object' really means 'object',
>which includes modules.

Right -- and *neither* "specifying behavior" nor "declaring support" 
are goals of PEP 3124; they're entirely out of its scope.  The 
Interface object's purpose is to support uniform access to, and 
implementation of, individual operations.  These are somewhat 
parallel concepts, but very different in thrust; zope.interface is 
LBYL, while PEP 3124 is EAFP all the way.

As a result, PEP 3124 chooses to punt on the issue of individual 
objects.  It's quite possible within the framework to allow 
instance-level checks during dispatching, but it's not going to be in 
the default engine (which is based on type-tuple dispatching; see 
Guido's Py3K overloading prototype).

zope.interface (and zope.component, IIRC) pay a high price in 
complexity for allowing interfaces to be per-instance rather than 
type-defined.  I implemented the same feature in PyProtocols, but 
over the years I rarely found it useful.  My understanding of its 
usefulness in Zope is that it:

1. supports specification and testing of module-level Zope APIs
2. allows views and other wrapping operations to be selected on a dynamic basis

Since #1 falls outside of PEP 3124's goals (i.e., it's not about 
specification or testing), that leaves use case #2.  In my 
experience, it has been more than sufficient to simply give these 
object some *other* interface, such as an IViewTags interface with a 
method to query these dynamic "tag" interfaces.  In other words, my 
experience and opinion supports the view that use case #2 is actually 
a coincidental abuse of interfaces for convenience, rather than the 
"one obvious way" to handle the use case.

To put it another way, if you define getView() as a generic function, 
you can always define a dynamic implementation of it for any type 
that you wish to have dynamic view selection capability.  Then, only 
those cases that require a complex solution have to pay for the complexity.

So, that's my rationale for why PEP 3124 doesn't provide any 
instance-based features out of the box; outside of API specs for 
singleton objects, the need for them is mostly an illusion created by 
Zope 3's dynamic view selection hack.


>My main focus is on determining what Abstract Base Classes and/or PEP
>3124's Interfaces do better than `zope.interface` (if anyone else is
>familiar with that package).

I at least am quite familiar with it, having helped to define some of 
its terminology and API, as well as being the original author of its 
class-decorator emulation for Python versions 2.2 and up.  :)  I also 
argued for its adoption of PEP 246, and wrote PyProtocols to unify 
Twisted and Zope interfaces in a PEP 246-based adaptation framework.

And what PEP 3124 does much better than zope.interface or even PyProtocols is:

1. Adaptation, especially incomplete adaptation.  You can implement 
only the methods that are actually needed for your use case.  If the 
interface includes generic implementations that are defined in terms 
of other methods in the interface, you need not reimplement 
them.  (Note: I'm well aware that here my definition of "better" 
would be considered "worse" by Jim Fulton, since zope.interface is 
LBYL-oriented.  However, for many users and use cases, EAFP *is* 
better, even if it's not for Zope.)

2. Interface recombination.  AFAIK, zope.interface doesn't support 
subset interfaces like PyProtocols does.  Neither zope.interface nor 
PyProtocols support method renaming, where two interfaces have a 
method with the same specification but different method names.

3. Low mental overhead.  PEP 3124 doesn't even *need* interfaces; 
simple use cases can just use overloaded functions and be on about 
their business.  Use cases that would require an interface and half a 
dozen adapter classes in zope.interface can be met by simply creating 
an overloaded function and adding methods.  And the resulting code 
reads like code in other languages that support overloading or 
generic functions, rather than reading like Java.


>In my blog post, I also show a dynamically constructed object
>providing an interface's specified behavior. An instance of an empty
>class is made, and then methods and other supporting attributes are
>attached to this specific instance only. Real world examples of this
>include Zope 2, where a folder may have "Python Scripts" or other
>callable members that, in effect, make for a totally custom object. It
>can also provide this same behavior (in fact, I was able to take
>advantage of this on some old old old Zope 2 projects that started in
>the web environment and transitioned to regular Python
>modules/classes).

And how often does this happen outside of Zope?  As I said, I rarely 
found it to be the case anywhere else.  I replicated the ability in 
PyProtocols because I was biased by my prior Zope experience, but 
once I got outside of Zope it almost entirely ceased to be useful.

Meanwhile, as I said, PEP 3124 is not closed to extension.  It's 
specifically intended that zope.interface (and any other interface 
packages that might arise in future) should be able to play as 
first-class citizens in the proposed API.  However, depending on the 
specific features desired, those packages might have some additional 
integration work to do.

(Note, by the way, that zope.interface is explicitly mentioned three 
times in the PEP, as an example of how other interface types should 
be able to be used for overloading, as long as they register 
appropriate methods with the provided framework.)


From monpublic at gmail.com  Wed May  9 03:58:13 2007
From: monpublic at gmail.com (Chris Monson)
Date: Tue, 8 May 2007 21:58:13 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
Message-ID: <da3f900e0705081858rc5fabfbheb9d520c5d007bc6@mail.gmail.com>

On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>
> This is just the first draft (also checked into SVN), and doesn't include
> the details of how the extension API works (so that third-party interfaces
> and generic functions can interoperate using the same decorators,
> annotations, etc.).
>
> Comments and questions appreciated, as it'll help drive better
> explanations
> of both the design and rationales.  I'm usually not that good at guessing
> what other people will want to know (or are likely to misunderstand) until
> I get actual questions.
>
>
> PEP: 3124
> Title: Overloading, Generic Functions, Interfaces, and Adaptation
> Version: $Revision: 55029 $
> Last-Modified: $Date: 2007-04-30 18:48:06 -0400 (Mon, 30 Apr 2007) $
> Author: Phillip J. Eby <pje at telecommunity.com>
> Discussions-To: Python 3000 List <python-3000 at python.org>
> Status: Draft
> Type: Standards Track
> Requires: 3107, 3115, 3119
> Replaces: 245, 246
> Content-Type: text/x-rst
> Created: 28-Apr-2007
> Post-History: 30-Apr-2007


[snip]


>
> "Before" and "After" Methods
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> In addition to the simple next-method chaining shown above, it is
> sometimes useful to have other ways of combining methods.  For
> example, the "observer pattern" can sometimes be implemented by adding
> extra methods to a function, that execute before or after the normal
> implementation.
>
> To support these use cases, the ``overloading`` module will supply
> ``@before``, ``@after``, and ``@around`` decorators, that roughly
> correspond to the same types of methods in the Common Lisp Object
> System (CLOS), or the corresponding "advice" types in AspectJ.
>
> Like ``@when``, all of these decorators must be passed the function to
> be overloaded, and can optionally accept a predicate as well::
>
>      def begin_transaction(db):
>          print "Beginning the actual transaction"
>
>
>      @before(begin_transaction)
>      def check_single_access(db: SingletonDB):
>          if db.inuse:
>              raise TransactionError("Database already in use")
>
>      @after(begin_transaction)
>      def start_logging(db: LoggableDB):
>          db.set_log_level(VERBOSE)



If we are looking at doing Design By Contract using @before and @after
(preconditions and postconditions), shouldn't there be some way of getting
at the return value in functions decorated with @after?  For example, it
seems reasonable to require an extra argument, perhaps at the beginning:

def successor(num):
  return num + 1

@before(successor)
def check_positive(num: int):
  if num < 0:
    raise PreconditionError("Positive integer inputs required")

@after(successor)
def check_successor(returned, num:int):
  if returned != num + 1:
    raise PostconditionError("successor failed to do its job")

Or am I missing something about how @after works?

+1, BTW, on this whole idea.

- C


``@before`` and ``@after`` methods are invoked either before or after
> the main function body, and are *never considered ambiguous*.  That
> is, it will not cause any errors to have multiple "before" or "after"
> methods with identical or overlapping signatures.  Ambiguities are
> resolved using the order in which the methods were added to the
> target function.
>
> "Before" methods are invoked most-specific method first, with
> ambiguous methods being executed in the order they were added.  All
> "before" methods are called before any of the function's "primary"
> methods (i.e. normal ``@overload`` methods) are executed.
>
> "After" methods are invoked in the *reverse* order, after all of the
> function's "primary" methods are executed.  That is, they are executed
> least-specific methods first, with ambiguous methods being executed in
> the reverse of the order in which they were added.
>
> The return values of both "before" and "after" methods are ignored,
> and any uncaught exceptions raised by *any* methods (primary or other)
> immediately end the dispatching process.  "Before" and "after" methods
> cannot have ``__proceed__`` arguments, as they are not responsible
> for calling any other methods.  They are simply called as a
> notification before or after the primary methods.
>
> Thus, "before" and "after" methods can be used to check or establish
> preconditions (e.g. by raising an error if the conditions aren't met)
> or to ensure postconditions, without needing to duplicate any existing
> functionality.
>
>
> "Around" Methods
> ~~~~~~~~~~~~~~~~
>
> The ``@around`` decorator declares a method as an "around" method.
> "Around" methods are much like primary methods, except that the
> least-specific "around" method has higher precedence than the
> most-specific "before" or method.
>
> Unlike "before" and "after" methods, however, "Around" methods *are*
> responsible for calling their ``__proceed__`` argument, in order to
> continue the invocation process.  "Around" methods are usually used
> to transform input arguments or return values, or to wrap specific
> cases with special error handling or try/finally conditions, e.g.::
>
>      @around(commit_transaction)
>      def lock_while_committing(__proceed__, db: SingletonDB):
>          with db.global_lock:
>              return __proceed__(db)
>
> They can also be used to replace the normal handling for a specific
> case, by *not* invoking the ``__proceed__`` function.
>
> The ``__proceed__`` given to an "around" method will either be the
> next applicable "around" method, a ``DispatchError`` instance,
> or a synthetic method object that will call all the "before" methods,
> followed by the primary method chain, followed by all the "after"
> methods, and return the result from the primary method chain.
>
> Thus, just as with normal methods, ``__proceed__`` can be checked for
> ``DispatchError``-ness, or simply invoked.  The "around" method should
> return the value returned by ``__proceed__``, unless of course it
> wishes to modify or replace it with a different return value for the
> function as a whole.
>
>
> Custom Combinations
> ~~~~~~~~~~~~~~~~~~~
>
> The decorators described above (``@overload``, ``@when``, ``@before``,
> ``@after``, and ``@around``) collectively implement what in CLOS is
> called the "standard method combination" -- the most common patterns
> used in combining methods.
>
> Sometimes, however, an application or library may have use for a more
> sophisticated type of method combination.  For example, if you
> would like to have "discount" methods that return a percentage off,
> to be subtracted from the value returned by the primary method(s),
> you might write something like this::
>
>      from overloading import always_overrides, merge_by_default
>      from overloading import Around, Before, After, Method, MethodList
>
>      class Discount(MethodList):
>          """Apply return values as discounts"""
>
>          def __call__(self, *args, **kw):
>              retval = self.tail(*args, **kw)
>              for sig, body in self.sorted():
>                  retval -= retval * body(*args, **kw)
>              return retval
>
>      # merge discounts by priority
>      merge_by_default(Discount)
>
>      # discounts have precedence over before/after/primary methods
>      always_overrides(Discount, Before)
>      always_overrides(Discount, After)
>      always_overrides(Discount, Method)
>
>      # but not over "around" methods
>      always_overrides(Around, Discount)
>
>      # Make a decorator called "discount" that works just like the
>      # standard decorators...
>      discount = Discount.make_decorator('discount')
>
>      # and now let's use it...
>      def price(product):
>          return product.list_price
>
>      @discount(price)
>      def ten_percent_off_shoes(product: Shoe)
>          return Decimal('0.1')
>
> Similar techniques can be used to implement a wide variety of
> CLOS-style method qualifiers and combination rules.  The process of
> creating custom method combination objects and their corresponding
> decorators is described in more detail under the `Extension API`_
> section.
>
> Note, by the way, that the ``@discount`` decorator shown will work
> correctly with any new predicates defined by other code.  For example,
> if ``zope.interface`` were to register its interface types to work
> correctly as argument annotations, you would be able to specify
> discounts on the basis of its interface types, not just classes or
> ``overloading``-defined interface types.
>
> Similarly, if a library like RuleDispatch or PEAK-Rules were to
> register an appropriate predicate implementation and dispatch engine,
> one would then be able to use those predicates for discounts as well,
> e.g.::
>
>      from somewhere import Pred  # some predicate implementation
>
>      @discount(
>          price,
>          Pred("isinstance(product,Shoe) and"
>               " product.material.name=='Blue Suede'")
>      )
>      def forty_off_blue_suede_shoes(product):
>          return Decimal('0.4')
>
> The process of defining custom predicate types and dispatching engines
> is also described in more detail under the `Extension API`_ section.
>
>
> Overloading Inside Classes
> --------------------------
>
> All of the decorators above have a special additional behavior when
> they are directly invoked within a class body: the first parameter
> (other than ``__proceed__``, if present) of the decorated function
> will be treated as though it had an annotation equal to the class
> in which it was defined.
>
> That is, this code::
>
>      class And(object):
>          # ...
>          @when(get_conjuncts)
>          def __conjuncts(self):
>              return self.conjuncts
>
> produces the same effect as this (apart from the existence of a
> private method)::
>
>      class And(object):
>          # ...
>
>      @when(get_conjuncts)
>      def get_conjuncts_of_and(ob: And):
>          return ob.conjuncts
>
> This behavior is both a convenience enhancement when defining lots of
> methods, and a requirement for safely distinguishing multi-argument
> overloads in subclasses.  Consider, for example, the following code::
>
>      class A(object):
>          def foo(self, ob):
>              print "got an object"
>
>          @overload
>          def foo(__proceed__, self, ob:Iterable):
>              print "it's iterable!"
>              return __proceed__(self, ob)
>
>
>      class B(A):
>          foo = A.foo     # foo must be defined in local namespace
>
>          @overload
>          def foo(__proceed__, self, ob:Iterable):
>              print "B got an iterable!"
>              return __proceed__(self, ob)
>
> Due to the implicit class rule, calling ``B().foo([])`` will print
> "B got an iterable!" followed by "it's iterable!", and finally,
> "got an object", while ``A().foo([])`` would print only the messages
> defined in ``A``.
>
> Conversely, without the implicit class rule, the two "Iterable"
> methods would have the exact same applicability conditions, so calling
> either ``A().foo([])`` or ``B().foo([])`` would result in an
> ``AmbiguousMethods`` error.
>
> It is currently an open issue to determine the best way to implement
> this rule in Python 3.0.  Under Python 2.x, a class' metaclass was
> not chosen until the end of the class body, which means that
> decorators could insert a custom metaclass to do processing of this
> sort.  (This is how RuleDispatch, for example, implements the implicit
> class rule.)
>
> PEP 3115, however, requires that a class' metaclass be determined
> *before* the class body has executed, making it impossible to use this
> technique for class decoration any more.
>
> At this writing, discussion on this issue is ongoing.
>
>
> Interfaces and Adaptation
> -------------------------
>
> The ``overloading`` module provides a simple implementation of
> interfaces and adaptation.  The following example defines an
> ``IStack`` interface, and declares that ``list`` objects support it::
>
>      from overloading import abstract, Interface
>
>      class IStack(Interface):
>          @abstract
>          def push(self, ob)
>              """Push 'ob' onto the stack"""
>
>          @abstract
>          def pop(self):
>              """Pop a value and return it"""
>
>
>      when(IStack.push, (list, object))(list.append)
>      when(IStack.pop, (list,))(list.pop)
>
>      mylist = []
>      mystack = IStack(mylist)
>      mystack.push(42)
>      assert mystack.pop()==42
>
> The ``Interface`` class is a kind of "universal adapter".  It accepts
> a single argument: an object to adapt.  It then binds all its methods
> to the target object, in place of itself.  Thus, calling
> ``mystack.push(42``) is the same as calling
> ``IStack.push(mylist, 42)``.
>
> The ``@abstract`` decorator marks a function as being abstract: i.e.,
> having no implementation.  If an ``@abstract`` function is called,
> it raises ``NoApplicableMethods``.  To become executable, overloaded
> methods must be added using the techniques previously described. (That
> is, methods can be added using ``@when``, ``@before``, ``@after``,
> ``@around``, or any custom method combination decorators.)
>
> In the example above, the ``list.append`` method is added as a method
> for ``IStack.push()`` when its arguments are a list and an arbitrary
> object.  Thus, ``IStack.push(mylist, 42)`` is translated to
> ``list.append(mylist, 42)``, thereby implementing the desired
> operation.
>
> (Note: the ``@abstract`` decorator is not limited to use in interface
> definitions; it can be used anywhere that you wish to create an
> "empty" generic function that initially has no methods.  In
> particular, it need not be used inside a class.)
>
>
> Subclassing and Re-assembly
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> Interfaces can be subclassed::
>
>      class ISizedStack(IStack):
>          @abstract
>          def __len__(self):
>              """Return the number of items on the stack"""
>
>      # define __len__ support for ISizedStack
>      when(ISizedStack.__len__, (list,))(list.__len__)
>
> Or assembled by combining functions from existing interfaces::
>
>      class Sizable(Interface):
>          __len__ = ISizedStack.__len__
>
>      # list now implements Sizable as well as ISizedStack, without
>      # making any new declarations!
>
> A class can be considered to "adapt to" an interface at a given
> point in time, if no method defined in the interface is guaranteed to
> raise a ``NoApplicableMethods`` error if invoked on an instance of
> that class at that point in time.
>
> In normal usage, however, it is "easier to ask forgiveness than
> permission".  That is, it is easier to simply use an interface on
> an object by adapting it to the interface (e.g. ``IStack(mylist)``)
> or invoking interface methods directly (e.g. ``IStack.push(mylist,
> 42)``), than to try to figure out whether the object is adaptable to
> (or directly implements) the interface.
>
>
> Implementing an Interface in a Class
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> It is possible to declare that a class directly implements an
> interface, using the ``declare_implementation()`` function::
>
>      from overloading import declare_implementation
>
>      class Stack(object):
>          def __init__(self):
>              self.data = []
>          def push(self, ob):
>              self.data.append(ob)
>          def pop(self):
>              return self.data.pop()
>
>      declare_implementation(IStack, Stack)
>
> The ``declare_implementation()`` call above is roughly equivalent to
> the following steps::
>
>      when(IStack.push, (Stack,object))(lambda self, ob: self.push(ob))
>      when(IStack.pop, (Stack,))(lambda self, ob: self.pop())
>
> That is, calling ``IStack.push()`` or ``IStack.pop()`` on an instance
> of any subclass of ``Stack``, will simply delegate to the actual
> ``push()`` or ``pop()`` methods thereof.
>
> For the sake of efficiency, calling ``IStack(s)`` where ``s`` is an
> instance of ``Stack``, **may** return ``s`` rather than an ``IStack``
> adapter.  (Note that calling ``IStack(x)`` where ``x`` is already an
> ``IStack`` adapter will always return ``x`` unchanged; this is an
> additional optimization allowed in cases where the adaptee is known
> to *directly* implement the interface, without adaptation.)
>
> For convenience, it may be useful to declare implementations in the
> class header, e.g.::
>
>      class Stack(metaclass=Implementer, implements=IStack):
>          ...
>
> Instead of calling ``declare_implementation()`` after the end of the
> suite.
>
>
> Interfaces as Type Specifiers
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ``Interface`` subclasses can be used as argument annotations to
> indicate what type of objects are acceptable to an overload, e.g.::
>
>      @overload
>      def traverse(g: IGraph, s: IStack):
>          g = IGraph(g)
>          s = IStack(s)
>          # etc....
>
> Note, however, that the actual arguments are *not* changed or adapted
> in any way by the mere use of an interface as a type specifier.  You
> must explicitly cast the objects to the appropriate interface, as
> shown above.
>
> Note, however, that other patterns of interface use are possible.
> For example, other interface implementations might not support
> adaptation, or might require that function arguments already be
> adapted to the specified interface.  So the exact semantics of using
> an interface as a type specifier are dependent on the interface
> objects you actually use.
>
> For the interface objects defined by this PEP, however, the semantics
> are as described above.  An interface I1 is considered "more specific"
> than another interface I2, if the set of descriptors in I1's
> inheritance hierarchy are a proper superset of the descriptors in I2's
> inheritance hierarchy.
>
> So, for example, ``ISizedStack`` is more specific than both
> ``ISizable`` and ``ISizedStack``, irrespective of the inheritance
> relationships between these interfaces.  It is purely a question of
> what operations are included within those interfaces -- and the
> *names* of the operations are unimportant.
>
> Interfaces (at least the ones provided by ``overloading``) are always
> considered less-specific than concrete classes.  Other interface
> implementations can decide on their own specificity rules, both
> between interfaces and other interfaces, and between interfaces and
> classes.
>
>
> Non-Method Attributes in Interfaces
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> The ``Interface`` implementation actually treats all attributes and
> methods (i.e. descriptors) in the same way: their ``__get__`` (and
> ``__set__`` and ``__delete__``, if present) methods are called with
> the wrapped (adapted) object as "self".  For functions, this has the
> effect of creating a bound method linking the generic function to the
> wrapped object.
>
> For non-function attributes, it may be easiest to specify them using
> the ``property`` built-in, and the corresponding ``fget``, ``fset``,
> and ``fdel`` attributes::
>
>      class ILength(Interface):
>          @property
>          @abstract
>          def length(self):
>              """Read-only length attribute"""
>
>      # ILength(aList).length == list.__len__(aList)
>      when(ILength.length.fget, (list,))(list.__len__)
>
> Alternatively, methods such as ``_get_foo()`` and ``_set_foo()``
> may be defined as part of the interface, and the property defined
> in terms of those methods, but this a bit more difficult for users
> to implement correctly when creating a class that directly implements
> the interface, as they would then need to match all the individual
> method names, not just the name of the property or attribute.
>
>
> Aspects
> -------
>
> The adaptation system provided assumes that adapters are "stateless",
> which is to say that adapters have no attributes or storage apart from
> those of the adapted object.  This follows the "typeclass/instance"
> model of Haskell, and the concept of "pure" (i.e., transitively
> composable) adapters.
>
> However, there are occasionally cases where, to provide a complete
> implementation of some interface, some sort of additional state is
> required.
>
> One possibility of course, would be to attach monkeypatched "private"
> attributes to the adaptee.  But this is subject to name collisions,
> and complicates the process of initialization.  It also doesn't work
> on objects that don't have a ``__dict__`` attribute.
>
> So the ``Aspect`` class is provided to make it easy to attach extra
> information to objects that either:
>
> 1. have a ``__dict__`` attribute (so aspect instances can be stored
>     in it, keyed by aspect class),
>
> 2. support weak referencing (so aspect instances can be managed using
>     a global but thread-safe weak-reference dictionary), or
>
> 3. implement or can be adapt to the ``overloading.IAspectOwner``
>     interface (technically, #1 or #2 imply this)
>
> Subclassing ``Aspect`` creates an adapter class whose state is tied
> to the life of the adapted object.
>
> For example, suppose you would like to count all the times a certain
> method is called on instances of ``Target`` (a classic AOP example).
> You might do something like::
>
>      from overloading import Aspect
>
>      class Count(Aspect):
>          count = 0
>
>      @after(Target.some_method)
>      def count_after_call(self, *args, **kw):
>          Count(self).count += 1
>
> The above code will keep track of the number of times that
> ``Target.some_method()`` is successfully called (i.e., it will not
> count errors).  Other code can then access the count using
> ``Count(someTarget).count``.
>
> ``Aspect`` instances can of course have ``__init__`` methods, to
> initialize any data structures.  They can use either ``__slots__``
> or dictionary-based attributes for storage.
>
> While this facility is rather primitive compared to a full-featured
> AOP tool like AspectJ, persons who wish to build pointcut libraries
> or other AspectJ-like features can certainly use ``Aspect`` objects
> and method-combination decorators as a base for more expressive AOP
> tools.
>
> XXX spec out full aspect API, including keys, N-to-1 aspects, manual
>      attach/detach/delete of aspect instances, and the ``IAspectOwner``
>      interface.
>
>
> Extension API
> =============
>
> TODO: explain how all of these work
>
> implies(o1, o2)
>
> declare_implementation(iface, class)
>
> predicate_signatures(ob)
>
> parse_rule(ruleset, body, predicate, actiontype, localdict, globaldict)
>
> combine_actions(a1, a2)
>
> rules_for(f)
>
> Rule objects
>
> ActionDef objects
>
> RuleSet objects
>
> Method objects
>
> MethodList objects
>
> IAspectOwner
>
>
>
> Implementation Notes
> ====================
>
> Most of the functionality described in this PEP is already implemented
> in the in-development version of the PEAK-Rules framework.  In
> particular, the basic overloading and method combination framework
> (minus the ``@overload`` decorator) already exists there.  The
> implementation of all of these features in ``peak.rules.core`` is 656
> lines of Python at this writing.
>
> ``peak.rules.core`` currently relies on the DecoratorTools and
> BytecodeAssembler modules, but both of these dependencies can be
> replaced, as DecoratorTools is used mainly for Python 2.3
> compatibility and to implement structure types (which can be done
> with named tuples in later versions of Python).  The use of
> BytecodeAssembler can be replaced using an "exec" or "compile"
> workaround, given a reasonable effort.  (It would be easier to do this
> if the ``func_closure`` attribute of function objects was writable.)
>
> The ``Interface`` class has been previously prototyped, but is not
> included in PEAK-Rules at the present time.
>
> The "implicit class rule" has previously been implemented in the
> RuleDispatch library.  However, it relies on the ``__metaclass__``
> hook that is currently eliminated in PEP 3115.
>
> I don't currently know how to make ``@overload`` play nicely with
> ``classmethod`` and ``staticmethod`` in class bodies.  It's not really
> clear if it needs to, however.
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/monpublic%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070508/fcf166af/attachment.html 

From collinw at gmail.com  Wed May  9 05:35:42 2007
From: collinw at gmail.com (Collin Winter)
Date: Tue, 8 May 2007 20:35:42 -0700
Subject: [Python-3000] Build is broken (r55196)
Message-ID: <43aa6ff70705082035kd28a3a6t3f547d5aa27f025e@mail.gmail.com>

As of r55196 (and possibly earlier), the p3yk branch does not make
when configured with --with-pydebug. setup.py triggers this assertion
failure:

python: Objects/object.c:64: _Py_AddToAllObjects: Assertion
`(op->_ob_prev == ((void *)0)) == (op->_ob_next == ((void *)0))'
failed.

Any ideas?

From python at rcn.com  Wed May  9 05:00:26 2007
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 8 May 2007 20:00:26 -0700
Subject: [Python-3000] Octal literals anecdote
Message-ID: <005401c791e7$0fd10d20$f301a8c0@RaymondLaptop1>

Those following the octal literal discussion might enjoy reading one of today's SF bug reports:
    www.python.org/sf/1715302


Raymond

From eucci.group at gmail.com  Wed May  9 05:57:54 2007
From: eucci.group at gmail.com (Jeff Shell)
Date: Tue, 8 May 2007 21:57:54 -0600
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <20070509015553.9C6843A4061@sparrow.telecommunity.com>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>
	<20070509015553.9C6843A4061@sparrow.telecommunity.com>
Message-ID: <88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>

On 5/8/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 03:52 PM 5/8/2007 -0600, Jeff Shell wrote:
> >I have a lengthy post that dissects a major issue that I have with
> >ABCs and the Interface definition that I saw in PEP 3124:: it all
> >seems rigidly class and class-instance based.
>
> Hi Jeff; I read your post a few days ago, but your blog doesn't
> support comments, so I've been "getting around to" writing a
> counterpoint on my blog.  But, now I can do it here instead.  :)

Regarding the comments, blame spammers. And CAPTCHA. There's an equal
spot in hell (should I choose to believe in hell) for both. :) I miss
the chance to have conversation, but the weight of gardening or having
to try seven times to differentiate between two funnily-drawn
characters killed that part of my humanity.

> >The cardinal sin I saw
> >in the Interface definition in PEP 3124 (at least, at the time I last
> >viewed it) was the inclusion of 'self' in a method spec.
>
> That's because you're confusing a generic function and a "method
> spec".  PEP 3124 interfaces are not *specifications*; they're
> namespaces for generic functions.  They are much closer in nature to
> ABCs than they are zope.interface-style Interfaces.  The principal
> thing they have in common with zope.interface (aside from the name)
> is the support for "IFoo(ob)"-style adaptation.
>
> Very few of zope.interface's design goals are shared by PEP
> 3124.  Notably, they are not particularly good for type checking or
> verification.  In PEP 3124, for example, IFoo(ob) *always* returns an
> object with the specified attributes; the only way to know whether
> they are actually implemented is to try using them.
>
> Notice that this is diametrically opposed to what zope.interface
> wants to do in such situations -- which is why the PEP makes such a
> big deal about it being possible to use zope.interface *instead*.

Well that puts another fear in my heart about confusing the issue
further - "oh, these kindof sound and look the same but are
diametrically opposed?"

I must admit that I didn't read PEP 3124 in depth - most of it was
fascinating, some of it went way over my head in complexity, and then
suddenly I saw an Interface. It seemed quite out of place, actually,
and it seemed diametrically opposed to the simplicity and power I've
been enjoying.

> That is, my own observation is that different frameworks sometimes
> need different kinds of interfaces.  For example, someone might
> create a framework that wants to verify preconditions and
> postconditions of methods in an interface, rather than merely
> specifying their names and arguments!  Using zope.interface as an
> exclusive basis for interface definition and type annotations would
> block innovation in this area.

FWIW, zope.interface allows 'tagged values' on all Interface Elements,
which is the base class/type of Attribute, Method, and even Interface
(I believe). Tagged values are used to hold invariants, which are code
objects in the specification and can provide the 'if obj.age < 18,
then obj.has_parents_permission must be true' type of logic. The
interface (ha!) for setting and getting tagged attributes aint the
prettiest, but it's the equivalent of type annotations and all other
such things. And like 'invariant', it's not too difficult to write
helper functions that deal with that interface.

I use this in a SQLAlchemy based system that uses zope.schema (which
builds on zope.interface to describe field (attributes) types and
restrictions). The spec looks something like this::

        class ILogins(Interface):
            login = zope.schema.TextLine(...)
            validateUnique(login, column=table.c.login)

I have even used `zope.interface` to stamp out a new abstract class
(of sorts), which it supports::

    schema = field.schema
    if IInterface.providedBy(schema):
        # schema is an Interface, not an implementation; we need a concrete
        # instance.
        schema = schema.deferred()
        directlyProvides(schema, field.schema)

This stamps out a new abstract instance, and declares support for the
interface. This particular use was for view binding (something that
you mention), but it's three lines of code that are very useful in my
system.

This particular use case is for Zope 'Object' fields; like an address
attribute might expect to have a complex type, like 'IAddress'. There
are situations, such as dynamic UI generation, where an empty instance
is needed. This allows the abstract specification to provide just
enough of a concrete implementation to fill in for a real object
that's expected to arrive in the future.

One could envision having that 'deferred()' interface method filling
in stronger implementations. Which means that there's probably more
power, or hooks (at least) in zope.interface than may be realized. The
common uses of it don't cover all possibilities.

> >... 'self' is an internal detail of class-instance implementations.
>
> Again - this is because you're assuming the purpose of a PEP 3124
> interface is to *specify* an interface, when in fact it's much more
> like an ABC, which may also *implement* the interface.  The
> specification and implementation are intentionally unified here.

Hm. I'll have to process this one...

> >It seems to me that Abstract Base Classes and even PEP 3124 are
> >primarily focused on classes. But in Python, "everything is an
> >object", but not everything is class-based.
> >...
> >The rest of this post focuses on what `zope.interface` already
> >provides - a system for specifying behavior and declaring support at
> >both the class and object level - and 'object' really means 'object',
> >which includes modules.
>
> Right -- and *neither* "specifying behavior" nor "declaring support"
> are goals of PEP 3124; they're entirely out of its scope.  The
> Interface object's purpose is to support uniform access to, and
> implementation of, individual operations.  These are somewhat
> parallel concepts, but very different in thrust; zope.interface is
> LBYL, while PEP 3124 is EAFP all the way.

Funny that those things are apparently opposed. 'Look Before You Leap'
brings to mind the concept of "don't dive into an empty pool" or
"don't do a backwards flip onto the pointy rocks"; where as 'Easier to
Ask Forgiveness Than Permission' brings to bind the concept of "sorry
i dove head first into your empty pool and cracked my skull open Mr.
Johnson. If I had asked I'm sure you would have said no! In any case,
even though it's your pool and there was a fence and everything and
you did not give me permission, my parents are going to sue" (OK,
maybe that last bit is the result of a healthy american upbringing...
but still!)

> As a result, PEP 3124 chooses to punt on the issue of individual
> objects.  It's quite possible within the framework to allow
> instance-level checks during dispatching, but it's not going to be in
> the default engine (which is based on type-tuple dispatching; see
> Guido's Py3K overloading prototype).

Huh? I'll try to look at that. types, classes, instances... That does
it, I'm switching to Io. (Honestly - I've recently seen the light
about prototype based object oriented programming; in light of types
of types and classes of classes and classes and instances and oh my,
languages that believe in "there are only objects, and they are only
instances" are sounding sweeter every day)

> zope.interface (and zope.component, IIRC) pay a high price in
> complexity for allowing interfaces to be per-instance rather than
> type-defined.  I implemented the same feature in PyProtocols, but
> over the years I rarely found it useful.  My understanding of its
> usefulness in Zope is that it:
>
> 1. supports specification and testing of module-level Zope APIs

I've had uses for it outside of modules.

> 2. allows views and other wrapping operations to be selected on a dynamic basis

That's an essential (and very powerful) feature in a large system. But
there are uses outside of that.

An area where parts of the zope 3 component architecture DO pay a high
price in complexity is where it has to create dynamic types in order
to satisfy some core requirement. I can't remember where this is, but
I know that I HATE it - suddenly, Zope is playing with MY class
hierarchy. Suddenly I'm in debug mode and have no idea what I'm
looking at. I'd much rather have it augmenting my instance than
mangling my classes, unless I choose to have it mangle my classes by
subclassing from a mangler. Annotate my class, but don't replace it.

> Since #1 falls outside of PEP 3124's goals (i.e., it's not about
> specification or testing), that leaves use case #2.  In my
> experience, it has been more than sufficient to simply give these
> object some *other* interface, such as an IViewTags interface with a
> method to query these dynamic "tag" interfaces.  In other words, my
> experience and opinion supports the view that use case #2 is actually
> a coincidental abuse of interfaces for convenience, rather than the
> "one obvious way" to handle the use case.

Ugh. Yeah, there are 'marker' interfaces, but.. ugh. dynamic "tag"
interfaces. Yuck.

My experience has been otherwise.

But I'm sorry that I confused that section of PEP 3124 to be about
specification and testing. I do, however, think that is a better use
case.

> To put it another way, if you define getView() as a generic function,
> you can always define a dynamic implementation of it for any type
> that you wish to have dynamic view selection capability.  Then, only
> those cases that require a complex solution have to pay for the complexity.
>
> So, that's my rationale for why PEP 3124 doesn't provide any
> instance-based features out of the box; outside of API specs for
> singleton objects, the need for them is mostly an illusion created by
> Zope 3's dynamic view selection hack.
>
>
> >My main focus is on determining what Abstract Base Classes and/or PEP
> >3124's Interfaces do better than `zope.interface` (if anyone else is
> >familiar with that package).
>
> I at least am quite familiar with it, having helped to define some of
> its terminology and API, as well as being the original author of its
> class-decorator emulation for Python versions 2.2 and up.  :)  I also
> argued for its adoption of PEP 246, and wrote PyProtocols to unify
> Twisted and Zope interfaces in a PEP 246-based adaptation framework.
>
> And what PEP 3124 does much better than zope.interface or even PyProtocols is:
>
> 1. Adaptation, especially incomplete adaptation.  You can implement
> only the methods that are actually needed for your use case.  If the
> interface includes generic implementations that are defined in terms
> of other methods in the interface, you need not reimplement
> them.  (Note: I'm well aware that here my definition of "better"
> would be considered "worse" by Jim Fulton, since zope.interface is
> LBYL-oriented.  However, for many users and use cases, EAFP *is*
> better, even if it's not for Zope.)

And for many users and use cases, LBYL is better. Especially for those
of us who get pissed off and start smoking every time we end up on a
spikey rock!

> 2. Interface recombination.  AFAIK, zope.interface doesn't support
> subset interfaces like PyProtocols does.  Neither zope.interface nor
> PyProtocols support method renaming, where two interfaces have a
> method with the same specification but different method names.

Um, then it's a different specification.

eat_pizza() and consume_pizza() are different. They may be the same to
Cookie Monster, but they're not the same for Grover.

> 3. Low mental overhead.  PEP 3124 doesn't even *need* interfaces;
> simple use cases can just use overloaded functions and be on about
> their business.  Use cases that would require an interface and half a
> dozen adapter classes in zope.interface can be met by simply creating
> an overloaded function and adding methods.  And the resulting code
> reads like code in other languages that support overloading or
> generic functions, rather than reading like Java.

The mental overhead in PEP 3124 was pretty high for me, but that may
stem from bias resulting from diametrically opposed interpretations of
the same word :).

> >In my blog post, I also show a dynamically constructed object
> >providing an interface's specified behavior. An instance of an empty
> >class is made, and then methods and other supporting attributes are
> >attached to this specific instance only. Real world examples of this
> >include Zope 2, where a folder may have "Python Scripts" or other
> >callable members that, in effect, make for a totally custom object. It
> >can also provide this same behavior (in fact, I was able to take
> >advantage of this on some old old old Zope 2 projects that started in
> >the web environment and transitioned to regular Python
> >modules/classes).
>
> And how often does this happen outside of Zope?  As I said, I rarely
> found it to be the case anywhere else.  I replicated the ability in
> PyProtocols because I was biased by my prior Zope experience, but
> once I got outside of Zope it almost entirely ceased to be useful.

We have a dynamic data transformation framework that exists outside of
Zope (Zope is basically used for UI). Objects are being dynamically
composed, wrapped, decomposed, rewrapped, filtered, and split -
constantly. Objects, not types. It's all composed of rules. I'm
itching to be able to add rules to apply zope.interface specifications
to the generated objects; if only to then make it much easier to add
other filtering rules later on.

With all of the wrapping and generation going on, we had to add some
basic 'is_a' methods to the base classes. And we do care whether an
object is a wrapper (isinstance) as well as whether the wrapped object
provides the DataSet interface.

It's another complex framework, it's just an outside-of-Zope system as
an example.

I know there's been some talk of ``__isinstance__()`` and
``__issubclass__()`` overriding being allowed, and I guess that's to
take care of the wrapped and wrapped and wrapped object situations?

In any case, it seems that I have long occupied worlds wherein complex
objects could be composed on the fly outside of the type system, and
I'd hade to have one of those constructed objects miss out on passing
a 'is-foo-like' test because they weren't raised by proper upper
middle class type parents.

> Meanwhile, as I said, PEP 3124 is not closed to extension.  It's
> specifically intended that zope.interface (and any other interface
> packages that might arise in future) should be able to play as
> first-class citizens in the proposed API.  However, depending on the
> specific features desired, those packages might have some additional
> integration work to do.
>
> (Note, by the way, that zope.interface is explicitly mentioned three
> times in the PEP, as an example of how other interface types should
> be able to be used for overloading, as long as they register
> appropriate methods with the provided framework.)

Thanks for clearing things up. I'll try to make another pass at
reading the PEP more closely. For me, at this moment, all of this
class/type based stuff is rubbing me the wrong way, and that's a
feeling that's very hard to get past. I'm not sure why. I'll try to
suppress those feelings when I revisit 3119 and 3124.

What's happening with Roles/Traits? That's still the system that I'd
like to see. I'm hoping that hasn't gotten swallowed up by generic
overloaded pre-post wrapped abstract methods. (as long as I never have
to type 'def public final', I'm cool).

I think roles/traits as a core concept (LBYL zope.interface style, if
you prefer to think of it that way) is useful, if not important. And I
still believe that zope.interface already provides a language/API from
which to build.

-- 
Jeff Shell

From nnorwitz at gmail.com  Wed May  9 06:18:02 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 8 May 2007 21:18:02 -0700
Subject: [Python-3000] Build is broken (r55196)
In-Reply-To: <43aa6ff70705082035kd28a3a6t3f547d5aa27f025e@mail.gmail.com>
References: <43aa6ff70705082035kd28a3a6t3f547d5aa27f025e@mail.gmail.com>
Message-ID: <ee2a432c0705082118n325e259ei2305fde72ad091b4@mail.gmail.com>

I had this problem.  make clean solved it. -- n

On 5/8/07, Collin Winter <collinw at gmail.com> wrote:
> As of r55196 (and possibly earlier), the p3yk branch does not make
> when configured with --with-pydebug. setup.py triggers this assertion
> failure:
>
> python: Objects/object.c:64: _Py_AddToAllObjects: Assertion
> `(op->_ob_prev == ((void *)0)) == (op->_ob_next == ((void *)0))'
> failed.
>
> Any ideas?
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/nnorwitz%40gmail.com
>

From collinw at gmail.com  Wed May  9 06:38:11 2007
From: collinw at gmail.com (Collin Winter)
Date: Tue, 8 May 2007 21:38:11 -0700
Subject: [Python-3000] Build is broken (r55196)
In-Reply-To: <ee2a432c0705082118n325e259ei2305fde72ad091b4@mail.gmail.com>
References: <43aa6ff70705082035kd28a3a6t3f547d5aa27f025e@mail.gmail.com>
	<ee2a432c0705082118n325e259ei2305fde72ad091b4@mail.gmail.com>
Message-ID: <43aa6ff70705082138i72a9a6fcvec2e33c405508155@mail.gmail.com>

Works on a different laptop with a fresh checkout. False alarm, sorry.

On 5/8/07, Neal Norwitz <nnorwitz at gmail.com> wrote:
> I had this problem.  make clean solved it. -- n
>
> On 5/8/07, Collin Winter <collinw at gmail.com> wrote:
> > As of r55196 (and possibly earlier), the p3yk branch does not make
> > when configured with --with-pydebug. setup.py triggers this assertion
> > failure:
> >
> > python: Objects/object.c:64: _Py_AddToAllObjects: Assertion
> > `(op->_ob_prev == ((void *)0)) == (op->_ob_next == ((void *)0))'
> > failed.
> >
> > Any ideas?
> > _______________________________________________
> > Python-3000 mailing list
> > Python-3000 at python.org
> > http://mail.python.org/mailman/listinfo/python-3000
> > Unsubscribe: http://mail.python.org/mailman/options/python-3000/nnorwitz%40gmail.com
> >
>

From foom at fuhm.net  Wed May  9 09:26:06 2007
From: foom at fuhm.net (James Y Knight)
Date: Wed, 9 May 2007 03:26:06 -0400
Subject: [Python-3000] the future of the GIL
In-Reply-To: <ca471dc20705071058n25a21acfvaca8e4979edfa404@mail.gmail.com>
References: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
	<ca471dc20705071058n25a21acfvaca8e4979edfa404@mail.gmail.com>
Message-ID: <9F3AFD78-7C73-4C9F-8CA6-3D10A1468939@fuhm.net>

On May 7, 2007, at 1:58 PM, Guido van Rossum wrote:
> As C doesn't have an atomic increment nor an atomic
> decrement-and-test, the INCREF and DECREF macros sprinkled throughout
> the code (many thousands of them) must be protected by some lock.

I've been intently ignoring the rest of the thread (and will continue  
to do so), but, to respond to this one particular point...

This just isn't true. Python can do an atomic increment in a fast  
platform specific way. It need not restrict itself to what's  
available in C. (after all, *threads* aren't available in C....)

Two implementations of note:

1) gcc 4.1 has atomic operation builtins:
http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic- 
Builtins.html#Atomic-Builtins

2) There's a pretty damn portable library which provides these  
functions for what looks to me like pretty much all CPUs anyone would  
use, under Linux, Windows, HP/UX, Solaris, and OSX, and has a  
fallback to using pthreads mutexes:

http://www.hpl.hp.com/research/linux/atomic_ops/index.php4
http://packages.debian.org/stable/libdevel/libatomic-ops-dev


It's quite possible the overhead of GIL-less INCREF/DECREF is still  
too high even with atomic increment/decrement primitives, but AFAICT  
nobody has actually tried it. So saying GIL-less operation for sure  
has too high of an overhead unless the refcounting GC is replaced  
seems a bit premature.

James


From tomerfiliba at gmail.com  Wed May  9 10:47:02 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Wed, 9 May 2007 10:47:02 +0200
Subject: [Python-3000] the future of the GIL
Message-ID: <1d85506f0705090147w155c15d3ka61f0f23b435b3a9@mail.gmail.com>

On 5/8/07, Thomas Heller <theller at ctypes.org> wrote:
> Wouldn't multiple interpreters (assuming the problems with them would be fixed)
> in the same process give the same benefit?  A separate GIL for each one?

hmm, i find this idea quite interesting really:

* builtin immutable objects such as None, small ints, non-heap types,
and builtin functions, would become uncollectible by the GC. after all,
we can't reclaim their memory anyway, so keeping the accounting
info is just a waste of time. the PyObject_HEAD struct would grow
an "ob_collectible" field, which would tell the GC to ignore these
objects altogether. for efficiency reasons, Py_INCREF/DECREF
would still change ob_refcount, only the GC will ignore it for
uncollectible objects.

* each thread would have a separate interpreter, and all APIs should
grow an additional parameter that specifies the interpreter state
to use.

* for compatibility reasons, we can also have a dict-like object mapping
between thread-ids to interpreter states. when you invoke an API,
it would get the interpreter state from the currently executing thread id.
maybe that could be defined as a macro over the real API function.

* the builtin immutable objects would be shared between all instances
of the interpreter. other then those, all other objects would be local
to the interpreter that created them

* extension modules would have to be changed to support
per-interpreter initialization.

* in order to communicate between interpreters, we would use some
kind of IPC mechanism, to serialize access to objects. of course it
would be much more efficient, as no context switches are required
in the same process. this would make each thread basically as
protected as a OS process, so no locks would be required.

* in order to support the IPC, a new builtin type, Proxy, would be added
to the language. it would be the only object that can hold a cross-reference
to objects in different interpreters -- much like today's RPC libs -- only
that wouldn't have to work over a socket.

* if python would ever have a tracing GC, that would greatly simplify
things. also, moving to an atomic incref/decref library could also
help.

of course i'm not talking about adding that to py3k. it's too immature
even for a pre-pep. but continuing to develop that idea more could
be the means to removing the GIL, and finally having really parallel
python scripts.



-tomer

From rasky at develer.com  Wed May  9 10:54:12 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Wed, 09 May 2007 10:54:12 +0200
Subject: [Python-3000] PEP 3125 -- a modest proposal
In-Reply-To: <000101c79102$385e1340$a91a39c0$@org>
References: <000101c79102$385e1340$a91a39c0$@org>
Message-ID: <f1s27k$dq$1@sea.gmane.org>

On 08/05/2007 1.48, Andrew Koenig wrote:

> It has occurred to me that as Python stands today, an indent always begins
> with a colon.  So in principle, we could define anything that looks like an
> indent but doesn't begin with a colon as a continuation.  

I got a dejavu here :)
http://mail.python.org/pipermail/python-3000/2007-April/007045.html

and Guido's answer:
http://mail.python.org/pipermail/python-3000/2007-April/007063.html
-- 
Giovanni Bajo


From rasky at develer.com  Wed May  9 11:07:45 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Wed, 09 May 2007 11:07:45 +0200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <20070506222840.25B2.JCARLSON@uci.edu>
References: <f1l97g$5cm$1@sea.gmane.org> <463E4645.5000503@acm.org>
	<20070506222840.25B2.JCARLSON@uci.edu>
Message-ID: <f1s312$489$1@sea.gmane.org>

On 07/05/2007 7.36, Josiah Carlson wrote:

> By going multi-process rather than multi-threaded, one generally removes
> shared memory from the equasion.  Note that this has the same effect as
> using queues with threads, which is generally seen as the only way of
> making threads "easy".  If one *needs* shared memory, we can certainly
> create an mmap-based shared memory subsystem with fine-grained object
> locking, or emulate it via a server process as the processing package
> has done.
> 
> Seriously, give the processing package a try.  It's much faster than one
> would expect.

I'm fully +1 with you on everything.

And part of the fact that we have to advocate this is because Python has 
always had pretty good threading libraries, but not processing libraries; 
actually, Python does have problems at spawning processes: the whole 
popen/popen2/subprocess mess isn't even fully solved yet.

One thing to be said, though, is that using multiple processes cause some 
headaches with frozen distributions (PyInstaller, py2exe, etc.), like those 
usually found on Windows, specifically because Windows does not have fork().

The processing module, for instance, doesn't take this problem into account at 
all, making it worthless for many of my real-world use cases.
-- 
Giovanni Bajo


From aahz at pythoncraft.com  Wed May  9 14:31:35 2007
From: aahz at pythoncraft.com (Aahz)
Date: Wed, 9 May 2007 05:31:35 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <1d85506f0705090147w155c15d3ka61f0f23b435b3a9@mail.gmail.com>
References: <1d85506f0705090147w155c15d3ka61f0f23b435b3a9@mail.gmail.com>
Message-ID: <20070509123135.GB1711@panix.com>

On Wed, May 09, 2007, tomer filiba wrote:
>
> of course i'm not talking about adding that to py3k. it's too immature
> even for a pre-pep. but continuing to develop that idea more could
> be the means to removing the GIL, and finally having really parallel
> python scripts.

...which is why this discussion belongs on python-ideas.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Look, it's your affair if you want to play with five people, but don't
go calling it doubles."  --John Cleese anticipates Usenet

From ark-mlist at att.net  Wed May  9 15:33:54 2007
From: ark-mlist at att.net (Andrew Koenig)
Date: Wed, 9 May 2007 09:33:54 -0400
Subject: [Python-3000] PEP 3125 -- a modest proposal
In-Reply-To: <f1s27k$dq$1@sea.gmane.org>
References: <000101c79102$385e1340$a91a39c0$@org> <f1s27k$dq$1@sea.gmane.org>
Message-ID: <002701c7923e$b0930040$11b900c0$@net>

> I got a dejavu here :)
> http://mail.python.org/pipermail/python-3000/2007-April/007045.html
> 
> and Guido's answer:
> http://mail.python.org/pipermail/python-3000/2007-April/007063.html

Well yes, but if it's done at the lexical level, the INDENT and DEDENT
tokens don't exist.



From walter at livinglogic.de  Wed May  9 17:04:21 2007
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 09 May 2007 17:04:21 +0200
Subject: [Python-3000] binascii.b2a_qp() in the p3yk branch
Message-ID: <4641E2F5.4000702@livinglogic.de>

binascii.b2a_qp() in the p3yk branch is broken. What I get is:

$ gdb ./python
GNU gdb 6.3-debian
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-linux"...Using host libthread_db 
library "/lib/tls/libthread_db.so.1".

(gdb) run
Starting program: /var/home/walter/checkouts/Python/p3yk/python
[Thread debugging using libthread_db enabled]
[New Thread -1209593088 (LWP 17690)]
Python 3.0x (p3yk:55200, May  9 2007, 11:43:49)
[GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import binascii
 >>> binascii.b2a_qp(b'')

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1209593088 (LWP 17690)]
0xb7ee9093 in strchr () from /lib/tls/libc.so.6
(gdb) bt
#0  0xb7ee9093 in strchr () from /lib/tls/libc.so.6
#1  0xb7c4744c in binascii_b2a_qp (self=0x0, args=0x0, kwargs=0x0) at 
/var/home/walter/checkouts/Python/p3yk/Modules/binascii.c:1153
#2  0x0807788e in PyCFunction_Call (func=0xb7e26e8c, arg=0xb7e1328c, 
kw=0xa0a0a0a) at Objects/methodobject.c:77
#3  0x080adbb4 in call_function (pp_stack=0xbffff45c, oparg=0) at 
Python/ceval.c:3513
#4  0x080abe66 in PyEval_EvalFrameEx (f=0x8235aa4, throwflag=0) at 
Python/ceval.c:2191
#5  0x080ac9e4 in PyEval_EvalCodeEx (co=0xb7e0fbf0, globals=0x0, 
locals=0xa0a0a0a, args=0xb7e3102c, argcount=0, kws=0x0, kwcount=0, 
defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:2812
#6  0x080aef5f in PyEval_EvalCode (co=0x0, globals=0x0, locals=0x0) at 
Python/ceval.c:491
#7  0x080d14ba in run_mod (mod=0x0, filename=0x0, globals=0x0, 
locals=0x0, flags=0x0, arena=0x0) at Python/pythonrun.c:1282
#8  0x080d0967 in PyRun_InteractiveOneFlags (fp=0x0, filename=0x8116596 
"<stdin>", flags=0xbffff65c) at Python/pythonrun.c:800
#9  0x080d0793 in PyRun_InteractiveLoopFlags (fp=0xb7f9cca0, 
filename=0x8116596 "<stdin>", flags=0xbffff65c) at Python/pythonrun.c:724
#10 0x080d1d32 in PyRun_AnyFileExFlags (fp=0xb7f9cca0, 
filename=0x8116596 "<stdin>", closeit=0, flags=0xbffff65c) at 
Python/pythonrun.c:693
#11 0x080569ab in Py_Main (argc=-1208365920, argv=0xbffff65c) at 
Modules/main.c:491
#12 0x080564bb in main (argc=0, argv=0x0) at Modules/python.c:23

From jcarlson at uci.edu  Wed May  9 17:57:10 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 09 May 2007 08:57:10 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <9F3AFD78-7C73-4C9F-8CA6-3D10A1468939@fuhm.net>
References: <ca471dc20705071058n25a21acfvaca8e4979edfa404@mail.gmail.com>
	<9F3AFD78-7C73-4C9F-8CA6-3D10A1468939@fuhm.net>
Message-ID: <20070509084606.25E5.JCARLSON@uci.edu>


James Y Knight <foom at fuhm.net> wrote:
> On May 7, 2007, at 1:58 PM, Guido van Rossum wrote:
> > As C doesn't have an atomic increment nor an atomic
> > decrement-and-test, the INCREF and DECREF macros sprinkled throughout
> > the code (many thousands of them) must be protected by some lock.
> 
> 2) There's a pretty damn portable library which provides these  
> functions for what looks to me like pretty much all CPUs anyone would  
> use, under Linux, Windows, HP/UX, Solaris, and OSX, and has a  
> fallback to using pthreads mutexes:
> 
> http://www.hpl.hp.com/research/linux/atomic_ops/index.php4
> http://packages.debian.org/stable/libdevel/libatomic-ops-dev
> 
> 
> It's quite possible the overhead of GIL-less INCREF/DECREF is still  
> too high even with atomic increment/decrement primitives, but AFAICT  
> nobody has actually tried it. So saying GIL-less operation for sure  
> has too high of an overhead unless the refcounting GC is replaced  
> seems a bit premature.

Of course the trouble is that while this would be great for
incref/decref operations, and the handling of certain immutable types,
very many objects in Python are dynamic and require the GIL for all
operations on them.  Removing the need to hold the GIL for incref/decref
operations and telling people "some objects don't need to hold the GIL
when you monkey with them" is really just a great way to confuse the
hell out of people.  There is already confusion with borrowed references,
which affects fewer types than would be affected if we were to say
"some immutable c types can be accessed without the GIL".

Could it offer speed up?  Probably, but how much, and what kind of a
PITA would it become to use and manipulate the immutable types?


 - Josiah


From pje at telecommunity.com  Wed May  9 18:34:02 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 09 May 2007 12:34:02 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <da3f900e0705081858rc5fabfbheb9d520c5d007bc6@mail.gmail.com
 >
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<da3f900e0705081858rc5fabfbheb9d520c5d007bc6@mail.gmail.com>
Message-ID: <20070509163219.A332C3A4061@sparrow.telecommunity.com>

At 09:58 PM 5/8/2007 -0400, Chris Monson wrote:
>If we are looking at doing Design By Contract using @before and 
>@after (preconditions and postconditions), shouldn't there be some 
>way of getting at the return value in functions decorated with @after?

Actually, it isn't really design by contract; i.e., I wasn't using 
the word "postconditions" in the DBC sense.  I was saying you could 
put code there to *ensure* (i.e. implement) additional 
postconditions, not *check* them.

If you wanted to implement DBC, it might be simplest to subclass 
overloading.Around to create a Contract class and @contract 
decorator, with a higher method-combination precedence than Around 
methods.  Indeed, that might make another nice example for the PEP.


From pje at telecommunity.com  Wed May  9 19:28:18 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 09 May 2007 13:28:18 -0400
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com
 >
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>
	<20070509015553.9C6843A4061@sparrow.telecommunity.com>
	<88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>
Message-ID: <20070509173942.B5A1B3A4061@sparrow.telecommunity.com>

At 09:57 PM 5/8/2007 -0600, Jeff Shell wrote:
>I must admit that I didn't read PEP 3124 in depth - most of it was
>fascinating, some of it went way over my head in complexity, and then
>suddenly I saw an Interface. It seemed quite out of place, actually,

Yep; it's really more like a "adapting abstract base class".  In a 
GF-based language like Dylan, it would probably just be called a 
"module".  In Dylan, the generic functions you export from a module 
define an interface, in precisely the same way as PEP 3124 Interfaces 
work, except they have no IFoo(bar).baz(), only IFoo.baz(bar).  (And 
with different syntax, of course.)


>[snip lots of stuff about zope.interface's specification features]

If you don't count in-house "enterprise" operations and shops like 
Google, Yahoo, et al., the development of Zope is certainly one of 
the largest (if not the very largest) Python project.  It's 
understandable that LBYL is desirable in that environment.  On the 
other hand, such large projects in Python are pretty darn rare.

Meanwhile, the subject of typing systems for Python (into which 
category zope.interface most assuredly falls) is still an open 
research area.  I've watched zope.interface and its predecessors and 
spin-offs for almost a decade now, and it's *still* not particularly 
settled.  Look, for example, at Guido's blog posts about type 
expressions and type parameterization.  Look at how quickly the 
efforts to define standard ABCs for Py3K turned back from grand 
vision to, "oh heck that doesn't quite do what we wanted".

So, my thought is that LBYL type systems for Python are still "here 
there be dragons" territory.  PEP 3124 simply doesn't try to go 
there, but neither does it block the passage of other explorers.  It 
happily coexists with other type systems, and if you want to use 
something called "roles" or "traits" as argument annotations, it will 
be OK with that.

The specifics, which I haven't spelled out in the "extension API" 
section yet, are mainly that in order to work with the default 
dispatching engine, you must register methods for the "implies()" 
generic function, such that the engine can tell whether a class 
implies the annotation, the annotation implies a class, or the 
annotation implies any other annotation.

Of course, there will also be a generic function you can register 
with in order to use a different dispatch engine when your 
annotations are encountered.  This would be the hook you'd need to 
use in order to have instance-specific checks.  Bear in mind, of 
course, that the such checks will necessarily be slow, and the 
slowdown may apply to every invocation of the function.


>'Look Before You Leap'
>brings to mind the concept of "don't dive into an empty pool" or
>"don't do a backwards flip onto the pointy rocks"; where as 'Easier to
>Ask Forgiveness Than Permission' brings to bind the concept of "sorry
>i dove head first into your empty pool and cracked my skull open Mr.
>Johnson. If I had asked I'm sure you would have said no! In any case,
>even though it's your pool and there was a fence and everything and
>you did not give me permission, my parents are going to sue" (OK,
>maybe that last bit is the result of a healthy american upbringing...
>but still!)

Well, if you run a program whose effects are that important without 
having tested it first, perhaps you *should* be sued.  :)


>Huh? I'll try to look at that. types, classes, instances... That does
>it, I'm switching to Io. (Honestly - I've recently seen the light
>about prototype based object oriented programming; in light of types
>of types and classes of classes and classes and instances and oh my,
>languages that believe in "there are only objects, and they are only
>instances" are sounding sweeter every day)

Why stop at Io?  Cecil is a prototype-based language with generic 
functions, including full predicate dispatch and even "dynamic types" 
(i.e., an object can be of several types at the same time, depending 
on its current state).  :)

For Python, though, I really don't see the need to create such 
ultra-dynamic objects.  It's so easy to just dynamically create a 
class whenever you need one, that it doesn't seem worth the bother to 
munge instances.  Zope, of course, is always a special case in this 
respect, since *persisting* dynamic classes is a PITA, compared to 
dynamic instances.  But the stdlib ain't Zope, and most Python code 
doesn't need to have its classes stored in a database.


>>Since #1 falls outside of PEP 3124's goals (i.e., it's not about
>>specification or testing), that leaves use case #2.  In my
>>experience, it has been more than sufficient to simply give these
>>object some *other* interface, such as an IViewTags interface with a
>>method to query these dynamic "tag" interfaces.  In other words, my
>>experience and opinion supports the view that use case #2 is actually
>>a coincidental abuse of interfaces for convenience, rather than the
>>"one obvious way" to handle the use case.
>
>Ugh. Yeah, there are 'marker' interfaces, but.. ugh. dynamic "tag"
>interfaces. Yuck.
>
>My experience has been otherwise.

Now you're confusing me.  How is having a way to ask for markers 
different from marker interfaces?  Both are equally "yuck", except 
that one isn't abusing interfaces to use them as markers.


>But I'm sorry that I confused that section of PEP 3124 to be about
>specification and testing. I do, however, think that is a better use
>case.

Well, write another PEP, then.  :)


>We have a dynamic data transformation framework that exists outside of
>Zope (Zope is basically used for UI). Objects are being dynamically
>composed, wrapped, decomposed, rewrapped, filtered, and split -
>constantly. Objects, not types. It's all composed of rules. I'm
>itching to be able to add rules to apply zope.interface specifications
>to the generated objects; if only to then make it much easier to add
>other filtering rules later on.

Maybe I'm getting to be like Guido in my old age, but maybe you 
should just write the program first and extract the framework second.  :)

In truth, though, I'd almost bet some serious cash that proper use of 
generic functions would evaporate your framework to virtual 
nothingness.  My observation has been that in languages with generic 
functions, the sort of thing that requires complex frameworks with 
lots of interfaces in Zope looks like a trivial little library.  In 
PEAK, I was able to cut out about 75% of the code in one 
sub-framework by switching it from interfaces+adaptation to generic 
functions, and in the process made it much more comprehensible.  With 
that kind of productivity enhancement, one can afford to write a lot 
more unit tests to do the LBYL-ing, and still come out ahead.  :)

Really, the problem of LBYL interfaces and adaptation is that they 
require you to laboriously figure out in advance where to allow 
flexibility in the framework.

Worse, they emphasize the *solution* domain rather than the problem 
domain.  They define what's required of parts that have to be plugged 
into a machine that then "solves" the problem.  So you have to design 
that machine and where various parts fit into it, which is largely a 
distraction from whatever you were trying to do in the first place!

In contrast, with generic functions, you focus simply on identifying 
the operations required by the problem domain, performing a 
functional decomposition rather than trying to create an entire 
network mesh of roles and responsibilities.  You code the *problem*, 
not the solution, so your code might even be comprehensible (or at 
least explainable) to non-programmer domain experts.  (i.e., even if 
they can't read the code, you can read it to them and confirm whether 
it matches the requirements.)

*Then*, after you have the functional decomposition (which is your 
real "specification" anyway), you can then decide what concrete 
object types you might need, and implement the lowest-level domain 
operations for those types.

And if you're following that process, interfaces really aren't 
anything but the documentation that explains what those 
problem-domain operations are supposed to do, so that if for some 
reason you need to implement new concrete types at a later time, you 
can add appropriate implementations.

IMO, that's a lot closer to being the One Obvious Way, because it 
doesn't *need* LBYL or anything like zope.interface, anywhere in that 
process.  See, if you want contract enforcement, you can just *build 
it right in* to the generic functions, using an appropriate method 
type.  See my comments here:

http://mail.python.org/pipermail/python-3000/2007-May/007444.html

Of course, that's not a *static* guarantee of correctness, but 
neither is most of what you're talking about.  Tests are still 
required, either way.


From guido at python.org  Wed May  9 19:56:19 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 9 May 2007 10:56:19 -0700
Subject: [Python-3000] binascii.b2a_qp() in the p3yk branch
In-Reply-To: <4641E2F5.4000702@livinglogic.de>
References: <4641E2F5.4000702@livinglogic.de>
Message-ID: <ca471dc20705091056ud68b79evaab3fc99a7daa8ab@mail.gmail.com>

Fixed. The code was using strchr() instead of memchr(), which was
wrong anyway; but b"" is the only object (apparently) whose buffer
pointer is NULL when the size is 0.

Committed revision 55204.

Please backport (I wouldn't be surprised if this could be exploited).

On 5/9/07, Walter D?rwald <walter at livinglogic.de> wrote:
> binascii.b2a_qp() in the p3yk branch is broken. What I get is:
>
> $ gdb ./python
> GNU gdb 6.3-debian
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "i386-linux"...Using host libthread_db
> library "/lib/tls/libthread_db.so.1".
>
> (gdb) run
> Starting program: /var/home/walter/checkouts/Python/p3yk/python
> [Thread debugging using libthread_db enabled]
> [New Thread -1209593088 (LWP 17690)]
> Python 3.0x (p3yk:55200, May  9 2007, 11:43:49)
> [GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> import binascii
>  >>> binascii.b2a_qp(b'')
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread -1209593088 (LWP 17690)]
> 0xb7ee9093 in strchr () from /lib/tls/libc.so.6
> (gdb) bt
> #0  0xb7ee9093 in strchr () from /lib/tls/libc.so.6
> #1  0xb7c4744c in binascii_b2a_qp (self=0x0, args=0x0, kwargs=0x0) at
> /var/home/walter/checkouts/Python/p3yk/Modules/binascii.c:1153
> #2  0x0807788e in PyCFunction_Call (func=0xb7e26e8c, arg=0xb7e1328c,
> kw=0xa0a0a0a) at Objects/methodobject.c:77
> #3  0x080adbb4 in call_function (pp_stack=0xbffff45c, oparg=0) at
> Python/ceval.c:3513
> #4  0x080abe66 in PyEval_EvalFrameEx (f=0x8235aa4, throwflag=0) at
> Python/ceval.c:2191
> #5  0x080ac9e4 in PyEval_EvalCodeEx (co=0xb7e0fbf0, globals=0x0,
> locals=0xa0a0a0a, args=0xb7e3102c, argcount=0, kws=0x0, kwcount=0,
> defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:2812
> #6  0x080aef5f in PyEval_EvalCode (co=0x0, globals=0x0, locals=0x0) at
> Python/ceval.c:491
> #7  0x080d14ba in run_mod (mod=0x0, filename=0x0, globals=0x0,
> locals=0x0, flags=0x0, arena=0x0) at Python/pythonrun.c:1282
> #8  0x080d0967 in PyRun_InteractiveOneFlags (fp=0x0, filename=0x8116596
> "<stdin>", flags=0xbffff65c) at Python/pythonrun.c:800
> #9  0x080d0793 in PyRun_InteractiveLoopFlags (fp=0xb7f9cca0,
> filename=0x8116596 "<stdin>", flags=0xbffff65c) at Python/pythonrun.c:724
> #10 0x080d1d32 in PyRun_AnyFileExFlags (fp=0xb7f9cca0,
> filename=0x8116596 "<stdin>", closeit=0, flags=0xbffff65c) at
> Python/pythonrun.c:693
> #11 0x080569ab in Py_Main (argc=-1208365920, argv=0xbffff65c) at
> Modules/main.c:491
> #12 0x080564bb in main (argc=0, argv=0x0) at Modules/python.c:23
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bjourne at gmail.com  Wed May  9 21:54:46 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Wed, 9 May 2007 21:54:46 +0200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
Message-ID: <740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>

On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> Comments and questions appreciated, as it'll help drive better explanations
> of both the design and rationales.  I'm usually not that good at guessing
> what other people will want to know (or are likely to misunderstand) until
> I get actual questions.

I haven't read it all yet. But my first comment is "This PEP is HUGE!"
922 lines. Is there any way you could shorten it or split it up in
more manageable chunks? My second comment is that there are to few
examples in the PEP.

> The API will be implemented in pure Python with no C, but may have
> some dependency on CPython-specific features such as ``sys._getframe``
> and the ``func_code`` attribute of functions.  It is expected that
> e.g. Jython and IronPython will have other ways of implementing
> similar functionality (perhaps using Java or C#).
>
>
> Rationale and Goals
> ===================
>
> Python has always provided a variety of built-in and standard-library
> generic functions, such as ``len()``, ``iter()``, ``pprint.pprint()``,
> and most of the functions in the ``operator`` module.  However, it
> currently:
>
> 1. does not have a simple or straightforward way for developers to
>     create new generic functions,

I think there is a very straightforward way. For example, a generic
function for token handling could be written like this:

    def handle_any(val):
        pass

    def handle_tok(tok, val):
       handlers = {
           ANY        : handle_any,
           BRANCH     : handle_branch,
           CATEGORY   : handle_category
       }
       try:
           return handlers[tok](val)
       except KeyError, e:
           fmt = "Unsupported token type: %s"
           raise ValueError(fmt % tok)

This is an idiom I have used hundreds of times. The handle_tok
function is generic because it dispatches to the correct handler based
on the type of tok.

> 2. does not have a standard way for methods to be added to existing
>     generic functions (i.e., some are added using registration
>     functions, others require defining ``__special__`` methods,
>     possibly by monkeypatching), and

When does "external" code wants to add to a generic function? In the
above example, you add to the generic function by inserting a new
key-value pair in the handlers list. If needed, it wouldn't be very
hard to make the handle_tok function extensible. Just make the
handlers object global.

> 3. does not allow dispatching on multiple argument types (except in
>     a limited form for arithmetic operators, where "right-hand"
>     (``__r*__``) methods can be used to do two-argument dispatch.

Why would you want that?

> The ``@overload`` decorator allows you to define alternate
> implementations of a function, specialized by argument type(s).  A
> function with the same name must already exist in the local namespace.
> The existing function is modified in-place by the decorator to add
> the new implementation, and the modified function is returned by the
> decorator.  Thus, the following code::
>
>      from overloading import overload
>      from collections import Iterable
>
>      def flatten(ob):
>          """Flatten an object to its component iterables"""
>          yield ob
>
>      @overload
>      def flatten(ob: Iterable):
>          for o in ob:
>              for ob in flatten(o):
>                  yield ob
>
>      @overload
>      def flatten(ob: basestring):
>          yield ob
>
> creates a single ``flatten()`` function whose implementation roughly
> equates to::
>
>      def flatten(ob):
>          if isinstance(ob, basestring) or not isinstance(ob, Iterable):
>              yield ob
>          else:
>              for o in ob:
>                  for ob in flatten(o):
>                      yield ob
>
> **except** that the ``flatten()`` function defined by overloading
> remains open to extension by adding more overloads, while the
> hardcoded version cannot be extended.

I very much prefer the latter version. The reason is because the
"locality of reference" is much worse in the overloaded version and
because I have found it to be very hard to read and understand
overloaded code in practice.

Let's say you find some code that looks like this:

    def do_stuff(ob):
        yield obj

    @overload
    def do_stuff(ob : ClassA):
        for o in ob:
            for ob in do_stuff(o):
                yield ob

    @overload
    def do_stuff(ob : classb):
        yield ob

Or this:

    def do_stuff(ob):
        if isinstance(ob, classb) or not isinstance(ob, ClassA):
            yield ob
        else:
            for o in ob:
                for ob in do_stuff(o):
                    yield ob

With the overloaded code, you have to read EVERY definition of
"do_stuff" to understand what the code does. Not just every definition
in the same module, but every definition in the whole program because
someone might have extended the do_stuff generic function.

What if they have defined a do_stuff that dispatch on ClassC that is a
subclass of ClassA? Good luck in figuring out what the code does.

With the non-overloaded version you also have the ability to insert
debug print statements to figure out what happens.

> For example, if someone wants to use ``flatten()`` with a string-like
> type that doesn't subclass ``basestring``, they would be out of luck
> with the second implementation.  With the overloaded implementation,
> however, they can either write this::
>
>      @overload
>      def flatten(ob: MyString):
>          yield ob
> or this (to avoid copying the implementation)::
>
>      from overloading import RuleSet
>      RuleSet(flatten).copy_rules((basestring,), (MyString,))

That may be great for flexibility, but I contend that it is awful for
reality. In reality, it would be much simpler and more readable to
just rewrite the flatten method:

    def flatten(ob):
        flat = (isinstance(ob, (basestring, MyString)) or
                not isinstance(ob, Iterable))
        if flat:
            yield ob
        else:
            for o in ob:
                for ob in flatten(o):
                    yield ob

Or change MyString so that it derives from basestring.


> Most of the functionality described in this PEP is already implemented
> in the in-development version of the PEAK-Rules framework.  In
> particular, the basic overloading and method combination framework
> (minus the ``@overload`` decorator) already exists there.  The
> implementation of all of these features in ``peak.rules.core`` is 656
> lines of Python at this writing.

I think PEAK is a great framework and that generic functions are great
for those who likes it. But I'm not convinced that writing multiple
dispatch functions the way PEAK prescribes is better than the any of
the currently used idioms.

I first encountered them when I tried fix a bug in the jsonify.py
module in TurboGears (now relocated to the TurboJSON package). It took
me about 30 minutes to figure out how it worked (including manual
reading). Had not PEAK style generic functions been used, it would
have taken me 2 minutes top.

So IMHO, generic functions certainly are useful for some things, but
not useful enough. Using them as a replacement for ordinary multiple
dispatch techniques is a bad idea.


-- 
mvh Bj?rn

From steven.bethard at gmail.com  Wed May  9 22:16:26 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Wed, 9 May 2007 14:16:26 -0600
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
Message-ID: <d11dcfba0705091316j4ec75a38xb70ce8b4a4f0b375@mail.gmail.com>

On 5/9/07, BJ?rn Lindqvist <bjourne at gmail.com> wrote:
> I very much prefer the latter version. The reason is because the
> "locality of reference" is much worse in the overloaded version and
> because I have found it to be very hard to read and understand
> overloaded code in practice.
>
> Let's say you find some code that looks like this:
>
>     def do_stuff(ob):
>         yield obj
>
>     @overload
>     def do_stuff(ob : ClassA):
>         for o in ob:
>             for ob in do_stuff(o):
>                 yield ob
>
>     @overload
>     def do_stuff(ob : classb):
>         yield ob
>
> Or this:
>
>     def do_stuff(ob):
>         if isinstance(ob, classb) or not isinstance(ob, ClassA):
>             yield ob
>         else:
>             for o in ob:
>                 for ob in do_stuff(o):
>                     yield ob
>
> With the overloaded code, you have to read EVERY definition of
> "do_stuff" to understand what the code does. Not just every definition
> in the same module, but every definition in the whole program because
> someone might have extended the do_stuff generic function.

I don't buy this argument.  That's like saying that I can't understand
what len() does without examining every object that defines __len__().
 Do you really have trouble understanding functions like len() or
hash()?

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From steven.bethard at gmail.com  Wed May  9 22:41:14 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Wed, 9 May 2007 14:41:14 -0600
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
Message-ID: <d11dcfba0705091341g2609864fx835ab7f565a63e37@mail.gmail.com>

On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> PEP: 3124
> Title: Overloading, Generic Functions, Interfaces, and Adaptation
[snip]
>      from overloading import overload
>      from collections import Iterable
>
>      def flatten(ob):
>          """Flatten an object to its component iterables"""
>          yield ob
>
>      @overload
>      def flatten(ob: Iterable):
>          for o in ob:
>              for ob in flatten(o):
>                  yield ob
>
>      @overload
>      def flatten(ob: basestring):
>          yield ob
[snip]
> ``@overload`` vs. ``@when``
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> The ``@overload`` decorator is a common-case shorthand for the more
> general ``@when`` decorator.  It allows you to leave out the name of
> the function you are overloading, at the expense of requiring the
> target function to be in the local namespace.  It also doesn't support
> adding additional criteria besides the ones specified via argument
> annotations.  The following function definitions have identical
> effects, except for name binding side-effects (which will be described
> below)::
>
>      @overload
>      def flatten(ob: basestring):
>          yield ob
>
>      @when(flatten)
>      def flatten(ob: basestring):
>          yield ob
>
>      @when(flatten)
>      def flatten_basestring(ob: basestring):
>          yield ob
>
>      @when(flatten, (basestring,))
>      def flatten_basestring(ob):
>          yield ob
[snip]

+1 on @overload and @when.


> Proceeding to the "Next" Method
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[snip]
> "Before" and "After" Methods
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[snip]
> "Around" Methods
> ~~~~~~~~~~~~~~~~
[snip]
> Custom Combinations
> ~~~~~~~~~~~~~~~~~~~

I'd rather see all this left as a third-party library to start with.
(Yes, even including __proceed__.)  It shouldn't be a problem to
supply these things separately, right?


> Interfaces and Adaptation
> -------------------------
>
> The ``overloading`` module provides a simple implementation of
> interfaces and adaptation.  The following example defines an
> ``IStack`` interface, and declares that ``list`` objects support it::
>
>      from overloading import abstract, Interface
>
>      class IStack(Interface):
>          @abstract
>          def push(self, ob)
>              """Push 'ob' onto the stack"""
>
>          @abstract
>          def pop(self):
>              """Pop a value and return it"""
>
>
>      when(IStack.push, (list, object))(list.append)
>      when(IStack.pop, (list,))(list.pop)
>
>      mylist = []
>      mystack = IStack(mylist)
>      mystack.push(42)
>      assert mystack.pop()==42
>
> The ``Interface`` class is a kind of "universal adapter".  It accepts
> a single argument: an object to adapt.  It then binds all its methods
> to the target object, in place of itself.  Thus, calling
> ``mystack.push(42``) is the same as calling
> ``IStack.push(mylist, 42)``.

+1 on adapters like this.

> Interfaces as Type Specifiers
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ``Interface`` subclasses can be used as argument annotations to
> indicate what type of objects are acceptable to an overload, e.g.::
>
>      @overload
>      def traverse(g: IGraph, s: IStack):
>          g = IGraph(g)
>          s = IStack(s)
>          # etc....

and +1 on being able to specify Interfaces as "types".

> Aspects
> -------
[snip]
>      from overloading import Aspect
>
>      class Count(Aspect):
>          count = 0
>
>      @after(Target.some_method)
>      def count_after_call(self, *args, **kw):
>          Count(self).count += 1

Again, I'd rather see this kind of thing in a third-party library.


Summary of my PEP thoughts:
* Keep things simple: just @overload, @when, @abstract and Interface.
* More complex things like __proceed__, @before, @after, Aspects, etc.
should be added by third-party modules

As others have mentioned, the current PEP is overwhelming. I'd rather
see Py3K start with just the basics. When people are comfortable with
the core, we can look into introducing the extras.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From pje at telecommunity.com  Wed May  9 22:58:39 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 09 May 2007 16:58:39 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
Message-ID: <20070509205655.622A63A4061@sparrow.telecommunity.com>

At 09:54 PM 5/9/2007 +0200, BJ?rn Lindqvist wrote:
>On 5/1/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > Comments and questions appreciated, as it'll help drive better explanations
> > of both the design and rationales.  I'm usually not that good at guessing
> > what other people will want to know (or are likely to misunderstand) until
> > I get actual questions.
>
>I haven't read it all yet. But my first comment is "This PEP is HUGE!"
>922 lines. Is there any way you could shorten it or split it up in
>more manageable chunks? My second comment is that there are to few
>examples in the PEP.

So it's too big AND too small.  I guess it's then equally displeasing 
to everyone.  :)

I notice that most of the rest of your message calls for further additions.  :)

> > 1. does not have a simple or straightforward way for developers to
> >     create new generic functions,
>
>I think there is a very straightforward way. For example, a generic
>function for token handling could be written like this:
>
>     def handle_any(val):
>         pass
>
>     def handle_tok(tok, val):
>        handlers = {
>            ANY        : handle_any,
>            BRANCH     : handle_branch,
>            CATEGORY   : handle_category
>        }
>        try:
>            return handlers[tok](val)
>        except KeyError, e:
>            fmt = "Unsupported token type: %s"
>            raise ValueError(fmt % tok)
>
>This is an idiom I have used hundreds of times. The handle_tok
>function is generic because it dispatches to the correct handler based
>on the type of tok.

First, this example is broken, since there's no way for anybody to 
add handlers to it (entirely aside from the fact that it recreates 
the dispatch table every time it executes).

Second, even if you *could* add handlers to it, you'd need to 
separately document the mechanism for adding handlers, for each and 
every new generic function. The purpose of the API in 3124 is to have 
a standard API that's independent of *how* the dispatching is 
actually implemented.  That is, whether you look up types in a 
dictionary or implement full predicate dispatch makes no difference to the API.


> > 2. does not have a standard way for methods to be added to existing
> >     generic functions (i.e., some are added using registration
> >     functions, others require defining ``__special__`` methods,
> >     possibly by monkeypatching), and
>
>When does "external" code wants to add to a generic function?

Any time you want to use new code with an existing framework.  For 
example, objects to be documented with pydoc currently have to 
reverse engineer a bunch of inspection code, while in a GF-based 
design they'd just add methods.

For more examples, see this thread:

http://mail.python.org/pipermail/python-3000/2007-May/007217.html


>What if they have defined a do_stuff that dispatch on ClassC that is a
>subclass of ClassA? Good luck in figuring out what the code does.
>
>With the non-overloaded version you also have the ability to insert
>debug print statements to figure out what happens.

Ahem.

     @before(do_stuff)
     def debug_it(ob: ClassC):
         import pdb
         pdb.set_trace()

Note that you don't even need to know what module the behavior you're 
looking for is even *in*; you only need to know where to import 
do_stuff and ClassC from, and put the above in a module that's been 
imported when do_stuff is called.

In other words, generic functions massively increase your ability to 
trace specific execution paths.


>That may be great for flexibility, but I contend that it is awful for
>reality. In reality, it would be much simpler and more readable to
>just rewrite the flatten method:

Not if it's *not your flatten function*, it wouldn't be.


>Or change MyString so that it derives from basestring.

Not if it's *not your MyString* class.


>I first encountered them when I tried fix a bug in the jsonify.py
>module in TurboGears (now relocated to the TurboJSON package). It took
>me about 30 minutes to figure out how it worked (including manual
>reading). Had not PEAK style generic functions been used, it would
>have taken me 2 minutes top.

So, you're saying it only took 28 minutes to acquire a skill that you 
can now use elsewhere?  That sounds great, actually.  :)


>So IMHO, generic functions certainly are useful for some things, but
>not useful enough. Using them as a replacement for ordinary multiple
>dispatch techniques is a bad idea.

What do you mean by "ordinary multiple dispatch techniques"?  No 
offense intended, but from the overall context of your message, it 
sounds like perhaps you don't know what "multiple dispatch" means, 
since you earlier asked "why would you want that?" in reference to an 
example of it.


From ncoghlan at gmail.com  Wed May  9 23:09:22 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 10 May 2007 07:09:22 +1000
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <20070509205655.622A63A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
Message-ID: <46423882.6070006@gmail.com>

Phillip J. Eby wrote:
> At 09:54 PM 5/9/2007 +0200, BJ?rn Lindqvist wrote:
>> With the non-overloaded version you also have the ability to insert
>> debug print statements to figure out what happens.
> 
> Ahem.
> 
>      @before(do_stuff)
>      def debug_it(ob: ClassC):
>          import pdb
>          pdb.set_trace()
> 
> Note that you don't even need to know what module the behavior you're 
> looking for is even *in*; you only need to know where to import 
> do_stuff and ClassC from, and put the above in a module that's been 
> imported when do_stuff is called.
> 
> In other words, generic functions massively increase your ability to 
> trace specific execution paths.

Possibly another good example to include in the PEP...

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From pje at telecommunity.com  Wed May  9 23:38:28 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 09 May 2007 17:38:28 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <d11dcfba0705091341g2609864fx835ab7f565a63e37@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<d11dcfba0705091341g2609864fx835ab7f565a63e37@mail.gmail.com>
Message-ID: <20070509213643.A37643A4061@sparrow.telecommunity.com>

At 02:41 PM 5/9/2007 -0600, Steven Bethard wrote:
>On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>>Proceeding to the "Next" Method
>>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>[snip]
>>"Before" and "After" Methods
>>~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>[snip]
>>"Around" Methods
>>~~~~~~~~~~~~~~~~
>[snip]
>>Custom Combinations
>>~~~~~~~~~~~~~~~~~~~
>
>I'd rather see all this left as a third-party library to start with.
>(Yes, even including __proceed__.)

That'd be rather like adding new-style classes but not super().


>It shouldn't be a problem to supply these things separately, right?

Separating proceed-ability out would be tough; every function that 
wanted to use it in any way would need additional decoration to flag 
that it wanted to use an alternative base method implementation.

Meanwhile, for the rest of the features, most of the implementation 
would still have to be in the core module.  The method combination 
framework has to exist in the core, or it can't do method combination 
without essentially replacing what's *in* the core, at which point 
you're not really *using* it any more.  That is, you'd just be using 
the third-party library.

In other words, no, you can't take out all forms of method 
combination (which is essentially what you're proposing) and still 
have the ability to add it back in later.

Meanwhile, leaving in the ability to have method combination later, 
but removing the actual implementation of the @before/around/after 
decorators in place would delete a total of less than 40 non-blank 
lines of code.  Removing __proceed__ support would delete maybe 10 
lines more, tops.

Given that removing the 40 lines removes an excellent example of how 
to use the combination framework, and removing the 10 imposes 
considerable difficulty for anybody else to put them back, it seems 
unwise to me to take either of them out.  That is, I don't see what 
gain there is by removing them, that wouldn't be equally well 
addressed by splitting documentation.

(These lines-of-code estimates are based on what's in 
peak.rules.core, of course, and so might change a bit depending on 
how things go with the PEP.)


>>Aspects
>>-------
>[snip]
>>      from overloading import Aspect
>>
>>      class Count(Aspect):
>>          count = 0
>>
>>      @after(Target.some_method)
>>      def count_after_call(self, *args, **kw):
>>          Count(self).count += 1
>
>Again, I'd rather see this kind of thing in a third-party library.

The reason for it being in the PEP is that it benefits from having a 
single shared implementation (especially for the weakref dictionary, 
but also for common-maintenance reasons).  Also, the core's 
implementation of generic functions will almost certainly be using 
Aspects itself, so it might as well expose that implementation for 
others to use...


>Summary of my PEP thoughts:
>* Keep things simple: just @overload, @when, @abstract and Interface.
>* More complex things like __proceed__, @before, @after, Aspects, etc.
>should be added by third-party modules
>
>As others have mentioned, the current PEP is overwhelming. I'd rather
>see Py3K start with just the basics. When people are comfortable with
>the core, we can look into introducing the extras.

Naturally, I don't consider any of these items "extras", or I 
wouldn't have included them.  The "extras" to me are things like full 
predicate dispatch with pattern matching and variable binding, 
ordered classifiers, parsing combinators (i.e. using overloads to 
define grammar productions), custom implication precedence, custom 
predicate indexes, and all that sort of thing.

What's proposed in the PEP is a far cry from being even as expressive 
as CLOS or AspectJ are, but it does supply the bare minimum needed to 
create a foundation for other libraries to build such capabilities on it.

(Btw, a side data point: Ruby 2.0 is supposed to include method 
combination; specifically ":pre", ":post", and ":wrap" qualifiers 
modeled on CLOS's "before", "after", and "around", respectively.)


From pje at telecommunity.com  Wed May  9 23:44:02 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 09 May 2007 17:44:02 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <46423882.6070006@gmail.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<46423882.6070006@gmail.com>
Message-ID: <20070509214217.D03463A4061@sparrow.telecommunity.com>

At 07:09 AM 5/10/2007 +1000, Nick Coghlan wrote:
>Phillip J. Eby wrote:
> > At 09:54 PM 5/9/2007 +0200, BJ?rn Lindqvist wrote:
> >> With the non-overloaded version you also have the ability to insert
> >> debug print statements to figure out what happens.
> >
> > Ahem.
> >
> >      @before(do_stuff)
> >      def debug_it(ob: ClassC):
> >          import pdb
> >          pdb.set_trace()
> >
> > Note that you don't even need to know what module the behavior you're
> > looking for is even *in*; you only need to know where to import
> > do_stuff and ClassC from, and put the above in a module that's been
> > imported when do_stuff is called.
> >
> > In other words, generic functions massively increase your ability to
> > trace specific execution paths.
>
>Possibly another good example to include in the PEP...

Probably.  When I write PEP's I tend to assume my primary audience is 
Guido, and I know he's already seen tons of 
tracing/logging/debugging/contract checking examples of what you can 
do with AOP.  Or at least, I know he's previously mentioned being 
unimpressed by such.  In any case, I didn't want to use that sort of 
example for fear that some might write the entire thing off as being 
more "unconvincing examples of AOP".

Still, it is kind of handy that you can write all your contract 
checking and debug/trace/log code in separate modules from your main 
code, and simply import those modules to activate those 
features.  It's just not the only reason or even the most important 
reason to have generic functions.


From benji at benjiyork.com  Thu May 10 00:11:36 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 09 May 2007 18:11:36 -0400
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <20070509173942.B5A1B3A4061@sparrow.telecommunity.com>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>	<20070509015553.9C6843A4061@sparrow.telecommunity.com>	<88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>
	<20070509173942.B5A1B3A4061@sparrow.telecommunity.com>
Message-ID: <46424718.20006@benjiyork.com>

Phillip J. Eby wrote:
> If you don't count in-house "enterprise" operations and shops like 
> Google, Yahoo, et al., the development of Zope is certainly one of 
> the largest (if not the very largest) Python project.  It's 
> understandable that LBYL is desirable in that environment.  On the 
> other hand, such large projects in Python are pretty darn rare.

By way of clarification: Even in the large Zope 3 projects I work on 
(which obviously use zope.interface), we virtually never use interfaces 
for LBYL (just as Zope 3 itself rarely does).

Instead, we either assume something implements a (little "i") interface 
and act as such (never invoking the interface machinery, the way most 
people write Python), or we use adaptation to ask for something that 
implements a particular (big "I") Interface (but even there no 
verification is done).

My point is, people generally use zope.interface Interfaces as 
documentation and names for particular behavior/API, not as an LBYL 
enforcement mechanism.
-- 
Benji York
http://benjiyork.com

From jimjjewett at gmail.com  Thu May 10 00:26:48 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 9 May 2007 18:26:48 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070509205655.622A63A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
Message-ID: <fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>

On 5/9/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 09:54 PM 5/9/2007 +0200, BJ?rn Lindqvist wrote:

> >What if they have defined a do_stuff that dispatch on ClassC that is a
> >subclass of ClassA? Good luck in figuring out what the code does.

> >With the non-overloaded version you also have the ability to insert
> >debug print statements to figure out what happens.

>      @before(do_stuff)
>      def debug_it(ob: ClassC):
>          import pdb
>          pdb.set_trace()

I think this may be backwards from his question.  As I read it, you
know about class A, but have never heard about class C (which happens
to be a substitute for A).  Someone added a different do_stuff
implementation for class C.

    @before(do_stuff)
    def debug_it(obj: ClassA):    # Never called, it is a classC

    def debug_it(obj: not ClassA)   # can't do this?

    def debug_it(obj):                # OK, trace *everything*.
        # Or, at least, everything that nicely did a call_next_method,
        # in case you wanted to wrap it this way.  Objects that thought
        # they were providing a complete concrete implementation will
        # still sneak through

    def wrap_the_generic(generic_name, debug_it):
        orig = generic_name
        def replacement( ...)  # hope you get the .sig right
            debug_it(...)
            orig(...)
        generic_name = replacement   # hope you can monkeypatch
        # uhh ... was the original supposed to have additional behavior,
        # for more registrations, etc...

Unless I'm missing something, this only simplifies things when all
specific implementations not only drink the kool-ade, but avoid
kool-ade related bugs.

-jJ

From jimjjewett at gmail.com  Thu May 10 00:46:28 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 9 May 2007 18:46:28 -0400
Subject: [Python-3000] [Python-Dev] Byte literals (was Re:
	[Python-checkins] Changing string constants to byte arrays (
	r55119 - in python/branches/py3k-struni/Lib: codecs.py
	test/test_codecs.py ))
In-Reply-To: <ca471dc20705081155u31b0b4d1y5868630398ca3206@mail.gmail.com>
References: <20070505150035.GA16303@panix.com>
	<200705051334.45120.fdrake@acm.org>
	<20070505124008.648D.JCARLSON@uci.edu>
	<ca471dc20705071042x586b21bcgbfdf8fbf8144913e@mail.gmail.com>
	<bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>
	<fb6fbf560705080734s4c315bc3i38c5ccbe3fc9a9a3@mail.gmail.com>
	<ca471dc20705080825j6b665307o408569dd2a9fb55d@mail.gmail.com>
	<fb6fbf560705081142j75dd696eybbefcf9e310b5a44@mail.gmail.com>
	<ca471dc20705081155u31b0b4d1y5868630398ca3206@mail.gmail.com>
Message-ID: <fb6fbf560705091546q3d4e3bbav7f381df3bd1c7ac8@mail.gmail.com>

On 5/8/07, Guido van Rossum <guido at python.org> wrote:
> On 5/8/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > On 5/8/07, Guido van Rossum <guido at python.org> wrote:


> > > 1. develop working code under 2.6
> > > 2. make sure it is warning-free with the special -Wpy3k option
> > > 3. use 2to3 to convert it to 3.0 compatible syntax in a temporary directory
> > > 4. run your unit test suite with 3.0
> > > 5. for any defects you find, EDIT THE 2.6 SOURCE AND GO BACK TO STEP 2

> > The problem is what to do after step 5 ...

> > Do you leave your 3 code in the awkward auto-generated format, and
> > suggest (by example) that py3 code is clunky?

> > Do you immediately stop supporting 2.x?

> > Or do you fork the code?

> I disagree that the converted code is awkward.

On python-dev, there was a recent discussion about changing stdlib
xrange into range.  It was pointed out that this would make conversion
harder, because range will get converted to list(xrange).

Maybe the tool has gotten smart enough to avoid constructions like:

    for k, v in list(dict.items()):

    for i in list(range(10)):

but I can't help feeling there will always be a few cases where it
makes the code longer and worse.

The hand patch for removing tuple parameters from the stdlib was
certainly better than the tool could ever be expected to generate.

I would be satisfied iIf the tool generates something like

    # 2to3:  Did you really *need* a list?
    for k, v in list(dict.items()):

and I can then change it back to

    for k, v in dict.items():

knowing it will run OK in both versions.  I will not be happy if I
have to do this editing more than once.

At the moment, I haven't seen anything that can't be expressed in code
that would run in either version.  (There might be something,
particular in str/unicode or tracebacks; I just don't remember seeing
it, so I think I could choose to avoid it for most code.)

(Note that letting the common code be less efficient in 2.x is an
acceptable tradeoff for me.  Others might prefer a way to annotate a
function or class as already dual.)

-jJ

From greg.ewing at canterbury.ac.nz  Thu May 10 03:08:03 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 10 May 2007 13:08:03 +1200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <9F3AFD78-7C73-4C9F-8CA6-3D10A1468939@fuhm.net>
References: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>
	<ca471dc20705071058n25a21acfvaca8e4979edfa404@mail.gmail.com>
	<9F3AFD78-7C73-4C9F-8CA6-3D10A1468939@fuhm.net>
Message-ID: <46427073.701@canterbury.ac.nz>

James Y Knight wrote:

> This just isn't true. Python can do an atomic increment in a fast  
> platform specific way.

The problem with this, from what I've heard, is that
atomic increment instructions tend to be on the order
of 100 times slower than normal memory accesses (I
guess because they have to bypass the cache or do extra
work to keep it consistent).

If that's true, even a single-instruction atomic increment
could be much slower than the currently used instruction
sequence for a Py_INCREF or Py_DECREF.

> It's quite possible the overhead of GIL-less INCREF/DECREF is still  
> too high even with atomic increment/decrement primitives, but AFAICT  
> nobody has actually tried it.

I thought that's what the oft-cited previous attempt was
doing, but maybe not. If not, it could be worth trying
to see what happens.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From pje at telecommunity.com  Thu May 10 03:23:41 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 09 May 2007 21:23:41 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
Message-ID: <20070510012202.EB0A83A4061@sparrow.telecommunity.com>

At 06:26 PM 5/9/2007 -0400, Jim Jewett wrote:
>On 5/9/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>>At 09:54 PM 5/9/2007 +0200, BJ?rn Lindqvist wrote:
>> >What if they have defined a do_stuff that dispatch on ClassC that is a
>> >subclass of ClassA? Good luck in figuring out what the code does.
>
>> >With the non-overloaded version you also have the ability to insert
>> >debug print statements to figure out what happens.
>
>>      @before(do_stuff)
>>      def debug_it(ob: ClassC):
>>          import pdb
>>          pdb.set_trace()
>
>I think this may be backwards from his question.  As I read it, you
>know about class A, but have never heard about class C (which happens
>to be a substitute for A).  Someone added a different do_stuff
>implementation for class C.
>
>    @before(do_stuff)
>    def debug_it(obj: ClassA):    # Never called, it is a classC

Actually, if you read what was said above, ClassC is a subclass of 
ClassA, so the above *is* called.


>    def debug_it(obj: not ClassA)   # can't do this?

Actually, you can, if you create something like a NotClass type and 
register methods to define its implication relationships to classes 
and other criteria.  Of course, it then wouldn't be called for ClassC...


>    def debug_it(obj):                # OK, trace *everything*.
>        # Or, at least, everything that nicely did a call_next_method,
>        # in case you wanted to wrap it this way.  Objects that thought
>        # they were providing a complete concrete implementation will
>        # still sneak through

Which is an excellent demonstration, by the way, of another reason 
why before/after methods are useful.  They're all *always* called 
before and after the primary methods, regardless of how many of them 
were registered.


>    def wrap_the_generic(generic_name, debug_it):
>        orig = generic_name
>        def replacement( ...)  # hope you get the .sig right
>            debug_it(...)
>            orig(...)
>        generic_name = replacement   # hope you can monkeypatch
>        # uhh ... was the original supposed to have additional behavior,
>        # for more registrations, etc...

I don't understand what this last example is supposed to be, but note 
that if you want to create special @debug methods with higher 
precedence than Around methods, it's relatively simple:

    class Debug(Around):
        """Like an Around, but with higher precedence"""

    debug = Debug.make_decorator('debug')
    always_overrides(Debug, Around)
    always_overrides(Debug, Method)
    always_overrides(Debug, Before)
    always_overrides(Debug, After)

(It occurs to me that although the current prototype implementation 
requires you to explicitly declare method override relationships for 
all applicable types, I should probably make the transitive 
declaration(s) automatic, so that the above would require only 
'always_overrides(Debug, Around)' to work.)


>Unless I'm missing something, this only simplifies things when all
>specific implementations not only drink the kool-ade, but avoid
>kool-ade related bugs.

I don't understand what you mean here.


From greg.ewing at canterbury.ac.nz  Thu May 10 03:24:44 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 10 May 2007 13:24:44 +1200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <f1s312$489$1@sea.gmane.org>
References: <f1l97g$5cm$1@sea.gmane.org> <463E4645.5000503@acm.org>
	<20070506222840.25B2.JCARLSON@uci.edu> <f1s312$489$1@sea.gmane.org>
Message-ID: <4642745C.1040702@canterbury.ac.nz>

Giovanni Bajo wrote:
> using multiple processes cause some 
> headaches with frozen distributions (PyInstaller, py2exe, etc.), like those 
> usually found on Windows, specifically because Windows does not have fork().

Isn't that just a problem with Windows generally? I don't
see what the method of packaging has to do with it.

Also, I've seen it suggested that there may actually be
a way of doing something equivalent to a fork in Windows,
even though it doesn't have a fork() system call as such.
Does anyone know more about this?

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From pje at telecommunity.com  Thu May 10 03:30:22 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 09 May 2007 21:30:22 -0400
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <46424718.20006@benjiyork.com>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>
	<20070509015553.9C6843A4061@sparrow.telecommunity.com>
	<88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>
	<20070509173942.B5A1B3A4061@sparrow.telecommunity.com>
	<46424718.20006@benjiyork.com>
Message-ID: <20070510012836.16D0D3A4061@sparrow.telecommunity.com>

At 06:11 PM 5/9/2007 -0400, Benji York wrote:
>Phillip J. Eby wrote:
>>If you don't count in-house "enterprise" operations and shops like 
>>Google, Yahoo, et al., the development of Zope is certainly one of 
>>the largest (if not the very largest) Python project.  It's 
>>understandable that LBYL is desirable in that environment.  On the 
>>other hand, such large projects in Python are pretty darn rare.
>
>By way of clarification: Even in the large Zope 3 projects I work on 
>(which obviously use zope.interface), we virtually never use 
>interfaces for LBYL (just as Zope 3 itself rarely does).

Yet, this is precisely what Jeff is claiming zope.interface is 
useful/desirable *for*, and Jim Fulton has also been quite clear that 
LBYL is its very raison d'etre.

But of course, as you point out below, that's not necessarily what 
most zope.interface users actually *do* with it.  :)


>Instead, we either assume something implements a (little "i") 
>interface and act as such (never invoking the interface machinery, 
>the way most people write Python), or we use adaptation to ask for 
>something that implements a particular (big "I") Interface (but even 
>there no verification is done).
>
>My point is, people generally use zope.interface Interfaces as 
>documentation and names for particular behavior/API, not as an LBYL 
>enforcement mechanism.

And thus, for all of the use cases you just described, the minimal 
PEP 3124 Interface implementation should do just fine, yes?  Indeed, 
ABCs would work for those use cases too, if you didn't need 
adaptation.  Or am I missing something?


From talin at acm.org  Thu May 10 04:21:34 2007
From: talin at acm.org (Talin)
Date: Wed, 09 May 2007 19:21:34 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <4642745C.1040702@canterbury.ac.nz>
References: <f1l97g$5cm$1@sea.gmane.org>
	<463E4645.5000503@acm.org>	<20070506222840.25B2.JCARLSON@uci.edu>
	<f1s312$489$1@sea.gmane.org> <4642745C.1040702@canterbury.ac.nz>
Message-ID: <464281AE.7040903@acm.org>

Greg Ewing wrote:
> Giovanni Bajo wrote:
>> using multiple processes cause some 
>> headaches with frozen distributions (PyInstaller, py2exe, etc.), like those 
>> usually found on Windows, specifically because Windows does not have fork().
> 
> Isn't that just a problem with Windows generally? I don't
> see what the method of packaging has to do with it.
> 
> Also, I've seen it suggested that there may actually be
> a way of doing something equivalent to a fork in Windows,
> even though it doesn't have a fork() system call as such.
> Does anyone know more about this?

I also wonder about embedded systems and game consoles. I don't know how 
many embedded microprocessors support fork(), but I know that modern 
consoles such as PS/3 and Xbox do not, since they have no support for 
virtual memory at all.

Also remember that the PS/3 is supposed to be one of the poster children 
for multiprocessing -- the whole 'cell processor' thing. You can't write 
an efficient game on the PS/3 unless it uses multiple processors.

Admittedly, not many current console-based games use Python, but that 
need not always be the case in the future, and a number of PC-based 
games are using it already.

This much I agree: There's no point in talking about supporting multiple 
processors using threads as long as we're living in a refcounting world.

Thought experiment: Suppose you were writing and brand-new dynamic 
language today, designed to work efficiently on multi-processor systems. 
Forget all of Python's legacy implementation details such as GILs and 
refcounts and such. What would it look like, and how well would it 
perform? (And I don't mean purely functional languages a la Erlang.)

For example, in a language that is based on continuations at a very deep 
level, there need not be any "global interpreter" at all. Each separate 
flow of execution is merely a pointer to a call frame, the evaluation of 
which produces a pointer to another call frame (or perhaps the same 
one). Yes, there would still be some shared state that would have to be 
managed, but I wouldn't think that the performance penalty of managing 
that would be horrible.

-- Talin

From greg.ewing at canterbury.ac.nz  Thu May 10 04:40:34 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 10 May 2007 14:40:34 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <20070509205655.622A63A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
Message-ID: <46428622.7000204@canterbury.ac.nz>

Phillip J. Eby wrote:
> For 
> example, objects to be documented with pydoc currently have to 
> reverse engineer a bunch of inspection code, while in a GF-based 
> design they'd just add methods.

There's a problem with this that I haven't seen a good
answer to yet. To add a method to a generic function,
you have to import the module that defines the base
function. So any module that wants its objects documented
in a custom way ends up depending on pydoc.

This problem doesn't arise if a protocol-based approach
is used, e.g. having pydoc look for a __document__ method
or some such.

There's also the possibility that other documentation
systems could make use of the same protocol if it's
designed appropriately, whereas extending pydoc-defined
generic functions benefits pydoc and nothing else.

--
Greg

From steven.bethard at gmail.com  Thu May 10 04:43:07 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Wed, 9 May 2007 20:43:07 -0600
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070509213643.A37643A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<d11dcfba0705091341g2609864fx835ab7f565a63e37@mail.gmail.com>
	<20070509213643.A37643A4061@sparrow.telecommunity.com>
Message-ID: <d11dcfba0705091943m2fc2d233n1f24392855f86f62@mail.gmail.com>

On 5/9/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 02:41 PM 5/9/2007 -0600, Steven Bethard wrote:
> >On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> >>Proceeding to the "Next" Method
> >>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >[snip]
> >>"Before" and "After" Methods
> >>~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >[snip]
> >>"Around" Methods
> >>~~~~~~~~~~~~~~~~
> >[snip]
> >>Custom Combinations
> >>~~~~~~~~~~~~~~~~~~~
> >
> >I'd rather see all this left as a third-party library to start with.
> >(Yes, even including __proceed__.)
>
> That'd be rather like adding new-style classes but not super().

Ok, then leave __proceed__ in.  I'm not really particular about the
details -- I'm just hoping you can cut things down to the absolute
minimum you need, and provide the rest in a third party module.  As it
is, I think there's too much in the PEP for it to be comprehensible.
And @before, @after, etc. seemed like good candidates for being
supplied later.

> Meanwhile, for the rest of the features, most of the implementation
> would still have to be in the core module.

That's fine. I'm not worried about the implementation. I trust you can
handle that. ;-) I'm worried about trying to pack too much stuff into
a PEP.

> Meanwhile, leaving in the ability to have method combination later,
> but removing the actual implementation of the @before/around/after
> decorators in place would delete a total of less than 40 non-blank
> lines of code.

Sure, but it would also delete huge chunks of explanation about
something which really isn't the core of the PEP. Python got
decorators without the 6 lines of functools.update_wrapper -- I see
this as being roughly the same. In particular,
functools.update_wrapper was never mentioned in PEP 318.

> >As others have mentioned, the current PEP is overwhelming. I'd rather
> >see Py3K start with just the basics. When people are comfortable with
> >the core, we can look into introducing the extras.
>
> Naturally, I don't consider any of these items "extras", or I
> wouldn't have included them.

I understand that.  I'm just hoping you can find a way to cut the PEP
down enough so that folks have a chance of wrapping their head around
it. ;-)  I really do think something along these lines
(overloading/generic functions) is right for Python.  I just think the
current PEP is too overwhelming for people to see that.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From greg.ewing at canterbury.ac.nz  Thu May 10 04:56:08 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 10 May 2007 14:56:08 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <20070510012202.EB0A83A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
Message-ID: <464289C8.4080004@canterbury.ac.nz>

Phillip J. Eby wrote:

> Which is an excellent demonstration, by the way, of another reason 
> why before/after methods are useful.  They're all *always* called 
> before and after the primary methods, regardless of how many of them 
> were registered.

But unless I'm mistaken, ClassC can still take over the
whole show using a method that doesn't call the next
method.

>     debug = Debug.make_decorator('debug')
>     always_overrides(Debug, Around)
>     always_overrides(Debug, Method)
>     always_overrides(Debug, Before)
>     always_overrides(Debug, After)

This is getting seriously brain-twisting. Are you saying
that this somehow overrides the subclass relationships,
so that an @Debug method for ClassA always gets called
before other methods, even ones for ClassC? If so, I
think this is all getting way too deeply magical.

Also, you still can't completely win, as someone could
define an @UtterlySelfish decorator that takes precedence
over your @Debug decorator.

For that matter, what if there is simply another
decorator @Foo that is defined to always_override
@Around? The precedence between that and your
@Debug decorator then appears to be undefined.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Thu May 10 05:18:05 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 10 May 2007 15:18:05 +1200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <3d2ce8cb0705091839w7b4fec56ud6a1ed9cb0ad264d@mail.gmail.com>
References: <f1l97g$5cm$1@sea.gmane.org> <463E4645.5000503@acm.org>
	<20070506222840.25B2.JCARLSON@uci.edu> <f1s312$489$1@sea.gmane.org>
	<4642745C.1040702@canterbury.ac.nz>
	<3d2ce8cb0705091839w7b4fec56ud6a1ed9cb0ad264d@mail.gmail.com>
Message-ID: <46428EED.3060205@canterbury.ac.nz>

Mike Klaas wrote:

> NtCreateProcess with SectionHandler=NULL does a fork()-like
> copy-on-write thing.  But it is an internal kernel api.

I just did some googling on this, and it seems to be
described as "undocumented". Does this mean that it's
possible to call it from userland, just that it's not
guaranteed to exist in the future?

If so, it looks like it might be possible to give
Python a fork() that works on Windows, at least for
the time being.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Thu May 10 05:27:34 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 10 May 2007 15:27:34 +1200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <464281AE.7040903@acm.org>
References: <f1l97g$5cm$1@sea.gmane.org> <463E4645.5000503@acm.org>
	<20070506222840.25B2.JCARLSON@uci.edu> <f1s312$489$1@sea.gmane.org>
	<4642745C.1040702@canterbury.ac.nz> <464281AE.7040903@acm.org>
Message-ID: <46429126.5070801@canterbury.ac.nz>

Talin wrote:

> Thought experiment: Suppose you were writing and brand-new dynamic 
> language today, designed to work efficiently on multi-processor systems. 
> Forget all of Python's legacy implementation details such as GILs and 
> refcounts and such. What would it look like, and how well would it 
> perform? (And I don't mean purely functional languages a la Erlang.)

Although I wouldn't make it purely functional, I think
I'd take some ideas from things like Erlang and Occam.

In particular, I'd keep the processes/threads/whatever
as separated as possible, communicating only via well-
defined channels having copy semantics for mutable objects.

Anything directly shared between processes (code objects,
classes, etc.) would be read-only, and probably exempt
from refcounting to enable access without locking.

Hmmm... guess I'll have to go away and design PyLang
now. :-)

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From mike.klaas at gmail.com  Thu May 10 05:16:05 2007
From: mike.klaas at gmail.com (Mike Klaas)
Date: Wed, 9 May 2007 20:16:05 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <464281AE.7040903@acm.org>
References: <f1l97g$5cm$1@sea.gmane.org> <463E4645.5000503@acm.org>
	<20070506222840.25B2.JCARLSON@uci.edu> <f1s312$489$1@sea.gmane.org>
	<4642745C.1040702@canterbury.ac.nz> <464281AE.7040903@acm.org>
Message-ID: <3d2ce8cb0705092016n4ca01f45ld99ed8156aa7bf0f@mail.gmail.com>

On 5/9/07, Talin <talin at acm.org> wrote:

<>
> This much I agree: There's no point in talking about supporting multiple
> processors using threads as long as we're living in a refcounting world.
<>

But python isn't--CPython, though, certainly is.  The CPython
interpreter has enormous stability, backward-compatibility, and speed
expectations to live up to, which makes huge architectural unheavals
an unlikely proposition.

I build multi-machine distributed systems using python (and hence use
multi-process parallelism all the time), but I would still like to
have a GILless CPython.  I don't buy the "multi-processor machines
aren't common" argument (certainly has not been my experience), nor
"threading is inferior to multiple processes as the former is too
hard": neither of these arguments would carry the day if (for
instance) a new python interpreter was created from scratch today.

Instead, the real reason the GIL still lingers in CPython is that such
an architectural change (while maintaining the same performance) is
difficult and _not done_.  No-one has solved this challenge, and until
that happens, talking on mailing lists about how great it would be is
pretty much pointless.  It would probably be more fruitful to start a
new python interpreter project based on a different architecture.
Perhaps you could even write it in python.  I suggest that you call it
"PyPy".

-MIke

From greg.ewing at canterbury.ac.nz  Thu May 10 05:31:42 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 10 May 2007 15:31:42 +1200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <3d2ce8cb0705092016n4ca01f45ld99ed8156aa7bf0f@mail.gmail.com>
References: <f1l97g$5cm$1@sea.gmane.org> <463E4645.5000503@acm.org>
	<20070506222840.25B2.JCARLSON@uci.edu> <f1s312$489$1@sea.gmane.org>
	<4642745C.1040702@canterbury.ac.nz> <464281AE.7040903@acm.org>
	<3d2ce8cb0705092016n4ca01f45ld99ed8156aa7bf0f@mail.gmail.com>
Message-ID: <4642921E.9070707@canterbury.ac.nz>

Mike Klaas wrote:
> It would probably be more fruitful to start a
> new python interpreter project based on a different architecture.

But it's not even clear what that different architecture
should be...

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From jcarlson at uci.edu  Thu May 10 05:38:49 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 09 May 2007 20:38:49 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <4642745C.1040702@canterbury.ac.nz>
References: <f1s312$489$1@sea.gmane.org> <4642745C.1040702@canterbury.ac.nz>
Message-ID: <20070509203702.25EF.JCARLSON@uci.edu>


Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Giovanni Bajo wrote:
> > using multiple processes cause some 
> > headaches with frozen distributions (PyInstaller, py2exe, etc.), like those 
> > usually found on Windows, specifically because Windows does not have fork().
> 
> Isn't that just a problem with Windows generally? I don't
> see what the method of packaging has to do with it.
> 
> Also, I've seen it suggested that there may actually be
> a way of doing something equivalent to a fork in Windows,
> even though it doesn't have a fork() system call as such.
> Does anyone know more about this?

Cygwin emulates fork() by creating a shared mmap, creating a new child
process, copying the contents of the parent process' memory to the child
process (after performing the proper allocations), then hacks up the
child process' call stack.

 - Josiah


From rrr at ronadam.com  Thu May 10 08:22:54 2007
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 10 May 2007 01:22:54 -0500
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <46428622.7000204@canterbury.ac.nz>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<46428622.7000204@canterbury.ac.nz>
Message-ID: <4642BA3E.8080708@ronadam.com>

Greg Ewing wrote:
> Phillip J. Eby wrote:
>> For 
>> example, objects to be documented with pydoc currently have to 
>> reverse engineer a bunch of inspection code, while in a GF-based 
>> design they'd just add methods.
> 
> There's a problem with this that I haven't seen a good
> answer to yet. To add a method to a generic function,
> you have to import the module that defines the base
> function. So any module that wants its objects documented
> in a custom way ends up depending on pydoc.

If you have everything at the same level then that may be true, but I don't 
think that is what Phillip is suggesting.


There might be a group of generic functions for introspection that all 
return some consistent data format back.  This might be in the inspect module.

Then you might have another set of generic functions for combining 
different sources of information together into another data structure. 
This would be used to pre-format and order the information.  These might be 
in docutils.

Then you might have a third level of generic functions for outputting that 
data in different formats.  Ie.. text, html, xml... reST.  This might be 
part of a generic formatting package.

Then pydoc becomes a very light weight module that ties these together to 
do what it does, but it can still extend each framework where it needs to.


A lot of the apparent fear involved with ABC's and generic functions seems 
to be disregarding the notion that generally you know something about the 
data (and code) that is being used at particular points in a process.  It 
is that implicit quality that allows us to not need to LBYL or put 
try-excepts around everything we do.  I think ABC's and generic functions 
may allow us to extend that quality to more of our code.


*My feelings about ABC's and inheritance is they are very useful for easily 
creating new objects.

*I think generic functions are very useful for doing operations on those 
objects when it doesn't make since for those objects to do yet another type 
of operation on it self.  Or to put another way, not everything done to an 
object should be done by a method in that object.

Cheers,
    Ron

From paul.dubois at gmail.com  Thu May 10 08:23:01 2007
From: paul.dubois at gmail.com (Paul Du Bois)
Date: Wed, 9 May 2007 23:23:01 -0700
Subject: [Python-3000] the future of the GIL
In-Reply-To: <464281AE.7040903@acm.org>
References: <f1l97g$5cm$1@sea.gmane.org> <463E4645.5000503@acm.org>
	<20070506222840.25B2.JCARLSON@uci.edu> <f1s312$489$1@sea.gmane.org>
	<4642745C.1040702@canterbury.ac.nz> <464281AE.7040903@acm.org>
Message-ID: <85f6a31f0705092323h48c0130ayd48e1b6e03adb3a4@mail.gmail.com>

I'll just chime in tersely since this really seems like -ideas and not
-3000 territory

On 5/9/07, Talin <talin at acm.org> wrote:
> modern consoles such as PS/3 and Xbox do not, since they have no support for
> virtual memory at all.

Well, they have virtual memory as in virtual address spaces, but they
don't swap. Lack of fork() is more of a control thing.

> Also remember that the PS/3 is supposed to be one of the poster children
> for multiprocessing -- the whole 'cell processor' thing. You can't write
> an efficient game on the PS/3 unless it uses multiple processors.

The PS3 is a good argument _against_ having multiple threads in one
interp. With the cell architecture you want to stay very far away from
a shared memory threading model. The 8 "SPUs" in the cell run separate
processes in their own address space (256K, code _and_ data!), so the
cell works best with a "multiple processes with tightly managed
intercommunication channels" program architecture.

> Thought experiment: Suppose you were writing and brand-new dynamic
> language today, designed to work efficiently on multi-processor systems.

Let's see... it would make state sharing difficult and asynchronous
communication easy!

paul

From theller at ctypes.org  Thu May 10 08:49:38 2007
From: theller at ctypes.org (Thomas Heller)
Date: Thu, 10 May 2007 08:49:38 +0200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <46427073.701@canterbury.ac.nz>
References: <1d85506f0705050629k35ebdf6aj285e8f10489d21d5@mail.gmail.com>	<ca471dc20705071058n25a21acfvaca8e4979edfa404@mail.gmail.com>	<9F3AFD78-7C73-4C9F-8CA6-3D10A1468939@fuhm.net>
	<46427073.701@canterbury.ac.nz>
Message-ID: <f1ufa2$bd2$1@sea.gmane.org>

Greg Ewing schrieb:
> James Y Knight wrote:
> 
>> This just isn't true. Python can do an atomic increment in a fast  
>> platform specific way.
> 
> The problem with this, from what I've heard, is that
> atomic increment instructions tend to be on the order
> of 100 times slower than normal memory accesses (I
> guess because they have to bypass the cache or do extra
> work to keep it consistent).
> 
> If that's true, even a single-instruction atomic increment
> could be much slower than the currently used instruction
> sequence for a Py_INCREF or Py_DECREF.
> 
>> It's quite possible the overhead of GIL-less INCREF/DECREF is still  
>> too high even with atomic increment/decrement primitives, but AFAICT  
>> nobody has actually tried it.
> 
> I thought that's what the oft-cited previous attempt was
> doing, but maybe not. If not, it could be worth trying
> to see what happens.
> 
I have recompiled Python from svn trunk on Windows, after replacing
'(op)->ob_recount++' and '--(op)->ob_refcnt' with calls to InterlockedIncrement()
and InterlockedDecrement().  The result was that the pystones/second went
down from ~52000 to ~24500.  Quite disappointing, I would say.

Thomas


From p.f.moore at gmail.com  Thu May 10 10:26:38 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 10 May 2007 09:26:38 +0100
Subject: [Python-3000] [Python-Dev] Byte literals (was Re:
	[Python-checkins] Changing string constants to byte arrays (
	r55119 - in python/branches/py3k-struni/Lib: codecs.py
	test/test_codecs.py ))
In-Reply-To: <fb6fbf560705091546q3d4e3bbav7f381df3bd1c7ac8@mail.gmail.com>
References: <20070505150035.GA16303@panix.com>
	<200705051334.45120.fdrake@acm.org>
	<20070505124008.648D.JCARLSON@uci.edu>
	<ca471dc20705071042x586b21bcgbfdf8fbf8144913e@mail.gmail.com>
	<bb8868b90705080616w471a42e4r68ba66d0e837f2f8@mail.gmail.com>
	<fb6fbf560705080734s4c315bc3i38c5ccbe3fc9a9a3@mail.gmail.com>
	<ca471dc20705080825j6b665307o408569dd2a9fb55d@mail.gmail.com>
	<fb6fbf560705081142j75dd696eybbefcf9e310b5a44@mail.gmail.com>
	<ca471dc20705081155u31b0b4d1y5868630398ca3206@mail.gmail.com>
	<fb6fbf560705091546q3d4e3bbav7f381df3bd1c7ac8@mail.gmail.com>
Message-ID: <79990c6b0705100126o339e1371ue8f5349b5b9a8d8a@mail.gmail.com>

On 09/05/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> Maybe the tool has gotten smart enough to avoid constructions like:
>
>     for k, v in list(dict.items()):
>
>     for i in list(range(10)):
>
> but I can't help feeling there will always be a few cases where it
> makes the code longer and worse.

Why don't you (in these cases) change your 2.x code to

    for k, v in dict.iteritems():

    for i in xrange(10):

Then 2to3 will do the right thing, *and* your 2.x code is improved...

> knowing it will run OK in both versions.  I will not be happy if I
> have to do this editing more than once.

If you edit the 2.6 source, you only need to do that once.

Paul.

From bjourne at gmail.com  Thu May 10 12:40:13 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Thu, 10 May 2007 12:40:13 +0200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
Message-ID: <740c3aec0705100340w2f50ef4ex32a212a7949c8c7a@mail.gmail.com>

On 5/10/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 5/9/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 09:54 PM 5/9/2007 +0200, BJ?rn Lindqvist wrote:
>
> > >What if they have defined a do_stuff that dispatch on ClassC that is a
> > >subclass of ClassA? Good luck in figuring out what the code does.
>
> > >With the non-overloaded version you also have the ability to insert
> > >debug print statements to figure out what happens.
>
> >      @before(do_stuff)
> >      def debug_it(ob: ClassC):
> >          import pdb
> >          pdb.set_trace()
>
> I think this may be backwards from his question.  As I read it, you
> know about class A, but have never heard about class C (which happens
> to be a substitute for A).  Someone added a different do_stuff
> implementation for class C.

It is backwards, using the debugger solves a problem that should not
have been there in the first case. Let's assume the original flatten
example again:

    from overloading import overload
    from collections import Iterable

    def flatten(ob):
        """Flatten an object to its component iterables"""
        yield ob

    @overload
    def flatten(ob: Iterable):
        for o in ob:
            for ob in flatten(o):
                yield ob

    @overload
    def flatten(ob: basestring):
        yield ob

Let's also assume that:

1. The above code is stored in a file flatten.py.
2. There is a class MyString in file mystring.py which is an Iterable
   but which is not a basestring.
3. There is a third file foo.py which contains

    RuleSet(flatten).copy_rules((basestring,), (MyString,))

4. There is a fourth file, bar.py, which contains

    ms = MyString('hello')
    print list(flatten(ms))

4. These four files are part of a moderately large Python project
   containing 80 modules.

According to how Phillip has described PEAK-style generic functions,
these assumptions are not at all unreasonable.

I am a new programmer analyzing the code on that project. I have read
the files flatten.py, mystring.py and bar.py but not foo.py. The
snippet in bar.py is then very surprising to me because it will print
['hello'] instead of ['h', 'e', 'l', 'l', 'o'].

Using a simple dispatch technique like the one in my handle_tok
example, or in the non-generic version of flatten, I wouldn't have
this problem. Now I do, so how do I troubleshoot it?

I could use the debug_it @before-function, but I don't think I should
have to just to see the control flow of a darn flatten function. The
other approach would be to grep the whole source for "flatten." Then I
should be able to figure out which dispatch rules are active when the
snippet in bar.py is invoked. But it would require considerable work.


-- 
mvh Bj?rn

From tomerfiliba at gmail.com  Thu May 10 13:36:55 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Thu, 10 May 2007 13:36:55 +0200
Subject: [Python-3000] mixin class decorator
Message-ID: <1d85506f0705100436j4ed5c2f7xe6bef98c3b86f5bf@mail.gmail.com>

with the new class decorators of py3k, new use cases emerge.
for example, now it is easy to have real mixin classes or even
mixin modules, a la ruby.

unlike inheritance, this mixin mechanism simply merges the
namespace of the class or module into the namespace of the
decorated class. it does not affect class hierarchies/MRO,
and provides finer granularity as to what methods are merged,
i.e., you explicit mark which methods should be merged.


def mixinmethod(func):
    """marks a method as a mixin method"""
    func.is_mixin = True
    return func

def get_unbound(obj, name):
    if name in obj.__dict__:
         return obj.__dict__[name]
    else:
         for b in obj.mro():
             if name in b.__dict__:
                 return b.__dict__[name]

def mixin(obj, override = False):
    """a class decorator that merges the attributes of 'obj' into the class"""
    def wrapper(cls):
        for name in dir(obj):
            attr = get_unbound(obj, name)
            if getattr(attr, "is_mixin", False):
                if override or not hasattr(cls, name):
                    setattr(cls, name, attr)
        return cls
    return wrapper

Example
==================
class DictMixin:
    @mixinmethod
    def __iter__(self):
        for k in self.keys():
            yield k

    @mixinmethod
    def has_key(self, key):
        try:
            value = self[key]
        except KeyError:
            return False
        return True

    @mixinmethod
    def clear(self):
        for key in self.keys():
            del self[key]
    ...

@mixin(DictMixin)
class MyDict:
    def keys(self):
        return range(10)

md = MyDict()
for k in md:
    print k
==================================

does it seem useful? should it be included in some stdlib?
or at least mentioned as a use case for class decorators in PEP 3129?
(not intended for 3.0a1)


-tomer

From benji at benjiyork.com  Thu May 10 15:06:24 2007
From: benji at benjiyork.com (Benji York)
Date: Thu, 10 May 2007 09:06:24 -0400
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <20070510012836.16D0D3A4061@sparrow.telecommunity.com>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>	<20070509015553.9C6843A4061@sparrow.telecommunity.com>	<88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>	<20070509173942.B5A1B3A4061@sparrow.telecommunity.com>	<46424718.20006@benjiyork.com>
	<20070510012836.16D0D3A4061@sparrow.telecommunity.com>
Message-ID: <464318D0.2000109@benjiyork.com>

Phillip J. Eby wrote:
> At 06:11 PM 5/9/2007 -0400, Benji York wrote:
>> By way of clarification: Even in the large Zope 3 projects I work on 
>> (which obviously use zope.interface), we virtually never use 
>> interfaces for LBYL (just as Zope 3 itself rarely does).
> 
> Yet, this is precisely what Jeff is claiming zope.interface is 
> useful/desirable *for*,

I'll let him speak for himself.

> and Jim Fulton has also been quite clear that 
> LBYL is its very raison d'etre.

I would let Jim speak for himself too, but I prefer to put words in his
mouth. ;)  While zope.interface has anemic facilities for "verifying"
interfaces, few people use them, and even then rarely outside of very
simple "does this object look right" when testing.  It may have been
believed verification would be a great thing, but it's all but
deprecated at this point.

> And thus, for all of the use cases you just described, the minimal 
> PEP 3124 Interface implementation should do just fine, yes?

Could be, especially if it allows for adaptation.  I don't have the time
to pour over the PEP right now.  My main intent in piping up was
dispelling the LBYL dispersions about zope.interface. ;)

If the PEP cooperates as well with zope.interface as you suggest, all
will be good in the world.  Personally I'd prefer sufficient hooks be
added to the language and these types of things (interfaces, adaptation,
generic functions, etc.) be left to third parties (like yourself)
instead of being canonicalized unnecessarily.

> Indeed, 
> ABCs would work for those use cases too, if you didn't need 
> adaptation.  Or am I missing something?

The main advantage I see to zope.interface is adaptation.  Other than 
that, the fact that the inheritance and interface hierarchies aren't mixed.

I would turn the argument around and assert that interfaces can be used
for the rare LBYL uses that ABCs appear to be aimed at, as well as more
interesting things.
-- 
Benji York
http://benjiyork.com


From ncoghlan at gmail.com  Thu May 10 16:08:44 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 11 May 2007 00:08:44 +1000
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <464318D0.2000109@benjiyork.com>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>	<20070509015553.9C6843A4061@sparrow.telecommunity.com>	<88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>	<20070509173942.B5A1B3A4061@sparrow.telecommunity.com>	<46424718.20006@benjiyork.com>	<20070510012836.16D0D3A4061@sparrow.telecommunity.com>
	<464318D0.2000109@benjiyork.com>
Message-ID: <4643276C.5040100@gmail.com>

Benji York wrote:
> If the PEP cooperates as well with zope.interface as you suggest, all
> will be good in the world.  Personally I'd prefer sufficient hooks be
> added to the language and these types of things (interfaces, adaptation,
> generic functions, etc.) be left to third parties (like yourself)
> instead of being canonicalized unnecessarily.

My understanding of PJE's PEP is that adding those hooks you mention is 
essentially what it is about :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From pje at telecommunity.com  Thu May 10 17:23:00 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 10 May 2007 11:23:00 -0400
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <4643276C.5040100@gmail.com>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>
	<20070509015553.9C6843A4061@sparrow.telecommunity.com>
	<88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>
	<20070509173942.B5A1B3A4061@sparrow.telecommunity.com>
	<46424718.20006@benjiyork.com>
	<20070510012836.16D0D3A4061@sparrow.telecommunity.com>
	<464318D0.2000109@benjiyork.com> <4643276C.5040100@gmail.com>
Message-ID: <20070510152120.407AB3A4061@sparrow.telecommunity.com>

At 12:08 AM 5/11/2007 +1000, Nick Coghlan wrote:
>Benji York wrote:
> > If the PEP cooperates as well with zope.interface as you suggest, all
> > will be good in the world.  Personally I'd prefer sufficient hooks be
> > added to the language and these types of things (interfaces, adaptation,
> > generic functions, etc.) be left to third parties (like yourself)
> > instead of being canonicalized unnecessarily.
>
>My understanding of PJE's PEP is that adding those hooks you mention is
>essentially what it is about :)

Yes, exactly -- plus a handful of useful default implementations that 
cover the most common use cases.

However, because the hooks themselves are implemented using those 
default implementations, we can't separate out the implementations 
and just leave the hooks!  There has to be *some* implementation of 
generic functions, method combination, interfaces, adaptation, and 
aspects, in order to implement the very hooks by which any 
replacement implementations would be installed.  It would be like 
trying to provide the idea of metaclasses without having the "type" 
type implemented.

That being the case, one might as well expose the basic functionality 
for people to use, until/unless their needs require an extended implementation.


From pje at telecommunity.com  Thu May 10 17:36:51 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 10 May 2007 11:36:51 -0400
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <464318D0.2000109@benjiyork.com>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>
	<20070509015553.9C6843A4061@sparrow.telecommunity.com>
	<88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>
	<20070509173942.B5A1B3A4061@sparrow.telecommunity.com>
	<46424718.20006@benjiyork.com>
	<20070510012836.16D0D3A4061@sparrow.telecommunity.com>
	<464318D0.2000109@benjiyork.com>
Message-ID: <20070510153507.205EB3A4061@sparrow.telecommunity.com>

At 09:06 AM 5/10/2007 -0400, Benji York wrote:
>I would let Jim speak for himself too, but I prefer to put words in his
>mouth. ;)  While zope.interface has anemic facilities for "verifying"
>interfaces, few people use them, and even then rarely outside of very
>simple "does this object look right" when testing.  It may have been
>believed verification would be a great thing, but it's all but
>deprecated at this point.

Okay, but that's quite the opposite of what I understand Jeff to be 
saying in this thread, which is that not only is LBYL good, but that 
he does it all the time.


>>And thus, for all of the use cases you just described, the minimal 
>>PEP 3124 Interface implementation should do just fine, yes?
>
>Could be, especially if it allows for adaptation.

Yes, it does.  In fact, adaptation is pretty much all they're good 
for, except for specifying argument types:

http://python.org/dev/peps/pep-3124/#interfaces-and-adaptation
http://python.org/dev/peps/pep-3124/#interfaces-as-type-specifiers


>My main intent in piping up was
>dispelling the LBYL dispersions about zope.interface. ;)

Well, "back in the day", before PyProtocols was written, I discovered 
PEP 246 adaptation and began trying to convince Jim Fulton that 
adaptation beat the pants off of using if-then's to do "implements" 
testing.  His argument then, IIRC, was that interface verification 
was more important.  I then went off and wrote PyProtocols in large 
part (specifically the large documentation part!) to show him what 
could be done using adaptation as a core concept.


>If the PEP cooperates as well with zope.interface as you suggest, all
>will be good in the world.  Personally I'd prefer sufficient hooks be
>added to the language and these types of things (interfaces, adaptation,
>generic functions, etc.) be left to third parties (like yourself)
>instead of being canonicalized unnecessarily.

Well, as Nick Coghlan's already pointed out, the PEP is mostly about 
creating a standard set of hooks, so that each framework doesn't have 
to reinvent decorators and syntax.


>>Indeed, ABCs would work for those use cases too, if you didn't need 
>>adaptation.  Or am I missing something?
>
>The main advantage I see to zope.interface is adaptation.  Other 
>than that, the fact that the inheritance and interface hierarchies 
>aren't mixed.

In which case, you might well be happy with PEP 3124 interfaces, 
unless you want to use instance-specific interfaces a lot.


>I would turn the argument around and assert that interfaces can be used
>for the rare LBYL uses that ABCs appear to be aimed at, as well as more
>interesting things.

Sure, which is another reason why PEP 3124 includes them.


From pje at telecommunity.com  Thu May 10 17:40:32 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 10 May 2007 11:40:32 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <740c3aec0705100340w2f50ef4ex32a212a7949c8c7a@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<740c3aec0705100340w2f50ef4ex32a212a7949c8c7a@mail.gmail.com>
Message-ID: <20070510153851.691673A40A0@sparrow.telecommunity.com>

At 12:40 PM 5/10/2007 +0200, BJ?rn Lindqvist wrote:
>I could use the debug_it @before-function, but I don't think I should
>have to just to see the control flow of a darn flatten function. The
>other approach would be to grep the whole source for "flatten." Then I
>should be able to figure out which dispatch rules are active when the
>snippet in bar.py is invoked. But it would require considerable work.

Or, you could simply print out the contents of RuleSet(flatten).  Or 
perhaps just the subset of those rules that you're interested in.


From pje at telecommunity.com  Thu May 10 17:47:19 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 10 May 2007 11:47:19 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <46428622.7000204@canterbury.ac.nz>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<46428622.7000204@canterbury.ac.nz>
Message-ID: <20070510154540.8CEC23A4061@sparrow.telecommunity.com>

At 02:40 PM 5/10/2007 +1200, Greg Ewing wrote:
>Phillip J. Eby wrote:
> > For
> > example, objects to be documented with pydoc currently have to
> > reverse engineer a bunch of inspection code, while in a GF-based
> > design they'd just add methods.
>
>There's a problem with this that I haven't seen a good
>answer to yet. To add a method to a generic function,
>you have to import the module that defines the base
>function. So any module that wants its objects documented
>in a custom way ends up depending on pydoc.

Using the "Importing" package from the Cheeseshop:

def register_pydoc(pydoc):

     @when(pydoc.get_signature)
     def signature_for_mytype(ob:MyType)
         # etc.

     @when(pydoc.get_contents)
     def contents_for_mytype(ob:MyType)
         # etc.

from peak.util.imports import whenImported
whenImported('pydoc', register_pydoc)


I certainly wouldn't object to making 'whenImported' and its friends 
a part of the stdlib.


>There's also the possibility that other documentation
>systems could make use of the same protocol if it's
>designed appropriately, whereas extending pydoc-defined
>generic functions benefits pydoc and nothing else.

Of course; it's actually somewhat more likely that the basic GFs 
should actually live in "inspect" (or something like it) rather than 
in "pydoc" per se.


From pje at telecommunity.com  Thu May 10 17:59:41 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 10 May 2007 11:59:41 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <d11dcfba0705091943m2fc2d233n1f24392855f86f62@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<d11dcfba0705091341g2609864fx835ab7f565a63e37@mail.gmail.com>
	<20070509213643.A37643A4061@sparrow.telecommunity.com>
	<d11dcfba0705091943m2fc2d233n1f24392855f86f62@mail.gmail.com>
Message-ID: <20070510155756.2DEB73A4061@sparrow.telecommunity.com>

At 08:43 PM 5/9/2007 -0600, Steven Bethard wrote:
>>Meanwhile, leaving in the ability to have method combination later,
>>but removing the actual implementation of the @before/around/after
>>decorators in place would delete a total of less than 40 non-blank
>>lines of code.
>
>Sure, but it would also delete huge chunks of explanation about
>something which really isn't the core of the PEP. Python got
>decorators without the 6 lines of functools.update_wrapper -- I see
>this as being roughly the same. In particular,
>functools.update_wrapper was never mentioned in PEP 318.

I see this as being more analagous to contextlib.contextmanager and 
PEP 343, myself.  :)


>I'm just hoping you can find a way to cut the PEP
>down enough so that folks have a chance of wrapping their head around
>it.

Well, it's a bit like new-style types, in that there are a bunch of 
pieces that go together, i.e., descriptors, metaclasses, slots, and 
mro.  I could certainly split the PEP into separate documents, but it 
might give the impression that the parts are more separable than they are.


>  ;-)  I really do think something along these lines
>(overloading/generic functions) is right for Python.  I just think the
>current PEP is too overwhelming for people to see that.

Yeah, and the dilemma is that if I go back and add in all the 
examples and clarifications that have come up in these threads, it's 
going to be even bigger.  Ditto for when I actually document the 
extension API part.  The PEP is already 50% larger (in text line 
count) than the implementation of most of its features!  (And the 
implementation already includes a bunch of the extension API.)

I'm certainly open to suggestions as to how best to proceed; I just 
don't see how, for example, to explain the PEP's interfaces without 
reference to generic functions.  So, even if it was split into 
different documents, you'd still have to read them in much the same 
order as in the one large document.

By the way, I have gotten off-list notes of encouragement from a 
number of people who've said they hope the PEP makes it, so evidently 
it's not overwhelming to everyone.  Unfortunately, it seems to be 
suffering a bit from Usenet Nod Syndrome among the people who are in 
favor of it.


From pje at telecommunity.com  Thu May 10 18:16:02 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 10 May 2007 12:16:02 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <464289C8.4080004@canterbury.ac.nz>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
Message-ID: <20070510161417.192943A4061@sparrow.telecommunity.com>

At 02:56 PM 5/10/2007 +1200, Greg Ewing wrote:
>Phillip J. Eby wrote:
>
> > Which is an excellent demonstration, by the way, of another reason
> > why before/after methods are useful.  They're all *always* called
> > before and after the primary methods, regardless of how many of them
> > were registered.
>
>But unless I'm mistaken, ClassC can still take over the
>whole show using a method that doesn't call the next
>method.

No, because you're still thinking of "before" and "after" as if they 
were syntax sugar for normal method chaining.  As I said above (and 
in the PEP), *all* before and after methods are always called, unless 
an exception is raised somewhere along the way.  This is one of the 
reasons they're useful to have, in addition to normal and "around" methods.


> >     debug = Debug.make_decorator('debug')
> >     always_overrides(Debug, Around)
> >     always_overrides(Debug, Method)
> >     always_overrides(Debug, Before)
> >     always_overrides(Debug, After)
>
>This is getting seriously brain-twisting. Are you saying
>that this somehow overrides the subclass relationships,
>so that an @Debug method for ClassA always gets called
>before other methods, even ones for ClassC?

Just like all the Around methods are always called before the before, 
after, and primary methods, and just like all the before methods are 
always called before the primary and after methods, etc.

This was all explicitly spelled out in the PEP:

   ``@before`` and ``@after`` methods are invoked either before or after
   the main function body, and are *never considered ambiguous*.  That
   is, it will not cause any errors to have multiple "before" or "after"
   methods with identical or overlapping signatures.  Ambiguities are
   resolved using the order in which the methods were added to the
   target function.

   "Before" methods are invoked most-specific method first, with
   ambiguous methods being executed in the order they were added.  All
   "before" methods are called before any of the function's "primary"
   methods (i.e. normal ``@overload`` methods) are executed.

   "After" methods are invoked in the *reverse* order, after all of the
   function's "primary" methods are executed.  That is, they are executed
   least-specific methods first, with ambiguous methods being executed in
   the reverse of the order in which they were added.

In particular, note the last sentence of the second paragraph, and 
the first sentence of the third paragraph.


>Also, you still can't completely win, as someone could
>define an @UtterlySelfish decorator that takes precedence
>over your @Debug decorator.

So?  Maybe that's what they *want*.  Sounds like "consenting adults" to me.


>For that matter, what if there is simply another
>decorator @Foo that is defined to always_override
>@Around? The precedence between that and your
>@Debug decorator then appears to be undefined.

If so, then you'll get an AmbiguousMethods error (either when 
defining the function or calling it) and thus be informed that you 
need another override declaration.


From jjb5 at cornell.edu  Thu May 10 18:23:11 2007
From: jjb5 at cornell.edu (Joel Bender)
Date: Thu, 10 May 2007 12:23:11 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <20070509205655.622A63A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
Message-ID: <464346EF.1080408@cornell.edu>

>      @before(do_stuff)
>      def debug_it(ob: ClassC):
>          import pdb
>          pdb.set_trace()

This is probably far fetched, but I would much rather see:

       before do_stuff(ob: ClassC):
           import pbd
           pdb.set_trace()

So the keyword 'before' and 'after' are just like 'def', they define 
functions with a particular signature that get inserted into the "which 
function to call" execution sequence.

I would want to be able to reference functions within classes as well:

       >>> class A:
       ...     def f(self, x):
       ...         print 'A.f, x =', x
       ...
       >>> z = A()
       >>> z.f(1)
       A.f, x = 1
       >>>
       >>> before A.f(self, x):
       ...     print 'yo!'
       ...
       >>> z.f(2)
       yo!
       A.f, x = 2

Could the sequence of opcodes for the 'before f()' get mushed into the 
front of the existing code for f()?  That would mean that changes to 'x' 
would be reflected in the original f():

       >>> before A.f(self, x):
       ...     print 'doubled'
       ...     x = x * 2
       >>> z.f(3)
       doubled
       yo!
       A.f, x = 6

And does a 'return' statement from a before short-circuit the call, or 
should it mean the same thing as falling off the end?


Joel

From tjreedy at udel.edu  Thu May 10 20:11:56 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 10 May 2007 14:11:56 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com><d11dcfba0705091341g2609864fx835ab7f565a63e37@mail.gmail.com><20070509213643.A37643A4061@sparrow.telecommunity.com><d11dcfba0705091943m2fc2d233n1f24392855f86f62@mail.gmail.com>
	<d11dcfba0705091943m2fc2d233n1f24392855f86f62@mail.gmail.co m>
	<20070510155756.2DEB73A4061@sparrow.telecommunity.com>
Message-ID: <f1vn9c$8pt$1@sea.gmane.org>


"Phillip J. Eby" <pje at telecommunity.com> wrote in message 
news:20070510155756.2DEB73A4061 at sparrow.telecommunity.com...
| At 08:43 PM 5/9/2007 -0600, Steven Bethard wrote:
| Yeah, and the dilemma is that if I go back and add in all the
| examples and clarifications that have come up in these threads, it's
| going to be even bigger.  Ditto for when I actually document the
| extension API part.  The PEP is already 50% larger (in text line
| count) than the implementation of most of its features!  (And the
| implementation already includes a bunch of the extension API.)
|
| I'm certainly open to suggestions as to how best to proceed; I just
| don't see how, for example, to explain the PEP's interfaces without
| reference to generic functions.  So, even if it was split into
| different documents, you'd still have to read them in much the same
| order as in the one large document.

Without having read the PEP itself (yet), as opposed to numerous posts 
here, I would note that many PEPs are shortened by referencing other docs. 
(The class decorater pep being an extreme example.)  This makes easier to 
get an 'executive overview' of the proposal is one is not interested in the 
details.  I will try to take a look in the next week.

| By the way, I have gotten off-list notes of encouragement from a
| number of people who've said they hope the PEP makes it, so evidently
| it's not overwhelming to everyone.  Unfortunately, it seems to be
| suffering a bit from Usenet Nod Syndrome among the people who are in
| favor of it.

I don't see an immediate personal use for either ABCs or your generic 
function machinery.  But to me, GFs seem at least as much in the spirit of 
Python as ABCs.  So here is a public probably yes nod from me;-)

Terry Jan Reedy





From p.f.moore at gmail.com  Thu May 10 20:50:08 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 10 May 2007 19:50:08 +0100
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070510155756.2DEB73A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<d11dcfba0705091341g2609864fx835ab7f565a63e37@mail.gmail.com>
	<20070509213643.A37643A4061@sparrow.telecommunity.com>
	<d11dcfba0705091943m2fc2d233n1f24392855f86f62@mail.gmail.com>
	<20070510155756.2DEB73A4061@sparrow.telecommunity.com>
Message-ID: <79990c6b0705101150x4de890d6wc5d01c434a4e6d48@mail.gmail.com>

On 10/05/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> By the way, I have gotten off-list notes of encouragement from a
> number of people who've said they hope the PEP makes it, so evidently
> it's not overwhelming to everyone.  Unfortunately, it seems to be
> suffering a bit from Usenet Nod Syndrome among the people who are in
> favor of it.

I'll add my public +1 here, then.

OTOH, I do find the PEP to be too long and fairly hard to follow - as
an example, the bit in the PEP saying

"""
(to avoid copying the implementation):

    from overloading import RuleSet
    RuleSet(flatten).copy_rules((basestring,), (MyString,))
"""

adds nothing to the point, but adds complexity which I suspect will
simply put people off. OK, it's only 3 lines, but I believe that
removing them will substantially improve the impact of that section,
and lose nothing of importance. I'm sure there's more of the same, as
well.

As I say, though, I am in favour of the idea - don't mistake criticism
of the way it's presented as dislike of the concept!

Paul.

From greg.ewing at canterbury.ac.nz  Thu May 10 23:39:56 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 11 May 2007 09:39:56 +1200
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <20070510152120.407AB3A4061@sparrow.telecommunity.com>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>
	<20070509015553.9C6843A4061@sparrow.telecommunity.com>
	<88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>
	<20070509173942.B5A1B3A4061@sparrow.telecommunity.com>
	<46424718.20006@benjiyork.com>
	<20070510012836.16D0D3A4061@sparrow.telecommunity.com>
	<464318D0.2000109@benjiyork.com> <4643276C.5040100@gmail.com>
	<20070510152120.407AB3A4061@sparrow.telecommunity.com>
Message-ID: <4643912C.1050208@canterbury.ac.nz>

Phillip J. Eby wrote:

> However, because the hooks themselves are implemented using those 
> default implementations, we can't separate out the implementations 
> and just leave the hooks!

What this seems to mean is that the necessary hooks
are already there.

--
Greg

From greg.ewing at canterbury.ac.nz  Thu May 10 23:59:07 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 11 May 2007 09:59:07 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <20070510161417.192943A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
Message-ID: <464395AB.6040505@canterbury.ac.nz>

Phillip J. Eby wrote:
> As I said above (and in 
> the PEP), *all* before and after methods are always called, unless an 
> exception is raised somewhere along the way.

>   "Before" methods are invoked most-specific method first, with
>   ambiguous methods being executed in the order they were added.  All
>   "before" methods are called before any of the function's "primary"
>   methods (i.e. normal ``@overload`` methods) are executed.

Well, it wasn't clear to me at all from the PEP that
this is how it works. The above paragraph doesn't say
anything about @around methods, for example, and it's
not obvious whether they should be considered "normal"
or "primary".

>> For that matter, what if there is simply another
>> decorator @Foo that is defined to always_override
>> @Around? The precedence between that and your
>> @Debug decorator then appears to be undefined.
> 
> If so, then you'll get an AmbiguousMethods error (either when defining 
> the function or calling it) and thus be informed that you need another 
> override declaration.

I can see a problem with this. If Library1 defines a
method that always overrides an @around method, and
Library2 does the same thing, then if I try to use
both libraries at the same time, I'll get an exception
that I don't know the cause of and don't have any
idea how to fix.

--
Greg

From guido at python.org  Fri May 11 01:00:00 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 10 May 2007 16:00:00 -0700
Subject: [Python-3000] PEPs update
Message-ID: <ca471dc20705101600l5d362752j1444238c923ea533@mail.gmail.com>

I've accepted some PEPs:

 SA 3120  Using UTF-8 as the default source encoding   von L?wis
 SA 3121  Extension Module Initialization & Finalization  von L?wis
 SA 3123  Making PyObject_HEAD conform to standard C   von L?wis
 SA 3127  Integer Literal Support and Syntax           Maupin
 SA 3129  Class Decorators                             Winter

(3129 is listed for completeness, I think it was already approved a
few days ago.)

and rejected some others:

 SR 3125  Remove Backslash Continuation                Jewett
 SR 3126  Remove Implicit String Concatenation         Jewett
 SR 3130  Access to Current Module/Class/Function      Jewett

I'm looking forward to seeing these implemented in the p3yk branch
(and a few others that have been accepted for a while now, e.g. 3109,
3110, 3113).

Other status updates:

3101 (string formatting) -- Talin will continue to shepherd this in
cooperation with the authors of the sandbox implementation.

3116 (new I/O) -- I'm slowly chipping away at implementing this. The
PEP is behind in tracking the actual implementation.

3119, 3124, 3141 (ABCs, GFs) -- I'm still thinking about this; 3119
and 3141 are still awaiting major rewrites and 3124 is under heavy
discussion.

3118 (buffer protocol) -- This is long, but I trust Travis. Maybe he
should just submit an implementation (hint, hint).

3128 (BList) -- I'll leave this for Raymond Hettinger to review.

3131 (non-ASCII identifiers) -- I'm leaning towards rejecting.

3132 (extended iterable unpacking) -- I'm leaning towards accepting.

I'm still hoping Raymond will check his draft PEPs in by Sunday night.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Fri May 11 01:20:26 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 10 May 2007 19:20:26 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <464395AB.6040505@canterbury.ac.nz>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
Message-ID: <20070510231845.9C98C3A4061@sparrow.telecommunity.com>

At 09:59 AM 5/11/2007 +1200, Greg Ewing wrote:
>Phillip J. Eby wrote:
>>As I said above (and in the PEP), *all* before and after methods 
>>are always called, unless an exception is raised somewhere along the way.
>
>>   "Before" methods are invoked most-specific method first, with
>>   ambiguous methods being executed in the order they were added.  All
>>   "before" methods are called before any of the function's "primary"
>>   methods (i.e. normal ``@overload`` methods) are executed.
>
>Well, it wasn't clear to me at all from the PEP that
>this is how it works. The above paragraph doesn't say
>anything about @around methods, for example,

That's because @around methods haven't been introduced at that point 
in the PEP; the following section introduces @around and explains 
that @arounds are called before the befores, etc.  For example, part 
of the section describing @around methods says:

   The ``__proceed__`` given to an "around" method will either be the
   next applicable "around" method, a ``DispatchError`` instance,
   or a synthetic method object that will call all the "before" methods,
   followed by the primary method chain, followed by all the "after"
   methods, and return the result from the primary method chain.

Of course, it's also stated in the PEP that it's basically copying 
CLOS's "standard method combination", but I really should add 
appropriate references for that.


>>>For that matter, what if there is simply another
>>>decorator @Foo that is defined to always_override
>>>@Around? The precedence between that and your
>>>@Debug decorator then appears to be undefined.
>>If so, then you'll get an AmbiguousMethods error (either when 
>>defining the function or calling it) and thus be informed that you 
>>need another override declaration.
>
>I can see a problem with this. If Library1 defines a
>method that always overrides an @around method, and
>Library2 does the same thing, then if I try to use
>both libraries at the same time, I'll get an exception
>that I don't know the cause of and don't have any
>idea how to fix.

Actually, that would require that Library1 and Library2 both add 
methods to a generic function in Library3.  Not only that, but *those 
methods would have to apply to the same classes*.  So, it's actually 
a lot harder to create that situation than it sounds.

In particular, notice that if Library1 only uses its combinators for 
methods applying to its own types, and Library2 does the same, they 
*cannot* create any method ambiguity in the third library's generic functions!

Of course, outside of debug hooks, adding custom combinators to 
somebody else's generic function probably isn't a very good idea in 
the first place, at least for instances of *that library's types* or 
of *built-in types* -- which is the only way to produce a conflict 
between two libraries that don't otherwise know about each 
other.  (That is, if L1 and L2 don't know each other, they can hardly 
be registering methods for each other's types with a common generic function.)

Meanwhile, in CLOS the set of allowed qualifiers for a generic 
function's methods is decided by the function itself, so there's no 
way for you to add foreign method types at all.  I personally think 
that's a little too restrictive, though, as it not only goes against 
"consenting adults", but it also effectively rules out "true" 
aspect-oriented programming.

By the way, I feel I should mention that although I disagree with a 
lot of your arguments, I *do* appreciate your taking the time to find 
possible edge or corner failure conditions and "unintended 
consequences".  So please don't let my poking of holes in your poking 
of holes in the PEP, stop you from trying to poke more holes in it.  :)


From python at rcn.com  Fri May 11 01:27:57 2007
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 10 May 2007 19:27:57 -0400 (EDT)
Subject: [Python-3000] PEPs update
Message-ID: <20070510192757.BIU44325@ms09.lnh.mail.rcn.net>

> and rejected some others:
> SR 3126  Remove Implicit String Concatenation         Jewett

I had high hopes for this one.  Cie le vie.

I did not see octal literals on your list.  FWIW, I'm -1 on the proposal.  The current situation is only a minor nuisance.  While I prefer to see octal literal support dropped entirely, I would rather live with the 0123 form than add more complexity with a non-standard format and a set of warnings for decimals with leading zeros:  date(2007, 05, 09).


> 3128 (BList) -- I'll leave this for Raymond Hettinger to review.

After looking at the source, I think this has almost zero chance for replacing list().  There is too much value in a simple C API, low space overhead for small lists, good performance is common use cases, and having performance that is easily understood.  The BList implementation lacks these virtues and trades-off a little performance is common cases for much better performance in uncommon cases.  As a Py3.0 PEP, I think it can be rejected.

Depending on its success as a third-party module, it still has a chance for inclusion in the collections module.  The essential criteria for that is whether it is a superior choice for some real-world use cases.  I've scanned my own code and found no instances where BList would have been preferable to a regular list.  However, that scan has a selection bias because it doesn't reflect what I would have written had BList been available.  So, after a few months, I intend to poll comp.lang.python for BList success stories.  If they exist, then I have no problem with inclusion in the collections module.  After all, its learning curve is near zero -- the only cost is the clutter factor stemming from indecision about the most appropriate data structure for a given task.

> I'm still hoping Raymond will check his draft PEPs in by Sunday night.

Sorry for the delay, I've been fully task saturated and the PEP writing has been slowed by the need to explore the ideas more fully.

The PEP for eliminating __del__ seemed straight-forward at the outset, but the use case you presented doesn't seem to have a clean substitute (as it requires the object to be alive to finalize it).  Other use cases do have a clean solution.  So, I'll go forward with the PEP but am a bit disheartened that it is going to have to advise try/finally or somesuch for the harder cases.

The information attributes idea is going well and is sticking close to the original presentation except that I've now seen the wisdom of modifying isinstance() to let some objects fake or proxy a type that they don't inherit from.



Raymond

From steven.bethard at gmail.com  Fri May 11 01:38:33 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu, 10 May 2007 17:38:33 -0600
Subject: [Python-3000] PEPs update
In-Reply-To: <20070510192757.BIU44325@ms09.lnh.mail.rcn.net>
References: <20070510192757.BIU44325@ms09.lnh.mail.rcn.net>
Message-ID: <d11dcfba0705101638o5fa216d3t1640c9b6bc2a080f@mail.gmail.com>

On 5/10/07, Raymond Hettinger <python at rcn.com> wrote:
> The PEP for eliminating __del__ seemed straight-forward
> at the outset, but the use case you presented doesn't
> seem to have a clean substitute (as it requires the object
> to be alive to finalize it).  Other use cases do have a clean
> solution.  So, I'll go forward with the PEP but am a bit
> disheartened that it is going to have to advise try/finally
> or somesuch for the harder cases.

You've probably already seen these, but just in case you haven't,
there have been two alternatives to __del__ posted recently to the
cookbook:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/519621
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/519610

Of course, they have their own sets of problems, but maybe there's
something worthwhile in there for the PEP.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From guido at python.org  Fri May 11 01:39:11 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 10 May 2007 16:39:11 -0700
Subject: [Python-3000] PEPs update
In-Reply-To: <20070510192757.BIU44325@ms09.lnh.mail.rcn.net>
References: <20070510192757.BIU44325@ms09.lnh.mail.rcn.net>
Message-ID: <ca471dc20705101639t2aa07267ub0d2163c819918e4@mail.gmail.com>

On 5/10/07, Raymond Hettinger <python at rcn.com> wrote:
> I did not see octal literals on your list.  FWIW, I'm -1 on the proposal.  The current situation is only a minor nuisance.  While I prefer to see octal literal support dropped entirely, I would rather live with the 0123 form than add more complexity with a non-standard format and a set of warnings for decimals with leading zeros:  date(2007, 05, 09).

It was on the list, accepted:
>  SA 3127  Integer Literal Support and Syntax           Maupin

> > 3128 (BList) -- I'll leave this for Raymond Hettinger to review.
>
> After looking at the source, I think this has almost zero chance for replacing list().  There is too much value in a simple C API, low space overhead for small lists, good performance is common use cases, and having performance that is easily understood.  The BList implementation lacks these virtues and trades-off a little performance is common cases for much better performance in uncommon cases.  As a Py3.0 PEP, I think it can be rejected.

OK, will do. I'll quote you in the rejection notice.

> Depending on its success as a third-party module, it still has a chance for inclusion in the collections module.  The essential criteria for that is whether it is a superior choice for some real-world use cases.  I've scanned my own code and found no instances where BList would have been preferable to a regular list.  However, that scan has a selection bias because it doesn't reflect what I would have written had BList been available.  So, after a few months, I intend to poll comp.lang.python for BList success stories.  If they exist, then I have no problem with inclusion in the collections module.  After all, its learning curve is near zero -- the only cost is the clutter factor stemming from indecision about the most appropriate data structure for a given task.
>
> > I'm still hoping Raymond will check his draft PEPs in by Sunday night.
>
> Sorry for the delay, I've been fully task saturated and the PEP writing has been slowed by the need to explore the ideas more fully.
>
> The PEP for eliminating __del__ seemed straight-forward at the outset, but the use case you presented doesn't seem to have a clean substitute (as it requires the object to be alive to finalize it).  Other use cases do have a clean solution.  So, I'll go forward with the PEP but am a bit disheartened that it is going to have to advise try/finally or somesuch for the harder cases.
>
> The information attributes idea is going well and is sticking close to the original presentation except that I've now seen the wisdom of modifying isinstance() to let some objects fake or proxy a type that they don't inherit from.

Cool. FWIW, the rewrite of PEP 3119 will focus mostly on overloading
isinstance() and issubclass() and a few examples of what can be done
with these.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Fri May 11 01:54:58 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 10 May 2007 19:54:58 -0400
Subject: [Python-3000] __del__ (was Re:  PEPs update)
In-Reply-To: <20070510192757.BIU44325@ms09.lnh.mail.rcn.net>
References: <20070510192757.BIU44325@ms09.lnh.mail.rcn.net>
Message-ID: <20070510235312.21F463A4061@sparrow.telecommunity.com>

At 07:27 PM 5/10/2007 -0400, Raymond Hettinger wrote:
>The PEP for eliminating __del__ seemed straight-forward at the 
>outset, but the use case you presented doesn't seem to have a clean 
>substitute (as it requires the object to be alive to finalize 
>it).  Other use cases do have a clean solution.  So, I'll go forward 
>with the PEP but am a bit disheartened that it is going to have to 
>advise try/finally or somesuch for the harder cases.

By the way - another issue with removing __del__ is that try/finally 
in generators (PEP 342) is implemented using it.

Which means that if you took away __del__ from the Python level, you 
could still simulate it by saving a reference to a running generator 
with a finally clause.  Of course, that would have at least as many 
problems as using __del__ directly, but there you go.  :)


From greg.ewing at canterbury.ac.nz  Fri May 11 03:20:52 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 11 May 2007 13:20:52 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <20070510231845.9C98C3A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
Message-ID: <4643C4F4.30708@canterbury.ac.nz>

Phillip J. Eby wrote:

> That's because @around methods haven't been introduced at that point in 
> the PEP; the following section introduces @around and explains that 
> @arounds are called before the befores, etc.

Hmm, so it's not the case that @before methods are
called "before" all other methods. That makes it even
more confusing.

I'm now even more of the opinion that this is too
complicated for Python's first generic function system.
"If it's hard to explain, it's probably a bad idea."

> Of course, it's also stated in the PEP that it's basically copying 
> CLOS's "standard method combination", but I really should add 
> appropriate references for that.

Relying on people knowing about CLOS in order to follow
this stuff doesn't seem like a good idea.

--
Greg

From jimjjewett at gmail.com  Fri May 11 03:35:37 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 10 May 2007 21:35:37 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070510154540.8CEC23A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<46428622.7000204@canterbury.ac.nz>
	<20070510154540.8CEC23A4061@sparrow.telecommunity.com>
Message-ID: <fb6fbf560705101835i600ac62bw2ec3e088c53ded2a@mail.gmail.com>

On 5/10/07, Phillip J. Eby <pje at telecommunity.com> wrote:

> Using the "Importing" package from the Cheeseshop:
...
> from peak.util.imports import whenImported
> whenImported('pydoc', register_pydoc)

> I certainly wouldn't object to making 'whenImported' and its friends
> a part of the stdlib.

Adding whenImported would be useful, even outside of ABCs and generic
functions.

But please don't go overboard with the "and its friends" part.  That
15K zip file boiled down to a 370 line python module.  Over 200 of
those lines were to support things like module inheritance or
returning a sequence with strings replaced by the result of running
import/getattr on them.  Those uses are probably too obscure for the
stdlib.

-jJ

From pje at telecommunity.com  Fri May 11 04:05:22 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 10 May 2007 22:05:22 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <fb6fbf560705101835i600ac62bw2ec3e088c53ded2a@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<46428622.7000204@canterbury.ac.nz>
	<20070510154540.8CEC23A4061@sparrow.telecommunity.com>
	<fb6fbf560705101835i600ac62bw2ec3e088c53ded2a@mail.gmail.com>
Message-ID: <20070511020339.5F4313A4061@sparrow.telecommunity.com>

At 09:35 PM 5/10/2007 -0400, Jim Jewett wrote:
>On 5/10/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>
>>Using the "Importing" package from the Cheeseshop:
>...
>>from peak.util.imports import whenImported
>>whenImported('pydoc', register_pydoc)
>
>>I certainly wouldn't object to making 'whenImported' and its friends
>>a part of the stdlib.
>
>Adding whenImported would be useful, even outside of ABCs and generic
>functions.
>
>But please don't go overboard with the "and its friends" part.  That
>15K zip file boiled down to a 370 line python module.  Over 200 of
>those lines were to support things like module inheritance

Actually, the part that deals with module inheritance is 16 lines, 
unless you count the documentation for it, which is another 22 
lines.  But I have no problem leaving that out of the stdlib; module 
inheritance is deprecated even in PEAK.


>or
>returning a sequence with strings replaced by the result of running
>import/getattr on them.  Those uses are probably too obscure for the
>stdlib.

If you mean importObject, importSequence, and importSuite, I agree with you.

Really, by "and friends" I mean importString and lazyModule, and I'm 
fine with relocating and renaming them, as well as stripping out the 
relative path bit.


From jimjjewett at gmail.com  Fri May 11 06:05:55 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 11 May 2007 00:05:55 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070511020339.5F4313A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<46428622.7000204@canterbury.ac.nz>
	<20070510154540.8CEC23A4061@sparrow.telecommunity.com>
	<fb6fbf560705101835i600ac62bw2ec3e088c53ded2a@mail.gmail.com>
	<20070511020339.5F4313A4061@sparrow.telecommunity.com>
Message-ID: <fb6fbf560705102105o7442d872x5bf8545c90686fee@mail.gmail.com>

On 5/10/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 09:35 PM 5/10/2007 -0400, Jim Jewett wrote:

> >Adding whenImported would be useful, even outside of ABCs and
> >generic functions.

> >But please don't go overboard with the "and its friends" part.

> If you mean importObject, importSequence, and importSuite, I agree
> with you.

> Really, by "and friends" I mean importString and lazyModule, and I'm
> fine with relocating and renaming them, as well as stripping out the
> relative path bit.

So we're mostly in agreement, but I had also wanted to leave out importString.

I know it can seem simpler to treat everything as an object, and not
worry about where the type switches from package to module to instance
to attribute.  I see it used in Twisted.

But I'm not sure it is *really* simpler for someone who isn't familiar
with your codebase, and I don't see why it is needed for whenImported.

-jJ

From p.f.moore at gmail.com  Fri May 11 10:40:27 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 11 May 2007 09:40:27 +0100
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <4643C4F4.30708@canterbury.ac.nz>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<4643C4F4.30708@canterbury.ac.nz>
Message-ID: <79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>

On 11/05/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> I'm now even more of the opinion that this is too
> complicated for Python's first generic function system.
> "If it's hard to explain, it's probably a bad idea."

Hmm. My view is that it *is* simple to explain, but unfortunately
Phillip's explanation in the PEP is not that simple explanation :-(

In my view, too much of the PEP is taken up with edge cases,
relatively obscure specialist uses, and unnecessary explanations of
implementation details. However, I haven't had any time recently to
review it in enough detail to offer a concrete proposal on how to
simplify it, so I've kept quiet so far.

I would argue that the PEP could be *very* simple if it restricted
itself to the basic idea. Much of what is being discussed is, in my
view, implementation detail - which Phillip finds compelling because
it shows the power of the basic approach, but which is turning others
off because it's more complex and subtle than a basic use case.

There are many features in Python which are powerful and simple on the
surface, but get quite gory when you delve beneath the covers
(new-style classes, decorators, generators, for example). That doesn't
mean they shouldn't be there.

Paul.

From jimjjewett at gmail.com  Fri May 11 15:46:19 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 11 May 2007 09:46:19 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070510231845.9C98C3A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
Message-ID: <fb6fbf560705110646h7a990f90p5cfb3797541f7ccd@mail.gmail.com>

On 5/10/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 09:59 AM 5/11/2007 +1200, Greg Ewing wrote:
> >Phillip J. Eby wrote:
> >>As I said above (and in the PEP), *all* before and after methods
> >>are always called, unless an exception is raised somewhere along the way.

> >>   "Before" methods are invoked most-specific method first, with
> >>   ambiguous methods being executed in the order they were added.  All
> >>   "before" methods are called before any of the function's "primary"
> >>   methods (i.e. normal ``@overload`` methods) are executed.

As much as it seems clear once you understand ... it isn't, if only
because it is so unexpected.  I think it needs an example, such as

    class A: ...
    class B(A): ...

Then register before/after/around/normal methods for each, and show
the execution path for a B().  As I understand it now (without
rereading the PEP)

    AroundB part 1
    AroundA part 1
    BeforeA
    BeforeB
    NormalB
    # NormalA gets skipped, unless NormalB calls it explicitly
    AfterA
    AfterB
    AroundA part 2
    AroundB part 2

But maybe it would just be AroundB, because an Around is really a replacement?

> >I can see a problem with this. If Library1 defines a
> >method that always overrides an @around method, and
> >Library2 does the same thing, then if I try to use
> >both libraries at the same time, I'll get an exception
> >that I don't know the cause of and don't have any
> >idea how to fix.

> Actually, that would require that Library1 and Library2 both add
> methods to a generic function in Library3.  Not only that, but *those
> methods would have to apply to the same classes*.  So, it's actually
> a lot harder to create that situation than it sounds.

> In particular, notice that if Library1 only uses its combinators for
> methods applying to its own types, and Library2 does the same, they
> *cannot* create any method ambiguity in the third library's generic
> functions!

Library 1 and Library 2 both register Sage classes with Numpy, or vice
versa.  Library 1 and 2 don't know about each other.  Library 1 and 2
also go through some extra version skew pains when Sage starts
registering its types itself.

hmm... if Library 2 is slightly buggy, or makes a slightly different
mapping than library 1, then my getting correct results will depend on
which of Library 1/Library 2 gets imported first -- or, rather, first
got to the registration stage of their being imported.

-jJ

From daniel at stutzbachenterprises.com  Fri May 11 17:00:46 2007
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Fri, 11 May 2007 10:00:46 -0500
Subject: [Python-3000] PEPs update
In-Reply-To: <20070510192757.BIU44325@ms09.lnh.mail.rcn.net>
References: <20070510192757.BIU44325@ms09.lnh.mail.rcn.net>
Message-ID: <eae285400705110800w352a3ea6gcd8a20f96e6fed9e@mail.gmail.com>

On 5/10/07, Raymond Hettinger <python at rcn.com> wrote:
> > 3128 (BList) -- I'll leave this for Raymond Hettinger to review.
>
> After looking at the source, I think this has almost zero chance for replacing
> list().  There is too much value in a simple C API, low space overhead for small lists,

Thanks for taking time to review my code.  Did you look through the
PEP as well?  Both of these issues were specifically addressed.  In
fact, I am half way done with implementing the change so that small
BLists are memory efficient.

> good performance is common use cases,

This is also addressed, to some extent, in the PEP.

> and having performance that is easily understood.

I am not sure what aspect of the performance might be misunderstood.
Just about everything is O(log n).  Could you clarify your concern?

> The BList implementation lacks these virtues and trades-off a little performance
> is common cases for much better performance in uncommon cases.  As a Py3.0
> PEP, I think it can be rejected.

Would it be useful if I created an experimental fork of 2.5 that
replaces array-based lists with BLists, so that the performance
penalty (if any) on existing code can be measured?

> Depending on its success as a third-party module, it still has a chance for
> inclusion in the collections module.  The essential criteria for that is whether
> it is a superior choice for some real-world use cases.  I've scanned my own
> code and found no instances where BList would have been preferable to a
> regular list.  However, that scan has a selection bias because it doesn't reflect
> what I would have written had BList been available.

Indeed, I wrote the BList because there were idioms that I wanted to
use that were just not practical with an array-based list.

-- 
Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC

From python at rcn.com  Fri May 11 18:22:57 2007
From: python at rcn.com (Raymond Hettinger)
Date: Fri, 11 May 2007 12:22:57 -0400 (EDT)
Subject: [Python-3000] PEPs update
Message-ID: <20070511122257.BIW67893@ms09.lnh.mail.rcn.net>

> Thanks for taking time to review my code.

You welcome.  And thanks for the continuing development effort.

> Did you look through the PEP as well? 

Yes.

> In fact, I am half way done with implementing the change so
> that small BLists are memory efficient.

As the code continues to evolve, I'll continue to look at it.  I look forward to seeing how far you can take this.  Newly developed code always faces an uphill battle when compared to mature open-source.

> I am not sure what aspect of the performance might 
> be misunderstood. Just about everything is O(log n).
>  Could you clarify your concern?

End-users (everyday Python programmers) need to be understand the performance intuitively and have a clear understanding of what is going on under-the-hood.  Our existing data structures have the virtue of having a simple mental model (except for aspects of re-sizing and over-allocation which are a bit obscure).


> Would it be useful if I created an experimental fork of 2.5 
> that replaces array-based lists with BLists,
>  so that the performance penalty (if any) on existing code 
> can be measured?

That would likely be an informative exercise and would assure that your code is truly interchangable with regular lists.  It would also highlight the under-the-hood difficulties you'll encounter with the C-API.

That being said, it is a labor intensive exercise and the time might be better spent on tweaking the third-party module code and building a happy user-base.


> Indeed, I wrote the BList because there were idioms that I
> wanted to use that were just not practical with an array-based list. 

We ought to set up a page on the wiki for success stories with blist as a third-party module.  In time, the Right Answer (tm) will become self-evident.  


Raymond

From g.brandl at gmx.net  Fri May 11 17:49:14 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 11 May 2007 17:49:14 +0200
Subject: [Python-3000] PEP 3132: Extended Iterable Unpacking
In-Reply-To: <f18ccp$utj$1@sea.gmane.org>
References: <f18ccp$utj$1@sea.gmane.org>
Message-ID: <f2239r$67c$1@sea.gmane.org>

Georg Brandl schrieb:
> This is a bit late, but it was in my queue by April 30, I swear! ;)
> Comments are appreciated, especially some phrasing sounds very clumsy
> to me, but I couldn't find a better one.

This was now accepted by Guido and checked in. Thanks for all the
comments!

Georg


From pje at telecommunity.com  Fri May 11 18:29:48 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 11 May 2007 12:29:48 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <fb6fbf560705110646h7a990f90p5cfb3797541f7ccd@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<fb6fbf560705110646h7a990f90p5cfb3797541f7ccd@mail.gmail.com>
Message-ID: <20070511162803.5E70F3A4061@sparrow.telecommunity.com>

At 09:46 AM 5/11/2007 -0400, Jim Jewett wrote:
>As much as it seems clear once you understand ... it isn't, if only
>because it is so unexpected.  I think it needs an example, such as
>
>     class A: ...
>     class B(A): ...
>
>Then register before/after/around/normal methods for each, and show
>the execution path for a B().  As I understand it now (without
>rereading the PEP)
>
>     AroundB part 1
>     AroundA part 1
>     BeforeA
>     BeforeB
>     NormalB
>     # NormalA gets skipped, unless NormalB calls it explicitly
>     AfterA
>     AfterB
>     AroundA part 2
>     AroundB part 2

The above is correct, except that either AroundB or AroundA *may* 
choose to skip calling the parts they enclose.


>But maybe it would just be AroundB, because an Around is really a replacement?

If AroundB didn't call its next-method, it would indeed be a replacement.


> > >I can see a problem with this. If Library1 defines a
> > >method that always overrides an @around method, and
> > >Library2 does the same thing, then if I try to use
> > >both libraries at the same time, I'll get an exception
> > >that I don't know the cause of and don't have any
> > >idea how to fix.
>
> > Actually, that would require that Library1 and Library2 both add
> > methods to a generic function in Library3.  Not only that, but *those
> > methods would have to apply to the same classes*.  So, it's actually
> > a lot harder to create that situation than it sounds.
>
> > In particular, notice that if Library1 only uses its combinators for
> > methods applying to its own types, and Library2 does the same, they
> > *cannot* create any method ambiguity in the third library's generic
> > functions!
>
>Library 1 and Library 2 both register Sage classes with Numpy, or vice
>versa.  Library 1 and 2 don't know about each other.  Library 1 and 2
>also go through some extra version skew pains when Sage starts
>registering its types itself.

Well, all the more reason to have this in place for 3.0 where 
everybody is starting over anyway.  ;-)

Seriously though, it seems to me that registering third-party types 
in fourth-party generic functions, from *library* code (as opposed to 
application code) is unwise.  I mean, you're already talking about 
FOUR people there, *not* counting Library 2!  (i.e., Sage, Numpy, 
Library 1, and the user).

However, the simple solution is that L1 and L2 should subclass the 
relevant Sage types and only register their subclasses.  Then, they 
each effectively "own" the types, and if Sage registers useful stuff 
later, they can just drop their subclasses.

That doesn't eliminate the issue of what type(s) the user of L1 and 
L2 should use, unless of course the use of Sage in at least one of L1 
and L2 is embedded and not user-visible.  However, it's not like such 
questions of choice and compatibility don't come up all the time 
anyway, and the user could, if he/she had to, use multiple 
inheritance plus some additional registrations of their own to work things out.

Also, remember that the user can always resolve ambiguities between 
libraries by making additional registrations that more specifically 
apply to the situation.

So, can you write a library that messes things up for other 
people?  Sure!  But you can already do that; this ain't Java, and 
we're all consenting adults.  If you write libraries that mess stuff 
up, you're going to get complaints.

The best practice here reminds me of a joke my coworkers used to tell 
when I was in the real estate software business.  One of our 
salespeople was talking to a real estate broker and explaining the 
menu of our program:

"See, if your company lists it, and another company sells it, or if 
you sell it, that's an "Inside Listing Sold".  But if another company 
lists it, and *you* sell it, then we call that an "Outside Listing Sold"."

The broker nodded.  "But what if another company lists *and* sells it?"

The salesperson thought a moment, then smiled.  "Well, we call that, 
"None of your business!""

In the same way here, you can register your types with other people's 
generic functions, or other people's types with your generic 
functions, or even your own types with your own generic 
functions.  But registering other people's types with other people's 
generic functions is what we would politely call, "none of your business".  :)


>hmm... if Library 2 is slightly buggy, or makes a slightly different
>mapping than library 1, then my getting correct results will depend on
>which of Library 1/Library 2 gets imported first -- or, rather, first
>got to the registration stage of their being imported.

Note that for "@around" and "@when/@overload", import order *does not 
resolve ambiguity*.  If the registrations are for the same types, 
calling the function with those types will raise an AmbiguousMethods 
error that lists the conflicting methods.

But, as I pointed out above, it's a bad idea for those two libraries 
to directly register another library's types without subclassing them 
first, per the NOYB rule.  :) 


From pje at telecommunity.com  Fri May 11 18:37:37 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 11 May 2007 12:37:37 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <fb6fbf560705102105o7442d872x5bf8545c90686fee@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<46428622.7000204@canterbury.ac.nz>
	<20070510154540.8CEC23A4061@sparrow.telecommunity.com>
	<fb6fbf560705101835i600ac62bw2ec3e088c53ded2a@mail.gmail.com>
	<20070511020339.5F4313A4061@sparrow.telecommunity.com>
	<fb6fbf560705102105o7442d872x5bf8545c90686fee@mail.gmail.com>
Message-ID: <20070511163551.C04D83A4061@sparrow.telecommunity.com>

At 12:05 AM 5/11/2007 -0400, Jim Jewett wrote:
>So we're mostly in agreement, but I had also wanted to leave out importString.
>
>I know it can seem simpler to treat everything as an object, and not
>worry about where the type switches from package to module to instance
>to attribute.  I see it used in Twisted.
>
>But I'm not sure it is *really* simpler for someone who isn't familiar
>with your codebase,

The use case is to be able to have a string that refers to an 
importable object.  The unittest module has something similar, egg 
entry points do, and so does mod_python.  (I wouldn't be surprised if 
mod_wsgi has something like that also.)  Chandler's repository 
(object database) also had code to "load classes" by using a string 
import, before I got there.

The thing is, string-import code is tricky to get just right; it 
therefore seems like a natural for "batteries included" if you're 
creating a stdlib module that's already doing stuff with strings and importing.


>and I don't see why it is needed for whenImported.

It isn't.  I'm just saying if we were going to add it to the stdlib, 
importString (perhaps with a name change) just seems like a 
no-brainer to include.  (vs. importObject, importSequence, and 
importSuite, which are just boilerplate over importString.)

Anyway, perhaps this should piggyback on the coming discussion of 
moving the full import code to Python; it might be that lazy imports 
and callbacks could be more cleanly implemented as part of that 
machinery, than by being tacked on afterwards.


From steven.bethard at gmail.com  Fri May 11 18:57:21 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 11 May 2007 10:57:21 -0600
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
Message-ID: <d11dcfba0705110957i7db27aefy530237673e683199@mail.gmail.com>

On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> PEP: 3124
> Title: Overloading, Generic Functions, Interfaces, and Adaptation

Ok, one more try at simplifying things.  How about you just drop the sections:

    "Before" and "After" Methods
    "Around" Methods
    Custom Combinations
    Aspects

Yes, I know that 90% of the machinery to support these will already be
in the module, but I still think it would make a clearer PEP if that
remaining 10% was factored out into third-party code.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From pje at telecommunity.com  Fri May 11 19:11:53 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 11 May 2007 13:11:53 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <d11dcfba0705110957i7db27aefy530237673e683199@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<d11dcfba0705110957i7db27aefy530237673e683199@mail.gmail.com>
Message-ID: <20070511171007.D27C23A4061@sparrow.telecommunity.com>

At 10:57 AM 5/11/2007 -0600, Steven Bethard wrote:
>On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > PEP: 3124
> > Title: Overloading, Generic Functions, Interfaces, and Adaptation
>
>Ok, one more try at simplifying things.  How about you just drop the sections:
>
>     "Before" and "After" Methods
>     "Around" Methods
>     Custom Combinations
>     Aspects
>
>Yes, I know that 90% of the machinery to support these will already be
>in the module, but I still think it would make a clearer PEP if that
>remaining 10% was factored out into third-party code.

ISTM that your statement is still true if you replace the phrase 
"third-party code" with "second PEP".  :)


From steven.bethard at gmail.com  Fri May 11 19:16:31 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 11 May 2007 11:16:31 -0600
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070511171007.D27C23A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<d11dcfba0705110957i7db27aefy530237673e683199@mail.gmail.com>
	<20070511171007.D27C23A4061@sparrow.telecommunity.com>
Message-ID: <d11dcfba0705111016s1e77036eo6a2e392be608656d@mail.gmail.com>

On 5/11/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 10:57 AM 5/11/2007 -0600, Steven Bethard wrote:
> >On 4/30/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > > PEP: 3124
> > > Title: Overloading, Generic Functions, Interfaces, and Adaptation
> >
> >Ok, one more try at simplifying things.  How about you just drop the sections:
> >
> >     "Before" and "After" Methods
> >     "Around" Methods
> >     Custom Combinations
> >     Aspects
> >
> >Yes, I know that 90% of the machinery to support these will already be
> >in the module, but I still think it would make a clearer PEP if that
> >remaining 10% was factored out into third-party code.
>
> ISTM that your statement is still true if you replace the phrase
> "third-party code" with "second PEP".  :)

That's fine too. =)

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From jimjjewett at gmail.com  Fri May 11 19:27:19 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 11 May 2007 13:27:19 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070511162803.5E70F3A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<fb6fbf560705110646h7a990f90p5cfb3797541f7ccd@mail.gmail.com>
	<20070511162803.5E70F3A4061@sparrow.telecommunity.com>
Message-ID: <fb6fbf560705111027g21bf5299xcafe0cad52b8f6a5@mail.gmail.com>

On 5/11/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 09:46 AM 5/11/2007 -0400, Jim Jewett wrote:
> >As much as it seems clear once you understand ... it isn't, if only
> >because it is so unexpected.  I think it needs an example, such as

> >     class A: ...
> >     class B(A): ...

> >Then register before/after/around/normal methods for each, and show
> >the execution path for a B().  As I understand it now (without
> >rereading the PEP)

> >     AroundB part 1
> >     AroundA part 1
> >     BeforeA
> >     BeforeB
> >     NormalB
> >     # NormalA gets skipped, unless NormalB calls it explicitly
> >     AfterA
> >     AfterB
> >     AroundA part 2
> >     AroundB part 2

> The above is correct, except that either AroundB or AroundA *may*
> choose to skip calling the parts they enclose.

So how is an Around method any different than a full concrete
implementation?  Just because it has higher precedence, so it can win
without being the most specific?

Could you drop the precedence stuff from the core library, and just have

"here is how to register a concrete implementation"
"here is the equivalent of super -- a way to call whatever would have
been called without your replacement"

I understand that the full version offers more functionality, but it
is also more complicated.  Maybe use that fuller version as a test
case, and mention in the module docs that it is possible to create
more powerful dispatch rules, and that there is an example
(test\test_generic_reg  ?), with even more powerful extensions
available as PEAK.rules (http:// ...)  and Zope.Interfaces
(http://...)

> >Library 1 and Library 2 both register Sage classes with Numpy, or vice
> >versa.  Library 1 and 2 don't know about each other.  Library 1 and 2
> >also go through some extra version skew pains when Sage starts
> >registering its types itself.

> Seriously though, it seems to me that registering third-party types
> in fourth-party generic functions, from *library* code (as opposed to
> application code) is unwise.  I mean, you're already talking about
> FOUR people there, *not* counting Library 2!  (i.e., Sage, Numpy,
> Library 1, and the user).

Those are all math libraries; Library 1 and Library 2 *should* both
work well with both NumPy and Sage, and can reasonably be considered
extensions of both.

Saying "You can do this with most numbers, but not NumPy numbers" is ugly.

Saying "You can do this, but sometimes it will break because the
extensions I work with don't know about each other, and I won't
translate, as a matter of policy" is ... probably not going to happen.

Ideally, NumPy and Sage would make the introductions directly, or
there would at least be a canonical mapping somewhere that Libraries 1
and 2 could agree on ... but that won't happen at any specific time.

Saying "You need to upgrade to at least version Package A version
2.3.4 and Package B version 4.3 to use my code" is unlikely to happen;
you yourself still support Python 2.2 in your own packages.

 > anyway, and the user could, if he/she had to, use multiple
> inheritance plus some additional registrations of their own to work things out.

If there are two registrations for the same selection criteria, how
can the user resolve things?  Either the first one registered wins, or
the second, or the user sees some sort of import failure, and can't
fix it without modifying somebody else's code to avoid one of those
registrations.

-jJ

From nnorwitz at gmail.com  Fri May 11 20:29:46 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Fri, 11 May 2007 11:29:46 -0700
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<4643C4F4.30708@canterbury.ac.nz>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
Message-ID: <ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>

On 5/11/07, Paul Moore <p.f.moore at gmail.com> wrote:
> On 11/05/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > I'm now even more of the opinion that this is too
> > complicated for Python's first generic function system.
> > "If it's hard to explain, it's probably a bad idea."
>
> Hmm. My view is that it *is* simple to explain, but unfortunately
> Phillip's explanation in the PEP is not that simple explanation :-(

[snip]

> I would argue that the PEP could be *very* simple if it restricted
> itself to the basic idea.

Paul,

Could you write up the simple version that you would use instead?

n

From pje at telecommunity.com  Fri May 11 20:51:12 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 11 May 2007 14:51:12 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <fb6fbf560705111027g21bf5299xcafe0cad52b8f6a5@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<fb6fbf560705110646h7a990f90p5cfb3797541f7ccd@mail.gmail.com>
	<20070511162803.5E70F3A4061@sparrow.telecommunity.com>
	<fb6fbf560705111027g21bf5299xcafe0cad52b8f6a5@mail.gmail.com>
Message-ID: <20070511184927.7A2043A4061@sparrow.telecommunity.com>

At 01:27 PM 5/11/2007 -0400, Jim Jewett wrote:
>So how is an Around method any different than a full concrete
>implementation?  Just because it has higher precedence, so it can win
>without being the most specific?

Yep.


>Could you drop the precedence stuff from the core library, and just have
>
>"here is how to register a concrete implementation"
>"here is the equivalent of super -- a way to call whatever would have
>been called without your replacement"
>
>I understand that the full version offers more functionality, but it
>is also more complicated.  Maybe use that fuller version as a test
>case, and mention in the module docs that it is possible to create
>more powerful dispatch rules, and that there is an example
>(test\test_generic_reg  ?), with even more powerful extensions
>available as PEAK.rules (http:// ...)  and Zope.Interfaces
>(http://...)

I don't have a problem with moving method combination (other than the 
super()-analogue) and aspects to a separate PEP for ease of 
understanding, but the implementation is pretty much the smallest 
indivisible collection of features that still allows features like 
those to be added.  (See the other PEP 3124 threads for that discussion.)

I think this is analagous to PEP 252 and 253, in that their 
implementation is interdependent, but could be considered separate 
topics and thus easier to read when separated.


> > >Library 1 and Library 2 both register Sage classes with Numpy, or vice
> > >versa.  Library 1 and 2 don't know about each other.  Library 1 and 2
> > >also go through some extra version skew pains when Sage starts
> > >registering its types itself.
>
> > Seriously though, it seems to me that registering third-party types
> > in fourth-party generic functions, from *library* code (as opposed to
> > application code) is unwise.  I mean, you're already talking about
> > FOUR people there, *not* counting Library 2!  (i.e., Sage, Numpy,
> > Library 1, and the user).
>
>Those are all math libraries; Library 1 and Library 2 *should* both
>work well with both NumPy and Sage, and can reasonably be considered
>extensions of both.
>
>Saying "You can do this with most numbers, but not NumPy numbers" is ugly.
>
>Saying "You can do this, but sometimes it will break because the
>extensions I work with don't know about each other, and I won't
>translate, as a matter of policy" is ... probably not going to happen.

If L1 defines a generic function, it's fine for it to register Sage 
and NumPy types for it.  But if NumPy defines a generic function and 
Sage defines the type, how is it any of L1's business?


>Ideally, NumPy and Sage would make the introductions directly, or
>there would at least be a canonical mapping somewhere that Libraries 1
>and 2 could agree on ... but that won't happen at any specific time.

Again, nothing stops L1 and L2 from subclassing those types, and 
registering only those subtypes.  Or from offering *optional* 
registration support modules, so an application can *choose* to 
import them.  NOYB ("none of your business") registration should only 
be done by applications, if they're done at all.


>Saying "You need to upgrade to at least version Package A version
>2.3.4 and Package B version 4.3 to use my code" is unlikely to happen;
>you yourself still support Python 2.2 in your own packages.

2.3, actually, as it's the Python used by the most 
widely-available/supported Linux distros at the moment.


>  > anyway, and the user could, if he/she had to, use multiple
> > inheritance plus some additional registrations of their own to 
> work things out.
>
>If there are two registrations for the same selection criteria, how
>can the user resolve things?

With an @around method, or by creating and using subclasses.


>Either the first one registered wins, or
>the second, or the user sees some sort of import failure, and can't
>fix it without modifying somebody else's code to avoid one of those
>registrations.

AmbiguousMethods is a call-time error, not a definition time error, 
unless you are using custom combinators.  L1 and L2 would have to 
define their own @lib1 and @lib2 combinators, and register them both 
with the same generic function *and* the same types, before you could 
get a definition-time error.

And I could probably change the implementation to avoid this by 
always deferring method combination until the function is invoked at 
least once, but I'm not convinced it's worth it, especially since it 
could make other errors harder to find when writing combinators.  In 
my experience, most combinators are defined by the library that 
defines the generic function using them, or else are general-purpose 
AOP-ish combinators like @before/@after/@around.


From daniel at stutzbachenterprises.com  Fri May 11 22:20:28 2007
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Fri, 11 May 2007 15:20:28 -0500
Subject: [Python-3000] PEPs update
In-Reply-To: <20070511122257.BIW67893@ms09.lnh.mail.rcn.net>
References: <20070511122257.BIW67893@ms09.lnh.mail.rcn.net>
Message-ID: <eae285400705111320l2533796fj52c96c75f516f3ce@mail.gmail.com>

On 5/11/07, Raymond Hettinger <python at rcn.com> wrote:
> Newly developed code always faces an uphill battle when compared to
> mature open-source.

As it should. :-)

> End-users (everyday Python programmers) need to be understand the
> performance intuitively and have a clear understanding of what is going
> on under-the-hood.  Our existing data structures have the virtue of having a
> simple mental model (except for aspects of re-sizing and over-allocation which
> are a bit obscure).

I guess I have a different perspective.  One advantage of the BList is
that the user doesn't *need* to understand what's going on
under-the-hood.  They can rely on it to have good performance for any
operation.

One of my motivations in creating it was so I could be more lazy in
the future.  With a BList, I don't have to wonder whether Python code
I write will ever be called with a really big list, and, if so,
whether I need to rewrite my algorithm to avoid O(n^2) behavior.

> That would likely be an informative exercise and would assure that your code
> is truly interchangable with regular lists.  It would also highlight the
> under-the-hood difficulties you'll encounter with the C-API.
>
> That being said, it is a labor intensive exercise and the time might be better
> spent on tweaking the third-party module code and building a happy user-base.

I actually don't think it will be that bad, since list operations go
through one thin API.  I just need to redirect the API in listobject.h
and I'm mostly done.  I think.

Maybe I'll take a quick pass at it, and if it turns into a nightmare,
I'll reconsider.

> > Indeed, I wrote the BList because there were idioms that I
> > wanted to use that were just not practical with an array-based list.
>
> We ought to set up a page on the wiki for success stories with blist as a
> third-party module.  In time, the Right Answer (tm) will become self-evident.

I haven't used the python.org wiki before.  If you point me to the
right place put a link to a BList page, I'd be happy to create one.
Somewhere under UsefulModules?

-- 
Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC

From python at rcn.com  Fri May 11 22:53:06 2007
From: python at rcn.com (Raymond Hettinger)
Date: Fri, 11 May 2007 16:53:06 -0400 (EDT)
Subject: [Python-3000] PEPs update
Message-ID: <20070511165306.BIX50379@ms09.lnh.mail.rcn.net>

> I haven't used the python.org wiki before.  If you point me to the
> right place put a link to a BList page, I'd be happy to create one.
> Somewhere under UsefulModules?

That would be a good place:

   http://wiki.python.org/moin/UsefulModules


Raymond

From benji at benjiyork.com  Fri May 11 23:28:24 2007
From: benji at benjiyork.com (Benji York)
Date: Fri, 11 May 2007 17:28:24 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <20070511163551.C04D83A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>	<20070509205655.622A63A4061@sparrow.telecommunity.com>	<46428622.7000204@canterbury.ac.nz>	<20070510154540.8CEC23A4061@sparrow.telecommunity.com>	<fb6fbf560705101835i600ac62bw2ec3e088c53ded2a@mail.gmail.com>	<20070511020339.5F4313A4061@sparrow.telecommunity.com>	<fb6fbf560705102105o7442d872x5bf8545c90686fee@mail.gmail.com>
	<20070511163551.C04D83A4061@sparrow.telecommunity.com>
Message-ID: <4644DFF8.8030609@benjiyork.com>

Phillip J. Eby wrote:
> At 12:05 AM 5/11/2007 -0400, Jim Jewett wrote:
>> So we're mostly in agreement, but I had also wanted to leave out importString.
>>
>> I know it can seem simpler to treat everything as an object, and not
>> worry about where the type switches from package to module to instance
>> to attribute.  I see it used in Twisted.
>>
>> But I'm not sure it is *really* simpler for someone who isn't familiar
>> with your codebase,
> 
> The use case is to be able to have a string that refers to an 
> importable object.  The unittest module has something similar, egg 
> entry points do, and so does mod_python.  (I wouldn't be surprised if 
> mod_wsgi has something like that also.)  Chandler's repository 
> (object database) also had code to "load classes" by using a string 
> import, before I got there.

zope.interface also allows "lazy" imports using string versions of 
module names in specific circumstances where circular dependencies are 
common.
-- 
Benji York
http://benjiyork.com

From greg.ewing at canterbury.ac.nz  Sat May 12 02:54:30 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 12 May 2007 12:54:30 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <20070511162803.5E70F3A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<fb6fbf560705110646h7a990f90p5cfb3797541f7ccd@mail.gmail.com>
	<20070511162803.5E70F3A4061@sparrow.telecommunity.com>
Message-ID: <46451046.4000109@canterbury.ac.nz>

Phillip J. Eby wrote:
> you can register your types with other people's 
> generic functions, or other people's types with your generic 
> functions,

There's still a possibility of conflict even then. Fred
registers one of Mary's types with his generic function,
which he feels entitled to do because he owns the function.
Meanwhile, Mary registers the same type with the same
function, which she feels entitled to do because she
owns the type.

The problem is that nobody entirely owns the (type,
function) pair, which is what's required to be unique.

--
Greg

From rasky at develer.com  Sat May 12 02:58:58 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Sat, 12 May 2007 02:58:58 +0200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <4642745C.1040702@canterbury.ac.nz>
References: <f1l97g$5cm$1@sea.gmane.org>
	<463E4645.5000503@acm.org>	<20070506222840.25B2.JCARLSON@uci.edu>
	<f1s312$489$1@sea.gmane.org> <4642745C.1040702@canterbury.ac.nz>
Message-ID: <f233gi$8j7$1@sea.gmane.org>

On 10/05/2007 3.24, Greg Ewing wrote:

>> using multiple processes cause some 
>> headaches with frozen distributions (PyInstaller, py2exe, etc.), like those 
>> usually found on Windows, specifically because Windows does not have fork().
> 
> Isn't that just a problem with Windows generally? I don't
> see what the method of packaging has to do with it.

The processing module has two ways of creating a new process which executes 
the same program of the current process:

- fork
- the moral equivalent of popen(sys.executable sys.argv[0]) + some magic 
values passed on the command line which is a pickled state.

The second method doesn't work out-of-the-box when the program is packaged, 
and it is the only one available in Windows.
-- 
Giovanni Bajo
Develer S.r.l.
http://www.develer.com


From rasky at develer.com  Sat May 12 03:00:36 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Sat, 12 May 2007 03:00:36 +0200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <20070509203702.25EF.JCARLSON@uci.edu>
References: <f1s312$489$1@sea.gmane.org> <4642745C.1040702@canterbury.ac.nz>
	<20070509203702.25EF.JCARLSON@uci.edu>
Message-ID: <f233jk$8j7$2@sea.gmane.org>

On 10/05/2007 5.38, Josiah Carlson wrote:

>>> using multiple processes cause some 
>>> headaches with frozen distributions (PyInstaller, py2exe, etc.), like those 
>>> usually found on Windows, specifically because Windows does not have fork().
>> Isn't that just a problem with Windows generally? I don't
>> see what the method of packaging has to do with it.
>>
>> Also, I've seen it suggested that there may actually be
>> a way of doing something equivalent to a fork in Windows,
>> even though it doesn't have a fork() system call as such.
>> Does anyone know more about this?
> 
> Cygwin emulates fork() by creating a shared mmap, creating a new child
> process, copying the contents of the parent process' memory to the child
> process (after performing the proper allocations), then hacks up the
> child process' call stack.

Yes that's the theory. If you look at the implementation, it's fullfilled of 
complexities, corner cases, undocumented glitchs and whatnot.

cygwin's fork() is mature, but I don't think it's easy to extract from cygwin. 
Moreover, there would be license issues since fork() is GPL. Doing another 
implementation from scratch is going to be hard.
-- 
Giovanni Bajo


From benji at benjiyork.com  Sat May 12 03:07:24 2007
From: benji at benjiyork.com (Benji York)
Date: Fri, 11 May 2007 21:07:24 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <4644E9AB.6080603@trueblade.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>	<20070509205655.622A63A4061@sparrow.telecommunity.com>	<46428622.7000204@canterbury.ac.nz>	<20070510154540.8CEC23A4061@sparrow.telecommunity.com>	<fb6fbf560705101835i600ac62bw2ec3e088c53ded2a@mail.gmail.com>	<20070511020339.5F4313A4061@sparrow.telecommunity.com>	<fb6fbf560705102105o7442d872x5bf8545c90686fee@mail.gmail.com>	<20070511163551.C04D83A4061@sparrow.telecommunity.com>
	<4644DFF8.8030609@benjiyork.com> <4644E9AB.6080603@trueblade.com>
Message-ID: <4645134C.2030509@benjiyork.com>

Eric V. Smith wrote:
> Benji York wrote:
>> zope.interface also allows "lazy" imports using string versions of 
>> module names in specific circumstances where circular dependencies are 
>> common.
> 
> Could you give an example of that?  I'm familiar with zope.interface, 
> but not with this feature.

I was mistaken, it's actually zope.dottedname that does this, which is 
then used by zope.app.container.constraints.  My confusion stems from 
the fact that zope.app.container.constraints is often used when defining 
interfaces.

My only reason for bringing it up was to reinforce the idea that it's a 
popular thing to reinvent, so either adding this to the stdlib or 
(preferably) creating a small, solid module as a stand-alone project.

zope.dottedname documentation: 
http://svn.zope.org/zope.dottedname/trunk/src/zope/dottedname/resolve.txt?rev=75116&view=markup
zope.app.container documentation:
http://svn.zope.org/zope.app.container/trunk/src/zope/app/container/constraints.txt?rev=75262&view=markup
-- 
Benji York
http://benjiyork.com

From greg.ewing at canterbury.ac.nz  Sat May 12 03:16:45 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 12 May 2007 13:16:45 +1200
Subject: [Python-3000] PEPs update
In-Reply-To: <eae285400705111320l2533796fj52c96c75f516f3ce@mail.gmail.com>
References: <20070511122257.BIW67893@ms09.lnh.mail.rcn.net>
	<eae285400705111320l2533796fj52c96c75f516f3ce@mail.gmail.com>
Message-ID: <4645157D.8050404@canterbury.ac.nz>

Daniel Stutzbach wrote:

> I actually don't think it will be that bad, since list operations go
> through one thin API.  I just need to redirect the API in listobject.h
> and I'm mostly done.

Some of that API consists of macros that index directly
into the list. Currently those are O(1) and inlined. You
would have to replace them with function calls that would
be O(log n) and not inlined. The performance implications
of that could be unpleasant.

--
Greg

From greg.ewing at canterbury.ac.nz  Sat May 12 03:42:42 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 12 May 2007 13:42:42 +1200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <f233jk$8j7$2@sea.gmane.org>
References: <f1s312$489$1@sea.gmane.org> <4642745C.1040702@canterbury.ac.nz>
	<20070509203702.25EF.JCARLSON@uci.edu> <f233jk$8j7$2@sea.gmane.org>
Message-ID: <46451B92.7010706@canterbury.ac.nz>

Giovanni Bajo wrote:

> cygwin's fork() is mature, but I don't think it's easy to extract from cygwin. 
> Moreover, there would be license issues since fork() is GPL. Doing another 
> implementation from scratch is going to be hard.

Also it doesn't sound very efficient compared to a
real unix fork, if it has to copy the whole address
space instead of using copy-on-write.

--
Greg

From greg.ewing at canterbury.ac.nz  Sat May 12 03:43:19 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 12 May 2007 13:43:19 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <20070511184927.7A2043A4061@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<fb6fbf560705110646h7a990f90p5cfb3797541f7ccd@mail.gmail.com>
	<20070511162803.5E70F3A4061@sparrow.telecommunity.com>
	<fb6fbf560705111027g21bf5299xcafe0cad52b8f6a5@mail.gmail.com>
	<20070511184927.7A2043A4061@sparrow.telecommunity.com>
Message-ID: <46451BB7.9030703@canterbury.ac.nz>

Phillip J. Eby wrote:
> At 01:27 PM 5/11/2007 -0400, Jim Jewett wrote:
>>If there are two registrations for the same selection criteria, how
>>can the user resolve things?

But what if there's *already* an @around method being
used? Then you need an @even_more_around method. Etc
ad infinitum?

--
Greg

From ncoghlan at gmail.com  Sat May 12 10:13:33 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 12 May 2007 18:13:33 +1000
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <46451046.4000109@canterbury.ac.nz>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>	<740c3aec0705091254n7a406621qb5cc491cf3af9743@mail.gmail.com>	<20070509205655.622A63A4061@sparrow.telecommunity.com>	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>	<464289C8.4080004@canterbury.ac.nz>	<20070510161417.192943A4061@sparrow.telecommunity.com>	<464395AB.6040505@canterbury.ac.nz>	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>	<fb6fbf560705110646h7a990f90p5cfb3797541f7ccd@mail.gmail.com>	<20070511162803.5E70F3A4061@sparrow.telecommunity.com>
	<46451046.4000109@canterbury.ac.nz>
Message-ID: <4645772D.5070206@gmail.com>

Greg Ewing wrote:
> Phillip J. Eby wrote:
>> you can register your types with other people's 
>> generic functions, or other people's types with your generic 
>> functions,
> 
> There's still a possibility of conflict even then. Fred
> registers one of Mary's types with his generic function,
> which he feels entitled to do because he owns the function.
> Meanwhile, Mary registers the same type with the same
> function, which she feels entitled to do because she
> owns the type.
> 
> The problem is that nobody entirely owns the (type,
> function) pair, which is what's required to be unique.

However, even if it *does* happen, the application programmer can still 
resolve the conflict by picking one of the two implementations and 
registering it as an override.

At the moment if you don't like the way a particularly library handles 
another library's or your application's types your ability to do 
anything about it is pretty close to nonexistent (unless the library 
employs some kind of interface or generic function mechanism).

Generic functions don't magically make library compatibility problems go 
away, particularly when the libraries involved are interacting directly 
rather than going through the main application. What they *do* provide 
is a standard toolkit for reducing the likelihood of incompatibility 
occurring in the first place, and providing the means for resolving 
whatever conflicts do arise.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From guido at python.org  Sat May 12 01:47:03 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 11 May 2007 16:47:03 -0700
Subject: [Python-3000] Revised PEP 3119 (Abstract Base Classes)
Message-ID: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>

Here's a new version of the ABC PEP. A lot has changed; a lot remains.
I can't give a detailed overview of all the changes, and a diff would
show too many spurious changes, but some of the highlights are:

- Overloading isinstance and issubclass is now a key mechanism rather
than an afterthought; it is also the only change to C code required
[12].

- No built-in types need to be modified, and @abstractmethod is once
again imported from abc.py, which defines this and a new metaclass,
ABCMeta.

- Built-in (and user-defined) types can be registered as "virtual
subclasses" (not related to virtual base classes in C++) of the
standard ABCs, e.g. Sequence.register(tuple) makes issubclass(tuple,
Sequence) true (but Sequence won't show up in __bases__ or __mro__).
You can define your own ABCs and register standard ABCs or built-in
types as their virtual subclasses.

- The number of pre-defined ABCs is greatly reduced. Apart from the
one-trick ponies, which are mostly unchanged, we now have: Set,
MutableSet, Mapping, MutableMapping, Sequence, MutableSequence. That's
it.

Enjoy,

PEP: 3119
Title: Introducing Abstract Base Classes
Version: $Revision: 55276 $
Last-Modified: $Date: 2007-05-11 13:49:12 -0700 (Fri, 11 May 2007) $
Author: Guido van Rossum <guido at python.org>, Talin <talin at acm.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 18-Apr-2007
Post-History: 26-Apr-2007, 11-May-2007


Abstract
========

This is a proposal to add Abstract Base Class (ABC) support to Python
3000.  It proposes:

* A way to overload ``isinstance()`` and ``issubclass()``.

* A new module ``abc`` which serves as an "ABC support framework".  It
  defines a metaclass for use with ABCs and a decorator that can be
  used to define abstract methods.

* Specific ABCs for containers and iterators, to be added to the
  collections module.

Much of the thinking that went into the proposal is not about the
specific mechanism of ABCs, as contrasted with Interfaces or Generic
Functions (GFs), but about clarifying philosophical issues like "what
makes a set", "what makes a mapping" and "what makes a sequence".

There's also a companion PEP 3141, which defines ABCs for numeric
types.


Acknowledgements
----------------

Talin wrote the Rationale below [1]_ as well as most of the section on
ABCs vs. Interfaces.  For that alone he deserves co-authorship.  The
rest of the PEP uses "I" referring to the first author.


Rationale
=========

In the domain of object-oriented programming, the usage patterns for
interacting with an object can be divided into two basic categories,
which are 'invocation' and 'inspection'.

Invocation means interacting with an object by invoking its methods.
Usually this is combined with polymorphism, so that invoking a given
method may run different code depending on the type of an object.

Inspection means the ability for external code (outside of the
object's methods) to examine the type or properties of that object,
and make decisions on how to treat that object based on that
information.

Both usage patterns serve the same general end, which is to be able to
support the processing of diverse and potentially novel objects in a
uniform way, but at the same time allowing processing decisions to be
customized for each different type of object.

In classical OOP theory, invocation is the preferred usage pattern,
and inspection is actively discouraged, being considered a relic of an
earlier, procedural programming style.  However, in practice this view
is simply too dogmatic and inflexible, and leads to a kind of design
rigidity that is very much at odds with the dynamic nature of a
language like Python.

In particular, there is often a need to process objects in a way that
wasn't anticipated by the creator of the object class.  It is not
always the best solution to build in to every object methods that
satisfy the needs of every possible user of that object.  Moreover,
there are many powerful dispatch philosophies that are in direct
contrast to the classic OOP requirement of behavior being strictly
encapsulated within an object, examples being rule or pattern-match
driven logic.

On the other hand, one of the criticisms of inspection by classic
OOP theorists is the lack of formalisms and the ad hoc nature of what
is being inspected.  In a language such as Python, in which almost any
aspect of an object can be reflected and directly accessed by external
code, there are many different ways to test whether an object conforms
to a particular protocol or not.  For example, if asking 'is this
object a mutable sequence container?', one can look for a base class
of 'list', or one can look for a method named '__getitem__'.  But note
that although these tests may seem obvious, neither of them are
correct, as one generates false negatives, and the other false
positives.

The generally agreed-upon remedy is to standardize the tests, and
group them into a formal arrangement.  This is most easily done by
associating with each class a set of standard testable properties,
either via the inheritance mechanism or some other means.  Each test
carries with it a set of promises: it contains a promise about the
general behavior of the class, and a promise as to what other class
methods will be available.

This PEP proposes a particular strategy for organizing these tests
known as Abstract Base Classes, or ABC.  ABCs are simply Python
classes that are added into an object's inheritance tree to signal
certain features of that object to an external inspector.  Tests are
done using ``isinstance()``, and the presence of a particular ABC
means that the test has passed.

In addition, the ABCs define a minimal set of methods that establish
the characteristic behavior of the type.  Code that discriminates
objects based on their ABC type can trust that those methods will
always be present.  Each of these methods are accompanied by an
generalized abstract semantic definition that is described in the
documentation for the ABC.  These standard semantic definitions are
not enforced, but are strongly recommended.

Like all other things in Python, these promises are in the nature of a
gentlemen's agreement, which in this case means that while the
language does enforce some of the promises made in the ABC, it is up
to the implementer of the concrete class to insure that the remaining
ones are kept.


Specification
=============

The specification follows the categories listed in the abstract:

* A way to overload ``isinstance()`` and ``issubclass()``.

* A new module ``abc`` which serves as an "ABC support framework".  It
  defines a metaclass for use with ABCs and a decorator that can be
  used to define abstract methods.

* Specific ABCs for containers and iterators, to be added to the
  collections module.


Overloading ``isinstance()`` and ``issubclass()``
-------------------------------------------------

During the development of this PEP and of its companion, PEP 3141, we
repeatedly faced the choice between standardizing more, fine-grained
ABCs or fewer, course-grained ones.  For example, at one stage, PEP
3141 introduced the following stack of base classes used for complex
numbers: MonoidUnderPlus, AdditiveGroup, Ring, Field, Complex (each
derived from the previous).  And the discussion mentioned several
other algebraic categorizations that were left out: Algebraic,
Transcendental, and IntegralDomain, and PrincipalIdealDomain.  In
earlier versions of the current PEP, we considered the use cases for
separate classes like Set, ComposableSet, MutableSet, HashableSet,
MutableComposableSet, HashableComposableSet.

The dilemma here is that we'd rather have fewer ABCs, but then what
should a user do who needs a less refined ABC?  Consider e.g. the
plight of a mathematician who wants to define his own kind of
Transcendental numbers, but also wants float and int to be considered
Transcendental.  PEP 3141 originally proposed to patch float.__bases__
for that purpose, but there are some good reasons to keep the built-in
types immutable (for one, they are shared between all Python
interpreters running in the same address space, as is used by
mod_python).

Another example would be someone who wants to define a generic
function (PEP 3124) for any sequences that has an ``append()`` method.
The ``Sequence`` ABC (see below) doesn't promise the ``append()``
method, while ``MutableSequence`` requires not only ``append()`` but
also various other mutating methods.

To solve these and similar dilemmas, the next section will propose a
metaclass for use with ABCs that will allow us to add an ABC as a
"virtual base class" (not the same concept as in C++) to any class,
including to another ABC.  This allows the standard library to define
ABCs ``Sequence`` and ``MutableSequence`` and register these as
virtual base classes for built-in types like ``basestring``, ``tuple``
and ``list``, so that for example the following conditions are all
true::

    isinstance([], Sequence)
    issubclass(list, Sequence)
    issubclass(list, MutableSequence)
    isinstance((), Sequence)
    not issubclass(tuple, MutableSequence)
    isinstance("", Sequence)
    issubclass(bytes, MutableSequence)

The primary mechanism proposed here is to allow overloading the
built-in functions ``isinstance()`` and ``issubclass()``.  The
overloading works as follows: The call ``isinstance(x, C)`` first
checks whether ``C.__instancecheck__`` exists, and if so, calls
``C.__instancecheck__(x)`` instead of its normal implementation.
Similarly, the call ``issubclass(D, C)`` first checks whether
``C.__subclasscheck__`` exists, and if so, calls
``C.__subclasscheck__(D)`` instead of its normal implementation.

Note that the magic names are not ``__isinstance__`` and
``__issubclass__``; this is because the reversal of the arguments
could cause confusion, especially for the ``issubclass()`` overloader.

A prototype implementation of this is given in [12]_.

Here is an example with (naively simple) implementations of
``__instancecheck__`` and ``__subclasscheck__``::

    class ABCMeta(type):

        def __instancecheck__(cls, inst):
            """Implement isinstance(inst, cls)."""
            return any(cls.__subclasscheck__(c)
                       for c in {type(inst), inst.__class__})

        def __subclasscheck__(cls, sub):
            """Implement issubclass(sub, cls)."""
            candidates = cls.__dict__.get("__subclass__", set()) | {cls}
            return any(c in candidates for c in sub.mro())

    class Sequence(metaclass=ABCMeta):
        __subclass__ = {list, tuple}

    assert issubclass(list, Sequence)
    assert issubclass(tuple, Sequence)

    class AppendableSequence(Sequence):
        __subclass__ = {list}

    assert issubclass(list, AppendableSequence)
    assert isinstance([], AppendableSequence)

    assert not issubclass(tuple, AppendableSequence)
    assert not isinstance((), AppendableSequence)

The next section proposes a full-fledged implementation.


The ``abc`` Module: an ABC Support Framework
--------------------------------------------

The new standard library module ``abc``, written in pure Python,
serves as an ABC support framework.  It defines a metaclass
``ABCMeta`` and a decorator ``@abstractmethod``.  A sample
implementation is given by [13]_.

The ``ABCMeta`` class overrides ``__instancecheck__`` and
``__subclasscheck__`` and defines a ``register`` method.  The
``register`` method takes one argument, which much be a class; after
the call ``B.register(C)``, the call ``issubclass(C, B)`` will return
True, by virtue of of ``B.__subclasscheck__(C)`` returning True.
Also, ``isinstance(x, B)`` is equivalent to ``issubclass(x.__class__,
B) or issubclass(type(x), B)``.  (It is possible ``type(x)`` and
``x.__class__`` are not the same object, e.g. when x is a proxy
object.)

These methods are intended to be be called on classes whose metaclass
is (derived from) ``ABCMeta``; for example::

    from abc import ABCMeta

    class MyABC(metaclass=ABCMeta):
        pass

    MyABC.register(tuple)

    assert issubclass(tuple, MyABC)
    assert isinstance((), MyABC)

The last two asserts are equivalent to the following two::

    assert MyABC.__subclasscheck__(tuple)
    assert MyABC.__instancecheck__(())

Of course, you can also directly subclass MyABC::

    class MyClass(MyABC):
        pass

    assert issubclass(MyClass, MyABC)
    assert isinstance(MyClass(), MyABC)

Also, of course, a tuple is not a ``MyClass``::

    assert not issubclass(tuple, MyClass)
    assert not isinstance((), MyClass)

You can register another class as a subclass of ``MyClass``::

    MyClass.register(list)

    assert issubclass(list, MyClass)
    assert issubclass(list, MyABC)

You can also register another ABC::

    class AnotherClass(metaclass=ABCMeta):
        pass

    AnotherClass.register(basestring)

    MyClass.register(AnotherClass)

    assert isinstance(str, MyABC)

That last assert requires tracing the following superclass-subclass
relationships::

    MyABC -> MyClass (using regular subclassing)
    MyClass -> AnotherClass (using registration)
    AnotherClass -> basestring (using registration)
    basestring -> str (using regular subclassing)

The ``abc`` module also defines a new decorator, ``@abstractmethod``,
to be used to declare abstract methods.  A class containing at least
one method declared with this decorator that hasn't been overridden
yet cannot be instantiated.  Such a methods may be called from the
overriding method in the subclass (using ``super`` or direct
invocation).  For example::

    from abc import ABCMeta, abstractmethod

    class A(metaclass=ABCMeta):
        @abstractmethod
        def foo(self): pass

    A()  # raises TypeError

    class B(A):
        pass

    B()  # raises TypeError

    class C(A):
        def foo(self): print(42)

    C()  # works

**Note:** The ``@abstractmethod`` decorator should only be used
inside a class body, and only for classes whose metaclass is (derived
from) ``ABCMeta``.  Dynamically adding abstract methods to a class, or
attempting to modify the abstraction status of a method or class once
it is created, are not supported.  The ``@abstractmethod`` only
affects subclasses derived using regular inheritance; "virtual
subclasses" registered with the ``register()`` method are not affected.

It has been suggested that we should also provide a way to define
abstract data attributes.  As it is easy to add these in a later
stage, and as the use case is considerably less common (apart from
pure documentation), we punt on this for now.

**Implementation:** The ``@abstractmethod`` decorator sets the
function attribute ``__isabstractmethod__`` to the value ``True``.
The ``ABCMeta.__new__`` method computes the type attribute
``__abstractmethods__`` as the set of all method names that have an
``__isabstractmethod__`` attribute whose value is true.  It does this
by combining the ``__abstractmethods__`` attributes of the base
classes, adding the names of all methods in the new class dict that
have a true ``__isabstractmethod__`` attribute, and removing the names
of all methods in the new class dict that don't have a true
``__isabstractmethod__`` attribute.  If the resulting
``__abstractmethods__`` set is non-empty, the class is considered
abstract, and attempts to instantiate it will raise ``TypeError``.
(If this were implemented in CPython, an internal flag
``Py_TPFLAGS_ABSTRACT`` could be used to speed up this check [6]_.)

**Discussion:** Unlike C++ or Java, abstract methods as defined here
may have an implementation.  This implementation can be called via the
``super`` mechanism from the class that overrides it.  This could be
useful as an end-point for a super-call in framework using a
cooperative multiple-inheritance [7]_, [8]_.


ABCs for Containers and Iterators
---------------------------------

The ``collections`` module will define ABCs necessary and sufficient
to work with sets, mappings, sequences, and some helper types such as
iterators and dictionary views.  All ABCs have the above-mentioned
``ABCMeta`` as their metaclass.

The ABCs provide implementations of their abstract methods that are
technically valid but fairly useless; e.g. ``__hash__`` returns 0, and
``__iter__`` returns an empty iterator.  In general, the abstract
methods represent the behavior of an empty container of the indicated
type.

Some ABCs also provide concrete (i.e. non-abstract) methods; for
example, the ``Iterator`` class has an ``__iter__`` method returning
itself, fulfilling an important invariant of iterators (which in
Python 2 has to be implemented anew by each iterator class).  These
ABCs can be considered "mix-in" classes.

No ABCs defined in the PEP override ``__init__``, ``__new__``,
``__str__`` or ``__repr__``.  Defining a standard constructor
signature would unnecessarily constrain custom container types, for
example Patricia trees or gdbm files.  Defining a specific string
representation for a collection is similarly left up to individual
implementations.

**Note:** There are no ABCs for ordering operations (``__lt__``,
``__le__``, ``__ge__``, ``__gt__``).  Defining these in a base class
(abstract or not) runs into problems with the accepted type for the
second operand.  For example, if class ``Ordering`` defined
``__lt__``, one would assume that for any ``Ordering`` instances ``x``
and ``y``, ``x < y`` would be defined (even if it just defines a
partial ordering).  But this cannot be the case: If both ``list`` and
``str`` derived from ``Ordering``, this would imply that ``[1, 2] <
(1, 2)`` should be defined (and presumably return False), while in
fact (in Python 3000!)  such "mixed-mode comparisons" operations are
explicitly forbidden and raise ``TypeError``.  See PEP 3100 and [14]_
for more information.  (This is a special case of a more general issue
with operations that take another argument of the same type:


One Trick Ponies
''''''''''''''''

These abstract classes represent single methods like ``__iter__`` or
``__len__``.

``Hashable``
    The base class for classes defining ``__hash__``.  The
    ``__hash__`` method should return an integer.  The abstract
    ``__hash__`` method always returns 0, which is a valid (albeit
    inefficient) implementation.  **Invariant:** If classes ``C1`` and
    ``C2`` both derive from ``Hashable``, the condition ``o1 == o2``
    must imply ``hash(o1) == hash(o2)`` for all instances ``o1`` of
    ``C1`` and all instances ``o2`` of ``C2``.  IOW, two objects
    should never compare equal but have different hash values.

    Another constraint is that hashable objects, once created, should
    never change their value (as compared by ``==``) or their hash
    value.  If a class cannot guarantee this, it should not derive
    from ``Hashable``; if it cannot guarantee this for certain
    instances, ``__hash__`` for those instances should raise a
    ``TypeError`` exception.

    **Note:** being an instance of this class does not imply that an
    object is immutable; e.g. a tuple containing a list as a member is
    not immutable; its ``__hash__`` method raises ``TypeError``.
    (This is because it recursively tries to compute the hash of each
    member; if a member is unhashable it raises ``TypeError``.)

``Iterable``
    The base class for classes defining ``__iter__``.  The
    ``__iter__`` method should always return an instance of
    ``Iterator`` (see below).  The abstract ``__iter__`` method
    returns an empty iterator.

``Iterator``
    The base class for classes defining ``__next__``.  This derives
    from ``Iterable``.  The abstract ``__next__`` method raises
    ``StopIteration``.  The concrete ``__iter__`` method returns
    ``self``.  Note the distinction between ``Iterable`` and
    ``Iterator``: an ``Iterable`` can be iterated over, i.e. supports
    the ``__iter__`` methods; an ``Iterator`` is what the built-in
    function ``iter()`` returns, i.e. supports the ``__next__``
    method.

``Sized``
    The base class for classes defining ``__len__``.  The ``__len__``
    method should return an ``Integer`` (see "Numbers" below) >= 0.
    The abstract ``__len__`` method returns 0.  **Invariant:** If a
    class ``C`` derives from ``Sized`` as well as from ``Iterable``,
    the invariant ``sum(1 for x in o) == len(o)`` should hold for any
    instance ``o`` of ``C``.

``Container``
    The base class for classes defining ``__contains__``.  The
    ``__contains__`` method should return a ``bool``.  The abstract
    ``__contains__`` method returns ``False``.  **Invariant:** If a
    class ``C`` derives from ``Container`` as well as from
    ``Iterable``, then ``(x in o for x in o)`` should be a generator
    yielding only True values for any instance ``o`` of ``C``.

**Open issues:** Conceivably, instead of using the ABCMeta metaclass,
these classes could override ``__instancecheck__`` and
``__subclasscheck__`` to check for the presence of the applicable
special method; for example::

    class Sized(metaclass=ABCMeta):
        @abstractmethod
        def __hash__(self):
            return 0
        @classmethod
        def __instancecheck__(cls, x):
            return hasattr(x, "__len__")
        @classmethod
        def __subclasscheck__(cls, C):
            return hasattr(C, "__bases__") and hasattr(C, "__len__")

This has the advantage of not requiring explicit registration.
However, the semantics hard to get exactly right given the confusing
semantics of instance attributes vs. class attributes, and that a
class is an instance of its metaclass; the check for ``__bases__`` is
only an approximation of the desired semantics.  **Strawman:** Let's
do it, but let's arrange it in such a way that the registration API
also works.


Sets
''''

These abstract classes represent read-only sets and mutable sets.  The
most fundamental set operation is the membership test, written as ``x
in s`` and implemented by ``s.__contains__(x)``.  This operation is
already defined by the `Container`` class defined above.  Therefore,
we define a set as a sized, iterable container for which certain
invariants from mathematical set theory hold.

The built-in type ``set`` derives from ``MutableSet``.  The built-in
type ``frozenset`` derives from ``Set`` and ``Hashable``.

``Set``

    This is a sized, iterable container, i.e., a subclass of
    ``Sized``, ``Iterable`` and ``Container``.  Not every subclass of
    those three classes is a set though!  Sets have the additional
    invariant that each element occurs only once (as can be determined
    by iteration), and in addition sets define concrete operators that
    implement the inequality operations as subclass/superclass tests.
    In general, the invariants for finite sets in mathematics
    hold. [11]_

    Sets with different implementations can be compared safely,
    (usually) efficiently and correctly using the mathematical
    definitions of the subclass/superclass operations for finite sets.
    The ordering operations have concrete implementations; subclasses
    may override these for speed but should maintain the semantics.
    Because ``Set`` derives from ``Sized``, ``__eq__`` may take a
    shortcut and returns ``False`` immediately if two sets of unequal
    length are compared.  Similarly, ``__le__`` may return ``False``
    immediately if the first set has more members than the second set.
    Note that set inclusion implements only a partial ordering;
    e.g. ``{1, 2}`` and ``{1, 3}`` are not ordered (all three of
    ``<``, ``==`` and ``>`` return ``False`` for these arguments).
    Sets cannot be ordered relative to mappings or sequences, but they
    can be compared to those for equality (and then they always
    compare unequal).

    This class also defines concrete operators to compute union,
    intersection, symmetric and asymmetric difference, respectively
    ``__or__``, ``__and__``, ``__xor__`` and ``__sub__``.  These
    operators should return instances of ``Set``.  The default
    implementations call the overridable class method
    ``_from_iterable()`` with an iterable argument.  This factory
    method's default implementation returns a ``frozenset`` instance;
    it may be overridden to return another appropriate ``Set``
    subclass.

    Finally, this class defines a concrete method ``_hash`` which
    computes the hash value from the elements.  Hashable subclasses of
    ``Set`` can implement ``__hash__`` by calling ``_hash`` or they
    can reimplement the same algorithm more efficiently; but the
    algorithm implemented should be the same.  Currently the algorithm
    is fully specified only by the source code [15]_.

    **Note:** the ``issubset`` and ``issuperset`` methods found on the
    set type in Python 2 are not supported, as these are mostly just
    aliases for ``__le__`` and ``__ge__``.

``MutableSet``

    This is a subclass of ``Set`` implementing additional operations
    to add and remove elements.  The supported methods have the
    semantics known from the ``set`` type in Python 2 (except for
    ``discard``, which is modeled after Java):

    ``.add(x)``
        Abstract method returning a ``bool`` that adds the element
        ``x`` if it isn't already in the set.  It should return
        ``True`` if ``x`` was added, ``False`` if it was already
        there. The abstract implementation raises
        ``NotImplementedError``.

    ``.discard(x)``
        Abstract method returning a ``bool`` that removes the element
        ``x`` if present.  It should return ``True`` if the element
        was present and ``False`` if it wasn't.  The abstract
        implementation raises ``NotImplementedError``.

    ``.pop()``
        Concrete method that removes and returns an arbitrary item.
        If the set is empty, it raises ``KeyError``.  The default
        implementation removes the first item returned by the set's
        iterator.

    ``.toggle(x)``
        Concrete method returning a ``bool`` that adds x to the set if
        it wasn't there, but removes it if it was there.  It should
        return ``True`` if ``x`` was added, ``False`` if it was
        removed.

    ``.clear()``
        Concrete method that empties the set.  The default
        implementation repeatedly calls ``self.pop()`` until
        ``KeyError`` is caught.  (**Note:** this is likely much slower
        than simply creating a new set, even if an implementation
        overrides it with a faster approach; but in some cases object
        identity is important.)

    This also supports the in-place mutating operations ``|=``,
    ``&=``, ``^=``, ``-=``.  These are concrete methods whose right
    operand can be an arbitrary ``Iterable``, except for ``&=``, whose
    right operand must be a ``Container``.  This ABC does not support
    the named methods present on the built-in concrete ``set`` type
    that perform (almost) the same operations.


Mappings
''''''''

These abstract classes represent read-only mappings and mutable
mappings.  The ``Mapping`` class represents the most common read-only
mapping API.

The built-in type ``dict`` derives from ``MutableMapping``.

``Mapping``

    A subclass of ``Container``, ``Iterable`` and ``Sized``.  The keys
    of a mapping naturally form a set.  The (key, value) pairs (which
    must be tuples) are also referred to as items.  The items also
    form a set.  Methods:

    ``.__getitem__(key)``
        Abstract method that returns the value corresponding to
        ``key``, or raises ``KeyError``.  The implementation always
        raises ``KeyError``.

    ``.get(key, default=None)``
        Concrete method returning ``self[key]`` if this does not raise
        ``KeyError``, and the ``default`` value if it does.

    ``.__contains__(key)``
        Concrete method returning ``True`` if ``self[key]`` does not
        raise ``KeyError``, and ``False`` if it does.

    ``.__len__()``
        Abstract method returning the number of distinct keys (i.e.,
        the length of the key set).

    ``.__iter__()``
        Abstract method returning each key in the key set exactly once.

    ``.keys()``
        Concrete method returning the key set as a ``Set``.  The
        default concrete implementation returns a "view" on the key
        set (meaning if the underlying mapping is modified, the view's
        value changes correspondingly); subclasses are not required to
        return a view but they should return a ``Set``.

    ``.items()``
        Concrete method returning the items as a ``Set``.  The default
        concrete implementation returns a "view" on the item set;
        subclasses are not required to return a view but they should
        return a ``Set``.

    ``.values()``
        Concrete method returning the values as a sized, iterable
        container (not a set!).  The default concrete implementation
        returns a "view" on the values of the mapping; subclasses are
        not required to return a view but they should return a sized,
        iterable container.

    The following invariants should hold for any mapping ``m``::

        len(m.values()) == len(m.keys()) == len(m.items()) == len(m)
        [value for value in m.values()] == [m[key] for key in m.keys()]
        [item for item in m.items()] == [(key, m[key]) for key in m.keys()]

    i.e. iterating over the items, keys and values should return
    results in the same order.

``MutableMapping``
    A subclass of ``Mapping`` that also implements some standard
    mutating methods.  Abstract methods include ``__setitem__``,
    ``__delitem__``.  Concrete methods include ``pop``, ``popitem``,
    ``clear``, ``update``.  **Note:** ``setdefault`` is *not* included.
    **Open issues:** Write out the specs for the methods.


Sequences
'''''''''

These abstract classes represent read-only sequences and mutable
sequences.

The built-in ``list`` and ``bytes`` types derive from
``MutableSequence``.  The built-in ``tuple`` and ``str`` types derive
from ``Sequence`` and ``Hashable``.

``Sequence``

    A subclass of ``Iterable``, ``Sized``, ``Container``.  It
    defines a new abstract method ``__getitem__`` that has a somewhat
    complicated signature: when called with an integer, it returns an
    element of the sequence or raises ``IndexError``; when called with
    a ``slice`` object, it returns another ``Sequence``.  The concrete
    ``__iter__`` method iterates over the elements using
    ``__getitem__`` with integer arguments 0, 1, and so on, until
    ``IndexError`` is raised.  The length should be equal to the
    number of values returned by the iterator.

    **Open issues:** Other candidate methods, which can all have
    default concrete implementations that only depend on ``__len__``
    and ``__getitem__`` with an integer argument: ``__reversed__``,
    ``index``, ``count``, ``__add__``, ``__mul__``.

``MutableSequence``

    A subclass of ``Sequence`` adding some standard mutating methods.
    Abstract mutating methods: ``__setitem__`` (for integer indices as
    well as slices), ``__delitem__`` (ditto), ``insert``, ``append``,
    ``reverse``.  Concrete mutating methods: ``extend``, ``pop``,
    ``remove``.  Concrete mutating operators: ``+=``, ``*=`` (these
    mutate the object in place).  **Note:** this does not define
    ``sort()`` -- that is only required to exist on genuine ``list``
    instances.


Strings
-------

Python 3000 will likely have at least two built-in string types: byte
strings (``bytes``), deriving from ``MutableSequence``, and (Unicode)
character strings (``str``), deriving from ``Sequence`` and
``Hashable``.

**Open issues:** define the base interfaces for these so alternative
implementations and subclasses know what they are in for.  This may be
the subject of a new PEP or PEPs (PEP 358 should be co-opted for the
``bytes`` type).


ABCs vs. Alternatives
=====================

In this section I will attempt to compare and contrast ABCs to other
approaches that have been proposed.


ABCs vs. Duck Typing
--------------------

Does the introduction of ABCs mean the end of Duck Typing?  I don't
think so.  Python will not require that a class derives from
``BasicMapping`` or ``Sequence`` when it defines a ``__getitem__``
method, nor will the ``x[y]`` syntax require that ``x`` is an instance
of either ABC.  You will still be able to assign any "file-like"
object to ``sys.stdout``, as long as it has a ``write`` method.

Of course, there will be some carrots to encourage users to derive
from the appropriate base classes; these vary from default
implementations for certain functionality to an improved ability to
distinguish between mappings and sequences.  But there are no sticks.
If ``hasattr(x, __len__)`` works for you, great!  ABCs are intended to
solve problems that don't have a good solution at all in Python 2,
such as distinguishing between mappings and sequences.


ABCs vs. Generic Functions
--------------------------

ABCs are compatible with Generic Functions (GFs).  For example, my own
Generic Functions implementation [4]_ uses the classes (types) of the
arguments as the dispatch key, allowing derived classes to override
base classes.  Since (from Python's perspective) ABCs are quite
ordinary classes, using an ABC in the default implementation for a GF
can be quite appropriate.  For example, if I have an overloaded
``prettyprint`` function, it would make total sense to define
pretty-printing of sets like this::

    @prettyprint.register(Set)
    def pp_set(s):
        return "{" + ... + "}"  # Details left as an exercise

and implementations for specific subclasses of Set could be added
easily.

I believe ABCs also won't present any problems for RuleDispatch,
Phillip Eby's GF implementation in PEAK [5]_.

Of course, GF proponents might claim that GFs (and concrete, or
implementation, classes) are all you need.  But even they will not
deny the usefulness of inheritance; and one can easily consider the
ABCs proposed in this PEP as optional implementation base classes;
there is no requirement that all user-defined mappings derive from
``BasicMapping``.


ABCs vs. Interfaces
-------------------

ABCs are not intrinsically incompatible with Interfaces, but there is
considerable overlap.  For now, I'll leave it to proponents of
Interfaces to explain why Interfaces are better.  I expect that much
of the work that went into e.g. defining the various shades of
"mapping-ness" and the nomenclature could easily be adapted for a
proposal to use Interfaces instead of ABCs.

"Interfaces" in this context refers to a set of proposals for
additional metadata elements attached to a class which are not part of
the regular class hierarchy, but do allow for certain types of
inheritance testing.

Such metadata would be designed, at least in some proposals, so as to
be easily mutable by an application, allowing application writers to
override the normal classification of an object.

The drawback to this idea of attaching mutable metadata to a class is
that classes are shared state, and mutating them may lead to conflicts
of intent.  Additionally, the need to override the classification of
an object can be done more cleanly using generic functions: In the
simplest case, one can define a "category membership" generic function
that simply returns False in the base implementation, and then provide
overrides that return True for any classes of interest.


References
==========

.. [1] An Introduction to ABC's, by Talin
   (http://mail.python.org/pipermail/python-3000/2007-April/006614.html)

.. [2] Incomplete implementation prototype, by GvR
   (http://svn.python.org/view/sandbox/trunk/abc/)

.. [3] Possible Python 3K Class Tree?, wiki page created by Bill Janssen
   (http://wiki.python.org/moin/AbstractBaseClasses)

.. [4] Generic Functions implementation, by GvR
   (http://svn.python.org/view/sandbox/trunk/overload/)

.. [5] Charming Python: Scaling a new PEAK, by David Mertz
   (http://www-128.ibm.com/developerworks/library/l-cppeak2/)

.. [6] Implementation of @abstractmethod
   (http://python.org/sf/1706989)

.. [7] Unifying types and classes in Python 2.2, by GvR
   (http://www.python.org/download/releases/2.2.3/descrintro/)

.. [8] Putting Metaclasses to Work: A New Dimension in Object-Oriented
   Programming, by Ira R. Forman and Scott H. Danforth
   (http://www.amazon.com/gp/product/0201433052)

.. [9] Partial order, in Wikipedia
   (http://en.wikipedia.org/wiki/Partial_order)

.. [10] Total order, in Wikipedia
   (http://en.wikipedia.org/wiki/Total_order)

.. [11] Finite set, in Wikipedia
   (http://en.wikipedia.org/wiki/Finite_set)

.. [12] Make isinstance/issubclass overloadable
   (http://python.org/sf/1708353)

.. [13] ABCMeta sample implementation
   (http://svn.python.org/view/sandbox/trunk/abc/xyz.py)

.. [14] python-dev email ("Comparing heterogeneous types")
   http://mail.python.org/pipermail/python-dev/2004-June/045111.html

.. [15] Function ``frozenset_hash()`` in Object/setobject.c
   (http://svn.python.org/view/python/trunk/Objects/setobject.c)


Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gproux+py3000 at gmail.com  Sat May 12 17:27:09 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Sun, 13 May 2007 00:27:09 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
Message-ID: <19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>

Dear all,

Pleased to meet you. I just subscribed to the list because I wanted to
join the discussion regarding a specific PEP (for all the rest, you
are all much more expert than me)

Guido:
> 3131 (non-ASCII identifiers) -- I'm leaning towards rejecting.

I would like to voice my opposition to the rejection at that stage and
request that more time is spent requesting/analysing the opinion of
more people especially the people who have to deal with non-roman
languages as a daily basis and especially people in the education
field (along like other interesting people like the OLPC people)

French but living in Japanese and essentially trilingual, I have
experience in localization/internationalization (as an i18n engineer
for Symbian Ltd.), and very ardent Python supporter, I have tried (and
sometimes managed) teaching Python to a number of younger or less
young people, male and female in both French and Japanese environment.

In this respect, I strongly believe that support non-ASCII identifiers
as proposed by PEP3131 would improve a number of things:
- discussion and uptake of python in "non-ascii" countries
- ability for children to learn programming in their own language (I
started programming at 7 years old and would have been very disturbed
if I could not use my own language to type in programs)
- increase of the number of new "interesting" packages from non-ascii countries
- ability for local programmers and local companies to provide
"bridges" between international (english) APIs and local APIs.
- Increase the number of python users (from 7 to 77 years old)

In my humble opinion, now that UTF8 is accepted as the standard source
code encoding, it is very difficult to understand why we should start
putting restrictions on the kind of identifiers that are used (which
would force people to comment line by line as they do now!).

When I am programming in Python, I am VERY DISTURBED when the code I
write contains much comment. It needs to be readable just by glancing
at it.

However, for most of the people who are core python developers, you
should ask what is the typical reading speed for "ascii" characters
for a e.g. standard Japanese pupil. You would be very surprised how
slow that is. In my opinion (after leaving in Japan for quite a bit),
people are very slow to read ASCII characters and this definitely
restrain their programming productivity and expressiveness.

Of course, for things like "standard libraries", I think that
self-regulation and project based regulation will impose ASCII
charsets for the base libraries and APIs but i really believe that
letting people use their own charset to express themself will REALLY
give them the productivity boost they would deserve from python.

Let me know if you have any question.

Regards,

Guillaume

From steven.bethard at gmail.com  Sat May 12 19:03:45 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 12 May 2007 11:03:45 -0600
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <1178551661.8251.16.camel@antoine-ubuntu>
References: <1178551661.8251.16.camel@antoine-ubuntu>
Message-ID: <d11dcfba0705121003r4d705b02p798e9b0422101e54@mail.gmail.com>

On 5/7/07, Antoine Pitrou <antoine.pitrou at wengo.com> wrote:
> FWIW and in light of the thread on removing __del__ from the language, I
> just posted Yet Another Recipe for automatic finalization:
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/519621
>
> It allows writing a finalizer as a single __finalize__ method, at the
> cost of explicitly calling an enable_finalizer() method with the list of
> attributes to keep alive on the "ghost object".

And here's a version that doesn't lose updates to the finalizer attributes:

    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/519635

It replaces enable_finalizer() with a class attribute __finalattrs__.
>From __finalize__, all class attributes and methods are accessible, as
are any instance attributes specified by __finalattrs__. Guido's
BufferedWriter example looks like::

    class BufferedWriter(Finalized):
        __finalattrs__ = 'buffer', 'raw'
        ...
        def flush(self):
            self.raw.write(self.buffer)
            self.buffer = b""

        def __finalize__(self):
            self.flush()


STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From eucci.group at gmail.com  Sat May 12 19:58:10 2007
From: eucci.group at gmail.com (Jeff Shell)
Date: Sat, 12 May 2007 11:58:10 -0600
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <20070510153507.205EB3A4061@sparrow.telecommunity.com>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>
	<20070509015553.9C6843A4061@sparrow.telecommunity.com>
	<88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>
	<20070509173942.B5A1B3A4061@sparrow.telecommunity.com>
	<46424718.20006@benjiyork.com>
	<20070510012836.16D0D3A4061@sparrow.telecommunity.com>
	<464318D0.2000109@benjiyork.com>
	<20070510153507.205EB3A4061@sparrow.telecommunity.com>
Message-ID: <88d0d31b0705121058n7a92af7dn178f220dba922a91@mail.gmail.com>

On 5/10/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 09:06 AM 5/10/2007 -0400, Benji York wrote:
> >I would let Jim speak for himself too, but I prefer to put words in his
> >mouth. ;)  While zope.interface has anemic facilities for "verifying"
> >interfaces, few people use them, and even then rarely outside of very
> >simple "does this object look right" when testing.  It may have been
> >believed verification would be a great thing, but it's all but
> >deprecated at this point.
>
> Okay, but that's quite the opposite of what I understand Jeff to be
> saying in this thread, which is that not only is LBYL good, but that
> he does it all the time.

Actually, I don't know what LBYL and EFTP (or whatever that other one
is) mean in this context. This is the first time I've heard, or at
least paid attention to, those acronyms. In this context anyways.

If you could explain what this really means, and KTATAM (Keep The
Acronyms To A Minimum), I would appreciate it. I recognize the
arguments I've made seem to go behind LBYL, but that was mostly chosen
because that's what you said zope.interface did or was. And
gul-darnit, I like zope.interface.

> >My main intent in piping up was
> >dispelling the LBYL dispersions about zope.interface. ;)
>
> Well, "back in the day", before PyProtocols was written, I discovered
> PEP 246 adaptation and began trying to convince Jim Fulton that
> adaptation beat the pants off of using if-then's to do "implements"
> testing.  His argument then, IIRC, was that interface verification
> was more important.  I then went off and wrote PyProtocols in large
> part (specifically the large documentation part!) to show him what
> could be done using adaptation as a core concept.

I think it's beneficial to have both. But I agree, it's usually better
to program against adaptation. It provides more flexibility. I think
the 'hasattr()' syndrome still hangs over many of us, however. We're
used to looking at the piece of the duck we're interested in more than
trying to see if we can put something into a duck suit (or better yet
- Duck Soup!)

But the 'provides / provided by' piece is still important to me.
Adaptation isn't *always* needed or useful.

I like that the interface hierarchy is different than an
implementation hierarchy. I like that it's easier to test for
interface provision than it is to use isinstance() -
`IFoo.providedBy(obj)` often works regardless of whether 'obj' is a
proxy or wrapper, and without tampering with `isinstance()`. I know
that there's been talk of having ``__isinstance()__`` and
``__issubclass()__``, which could be used to take care of the
proxy/wrapper problem. But I haven't formed an opinion about how I
feel about that.

I like the Roles/Traits side of zope.interface because I can declare
that information about third party products. For example, I was able
to add some 'implements' directives to a SQLAlchemy 'InstrumentedList'
class - basically I said that it supported the common `ISequence`
interface. Which I recognize that in this particular scenario, if that
role/trait or abstract base class was built in, than I wouldn't have
had to do that (since it is based on a common Python type spec). Still
though - it doesn't matter whether `InstrumentedList` derives from
`list` or `UserList` or implements the entire sequence API directly.
The Trait could be assigned independent of implementation, and could
be done in another product without affecting any internals of
SQLAlchemy: I didn't have to make a subclass that SQLAlchemy wouldn't
know to instantiate. I didn't have to write an adapter. I just had to
say "I happen to know that instances of this class will have this
trait".

I don't know if that's LBYL, EYV (Eat Your Vegetables), LBWBCTS (Look
Both Ways Before Crossing The Street), or what. I think it's just a
way of saying "I happen to know that this thing smells like a duck. It
doesn't say that it smells like a duck, but I know it smells like a
duck. And for everywhere that I expect to find the fine fragrance of
duck, this thing should be allowed." No adapters, no changing the base
classes, no meddling with method resolution order, just adding a
trait. The trait in this case is like an access pass - just an extra
thing worn around the neck that will grant you access to certain doors
and pathways. It doesn't change who you are or how you accomplish your
job.

MTAAFT (Maybe There's Another Acronym For This)?

-- 
Jeff Shell

From pje at telecommunity.com  Sat May 12 19:55:12 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 12 May 2007 13:55:12 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <46451BB7.9030703@canterbury.ac.nz>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070509205655.622A63A4061@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<fb6fbf560705110646h7a990f90p5cfb3797541f7ccd@mail.gmail.com>
	<20070511162803.5E70F3A4061@sparrow.telecommunity.com>
	<fb6fbf560705111027g21bf5299xcafe0cad52b8f6a5@mail.gmail.com>
	<20070511184927.7A2043A4061@sparrow.telecommunity.com>
	<46451BB7.9030703@canterbury.ac.nz>
Message-ID: <20070512180213.EEDC93A4088@sparrow.telecommunity.com>

At 01:43 PM 5/12/2007 +1200, Greg Ewing wrote:
>Phillip J. Eby wrote:
> > At 01:27 PM 5/11/2007 -0400, Jim Jewett wrote:
> >>If there are two registrations for the same selection criteria, how
> >>can the user resolve things?
>
>But what if there's *already* an @around method being
>used? Then you need an @even_more_around method. Etc
>ad infinitum?

Yep, so simple things are simple, and complex things are 
possible.  That's the Python Way(tm).  :)

To put your comment in another perspective, "but what if somebody 
already defined a method in their class that I want to change?  Then 
I need to subclass it.  And if somebody wants to change that they 
have to subclass *that*, etc. ad infinitum?  Clearly classes are too 
complicated!"  :)

In practice, @around is mostly used for application-defined special 
cases, and there is no higher authority than the application who 
needs to override things.  If a library needs special combinators 
internally, it's better off making them lower-than- at around 
precedence.  Normal, before, and after methods are usually adequate 
for libraries.  (Aside from special-purpose combinators like the 
@discount example.)


From stargaming at gmail.com  Sat May 12 20:12:38 2007
From: stargaming at gmail.com (Stargaming)
Date: Sat, 12 May 2007 20:12:38 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
Message-ID: <f2502n$92j$1@sea.gmane.org>

Guillaume Proux schrieb:
> Dear all,
> 
[snip]
> 
> In this respect, I strongly believe that support non-ASCII identifiers
> as proposed by PEP3131 would improve a number of things:
> - discussion and uptake of python in "non-ascii" countries

While still separating them from ascii-countries. They would start 
writing programs that expose foreign-phrased APIs but we would deny 
using them because we couldn't even type a single word!

> - ability for children to learn programming in their own language (I
> started programming at 7 years old and would have been very disturbed
> if I could not use my own language to type in programs)

AFAIK, allowing non-ascii identifiers would still *not* translate 
python. They would still have to struggle with every part of python that 
is builtin, i.e. builtins (you could let non-ascii identifiers reference 
them, though) and keywords. Better come up with some proposal to 
translate python (perhaps PyPy could do something here?) or all 
python-scripts (I think a translator could do its job here) to improve 
the situation.

> - increase of the number of new "interesting" packages from non-ascii countries

As stated above, we could not use them though. Bad deal, if you ask me!

> - ability for local programmers and local companies to provide
> "bridges" between international (english) APIs and local APIs.

I don't get the improvement offered by this one. We should *allow* 
non-ascii identifiers to **require** wrappers?

> - Increase the number of python users (from 7 to 77 years old)

Works in English, too.

> 
> In my humble opinion, now that UTF8 is accepted as the standard source
> code encoding, it is very difficult to understand why we should start
> putting restrictions on the kind of identifiers that are used (which
> would force people to comment line by line as they do now!).

No, we do not restrict them, we simply do not allow them (what is a huge 
difference here). UTF-8 will be allowed (*and* enforced by default) as a 
file encoding, i.e. strings and comments will be affected. I don't see 
the real restriction here. Correct me please, if I'm wrong.

> 
> When I am programming in Python, I am VERY DISTURBED when the code I
> write contains much comment. It needs to be readable just by glancing
> at it.

OTOH, I cannot glance at japanese code and know what it means. So, 
better the japanese developer named it badly but explained it than 
requiring me to consult a dictionary.

> 
> However, for most of the people who are core python developers, you
> should ask what is the typical reading speed for "ascii" characters
> for a e.g. standard Japanese pupil. You would be very surprised how
> slow that is. In my opinion (after leaving in Japan for quite a bit),
> people are very slow to read ASCII characters and this definitely
> restrain their programming productivity and expressiveness.

See above, at least *my* reading speed for japanese text tends to zero 
(if not less!).

> 
> Of course, for things like "standard libraries", I think that
> self-regulation and project based regulation will impose ASCII
> charsets for the base libraries and APIs but i really believe that
> letting people use their own charset to express themself will REALLY
> give them the productivity boost they would deserve from python.

They're free to express their thoughts in comments, today, still 
separating them from ascii-developers.

> 
> Let me know if you have any question.
> 
> Regards,
> 
> Guillaume

I do not think allowing people to program in *their* language would 
enhance integration. It would just split the python community *even* 
more. I like communicating with non-native English speakers much more 
than not communicating with them at all because they got their own 
language in there.
Additionally, I think the reason for rejection of this PEP is the same 
one that applied to all those "Let the user extend Python's grammar at 
runtime" -- one developer would have to learn a completely new language 
for understanding a program.
To communicate, we just have to find (or agree on) a common point 
between devs. Python is English, that's a matter of fact IMO. It is the 
common language that makes us a community and *one* language.

I'm, well, -1 on this (even though I don't know if I got a voice here).

-- 
Greetings,
Stargaming


From pje at telecommunity.com  Sat May 12 20:29:17 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 12 May 2007 14:29:17 -0400
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <88d0d31b0705121058n7a92af7dn178f220dba922a91@mail.gmail.co
 m>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>
	<20070509015553.9C6843A4061@sparrow.telecommunity.com>
	<88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>
	<20070509173942.B5A1B3A4061@sparrow.telecommunity.com>
	<46424718.20006@benjiyork.com>
	<20070510012836.16D0D3A4061@sparrow.telecommunity.com>
	<464318D0.2000109@benjiyork.com>
	<20070510153507.205EB3A4061@sparrow.telecommunity.com>
	<88d0d31b0705121058n7a92af7dn178f220dba922a91@mail.gmail.com>
Message-ID: <20070512182731.977973A4088@sparrow.telecommunity.com>

At 11:58 AM 5/12/2007 -0600, Jeff Shell wrote:
>Actually, I don't know what LBYL and EFTP (or whatever that other one
>is) mean in this context. This is the first time I've heard, or at
>least paid attention to, those acronyms. In this context anyways.
>
>If you could explain what this really means, and KTATAM (Keep The
>Acronyms To A Minimum), I would appreciate it. I recognize the
>arguments I've made seem to go behind LBYL, but that was mostly chosen
>because that's what you said zope.interface did or was. And
>gul-darnit, I like zope.interface.

Checking whether an object provides an interface is LBYL.  Simply 
proceeding as if it does (or adapting it to a desired interface), is EAFP.

zope.interface can certainly be used in either style, but when it was 
first created, LBYL is *all* it did.  Adaptation was added later.


> > Well, "back in the day", before PyProtocols was written, I discovered
> > PEP 246 adaptation and began trying to convince Jim Fulton that
> > adaptation beat the pants off of using if-then's to do "implements"
> > testing.  His argument then, IIRC, was that interface verification
> > was more important.  I then went off and wrote PyProtocols in large
> > part (specifically the large documentation part!) to show him what
> > could be done using adaptation as a core concept.
>
>I think it's beneficial to have both. But I agree, it's usually better
>to program against adaptation. It provides more flexibility. I think
>the 'hasattr()' syndrome still hangs over many of us, however. We're
>used to looking at the piece of the duck we're interested in more than
>trying to see if we can put something into a duck suit (or better yet
>- Duck Soup!)
>
>But the 'provides / provided by' piece is still important to me.
>Adaptation isn't *always* needed or useful.

That's actually an illusion created by the economic impact of using 
interfaces and adapters instead of generic functions.  In languages 
with generic functions, nobody bothers creating separate "trait" 
systems, apart from designating groups of GFs that "go 
together".  (Haskell typeclasses and Dylan modules, for example), 
because GFs are so easy and elementary that it seems like part of the 
normal development flow.

Interfaces+Adaptation are such a clumsy way of doing the same thing, 
that it often seems easier to get by *checking* for an existing 
interface, instead of defining a new one and adapting to it.  (Or 
just using an overload.)

But in the early days of PyProtocols, I soon realized that checking 
for an interface was *always* an antipattern, no matter how 
temptingly convenient it might appear to be to rationalize an 
interface check at the time.  You can get away with it 
sometimes...  but never for long, if your code is being reused.


>I don't know if that's LBYL, EYV (Eat Your Vegetables), LBWBCTS (Look
>Both Ways Before Crossing The Street), or what. I think it's just a
>way of saying "I happen to know that this thing smells like a duck. It
>doesn't say that it smells like a duck, but I know it smells like a
>duck. And for everywhere that I expect to find the fine fragrance of
>duck, this thing should be allowed."

Note that this is still one level of abstraction away from your goal: 
to get some behavior.  Instead of checking for duckness or 
quackability, *just peform the "quack" operation*.

If you want to know about quackability because you intend to do 
something *else* with the object, then just do that "something else".

The point of generic functions is that the only reason it's worth 
knowing something about a "trait" is to select *how* you will 
accomplish something.  So just accomplish the something, instead of 
micromanaging.

Remember the bad old days before OO?  The big step forward was to get 
rid of all those switch/cases in your functions, replacing them with 
method dispatching.  The second big step forward is to get rid of the 
type/hasattr/interface/role/trait testing, and replace it with 
generic functions.


From guido at python.org  Sat May 12 20:53:58 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 12 May 2007 11:53:58 -0700
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <d11dcfba0705121003r4d705b02p798e9b0422101e54@mail.gmail.com>
References: <1178551661.8251.16.camel@antoine-ubuntu>
	<d11dcfba0705121003r4d705b02p798e9b0422101e54@mail.gmail.com>
Message-ID: <ca471dc20705121153x3701cef9ied415f4d88f0ba84@mail.gmail.com>

On 5/12/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> And here's a version that doesn't lose updates to the finalizer attributes:
>
>     http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/519635
>
> It replaces enable_finalizer() with a class attribute __finalattrs__.
> >From __finalize__, all class attributes and methods are accessible, as
> are any instance attributes specified by __finalattrs__. Guido's
> BufferedWriter example looks like::
>
>     class BufferedWriter(Finalized):
>         __finalattrs__ = 'buffer', 'raw'
>         ...
>         def flush(self):
>             self.raw.write(self.buffer)
>             self.buffer = b""
>
>         def __finalize__(self):
>             self.flush()

But can I subclass it and in the subclass override (extend) flush()? E.g.

class MyWriter(BufferedWriter):
  def flush(self):
    super(MyWriter, self).flush()  # Or super.flush() once PEP xxx is accepted
    print("Feel free to unplug the disk now")

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jimjjewett at gmail.com  Sat May 12 21:03:05 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sat, 12 May 2007 15:03:05 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070512180213.EEDC93A4088@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<fb6fbf560705110646h7a990f90p5cfb3797541f7ccd@mail.gmail.com>
	<20070511162803.5E70F3A4061@sparrow.telecommunity.com>
	<fb6fbf560705111027g21bf5299xcafe0cad52b8f6a5@mail.gmail.com>
	<20070511184927.7A2043A4061@sparrow.telecommunity.com>
	<46451BB7.9030703@canterbury.ac.nz>
	<20070512180213.EEDC93A4088@sparrow.telecommunity.com>
Message-ID: <fb6fbf560705121203n7762f8dfr8a2b2cea1aa6f6f2@mail.gmail.com>

On 5/12/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 01:43 PM 5/12/2007 +1200, Greg Ewing wrote:

> In practice, @around is mostly used for application-defined special
> cases, and there is no higher authority than the application who
> needs to override things.  If a library needs special combinators
> internally, it's better off making them lower-than- at around
> precedence.  Normal, before, and after methods are usually adequate
> for libraries.  (Aside from special-purpose combinators like the
> @discount example.)

(1)  Would it be reaonable to say this in the PEP?

(2)  Would it be reasonable to leave out (or at least, leave for
another PEP) the extension methods like discount?

-jJ

From talin at acm.org  Sat May 12 21:07:24 2007
From: talin at acm.org (Talin)
Date: Sat, 12 May 2007 12:07:24 -0700
Subject: [Python-3000] Revised PEP 3119 (Abstract Base Classes)
In-Reply-To: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
References: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
Message-ID: <4646106C.1090608@acm.org>

Guido van Rossum wrote:
> Here's a new version of the ABC PEP. A lot has changed; a lot remains.
> I can't give a detailed overview of all the changes, and a diff would
> show too many spurious changes, but some of the highlights are:

Some general comments on the PEP:

Compared to the previous version, this version of the PEP is closer in 
spirit to the various other competing proposals for 'post-hoc object 
taxonomies', although some important differences remain. I'd like to 
point out both the similarities and the differences, especially the 
latter as they form the basis for further discussion and possibly evolution.

First, the ways in which the new PEP more closely resembles its competitors:

The new version of the PEP is more strongly oriented towards post-hoc 
classification of objects, in other words, putting classes into 
categories that may not have existed when the classes were created.

It also means that there is no longer a requirement that categories for 
built-in objects have an official Python "seal of approval". Anyone can 
come along and re-categorize the built-ins however they like; And they 
can do so in a way that doesn't interfere with any previously existing 
categories. There will of course be certain 'standard' categories (as 
outlined in the PEP), but these standard categories do not have any 
privileged status, unlike the ones in the earlier versions of the PEP.

It means that if we make a mistake defining the categories (or more 
likely, if we fail to address someone's needs), it is possible for 
someone else to come along and repair that mistake by defining a 
competing taxonomy.

The categorization relationships are now stored preferentially in a map 
which is external to the objects being categorized, allowing objects to 
be recategorized without mutating them. This is similar to the behavior 
of Colin Winter's 'roles' proposal and some others.

(For the remainder of this document, I am going to use the term "dynamic 
inheritance" to describe specifying inheritance via Guido's special 
methods, as opposed to "traditional inheritance", what we have now.)

Now, on to the differences:

The key differentiator between Guido's proposal and the others can be 
summarized by the following question: "Should the mechanism which 
defines the hierarchy of classes be the same as the mechanism that 
defines the hierarchy of categories?"

To put it another way, is a "category" (or "interface" or "role" or 
whatever term you want to use) a "class" in the normal sense, or is it 
some other thing?

In the terminology of Java and C# and other languages which support 
interfaces, the term 'interface' is explicitly defined as something that 
is 'not a class'. A class is a unit of implementation, and interfaces 
contain no implementation. [1]

In these object classification systems, there are three different 
relationships we care about:

    -- The normal inheritance relationship between classes.

    -- The specification of which classes belong to which categories.

    -- The relationship between the categories themselves.

(Note that in some systems, such as Raymond Hettinger's attribute-based 
proposal, the third type of relationship doesn't exist - each category 
is standalone, although you can simulate the effects of a category 
hierarchy by putting objects in multiple categories. Thus, there's no 
MutableSequence category, but you can place an object in both Mutable 
and Sequence and infer from there.)

Given these different types of relationships, the question to be asked 
is, should all of these various things use the same mechanism and the 
same testing predicate (isinstance), or should they be separate mechanisms?

I'll try to summarize some of the pros and cons, although this should 
not be considered a comprehensive list:

Arguments in favor of reusing 'isinstance':

    -- It's familiar and easy to remember.

    -- Not everyone considers interfaces and implementations to be 
distinct things, at least not in Python where there are no clear 
boundaries enforced by the language (as can be seen in Guido's desire to 
have some partial implementation in the ABCs.)

    -- Declaring overloads in PJE's generic function proposal is cleaner 
if we only have to worry about annotating arguments for types rather 
than types + interfaces. In other words, we would need two different 
kinds of annotations for a given method signature, and a way to 
discriminate between them. If categories are just base classes, then we 
only have one dispatch type to worry about. [2]

Arguments in favor of a different mechanism:

    -- Mixing different kinds of inheritance mechanisms within a single 
object might lead to some strange inconsistencies. For example, if you 
have two classes, one which derives from an ABC using traditional 
inheritance, and one which derives using dynamic inheritance, they may 
behave differently.

    (For example, the @abstractmethod decorator only affects classes 
that derive from the ABC using traditional inheritance, not dynamic 
inheritance. Some folks my find this inconsistency objectionable.)

    -- For some people, an interface is not the same thing as a class, 
and should not be treated as such. In particular, there is a desire by 
some people to enforce a stricter separation between interface and 
implementation.

    -- Forcing them to be separate allows you to make certain 
simplifying assumptions about the class hierarchy. If categories can 
relate to each other via traditional inheritance, and if I want to trace 
upwards from a given class to find all interfaces that it implements, 
then I may have to trace both traditional and dynamic inheritance links. 
If categories can only relate via some special scheme, however, then I 
can simply do my tracing in two passes: First find all base classes 
using traditional inheritance, and then given that set, find all 
categories using dynamic inheritance. In other words, I don't have to 
keep switching inheritance types as I trace.

---

[1] On the other hand, both C# and Java allow interfaces to be tested by 
their equivalent of "isinstance" so there is some conflation of the two. 
On the gripping hand, however, C# and Java are both statically typed 
language, where things like "isinstance" really means "istype", whereas 
in Python "isinstance" really means something more like 
"isimplementation". So there is no exact equivalent to what Python does 
here.

[2] I should mention that one of my personal criteria for evaluating 
these proposals is the level of synergy achieved with PJE's PEP. Now, 
PJE may claim that he doesn't need interfaces or ABCs or anything, but I 
believe that his PEP benefits considerably by the existence of ABCs, 
because it means that you need far fewer overloads in an ABC world. 
Thus, I can overload based on "Sequence" rather than having to have 
separate overloads for list, tuple, and various user-created sequence 
types. (Although I can if I really need to.)

I would go further, and say that these object taxonomies should only go 
so far as to provide what is needed to obtain that synergy; Any features 
beyond that are mostly superfluous. But that's just my personal opinion.

-- Talin

From talin at acm.org  Sat May 12 21:09:52 2007
From: talin at acm.org (Talin)
Date: Sat, 12 May 2007 12:09:52 -0700
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <fb6fbf560705121203n7762f8dfr8a2b2cea1aa6f6f2@mail.gmail.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>	<20070510161417.192943A4061@sparrow.telecommunity.com>	<464395AB.6040505@canterbury.ac.nz>	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>	<fb6fbf560705110646h7a990f90p5cfb3797541f7ccd@mail.gmail.com>	<20070511162803.5E70F3A4061@sparrow.telecommunity.com>	<fb6fbf560705111027g21bf5299xcafe0cad52b8f6a5@mail.gmail.com>	<20070511184927.7A2043A4061@sparrow.telecommunity.com>	<46451BB7.9030703@canterbury.ac.nz>	<20070512180213.EEDC93A4088@sparrow.telecommunity.com>
	<fb6fbf560705121203n7762f8dfr8a2b2cea1aa6f6f2@mail.gmail.com>
Message-ID: <46461100.9080609@acm.org>

Jim Jewett wrote:
> On 5/12/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>> At 01:43 PM 5/12/2007 +1200, Greg Ewing wrote:
> 
>> In practice, @around is mostly used for application-defined special
>> cases, and there is no higher authority than the application who
>> needs to override things.  If a library needs special combinators
>> internally, it's better off making them lower-than- at around
>> precedence.  Normal, before, and after methods are usually adequate
>> for libraries.  (Aside from special-purpose combinators like the
>> @discount example.)
> 
> (1)  Would it be reaonable to say this in the PEP?
> 
> (2)  Would it be reasonable to leave out (or at least, leave for
> another PEP) the extension methods like discount?

There ought to be a way to preserve with each PEP a separate document 
containing a more lengthy discussion of the rationales and consequences. 
Similar to the way that the _Federalist Papers_ is often used to 
interpret the meaning of the U.S. Constitution.

> -jJ
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/talin%40acm.org


From guido at python.org  Sat May 12 21:19:58 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 12 May 2007 12:19:58 -0700
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <88d0d31b0705121058n7a92af7dn178f220dba922a91@mail.gmail.com>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>
	<20070509015553.9C6843A4061@sparrow.telecommunity.com>
	<88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>
	<20070509173942.B5A1B3A4061@sparrow.telecommunity.com>
	<46424718.20006@benjiyork.com>
	<20070510012836.16D0D3A4061@sparrow.telecommunity.com>
	<464318D0.2000109@benjiyork.com>
	<20070510153507.205EB3A4061@sparrow.telecommunity.com>
	<88d0d31b0705121058n7a92af7dn178f220dba922a91@mail.gmail.com>
Message-ID: <ca471dc20705121219o56909306qadf513c531447135@mail.gmail.com>

On 5/12/07, Jeff Shell <eucci.group at gmail.com> wrote:
> I like that the interface hierarchy is different than an
> implementation hierarchy. I like that it's easier to test for
> interface provision than it is to use isinstance() -
> `IFoo.providedBy(obj)` often works regardless of whether 'obj' is a
> proxy or wrapper, and without tampering with `isinstance()`. I know
> that there's been talk of having ``__isinstance()__`` and
> ``__issubclass()__``, which could be used to take care of the
> proxy/wrapper problem. But I haven't formed an opinion about how I
> feel about that.
>
> I like the Roles/Traits side of zope.interface because I can declare
> that information about third party products. For example, I was able
> to add some 'implements' directives to a SQLAlchemy 'InstrumentedList'
> class - basically I said that it supported the common `ISequence`
> interface. Which I recognize that in this particular scenario, if that
> role/trait or abstract base class was built in, than I wouldn't have
> had to do that (since it is based on a common Python type spec). Still
> though - it doesn't matter whether `InstrumentedList` derives from
> `list` or `UserList` or implements the entire sequence API directly.
> The Trait could be assigned independent of implementation, and could
> be done in another product without affecting any internals of
> SQLAlchemy: I didn't have to make a subclass that SQLAlchemy wouldn't
> know to instantiate. I didn't have to write an adapter. I just had to
> say "I happen to know that instances of this class will have this
> trait".
>
> I don't know if that's LBYL, EYV (Eat Your Vegetables), LBWBCTS (Look
> Both Ways Before Crossing The Street), or what. I think it's just a
> way of saying "I happen to know that this thing smells like a duck. It
> doesn't say that it smells like a duck, but I know it smells like a
> duck. And for everywhere that I expect to find the fine fragrance of
> duck, this thing should be allowed." No adapters, no changing the base
> classes, no meddling with method resolution order, just adding a
> trait. The trait in this case is like an access pass - just an extra
> thing worn around the neck that will grant you access to certain doors
> and pathways. It doesn't change who you are or how you accomplish your
> job.

Please have a look at the latest version (updated yesterday) of PEP
3119. Using the classes there, you can say

  from collections import Sequence
  Sequence.register(InstrumentedList)

>From this point, issubclass(InstrumentedList, Sequence) will be true
(and likewise for instances of it and isinstance(x, Sequence)). But
InstrumentedList's __mro__ and __bases__ are unchanged. This is pretty
close to what you expect from a Zope interface, except you can also
subclass Sequence if you want to, and a later version of SQLAlchemy
could subclass InstrumentedList from Sequence. (The register() call
would then be redundant, but you won't have to remove it -- it will
act as a no-op if the given subclass relationship already holds.)

A subclass of Sequence can behave either as an implementation class
(when it provides implementations of all required methods) or as
another interface (if it adds one or more new abstract method). You
can think of Sequence and its brethren as mix-ins -- they provide some
default implementations of certain methods, and abstract definitions
of others (the "essential" ones; e.g. Sequence makes __len__ and
__getitem__ abstract but __iter__ has a concrete default
implementation).

Phillip has told me that this is transparent to his GF machinery -- if
you overload a GF on Sequence, and InstrumentedList is a subclass
Sequence (whether through registration or subclassing), then that
version of the GF will be used for InstrumentedList (unless there's a
more specific overloaded version of course).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Sat May 12 21:26:10 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 12 May 2007 15:26:10 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <fb6fbf560705121203n7762f8dfr8a2b2cea1aa6f6f2@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<fb6fbf560705110646h7a990f90p5cfb3797541f7ccd@mail.gmail.com>
	<20070511162803.5E70F3A4061@sparrow.telecommunity.com>
	<fb6fbf560705111027g21bf5299xcafe0cad52b8f6a5@mail.gmail.com>
	<20070511184927.7A2043A4061@sparrow.telecommunity.com>
	<46451BB7.9030703@canterbury.ac.nz>
	<20070512180213.EEDC93A4088@sparrow.telecommunity.com>
	<fb6fbf560705121203n7762f8dfr8a2b2cea1aa6f6f2@mail.gmail.com>
Message-ID: <20070512192425.84E3C3A4088@sparrow.telecommunity.com>

At 03:03 PM 5/12/2007 -0400, Jim Jewett wrote:
>On 5/12/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>>At 01:43 PM 5/12/2007 +1200, Greg Ewing wrote:
>
>>In practice, @around is mostly used for application-defined special
>>cases, and there is no higher authority than the application who
>>needs to override things.  If a library needs special combinators
>>internally, it's better off making them lower-than- at around
>>precedence.  Normal, before, and after methods are usually adequate
>>for libraries.  (Aside from special-purpose combinators like the
>>@discount example.)
>
>(1)  Would it be reaonable to say this in the PEP?

Sure.


>(2)  Would it be reasonable to leave out (or at least, leave for
>another PEP) the extension methods like discount?

The emerging consensus appears to be that everything relating to 
method combination and Aspects should be a second PEP, much like the 
Python 2.2 type system overhaul was separated into a 
mro/metaclass-oriented PEP and a descriptor-oriented PEP, even though 
the two were  quite interrelated.

So, examples for custom method combination, as well as best-practices 
for the standard combinators' uses would reasonably both go in the 
method-combination-and-aspects PEP.


From steven.bethard at gmail.com  Sat May 12 21:28:54 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 12 May 2007 13:28:54 -0600
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <ca471dc20705121153x3701cef9ied415f4d88f0ba84@mail.gmail.com>
References: <1178551661.8251.16.camel@antoine-ubuntu>
	<d11dcfba0705121003r4d705b02p798e9b0422101e54@mail.gmail.com>
	<ca471dc20705121153x3701cef9ied415f4d88f0ba84@mail.gmail.com>
Message-ID: <d11dcfba0705121228m7134ed5fmd0d74d5d54682869@mail.gmail.com>

On 5/12/07, Guido van Rossum <guido at python.org> wrote:
> On 5/12/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> > And here's a version that doesn't lose updates to the finalizer attributes:
> >
> >     http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/519635
> >
> > It replaces enable_finalizer() with a class attribute __finalattrs__.
> > >From __finalize__, all class attributes and methods are accessible, as
> > are any instance attributes specified by __finalattrs__. Guido's
> > BufferedWriter example looks like::
> >
> >     class BufferedWriter(Finalized):
> >         __finalattrs__ = 'buffer', 'raw'
> >         ...
> >         def flush(self):
> >             self.raw.write(self.buffer)
> >             self.buffer = b""
> >
> >         def __finalize__(self):
> >             self.flush()
>
> But can I subclass it and in the subclass override (extend) flush()? E.g.
>
> class MyWriter(BufferedWriter):
>   def flush(self):
>     super(MyWriter, self).flush()  # Or super.flush() once PEP xxx is accepted
>     print("Feel free to unplug the disk now")

Yep. The 'self' passed to __finalize__ is still an instance of the
same class (e.g. BufferedWriter or MyWriter). So inheritance works
normally:

>>> class BufferedWriter(Finalized):
...     __finalattrs__ = 'buffer', 'raw'
...     def __init__(self):
...         self.buffer = ''
...         self.raw = 'raw'
...     def flush(self):
...         print 'writing:', self.buffer, 'to', self.raw
...         self.buffer = ''
...     def __finalize__(self):
...         self.flush()
...
>>> class MyWriter(BufferedWriter):
...     def flush(self):
...         super(MyWriter, self).flush()
...         print 'feel free to unplug the disk now'
...
>>> w = MyWriter()
>>> del w
writing:  to raw
feel free to unplug the disk now

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From guido at python.org  Sat May 12 22:03:59 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 12 May 2007 13:03:59 -0700
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070512192425.84E3C3A4088@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<fb6fbf560705110646h7a990f90p5cfb3797541f7ccd@mail.gmail.com>
	<20070511162803.5E70F3A4061@sparrow.telecommunity.com>
	<fb6fbf560705111027g21bf5299xcafe0cad52b8f6a5@mail.gmail.com>
	<20070511184927.7A2043A4061@sparrow.telecommunity.com>
	<46451BB7.9030703@canterbury.ac.nz>
	<20070512180213.EEDC93A4088@sparrow.telecommunity.com>
	<fb6fbf560705121203n7762f8dfr8a2b2cea1aa6f6f2@mail.gmail.com>
	<20070512192425.84E3C3A4088@sparrow.telecommunity.com>
Message-ID: <ca471dc20705121303y5bce459bpb5b083bb90121306@mail.gmail.com>

On 5/12/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> The emerging consensus appears to be that everything relating to
> method combination and Aspects should be a second PEP, [...]

Yes, please.

I've just finished reading linearly through the version of PEP 3124
that's currently online, and the farther I got into the method
combining and Aspects section, the stronger the feeling I had that
there's just too much stuff here, and that it's all quite esoteric.

Some other feedback on the PEP (I will be awaiting your split-up
version before commenting in detail):

- Please supply a References section, linking to clear explanations
(and sometimes source code) of the various systems you mention (e.g.
Haskell typeclasses, CLOS, AspectJ, but also PEAK, RuleDispatch and so
on). Even in this age of search engines you owe your reader this
service. (And in the past I've had a helluva time finding things like
those "656 lines" in peak.rules.core!) Every time you mention a
concept that I don't know very well without a reference, I feel a
little stupider, and less favorably inclined towards the PEP. I
imagine that's not just my response; nobody likes reading something
that makes them feel stupid.

- Please provide motivating use cases beyond the toy examples for each
proposed feature. I am really glad that you have toy examples, because
they help tremendously to understand how a feature works. But I am
often stuck with the question "why would I need this"?

- Expect pushback on your assumption that every function or method
should be fair game for overloading. Requiring explicit tagging the
base or default implementation makes things a lot more palatable and
predictable for those of us who are still struggling to accept GFs.

- Some of the examples of method overloading in classes look really
hard to follow and easy to get wrong if one deviates from the cookbook
examples. (Though this may be limited to the "advanced" PEP.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Sun May 13 03:40:11 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 13 May 2007 13:40:11 +1200
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <ca471dc20705121219o56909306qadf513c531447135@mail.gmail.com>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>
	<20070509015553.9C6843A4061@sparrow.telecommunity.com>
	<88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>
	<20070509173942.B5A1B3A4061@sparrow.telecommunity.com>
	<46424718.20006@benjiyork.com>
	<20070510012836.16D0D3A4061@sparrow.telecommunity.com>
	<464318D0.2000109@benjiyork.com>
	<20070510153507.205EB3A4061@sparrow.telecommunity.com>
	<88d0d31b0705121058n7a92af7dn178f220dba922a91@mail.gmail.com>
	<ca471dc20705121219o56909306qadf513c531447135@mail.gmail.com>
Message-ID: <46466C7B.5010006@canterbury.ac.nz>

Guido van Rossum wrote:

> From this point, issubclass(InstrumentedList, Sequence) will be true
> (and likewise for instances of it and isinstance(x, Sequence)). But
> InstrumentedList's __mro__ and __bases__ are unchanged.

I think I've figured out what bothers me about this
kind of overloading of isinstance(). Normally if
isinstance(x, C) is true, we expect that a method
call on x can at least potentially invoke a method
of class C. But if isinstance(x, C) can be true
even if C doesn't appear in the mro of x, this is
no longer the case.

This isn't so much of a worry when x is acting as a
proxy for C, since it's probably forwarding method
calls to a real instance of C somewhere. But using
it in a more general way seems strange.

--
Greg

From greg.ewing at canterbury.ac.nz  Sun May 13 03:52:42 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 13 May 2007 13:52:42 +1200
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <d11dcfba0705121228m7134ed5fmd0d74d5d54682869@mail.gmail.com>
References: <1178551661.8251.16.camel@antoine-ubuntu>
	<d11dcfba0705121003r4d705b02p798e9b0422101e54@mail.gmail.com>
	<ca471dc20705121153x3701cef9ied415f4d88f0ba84@mail.gmail.com>
	<d11dcfba0705121228m7134ed5fmd0d74d5d54682869@mail.gmail.com>
Message-ID: <46466F6A.30302@canterbury.ac.nz>

Steven Bethard wrote:

> Yep. The 'self' passed to __finalize__ is still an instance of the
> same class (e.g. BufferedWriter or MyWriter). So inheritance works
> normally:

However, if the overridden method uses any attributes
not mentioned in the original __finalattrs__, they
will need to be added to it somehow.

It might be useful if the metaclass gathered up the
contents of __finalattr__ from the class and all its
base classes. Then a class could just list its
own needed attributes without having to worry about
those needed by its base classes.

This also suggests that some care will be needed when
overriding methods of a class that uses this recipe.
You need to know whether the method can be called from
the finalizer, so you can be sure to include the
appropriate attributes in __finalattrs__.

--
Greg

From steven.bethard at gmail.com  Sun May 13 04:18:09 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 12 May 2007 20:18:09 -0600
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <46466F6A.30302@canterbury.ac.nz>
References: <1178551661.8251.16.camel@antoine-ubuntu>
	<d11dcfba0705121003r4d705b02p798e9b0422101e54@mail.gmail.com>
	<ca471dc20705121153x3701cef9ied415f4d88f0ba84@mail.gmail.com>
	<d11dcfba0705121228m7134ed5fmd0d74d5d54682869@mail.gmail.com>
	<46466F6A.30302@canterbury.ac.nz>
Message-ID: <d11dcfba0705121918k1dc5bb1dib998bb35130a7cb6@mail.gmail.com>

On 5/12/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Steven Bethard wrote:
>
> > Yep. The 'self' passed to __finalize__ is still an instance of the
> > same class (e.g. BufferedWriter or MyWriter). So inheritance works
> > normally:
>
> However, if the overridden method uses any attributes
> not mentioned in the original __finalattrs__, they
> will need to be added to it somehow.
>
> It might be useful if the metaclass gathered up the
> contents of __finalattr__ from the class and all its
> base classes. Then a class could just list its
> own needed attributes without having to worry about
> those needed by its base classes.

You already don't need to list the attributes from the base classes.
The __finalattrs__ are converted into class level descriptors, so if
class D inherits from class C, it has the __finalattrs__ descriptors
for both classes.

Did you try it and find that it didn't work?

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From gproux+py3000 at gmail.com  Sun May 13 05:50:30 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Sun, 13 May 2007 12:50:30 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <f2502n$92j$1@sea.gmane.org>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<f2502n$92j$1@sea.gmane.org>
Message-ID: <19dd68ba0705122050n45bec072s831c72270cf04bf4@mail.gmail.com>

Dear Stargaming,

On 5/13/07, Stargaming <stargaming at gmail.com> wrote:
> Guillaume Proux schrieb:

I see that the language you are most comfortable with is German.
Compared with French (and even more with Japanese), I have a bias that
German people are very gifted in foreign languages and especially in
English...

> While still separating them from ascii-countries. They would start
> writing programs that expose foreign-phrased APIs but we would deny
> using them because we couldn't even type a single word!

If I rephrase your sentence above to use the local (e.g. Japanese)
view: "People in ascii countries, are writing programs that expose
foreign phrased APIs but we are denied using them because we cannot
even read a single word"

The situation right now that each community in "non-ascii" countries
is rather small because they are denied writing ANY program at all.

 Acceptance of 3131 would enable the following things:
1)  new contributors (including the younger who is not necessarily
able to deal with English) will start programming (in python)
2)  some of them will want to join the international community
3)  more programs, both in local and international communities



> AFAIK, allowing non-ascii identifiers would still *not* translate
> python. They would still have to struggle with every part of python that
> is builtin, i.e. builtins (you could let non-ascii identifiers reference

You answer misses the point about ability for children to learn
programming early.
In France, my experience was that it is very important to let children
use their own native vocabulary. Let me tell you about two things we
did in France re. computer science teaching to young children around
the 1980s which had a great influence on me and other children of that
type:
1) LOGO programming: we would be able to use the turtle using simple
words in French. That made a big impression on children. We would
spend hours playing with this. Now people just grab a Nintendo DS and
never approach computers with a "programming" approach at that age.
2) Robots: we had a robotic arm that came with a custom programming
interface (in French). It was very challenging to optimize your
programs to achieve a given task (take the ball and drop it in the
glass) in the minimum of time and steps.

Without the ability to do all that in French at the age of  8 or 9
years old, we would probably have not enjoyed that as much.


> As stated above, we could not use them though. Bad deal, if you ask me!

They can't use what we give them here. Who is loser in the deal right
now. Once again, we should not deny each language to create their own
package ecosystem. I believe really *good* packages will always end up
having an i18n-ized version.

> I don't get the improvement offered by this one. We should *allow*
> non-ascii identifiers to **require** wrappers?

You are always taking the wrong side of the equation. By allowing
non-ascii chars in the mix, standard APIs will be able to be offered
in each local languages, once again, for the local good.

> > - Increase the number of python users (from 7 to 77 years old)
> Works in English, too.

Do you know many Japanese/Chinese young children or elderly people
that are only speaker/reader/writer of their own language? Try to get
them to speak to them in English just for fun or worse, make them read
python code and ask them to explain you what it means.

> No, we do not restrict them, we simply do not allow them (what is a huge
> difference here). UTF-8 will be allowed (*and* enforced by default) as a

Not allowing something which now becomes naturally possible is *not* a
restriction?

> file encoding, i.e. strings and comments will be affected. I don't see
> the real restriction here. Correct me please, if I'm wrong.

Imagine you would be born in a world where your alphabet is hardly
ever used in the computing world. I am sure you would have a much
harder time learning programming.

> OTOH, I cannot glance at japanese code and know what it means. So,
> better the japanese developer named it badly but explained it than
> requiring me to consult a dictionary.

I am talking about your own code, the code you might need to maintain for years.
Once again, you are looking at your own small world where it is "easy"
for you to *write* programs if only because it uses the character set
in which you have been dwelling since you were born.

> See above, at least *my* reading speed for japanese text tends to zero
> (if not less!).

And this is not the issue. Of course, in the future of accepted
PEP3131, there is some scripts which you won't be able to read. And
that is fine, because it will probably some internally developped
program in a large Japanese company.

I am a strong believer that self regulation will happen for new
packages that could be interesting for the international python
community (remember that we are talking about new packages that would
never see the light of day without PEP3131)

> They're free to express their thoughts in comments, today, still
> separating them from ascii-developers.

You were shrugging off ealier the fact that it is not important for
people to understand their code by glancing at it. And now dismiss
their concern with, "Good enough for those guys to just do line by
line commenting". How nice.

> I do not think allowing people to program in *their* language would
> enhance integration. It would just split the python community *even*
> more. I like communicating with non-native English speakers much more

My point is that you cannot split a community... that does not even
exist *yet* because the entry barrier to the community is too high for
too many people in non-ascii countries. I am taking the long-term
view. Getting people involved with Python today when they are 7-8
years old and in 10 years we will have strong community members of
non-ascii countries.

> To communicate, we just have to find (or agree on) a common point
> between devs. Python is English, that's a matter of fact IMO. It is the
> common language that makes us a community and *one* language.

Yes, and I don't think that PEP3131 will change anything to that fact,
but for each local community we should allow people to use their own
language (mostly as users).

The fact that I have seen NO comment on this issue from non-ascii devs
also definitely makes me think that the community is not reaching far
enough in those countries that are not using latin characters and that
PEP3131 will help providing new blood from these countries.

I hope Guido will be able to see the long-term benefits of accepting this PEP.

Best Regards,

Guillaume

From guido at python.org  Sun May 13 05:59:18 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 12 May 2007 20:59:18 -0700
Subject: [Python-3000] ABC's, Roles, etc
In-Reply-To: <46466C7B.5010006@canterbury.ac.nz>
References: <88d0d31b0705081452w3082c1d0x7355e5eed7a3e093@mail.gmail.com>
	<88d0d31b0705082057l75ae6241gffa42545c25937d@mail.gmail.com>
	<20070509173942.B5A1B3A4061@sparrow.telecommunity.com>
	<46424718.20006@benjiyork.com>
	<20070510012836.16D0D3A4061@sparrow.telecommunity.com>
	<464318D0.2000109@benjiyork.com>
	<20070510153507.205EB3A4061@sparrow.telecommunity.com>
	<88d0d31b0705121058n7a92af7dn178f220dba922a91@mail.gmail.com>
	<ca471dc20705121219o56909306qadf513c531447135@mail.gmail.com>
	<46466C7B.5010006@canterbury.ac.nz>
Message-ID: <ca471dc20705122059v437a93d4vc4a8752d17030670@mail.gmail.com>

On 5/12/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>
> > From this point, issubclass(InstrumentedList, Sequence) will be true
> > (and likewise for instances of it and isinstance(x, Sequence)). But
> > InstrumentedList's __mro__ and __bases__ are unchanged.
>
> I think I've figured out what bothers me about this
> kind of overloading of isinstance(). Normally if
> isinstance(x, C) is true, we expect that a method
> call on x can at least potentially invoke a method
> of class C. But if isinstance(x, C) can be true
> even if C doesn't appear in the mro of x, this is
> no longer the case.

Well, not if x.__class__ overrides all of C's methods. And this
registration business we're *supposed* to register only classes that
provide concrete implementation of all of C's methods, which comes
down to the same thing.

Also, I'm unclear under what circumstances knowing that would make a
difference in your understanding of a program?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Sun May 13 06:45:57 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 13 May 2007 16:45:57 +1200
Subject: [Python-3000] PEP: Eliminate __del__
In-Reply-To: <d11dcfba0705121918k1dc5bb1dib998bb35130a7cb6@mail.gmail.com>
References: <1178551661.8251.16.camel@antoine-ubuntu>
	<d11dcfba0705121003r4d705b02p798e9b0422101e54@mail.gmail.com>
	<ca471dc20705121153x3701cef9ied415f4d88f0ba84@mail.gmail.com>
	<d11dcfba0705121228m7134ed5fmd0d74d5d54682869@mail.gmail.com>
	<46466F6A.30302@canterbury.ac.nz>
	<d11dcfba0705121918k1dc5bb1dib998bb35130a7cb6@mail.gmail.com>
Message-ID: <46469805.3000005@canterbury.ac.nz>

Steven Bethard wrote:

> You already don't need to list the attributes from the base classes.
> The __finalattrs__ are converted into class level descriptors, so if
> class D inherits from class C, it has the __finalattrs__ descriptors
> for both classes.

That's fine, then.

--
Greg

From talin at acm.org  Sun May 13 07:36:10 2007
From: talin at acm.org (Talin)
Date: Sat, 12 May 2007 22:36:10 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
Message-ID: <4646A3CA.40705@acm.org>

Guillaume Proux wrote:
> Dear all,
> 
> Pleased to meet you. I just subscribed to the list because I wanted to
> join the discussion regarding a specific PEP (for all the rest, you
> are all much more expert than me)
> 
> Guido:
>> 3131 (non-ASCII identifiers) -- I'm leaning towards rejecting.
> 
> I would like to voice my opposition to the rejection at that stage and
> request that more time is spent requesting/analysing the opinion of
> more people especially the people who have to deal with non-roman
> languages as a daily basis and especially people in the education
> field (along like other interesting people like the OLPC people)

One point that was raised by Alex Martelli is that the full set of 
Unicode 'letter' characters includes many characters which are visually 
indistinguishable in every font in the world. It means that, from now 
on, when I look at the variable named 'a', I can no longer be sure that 
what I am looking at is really the character I think it is. It means 
that we have introduced, for every Python programmer, a level of 
uncertainty that wasn't there before.

Programming languages are supposed to represent a compromise between the 
capabilities of humans and the capabilities of computers - in other 
words, both humans and computers are supposed to meet each other 
half-way and find a "sweet spot" that represents a restricted set of 
commands and symbols that both can understand. Each of them is expected 
to put a certain amount of effort into learning this common language - 
in the case of the computer, that effort is embodied into the design of 
the compiler, and in the case of the human, that effort is the learning 
of a formal dialect of commands and codes.

However, there is another use of programming languages, which is for 
programmers to communicate with each other. Specifically, the 
programming language provides a concise, unambiguous way to describe a 
particular algorithm or technique. Again, there is the expectation that 
practitioners of this discipline are expected to put a certain amount of 
effort into learning this formal language so that they can communicate 
with each other precisely.

The fact that programming languages resemble a particular human language 
is a pedagogical convenience, but it need not be so, and wasn't always 
that way. And the fact that a pidgin form of English words and grammar 
is used for most programming languages is a frozen accident, just as 
English is also the language used for international air traffic control. 
However, I think it is a mistake to think that programs themselves are 
written in "English", they are written in a formal language, similar to 
the language of mathematics, which every programmer needs to put in a 
modest amount of effort to learn, even native English speakers.

In any case, I would argue that if you teach someone to program in a 
dialect that cannot be understood by the global community of 
programmers, then you haven't really taught them 'programming' at all - 
you've taught them a kind of applied logic that they might be able to 
use personally, but that is only a small part of the craft of software 
engineering. The greater part is the ability to understand the vast 
corpus of literature out there that explains how to do just about 
everything you can think of with these tools. Only through learning a 
common language can they participate in the global technical 
infrastructure, which is more and more what I believe 'programming' is 
about.

There is another issue to be considered as well: Many human languages 
have a different grammatical structure. Even if you were to allow 
non-ASCII identifiers, and more so even if you were to allow the 
keywords themselves to be localized, you still have the problem that 
'if' comes at the start of a sentence, which makes no sense in many 
languages.

-- Talin

From hanser at club-internet.fr  Sun May 13 08:56:33 2007
From: hanser at club-internet.fr (Pierre Hanser)
Date: Sun, 13 May 2007 08:56:33 +0200
Subject: [Python-3000] Support for PEP 3131
Message-ID: <4646B6A1.7060007@club-internet.fr>

hello

i would like to add that even for a french user, it would be good
to have the possibility to use non ascii identifiers.

take the simple problem of the difference between "action to do"
and "action done".

In english, most of the time, adding 'ed' to the verb will do
the difference: change -> changed

great, this is still ascii!

in french:   change -> chang?   (ends with 'eacute')

bad, not ascii!

The possibility to use non ascii characters for identifiers
would be a strong bonus in my opinion.
-- 
	Pierre

From martin at v.loewis.de  Sun May 13 13:55:26 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 May 2007 13:55:26 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <4646A3CA.40705@acm.org>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org>
Message-ID: <4646FCAE.7090804@v.loewis.de>

> One point that was raised by Alex Martelli is that the full set of 
> Unicode 'letter' characters includes many characters which are visually 
> indistinguishable in every font in the world. It means that, from now 
> on, when I look at the variable named 'a', I can no longer be sure that 
> what I am looking at is really the character I think it is. It means 
> that we have introduced, for every Python programmer, a level of 
> uncertainty that wasn't there before.

That's a red herring. This problem is unlikely to occur in practice.
There are other, more serious cases of presentation ambiguity
(e.g. tabs vs. spaces), yet nobody suggests to ban tabs from the
language for that reason.

> However, I think it is a mistake to think that programs themselves are 
> written in "English", they are written in a formal language, similar to 
> the language of mathematics, which every programmer needs to put in a 
> modest amount of effort to learn, even native English speakers.

While that is true from the computer's point of view, it is not so
for many people writing programs. They want to understand programs
"naturally", not "mathematically". If it was only for the mathematical
properties, we could restrict ourselves to identifiers _1, _2, _3,
and so on.

> In any case, I would argue that if you teach someone to program in a 
> dialect that cannot be understood by the global community of 
> programmers, then you haven't really taught them 'programming' at all - 
> you've taught them a kind of applied logic that they might be able to 
> use personally, but that is only a small part of the craft of software 
> engineering.

What is the relationship to this PEP? (and in the paragraphs that I
snipped?) Following your long text of reasoning - are you now in favor
or opposed to the language change proposed in PEP 3131?

> There is another issue to be considered as well: Many human languages 
> have a different grammatical structure. Even if you were to allow 
> non-ASCII identifiers, and more so even if you were to allow the 
> keywords themselves to be localized, you still have the problem that 
> 'if' comes at the start of a sentence, which makes no sense in many 
> languages.

The PEP doesn't propose to adjust the grammar of Python to the grammar
of any natural language. All it proposes is to extend the language
to allow additional characters in identifiers.

Regards,
Martin

From martin at v.loewis.de  Sun May 13 15:11:48 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 May 2007 15:11:48 +0200
Subject: [Python-3000] the future of the GIL
In-Reply-To: <46428EED.3060205@canterbury.ac.nz>
References: <f1l97g$5cm$1@sea.gmane.org>
	<463E4645.5000503@acm.org>	<20070506222840.25B2.JCARLSON@uci.edu>
	<f1s312$489$1@sea.gmane.org>	<4642745C.1040702@canterbury.ac.nz>	<3d2ce8cb0705091839w7b4fec56ud6a1ed9cb0ad264d@mail.gmail.com>
	<46428EED.3060205@canterbury.ac.nz>
Message-ID: <46470E94.6040307@v.loewis.de>

> If so, it looks like it might be possible to give
> Python a fork() that works on Windows, at least for
> the time being.

It's quite a challenge. That call just creates a process,
and not thread. You need to invoke many more API calls
to make the process actually run.

For some reason (which I couldn't figure out), Cygwin
abstained from using that in their implementation of
fork.

Regards,
Martin

From jason.orendorff at gmail.com  Sun May 13 15:39:22 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Sun, 13 May 2007 09:39:22 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <4646A3CA.40705@acm.org>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org>
Message-ID: <bb8868b90705130639p69b719e8h2f40863e44547106@mail.gmail.com>

On 5/13/07, Talin <talin at acm.org> wrote:
> The fact that programming languages resemble a particular human language
> is a pedagogical convenience, but it need not be so, and wasn't always
> that way.

"Crucial usability feature", not "pedagogical convenience".

Choosing good names for things is an important skill in all modern
programming languages.  It always involves a natural language, and
great skill in that natural language pays off.


Look--this whole discussion has lacked perspective.  Non-ASCII
identifiers are not going to cause confusion or "split the community
even more".  Java and XML have not suffered.  People writing code
for open distribution will stick to ASCII in practice.  There is no
problem.

But for the same reason, the benefit isn't all that great either.

Python should allow foreign-language identifiers because (1) it's a
gesture of good will to people everywhere who don't speak English
fluently;  (2) some students will benefit;  (3) some people writing code
that no one else will ever see will benefit.

Non-English-language tutorials might also benefit.  (?)

I think the gesture alone is worth it, even if no one ever used the
feature productively.  But people will.  The cost to python-dev is low,
and the cost to English-speaking users is very likely zero.

What am I missing?

-j

From gproux+py3000 at gmail.com  Sun May 13 16:23:56 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Sun, 13 May 2007 23:23:56 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <bb8868b90705130639p69b719e8h2f40863e44547106@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org>
	<bb8868b90705130639p69b719e8h2f40863e44547106@mail.gmail.com>
Message-ID: <19dd68ba0705130723r793e034ax46319b86e166ac5b@mail.gmail.com>

Hi Jason,

Very interesting post. I will just make a little comment.

On 5/13/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> Python should allow foreign-language identifiers because (1) it's a
> gesture of good will to people everywhere who don't speak English
> fluently;  (2) some students will benefit;  (3) some people writing code
> that no one else will ever see will benefit.

I would change your (1) slightly to make it clear that it is NOT a
fluency in English vs. other language issue. What this is about is
really the ability to efficiently read/write Python programs when it
is usually a challenge to read (but also write) latin characters.
Once again, you would be surprised how challenging it is for e.g. most
Japanese people to decipher text written in latin characters.

> I think the gesture alone is worth it, even if no one ever used the
> feature productively.  But people will.  The cost to python-dev is low,
> and the cost to English-speaking users is very likely zero.

ASCII limitations have disappeared everywhere in the usage of most
modern OSes because the historical memory limitations have no more
meaning. It is great to see that Python could become the first
language to *really* enter the 21st century.

Regards,

Guillaume

From martin at v.loewis.de  Sun May 13 16:30:00 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 May 2007 16:30:00 +0200
Subject: [Python-3000] PEP 3123 (Was: PEP Parade)
In-Reply-To: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
References: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
Message-ID: <464720E8.3040402@v.loewis.de>

>  S  3123  Making PyObject_HEAD conform to standard C   von L?wis
> 
> I like it, but who's going to make the changes? Once those chnges have
> been made, will it still be reasonable to expect to merge C code from
> the (2.6) trunk into the 3.0 branch?

I just created bugs.python.org/1718153, which implements this PEP.

I had to add a number of additional macros (Py_Refcnt, Py_Size,
PyVarObject_HEAD_INIT); using these macros throughout is the bulk
of the change.

If the macros are backported to 2.x (omitting the "hard" changes
to PyObject itself), then the code base can stay the same between
2.x and 3.x (of course, backporting changes from 2.6 to 2.5 might
become harder, as the chances for conflicts increase).

As for statistics: there are ca. 580 uses of Py_Type in the code,
410 of Py_Size, and 20 of Py_Refcnt.

How should I proceed? The next natural step would be to change
2.6, and then wait until those changes get merged into 3k.

Regards,
Martin

From tanzer at swing.co.at  Sun May 13 16:31:03 2007
From: tanzer at swing.co.at (Christian Tanzer)
Date: Sun, 13 May 2007 16:31:03 +0200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: Your message of "Sat, 12 May 2007 13:03:59 PDT."
	<ca471dc20705121303y5bce459bpb5b083bb90121306@mail.gmail.com>
Message-ID: <E1HnF6F-0001S8-1B@swing.co.at>


"Guido van Rossum" <guido at python.org> wrote:

> - Expect pushback on your assumption that every function or method
> should be fair game for overloading. Requiring explicit tagging the
> base or default implementation makes things a lot more palatable and
> predictable for those of us who are still struggling to accept GFs.

Front-up tagging might make it more palatable but IMHO it would be a
serious mistake.

I still shudder when I think of C++'s `virtual` (although it's been a
looong time since I stopped using C++ [thanks, Guido!] :-)

PS: I didn't have time to really study Phillip's PEP but I've followed
    the GF saga over the months and I'd be very happy if GF's were
    promoted to be a core feature of Python!

-- 
Christian Tanzer                                    http://www.c-tanzer.at/


From rrr at ronadam.com  Sun May 13 17:04:35 2007
From: rrr at ronadam.com (Ron Adam)
Date: Sun, 13 May 2007 10:04:35 -0500
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <bb8868b90705130639p69b719e8h2f40863e44547106@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>	<4646A3CA.40705@acm.org>
	<bb8868b90705130639p69b719e8h2f40863e44547106@mail.gmail.com>
Message-ID: <46472903.3070608@ronadam.com>

Jason Orendorff wrote:

> I think the gesture alone is worth it, even if no one ever used the
> feature productively.  But people will.  The cost to python-dev is low,
> and the cost to English-speaking users is very likely zero.
> 
> What am I missing?

I don't think you're missing anything.

I think you are correct, the perceived impact is greater than the actual. 
The reason we don't run into python written in other languages more often 
is because most people have a language preference set when they do internet 
searches.  Not that all programs are written using only English.

As more people use python it becomes less (not more) of a need to read and 
understand *everyone* else's programs as there is most likely already what 
you need, or something close to it, in your own language.

I believe the walls are not solid or one way.  Good programs written in 
other languages most likely get translated to English at some point if they 
are freely distributed.

Ron

From collinw at gmail.com  Sun May 13 17:22:03 2007
From: collinw at gmail.com (Collin Winter)
Date: Sun, 13 May 2007 08:22:03 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
Message-ID: <43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>

On 5/12/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
[snip]
> In this respect, I strongly believe that support non-ASCII identifiers
> as proposed by PEP3131 would improve a number of things:
> - discussion and uptake of python in "non-ascii" countries
> - ability for children to learn programming in their own language (I
> started programming at 7 years old and would have been very disturbed
> if I could not use my own language to type in programs)
> - increase of the number of new "interesting" packages from non-ascii countries
> - ability for local programmers and local companies to provide
> "bridges" between international (english) APIs and local APIs.
> - Increase the number of python users (from 7 to 77 years old)

Says you. So far, all I've seen from PEP 3131's supporters is a lot of
hollow assertions and idle theorizing: "Python will be easier to use
for people using non-ASCII character sets", "Python will be easier to
learn for those raised with non-Roman-influenced languages", etc, etc.
Until I see some kind of evidence, something to back up these claims,
I'm going to assume you're wrong.

Have there been studies on this kind of thing? Has there been any
research into whether a mixture of English keywords and, say, Japanese
and English identifiers makes a given programming language easier to
learn and use? If so, why aren't they referenced in the PEP or linked
in any emails? Given the lack of evidence presented so far, my
operating assumption is that the PEP's supporters -- including you --
are making things up to support a conclusion that they might wish to
be true.

> In my humble opinion, now that UTF8 is accepted as the standard source
> code encoding, it is very difficult to understand why we should start
> putting restrictions on the kind of identifiers that are used (which
> would force people to comment line by line as they do now!).
>
> When I am programming in Python, I am VERY DISTURBED when the code I
> write contains much comment. It needs to be readable just by glancing
> at it.
>
> However, for most of the people who are core python developers, you
> should ask what is the typical reading speed for "ascii" characters
> for a e.g. standard Japanese pupil. You would be very surprised how
> slow that is. In my opinion (after leaving in Japan for quite a bit),
> people are very slow to read ASCII characters and this definitely
> restrain their programming productivity and expressiveness.

See, that's the thing I have yet to see addressed: there's been lot of
stress on "being able to write variable/class/method names in
Arabic/Mandarin/Hindi will make it easier for native speakers to
understand", but as far as I know, no-one has yet addressed how these
non-English identifiers will mesh with the existing English keywords
and English standard library functions. You say that being able to
write identifiers in Cyrillic will make Python easier for Russian
natives to read, to make Python code as you say, "readable just by
glancing at it". But the fact is any native-language identifiers will
be surrounded in a sea of English: keywords, the standard library,
almost all open-source packages, etc. How does that impact your
readability guesses?

Also, method/function names are traditionally expressed in English as
verb phrases (e.g., "isElementVisible()") which dovetail nicely with
Anglo-centric keywords like "if" and "for ... in ...". How do
identifiers in languages with dramatically different grammars like
Japanese -- or worse, different reading orders like Farsi and Hebrew
-- interact with "if", "while" and the new "x if y else z" expression,
which are deeply rooted in English grammar? My suspicion is, at least
for right-to-left languages like Arabic, not well, if at all.

Lastly, I take issue with one of the PEP's guidelines under the
"Policy Specification" section: "All identifiers in the Python
standard library...SHOULD use English words wherever feasible"
(emphasis in the original). Are we now going to admit the possibility
that part of the standard library will be written in English, some
parts will be written in Spanish and this one module over there will
be written in Czech? Absolutely ludicrous.

Come-on-tell-us-how-you-really-feel-ly,
Collin Winter

From tomerfiliba at gmail.com  Sun May 13 17:33:08 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Sun, 13 May 2007 17:33:08 +0200
Subject: [Python-3000] Support for PEP 3131
Message-ID: <1d85506f0705130833v1058b022re0597cf9f259320d@mail.gmail.com>

[Guillaume Proux]
> In this respect, I strongly believe that support non-ASCII identifiers
> as proposed by PEP3131 would improve a number of things:
> - discussion and uptake of python in "non-ascii" countries
> - ability for children to learn programming in their own language (I
> started programming at 7 years old and would have been very disturbed
> if I could not use my own language to type in programs)
> - increase of the number of new "interesting" packages from non-ascii countries
> - ability for local programmers and local companies to provide
> "bridges" between international (english) APIs and local APIs.
> - Increase the number of python users (from 7 to 77 years old)

well, i myself am a native hebrew speaker, so i'm quite sensitive
to text-direction issues with all sorts of editors. to this day, i haven't
seen a single editor that handles RTL/LTR transitions correctly,
including microsoft word.

when you start mixing LTR and RTL texts, it's asking for trouble:
??_????? = "doe"
??? = 5

i don't know how that would render on your machine, but on mine it says:
shem_mishpacha = "doe"
5 = gil   # looks reversed, but it's actually correct (!!)

so that basically rules out using hebrew, arabic and farsi from being
used as identifier, and the list is not complete.

now, since not all languages can be used, why bother supporting only some?
and if my library exposes a function with a chinese name, how would you
be able to invoke it without a chinese keyboard?

you'd do better with a translator sitting between the interpreter
and the editor.


-tomer

From foom at fuhm.net  Sun May 13 17:50:31 2007
From: foom at fuhm.net (James Y Knight)
Date: Sun, 13 May 2007 11:50:31 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
Message-ID: <575BCDBE-0368-4747-872F-115B2ED46122@fuhm.net>


On May 13, 2007, at 11:22 AM, Collin Winter wrote:

> See, that's the thing I have yet to see addressed: there's been lot of
> stress on "being able to write variable/class/method names in
> Arabic/Mandarin/Hindi will make it easier for native speakers to
> understand", but as far as I know, no-one has yet addressed how these
> non-English identifiers will mesh with the existing English keywords
> and English standard library functions. You say that being able to
> write identifiers in Cyrillic will make Python easier for Russian
> natives to read, to make Python code as you say, "readable just by
> glancing at it". But the fact is any native-language identifiers will
> be surrounded in a sea of English: keywords, the standard library,
> almost all open-source packages, etc. How does that impact your
> readability guesses?

In order to teach programming to non-english-speaking users, I would  
imagine people would translate some libraries (like, say, turtle),  
and simply teach the meaning of the few keywords.

Clearly in order to use the wider range of libraries out there and  
communicate with the wider python community, the person will have to  
know english. But it should be possible to start learning the  
fundamentals of programming without that.

James

From aahz at pythoncraft.com  Sun May 13 18:02:07 2007
From: aahz at pythoncraft.com (Aahz)
Date: Sun, 13 May 2007 09:02:07 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <4646FCAE.7090804@v.loewis.de>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>
Message-ID: <20070513160206.GA3161@panix.com>

On Sun, May 13, 2007, "Martin v. L?wis" wrote:
>
> There are other, more serious cases of presentation ambiguity
> (e.g. tabs vs. spaces), yet nobody suggests to ban tabs from the
> language for that reason.

Well, I do.  ;-)
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Look, it's your affair if you want to play with five people, but don't
go calling it doubles."  --John Cleese anticipates Usenet

From pje at telecommunity.com  Sun May 13 18:19:36 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 13 May 2007 12:19:36 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <E1HnF6F-0001S8-1B@swing.co.at>
References: <Your message of "Sat, 12 May 2007 13:03:59 PDT."
	<ca471dc20705121303y5bce459bpb5b083bb90121306@mail.gmail.com>
	<E1HnF6F-0001S8-1B@swing.co.at>
Message-ID: <20070513161749.C56DA3A409B@sparrow.telecommunity.com>

At 04:31 PM 5/13/2007 +0200, Christian Tanzer wrote:

>"Guido van Rossum" <guido at python.org> wrote:
>
> > - Expect pushback on your assumption that every function or method
> > should be fair game for overloading. Requiring explicit tagging the
> > base or default implementation makes things a lot more palatable and
> > predictable for those of us who are still struggling to accept GFs.
>
>Front-up tagging might make it more palatable but IMHO it would be a
>serious mistake.
>
>I still shudder when I think of C++'s `virtual` (although it's been a
>looong time since I stopped using C++ [thanks, Guido!] :-)

It's not *that* serious.  Even if we end up with the stdlib-supplied 
version having to pre-declare functions, it'll be trivial to 
implement a third party library that retroactively makes it unnecessary.

Specifically, the way it would work is that the 
overloading.rules_for() function will just need a "before" overload 
for FunctionType that modifies the function in-place to be suitable.

So, people who want to be able to do true AOP will just need to 
either write a short piece of code themselves, or import it from 
somewhere.  It'd probably look something like this:

    from overloading import before, rules_for, isgeneric, overloadable

    @before
    def rules_for(ob: type(lambda:None)):
        if not isgeneric(ob):
             gf = overloadable(ob)  # apply the decorator
             ob.__code__    = gf.__code__
             ob.__closure__ = gf.__closure__
             ob.__globals__ = gf.__globals__
             ob.__dict__    = gf.__dict__

The idea here is that if "@overloadable" is the decorator for turning 
a regular function into a generic function, you can simply apply it 
to the function and copy the new function's attributes to the old 
one, thereby converting it in-place.  It might be slightly trickier 
than shown, but probably not much.


From gproux+py3000 at gmail.com  Sun May 13 18:18:09 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Mon, 14 May 2007 01:18:09 +0900
Subject: [Python-3000] Fwd:  Support for PEP 3131
In-Reply-To: <19dd68ba0705130917y3988052cnbbebf1536cc25bdd@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
	<19dd68ba0705130917y3988052cnbbebf1536cc25bdd@mail.gmail.com>
Message-ID: <19dd68ba0705130918p29c8a79ds9ac59b890a3d7feb@mail.gmail.com>

Hello,

On 5/14/07, Collin Winter <collinw at gmail.com> wrote:
> Says you. So far, all I've seen from PEP 3131's supporters is a lot of
> hollow assertions and idle theorizing: "Python will be easier to use
> for people using non-ASCII character sets", "Python will be easier to
> learn for those raised with non-Roman-influenced languages", etc, etc.
> Until I see some kind of evidence, something to back up these claims,
> I'm going to assume you're wrong.

How could you gather any evidence without implementing the support first?
I understand that your argument being really weak and not supported by
actual fact or evidence, it should also made void.

> Have there been studies on this kind of thing? Has there been any

Part of my first post was to try to call some attention on this,
precisely to gather more opinions and if possible evidence.

> natives to read, to make Python code as you say, "readable just by
> glancing at it". But the fact is any native-language identifiers will
> be surrounded in a sea of English: keywords, the standard library,
> almost all open-source packages, etc. How does that impact your
> readability guesses?

Let me understand how well you understand the problem yourself since I
am trying to see why you are putting up such a resistance to a somehow
innocuous change for "you" as a person for whom reading/writing latin
character is a core competency.
After 10 years in Japan (and near fluency), I can personally witness
of the inappropriateness of using latin characters to express yourself
when thinking in Japanese.

> [...]
> which are deeply rooted in English grammar? My suspicion is, at least
> for right-to-left languages like Arabic, not well, if at all.

While I still do not understand the reason for such an opposition from
people for whom this PEP would likely have no impact, but I expected
people using RTL languages to come up with a "this won't work with RTL
language" arguments.
However, PEP3131 is none about grammar, and all about the ability to
freely choose your identifiers (vars and method names) from the
charset that pleases you for your own programs.
Implementation of PEP3131 will not hinder people with RTL languages.
The fact that Python will continue having a standard latin inspired
grammar it not going to make RTL language people less productive than
before. Once again, this is a zero impact on them and this will still
greatly benefit people who understand the orthogonality between
grammar and variable naming.

Regards,

Guillaume

From gproux+py3000 at gmail.com  Sun May 13 18:25:37 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Mon, 14 May 2007 01:25:37 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <1d85506f0705130833v1058b022re0597cf9f259320d@mail.gmail.com>
References: <1d85506f0705130833v1058b022re0597cf9f259320d@mail.gmail.com>
Message-ID: <19dd68ba0705130925j1dd55f1boba9e1b6c036d0422@mail.gmail.com>

Dear Tomer,

> well, i myself am a native hebrew speaker, so i'm quite sensitive
> to text-direction issues with all sorts of editors. to this day, i haven't
> seen a single editor that handles RTL/LTR transitions correctly,
> including microsoft word.

Are you talking about editor bugs? You should find a way to report the
bugs to the people in charge of development of those editors, but I
believe nothing here is python fault.

> when you start mixing LTR and RTL texts, it's asking for trouble:
> ??_????? = "doe"
> ??? = 5

Looks cool :)


> shem_mishpacha = "doe"
> 5 = gil   # looks reversed, but it's actually correct (!!)


> so that basically rules out using hebrew, arabic and farsi from being
> used as identifier, and the list is not complete.

You are ruling out PEP3131 because there is no good editor able to
support your language? True, for the same reason we should never have
made a Unicode standard.
If the editor is the problem, fix the editor.

> now, since not all languages can be used, why bother supporting only some?
> and if my library exposes a function with a chinese name, how would you
> be able to invoke it without a chinese keyboard?

If you don't speak chinese, i reckon the probability you will ever
(find and) use a library that would expose chinese name is 0.

Regards,

Guillaume

From collinw at gmail.com  Sun May 13 19:09:13 2007
From: collinw at gmail.com (Collin Winter)
Date: Sun, 13 May 2007 10:09:13 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705130925j1dd55f1boba9e1b6c036d0422@mail.gmail.com>
References: <1d85506f0705130833v1058b022re0597cf9f259320d@mail.gmail.com>
	<19dd68ba0705130925j1dd55f1boba9e1b6c036d0422@mail.gmail.com>
Message-ID: <43aa6ff70705131009s7d5b177dmea7c790d670ac3c0@mail.gmail.com>

On 5/13/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> [Tomer Filiba]
> > when you start mixing LTR and RTL texts, it's asking for trouble:
> > ??_????? = "doe"
> > ??? = 5
> >
> > shem_mishpacha = "doe"
> > 5 = gil   # looks reversed, but it's actually correct (!!)
> >
> > so that basically rules out using hebrew, arabic and farsi from being
> > used as identifier, and the list is not complete.
>
> You are ruling out PEP3131 because there is no good editor able to
> support your language? True, for the same reason we should never have
> made a Unicode standard.
> If the editor is the problem, fix the editor.

No no no no no. This isn't a problem with the editor: it's a problem
with allowing Hebrew identifiers.  Tomer can correct me on this, but I
strongly doubt that it improves readability by forcing the programmer
to constantly change which direction they're reading from, e.g.,  "if
??_?????.strip():"

Collin Winter

From jcarlson at uci.edu  Sun May 13 19:17:06 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 13 May 2007 10:17:06 -0700
Subject: [Python-3000] mixin class decorator
In-Reply-To: <1d85506f0705100436j4ed5c2f7xe6bef98c3b86f5bf@mail.gmail.com>
References: <1d85506f0705100436j4ed5c2f7xe6bef98c3b86f5bf@mail.gmail.com>
Message-ID: <20070513100929.854C.JCARLSON@uci.edu>


"tomer filiba" <tomerfiliba at gmail.com> wrote:
> 
> with the new class decorators of py3k, new use cases emerge.
> for example, now it is easy to have real mixin classes or even
> mixin modules, a la ruby.
[snip]
> does it seem useful? should it be included in some stdlib?
> or at least mentioned as a use case for class decorators in PEP 3129?
> (not intended for 3.0a1)

There are many use-cases for class decorators.  Since the PEP has
already been accepted, including this is unnecessary (for acceptance or
rejection purposes).

About the only thing that I think would be nice is if we could get class
decorators in 2.6 as well (a future import would work for me).  (which
would also open up the gates for 3rd party or stdlib-based type
annotation library like @implements(interface.foo) ).

 - Josiah


From fdrake at acm.org  Sun May 13 19:21:59 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sun, 13 May 2007 13:21:59 -0400
Subject: [Python-3000] mixin class decorator
In-Reply-To: <20070513100929.854C.JCARLSON@uci.edu>
References: <1d85506f0705100436j4ed5c2f7xe6bef98c3b86f5bf@mail.gmail.com>
	<20070513100929.854C.JCARLSON@uci.edu>
Message-ID: <200705131322.00152.fdrake@acm.org>

On Sunday 13 May 2007, Josiah Carlson wrote:
 > About the only thing that I think would be nice is if we could get class
 > decorators in 2.6 as well (a future import would work for me).

Since class decorators don't introduce a new keyword, there'd be no need for a 
future import.  Something that's no syntactically legal now would become 
legal, and that's an allowed change.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From tomerfiliba at gmail.com  Sun May 13 19:42:45 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Sun, 13 May 2007 19:42:45 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <43aa6ff70705131009s7d5b177dmea7c790d670ac3c0@mail.gmail.com>
References: <1d85506f0705130833v1058b022re0597cf9f259320d@mail.gmail.com>
	<19dd68ba0705130925j1dd55f1boba9e1b6c036d0422@mail.gmail.com>
	<43aa6ff70705131009s7d5b177dmea7c790d670ac3c0@mail.gmail.com>
Message-ID: <1d85506f0705131042q23270a91qa31ff2f3940019ed@mail.gmail.com>

On 5/13/07, Collin Winter <collinw at gmail.com> wrote:
> No no no no no. This isn't a problem with the editor: it's a problem
> with allowing Hebrew identifiers.  Tomer can correct me on this, but I
> strongly doubt that it improves readability by forcing the programmer
> to constantly change which direction they're reading from, e.g.,  "if
> ??_?????.strip():"

well, you're right, but i'd've chosen another counter-example:

if ??????.?????:
    pass

which comes first? does it say bacon.eggs or eggs.bacon?
and what happens if the editor uses a dot prefixed by LTR
marker? the meaning is reversed, but it still looks the same!

[Guillaume Proux]
> You are ruling out PEP3131 because there is no good editor able to
> support your language?

first, technical limitations do control the way we use computers.
it's a fact. second, RTL/LTR issues are nondeterministic,
and i'd rather leave heuristics out of my code.

again, if you want a hebrew-version of python, you'd also want
hebrew semantics and hebrew syntax.
see also http://cheeseshop.python.org/pypi/hpy

---

you can always translate or transliterate a word to english, like so:
if beykon.beytzim:

in the worst case, it wouldn't be meaningful to the english reader,
but at least other ppl could use your code.

> If you don't speak chinese, i reckon the probability you will ever
> (find and) use a library that would expose chinese name is 0.

you'd be surprised how many times i scanned through
japanese/chinese forums with an online translator, looking
for documentation or cheaper products :)


-tomer

From gproux+py3000 at gmail.com  Sun May 13 20:04:21 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Mon, 14 May 2007 03:04:21 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <1d85506f0705131042q23270a91qa31ff2f3940019ed@mail.gmail.com>
References: <1d85506f0705130833v1058b022re0597cf9f259320d@mail.gmail.com>
	<19dd68ba0705130925j1dd55f1boba9e1b6c036d0422@mail.gmail.com>
	<43aa6ff70705131009s7d5b177dmea7c790d670ac3c0@mail.gmail.com>
	<1d85506f0705131042q23270a91qa31ff2f3940019ed@mail.gmail.com>
Message-ID: <19dd68ba0705131104r85531f3o12b7e1769d7b7140@mail.gmail.com>

HI Tomer,

> if ??????.?????:
>     pass
>
> which comes first? does it say bacon.eggs or eggs.bacon?
> and what happens if the editor uses a dot prefixed by LTR
> marker? the meaning is reversed, but it still looks the same!

All that is really a *presentation* issue. And as such, an editor
specialized in editing hebrew or arabic python should help you write
the code you want to write.
However, why would you put a LTR marker here?  why try to add issues?
Also as soon as UTF-8 is accepted as the *standard* encoding, isn't
this issue the same with latin characters (not sure here, just
asking).
Additionally,would a professional programmer choose to add LTR markers
to make the source code ambiguous?

> again, if you want a hebrew-version of python, you'd also want
> hebrew semantics and hebrew syntax.
> see also http://cheeseshop.python.org/pypi/hpy

Yes, but let PEP3131 go forward first, as this will make it easier for
you to implement the full hebrew semantics.

> you can always translate or transliterate a word to english, like so:
> if beykon.beytzim:

Is this a bijective translation ?  How good is most people latin
character reading ability among Hebrew speakers? From the beginning, I
can tell from experience that Japanese people have great difficulties
in reading english or even transliterated japanese (which is never
good anyway because of homonyms)

> you'd be surprised how many times i scanned through
> japanese/chinese forums with an online translator, looking
> for documentation or cheaper products :)

I am happy to see your open minded spirit. Keep this open minded
spirit when evaluating the PEP3131  :)

Regards,

Guillaume

From martin at v.loewis.de  Sun May 13 20:27:34 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 13 May 2007 20:27:34 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
Message-ID: <46475896.80402@v.loewis.de>

> Have there been studies on this kind of thing? Has there been any
> research into whether a mixture of English keywords and, say, Japanese
> and English identifiers makes a given programming language easier to
> learn and use? If so, why aren't they referenced in the PEP or linked
> in any emails?

There is anecdotal evidence that people intuitively use characters
from their native language, and then are surprised by the syntax errors.

<sarcasm>
Unfortunately, they are not required to report their usage to the
Ministry for Use Of Funny Characters In Programming Languages.
</sarcasm>

> Given the lack of evidence presented so far, my
> operating assumption is that the PEP's supporters -- including you --
> are making things up to support a conclusion that they might wish to
> be true.

Are you also assuming that I make up my mentioning of anecdotal
evidence?

> Also, method/function names are traditionally expressed in English as
> verb phrases (e.g., "isElementVisible()") which dovetail nicely with
> Anglo-centric keywords like "if" and "for ... in ...". How do
> identifiers in languages with dramatically different grammars like
> Japanese -- or worse, different reading orders like Farsi and Hebrew
> -- interact with "if", "while" and the new "x if y else z" expression,
> which are deeply rooted in English grammar? My suspicion is, at least
> for right-to-left languages like Arabic, not well, if at all.

I don't speak Farsi, Arabic, or Hebrew, so I can't comment on that.
I know that in German, if/while is not an issue at all. People
regularly read "if" aloud as "wenn" or "falls".

> Lastly, I take issue with one of the PEP's guidelines under the
> "Policy Specification" section: "All identifiers in the Python
> standard library...SHOULD use English words wherever feasible"
> (emphasis in the original). Are we now going to admit the possibility
> that part of the standard library will be written in English, some
> parts will be written in Spanish and this one module over there will
> be written in Czech? Absolutely ludicrous.

The emphasis follows the convention of RFC 2119; it's not an emphasis,
but an indication of specification.

3. SHOULD   This word, or the adjective "RECOMMENDED", mean that there
   may exist valid reasons in particular circumstances to ignore a
   particular item, but the full implications must be understood and
   carefully weighed before choosing a different course.

There are already deviations from that rule in the standard library:

(aifc.AIFC_read.)initfp
(aifc.AIFC_read.)getnchannels
(aifc.AIFC_read.)getcomptype
(asyncore.dispatcher.)del_channel
(binhex.)_Hqxcoderengine
opcode.def_op
opcode.name_op
opcode.jrel_op
(etc.)

all are not proper English words.

Regards,
Martin


From hanser at club-internet.fr  Sun May 13 21:01:35 2007
From: hanser at club-internet.fr (Pierre Hanser)
Date: Sun, 13 May 2007 21:01:35 +0200
Subject: [Python-3000] Support for PEP 3131
Message-ID: <4647608F.7000202@club-internet.fr>

some personnal thoughts on the subject, pro, of course:

1) it's a matter of justice:

everybody should have the right to name his variables
with the exact name he prefers.

2) it's a matter of equity:

why would only english speakers able to write programs?
(and even english uses accents in numerous words: caf?, ...)

3) it's a matter of freedom, finally dropping old imperialism

I would say this PEP has something to do with freeing
the world from imperialism! you will have to be very
convincing to refuse the pep without spreading a
disgracious feeling of wanting to spread english
more than needed (i know that many people on this list
are not english nor US citizens, but that's not enough
to avoid this feeling...)

4) it does not suppress anything to anybody:

the risk is to have more programs available, even if you
can't fully read them. No need to come with a proof of
utility: you dislike => don't use it, but don't prevent
others to use it.

5) rules for official library may be stricter

Everybody is mature enough to know the audience of his
programs: personnal => may use native language, public
=> should use english, it seems.

in fact I'm even not against inclusion of not english
libraries in the standard lib, but if this is the way
to have the pep, i would accept restrictions.

may be a lexicon at the beginning of each library written
in a foreign language would be enough

6) it gives access to litterary programming

I try to avoid the expression 'litterate programming'
because I know the traditionnal connotation, but my
english is not good enough to find a good title. So,
let me describe:

one of the biggest advantages I see
is the ability to write well written programming
expressions. French without accents is not really
french and looks always poor to demanding people.

joy of programming comes with the beauty of well
written algorithms, written in good language.
And your language is always a bit better than any
other, if you can write it in it's full glory.

7) most other considerations do not matter

font: probably you can find fonts for most languages (?)

difficulty of writing: this get's better from day to day

...

8) the programmer, He in the previous lines, may be of course
male or female...

my 8 cents for this evening
-- 
	Pierre


From murman at gmail.com  Sun May 13 21:04:48 2007
From: murman at gmail.com (Michael Urman)
Date: Sun, 13 May 2007 14:04:48 -0500
Subject: [Python-3000] Unicode strings, identifiers, and import
Message-ID: <dcbbbb410705131204x17a32b5dq363703f219cb707@mail.gmail.com>

This occurred to me while reading the PEP 3131 discussion, and while
it's not limited to PEP 3131 concerns, I don't believe I've seen
discussed yet elsewhere. What is the interaction between import or
__import__ and Unicode module names (or at least Unicode strings
describing them). Currently in python 2.5, __import__ appears coerce
to str, leading to the following error case:

>>> __import__(unicodedata.lookup('GREEK SMALL LETTER EPSILON'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u03b5' in
position 0: ordinal not in range(128)

With str being the Unicode type in py3k, this branch of the potential
problem needs to be addressed clearly, whether by defining __import__
as converting through ASCII, or by defining a useful semantic. If PEP
3131 is to be accepted, then it should probably address whether import
will work on non-ASCII identifiers, and if so what the semantics are
(if __import__ would otherwise limit to ASCII).

I'm a little worried on the implementation side, because while on
Windows it should be easy to use unicode file APIs, on Linux the
filenames may or may be UTF-8 friendly.

Michael
-- 
Michael Urman

From guido at python.org  Sun May 13 21:15:47 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 13 May 2007 12:15:47 -0700
Subject: [Python-3000] PEP 3123 (Was: PEP Parade)
In-Reply-To: <464720E8.3040402@v.loewis.de>
References: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>
	<464720E8.3040402@v.loewis.de>
Message-ID: <ca471dc20705131215l4ac14cf8je421e99dcaf30cd@mail.gmail.com>

I'm okay with applying to 2.6 and then merging into 3.0.  ISTM though
that backporting this to 2.5 would cause the release manager to throw
a fit, so I think that's not worth it. What would be the benefit
anyway?

--Guido

On 5/13/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> >  S  3123  Making PyObject_HEAD conform to standard C   von L?wis
> >
> > I like it, but who's going to make the changes? Once those chnges have
> > been made, will it still be reasonable to expect to merge C code from
> > the (2.6) trunk into the 3.0 branch?
>
> I just created bugs.python.org/1718153, which implements this PEP.
>
> I had to add a number of additional macros (Py_Refcnt, Py_Size,
> PyVarObject_HEAD_INIT); using these macros throughout is the bulk
> of the change.
>
> If the macros are backported to 2.x (omitting the "hard" changes
> to PyObject itself), then the code base can stay the same between
> 2.x and 3.x (of course, backporting changes from 2.6 to 2.5 might
> become harder, as the chances for conflicts increase).
>
> As for statistics: there are ca. 580 uses of Py_Type in the code,
> 410 of Py_Size, and 20 of Py_Refcnt.
>
> How should I proceed? The next natural step would be to change
> 2.6, and then wait until those changes get merged into 3k.
>
> Regards,
> Martin
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sun May 13 21:29:23 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 13 May 2007 12:29:23 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
Message-ID: <ca471dc20705131229n3ae005dfi6c6aa369cd90a17a@mail.gmail.com>

On 5/13/07, Collin Winter <collinw at gmail.com> wrote:
> Have there been studies on this kind of thing? Has there been any
> research into whether a mixture of English keywords and, say, Japanese
> and English identifiers makes a given programming language easier to
> learn and use? If so, why aren't they referenced in the PEP or linked
> in any emails? Given the lack of evidence presented so far, my
> operating assumption is that the PEP's supporters -- including you --
> are making things up to support a conclusion that they might wish to
> be true.

In particular, AFAIK Java has allowed all Unicode letters in
identifiers right from the start. I'd like to hear about descriptions
of actual user experiences with this feature, in Java or in any other
language that supports it. (*Are* there any others?) That would be far
more valuable to me than any continued argumentation for or against
the proposal.

I also note that there's no particular reason why this needs to be
done exactly in 3.0. It's not backwards incompatible -- it could be
done in 2.6 if people really really want it, or it could be introduced
in 3.1, 3.2 or whenever the world appears to be ready. I certainly
don't consider it an early design mistake to only require ASCII -- at
the time it was the only sane thing to do and I'm far from convinced
that it needs to change now.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sun May 13 21:31:00 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 13 May 2007 12:31:00 -0700
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <dcbbbb410705131204x17a32b5dq363703f219cb707@mail.gmail.com>
References: <dcbbbb410705131204x17a32b5dq363703f219cb707@mail.gmail.com>
Message-ID: <ca471dc20705131231r510d6fb3i10f035feee2d4477@mail.gmail.com>

The answer to all of this is the filesystem encoding, which is already
supported. Doesn't appear particularly difficult to me.

On 5/13/07, Michael Urman <murman at gmail.com> wrote:
> This occurred to me while reading the PEP 3131 discussion, and while
> it's not limited to PEP 3131 concerns, I don't believe I've seen
> discussed yet elsewhere. What is the interaction between import or
> __import__ and Unicode module names (or at least Unicode strings
> describing them). Currently in python 2.5, __import__ appears coerce
> to str, leading to the following error case:
>
> >>> __import__(unicodedata.lookup('GREEK SMALL LETTER EPSILON'))
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> UnicodeEncodeError: 'ascii' codec can't encode character u'\u03b5' in
> position 0: ordinal not in range(128)
>
> With str being the Unicode type in py3k, this branch of the potential
> problem needs to be addressed clearly, whether by defining __import__
> as converting through ASCII, or by defining a useful semantic. If PEP
> 3131 is to be accepted, then it should probably address whether import
> will work on non-ASCII identifiers, and if so what the semantics are
> (if __import__ would otherwise limit to ASCII).
>
> I'm a little worried on the implementation side, because while on
> Windows it should be easy to use unicode file APIs, on Linux the
> filenames may or may be UTF-8 friendly.
>
> Michael
> --
> Michael Urman
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From 2007 at jmunch.dk  Sun May 13 21:10:09 2007
From: 2007 at jmunch.dk (Anders J. Munch)
Date: Sun, 13 May 2007 21:10:09 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
Message-ID: <46476291.6040502@jmunch.dk>

Collin Winter wrote:
 >So far, all I've seen from PEP 3131's supporters is a lot of
 > hollow assertions and idle theorizing: "Python will be easier to use
 > for people using non-ASCII character sets", "Python will be easier to
 > learn for those raised with non-Roman-influenced languages", etc, etc.
 > Until I see some kind of evidence, something to back up these claims,
 > I'm going to assume you're wrong.
 >

You haven't brought any hard evidence to the table yourself, so in
the absense of that, my anecdotal evidence trumps your pure
speculation ;-)

I've coded non-trivial stuff in three languages: Danglish, English and
Danish. Well, strictly speaking only the latter two are real
langauages; Danglish is just a name for way Danish programmers
typically write: A hodge-podge of Danish and English mixed with no
apparent system, ever preferring whichever word springs to mind first,
switching to (bad) English whenever the Danish alternative would need
transliteration.  Or worse, switching to a different but less
appropriate Danish word that has the sole advantage of not needing
transliteration.

I've found that using my native Danish is the better option of the
three because, unsurprisingly, I am are more productive using my
native language than a foreign language.  Do I really need to submit
proof for that?  Isn't that just obvious?

 > See, that's the thing I have yet to see addressed: there's been lot of
 > stress on "being able to write variable/class/method names in
 > Arabic/Mandarin/Hindi will make it easier for native speakers to
 > understand", but as far as I know, no-one has yet addressed how these
 > non-English identifiers will mesh with the existing English keywords
 > and English standard library functions.

They mesh *brilliantly*.  The different languages used means that the
provenance of identifiers is intuitively available: English
identifiers means std. lib. or 3rd party, native language means
in-house.  Very helpful - my heart goes out to the poor suffering
monolinguists who must do without this valuable code reading aid.

+1 on PEP 3131.  

greetings-from-rainy-Denmark-ly y'rs, Anders


From santagada at gmail.com  Sun May 13 22:16:35 2007
From: santagada at gmail.com (Leonardo Santagada)
Date: Sun, 13 May 2007 17:16:35 -0300
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <ca471dc20705131229n3ae005dfi6c6aa369cd90a17a@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>	<43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
	<ca471dc20705131229n3ae005dfi6c6aa369cd90a17a@mail.gmail.com>
Message-ID: <46477223.2040105@gmail.com>

Guido van Rossum escreveu:
> In particular, AFAIK Java has allowed all Unicode letters in
> identifiers right from the start. I'd like to hear about descriptions
> of actual user experiences with this feature, in Java or in any other
> language that supports it. (*Are* there any others?) That would be far
> more valuable to me than any continued argumentation for or against
> the proposal.
> 
Javascript  also supports identifiers using any unicode letter, but you 
cannot use escape sequences as in java (I don't really know if java 
support that, but the ecmascript spec gave me this impression).


The thing is, living in Brazil (latin-1 characters) I never ever seen 
any javascript or java code using unicode identifiers. I've seen them 
using unicode string literals and that is supported by python.

I Still think that what is needed for children and people starting with 
programming is something more than identifiers, is a complete system in 
their language (Someone said to me in passing that C++ standard 
committee had a proposal on this kind of sutff).

To better the support for Brazilian portuguese on python we need the 
whole standard library and all strings be unicode... and that will be 
solved by python 3k. There were comments on the brazilian community that 
unicode errors was the most common kind of errors by far in any software 
that needs string processing.

From tjreedy at udel.edu  Sun May 13 22:16:33 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 13 May 2007 16:16:33 -0400
Subject: [Python-3000] Support for PEP 3131
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com><4646A3CA.40705@acm.org>
	<4646FCAE.7090804@v.loewis.de>
Message-ID: <f27rmv$k1d$1@sea.gmane.org>


""Martin v. L?wis"" <martin at v.loewis.de> wrote in message 
news:4646FCAE.7090804 at v.loewis.de...
| That's a red herring.

That is how I felt when you dismissed my effort to make your proposal more 
useful and more acceptable to some (by addressing transliteration) with the 
little molehill problem that Norwegians and Germans disagree about o: 
(rotated 90 degrees).

So I shut up.

However, to me, one impetus to expanding the Python char set is the OLPC 
project.  For children to share programs across national boundaries, 
programs written in local chars will usually have to be transliterated to 
the one set of chars common to all versions of the laptop.

Terry Jan Reedy




From baptiste13 at altern.org  Sun May 13 23:12:31 2007
From: baptiste13 at altern.org (Baptiste Carvello)
Date: Sun, 13 May 2007 23:12:31 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <bb8868b90705130639p69b719e8h2f40863e44547106@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>	<4646A3CA.40705@acm.org>
	<bb8868b90705130639p69b719e8h2f40863e44547106@mail.gmail.com>
Message-ID: <f27v02$553$1@sea.gmane.org>

Jason Orendorff a ?crit :
> 
> Python should allow foreign-language identifiers because (1) it's a
> gesture of good will to people everywhere who don't speak English
> fluently;  (2) some students will benefit;  (3) some people writing code
> that no one else will ever see will benefit.
> 
As I said in a previous post, these use cases would be well served by a command
line switch. People who do not care about distributing their code can just do
alias python = python -I

On the other hand, people who want wider distribution would test without the
switch and easily check that all their identifiers are ASCII.

The default should be the best choice for the python open source community, that
is ASCII identifiers only.

Cheers,
Baptiste


From brett at python.org  Mon May 14 00:58:52 2007
From: brett at python.org (Brett Cannon)
Date: Sun, 13 May 2007 15:58:52 -0700
Subject: [Python-3000] getting compiler package failures
Message-ID: <bbaeab100705131558v1e57afb6y2255646231896bd@mail.gmail.com>

I just did a ``make distclean`` on a clean checkout (r55300) and
test_compiler/test_transformer are failing:

  File "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
line 715, in atom
    return self._atom_dispatch[nodelist[0][0]](nodelist)
KeyError: 322

or

  File "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
line 776, in lookup_node
    return self._dispatch[node[0]]
KeyError: 331

or

  File "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
line 783, in com_node
    return self._dispatch[node[0]](node[1:])
KeyError: 339


I don't know the compiler package at all (which is why I am currently stuck
on Tony Lownds' PEP 3113 patch since I am getting a
compiler.transformer.WalkerError) so I have no clue how to go about fixing
this.  Anyone happen to know what may have caused the breakage?

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070513/b1e54f47/attachment.htm 

From gproux+py3000 at gmail.com  Mon May 14 01:17:25 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Mon, 14 May 2007 08:17:25 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <ca471dc20705131229n3ae005dfi6c6aa369cd90a17a@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
	<ca471dc20705131229n3ae005dfi6c6aa369cd90a17a@mail.gmail.com>
Message-ID: <19dd68ba0705131617w4bbc45f3p833159a20552e909@mail.gmail.com>

Dear all,

On 5/14/07, Guido van Rossum <guido at python.org> wrote:
> of actual user experiences with this feature, in Java or in any other
> language that supports it. (*Are* there any others?) That would be far
> more valuable to me than any continued argumentation for or against
> the proposal.

Interestingly, this is *not* a well known fact. I have asked 2
friend-of-mine seasoned Java programmers and they were *amazed* that
this is supported.
However, if you consider XML as a language, then there was plenty of
discussion in the past talking about the benefits of allowing unicode
characters in tags.
 see e.g. http://lists.xml.org/archives/xml-dev/200107/msg00254.html


> I also note that there's no particular reason why this needs to be
> done exactly in 3.0. It's not backwards incompatible -- it could be

As one realizes that this needs to be done, then I would love to see
that introduced in 2.5 :)

> don't consider it an early design mistake to only require ASCII -- at
> the time it was the only sane thing to do and I'm far from convinced
> that it needs to change now.

I wish you would be able to let us know precisely what hinders you in
accepting this change. After reading the following document,
http://www.python.org/doc/essays/foreword/ , I expected you would be
very open-minded to that change especially when you are talking about
"the emphasis on readibility" and how you came up with Python as an
evolution of ABC "a wonderful teaching language" because I can't fail
to see how such change would be incredibly important for Python as a
"wonderful teaching language" especially towards children in countries
where latin characters are really foreign.

Regards,

Guillaume

From guido at python.org  Mon May 14 02:15:24 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 13 May 2007 17:15:24 -0700
Subject: [Python-3000] getting compiler package failures
In-Reply-To: <bbaeab100705131558v1e57afb6y2255646231896bd@mail.gmail.com>
References: <bbaeab100705131558v1e57afb6y2255646231896bd@mail.gmail.com>
Message-ID: <ca471dc20705131715qc53a622x4ca6a6bc25ba0c22@mail.gmail.com>

test_compiler and test_transformer have been broken for a couple of
months now I believe.

Unless someone comes to the rescue of the compiler package soon, I'm
tempted to remove it from the p3yk branch -- it doesn't seem to serve
any particularly good purpose, especially now that the AST used by the
compiler written in C is exportable.

--Guido

On 5/13/07, Brett Cannon <brett at python.org> wrote:
> I just did a ``make distclean`` on a clean checkout (r55300) and
> test_compiler/test_transformer are failing:
>
>   File
> "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
> line 715, in atom
>      return self._atom_dispatch[nodelist[0][0]](nodelist)
> KeyError: 322
>
> or
>
>   File
> "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
> line 776, in lookup_node
>     return self._dispatch[node[0]]
> KeyError: 331
>
> or
>
>   File
> "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
> line 783, in com_node
>     return self._dispatch[node[0]](node[1:])
> KeyError: 339
>
>
> I don't know the compiler package at all (which is why I am currently stuck
> on Tony Lownds' PEP 3113 patch since I am getting a
> compiler.transformer.WalkerError) so I have no clue how to
> go about fixing this.  Anyone happen to know what may have caused the
> breakage?
>
> -Brett
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/guido%40python.org
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From brett at python.org  Mon May 14 02:17:24 2007
From: brett at python.org (Brett Cannon)
Date: Sun, 13 May 2007 17:17:24 -0700
Subject: [Python-3000] getting compiler package failures
In-Reply-To: <ca471dc20705131715qc53a622x4ca6a6bc25ba0c22@mail.gmail.com>
References: <bbaeab100705131558v1e57afb6y2255646231896bd@mail.gmail.com>
	<ca471dc20705131715qc53a622x4ca6a6bc25ba0c22@mail.gmail.com>
Message-ID: <bbaeab100705131717o766daad0q927512aa57ed4aef@mail.gmail.com>

On 5/13/07, Guido van Rossum <guido at python.org> wrote:
>
> test_compiler and test_transformer have been broken for a couple of
> months now I believe.
>
> Unless someone comes to the rescue of the compiler package soon, I'm
> tempted to remove it from the p3yk branch -- it doesn't seem to serve
> any particularly good purpose, especially now that the AST used by the
> compiler written in C is exportable.



+1000 from me.  I was thinking of suggesting this, but I forgot to put it in
the email.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070513/5039b525/attachment.htm 

From greg.ewing at canterbury.ac.nz  Mon May 14 02:32:04 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 14 May 2007 12:32:04 +1200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <4646B6A1.7060007@club-internet.fr>
References: <4646B6A1.7060007@club-internet.fr>
Message-ID: <4647AE04.1040207@canterbury.ac.nz>

Pierre Hanser wrote:

> In english, most of the time, adding 'ed' to the verb will do
> the difference: change -> changed
> 
> in french:   change -> chang?   (ends with 'eacute')

Fine if the reader understands French, but if you
later want to translate this program so that a
non-French speaker can read it, what would you
do?

--
Greg

From guido at python.org  Mon May 14 02:39:54 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 13 May 2007 17:39:54 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705131617w4bbc45f3p833159a20552e909@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
	<ca471dc20705131229n3ae005dfi6c6aa369cd90a17a@mail.gmail.com>
	<19dd68ba0705131617w4bbc45f3p833159a20552e909@mail.gmail.com>
Message-ID: <ca471dc20705131739n318782d3k9198a8a5180fa5ef@mail.gmail.com>

On 5/13/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> On 5/14/07, Guido van Rossum <guido at python.org> wrote:
> > of actual user experiences with this feature, in Java or in any other
> > language that supports it. (*Are* there any others?) That would be far
> > more valuable to me than any continued argumentation for or against
> > the proposal.
>
> Interestingly, this is *not* a well known fact. I have asked 2
> friend-of-mine seasoned Java programmers and they were *amazed* that
> this is supported.

Well, maybe we should add it to Python as a secret feature. :-) :-) :-)

> However, if you consider XML as a language, then there was plenty of
> discussion in the past talking about the benefits of allowing unicode
> characters in tags.
>  see e.g. http://lists.xml.org/archives/xml-dev/200107/msg00254.html

I imagine the situation there is sufficiently different though; XML is
data, not code.

> > I also note that there's no particular reason why this needs to be
> > done exactly in 3.0. It's not backwards incompatible -- it could be
>
> As one realizes that this needs to be done, then I would love to see
> that introduced in 2.5 :)

I realize you've added a smiley, but please, don't propose new
features for a release that's already been released. The release
managers will put you in jail and not let you out until 4.0 has been
released. :-)

> > don't consider it an early design mistake to only require ASCII -- at
> > the time it was the only sane thing to do and I'm far from convinced
> > that it needs to change now.
>
> I wish you would be able to let us know precisely what hinders you in
> accepting this change.

Because most people still use systems that have very inadequate tools
for handling non-ASCII text, especially non-Latin-1 text. For example,
at work I use Ubuntu, a modern Linux distribution actively supported
by a company headquartered in South-Africa. Their main market lies
outside Europe and North America. And yet, there is no standard way to
enter non-ASCII characters as basic as c-cedilla or u-umlaut; the main
tools I use (Emacs, Firefox and bash running in a terminal emulator)
all have different input methods, different ideas of the default
character encoding, and so on. It's a crapshoot whether
copy-and-pasting even the simplest non-ASCII text (like the name of
PEP 3131's author :-) between any two of these will work.

I see program code as a tool for communication between people. Note
how you & I are using English in this thread even though it is not the
mother tongue for either of us. So we use English, since we can both
read and write it reasonably well. This is the *only* way that
programmers raised in different countries can exchange code at all.
(It may change if at some point in the future computer translation
gets 1000x better, but we're not there yet -- try translate.google.com
if you don't believe me.)

Now, you may disagree with me on the conclusion even if you agree on
the premises. But you asked for my motivation, and this is it.

> After reading the following document,
> http://www.python.org/doc/essays/foreword/ , I expected you would be
> very open-minded to that change especially when you are talking about
> "the emphasis on readibility" and how you came up with Python as an
> evolution of ABC "a wonderful teaching language" because I can't fail
> to see how such change would be incredibly important for Python as a
> "wonderful teaching language" especially towards children in countries
> where latin characters are really foreign.

You're stretching my words there. The issue if translation hadn't
crossed my mind when I wrote that (over 10 years ago) and the tools
*really* weren't ready then. And regarding readability, if all the
programmers in the world agreed to use broken English, the readability
of their code to each other would be much better dan als we allemaal
in onze eigen taal schreven.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From arvind1.singh at gmail.com  Mon May 14 02:41:49 2007
From: arvind1.singh at gmail.com (Arvind Singh)
Date: Mon, 14 May 2007 06:11:49 +0530
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <46476291.6040502@jmunch.dk>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
	<46476291.6040502@jmunch.dk>
Message-ID: <c19402930705131741n7149073w3f531a6775873ffa@mail.gmail.com>

On 5/14/07, Anders J. Munch <2007 at jmunch.dk> wrote:

> You haven't brought any hard evidence to the table yourself, so in
> the absense of that, my anecdotal evidence trumps your pure
> speculation ;-)


Fact: Younger brains learn new concepts (and languages) faster than older
ones.
Argument: To be part of "international" programming community (or a *real*
programmer), one has to learn English anyway, why help anyone develop a
habit which he/she will have to discard later? [Indians usually deal with 3
languages in their childhood (English, Hindi, Sanskrit/local language).]


I've coded non-trivial stuff in three languages: Danglish, English and
> Danish. Well, strictly speaking only the latter two are real
> langauages; Danglish is just a name for way Danish programmers
> typically write: A hodge-podge of Danish and English mixed with no
> apparent system, ever preferring whichever word springs to mind first,
> switching to (bad) English whenever the Danish alternative would need
> transliteration.  Or worse, switching to a different but less
> appropriate Danish word that has the sole advantage of not needing
> transliteration.


This PEP talks about support for *identifiers*. If you need *extensive*
vocabulary for your *identifiers*, I'd assume that you're coding something
non-trivial (with ignorable exceptions). Such non-trivial code should be
sharable under a _common_ language that *others* can understand as well,
IMHO.

Further, if you are doing something non-trivial, I can also assume that
you'd be using third-party libraries. How would the code look if identifiers
were written in various encodings?


I've found that using my native Danish is the better option of the
> three because, unsurprisingly, I am are more productive using my
> native language than a foreign language.  Do I really need to submit
> proof for that?  Isn't that just obvious?


Not so obvious to me, actually. Ask any good user-interface designer, humans
aren't (generally; since I see you as a "gifted" exception :-) ) good with
"modal" interfaces. The more "modes" one has to shift among, the lesser the
productivity, in general. Maybe you feel more productive because of lengthy
"modes" or long pieces of code (i.e., looong functions): not a good
programming practice, as I've been taught.


They mesh *brilliantly*.  The different languages used means that the
> provenance of identifiers is intuitively available: English
> identifiers means std. lib. or 3rd party, native language means
> in-house.  Very helpful - my heart goes out to the poor suffering
> monolinguists who must do without this valuable code reading aid.


Since Hindi was mentioned, I'd like to say: Don't even think about it!


+1 on PEP 3131.


Without knowing whether I have a say or not:

-1 on this PEP

Regards,
Arvind

-- 
There should be one-- and preferably only one --obvious way to choose your
*identifiers*.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070514/abb971d4/attachment.html 

From greg.ewing at canterbury.ac.nz  Mon May 14 02:46:23 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 14 May 2007 12:46:23 +1200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <4646FCAE.7090804@v.loewis.de>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>
Message-ID: <4647B15F.7040700@canterbury.ac.nz>

Martin v. L?wis wrote:
> There are other, more serious cases of presentation ambiguity
> (e.g. tabs vs. spaces), yet nobody suggests to ban tabs from the
> language for that reason.

But we *have* suggested banning mixed tabs and spaces
(rather than just recommending against it), which is
something that can be automatically verified.

I don't think this scenario is all that unlikely. A
program is initially written by a Russian programmer
who uses his own version of "a" as a variable name.
Later an English-speaking programmer makes some
changes, and uses an ascii "a". Now there are two
subtly different variables called "a" in different
parts of the program.

--
Greg

From gproux+py3000 at gmail.com  Mon May 14 03:09:29 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Mon, 14 May 2007 10:09:29 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <ca471dc20705131739n318782d3k9198a8a5180fa5ef@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
	<ca471dc20705131229n3ae005dfi6c6aa369cd90a17a@mail.gmail.com>
	<19dd68ba0705131617w4bbc45f3p833159a20552e909@mail.gmail.com>
	<ca471dc20705131739n318782d3k9198a8a5180fa5ef@mail.gmail.com>
Message-ID: <19dd68ba0705131809i11cd0b82pfc226bd81dd16e99@mail.gmail.com>

Hello,

> > Interestingly, this is *not* a well known fact. I have asked 2
> > friend-of-mine seasoned Java programmers and they were *amazed* that
> > this is supported.
> Well, maybe we should add it to Python as a secret feature. :-) :-) :-)

But they also said that:
1) they wish they would have known earlier...
2) would start using this immediatly for their own small projects

> >  see e.g. http://lists.xml.org/archives/xml-dev/200107/msg00254.html
> I imagine the situation there is sufficiently different though; XML is
> data, not code.

I wish you had enough time to read some of the posts linked from the
above URL. In particular, you can see the viewpoint of some Japanese
people on the ability for them to describe data structures (which is
really a programming concept) in their own words.


> I realize you've added a smiley, but please, don't propose new
> features for a release that's already been released. The release
> managers will put you in jail and not let you out until 4.0 has been
> released. :-)

eheheheh :)

> Because most people still use systems that have very inadequate tools
> for handling non-ASCII text, especially non-Latin-1 text. For example,
> at work I use Ubuntu, a modern Linux distribution actively supported
> by a company headquartered in South-Africa. Their main market lies
> outside Europe and North America. And yet, there is no standard way to
> enter non-ASCII characters as basic as c-cedilla or u-umlaut; the main

I also use Ubuntu at home.
Regarding your issue: hum? you can change keyboard layout (I even
think it does affect the current input system immediatly). Also there
is a number of tools like gucharmap
(http://gucharmap.sourceforge.net/shots/shot-003.png) that enables you
to copy paste rare characters.

> tools I use (Emacs, Firefox and bash running in a terminal emulator)
> all have different input methods, different ideas of the default
> character encoding, and so on. It's a crapshoot whether
> copy-and-pasting even the simplest non-ASCII text (like the name of
> PEP 3131's author :-) between any two of these will work.

Ubuntu Feisty (and I think Edgy too) default on UTF8 everywhere and I
have never had any issue using French, Japanese and English anywhere.
Windows came to this maturity point about 5-6 years ago.

> I see program code as a tool for communication between people. Note
> how you & I are using English in this thread even though it is not the
> mother tongue for either of us. So we use English, since we can both
> read and write it reasonably well. This is the *only* way that
> programmers raised in different countries can exchange code at all.

I *totally* agree with you, you sometimes need to go down to the
lowest common denominator (with tongue in cheek)... But I still do not
understand that you are not happy to see people become more productive
with Python when there is no need of international exchange: the small
(or large) internal application,  the throw-away script, the ability
to extend C programs with a scripting language that is respectful of
the native language of the (mostly-non programmer) user etc...

> gets 1000x better, but we're not there yet -- try translate.google.com
> if you don't believe me.)

I hope you get bonus points at work for mentioning this one. Believe
it or not, translate.google.com is my friend!


> You're stretching my words there. The issue if translation hadn't

Clearly you could not think of this issue, but I am not stretching
your word. I was just reusing some of the *strong* points you made why
you thought Python was such a great invention of yours (and don't get
me wrong, we all love it!). I was just applying those great points to
this new issue which I believe fully deserve more attention.

> crossed my mind when I wrote that (over 10 years ago) and the tools
> *really* weren't ready then. And regarding readability, if all the

The tools are ready now. We live in a mostly fully unicode world now,
and we just agreed in another PEP that the default source encoding of
files will be UTF8...

> programmers in the world agreed to use broken English, the readability
> of their code to each other would be much better dan als we allemaal
> in onze eigen taal schreven.

The funny thing is that I can read this sentence very well: my life
was spent surrounded by latin characters. I can even probably
understand it as I can speak some German too.
allesmaal -> Jedesmal -> always
onze -> eine -> its
eigen -> eigen -> own
taal ->  sprache -> language
schreven -> schreiben -> write

My cultural background can help me decipher VERY QUICKLY what you
wrote. But think of the 7 years old Japanese child. They are not
taught latin characters really before they will seriously learn
English... but this is the year I started programming (by copying
french listing of programs for Thomson TO7-70 computers... oh my
god!).

Regards,

Guillaume

From guido at python.org  Mon May 14 03:51:40 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 13 May 2007 18:51:40 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705131809i11cd0b82pfc226bd81dd16e99@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
	<ca471dc20705131229n3ae005dfi6c6aa369cd90a17a@mail.gmail.com>
	<19dd68ba0705131617w4bbc45f3p833159a20552e909@mail.gmail.com>
	<ca471dc20705131739n318782d3k9198a8a5180fa5ef@mail.gmail.com>
	<19dd68ba0705131809i11cd0b82pfc226bd81dd16e99@mail.gmail.com>
Message-ID: <ca471dc20705131851ldda4e84x55cbec487fe8771f@mail.gmail.com>

I respectfully disagree with the conclusion you draw from the same
data. I don't think either of us can say anything that will satisfy
the other.

--Guido

On 5/13/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> Hello,
>
> > > Interestingly, this is *not* a well known fact. I have asked 2
> > > friend-of-mine seasoned Java programmers and they were *amazed* that
> > > this is supported.
> > Well, maybe we should add it to Python as a secret feature. :-) :-) :-)
>
> But they also said that:
> 1) they wish they would have known earlier...
> 2) would start using this immediatly for their own small projects
>
> > >  see e.g. http://lists.xml.org/archives/xml-dev/200107/msg00254.html
> > I imagine the situation there is sufficiently different though; XML is
> > data, not code.
>
> I wish you had enough time to read some of the posts linked from the
> above URL. In particular, you can see the viewpoint of some Japanese
> people on the ability for them to describe data structures (which is
> really a programming concept) in their own words.
>
>
> > I realize you've added a smiley, but please, don't propose new
> > features for a release that's already been released. The release
> > managers will put you in jail and not let you out until 4.0 has been
> > released. :-)
>
> eheheheh :)
>
> > Because most people still use systems that have very inadequate tools
> > for handling non-ASCII text, especially non-Latin-1 text. For example,
> > at work I use Ubuntu, a modern Linux distribution actively supported
> > by a company headquartered in South-Africa. Their main market lies
> > outside Europe and North America. And yet, there is no standard way to
> > enter non-ASCII characters as basic as c-cedilla or u-umlaut; the main
>
> I also use Ubuntu at home.
> Regarding your issue: hum? you can change keyboard layout (I even
> think it does affect the current input system immediatly). Also there
> is a number of tools like gucharmap
> (http://gucharmap.sourceforge.net/shots/shot-003.png) that enables you
> to copy paste rare characters.
>
> > tools I use (Emacs, Firefox and bash running in a terminal emulator)
> > all have different input methods, different ideas of the default
> > character encoding, and so on. It's a crapshoot whether
> > copy-and-pasting even the simplest non-ASCII text (like the name of
> > PEP 3131's author :-) between any two of these will work.
>
> Ubuntu Feisty (and I think Edgy too) default on UTF8 everywhere and I
> have never had any issue using French, Japanese and English anywhere.
> Windows came to this maturity point about 5-6 years ago.
>
> > I see program code as a tool for communication between people. Note
> > how you & I are using English in this thread even though it is not the
> > mother tongue for either of us. So we use English, since we can both
> > read and write it reasonably well. This is the *only* way that
> > programmers raised in different countries can exchange code at all.
>
> I *totally* agree with you, you sometimes need to go down to the
> lowest common denominator (with tongue in cheek)... But I still do not
> understand that you are not happy to see people become more productive
> with Python when there is no need of international exchange: the small
> (or large) internal application,  the throw-away script, the ability
> to extend C programs with a scripting language that is respectful of
> the native language of the (mostly-non programmer) user etc...
>
> > gets 1000x better, but we're not there yet -- try translate.google.com
> > if you don't believe me.)
>
> I hope you get bonus points at work for mentioning this one. Believe
> it or not, translate.google.com is my friend!
>
>
> > You're stretching my words there. The issue if translation hadn't
>
> Clearly you could not think of this issue, but I am not stretching
> your word. I was just reusing some of the *strong* points you made why
> you thought Python was such a great invention of yours (and don't get
> me wrong, we all love it!). I was just applying those great points to
> this new issue which I believe fully deserve more attention.
>
> > crossed my mind when I wrote that (over 10 years ago) and the tools
> > *really* weren't ready then. And regarding readability, if all the
>
> The tools are ready now. We live in a mostly fully unicode world now,
> and we just agreed in another PEP that the default source encoding of
> files will be UTF8...
>
> > programmers in the world agreed to use broken English, the readability
> > of their code to each other would be much better dan als we allemaal
> > in onze eigen taal schreven.
>
> The funny thing is that I can read this sentence very well: my life
> was spent surrounded by latin characters. I can even probably
> understand it as I can speak some German too.
> allesmaal -> Jedesmal -> always
> onze -> eine -> its
> eigen -> eigen -> own
> taal ->  sprache -> language
> schreven -> schreiben -> write
>
> My cultural background can help me decipher VERY QUICKLY what you
> wrote. But think of the 7 years old Japanese child. They are not
> taught latin characters really before they will seriously learn
> English... but this is the year I started programming (by copying
> french listing of programs for Thomson TO7-70 computers... oh my
> god!).
>
> Regards,
>
> Guillaume
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From murman at gmail.com  Mon May 14 05:03:26 2007
From: murman at gmail.com (Michael Urman)
Date: Sun, 13 May 2007 22:03:26 -0500
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <ca471dc20705131231r510d6fb3i10f035feee2d4477@mail.gmail.com>
References: <dcbbbb410705131204x17a32b5dq363703f219cb707@mail.gmail.com>
	<ca471dc20705131231r510d6fb3i10f035feee2d4477@mail.gmail.com>
Message-ID: <dcbbbb410705132003k73cf08c3hf7f0b99e2e1b9895@mail.gmail.com>

On 5/13/07, Guido van Rossum <guido at python.org> wrote:
> The answer to all of this is the filesystem encoding, which is already
> supported. Doesn't appear particularly difficult to me.

Okay, that's fair. It seems reasonable to accept the limitations of
following the filesystem encoding for module names. I should probably
test py3k to make sure it already has updated __import__ to use the
filesystem encoding instead of the default encoding, but instead I'll
just feebly imply the question here.

Further thoughts related to this lead me to ask if there is to be only
the version of open() which takes a unicode string, of if there will
also be the opportunity to pass a byte string which doesn't pass
through the encoding. It's far too common for Linux users to have
files named with different encodings than their environment suggests.
If it's only possible to open files whose names can be decoded via the
filesystem encoding, I foresee several unhappy end-user experiences.
-- 
Michael Urman

From martin at v.loewis.de  Mon May 14 07:12:57 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 14 May 2007 07:12:57 +0200
Subject: [Python-3000] PEP 3123 (Was: PEP Parade)
In-Reply-To: <ca471dc20705131215l4ac14cf8je421e99dcaf30cd@mail.gmail.com>
References: <ca471dc20705011131m5848a04ct55b6bce5174248d5@mail.gmail.com>	
	<464720E8.3040402@v.loewis.de>
	<ca471dc20705131215l4ac14cf8je421e99dcaf30cd@mail.gmail.com>
Message-ID: <4647EFD9.9010001@v.loewis.de>

Guido van Rossum schrieb:
> I'm okay with applying to 2.6 and then merging into 3.0.  ISTM though
> that backporting this to 2.5 would cause the release manager to throw
> a fit, so I think that's not worth it. What would be the benefit
> anyway?

I think you misunderstood. If this is applied to 2.6, then (independent)
bug fixes that get applied to 2.6 will be more difficult to backport,
if the change falls into a region where ->ob_type was used, as the
patch might fail to apply.

Regards,
Martin

From martin at v.loewis.de  Mon May 14 07:24:59 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 14 May 2007 07:24:59 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <4647B15F.7040700@canterbury.ac.nz>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>
	<4647B15F.7040700@canterbury.ac.nz>
Message-ID: <4647F2AB.3060406@v.loewis.de>

> I don't think this scenario is all that unlikely. A
> program is initially written by a Russian programmer
> who uses his own version of "a" as a variable name.
> Later an English-speaking programmer makes some
> changes, and uses an ascii "a". Now there are two
> subtly different variables called "a" in different
> parts of the program.

If they work in the same project, they will have a coding
style that says "ASCII-only identifiers".

Also, if the change is in different parts of the program,
there won't be a common variable called "a". When was the
last time you called a variable 'a'? I hope it was a local
variable; if you use 'a' for class or method names, or
global variables, you have bigger problems than
typographical ones.

Regards,
Martin


From hanser at club-internet.fr  Mon May 14 07:33:12 2007
From: hanser at club-internet.fr (Pierre Hanser)
Date: Mon, 14 May 2007 07:33:12 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <4647AE04.1040207@canterbury.ac.nz>
References: <4646B6A1.7060007@club-internet.fr>
	<4647AE04.1040207@canterbury.ac.nz>
Message-ID: <4647F498.90702@club-internet.fr>

Greg Ewing a ?crit :
> Pierre Hanser wrote:
> 
>> In english, most of the time, adding 'ed' to the verb will do
>> the difference: change -> changed
>>
>> in french:   change -> chang?   (ends with 'eacute')
> 
> Fine if the reader understands French, but if you
> later want to translate this program so that a
> non-French speaker can read it, what would you
> do?

i will translate it, but i don't want to have to speak
english for my personnal homework. That's all. And that
should be enough.

currently, at home, my choice is poor degenerated french,
or english. What a choice!
-
	Pierre

From collinw at gmail.com  Mon May 14 07:36:55 2007
From: collinw at gmail.com (Collin Winter)
Date: Sun, 13 May 2007 22:36:55 -0700
Subject: [Python-3000] PEP 3133: Introducing Roles
Message-ID: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>

PEP: 3133
Title: Introducing Roles
Version: $Revision$
Last-Modified: $Date$
Author: Collin Winter <collinw at gmail.com>
Status: Draft
Type: Standards Track
Requires: 3115, 3129
Content-Type: text/x-rst
Created: 1-May-2007
Python-Version: 3.0
Post-History: 13-May-2007


Abstract
========

Python's existing object model organizes objects according to their
implementation.  It is often desirable -- especially in
duck typing-based language like Python -- to organize objects by
the part they play in a larger system (their intent), rather than by
how they fulfill that part (their implementation).  This PEP
introduces the concept of roles, a mechanism for organizing
objects according to their intent rather than their implementation.


Rationale
=========

In the beginning were objects.  They allowed programmers to marry
function and state, and to increase code reusability through concepts
like polymorphism and inheritance, and lo, it was good.  There came
a time, however, when inheritance and polymorphism weren't enough.
With the invention of both dogs and trees, we were no longer able to
be content with knowing merely, "Does it understand 'bark'?"
We now needed to know what a given object thought that "bark" meant.

One solution, the one detailed here, is that of roles, a mechanism
orthogonal and complementary to the traditional class/instance system.
Whereas classes concern themselves with state and implementation, the
roles mechanism deals exclusively with the behaviours embodied in a
given class.

This system was originally called "traits" and implemented for Squeak
Smalltalk [#traits-paper]_.  It has since been adapted for use in
Perl 6 [#perl6-s12]_ where it is called "roles", and it is primarily
from there that the concept is now being interpreted for Python 3.
Python 3 will preserve the name "roles".

In a nutshell: roles tell you *what* an object does, classes tell you
*how* an object does it.

In this PEP, I will outline a system for Python 3 that will make it
possible to easily determine whether a given object's understanding
of "bark" is tree-like or dog-like.  (There might also be more
serious examples.)


A Note on Syntax
----------------

A syntax proposals in this PEP are tentative and should be
considered to be strawmen.  The necessary bits that this PEP depends
on -- namely PEP 3115's class definition syntax and PEP 3129's class
decorators -- are still being formalized and may change.  Function
names will, of course, be subject to lengthy bikeshedding debates.


Performing Your Role
====================

Static Role Assignment
----------------------

Let's start out by defining ``Tree`` and ``Dog`` classes ::

  class Tree(Vegetable):

    def bark(self):
      return self.is_rough()


  class Dog(Animal):

    def bark(self):
      return self.goes_ruff()

While both implement a ``bark()`` method with the same signature,
they do wildly different things.  We need some way of differentiating
what we're expecting. Relying on inheritance and a simple
``isinstance()`` test will limit code reuse and/or force any dog-like
classes to inherit from ``Dog``, whether or not that makes sense.
Let's see if roles can help. ::

  @perform_role(Doglike)
  class Dog(Animal):
    ...

  @perform_role(Treelike)
  class Tree(Vegetable):
    ...

  @perform_role(SitThere)
  class Rock(Mineral):
    ...

We use class decorators from PEP 3129 to associate a particular role
or roles with a class.  Client code can now verify that an incoming
object performs the ``Doglike`` role, allowing it to handle ``Wolf``,
``LaughingHyena`` and ``Aibo`` [#aibo]_ instances, too.

Roles can be composed via normal inheritance: ::

  @perform_role(Guard, MummysLittleDarling)
  class GermanShepherd(Dog):

    def guard(self, the_precious):
      while True:
        if intruder_near(the_precious):
          self.growl()

    def get_petted(self):
      self.swallow_pride()

Here, ``GermanShepherd`` instances perform three roles: ``Guard`` and
``MummysLittleDarling`` are applied directly, whereas ``Doglike``
is inherited from ``Dog``.


Assigning Roles at Runtime
--------------------------

Roles can be assigned at runtime, too, by unpacking the syntactic
sugar provided by decorators.

Say we import a ``Robot`` class from another module, and since we
know that ``Robot`` already implements our ``Guard`` interface,
we'd like it to play nicely with guard-related code, too. ::

  >>> perform(Guard)(Robot)

This takes effect immediately and impacts all instances of ``Robot``.


Asking Questions About Roles
----------------------------

Just because we've told our robot army that they're guards, we'd
like to check in on them occasionally and make sure they're still at
their task. ::

  >>> performs(our_robot, Guard)
  True

What about that one robot over there? ::

  >>> performs(that_robot_over_there, Guard)
  True

The ``performs()`` function is used to ask if a given object
fulfills a given role.  It cannot be used, however, to ask a
class if its instances fulfill a role: ::

  >>> performs(Robot, Guard)
  False

This is because the ``Robot`` class is not interchangeable
with a ``Robot`` instance.


Defining New Roles
==================

Empty Roles
-----------

Roles are defined like a normal class, but use the ``Role``
metaclass. ::

  class Doglike(metaclass=Role):
    ...

Metaclasses are used to indicate that ``Doglike`` is a ``Role`` in
the same way 5 is an ``int`` and ``tuple`` is a ``type``.


Composing Roles via Inheritance
-------------------------------

Roles may inherit from other roles; this has the effect of composing
them.  Here, instances of ``Dog`` will perform both the
``Doglike`` and ``FourLegs`` roles. ::

  class FourLegs(metaclass=Role):
    pass

  class Doglike(FourLegs, Carnivor):
    pass

  @perform_role(Doglike)
  class Dog(Mammal):
    pass


Requiring Concrete Methods
--------------------------

So far we've only defined empty roles -- not very useful things.
Let's now require that all classes that claim to fulfill the
``Doglike`` role define a ``bark()`` method: ::

  class Doglike(FourLegs):

    def bark(self):
      pass

No decorators are required to flag the method as "abstract", and the
method will never be called, meaning whatever code it contains (if any)
is irrelevant.  Roles provide *only* abstract methods; concrete
default implementations are left to other, better-suited mechanisms
like mixins.

Once you have defined a role, and a class has claimed to perform that
role, it is essential that that claim be verified.  Here, the
programmer has misspelled one of the methods required by the role. ::

  @perform_role(FourLegs)
  class Horse(Mammal):

    def run_like_teh_wind(self)
      ...

This will cause the role system to raise an exception, complaining
that you're missing a ``run_like_the_wind()`` method.  The role
system carries out these checks as soon as a class is flagged as
performing a given role.

Concrete methods are required to match exactly the signature demanded
by the role.  Here, we've attempted to fulfill our role by defining a
concrete version of ``bark()``, but we've missed the mark a bit. ::

  @perform_role(Doglike)
  class Coyote(Mammal):

    def bark(self, target=moon):
      pass

This method's signature doesn't match exactly with what the
``Doglike`` role was expecting, so the role system will throw a bit
of a tantrum.


Mechanism
=========

The following are strawman proposals for how roles might be expressed
in Python.  The examples here are phrased in a way that the roles
mechanism may be implemented without changing the Python interpreter.
(Examples adapted from an article on Perl 6 roles by Curtis Poe
[#roles-examples]_.)

1. Static class role assignment ::

     @perform_role(Thieving)
     class Elf(Character):
       ...

   ``perform_role()`` accepts multiple arguments, such that this is
   also legal: ::

     @perform_role(Thieving, Spying, Archer)
     class Elf(Character):
       ...

   The ``Elf`` class now performs both the ``Thieving``, ``Spying``,
   and ``Archer`` roles.

2. Querying instances ::

     if performs(my_elf, Thieving):
       ...

   The second argument to ``performs()`` may also be anything with a
   ``__contains__()`` method, meaning the following is legal: ::

     if performs(my_elf, set([Thieving, Spying, BoyScout])):
       ...

   Like ``isinstance()``, the object needs only to perform a single
   role out of the set in order for the expression to be true.


Relationship to Abstract Base Classes
=====================================

Early drafts of this PEP [#proposal]_ envisioned roles as competing
with the abstract base classes proposed in PEP 3119.  After further
discussion and deliberation, a compromise and a delegation of
responsibilities and use-cases has been worked out as follows:

* Roles provide a way of indicating a object's semantics and abstract
  capabilities.  A role may define abstract methods, but only as a
  way of delineating an interface through which a particular set of
  semantics are accessed.  An ``Ordering`` role might require that
  some set of ordering operators  be defined. ::

    class Ordering(metaclass=Role):
      def __ge__(self, other):
        pass

      def __le__(self, other):
        pass

      def __ne__(self, other):
        pass

      # ...and so on

  In this way, we're able to indicate an object's role or function
  within a larger system without constraining or concerning ourselves
  with a particular implementation.

* Abstract base classes, by contrast, are a way of reusing common,
  discrete units of implementation.  For example, one might define an
  ``OrderingMixin`` that implements several ordering operators in
  terms of other operators. ::

    class OrderingMixin:
      def __ge__(self, other):
        return self > other or self == other

      def __le__(self, other):
        return self < other or self == other

      def __ne__(self, other):
        return not self == other

      # ...and so on

  Using this abstract base class - more properly, a concrete
  mixin - allows a programmer to define a limited set of operators
  and let the mixin in effect "derive" the others.

By combining these two orthogonal systems, we're able to both
a) provide functionality, and b) alert consumer systems to the
presence and availability of this functionality.  For example,
since the ``OrderingMixin`` class above satisfies the interface
and semantics expressed in the ``Ordering`` role, we say the mixin
performs the role: ::

  @perform_role(Ordering)
  class OrderingMixin:
    def __ge__(self, other):
      return self > other or self == other

    def __le__(self, other):
      return self < other or self == other

    def __ne__(self, other):
      return not self == other

    # ...and so on

Now, any class that uses the mixin will automatically -- that is,
without further programmer effort -- be tagged as performing the
``Ordering`` role.

The separation of concerns into two distinct, orthogonal systems
is desirable because it allows us to use each one separately.
Take, for example, a third-party package providing a
``RecursiveHash`` role that indicates a container takes its
contents into account when determining its hash value.  Since
Python's built-in ``tuple`` and ``frozenset`` classes follow this
semantic, the ``RecursiveHash`` role can be applied to them. ::

  >>> perform_role(RecursiveHash)(tuple)
  >>> perform_role(RecursiveHash)(frozenset)

Now, any code that consumes ``RecursiveHash`` objects will now be
able to consume tuples and frozensets.


Open Issues
===========

Allowing Instances to Perform Different Roles Than Their Class
--------------------------------------------------------------

Perl 6 allows instances to perform different roles than their class.
These changes are local to the single instance and do not affect
other instances of the class.  For example: ::

  my_elf = Elf()
  my_elf.goes_on_quest()
  my_elf.becomes_evil()
  now_performs(my_elf, Thieving) # Only this one elf is a thief
  my_elf.steals(["purses", "candy", "kisses"])

In Perl 6, this is done by creating an anonymous class that
inherits from the instance's original parent and performs the
additional role(s).  This is possible in Python 3, though whether it
is desirable is still is another matter.

Inclusion of this feature would, of course, make it much easier to
express the works of Charles Dickens in Python: ::

  >>> from literature import role, BildungsRoman
  >>> from dickens import Urchin, Gentleman
  >>>
  >>> with BildungsRoman() as OliverTwist:
  ...   mr_brownlow = Gentleman()
  ...   oliver, artful_dodger = Urchin(), Urchin()
  ...   now_performs(artful_dodger, [role.Thief, role.Scoundrel])
  ...
  ...   oliver.has_adventures_with(ArtfulDodger)
  ...   mr_brownlow.adopt_orphan(oliver)
  ...   now_performs(oliver, role.RichWard)


Requiring Attributes
--------------------

Neal Norwitz has requested the ability to make assertions about
the presence of attributes using the same mechanism used to require
methods.  Since roles take effect at class definition-time, and
since the vast majority of attributes are defined at runtime by a
class's ``__init__()`` method, there doesn't seem to be a good way
to check for attributes at the same time as methods.

It may still be desirable to include non-enforced attributes in the
role definition, if only for documentation purposes.


Roles of Roles
--------------

Under the proposed semantics, it is possible for roles to
have roles of their own. ::

  @perform_role(Y)
  class X(metaclass=Role):
    ...

While this is possible, it is meaningless, since roles
are generally not instantiated.  There has been some
off-line discussion about giving meaning to this expression, but so
far no good ideas have emerged.


class_performs()
----------------

It is currently not possible to ask a class if its instances perform
a given role.  It may be desirable to provide an analogue to
``performs()`` such that ::

  >>> isinstance(my_dwarf, Dwarf)
  True
  >>> performs(my_dwarf, Surly)
  True
  >>> performs(Dwarf, Surly)
  False
  >>> class_performs(Dwarf, Surly)
  True


Prettier Dynamic Role Assignment
--------------------------------

An early draft of this PEP included a separate mechanism for
dynamically assigning a role to a class.  This was spelled ::

  >>> now_perform(Dwarf, GoldMiner)

This same functionality already exists by unpacking the syntactic
sugar provided by decorators: ::

  >>> perform_role(GoldMiner)(Dwarf)

At issue is whether dynamic role assignment is sufficiently important
to warrant a dedicated spelling.


Syntax Support
--------------

Though the phrasings laid out in this PEP are designed so that the
roles system could be shipped as a stand-alone package, it may be
desirable to add special syntax for defining, assigning and
querying roles.  One example might be a role keyword, which would
translate ::

  class MyRole(metaclass=Role):
    ...

into ::

  role MyRole:
    ...

Assigning a role could take advantage of the class definition
arguments proposed in PEP 3115: ::

  class MyClass(performs=MyRole):
    ...


Implementation
==============

A reference implementation is forthcoming.


Acknowledgements
================

Thanks to Jeffery Yasskin, Talin and Guido van Rossum for several
hours of in-person discussion to iron out the differences, overlap
and finer points of roles and abstract base classes.


References
==========

.. [#aibo]
   http://en.wikipedia.org/wiki/AIBO

.. [#roles-examples]
   http://www.perlmonks.org/?node_id=384858

.. [#perl6-s12]
   http://dev.perl.org/perl6/doc/design/syn/S12.html

.. [#traits-paper]
   http://www.iam.unibe.ch/~scg/Archive/Papers/Scha03aTraits.pdf

.. [#proposal]
   http://mail.python.org/pipermail/python-3000/2007-April/007026.html


Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:

From steven.bethard at gmail.com  Mon May 14 08:08:08 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon, 14 May 2007 00:08:08 -0600
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
Message-ID: <d11dcfba0705132308h712fbadeq91e4474f294ac936@mail.gmail.com>

On 5/13/07, Collin Winter <collinw at gmail.com> wrote:
> PEP: 3133
> Title: Introducing Roles
[snip]
> * Roles provide a way of indicating a object's semantics and abstract
>   capabilities.  A role may define abstract methods, but only as a
>   way of delineating an interface through which a particular set of
>   semantics are accessed.
[snip]
> * Abstract base classes, by contrast, are a way of reusing common,
>   discrete units of implementation.
[snip]
>   Using this abstract base class - more properly, a concrete
>   mixin - allows a programmer to define a limited set of operators
>   and let the mixin in effect "derive" the others.

So what's the difference between a role and an abstract base class
that used @abstractmethod on all of its methods? Isn't such an ABC
just "delineating an interface"?

> since the ``OrderingMixin`` class above satisfies the interface
> and semantics expressed in the ``Ordering`` role, we say the mixin
> performs the role: ::
>
>   @perform_role(Ordering)
>   class OrderingMixin:
>     def __ge__(self, other):
>       return self > other or self == other
>
>     def __le__(self, other):
>       return self < other or self == other
>
>     def __ne__(self, other):
>       return not self == other
>
>     # ...and so on
>
> Now, any class that uses the mixin will automatically -- that is,
> without further programmer effort -- be tagged as performing the
> ``Ordering`` role.

But why is::

    performs(obj, Ordering)

any better than::

    isinstance(obj, Ordering)

if Ordering is just an appropriately registered ABC?


(BTW, Ordering is a bad example since the ABC PEP no longer proposes
that.  Maybe Sequence or Mapping instead?)


STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From tcdelaney at optusnet.com.au  Mon May 14 09:23:52 2007
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Mon, 14 May 2007 17:23:52 +1000
Subject: [Python-3000] PEP 367: New Super
Message-ID: <003001c795f8$d5275060$0201a8c0@mshome.net>

Here is my modified version of PEP 367. The reference implementation in it 
is pretty long, and should probably be split out to somewhere else (esp. 
since it can't fully implement the semantics).

Cheers,

Tim Delaney


PEP: 367
Title: New Super
Version: $Revision$
Last-Modified: $Date$
Author: Calvin Spealman <ironfroggy at gmail.com>
Author: Tim Delaney <timothy.c.delaney at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 28-Apr-2007
Python-Version: 2.6
Post-History: 28-Apr-2007, 29-Apr-2007 (1), 29-Apr-2007 (2), 14-May-2007

Abstract
========

This PEP proposes syntactic sugar for use of the ``super`` type to 
automatically
construct instances of the super type binding to the class that a method was
defined in, and the instance (or class object for classmethods) that the 
method
is currently acting upon.

The premise of the new super usage suggested is as follows::

    super.foo(1, 2)

to replace the old::

    super(Foo, self).foo(1, 2)

and the current ``__builtin__.super`` be aliased to 
``__builtin__.__super__``
(with ``__builtin__.super`` to be removed in Python 3.0).

It is further proposed that assignment to ``super`` become a 
``SyntaxError``,
similar to the behaviour of ``None``.


Rationale
=========

The current usage of super requires an explicit passing of both the class 
and
instance it must operate from, requiring a breaking of the DRY (Don't Repeat
Yourself) rule. This hinders any change in class name, and is often 
considered
a wart by many.


Specification
=============

Within the specification section, some special terminology will be used to
distinguish similar and closely related concepts. "super type" will refer to
the actual builtin type named "super". A "super instance" is simply an 
instance
of the super type, which is associated with a class and possibly with an
instance of that class.

Because the new ``super`` semantics are not backwards compatible with Python
2.5, the new semantics will require a ``__future__`` import::

    from __future__ import new_super

The current ``__builtin__.super`` will be aliased to 
``__builtin__.__super__``.
This will occur regardless of whether the new ``super`` semantics are 
active.
It is not possible to simply rename ``__builtin__.super``, as that would 
affect
modules that do not use the new ``super`` semantics. In Python 3.0 it is
proposed that the name ``__builtin__.super`` will be removed.

Replacing the old usage of super, calls to the next class in the MRO (method
resolution order) can be made without explicitly creating a ``super``
instance (although doing so will still be supported via ``__super__``). 
Every
function will have an implicit local named ``super``. This name behaves
identically to a normal local, including use by inner functions via a cell,
with the following exceptions:

1. Assigning to the name ``super`` will raise a ``SyntaxError`` at compile 
time;

2. Calling a static method or normal function that accesses the name 
``super``
   will raise a ``TypeError`` at runtime.

Every function that uses the name ``super``, or has an inner function that
uses the name ``super``, will include a preamble that performs the 
equivalent
of::

    super = __builtin__.__super__(<class>, <instance>)

where ``<class>`` is the class that the method was defined in, and
``<instance>`` is the first parameter of the method (normally ``self`` for
instance methods, and ``cls`` for class methods). For static methods and 
normal
functions, ``<class>`` will be ``None``, resulting in a ``TypeError`` being
raised during the preamble.

Note: The relationship between ``super`` and ``__super__`` is similar to 
that
between ``import`` and ``__import__``.

Much of this was discussed in the thread of the python-dev list, "Fixing 
super
anyone?" [1]_.


Open Issues
-----------


Determining the class object to use
'''''''''''''''''''''''''''''''''''

The exact mechanism for associating the method with the defining class is 
not
specified in this PEP, and should be chosen for maximum performance. For
CPython, it is suggested that the class instance be held in a C-level 
variable
on the function object which is bound to one of ``NULL`` (not part of a 
class),
``Py_None`` (static method) or a class object (instance or class method).


Should ``super`` actually become a keyword?
'''''''''''''''''''''''''''''''''''''''''''

With this proposal, ``super`` would become a keyword to the same extent that
``None`` is a keyword. It is possible that further restricting the ``super``
name may simplify implementation, however some are against the actual 
keyword-
ization of super. The simplest solution is often the correct solution and 
the
simplest solution may well not be adding additional keywords to the language
when they are not needed. Still, it may solve other open issues.


Closed Issues
-------------

super used with __call__ attributes
'''''''''''''''''''''''''''''''''''

It was considered that it might be a problem that instantiating super 
instances
the classic way, because calling it would lookup the __call__ attribute and
thus try to perform an automatic super lookup to the next class in the MRO.
However, this was found to be false, because calling an object only looks up
the __call__ method directly on the object's type. The following example 
shows
this in action.

::

    class A(object):
        def __call__(self):
            return '__call__'
        def __getattribute__(self, attr):
            if attr == '__call__':
                return lambda: '__getattribute__'
    a = A()
    assert a() == '__call__'
    assert a.__call__() == '__getattribute__'

In any case, with the renaming of ``__builtin__.super`` to
``__builtin__.__super__`` this issue goes away entirely.


Reference Implementation
========================

It is impossible to implement the above specification entirely in Python. 
This
reference implementation has the following differences to the specification:

1. New ``super`` semantics are implemented using bytecode hacking.

2. Assignment to ``super`` is not a ``SyntaxError``. Also see point #4.

3. Classes must either use the metaclass ``autosuper_meta`` or inherit from
   the base class ``autosuper`` to acquire the new ``super`` semantics.

4. ``super`` is not an implicit local variable. In particular, for inner
   functions to be able to use the super instance, there must be an 
assignment
   of the form ``super = super`` in the method.

The reference implementation assumes that it is being run on Python 2.5+.

::

    #!/usr/bin/env python
    #
    # autosuper.py

    from array import array
    import dis
    import new
    import types
    import __builtin__
    __builtin__.__super__ = __builtin__.super
    del __builtin__.super

    # We need these for modifying bytecode
    from opcode import opmap, HAVE_ARGUMENT, EXTENDED_ARG

    LOAD_GLOBAL = opmap['LOAD_GLOBAL']
    LOAD_NAME = opmap['LOAD_NAME']
    LOAD_CONST = opmap['LOAD_CONST']
    LOAD_FAST = opmap['LOAD_FAST']
    LOAD_ATTR = opmap['LOAD_ATTR']
    STORE_FAST = opmap['STORE_FAST']
    LOAD_DEREF = opmap['LOAD_DEREF']
    STORE_DEREF = opmap['STORE_DEREF']
    CALL_FUNCTION = opmap['CALL_FUNCTION']
    STORE_GLOBAL = opmap['STORE_GLOBAL']
    DUP_TOP = opmap['DUP_TOP']
    POP_TOP = opmap['POP_TOP']
    NOP = opmap['NOP']
    JUMP_FORWARD = opmap['JUMP_FORWARD']
    ABSOLUTE_TARGET = dis.hasjabs

    def _oparg(code, opcode_pos):
        return code[opcode_pos+1] + (code[opcode_pos+2] << 8)

    def _bind_autosuper(func, cls):
        co = func.func_code
        name = func.func_name
        newcode = array('B', co.co_code)
        codelen = len(newcode)
        newconsts = list(co.co_consts)
        newvarnames = list(co.co_varnames)

        # Check if the global 'super' keyword is already present
        try:
            sn_pos = list(co.co_names).index('super')
        except ValueError:
            sn_pos = None

        # Check if the varname 'super' keyword is already present
        try:
            sv_pos = newvarnames.index('super')
        except ValueError:
            sv_pos = None

        # Check if the callvar 'super' keyword is already present
        try:
            sc_pos = list(co.co_cellvars).index('super')
        except ValueError:
            sc_pos = None

        # If 'super' isn't used anywhere in the function, we don't have 
anything to do
        if sn_pos is None and sv_pos is None and sc_pos is None:
            return func

        c_pos = None
        s_pos = None
        n_pos = None

        # Check if the 'cls_name' and 'super' objects are already in the 
constants
        for pos, o in enumerate(newconsts):
            if o is cls:
                c_pos = pos

            if o is __super__:
                s_pos = pos

            if o == name:
                n_pos = pos

        # Add in any missing objects to constants and varnames
        if c_pos is None:
            c_pos = len(newconsts)
            newconsts.append(cls)

        if n_pos is None:
            n_pos = len(newconsts)
            newconsts.append(name)

        if s_pos is None:
            s_pos = len(newconsts)
            newconsts.append(__super__)

        if sv_pos is None:
            sv_pos = len(newvarnames)
            newvarnames.append('super')

        # This goes at the start of the function. It is:
        #
        #   super = __super__(cls, self)
        #
        # If 'super' is a cell variable, we store to both the
        # local and cell variables (i.e. STORE_FAST and STORE_DEREF).
        #
        preamble = [
            LOAD_CONST, s_pos & 0xFF, s_pos >> 8,
            LOAD_CONST, c_pos & 0xFF, c_pos >> 8,
            LOAD_FAST, 0, 0,
            CALL_FUNCTION, 2, 0,
        ]

        if sc_pos is None:
            # 'super' is not a cell variable - we can just use the local 
variable
            preamble += [
                STORE_FAST, sv_pos & 0xFF, sv_pos >> 8,
            ]
        else:
            # If 'super' is a cell variable, we need to handle LOAD_DEREF.
            preamble += [
                DUP_TOP,
                STORE_FAST, sv_pos & 0xFF, sv_pos >> 8,
                STORE_DEREF, sc_pos & 0xFF, sc_pos >> 8,
            ]

        preamble = array('B', preamble)

        # Bytecode for loading the local 'super' variable.
        load_super = array('B', [
            LOAD_FAST, sv_pos & 0xFF, sv_pos >> 8,
        ])

        preamble_len = len(preamble)
        need_preamble = False
        i = 0

        while i < codelen:
            opcode = newcode[i]
            need_load = False
            remove_store = False

            if opcode == EXTENDED_ARG:
                raise TypeError("Cannot use 'super' in function with 
EXTENDED_ARG opcode")

            # If the opcode is an absolute target it needs to be adjusted
            # to take into account the preamble.
            elif opcode in ABSOLUTE_TARGET:
                oparg = _oparg(newcode, i) + preamble_len
                newcode[i+1] = oparg & 0xFF
                newcode[i+2] = oparg >> 8

            # If LOAD_GLOBAL(super) or LOAD_NAME(super) then we want to 
change it into
            # LOAD_FAST(super)
            elif (opcode == LOAD_GLOBAL or opcode == LOAD_NAME) and 
_oparg(newcode, i) == sn_pos:
                need_preamble = need_load = True

            # If LOAD_FAST(super) then we just need to add the preamble
            elif opcode == LOAD_FAST and _oparg(newcode, i) == sv_pos:
                need_preamble = need_load = True

            # If LOAD_DEREF(super) then we change it into LOAD_FAST(super) 
because
            # it's slightly faster.
            elif opcode == LOAD_DEREF and _oparg(newcode, i) == sc_pos:
                need_preamble = need_load = True

            if need_load:
                newcode[i:i+3] = load_super

            i += 1

            if opcode >= HAVE_ARGUMENT:
                i += 2

        # No changes needed - get out.
        if not need_preamble:
            return func

        # Our preamble will have 3 things on the stack
        co_stacksize = max(3, co.co_stacksize)

        # Conceptually, our preamble is on the `def` line.
        co_lnotab = array('B', co.co_lnotab)

        if co_lnotab:
            co_lnotab[0] += preamble_len

        co_lnotab = co_lnotab.tostring()

        # Our code consists of the preamble and the modified code.
        codestr = (preamble + newcode).tostring()

        codeobj = new.code(co.co_argcount, len(newvarnames), co_stacksize,
                           co.co_flags, codestr, tuple(newconsts), 
co.co_names,
                           tuple(newvarnames), co.co_filename, co.co_name,
                           co.co_firstlineno, co_lnotab, co.co_freevars,
                           co.co_cellvars)

        func.func_code = codeobj
        func.func_class = cls
        return func

    class autosuper_meta(type):
        def __init__(cls, name, bases, clsdict):
            UnboundMethodType = types.UnboundMethodType

            for v in vars(cls):
                o = getattr(cls, v)
                if isinstance(o, UnboundMethodType):
                    _bind_autosuper(o.im_func, cls)

    class autosuper(object):
        __metaclass__ = autosuper_meta

    if __name__ == '__main__':
        class A(autosuper):
            def f(self):
                return 'A'

        class B(A):
            def f(self):
                return 'B' + super.f()

        class C(A):
            def f(self):
                def inner():
                    return 'C' + super.f()

                # Needed to put 'super' into a cell
                super = super
                return inner()

        class D(B, C):
            def f(self, arg=None):
                var = None
                return 'D' + super.f()

        assert D().f() == 'DBCA'

Disassembly of B.f and C.f reveals the different preambles used when 
``super``
is simply a local variable compared to when it is used by an inner function.

::

    >>> dis.dis(B.f)

    214           0 LOAD_CONST               4 (<type 'super'>)
                  3 LOAD_CONST               2 (<class '__main__.B'>)
                  6 LOAD_FAST                0 (self)
                  9 CALL_FUNCTION            2
                 12 STORE_FAST               1 (super)

    215          15 LOAD_CONST               1 ('B')
                 18 LOAD_FAST                1 (super)
                 21 LOAD_ATTR                1 (f)
                 24 CALL_FUNCTION            0
                 27 BINARY_ADD
                 28 RETURN_VALUE

::

    >>> dis.dis(C.f)

    218           0 LOAD_CONST               4 (<type 'super'>)
                  3 LOAD_CONST               2 (<class '__main__.C'>)
                  6 LOAD_FAST                0 (self)
                  9 CALL_FUNCTION            2
                 12 DUP_TOP
                 13 STORE_FAST               1 (super)
                 16 STORE_DEREF              0 (super)

    219          19 LOAD_CLOSURE             0 (super)
                 22 LOAD_CONST               1 (<code object inner at 
00C160A0, file "autosuper.py", line 219>)
                 25 MAKE_CLOSURE             0
                 28 STORE_FAST               2 (inner)

    223          31 LOAD_FAST                1 (super)
                 34 STORE_DEREF              0 (super)

    224          37 LOAD_FAST                2 (inner)
                 40 CALL_FUNCTION            0
                 43 RETURN_VALUE

Note that in the final implementation, the preamble would not be part of the
bytecode of the method, but would occur immediately following unpacking of
parameters.


Alternative Proposals
=====================

No Changes
----------

Although its always attractive to just keep things how they are, people have
sought a change in the usage of super calling for some time, and for good
reason, all mentioned previously.

- Decoupling from the class name (which might not even be bound to the
  right class anymore!)
- Simpler looking, cleaner super calls would be better

Dynamic attribute on super type
-------------------------------

The proposal adds a dynamic attribute lookup to the super type, which will
automatically determine the proper class and instance parameters. Each super
attribute lookup identifies these parameters and performs the super lookup 
on
the instance, as the current super implementation does with the explicit
invokation of a super instance upon a class and instance.

This proposal relies on sys._getframe(), which is not appropriate for 
anything
except a prototype implementation.


super(__this_class__, self)
---------------------------

This is nearly an anti-proposal, as it basically relies on the acceptance of
the __this_class__ PEP, which proposes a special name that would always be
bound to the class within which it is used. If that is accepted, 
__this_class__
could simply be used instead of the class' name explicitly, solving the name
binding issues [2]_.

self.__super__.foo(\*args)
--------------------------

The __super__ attribute is mentioned in this PEP in several places, and 
could
be a candidate for the complete solution, actually using it explicitly 
instead
of any super usage directly. However, double-underscore names are usually an
internal detail, and attempted to be kept out of everyday code.

super(self, \*args) or __super__(self, \*args)
----------------------------------------------

This solution only solves the problem of the type indication, does not 
handle
differently named super methods, and is explicit about the name of the
instance. It is less flexable without being able to enacted on other method
names, in cases where that is needed. One use case this fails is where a 
base-
class has a factory classmethod and a subclass has two factory classmethods,
both of which needing to properly make super calls to the one in the base-
class.

super.foo(self, \*args)
-----------------------

This variation actually eliminates the problems with locating the proper
instance, and if any of the alternatives were pushed into the spotlight, I
would want it to be this one.

super or super()
----------------

This proposal leaves no room for different names, signatures, or application
to other classes, or instances. A way to allow some similar use alongside 
the
normal proposal would be favorable, encouraging good design of multiple
inheritence trees and compatible methods.

super(\*p, \*\*kw)
------------------

There has been the proposal that directly calling ``super(*p, **kw)`` would
be equivalent to calling the method on the ``super`` object with the same 
name
as the method currently being executed i.e. the following two methods would 
be
equivalent:

::

    def f(self, *p, **kw):
        super.f(*p, **kw)

::

    def f(self, *p, **kw):
        super(*p, **kw)

There is strong sentiment for and against this, but implementation and style
concerns are obvious. Guido has suggested that this should be excluded from
this PEP on the principle of KISS (Keep It Simple Stupid).



History
=======
29-Apr-2007 - Changed title from "Super As A Keyword" to "New Super"
            - Updated much of the language and added a terminology section
              for clarification in confusing places.
            - Added reference implementation and history sections.

06-May-2007 - Updated by Tim Delaney to reflect discussions on the 
python-3000
              and python-dev mailing lists.

References
==========

.. [1] Fixing super anyone?
   (http://mail.python.org/pipermail/python-3000/2007-April/006667.html)
.. [2] PEP 3130: Access to Module/Class/Function Currently Being Defined 
(this)
   (http://mail.python.org/pipermail/python-ideas/2007-April/000542.html)


Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:


From talin at acm.org  Mon May 14 09:48:23 2007
From: talin at acm.org (Talin)
Date: Mon, 14 May 2007 00:48:23 -0700
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
Message-ID: <46481447.7010100@acm.org>

Collin Winter wrote:
> PEP: 3133

[snip]

I'll probably have quite a few comments on this over the next few days. 
First let me start off by saying I like the general approach of your PEP.

Let me kick off the bikeshed part of the discussion by saying that the 
"role/performs" terminology is not my favorite - I kind of like the 
terminology that was introduced by Jeff Shell in an earlier message, 
specifically the terms "specifies", "provides" and "implements":

   * An interface *specifies* a set of methods.
   * An object can *provide* the services that are specified by an 
interface.
   * A class can *implement* the services that are specified by an 
interface.

In other words, the difference between 'provides' and 'implements' is 
that one is asking about the instance and another is asking about the class.

Thus, you can ask an object if it provides() an interface; You can also 
ask a class if it implements() an interface.


Another question that I'd like to ask is: Your PEP describes a mechanism 
for defining roles and testing for them. What it doesn't define is what 
roles will be defined in the standard library, and specifically what 
roles will be defined for the built-in classes.


The third issue I want to raise is how the roles system interacts with 
PJE's generic functions PEP. Let me give some background:

In the most general terms, a method of a generic function is a function 
with a set of constraints on the arguments. These constraints can be 
types, but they don't have to be. Depending on the actual calling 
arguments, the dispatcher will attempt to find the method whose 
constraints most closely match the calling arguments.

Clearly, in a system in which there are both roles and generics, we 
would want to create overloads in which the constraints can be role 
tests rather than type tests.

So for example, if Guard is a role, we want to be able to dispatch on it:

    @overload
    def idle( actor: Guard ):
       ...

We would also like to be able to define methods that contain both 
type-tests and role-tests:

    @overload
    def watch( actor: Guard, treasure: list ):
       ...

In order for this to work, the dispatcher will need to know that the 
first argument requires a role-test ("performs" or whatever), while the 
second argument requires a type-test. I would like to see some more 
detail on how this would work.

However, its even more complicated than that. Generic function 
dispatchers can be made to work efficiently if there is a way to compare 
constraints with each other. Specifically, what you need to know is 
this: given any two tests, are those tests completely disjoint, is one 
test a subset of the other, or neither?

For example, suppose we have the following overloads:

    class MyList( list ):
       ...

    @overload
    def watch( a: list )
       ...

    @overload
    def watch( a: tuple )
       ...

    @overload
    def watch( a: MyList )

The most efficient dispatch algorithm for this particular set of 
overloads will first test to see if the argument is a list; If not, it 
will test to see if it's a tuple, otherwise it will test to see if it's 
a MyList.

In other words, even though there are three possible tests, we only need 
to perform two of them at most, because if it is a list, then it can't 
possibly be a tuple, and if it's not a list then it can't possibly be a 
MyList. As you add more overloads and more tests, this kind of pruning 
becomes important, and there are some wonderful algorithms for figuring 
this all out.

Now consider, however, the following situation, where you have a role, a 
class which implements that role, and a subclass:

    class Worker( Role ):
       ...

    @perform_role( Worker )
    class Robot:
       ...

    class ShinyRobot( Robot ):
       ...

Now, suppose we have a number of overloads:

    @overload
    def work( actor: Worker ):
       ...

    @overload
    def work( actor: Robot ):
       ...

    @overload
    def work( actor: ShinyRobot ):
       ...

In this case, the dispatching on the first argument we are sometimes 
doing type tests, and sometimes doing role tests.

Furthermore, we have an interaction between roles and types: The 
ShinyRobot test (a type test) can never succeed unless the role test 
(Worker) also succeeds. For purposes of dispatching efficiency, we want 
to be able to allow the dispatcher to know that the "ShinyRobot" is a 
subset of "Worker", even though the two tests are different kinds of tests.

Thus, the generic function dispatcher will need to be able to take two 
tests, which might both be type tests, or both role tests, or one of 
each - and compare them to see if one is a subset of the other, or if 
they overlap at all.


-- Talin


From arvind1.singh at gmail.com  Mon May 14 12:06:32 2007
From: arvind1.singh at gmail.com (Arvind Singh)
Date: Mon, 14 May 2007 15:36:32 +0530
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
Message-ID: <c19402930705140306w6f9a2df4m4f4cb9c5a1ba9630@mail.gmail.com>

> Asking Questions About Roles

Shouldn't there be some way to ``revoke'' roles?

How can we get a list of all roles played by an object?

Should there be a way to check ``loosely'' whether an object can
potentially play a given role? (i.e., checking whether an object
provides a give interface, atleast syntactically)

I understand that this can be achieved via:

try:
      now_performs(instance.__class__, [role.RoleToCheck])
except:
      print("can't play role")
else:
      print("maybe plays role")

But such approach will be error prone (``revoking'' roles later, and
such; destructive checks are bad idea, anyway). Better would be to
have::

if performs(instance, [role.RoleToCheck], loose=True):
      print("maybe plays role")


> Assigning Roles at Runtime

Maybe it should be suggested that dynamic role assignment should not
be made without knowing the implementation (with a reminder about
tree's bark() and dog's bark() ).


Regards,
Arvind

From p.f.moore at gmail.com  Mon May 14 12:56:47 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 14 May 2007 11:56:47 +0100
Subject: [Python-3000] Revised PEP 3119 (Abstract Base Classes)
In-Reply-To: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
References: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
Message-ID: <79990c6b0705140356i2cfe4dccx9410534e211d8c94@mail.gmail.com>

On 12/05/07, Guido van Rossum <guido at python.org> wrote:
> Here's a new version of the ABC PEP. A lot has changed; a lot remains.
> I can't give a detailed overview of all the changes, and a diff would
> show too many spurious changes, but some of the highlights are:

As a general comment, I like the direction this has moved in. In
particular, I like the fact that ABCs can be registered after the fact
(as Talin describes it, "post-hoc classification".

> ABCs vs. Duck Typing

This remains my key concern. The PEP nicely addresses the issue as far
as core Python is concerned, but I'd be happier with some style
recommendations for 3rd party frameworks clarifying that they should
also avoid taking the "stick" approach. OTOH, we've had interface
implementations, and heavy users of them (e.g. zope.interface and
Twisted) for ages now, and the world hasn't ended, so I guess there's
no reason to assume that people won't use ABCs sensibly, too.

Overall, then, I'm moving towards a +1 (or at least a +0.5...)

Paul.

From exarkun at divmod.com  Mon May 14 13:32:40 2007
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Mon, 14 May 2007 07:32:40 -0400
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <dcbbbb410705132003k73cf08c3hf7f0b99e2e1b9895@mail.gmail.com>
Message-ID: <20070514113240.19381.1581980774.divmod.quotient.32995@ohm>

On Sun, 13 May 2007 22:03:26 -0500, Michael Urman <murman at gmail.com> wrote:
>On 5/13/07, Guido van Rossum <guido at python.org> wrote:
>> The answer to all of this is the filesystem encoding, which is already
>> supported. Doesn't appear particularly difficult to me.
>
>Okay, that's fair. It seems reasonable to accept the limitations of
>following the filesystem encoding for module names. I should probably
>test py3k to make sure it already has updated __import__ to use the
>filesystem encoding instead of the default encoding, but instead I'll
>just feebly imply the question here.

It's harder for this, actually.  Even if you know the encoding, you'll
still run into problems when you don't know the normalization.  Consider
the case where a developer creates a module with a non-ASCII name on OS X
and then distributes it.  There is a fair to strong chance that their
source code will use NFC for the module name.  During development, this
will work just fine, as OS X normalizes all filename access to NFD.  When
someone on another platform attempts to use the module though, they will
mysteriously find that it cannot be found.  Their NFC spelling of the
module name won't find the NFD file in the filesystem, and they will likely
be completely baffled by the failure.

This is, of course, an existing difficulty with dealing with unicode
filenames in Python, but at least the interpreter itself doesn't yet
have to concern itself with it, as no language features require it.
I suspect that if non-ASCII module names are allowed, a lot of people
will be running into this.

Jean-Paul

From p.f.moore at gmail.com  Mon May 14 16:00:45 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 14 May 2007 15:00:45 +0100
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<fb6fbf560705091526q68df2e2coafe477a62f7240b1@mail.gmail.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<4643C4F4.30708@canterbury.ac.nz>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
Message-ID: <79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>

On 11/05/07, Neal Norwitz <nnorwitz at gmail.com> wrote:
> On 5/11/07, Paul Moore <p.f.moore at gmail.com> wrote:
> > Hmm. My view is that it *is* simple to explain, but unfortunately
> > Phillip's explanation in the PEP is not that simple explanation :-(
>
> [snip]
>
> > I would argue that the PEP could be *very* simple if it restricted
> > itself to the basic idea.
>
> Could you write up the simple version that you would use instead?

I'd have liked to, but unfortunately, I haven't had the time to do so
(and I probably won't in the near future). However, it looks like
there's a general feeling emerging that snipping certain sections
would be enough. I'd agree with that - my personal feeling is that
it'd be OK to remove all of the following sections:

* "Before" and "After" Methods (as per Steven Bethard's suggestion)
* "Around" Methods (as per Steven Bethard's suggestion)
* Custom Combinations (as per Steven Bethard's suggestion)
* Interfaces and Adaptation (doesn't feel like a core aspect of the proposal)
* Aspects (as per Steven Bethard's suggestion)
* Extension API (currently empty, and that hasn't hampered the discussions!!)

I'd be OK with them going into an additional PEP, but to be honest, it
wouldn't bother me to see them left out of the PEP process
altogether[1]. (I don't feel that I have enough experience with
*using* GFs to comment meaningfully, so I'd be willing to defer to
Phillip's judgement here).

Paul.

[1] But I'd like to see them documented in the final implementation -
I'm not suggesting they be undocumented features.

From guido at python.org  Mon May 14 17:22:48 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 08:22:48 -0700
Subject: [Python-3000] Revised PEP 3119 (Abstract Base Classes)
In-Reply-To: <79990c6b0705140356i2cfe4dccx9410534e211d8c94@mail.gmail.com>
References: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
	<79990c6b0705140356i2cfe4dccx9410534e211d8c94@mail.gmail.com>
Message-ID: <ca471dc20705140822p7724769fo6a00c4db7ea9a26d@mail.gmail.com>

On 5/14/07, Paul Moore <p.f.moore at gmail.com> wrote:
> > ABCs vs. Duck Typing
>
> This remains my key concern. The PEP nicely addresses the issue as far
> as core Python is concerned, but I'd be happier with some style
> recommendations for 3rd party frameworks clarifying that they should
> also avoid taking the "stick" approach. OTOH, we've had interface
> implementations, and heavy users of them (e.g. zope.interface and
> Twisted) for ages now, and the world hasn't ended, so I guess there's
> no reason to assume that people won't use ABCs sensibly, too.

I'm not sure what language you would specifically like to see added to
the PEP. "Recommendation for 3rd party frameworks: please don't use
the stick approach." sounds a little strange. What's the point you're
trying to get across?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May 14 17:25:02 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 08:25:02 -0700
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <20070514113240.19381.1581980774.divmod.quotient.32995@ohm>
References: <dcbbbb410705132003k73cf08c3hf7f0b99e2e1b9895@mail.gmail.com>
	<20070514113240.19381.1581980774.divmod.quotient.32995@ohm>
Message-ID: <ca471dc20705140825s453a6281ife63414e2c7e0a9b@mail.gmail.com>

On 5/14/07, Jean-Paul Calderone <exarkun at divmod.com> wrote:
> On Sun, 13 May 2007 22:03:26 -0500, Michael Urman <murman at gmail.com> wrote:
> >On 5/13/07, Guido van Rossum <guido at python.org> wrote:
> >> The answer to all of this is the filesystem encoding, which is already
> >> supported. Doesn't appear particularly difficult to me.
> >
> >Okay, that's fair. It seems reasonable to accept the limitations of
> >following the filesystem encoding for module names. I should probably
> >test py3k to make sure it already has updated __import__ to use the
> >filesystem encoding instead of the default encoding, but instead I'll
> >just feebly imply the question here.
>
> It's harder for this, actually.  Even if you know the encoding, you'll
> still run into problems when you don't know the normalization.  Consider
> the case where a developer creates a module with a non-ASCII name on OS X
> and then distributes it.  There is a fair to strong chance that their
> source code will use NFC for the module name.  During development, this
> will work just fine, as OS X normalizes all filename access to NFD.  When
> someone on another platform attempts to use the module though, they will
> mysteriously find that it cannot be found.  Their NFC spelling of the
> module name won't find the NFD file in the filesystem, and they will likely
> be completely baffled by the failure.
>
> This is, of course, an existing difficulty with dealing with unicode
> filenames in Python, but at least the interpreter itself doesn't yet
> have to concern itself with it, as no language features require it.
> I suspect that if non-ASCII module names are allowed, a lot of people
> will be running into this.

Isn't normalization also going to be an issue with using non-ASCII in
general? Does it mean that Python will have to use a normalization
before comparing identifiers as equal? That's terrible, as it will
vastly increase the amount needed to hash a string, too.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May 14 17:29:12 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 08:29:12 -0700
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<4643C4F4.30708@canterbury.ac.nz>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
Message-ID: <ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>

On 5/14/07, Paul Moore <p.f.moore at gmail.com> wrote:
> However, it looks like
> there's a general feeling emerging that snipping certain sections
> would be enough. I'd agree with that - my personal feeling is that
> it'd be OK to remove all of the following sections:
>
> * "Before" and "After" Methods (as per Steven Bethard's suggestion)
> * "Around" Methods (as per Steven Bethard's suggestion)
> * Custom Combinations (as per Steven Bethard's suggestion)
> * Interfaces and Adaptation (doesn't feel like a core aspect of the proposal)
> * Aspects (as per Steven Bethard's suggestion)
> * Extension API (currently empty, and that hasn't hampered the discussions!!)
>
> I'd be OK with them going into an additional PEP, but to be honest, it
> wouldn't bother me to see them left out of the PEP process
> altogether[1]. (I don't feel that I have enough experience with
> *using* GFs to comment meaningfully, so I'd be willing to defer to
> Phillip's judgement here).

That would suit me fine, since my inclination is to approve some form
of the basics of the PEP (with reservations I will explain in another
message) but to reject the second PEP.

> [1] But I'd like to see them documented in the final implementation -
> I'm not suggesting they be undocumented features.

I'm suggesting they aren't features at all, except for the extension
API. All the other stuff should be addable in a separate module using
the extension API.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jason.orendorff at gmail.com  Mon May 14 17:42:24 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Mon, 14 May 2007 11:42:24 -0400
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <ca471dc20705140825s453a6281ife63414e2c7e0a9b@mail.gmail.com>
References: <dcbbbb410705132003k73cf08c3hf7f0b99e2e1b9895@mail.gmail.com>
	<20070514113240.19381.1581980774.divmod.quotient.32995@ohm>
	<ca471dc20705140825s453a6281ife63414e2c7e0a9b@mail.gmail.com>
Message-ID: <bb8868b90705140842x27257e07o9faaa8407f953d53@mail.gmail.com>

On 5/14/07, Guido van Rossum <guido at python.org> wrote:
> Isn't normalization also going to be an issue with using non-ASCII in
> general? Does it mean that Python will have to use a normalization
> before comparing identifiers as equal? That's terrible, as it will
> vastly increase the amount needed to hash a string, too.

PEP 3131 addresses this.  The tokenizer would normalize identifier
tokens to NFC.  Because this happens so early, the rest of Python
would be unaffected.

-j

From jason.orendorff at gmail.com  Mon May 14 18:22:56 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Mon, 14 May 2007 12:22:56 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <4647B15F.7040700@canterbury.ac.nz>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>
	<4647B15F.7040700@canterbury.ac.nz>
Message-ID: <bb8868b90705140922v2c1d862ic8fb4f91418656bd@mail.gmail.com>

On 5/13/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> I don't think this scenario is all that unlikely. A
> program is initially written by a Russian programmer
> who uses his own version of "a" as a variable name.
> Later an English-speaking programmer makes some
> changes, and uses an ascii "a". Now there are two
> subtly different variables called "a" in different
> parts of the program.

Greg,

If this scenario were *not* unlikely, it would have happened
to a Java programmer somewhere, right?  Has this *ever*
happened?  I wasn't able to find a case.

-- 
Jason

From pje at telecommunity.com  Mon May 14 18:34:15 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 12:34:15 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070510012202.EB0A83A4061@sparrow.telecommunity.com>
	<464289C8.4080004@canterbury.ac.nz>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<4643C4F4.30708@canterbury.ac.nz>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
Message-ID: <20070514163231.275CE3A4036@sparrow.telecommunity.com>

At 08:29 AM 5/14/2007 -0700, Guido van Rossum wrote:
>On 5/14/07, Paul Moore <p.f.moore at gmail.com> wrote:
> > However, it looks like
> > there's a general feeling emerging that snipping certain sections
> > would be enough. I'd agree with that - my personal feeling is that
> > it'd be OK to remove all of the following sections:
> >
> > * "Before" and "After" Methods (as per Steven Bethard's suggestion)
> > * "Around" Methods (as per Steven Bethard's suggestion)
> > * Custom Combinations (as per Steven Bethard's suggestion)
> > * Interfaces and Adaptation (doesn't feel like a core aspect of 
> the proposal)
> > * Aspects (as per Steven Bethard's suggestion)
> > * Extension API (currently empty, and that hasn't hampered the 
> discussions!!)
> >
> > I'd be OK with them going into an additional PEP, but to be honest, it
> > wouldn't bother me to see them left out of the PEP process
> > altogether[1]. (I don't feel that I have enough experience with
> > *using* GFs to comment meaningfully, so I'd be willing to defer to
> > Phillip's judgement here).
>
>That would suit me fine, since my inclination is to approve some form
>of the basics of the PEP (with reservations I will explain in another
>message) but to reject the second PEP.

FYI, wrt to Paul's list, my own list for the 2nd PEP doesn't include 
interfaces and adaptation; they'd be squarely in the first PEP.


> > [1] But I'd like to see them documented in the final implementation -
> > I'm not suggesting they be undocumented features.
>
>I'm suggesting they aren't features at all, except for the extension
>API. All the other stuff should be addable in a separate module using
>the extension API.

I don't see what the benefit is of making people implement their own 
versions of @before, @after, and @around, which then won't 
interoperate properly with others' versions of the same thing.  Even 
if we leave in place the MethodList base class (which Before and 
After are subclasses of), one of its limitations is that it can only 
combine methods of the same type.  There's no way for two different 
user-implemented "befores" to merge at the same precedence level, 
without some fairly fancy footwork on the implementer's part, or some 
kind of convention being established as to how to tell whether a 
method intends to be a before or after or whatever.  (And this 
same-precedence merging is critical feature of @before/@after, as 
they are used mainly for "observer"-like hooks, where multiple 
libraries may be observing the same thing.)

So, one of the reasons for including those features (along with 
Aspect) in the stdlib is the standardization part.  Really, 
standardization of a lot of this stuff is the main point to having a 
PEP at all.

By the way, I'm not sure if I mentioned this before, but Ruby 2.0 is 
supposed to include before/after/around qualifiers, except they're 
called pre/post/wrap, and I'm not sure if the combination rules are 
100% the same as my before/after/around.  And they're using Ruby's 
open classes rather than standalone generic functions.  But it's 
another data point.

Note that in current Ruby, you can simulate generic functions 
(single-dispatch only) via open classes as long as you use 
sufficiently-unique method names.  The fact that Matz wants to add 
these qualifiers seems to suggest that simple next-method chaining 
(i.e. super) isn't as expressive as they'd like.  Unfortunately, I 
haven't been able to find an RCR for this feature, only references to 
RubyConf slide presentations, so I don't know what their specific rationale is.


From collinw at gmail.com  Mon May 14 18:35:14 2007
From: collinw at gmail.com (Collin Winter)
Date: Mon, 14 May 2007 09:35:14 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <bb8868b90705140922v2c1d862ic8fb4f91418656bd@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>
	<4647B15F.7040700@canterbury.ac.nz>
	<bb8868b90705140922v2c1d862ic8fb4f91418656bd@mail.gmail.com>
Message-ID: <43aa6ff70705140935g573a26a2wd3aa88703aa0f485@mail.gmail.com>

On 5/14/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> On 5/13/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > I don't think this scenario is all that unlikely. A
> > program is initially written by a Russian programmer
> > who uses his own version of "a" as a variable name.
> > Later an English-speaking programmer makes some
> > changes, and uses an ascii "a". Now there are two
> > subtly different variables called "a" in different
> > parts of the program.
>
> Greg,
>
> If this scenario were *not* unlikely, it would have happened
> to a Java programmer somewhere, right?  Has this *ever*
> happened?  I wasn't able to find a case.

Well, it's not exactly the kind of thing that makes for a riveting blog post.

This is something the Perl 6 people debated for months on end when
deciding whether to support Unicode identifiers. They eventually came
to the conclusion that if your editor doesn't flag this kind of thing,
it's a bug in the editor. I don't know of any editors that actually do
this, but there you go.

Of course, one of the main motivations for including Unicode support
in Perl 6 was that they were running out of "meaningful" ASCII
punctuation combinations and were looking to things like the ?+?
operator and the ? operator for their salvation. Thankfully Python
doesn't have this problem.

Collin Winter

From jcarlson at uci.edu  Mon May 14 18:40:08 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 14 May 2007 09:40:08 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <bb8868b90705140922v2c1d862ic8fb4f91418656bd@mail.gmail.com>
References: <4647B15F.7040700@canterbury.ac.nz>
	<bb8868b90705140922v2c1d862ic8fb4f91418656bd@mail.gmail.com>
Message-ID: <20070514093643.8559.JCARLSON@uci.edu>


"Jason Orendorff" <jason.orendorff at gmail.com> wrote:
> 
> On 5/13/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > I don't think this scenario is all that unlikely. A
> > program is initially written by a Russian programmer
> > who uses his own version of "a" as a variable name.
> > Later an English-speaking programmer makes some
> > changes, and uses an ascii "a". Now there are two
> > subtly different variables called "a" in different
> > parts of the program.
> 
> If this scenario were *not* unlikely, it would have happened
> to a Java programmer somewhere, right?  Has this *ever*
> happened?  I wasn't able to find a case.

Have you been able to find substantial Java source in which non-ascii
identifiers were used?  I have been curious about its prevalence, but
wouldn't even know how to start searching for such code.

 - Josiah


From pje at telecommunity.com  Mon May 14 18:42:27 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 12:42:27 -0400
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.co
 m>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
Message-ID: <20070514164041.C8C033A4036@sparrow.telecommunity.com>

At 10:36 PM 5/13/2007 -0700, Collin Winter wrote:
>2. Querying instances ::
>
>      if performs(my_elf, Thieving):
>        ...

-1 on using any function other than isinstance() for this.

Rationale: isinstance() makes the code smell of inspection more 
obvious, where another function name makes it seem like you are doing 
something harmless.  In reality, performs() testing (or any other 
kind of interface testing) using if-then is always harmful in library code.


>    The second argument to ``performs()`` may also be anything with a
>    ``__contains__()`` method, meaning the following is legal: ::
>
>      if performs(my_elf, set([Thieving, Spying, BoyScout])):
>        ...
>
>    Like ``isinstance()``, the object needs only to perform a single
>    role out of the set in order for the expression to be true.

Right, so let's just use isinstance().  Likewise, issubclass() for 
checking whether instances of a class perform a role.  (And if 
issubclass() works, then roles will also be usable by PEP 3124 
generic functions without any additional effort.)


From guido at python.org  Mon May 14 18:41:55 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 09:41:55 -0700
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070514163231.275CE3A4036@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<4643C4F4.30708@canterbury.ac.nz>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
Message-ID: <ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>

On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 08:29 AM 5/14/2007 -0700, Guido van Rossum wrote:
> >On 5/14/07, Paul Moore <p.f.moore at gmail.com> wrote:
> > > However, it looks like
> > > there's a general feeling emerging that snipping certain sections
> > > would be enough. I'd agree with that - my personal feeling is that
> > > it'd be OK to remove all of the following sections:
> > >
> > > * "Before" and "After" Methods (as per Steven Bethard's suggestion)
> > > * "Around" Methods (as per Steven Bethard's suggestion)
> > > * Custom Combinations (as per Steven Bethard's suggestion)
> > > * Interfaces and Adaptation (doesn't feel like a core aspect of
> > the proposal)
> > > * Aspects (as per Steven Bethard's suggestion)
> > > * Extension API (currently empty, and that hasn't hampered the
> > discussions!!)
> > >
> > > I'd be OK with them going into an additional PEP, but to be honest, it
> > > wouldn't bother me to see them left out of the PEP process
> > > altogether[1]. (I don't feel that I have enough experience with
> > > *using* GFs to comment meaningfully, so I'd be willing to defer to
> > > Phillip's judgement here).
> >
> >That would suit me fine, since my inclination is to approve some form
> >of the basics of the PEP (with reservations I will explain in another
> >message) but to reject the second PEP.
>
> FYI, wrt to Paul's list, my own list for the 2nd PEP doesn't include
> interfaces and adaptation; they'd be squarely in the first PEP.
>
>
> > > [1] But I'd like to see them documented in the final implementation -
> > > I'm not suggesting they be undocumented features.
> >
> >I'm suggesting they aren't features at all, except for the extension
> >API. All the other stuff should be addable in a separate module using
> >the extension API.
>
> I don't see what the benefit is of making people implement their own
> versions of @before, @after, and @around, which then won't
> interoperate properly with others' versions of the same thing.  Even
> if we leave in place the MethodList base class (which Before and
> After are subclasses of), one of its limitations is that it can only
> combine methods of the same type.  There's no way for two different
> user-implemented "befores" to merge at the same precedence level,
> without some fairly fancy footwork on the implementer's part, or some
> kind of convention being established as to how to tell whether a
> method intends to be a before or after or whatever.  (And this
> same-precedence merging is critical feature of @before/@after, as
> they are used mainly for "observer"-like hooks, where multiple
> libraries may be observing the same thing.)
>
> So, one of the reasons for including those features (along with
> Aspect) in the stdlib is the standardization part.  Really,
> standardization of a lot of this stuff is the main point to having a
> PEP at all.

OK, let me repeat this request than: real use cases! Point me to code
that uses or could be dramatically simplified by adding all this.
Until, then, before/after and everything beyond it is solidly in
YAGNI-land.

> By the way, I'm not sure if I mentioned this before, but Ruby 2.0 is
> supposed to include before/after/around qualifiers, except they're
> called pre/post/wrap, and I'm not sure if the combination rules are
> 100% the same as my before/after/around.  And they're using Ruby's
> open classes rather than standalone generic functions.  But it's
> another data point.
>
> Note that in current Ruby, you can simulate generic functions
> (single-dispatch only) via open classes as long as you use
> sufficiently-unique method names.  The fact that Matz wants to add
> these qualifiers seems to suggest that simple next-method chaining
> (i.e. super) isn't as expressive as they'd like.  Unfortunately, I
> haven't been able to find an RCR for this feature, only references to
> RubyConf slide presentations, so I don't know what their specific rationale is.

So if Matz jumped off a cliff, would you recommend I jump too?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May 14 18:43:22 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 09:43:22 -0700
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <bb8868b90705140842x27257e07o9faaa8407f953d53@mail.gmail.com>
References: <dcbbbb410705132003k73cf08c3hf7f0b99e2e1b9895@mail.gmail.com>
	<20070514113240.19381.1581980774.divmod.quotient.32995@ohm>
	<ca471dc20705140825s453a6281ife63414e2c7e0a9b@mail.gmail.com>
	<bb8868b90705140842x27257e07o9faaa8407f953d53@mail.gmail.com>
Message-ID: <ca471dc20705140943r43ca5f33tad5aff8df87e96ec@mail.gmail.com>

On 5/14/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> On 5/14/07, Guido van Rossum <guido at python.org> wrote:
> > Isn't normalization also going to be an issue with using non-ASCII in
> > general? Does it mean that Python will have to use a normalization
> > before comparing identifiers as equal? That's terrible, as it will
> > vastly increase the amount needed to hash a string, too.
>
> PEP 3131 addresses this.  The tokenizer would normalize identifier
> tokens to NFC.  Because this happens so early, the rest of Python
> would be unaffected.

Does the tokenizer do this for all string literals, too? Otherwise you
could still get surprises with things like x.foo vs. getattr(x,
"foo"), if the name foo were normalized but the string "foo" were not.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Mon May 14 18:58:51 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 12:58:51 -0400
Subject: [Python-3000] PEP 367: New Super
In-Reply-To: <003001c795f8$d5275060$0201a8c0@mshome.net>
References: <003001c795f8$d5275060$0201a8c0@mshome.net>
Message-ID: <20070514165704.4F8D23A4036@sparrow.telecommunity.com>

At 05:23 PM 5/14/2007 +1000, Tim Delaney wrote:
>Determining the class object to use
>'''''''''''''''''''''''''''''''''''
>
>The exact mechanism for associating the method with the defining class is
>not
>specified in this PEP, and should be chosen for maximum performance. For
>CPython, it is suggested that the class instance be held in a C-level
>variable
>on the function object which is bound to one of ``NULL`` (not part of a
>class),
>``Py_None`` (static method) or a class object (instance or class method).

Another open issue here: is the decorated class used, or the undecorated class?


From p.f.moore at gmail.com  Mon May 14 19:01:34 2007
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 14 May 2007 18:01:34 +0100
Subject: [Python-3000] Revised PEP 3119 (Abstract Base Classes)
In-Reply-To: <ca471dc20705140822p7724769fo6a00c4db7ea9a26d@mail.gmail.com>
References: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
	<79990c6b0705140356i2cfe4dccx9410534e211d8c94@mail.gmail.com>
	<ca471dc20705140822p7724769fo6a00c4db7ea9a26d@mail.gmail.com>
Message-ID: <79990c6b0705141001h1b9cf648s791ccedd8175474c@mail.gmail.com>

On 14/05/07, Guido van Rossum <guido at python.org> wrote:
> I'm not sure what language you would specifically like to see added to
> the PEP. "Recommendation for 3rd party frameworks: please don't use
> the stick approach." sounds a little strange. What's the point you're
> trying to get across?

Something like:

As a style issue, 3rd party code which wishes to use ABCs should
follow the lead of the core and standard library, and be written in
such a way as to allow, but not require, the use of ABCs.

But as I said, I'm coming to the view that worrying about such things
is FUD. So I'm happy enough to relegate this sort of thing to possible
a PEP 8 amendment if such an issue really does become a problem.

Paul.

From pje at telecommunity.com  Mon May 14 19:34:53 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 13:34:53 -0400
Subject: [Python-3000] Revised PEP 3119 (Abstract Base Classes)
In-Reply-To: <79990c6b0705141001h1b9cf648s791ccedd8175474c@mail.gmail.co
 m>
References: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
	<79990c6b0705140356i2cfe4dccx9410534e211d8c94@mail.gmail.com>
	<ca471dc20705140822p7724769fo6a00c4db7ea9a26d@mail.gmail.com>
	<79990c6b0705141001h1b9cf648s791ccedd8175474c@mail.gmail.com>
Message-ID: <20070514173307.E98933A4036@sparrow.telecommunity.com>

At 06:01 PM 5/14/2007 +0100, Paul Moore wrote:
>On 14/05/07, Guido van Rossum <guido at python.org> wrote:
> > I'm not sure what language you would specifically like to see added to
> > the PEP. "Recommendation for 3rd party frameworks: please don't use
> > the stick approach." sounds a little strange. What's the point you're
> > trying to get across?
>
>Something like:
>
>As a style issue, 3rd party code which wishes to use ABCs should
>follow the lead of the core and standard library, and be written in
>such a way as to allow, but not require, the use of ABCs.
>
>But as I said, I'm coming to the view that worrying about such things
>is FUD.

It's not FUD.  It's a pitfall that everybody falls into, even "wizards".

Realistically, warning people about it won't stop everyone from 
falling into it, but it will at least help some of them realize their 
mistake more quickly once they've made it.  :)

(That is, some will go, "oh, so *that's* why they said this was bad", 
instead of thinking their problems are one-time flukes.)

However, the issue I'm talking about here is that of using if-then 
tests to select behavior based on some global type, which is a bit 
more specific than "don't require ABCs", so YMMV.  :)


From jason.orendorff at gmail.com  Mon May 14 19:38:55 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Mon, 14 May 2007 13:38:55 -0400
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <ca471dc20705140943r43ca5f33tad5aff8df87e96ec@mail.gmail.com>
References: <dcbbbb410705132003k73cf08c3hf7f0b99e2e1b9895@mail.gmail.com>
	<20070514113240.19381.1581980774.divmod.quotient.32995@ohm>
	<ca471dc20705140825s453a6281ife63414e2c7e0a9b@mail.gmail.com>
	<bb8868b90705140842x27257e07o9faaa8407f953d53@mail.gmail.com>
	<ca471dc20705140943r43ca5f33tad5aff8df87e96ec@mail.gmail.com>
Message-ID: <bb8868b90705141038y757adfcbpde91fa632aec3063@mail.gmail.com>

On 5/14/07, Guido van Rossum <guido at python.org> wrote:
> Does the tokenizer do this for all string literals, too? Otherwise you
> could still get surprises with things like x.foo vs. getattr(x,
> "foo"), if the name foo were normalized but the string "foo" were not.

It does not; so yes, you could.

-j

From guido at python.org  Mon May 14 20:25:53 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 11:25:53 -0700
Subject: [Python-3000] PEP 3124 - more commentary
Message-ID: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>

First I'll try to explain why I don't like the sys._getframe() approach.

Phillip's current syntax is roughly:

  def flatten(x): ...      # this is the "base" function

  @overload
  def flatten(y: str): ...    # this adds an overloaded version

The implementation of @overload needs to use sys._getframe() to look
up the name of the function ('flatten') in the surrounding namespace.
I find this too fragile an approach; it means that I can't easily
write another function that calls overload to get the same effect; in
particular, I don't see how this code could work:

  def my_overload(func):
    "Shorthand for @some_decorator + @overload."
    return some_decorator(overload(func))

  @my_oveload
  def flatten(z: int): ...

If the overload decorator simply looked in the calling scope, it would
not find 'flatten' there, since that's the local scope of my_overload.
(If it devised some clever scheme of descending down the stack, I
would just have to create a more complicated example.)

I find the semantics of things that use sys._getframe() muddy and
would really much much rather avoid them. Using the approach in my old
sandbox/overload/overloading.py code, this objection is removed: the
function being overloaded is named explicitly in the decorator.

I realize that @overload is only a shorthand for @when(function). But
I'd much rather not have @overload at all -- the frame inspection
makes it really hard for me to explain carefully what happens without
just giving the code that uses sys._getframe(); and this makes it
difficult to reason about code using @overload.

My own preference for spelling this example would be

@overloadable
def flatten(x): ...

@flatten.overload
def _(y: str): ...

And for the combined decorator:

@my_overload(flatten)
def _(z: int): ...

******************

I also really don't like approaches based on patching the function
object's code in place. Again, it makes it hard to reason about
innocent-looking code.  It's one thing to say "we can prove property X
assuming no-one assigns a different function to my global f" (since
assigning to module globals from outside the module is an extremely
rare practice). It's quite another thing to say "we can prove propery
X assuming no-one overloads my global f". This is why I really really
really want to require flagging the overloadable function before it
can be overloaded. (And that's why I propose @flatten.overload instead
of @overload(flatten).)

******************

Next, I have a question about the __proceed__ magic argument. I can
see why this is useful, and I can see why having this as a magic
argument is preferable over other solutions (I couldn't come up with a
better solution, and believe me I tried :-).  However, I think making
this the *first* argument would upset tools that haven't been taught
about this yet. Is there any problem with making it a keyword argument
with a default of None, by convention to be placed last?

******************

Finally, I looked at the example of overloading a method instead of a
function.  The little dance required to overload a method defined in a
base class feels fragile, and so does the magic apparently required to
special-case the first argument. This is unfortunate because I imagine
this to be an important use case -- I certainly would expect that the
pretty-printing example would need some state that's most conveniently
stored on a "pretty-printer" object where one overloads the pprint
method, not a pprint function.

******************

Forgive me if this is mentioned in the PEP, but what happens with
keyword args? Can I invoke an overloaded function with (some) keyword
args, assuming they match the argument names given in the default
implementation? Or are we restricted to positional argument passing
only? (That would be a big step backwards.)

******************

Also, can we overload different-length signatures (like in C++ or
Java)? This is very common in those languages; while Python typically
uses default argument values, there are use cases that don't easily
fit in that pattern (e.g. the signature of range()).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From daniel at stutzbachenterprises.com  Mon May 14 20:31:42 2007
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Mon, 14 May 2007 13:31:42 -0500
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705131104r85531f3o12b7e1769d7b7140@mail.gmail.com>
References: <1d85506f0705130833v1058b022re0597cf9f259320d@mail.gmail.com>
	<19dd68ba0705130925j1dd55f1boba9e1b6c036d0422@mail.gmail.com>
	<43aa6ff70705131009s7d5b177dmea7c790d670ac3c0@mail.gmail.com>
	<1d85506f0705131042q23270a91qa31ff2f3940019ed@mail.gmail.com>
	<19dd68ba0705131104r85531f3o12b7e1769d7b7140@mail.gmail.com>
Message-ID: <eae285400705141131u7cda0202p8a1fef47f5021878@mail.gmail.com>

On 5/13/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> Is this a bijective translation ?  How good is most people latin
> character reading ability among Hebrew speakers? From the beginning, I
> can tell from experience that Japanese people have great difficulties
> in reading english or even transliterated japanese (which is never
> good anyway because of homonyms)

Unicode identifiers have been proposed before:

http://mail.python.org/pipermail/i18n-sig/2001-February/000741.html
http://mail.python.org/pipermail/python-list/2002-May/143901.html

Based on those threads, it seems that two empirical criteria that
would sway many in the Python community are:

1) Evidence of positive use and results from languages that already
support Unicode identifiers, such as Java, and/or

2) Support of Unicode identifiers in languages where the primary
language author's native tongue is not based on Latin characters
(notably Yukihiro Matsumoto's Ruby).

-- 
Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC

From jeremy at alum.mit.edu  Mon May 14 20:58:05 2007
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 14 May 2007 14:58:05 -0400
Subject: [Python-3000] getting compiler package failures
In-Reply-To: <ca471dc20705131715qc53a622x4ca6a6bc25ba0c22@mail.gmail.com>
References: <bbaeab100705131558v1e57afb6y2255646231896bd@mail.gmail.com>
	<ca471dc20705131715qc53a622x4ca6a6bc25ba0c22@mail.gmail.com>
Message-ID: <e8bf7a530705141158n7547e74eoccb767e7e1944d62@mail.gmail.com>

On 5/13/07, Guido van Rossum <guido at python.org> wrote:
> test_compiler and test_transformer have been broken for a couple of
> months now I believe.
>
> Unless someone comes to the rescue of the compiler package soon, I'm
> tempted to remove it from the p3yk branch -- it doesn't seem to serve
> any particularly good purpose, especially now that the AST used by the
> compiler written in C is exportable.

We currently lack the ability to take an AST exported by the Python-C
compiler and pass it back to the compiler to generate bytecode.  It
would be a lot more practical, however, to add this ability than to
try to maintain two different compilers.

So a qualified +1 from me.

Jeremy

>
> --Guido
>
> On 5/13/07, Brett Cannon <brett at python.org> wrote:
> > I just did a ``make distclean`` on a clean checkout (r55300) and
> > test_compiler/test_transformer are failing:
> >
> >   File
> > "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
> > line 715, in atom
> >      return self._atom_dispatch[nodelist[0][0]](nodelist)
> > KeyError: 322
> >
> > or
> >
> >   File
> > "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
> > line 776, in lookup_node
> >     return self._dispatch[node[0]]
> > KeyError: 331
> >
> > or
> >
> >   File
> > "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
> > line 783, in com_node
> >     return self._dispatch[node[0]](node[1:])
> > KeyError: 339
> >
> >
> > I don't know the compiler package at all (which is why I am currently
> stuck
> > on Tony Lownds' PEP 3113 patch since I am getting a
> > compiler.transformer.WalkerError) so I have no clue how to
> > go about fixing this.  Anyone happen to know what may have caused the
> > breakage?
> >
> > -Brett
> >
> > _______________________________________________
> > Python-3000 mailing list
> > Python-3000 at python.org
> > http://mail.python.org/mailman/listinfo/python-3000
> > Unsubscribe:
> > http://mail.python.org/mailman/options/python-3000/guido%40python.org
> >
> >
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/jeremy%40alum.mit.edu
>

From guido at python.org  Mon May 14 21:00:28 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 12:00:28 -0700
Subject: [Python-3000] getting compiler package failures
In-Reply-To: <e8bf7a530705141158n7547e74eoccb767e7e1944d62@mail.gmail.com>
References: <bbaeab100705131558v1e57afb6y2255646231896bd@mail.gmail.com>
	<ca471dc20705131715qc53a622x4ca6a6bc25ba0c22@mail.gmail.com>
	<e8bf7a530705141158n7547e74eoccb767e7e1944d62@mail.gmail.com>
Message-ID: <ca471dc20705141200x375d702bn35859ab1e8be9dee@mail.gmail.com>

OK Brett, let 'er rip.

On 5/14/07, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
> On 5/13/07, Guido van Rossum <guido at python.org> wrote:
> > test_compiler and test_transformer have been broken for a couple of
> > months now I believe.
> >
> > Unless someone comes to the rescue of the compiler package soon, I'm
> > tempted to remove it from the p3yk branch -- it doesn't seem to serve
> > any particularly good purpose, especially now that the AST used by the
> > compiler written in C is exportable.
>
> We currently lack the ability to take an AST exported by the Python-C
> compiler and pass it back to the compiler to generate bytecode.  It
> would be a lot more practical, however, to add this ability than to
> try to maintain two different compilers.
>
> So a qualified +1 from me.
>
> Jeremy
>
> >
> > --Guido
> >
> > On 5/13/07, Brett Cannon <brett at python.org> wrote:
> > > I just did a ``make distclean`` on a clean checkout (r55300) and
> > > test_compiler/test_transformer are failing:
> > >
> > >   File
> > > "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
> > > line 715, in atom
> > >      return self._atom_dispatch[nodelist[0][0]](nodelist)
> > > KeyError: 322
> > >
> > > or
> > >
> > >   File
> > > "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
> > > line 776, in lookup_node
> > >     return self._dispatch[node[0]]
> > > KeyError: 331
> > >
> > > or
> > >
> > >   File
> > > "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
> > > line 783, in com_node
> > >     return self._dispatch[node[0]](node[1:])
> > > KeyError: 339
> > >
> > >
> > > I don't know the compiler package at all (which is why I am currently
> > stuck
> > > on Tony Lownds' PEP 3113 patch since I am getting a
> > > compiler.transformer.WalkerError) so I have no clue how to
> > > go about fixing this.  Anyone happen to know what may have caused the
> > > breakage?
> > >
> > > -Brett
> > >
> > > _______________________________________________
> > > Python-3000 mailing list
> > > Python-3000 at python.org
> > > http://mail.python.org/mailman/listinfo/python-3000
> > > Unsubscribe:
> > > http://mail.python.org/mailman/options/python-3000/guido%40python.org
> > >
> > >
> >
> >
> > --
> > --Guido van Rossum (home page: http://www.python.org/~guido/)
> > _______________________________________________
> > Python-3000 mailing list
> > Python-3000 at python.org
> > http://mail.python.org/mailman/listinfo/python-3000
> > Unsubscribe:
> > http://mail.python.org/mailman/options/python-3000/jeremy%40alum.mit.edu
> >
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tomerfiliba at gmail.com  Mon May 14 21:12:48 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Mon, 14 May 2007 21:12:48 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <eae285400705141131u7cda0202p8a1fef47f5021878@mail.gmail.com>
References: <1d85506f0705130833v1058b022re0597cf9f259320d@mail.gmail.com>
	<19dd68ba0705130925j1dd55f1boba9e1b6c036d0422@mail.gmail.com>
	<43aa6ff70705131009s7d5b177dmea7c790d670ac3c0@mail.gmail.com>
	<1d85506f0705131042q23270a91qa31ff2f3940019ed@mail.gmail.com>
	<19dd68ba0705131104r85531f3o12b7e1769d7b7140@mail.gmail.com>
	<eae285400705141131u7cda0202p8a1fef47f5021878@mail.gmail.com>
Message-ID: <1d85506f0705141212m65b9ec37q5f685f507e394f01@mail.gmail.com>

as an english-second-language programmer, i'd really like to be able
to have unicode identifiers -- but my gut feeling is -- it will open the
door for a tower of babel.

once we have chinese, french and hindi function names, i'd be very
difficult to interoperate with third party libs. imagine i wrote my
code using twisted-he, while my client has installed twisted-fr...
kaboom? so the next step would be localization files that would
map standard names to locale-specific name? and then the
interpreter would use locale-dependent importing? we'll never see
the end of that. it would just grow more and more complicated.

english, or latin at least, is sufficient for programming. allowing for
more languages effectively means the creation of small, close
communities, rather than a global one.

-1 from me.


-tomer

On 5/14/07, Daniel Stutzbach <daniel at stutzbachenterprises.com> wrote:
> On 5/13/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> > Is this a bijective translation ?  How good is most people latin
> > character reading ability among Hebrew speakers? From the beginning, I
> > can tell from experience that Japanese people have great difficulties
> > in reading english or even transliterated japanese (which is never
> > good anyway because of homonyms)
>
> Unicode identifiers have been proposed before:
>
> http://mail.python.org/pipermail/i18n-sig/2001-February/000741.html
> http://mail.python.org/pipermail/python-list/2002-May/143901.html
>
> Based on those threads, it seems that two empirical criteria that
> would sway many in the Python community are:
>
> 1) Evidence of positive use and results from languages that already
> support Unicode identifiers, such as Java, and/or
>
> 2) Support of Unicode identifiers in languages where the primary
> language author's native tongue is not based on Latin characters
> (notably Yukihiro Matsumoto's Ruby).
>
> --
> Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC
>

From pje at telecommunity.com  Mon May 14 21:26:10 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 15:26:10 -0400
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.co
 m>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
Message-ID: <20070514192423.624D63A4036@sparrow.telecommunity.com>

At 11:25 AM 5/14/2007 -0700, Guido van Rossum wrote:
>The implementation of @overload needs to use sys._getframe() to look
>up the name of the function ('flatten') in the surrounding namespace.
>I find this too fragile an approach; it means that I can't easily
>write another function that calls overload to get the same effect; in
>particular, I don't see how this code could work:
>
>   def my_overload(func):
>     "Shorthand for @some_decorator + @overload."
>     return some_decorator(overload(func))
>
>   @my_oveload
>   def flatten(z: int): ...
>
>If the overload decorator simply looked in the calling scope, it would
>not find 'flatten' there, since that's the local scope of my_overload.
>(If it devised some clever scheme of descending down the stack, I
>would just have to create a more complicated example.)

Actually, your "my_overload" would just need to do its own getframe 
and call when() on the result, since overload is just sugar for when().


>I realize that @overload is only a shorthand for @when(function). But
>I'd much rather not have @overload at all -- the frame inspection
>makes it really hard for me to explain carefully what happens without
>just giving the code that uses sys._getframe(); and this makes it
>difficult to reason about code using @overload.

This is why in the very earliest GF discussions here, I proposed a 
'defop expr(...)' syntax, as it would eliminate the need for any 
getframe hackery.


>My own preference for spelling this example would be
>
>@overloadable
>def flatten(x): ...
>
>@flatten.overload
>def _(y: str): ...

Btw, this is similar to how RuleDispatch actually spells it, except 
that it's @flatten.when().  Later, I decided I preferred putting the 
*mode* of combination (e.g. when vs. around vs. whatever) first, both 
because it reads more naturally (e.g. "when flattening", "before 
flattening", etc.) and because it enabled one to retroactively extend 
existing functions.


>Next, I have a question about the __proceed__ magic argument. I can
>see why this is useful, and I can see why having this as a magic
>argument is preferable over other solutions (I couldn't come up with a
>better solution, and believe me I tried :-).  However, I think making
>this the *first* argument would upset tools that haven't been taught
>about this yet. Is there any problem with making it a keyword argument
>with a default of None, by convention to be placed last?

Actually, a pending revision to the PEP is to drop the special name 
and instead use a special annotation, e.g.:

     def whatever(nm:next_method, ...):

(This idea came up in an early thread when some folks queried whether 
a better name than __proceed__ could be found.)

Anyway, with this, it could also be placed as a keyword 
argument.  The main reason for putting it in the first position is 
performance.  Allowing it to be anywhere, however, would let the 
choice of where be a matter of style.


>Finally, I looked at the example of overloading a method instead of a
>function.  The little dance required to overload a method defined in a
>base class feels fragile,

Note that a defop syntax would simplify this; i.e. :

     defop MyBaseClass.methodname(...):
         ...

This doesn't help with the first-argument magic, however.

However, since we're going to have to have some way for 'super' to 
know the class a function is defined in, ISTM that the same magic 
should be reusable for the first-argument rule.


>Forgive me if this is mentioned in the PEP, but what happens with
>keyword args? Can I invoke an overloaded function with (some) keyword
>args, assuming they match the argument names given in the default
>implementation?

Yes.  That's done with code generation; PEAK-Rules uses direct 
bytecode generation, but a sourcecode-based generation is also 
possible and would be used for the PEP implementation (it was also 
used in RuleDispatch).


>Also, can we overload different-length signatures (like in C++ or
>Java)? This is very common in those languages; while Python typically
>uses default argument values, there are use cases that don't easily
>fit in that pattern (e.g. the signature of range()).

I see a couple different possibilities for this.  Could you give an 
example of how you'd *like* it to work?


From pje at telecommunity.com  Mon May 14 21:40:39 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 15:40:39 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<4643C4F4.30708@canterbury.ac.nz>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
	<ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>
Message-ID: <20070514193852.7BE1E3A4036@sparrow.telecommunity.com>

At 09:41 AM 5/14/2007 -0700, Guido van Rossum wrote:
> > Note that in current Ruby, you can simulate generic functions
> > (single-dispatch only) via open classes as long as you use
> > sufficiently-unique method names.  The fact that Matz wants to add
> > these qualifiers seems to suggest that simple next-method chaining
> > (i.e. super) isn't as expressive as they'd like.  Unfortunately, I
> > haven't been able to find an RCR for this feature, only references to
> > RubyConf slide presentations, so I don't know what their specific 
> rationale is.
>
>So if Matz jumped off a cliff, would you recommend I jump too?

If we're using cliff-diving as a metaphor for generic functions, I'd 
say that method combination is more comparable to saying that if Matz 
decided he'd like to have a swimsuit next time he went cliff-diving, 
then I would recommend that you consider whether you might like to 
take a swimsuit as well, were you planning your first such dive.  :)

In practice, however, I wasn't recommending blindly following Matz or 
anybody else.
I simply said the plan for Ruby was suggestive that method 
combination is worth looking into further, because in the case of 
Ruby, they already had single-dispatch generic functions, so the 
addition suggests combination is no longer considered a YAGNI there.

As I said, however, I unfortunately haven't been able to find any 
documented rationale for the proposal -- implying that I have no idea 
whether Matz' decision is more comparable to jumping off a cliff or 
packing a swimsuit, and thus cannot give any actual recommendation 
with respect to such.  :)

I simply mentioned the subject in case anybody else knew more about 
the rationale or where to look for the RCR (if one exists).  That is, 
I think it would be useful to know why there was interest in adding 
such a feature there.

Anyway, that's hardly the same as recommending you jump off a 
cliff.  Indeed, it wasn't a recommendation of any sort at all, just a 
comment on my investigation into useful references for the PEP -- 
i.e., something that you asked for more of.

As for actual code, I'm looking now for examples from people's code 
besides mine.  The canonical use for me is separating business rules 
like "@after sell(cust:GoldCustomer, prod:FooProduct): 
email_regional_sales_mgr()" from reusable library code, so that 
"enterprise" developers can add business rules to an upgradeable core 
library maintained by a vendor or architect.  Of course, I'll also 
use them to do debug prints or drop into the debugger.

Meanwhile, I've been told repeatedly that TurboGears makes extensive 
use of RuleDispatch, and my quick look today showed they actually use 
a custom method combination, but I haven't yet tracked down where it 
gets used, or what the rationale for it is.

It doesn't appear to be used in the core TurboGears package, so I 
suppose it must be in the various add-ons, which I haven't had time 
to go through yet.  Their custom method combination does *support* 
before/after/around methods, but their core tests only tested 
"around" methods that I saw.


From jimjjewett at gmail.com  Mon May 14 21:43:22 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 14 May 2007 15:43:22 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070514163231.275CE3A4036@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<4643C4F4.30708@canterbury.ac.nz>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
Message-ID: <fb6fbf560705141243h557e7951s2c04930ffe6f0a39@mail.gmail.com>

On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> I don't see what the benefit is of making people implement their own
> versions of @before, @after, and @around, which then won't
> interoperate properly with others' versions of the same thing.  Even
> if we leave in place the MethodList base class (which Before and
> After are subclasses of), one of its limitations is that it can only
> combine methods of the same type.


That sounds broken; could you use a numeric precedence with default
levels, like the logging library does?

-jJ

From jason.orendorff at gmail.com  Mon May 14 21:46:20 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Mon, 14 May 2007 15:46:20 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <20070514093643.8559.JCARLSON@uci.edu>
References: <4647B15F.7040700@canterbury.ac.nz>
	<bb8868b90705140922v2c1d862ic8fb4f91418656bd@mail.gmail.com>
	<20070514093643.8559.JCARLSON@uci.edu>
Message-ID: <bb8868b90705141246m25261365h2d66fb59d903e3a1@mail.gmail.com>

On 5/14/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> Have you been able to find substantial Java source in which non-ascii
> identifiers were used?  I have been curious about its prevalence, but
> wouldn't even know how to start searching for such code.

No, I haven't.

The most substantial use cases (if any) would have to be
in closed source code, which is hard to find.

I spent a little time looking for Java tutorials in a few
languages: Spanish, Japanese, Chinese, Korean.  Couldn't
find anything in Chinese.  (I don't know these languages. I
have no idea if I was looking in the right places, etc.)

  - For identifiers, the Spanish-language tutorials mostly
    used Spanish words stripped down to ASCII (accents
    and tildes dropped).

  - The Korean and Japanese tutorials I found (3 total)
    used English identifiers exclusively.

They did tend to use non-English characters freely in
comments and (about half the time) in string literals.
The Japanese tutorials had no comments at all in the
code.

-j

From guido at python.org  Mon May 14 21:47:26 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 12:47:26 -0700
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <20070514192423.624D63A4036@sparrow.telecommunity.com>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514192423.624D63A4036@sparrow.telecommunity.com>
Message-ID: <ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>

On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 11:25 AM 5/14/2007 -0700, Guido van Rossum wrote:
> >The implementation of @overload needs to use sys._getframe() to look
> >up the name of the function ('flatten') in the surrounding namespace.
> >I find this too fragile an approach; it means that I can't easily
> >write another function that calls overload to get the same effect; in
> >particular, I don't see how this code could work:
> >
> >   def my_overload(func):
> >     "Shorthand for @some_decorator + @overload."
> >     return some_decorator(overload(func))
> >
> >   @my_oveload
> >   def flatten(z: int): ...
> >
> >If the overload decorator simply looked in the calling scope, it would
> >not find 'flatten' there, since that's the local scope of my_overload.
> >(If it devised some clever scheme of descending down the stack, I
> >would just have to create a more complicated example.)
>
> Actually, your "my_overload" would just need to do its own getframe
> and call when() on the result, since overload is just sugar for when().

That does nothing to address my abhorrence of sys._getframe(). To the
contary, it looks like knowledge of the implementation is required for
proper use and understanding of @overload. A big fat -1 on that.

> >I realize that @overload is only a shorthand for @when(function). But
> >I'd much rather not have @overload at all -- the frame inspection
> >makes it really hard for me to explain carefully what happens without
> >just giving the code that uses sys._getframe(); and this makes it
> >difficult to reason about code using @overload.
>
> This is why in the very earliest GF discussions here, I proposed a
> 'defop expr(...)' syntax, as it would eliminate the need for any
> getframe hackery.

But that would completely kill your "but it's all pure Python code so
it's harmless and portable" argument.

It seems that you're really not interested at all in compromising to
accept mandatory marking of the base overloadable function. That's too
bad, because I'm not compromising either *on that particular issue*.

> >My own preference for spelling this example would be
> >
> >@overloadable
> >def flatten(x): ...
> >
> >@flatten.overload
> >def _(y: str): ...
>
> Btw, this is similar to how RuleDispatch actually spells it, except
> that it's @flatten.when().  Later, I decided I preferred putting the
> *mode* of combination (e.g. when vs. around vs. whatever) first, both
> because it reads more naturally (e.g. "when flattening", "before
> flattening", etc.)

But the function name isn't "flattening" (and there are good reasons
for that). This requires too much squinting to work.

> and because it enabled one to retroactively extend
> existing functions.

Which as you know I don't like, so that argument doesn't hold.

I find that "when" feels like a "condition" (albeit a temporal one)
and I'd much rather read the descriptor in terms of what the action of
the decorator is (i.e. some kind of registration) rather than trying
to read like some vaguely declarative English phrase.

> >Next, I have a question about the __proceed__ magic argument. I can
> >see why this is useful, and I can see why having this as a magic
> >argument is preferable over other solutions (I couldn't come up with a
> >better solution, and believe me I tried :-).  However, I think making
> >this the *first* argument would upset tools that haven't been taught
> >about this yet. Is there any problem with making it a keyword argument
> >with a default of None, by convention to be placed last?
>
> Actually, a pending revision to the PEP is to drop the special name
> and instead use a special annotation, e.g.:
>
>      def whatever(nm:next_method, ...):
>
> (This idea came up in an early thread when some folks queried whether
> a better name than __proceed__ could be found.)

Cool, I agree that an annotation is better than a magic name.

> Anyway, with this, it could also be placed as a keyword
> argument.  The main reason for putting it in the first position is
> performance.  Allowing it to be anywhere, however, would let the
> choice of where be a matter of style.

Right. What's the performance issue with the first argument?

> >Finally, I looked at the example of overloading a method instead of a
> >function.  The little dance required to overload a method defined in a
> >base class feels fragile,
>
> Note that a defop syntax would simplify this; i.e. :
>
>      defop MyBaseClass.methodname(...):
>          ...
>
> This doesn't help with the first-argument magic, however.
>
> However, since we're going to have to have some way for 'super' to
> know the class a function is defined in, ISTM that the same magic
> should be reusable for the first-argument rule.

Perhaps. Though super only needs to know it once the method is being
called, while your decorator (presumably) needs to know when the
method is being defined, i.e. before the class object is constructed.

Also, the similarities between next-method and super are overwhelming.
It would be great if you could work with Tim Delaney on a mechanism
underlying all three issues, or at least two of the three.

> >Forgive me if this is mentioned in the PEP, but what happens with
> >keyword args? Can I invoke an overloaded function with (some) keyword
> >args, assuming they match the argument names given in the default
> >implementation?
>
> Yes.  That's done with code generation; PEAK-Rules uses direct
> bytecode generation, but a sourcecode-based generation is also
> possible and would be used for the PEP implementation (it was also
> used in RuleDispatch).

There's currently no discussion of this. Without a good understanding
of the implementation I cannot accept the PEP.

> >Also, can we overload different-length signatures (like in C++ or
> >Java)? This is very common in those languages; while Python typically
> >uses default argument values, there are use cases that don't easily
> >fit in that pattern (e.g. the signature of range()).
>
> I see a couple different possibilities for this.  Could you give an
> example of how you'd *like* it to work?

In the simplest case (no default argument values) overloading two-arg
functions and three-arg functions with the same name should act as if
there were two completely separate functions, except for the base
(default) function. Example:

@overloadable
def range(start:int, stop:int, step:int):
  ...  # implement xrange

@range.overload
def range(x): return range(0, x, 1)

@range.overload
def range(x, y): return range(x, y, 1)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May 14 21:51:23 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 12:51:23 -0700
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070514193852.7BE1E3A4036@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<4643C4F4.30708@canterbury.ac.nz>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
	<ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>
	<20070514193852.7BE1E3A4036@sparrow.telecommunity.com>
Message-ID: <ca471dc20705141251y7162cc1fpf2f93e9afd970a6@mail.gmail.com>

On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> I simply said the plan for Ruby was suggestive that method
> combination is worth looking into further, because in the case of
> Ruby, they already had single-dispatch generic functions, so the
> addition suggests combination is no longer considered a YAGNI there.
>
> As I said, however, I unfortunately haven't been able to find any
> documented rationale for the proposal -- implying that I have no idea
> whether Matz' decision is more comparable to jumping off a cliff or
> packing a swimsuit, and thus cannot give any actual recommendation
> with respect to such.  :)

So how do you know what's going on there is the same as what's
apparently going on here, i.e. some folks have fallen in love with
CLOS or Haskell or whatever and are pushing for some theoretical ideal
that has no practical applications?

> Meanwhile, I've been told repeatedly that TurboGears makes extensive
> use of RuleDispatch, and my quick look today showed they actually use
> a custom method combination, but I haven't yet tracked down where it
> gets used, or what the rationale for it is.
>
> It doesn't appear to be used in the core TurboGears package, so I
> suppose it must be in the various add-ons, which I haven't had time
> to go through yet.  Their custom method combination does *support*
> before/after/around methods, but their core tests only tested
> "around" methods that I saw.

I'm looking forward to a more complete examination of that use case.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From collinw at gmail.com  Mon May 14 21:52:22 2007
From: collinw at gmail.com (Collin Winter)
Date: Mon, 14 May 2007 12:52:22 -0700
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <c19402930705140306w6f9a2df4m4f4cb9c5a1ba9630@mail.gmail.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
	<c19402930705140306w6f9a2df4m4f4cb9c5a1ba9630@mail.gmail.com>
Message-ID: <43aa6ff70705141252r4dbebaa3p1b4e5939719abdff@mail.gmail.com>

On 5/14/07, Arvind Singh <arvind1.singh at gmail.com> wrote:
> > Asking Questions About Roles
>
> Shouldn't there be some way to ``revoke'' roles?

No, roles are purely additive. Allowing role revocation is an easy
recipe for race conditions where one bit of code says type X does a
given role and another bit of code says it doesn't.

> How can we get a list of all roles played by an object?

Something like this could be trivially added. What use-case do you have in mind?

> Should there be a way to check ``loosely'' whether an object can
> potentially play a given role? (i.e., checking whether an object
> provides a give interface, atleast syntactically)

This could be added, yes.

Collin Winter

From jcarlson at uci.edu  Mon May 14 22:03:10 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 14 May 2007 13:03:10 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <bb8868b90705141246m25261365h2d66fb59d903e3a1@mail.gmail.com>
References: <20070514093643.8559.JCARLSON@uci.edu>
	<bb8868b90705141246m25261365h2d66fb59d903e3a1@mail.gmail.com>
Message-ID: <20070514125306.8563.JCARLSON@uci.edu>


"Jason Orendorff" <jason.orendorff at gmail.com> wrote:
> On 5/14/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> > Have you been able to find substantial Java source in which non-ascii
> > identifiers were used?  I have been curious about its prevalence, but
> > wouldn't even know how to start searching for such code.
> 
> No, I haven't.
> 
> The most substantial use cases (if any) would have to be
> in closed source code, which is hard to find.
[snip]
> They did tend to use non-English characters freely in
> comments and (about half the time) in string literals.
> The Japanese tutorials had no comments at all in the
> code.

Your findings seem to suggest (but not prove either way) that having
unicode strings and comments (that Python already supports) may be
sufficient for a majority of use-cases (assuming that people document
and comment their code ;).  It would be nice to be able to find more
examples in Java.

I guess the question is whether the potential for community
fragmentation is worth trying to handle a (seemingly much) smaller set
of use-cases than is (already arguably sufficiently) handled with ascii
identifiers.


 - Josiah


From collinw at gmail.com  Mon May 14 22:03:44 2007
From: collinw at gmail.com (Collin Winter)
Date: Mon, 14 May 2007 13:03:44 -0700
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <d11dcfba0705132308h712fbadeq91e4474f294ac936@mail.gmail.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
	<d11dcfba0705132308h712fbadeq91e4474f294ac936@mail.gmail.com>
Message-ID: <43aa6ff70705141303t63520c44g8f88f8ae56732137@mail.gmail.com>

On 5/13/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> On 5/13/07, Collin Winter <collinw at gmail.com> wrote:
> > PEP: 3133
> > Title: Introducing Roles
> [snip]
> > * Roles provide a way of indicating a object's semantics and abstract
> >   capabilities.  A role may define abstract methods, but only as a
> >   way of delineating an interface through which a particular set of
> >   semantics are accessed.
> [snip]
> > * Abstract base classes, by contrast, are a way of reusing common,
> >   discrete units of implementation.
> [snip]
> >   Using this abstract base class - more properly, a concrete
> >   mixin - allows a programmer to define a limited set of operators
> >   and let the mixin in effect "derive" the others.
>
> So what's the difference between a role and an abstract base class
> that used @abstractmethod on all of its methods? Isn't such an ABC
> just "delineating an interface"?
>
> > since the ``OrderingMixin`` class above satisfies the interface
> > and semantics expressed in the ``Ordering`` role, we say the mixin
> > performs the role: ::
> >
> >   @perform_role(Ordering)
> >   class OrderingMixin:
> >     def __ge__(self, other):
> >       return self > other or self == other
> >
> >     def __le__(self, other):
> >       return self < other or self == other
> >
> >     def __ne__(self, other):
> >       return not self == other
> >
> >     # ...and so on
> >
> > Now, any class that uses the mixin will automatically -- that is,
> > without further programmer effort -- be tagged as performing the
> > ``Ordering`` role.
>
> But why is::
>
>     performs(obj, Ordering)
>
> any better than::
>
>     isinstance(obj, Ordering)
>
> if Ordering is just an appropriately registered ABC?

There really is no difference between roles and all- at abstractmethod
ABCs. From my point of view, though, roles win because they don't
require any changes to the interpreter; they're a much simpler way of
expressing the same concept. You may like adding the extra complexity
and indirection to the VM necessary to support
issubclass()/isinstance() overriding, but I don't.

Collin Winter

From steven.bethard at gmail.com  Mon May 14 22:33:31 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon, 14 May 2007 14:33:31 -0600
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <43aa6ff70705141303t63520c44g8f88f8ae56732137@mail.gmail.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
	<d11dcfba0705132308h712fbadeq91e4474f294ac936@mail.gmail.com>
	<43aa6ff70705141303t63520c44g8f88f8ae56732137@mail.gmail.com>
Message-ID: <d11dcfba0705141333j3aa93914s5d680fff99af7283@mail.gmail.com>

On 5/14/07, Collin Winter <collinw at gmail.com> wrote:
> There really is no difference between roles and all- at abstractmethod
> ABCs. From my point of view, though, roles win because they don't
> require any changes to the interpreter; they're a much simpler way of
> expressing the same concept.

Ok, you clearly have an implementation in mind, but I don't know what
it is.  As far as I can tell:

* metaclass=Role ~ metaclass=ABCMeta, except that all methods must be abstract
* perform_role(role)(cls) ~ role.register(cls)
* performs(obj, role) ~ isinstance(obj, role)

And so, as far as I can see, without an Implementation section, all
you're propsing is a different syntax for the same functionality. Was
there a discussion of your implementation that I missed?

> You may like adding the extra complexity
> and indirection to the VM necessary to support
> issubclass()/isinstance() overriding, but I don't.

Have you looked at Guido's issubclass()/isinstance() patch
(http://bugs.python.org/1708353)?  I'd hardly say that 34 lines of C
code is substantial "extra complexity".

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From foom at fuhm.net  Mon May 14 22:44:54 2007
From: foom at fuhm.net (James Y Knight)
Date: Mon, 14 May 2007 16:44:54 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<4643C4F4.30708@canterbury.ac.nz>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
	<ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>
Message-ID: <7835189A-91EC-4582-8667-5DAC18DDF301@fuhm.net>

On May 14, 2007, at 12:41 PM, Guido van Rossum wrote:
> OK, let me repeat this request than: real use cases! Point me to code
> that uses or could be dramatically simplified by adding all this.
> Until, then, before/after and everything beyond it is solidly in
> YAGNI-land.


Excerpted from a recent post to scons-dev by Maciej Pasternacki :

> COMMON ISSUES
>
> Automake uses -local and -hook rules to allow software author to
> customize generated Makefile's behaviour without overriding it.  Some
> way of hinting what should be done before/after/around node is built
> should be provided to make it possible also in SCons.  API for this
> might be slightly based on how Common Lisp Object System's method
> combinations work
> (http://www.lispworks.com/documentation/HyperSpec/Body/07_ffb.htm).
>
> General solution would allow -local/-hook-type customization for all
> nodes, not just a few selected ones like Automake does.

I'm not sure if the poster had seen this PEP already or not, but I  
pointed him towards it. (Note: this is regarding a proposal, not  
existing code).

James

From brett at python.org  Mon May 14 23:10:31 2007
From: brett at python.org (Brett Cannon)
Date: Mon, 14 May 2007 14:10:31 -0700
Subject: [Python-3000] getting compiler package failures
In-Reply-To: <ca471dc20705141200x375d702bn35859ab1e8be9dee@mail.gmail.com>
References: <bbaeab100705131558v1e57afb6y2255646231896bd@mail.gmail.com>
	<ca471dc20705131715qc53a622x4ca6a6bc25ba0c22@mail.gmail.com>
	<e8bf7a530705141158n7547e74eoccb767e7e1944d62@mail.gmail.com>
	<ca471dc20705141200x375d702bn35859ab1e8be9dee@mail.gmail.com>
Message-ID: <bbaeab100705141410g2946ba9ag2b7f6bcf037c5984@mail.gmail.com>

On 5/14/07, Guido van Rossum <guido at python.org> wrote:
>
> OK Brett, let 'er rip.


Ripped in revision 55322.

-Brett


On 5/14/07, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
> > On 5/13/07, Guido van Rossum <guido at python.org> wrote:
> > > test_compiler and test_transformer have been broken for a couple of
> > > months now I believe.
> > >
> > > Unless someone comes to the rescue of the compiler package soon, I'm
> > > tempted to remove it from the p3yk branch -- it doesn't seem to serve
> > > any particularly good purpose, especially now that the AST used by the
> > > compiler written in C is exportable.
> >
> > We currently lack the ability to take an AST exported by the Python-C
> > compiler and pass it back to the compiler to generate bytecode.  It
> > would be a lot more practical, however, to add this ability than to
> > try to maintain two different compilers.
> >
> > So a qualified +1 from me.
> >
> > Jeremy
> >
> > >
> > > --Guido
> > >
> > > On 5/13/07, Brett Cannon <brett at python.org> wrote:
> > > > I just did a ``make distclean`` on a clean checkout (r55300) and
> > > > test_compiler/test_transformer are failing:
> > > >
> > > >   File
> > > > "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
> > > > line 715, in atom
> > > >      return self._atom_dispatch[nodelist[0][0]](nodelist)
> > > > KeyError: 322
> > > >
> > > > or
> > > >
> > > >   File
> > > > "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
> > > > line 776, in lookup_node
> > > >     return self._dispatch[node[0]]
> > > > KeyError: 331
> > > >
> > > > or
> > > >
> > > >   File
> > > > "/Users/drifty/Dev/python/3.x/pristine/Lib/compiler/transformer.py",
> > > > line 783, in com_node
> > > >     return self._dispatch[node[0]](node[1:])
> > > > KeyError: 339
> > > >
> > > >
> > > > I don't know the compiler package at all (which is why I am
> currently
> > > stuck
> > > > on Tony Lownds' PEP 3113 patch since I am getting a
> > > > compiler.transformer.WalkerError) so I have no clue how to
> > > > go about fixing this.  Anyone happen to know what may have caused
> the
> > > > breakage?
> > > >
> > > > -Brett
> > > >
> > > > _______________________________________________
> > > > Python-3000 mailing list
> > > > Python-3000 at python.org
> > > > http://mail.python.org/mailman/listinfo/python-3000
> > > > Unsubscribe:
> > > >
> http://mail.python.org/mailman/options/python-3000/guido%40python.org
> > > >
> > > >
> > >
> > >
> > > --
> > > --Guido van Rossum (home page: http://www.python.org/~guido/)
> > > _______________________________________________
> > > Python-3000 mailing list
> > > Python-3000 at python.org
> > > http://mail.python.org/mailman/listinfo/python-3000
> > > Unsubscribe:
> > >
> http://mail.python.org/mailman/options/python-3000/jeremy%40alum.mit.edu
> > >
> >
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070514/2004f7f4/attachment.htm 

From collinw at gmail.com  Mon May 14 23:18:35 2007
From: collinw at gmail.com (Collin Winter)
Date: Mon, 14 May 2007 14:18:35 -0700
Subject: [Python-3000] getting compiler package failures
In-Reply-To: <bbaeab100705141410g2946ba9ag2b7f6bcf037c5984@mail.gmail.com>
References: <bbaeab100705131558v1e57afb6y2255646231896bd@mail.gmail.com>
	<ca471dc20705131715qc53a622x4ca6a6bc25ba0c22@mail.gmail.com>
	<e8bf7a530705141158n7547e74eoccb767e7e1944d62@mail.gmail.com>
	<ca471dc20705141200x375d702bn35859ab1e8be9dee@mail.gmail.com>
	<bbaeab100705141410g2946ba9ag2b7f6bcf037c5984@mail.gmail.com>
Message-ID: <43aa6ff70705141418k23664327h7416fd24bd0851cd@mail.gmail.com>

On 5/14/07, Brett Cannon <brett at python.org> wrote:
>
>
> On 5/14/07, Guido van Rossum <guido at python.org> wrote:
> > OK Brett, let 'er rip.
>
> Ripped in revision 55322.

Woohoo!

From benji at benjiyork.com  Mon May 14 23:35:34 2007
From: benji at benjiyork.com (Benji York)
Date: Mon, 14 May 2007 17:35:34 -0400
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
Message-ID: <4648D626.1030201@benjiyork.com>

Collin Winter wrote:
> PEP: 3133
> Title: Introducing Roles

Everything included here is included in zope.interface.  See in-line 
comments below for the analogs.

[snip]

> Performing Your Role
> ====================
> 
> Static Role Assignment
> ----------------------
> 
> Let's start out by defining ``Tree`` and ``Dog`` classes ::
> 
>   class Tree(Vegetable):
> 
>     def bark(self):
>       return self.is_rough()
> 
> 
>   class Dog(Animal):
> 
>     def bark(self):
>       return self.goes_ruff()
> 
> While both implement a ``bark()`` method with the same signature,
> they do wildly different things.  We need some way of differentiating
> what we're expecting. Relying on inheritance and a simple
> ``isinstance()`` test will limit code reuse and/or force any dog-like
> classes to inherit from ``Dog``, whether or not that makes sense.
> Let's see if roles can help. ::
> 
>   @perform_role(Doglike)
>   class Dog(Animal):
>     ...

class Dog(Animal):
     zope.interface.implements(Doglike)

>   @perform_role(Treelike)
>   class Tree(Vegetable):
>     ...

     class Tree(Vegetable):
         zope.interface.implements(Treelike)

>   @perform_role(SitThere)
>   class Rock(Mineral):
>     ...

     class Rock(Mineral):
         zope.interface.implements(SitThere)

> We use class decorators from PEP 3129 to associate a particular role
> or roles with a class.

zope.interface.implements should be usable with the PEP 3129 syntax, but 
I showed the current class decorator syntax throughout.

> Client code can now verify that an incoming
> object performs the ``Doglike`` role, allowing it to handle ``Wolf``,
> ``LaughingHyena`` and ``Aibo`` [#aibo]_ instances, too.
> 
> Roles can be composed via normal inheritance: ::
> 
>   @perform_role(Guard, MummysLittleDarling)
>   class GermanShepherd(Dog):
> 
>     def guard(self, the_precious):
>       while True:
>         if intruder_near(the_precious):
>           self.growl()
> 
>     def get_petted(self):
>       self.swallow_pride()

class GermanShepherd(Dog):
     zope.interface.implements(Guard, MummysLittleDarling)

[rest of class definition is the same]

> Here, ``GermanShepherd`` instances perform three roles: ``Guard`` and
> ``MummysLittleDarling`` are applied directly, whereas ``Doglike``
> is inherited from ``Dog``.
> 
> 
> Assigning Roles at Runtime
> --------------------------
> 
> Roles can be assigned at runtime, too, by unpacking the syntactic
> sugar provided by decorators.
> 
> Say we import a ``Robot`` class from another module, and since we
> know that ``Robot`` already implements our ``Guard`` interface,
> we'd like it to play nicely with guard-related code, too. ::
> 
>   >>> perform(Guard)(Robot)
> 
> This takes effect immediately and impacts all instances of ``Robot``.

     >>> zope.interface.classImplements(Robot, Guard)

> Asking Questions About Roles
> ----------------------------
> 
> Just because we've told our robot army that they're guards, we'd
> like to check in on them occasionally and make sure they're still at
> their task. ::
> 
>   >>> performs(our_robot, Guard)
>   True

     >>> zope.interface.directlyProvides(our_robot, Guard)

> What about that one robot over there? ::
> 
>   >>> performs(that_robot_over_there, Guard)
>   True

     >>> Guard.providedBy(that_robot_over_there)
     True

> The ``performs()`` function is used to ask if a given object
> fulfills a given role.  It cannot be used, however, to ask a
> class if its instances fulfill a role: ::
> 
>   >>> performs(Robot, Guard)
>   False

     >>> Guard.providedBy(Robot)
     False

> This is because the ``Robot`` class is not interchangeable
> with a ``Robot`` instance.

But if you want to find out if a class creates instances that provide an 
interface you can::

     >>> Guard.implementedBy(Robot)
     True

> 
> Defining New Roles
> ==================
> 
> Empty Roles
> -----------
> 
> Roles are defined like a normal class, but use the ``Role``
> metaclass. ::
> 
>   class Doglike(metaclass=Role):
>     ...

Interfaces are defined like normal classes, but subclass 
zope.interface.Interface:

     class Doglike(zope.interface.Interface):
         pass

> Metaclasses are used to indicate that ``Doglike`` is a ``Role`` in
> the same way 5 is an ``int`` and ``tuple`` is a ``type``.
> 
> 
> Composing Roles via Inheritance
> -------------------------------
> 
> Roles may inherit from other roles; this has the effect of composing
> them.  Here, instances of ``Dog`` will perform both the
> ``Doglike`` and ``FourLegs`` roles. ::
> 
>   class FourLegs(metaclass=Role):
>     pass
> 
>   class Doglike(FourLegs, Carnivor):
>     pass
> 
>   @perform_role(Doglike)
>   class Dog(Mammal):
>     pass

     class FourLegs(zope.interface.Interface):
         pass

     class Doglike(FourLegs, Carnivore):
         pass

     class Dog(Mammal):
         zope.interface.implements(Doglike)

> Requiring Concrete Methods
> --------------------------
> 
> So far we've only defined empty roles -- not very useful things.
> Let's now require that all classes that claim to fulfill the
> ``Doglike`` role define a ``bark()`` method: ::
> 
>   class Doglike(FourLegs):
> 
>     def bark(self):
>       pass

     class Doglike(FourLegs):
         def bark():
             pass

> No decorators are required to flag the method as "abstract", and the
> method will never be called, meaning whatever code it contains (if any)
> is irrelevant.  Roles provide *only* abstract methods; concrete
> default implementations are left to other, better-suited mechanisms
> like mixins.
> 
> Once you have defined a role, and a class has claimed to perform that
> role, it is essential that that claim be verified.  Here, the
> programmer has misspelled one of the methods required by the role. ::
> 
>   @perform_role(FourLegs)
>   class Horse(Mammal):
> 
>     def run_like_teh_wind(self)
>       ...
> 
> This will cause the role system to raise an exception, complaining
> that you're missing a ``run_like_the_wind()`` method.  The role
> system carries out these checks as soon as a class is flagged as
> performing a given role.

zope.interface does no runtime checking.  It has a similar mechanism in 
zope.interface.verify::

     >>> from zope.interface.verify import verifyObject
     >>> verifyObject(Guard, our_robot)
     True

> Concrete methods are required to match exactly the signature demanded
> by the role.  Here, we've attempted to fulfill our role by defining a
> concrete version of ``bark()``, but we've missed the mark a bit. ::
> 
>   @perform_role(Doglike)
>   class Coyote(Mammal):
> 
>     def bark(self, target=moon):
>       pass
> 
> This method's signature doesn't match exactly with what the
> ``Doglike`` role was expecting, so the role system will throw a bit
> of a tantrum.

zope.interface doesn't do anything like this.  I suspect *args, and 
**kws make it impractical to do so (not mentioning whether or not it's a 
good idea).

The rest of the PEP concerns implementation and other details, so 
eliding that.
-- 
Benji York
http://benjiyork.com

From pje at telecommunity.com  Mon May 14 23:50:56 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 17:50:56 -0400
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.co
 m>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514192423.624D63A4036@sparrow.telecommunity.com>
	<ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>
Message-ID: <20070514214915.C361C3A4036@sparrow.telecommunity.com>

At 12:47 PM 5/14/2007 -0700, Guido van Rossum wrote:
>> >I realize that @overload is only a shorthand for @when(function). But
>> >I'd much rather not have @overload at all -- the frame inspection
>> >makes it really hard for me to explain carefully what happens without
>> >just giving the code that uses sys._getframe(); and this makes it
>> >difficult to reason about code using @overload.
>>
>>This is why in the very earliest GF discussions here, I proposed a
>>'defop expr(...)' syntax, as it would eliminate the need for any
>>getframe hackery.
>
>But that would completely kill your "but it's all pure Python code so
>it's harmless and portable" argument.

Uh, wha?  You lost me completely there.  A 'defop' syntax simply 
eliminates the need to name the target function twice (once in the 
decorator, and again in the 'def').  I don't get what that has to do 
with stuff being harmless or portable or any of that.

Are you perhaps conflating this with the issue of marking functions 
as overloadable?  These are  independent ideas, AFAICT.


>It seems that you're really not interested at all in compromising to
>accept mandatory marking of the base overloadable function.

Uh, wha?  I already agreed to that a couple of weeks ago:

http://mail.python.org/pipermail/python-3000/2007-May/007205.html

I just haven't updated the PEP yet -- any more than I've updated it 
with anything else that's been in these ongoing threads, like the 
:next_method annotation or splitting the PEP.


>>Anyway, with this, it could also be placed as a keyword
>>argument.  The main reason for putting it in the first position is
>>performance.  Allowing it to be anywhere, however, would let the
>>choice of where be a matter of style.
>
>Right. What's the performance issue with the first argument?

Chaining using the first argument can be implemented using a bound 
method object, which gets performance bonuses from the C eval loop 
that partial() objects don't.  (Of course, when RuleDispatch was 
written, partial() objects didn't exist, anyway.)


>>However, since we're going to have to have some way for 'super' to
>>know the class a function is defined in, ISTM that the same magic
>>should be reusable for the first-argument rule.
>
>Perhaps. Though super only needs to know it once the method is being
>called, while your decorator (presumably) needs to know when the
>method is being defined, i.e. before the class object is constructed.

Not really; at some point the class object has to be assigned and 
stored somewhere for super to use, so if same process of "assigning" 
can be used to actually perform the registration, we're good to go.


>Also, the similarities between next-method and super are overwhelming.
>It would be great if you could work with Tim Delaney on a mechanism
>underlying all three issues, or at least two of the three.

I'm not sure I follow you.  Do you mean, something like using :super 
as the annotation instead of next_method, or are you just talking 
about the implementation mechanics?


>> >Forgive me if this is mentioned in the PEP, but what happens with
>> >keyword args? Can I invoke an overloaded function with (some) keyword
>> >args, assuming they match the argument names given in the default
>> >implementation?
>>
>>Yes.  That's done with code generation; PEAK-Rules uses direct
>>bytecode generation, but a sourcecode-based generation is also
>>possible and would be used for the PEP implementation (it was also
>>used in RuleDispatch).
>
>There's currently no discussion of this.

Well, actually there's this bit:

"""The use of BytecodeAssembler can be replaced using an "exec" or "compile"
workaround, given a reasonable effort.  (It would be easier to do this
if the ``func_closure`` attribute of function objects was writable.)"""

But the closure bit is irrelevant if we're using @overloadable.

>Without a good understanding
>of the implementation I cannot accept the PEP.

The mechanism is exec'ing of a string containing a function 
definition.  The original function's signature is obtained using 
inspect.getargspec(), and the string is exec'd to obtain a new 
function whose signature matches, but whose body contains the generic 
function lookup code.

In practice, the actual function definition has to be nested, so that 
argument defaults can be passed in without needing to convert them to 
strings, and so that the needed lookup tables can be seen via closure 
variables.  A string template would look something like:

     def make_the_function(__defaults, __lookup):
         def $funcname($accept_signature):
             return __lookup($type_tuple)($call_signature)
         return $funcname

The $type_tuple bit would expand to something like:

     type(firstargname), type(secondargname), ...

And $accept_signature would expand to the original function's 
signature, with default values replaced by "__defaults[0]", 
"__defaults[1]", etc. in order to make the resulting function have 
the same default values.

The function that would be returned from @overloadable would be the 
result of calling "make_the_function", passing in the original 
function's func_defaults and an appropriate value for __lookup.

A similar approach is used in RuleDispatch currently.



>> >Also, can we overload different-length signatures (like in C++ or
>> >Java)? This is very common in those languages; while Python typically
>> >uses default argument values, there are use cases that don't easily
>> >fit in that pattern (e.g. the signature of range()).
>>
>>I see a couple different possibilities for this.  Could you give an
>>example of how you'd *like* it to work?
>
>In the simplest case (no default argument values) overloading two-arg
>functions and three-arg functions with the same name should act as if
>there were two completely separate functions, except for the base
>(default) function. Example:
>
>@overloadable
>def range(start:int, stop:int, step:int):
>  ...  # implement xrange
>
>@range.overload
>def range(x): return range(0, x, 1)
>
>@range.overload
>def range(x, y): return range(x, y, 1)

Hm.  I'll need to give some thought to that, but it seems to me that 
it's sort of like having None defaults for the missing arguments, and 
then treating the missing-argument versions as requiring type(None) 
for those arguments.  Except that we'd need something besides None, 
and that the overloads would need wrappers that drop the extra 
arguments.  It certainly seems possible, anyway.

I'm not sure I like it, though.  It's not obvious from the first 
function's signature that you can call it with fewer arguments, or 
what that would mean.  For example, shouldn't the later signatures be 
"range(stop)" and "range(start,stop)"?  Hm.


From pje at telecommunity.com  Tue May 15 00:02:42 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 18:02:42 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <fb6fbf560705141243h557e7951s2c04930ffe6f0a39@mail.gmail.co
 m>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070510161417.192943A4061@sparrow.telecommunity.com>
	<464395AB.6040505@canterbury.ac.nz>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<4643C4F4.30708@canterbury.ac.nz>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
	<fb6fbf560705141243h557e7951s2c04930ffe6f0a39@mail.gmail.com>
Message-ID: <20070514220057.253A73A4036@sparrow.telecommunity.com>

At 03:43 PM 5/14/2007 -0400, Jim Jewett wrote:
>On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>>I don't see what the benefit is of making people implement their own
>>versions of @before, @after, and @around, which then won't
>>interoperate properly with others' versions of the same thing.  Even
>>if we leave in place the MethodList base class (which Before and
>>After are subclasses of), one of its limitations is that it can only
>>combine methods of the same type.
>
>That sounds broken; could you use a numeric precedence with default
>levels, like the logging library does?

There are lots of things that *could* be done, but I personally 
dislike numeric levels because they're arbitrary and it's too easy to 
just tweak a number than think through what you actually intend.

However, nothing stops you from inventing a combination type or even 
a criterion type that uses a numeric precedence.  However, at this 
point, just to prevent further head-exploding I've been leaving that 
part of the extension API vague.

But, the basic idea is that just like Interfaces or ABCs or Roles can 
be used to annotate arguments, so too could you add other types of 
criteria objects, and the precedence of those criteria could be used 
to disambiguate method precedence.

In other words, you're not limited to using diffferent combinators in 
order to extend the precedence system.  That's just what we've been discussing.

One of the reasons to have standard versions of 
when/before/after/around, however, is so that most code will never 
need to define any combinators.  The standard ones should handle the 
vast majority of use cases.

Admittedly, before/after/around are IMO 20% cases, not 80% 
cases.  Probably basic overloading is 75-80% of use cases.  But 
before/after/around covers another 20-25% or so, leaving maybe 5% or 
less for the custom combinator cases.


From pje at telecommunity.com  Tue May 15 00:34:20 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 18:34:20 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <ca471dc20705141251y7162cc1fpf2f93e9afd970a6@mail.gmail.com
 >
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>
	<4643C4F4.30708@canterbury.ac.nz>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
	<ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>
	<20070514193852.7BE1E3A4036@sparrow.telecommunity.com>
	<ca471dc20705141251y7162cc1fpf2f93e9afd970a6@mail.gmail.com>
Message-ID: <20070514223422.9CEF23A4036@sparrow.telecommunity.com>

At 12:51 PM 5/14/2007 -0700, Guido van Rossum wrote:
>On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > I simply said the plan for Ruby was suggestive that method
> > combination is worth looking into further, because in the case of
> > Ruby, they already had single-dispatch generic functions, so the
> > addition suggests combination is no longer considered a YAGNI there.
> >
> > As I said, however, I unfortunately haven't been able to find any
> > documented rationale for the proposal -- implying that I have no idea
> > whether Matz' decision is more comparable to jumping off a cliff or
> > packing a swimsuit, and thus cannot give any actual recommendation
> > with respect to such.  :)
>
>So how do you know what's going on there is the same as what's
>apparently going on here, i.e. some folks have fallen in love with
>CLOS or Haskell or whatever and are pushing for some theoretical ideal
>that has no practical applications?

I don't, which is why I said I'm *looking for the RCR or other 
rationale document*.

However, with respect, I didn't go to all the trouble of implementing 
method combination in RuleDispatch just for the heck of it.  (And it 
was considerable trouble, doing it the way CLOS implements it, until 
I figured out an approach more suitable for Python and decorators.)

But let me try to get closer to the issue that I have.  I honestly 
don't see at this moment in time, how to split out most of the 
features you don't like (mainly before/after/around), in such a way 
that they can be put back in by a third-party module, without leading 
to other problems.  For example, I fear that certain of those 
features (especially before/after/around) require a single "blessed" 
implementation in order to have a sane/stable base for library 
inter-op, even if they *could* be separated out and put back 
in.  That is, even if it's possible to separate the "mechanism", I 
think that for "policy" reasons, they should have a canonical implementation.

However, if we posit that I create some "third party" module that 
should be considered canonical or blessed for that purpose, then what 
is the difference from simply treating the entire thing as a 
third-party module to begin with?

I'm not trying to cause a problem here, nor dictate to anybody (least 
of all you!) how it all should be.  I'm just saying I don't know 
*how* to solve this bit in a way that works for everybody.

I can go back and spend some more time on the problem of how to 
separate method combination from the core that I currently 
envision.  But there's going to have to be at least *some* sort of 
hook there, to allow it to be added back in later.  (Notice that if 
the core doesn't provide a facility to modify existing functions, 
then the core has to declare all its hooks in advance.  But please 
don't confuse this statement of fact, with an argument for not doing 
something I've already agreed to do...)

Anyway, perhaps you don't care if those features can be added back 
in, or perhaps you actively wish to discourage this.  It would be 
good to know where you stand on this point.


From guido at python.org  Tue May 15 00:43:50 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 15:43:50 -0700
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <20070514214915.C361C3A4036@sparrow.telecommunity.com>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514192423.624D63A4036@sparrow.telecommunity.com>
	<ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>
	<20070514214915.C361C3A4036@sparrow.telecommunity.com>
Message-ID: <ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>

On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 12:47 PM 5/14/2007 -0700, Guido van Rossum wrote:
> >> >I realize that @overload is only a shorthand for @when(function). But
> >> >I'd much rather not have @overload at all -- the frame inspection
> >> >makes it really hard for me to explain carefully what happens without
> >> >just giving the code that uses sys._getframe(); and this makes it
> >> >difficult to reason about code using @overload.
> >>
> >>This is why in the very earliest GF discussions here, I proposed a
> >>'defop expr(...)' syntax, as it would eliminate the need for any
> >>getframe hackery.
> >
> >But that would completely kill your "but it's all pure Python code so
> >it's harmless and portable" argument.
>
> Uh, wha?  You lost me completely there.  A 'defop' syntax simply
> eliminates the need to name the target function twice (once in the
> decorator, and again in the 'def').  I don't get what that has to do
> with stuff being harmless or portable or any of that.
>
> Are you perhaps conflating this with the issue of marking functions
> as overloadable?  These are  independent ideas, AFAICT.
>
> >It seems that you're really not interested at all in compromising to
> >accept mandatory marking of the base overloadable function.
>
> Uh, wha?  I already agreed to that a couple of weeks ago:
>
> http://mail.python.org/pipermail/python-3000/2007-May/007205.html
>
> I just haven't updated the PEP yet -- any more than I've updated it
> with anything else that's been in these ongoing threads, like the
> :next_method annotation or splitting the PEP.

Ah, sorry. The way this misunderstanding probably originated was that
I read your "this is why I originally proposed defop, to avoid
getframe hackery" as a maintaining the current need for getframe,
instead of a historical fact.

> >>Anyway, with this, it could also be placed as a keyword
> >>argument.  The main reason for putting it in the first position is
> >>performance.  Allowing it to be anywhere, however, would let the
> >>choice of where be a matter of style.
> >
> >Right. What's the performance issue with the first argument?
>
> Chaining using the first argument can be implemented using a bound
> method object, which gets performance bonuses from the C eval loop
> that partial() objects don't.  (Of course, when RuleDispatch was
> written, partial() objects didn't exist, anyway.)

Sounds like premature optimization to me. We can find a way to do it
fast later; let's first make it right.

> >>However, since we're going to have to have some way for 'super' to
> >>know the class a function is defined in, ISTM that the same magic
> >>should be reusable for the first-argument rule.
> >
> >Perhaps. Though super only needs to know it once the method is being
> >called, while your decorator (presumably) needs to know when the
> >method is being defined, i.e. before the class object is constructed.
>
> Not really; at some point the class object has to be assigned and
> stored somewhere for super to use, so if same process of "assigning"
> can be used to actually perform the registration, we're good to go.

True. So are you working with Tim Delaney on this? Otherwise he may
propose a simpler mechanism that won't allow this re-use of the
mechanism.

> >Also, the similarities between next-method and super are overwhelming.
> >It would be great if you could work with Tim Delaney on a mechanism
> >underlying all three issues, or at least two of the three.
>
> I'm not sure I follow you.  Do you mean, something like using :super
> as the annotation instead of next_method, or are you just talking
> about the implementation mechanics?

super is going to be a keyword with magic properties. Wouldn't it be
great if instead of

@when(...)
def flatten(x: Mapping, nm: next_method):
  ...
  nm(x)

we could write

@when(...)
def flatten(x: Mapping):
  ...
  super.flatten(x)  # or super(x)

or some other permutation of super? Or do you see the need to call
both next-method and super from the same code?

> >> >Forgive me if this is mentioned in the PEP, but what happens with
> >> >keyword args? Can I invoke an overloaded function with (some) keyword
> >> >args, assuming they match the argument names given in the default
> >> >implementation?
> >>
> >>Yes.  That's done with code generation; PEAK-Rules uses direct
> >>bytecode generation, but a sourcecode-based generation is also
> >>possible and would be used for the PEP implementation (it was also
> >>used in RuleDispatch).
> >
> >There's currently no discussion of this.
>
> Well, actually there's this bit:
>
> """The use of BytecodeAssembler can be replaced using an "exec" or "compile"
> workaround, given a reasonable effort.  (It would be easier to do this
> if the ``func_closure`` attribute of function objects was writable.)"""
>
> But the closure bit is irrelevant if we're using @overloadable.

Thanks.

> >Without a good understanding
> >of the implementation I cannot accept the PEP.
>
> The mechanism is exec'ing of a string containing a function
> definition.  The original function's signature is obtained using
> inspect.getargspec(), and the string is exec'd to obtain a new
> function whose signature matches, but whose body contains the generic
> function lookup code.

Do note that e.g. in IronPython (and maybe also in Jython?)
exec/eval/compile are 10-50x slower (relative to the rest of the
system) than in CPython.

It does look like a clever approach though.

> In practice, the actual function definition has to be nested, so that
> argument defaults can be passed in without needing to convert them to
> strings, and so that the needed lookup tables can be seen via closure
> variables.  A string template would look something like:
>
>      def make_the_function(__defaults, __lookup):
>          def $funcname($accept_signature):
>              return __lookup($type_tuple)($call_signature)
>          return $funcname
>
> The $type_tuple bit would expand to something like:
>
>      type(firstargname), type(secondargname), ...
>
> And $accept_signature would expand to the original function's
> signature, with default values replaced by "__defaults[0]",
> "__defaults[1]", etc. in order to make the resulting function have
> the same default values.
>
> The function that would be returned from @overloadable would be the
> result of calling "make_the_function", passing in the original
> function's func_defaults and an appropriate value for __lookup.
>
> A similar approach is used in RuleDispatch currently.
>
>
>
> >> >Also, can we overload different-length signatures (like in C++ or
> >> >Java)? This is very common in those languages; while Python typically
> >> >uses default argument values, there are use cases that don't easily
> >> >fit in that pattern (e.g. the signature of range()).
> >>
> >>I see a couple different possibilities for this.  Could you give an
> >>example of how you'd *like* it to work?
> >
> >In the simplest case (no default argument values) overloading two-arg
> >functions and three-arg functions with the same name should act as if
> >there were two completely separate functions, except for the base
> >(default) function. Example:
> >
> >@overloadable
> >def range(start:int, stop:int, step:int):
> >  ...  # implement xrange
> >
> >@range.overload
> >def range(x): return range(0, x, 1)
> >
> >@range.overload
> >def range(x, y): return range(x, y, 1)
>
> Hm.  I'll need to give some thought to that, but it seems to me that
> it's sort of like having None defaults for the missing arguments, and
> then treating the missing-argument versions as requiring type(None)
> for those arguments.  Except that we'd need something besides None,
> and that the overloads would need wrappers that drop the extra
> arguments.  It certainly seems possible, anyway.
>
> I'm not sure I like it, though.

C++ and Java users use it all the time though.

> It's not obvious from the first
> function's signature that you can call it with fewer arguments, or
> what that would mean.  For example, shouldn't the later signatures be
> "range(stop)" and "range(start,stop)"?  Hm.

I don't know if the arg names for overloadings must match those of the
default function or not -- is that specified by your PEP?

My own trivially simple overloading code (sandbox/overload, and now
also added as an experiment to sandbox/abc, with slightly different
terminology and using issubclass exclusively, as you recommended over
a year ago :-) has no problem with this. Of course it only handles
positional arguments and completely ignores argument names except as
keys into the annotations dict.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue May 15 01:19:06 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 16:19:06 -0700
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070514223422.9CEF23A4036@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
	<ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>
	<20070514193852.7BE1E3A4036@sparrow.telecommunity.com>
	<ca471dc20705141251y7162cc1fpf2f93e9afd970a6@mail.gmail.com>
	<20070514223422.9CEF23A4036@sparrow.telecommunity.com>
Message-ID: <ca471dc20705141619o42fd2626m92b17c4db7a2085@mail.gmail.com>

On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> However, with respect, I didn't go to all the trouble of implementing
> method combination in RuleDispatch just for the heck of it.  (And it
> was considerable trouble, doing it the way CLOS implements it, until
> I figured out an approach more suitable for Python and decorators.)

So you owe us more motivating examples (in addition to the explanatory
examples), showing how you had a particular problem, and you couldn't
solve it cleanly using the usual suspects (subclassing, callbacks,
etc.), and how method combining came to the rescue. Perhaps writing it
up like a pattern description a la GoF might help.

> But let me try to get closer to the issue that I have.  I honestly
> don't see at this moment in time, how to split out most of the
> features you don't like (mainly before/after/around), in such a way
> that they can be put back in by a third-party module, without leading
> to other problems.  For example, I fear that certain of those
> features (especially before/after/around) require a single "blessed"
> implementation in order to have a sane/stable base for library
> inter-op, even if they *could* be separated out and put back
> in.  That is, even if it's possible to separate the "mechanism", I
> think that for "policy" reasons, they should have a canonical implementation.

Please share more details, so your readers can understand this too.
Right now the whole discussion around this appears to be in your head
only, and what you write is the conclusion *you* have drawn. That's
not very helpful -- I have great respect for the powers of your mind,
but not quite up to the point that I'll accept a feature because you
say it has to be so.

> However, if we posit that I create some "third party" module that
> should be considered canonical or blessed for that purpose, then what
> is the difference from simply treating the entire thing as a
> third-party module to begin with?

You're absolutely free to implement your entire proposal as a 3rd
party library, and then eventually come back and point me to all the
users who are clamoring for its inclusion into the standard library.

> I'm not trying to cause a problem here, nor dictate to anybody (least
> of all you!) how it all should be.  I'm just saying I don't know
> *how* to solve this bit in a way that works for everybody.

But can you at least share enough of the problem so others can look at
it and either suggest a solution or agree with your conclusion?

> I can go back and spend some more time on the problem of how to
> separate method combination from the core that I currently
> envision.  But there's going to have to be at least *some* sort of
> hook there, to allow it to be added back in later.

I'm all for hooks. They can take the form of a particular factoring
into methods that make it easy to override some method; or using GF's
recursively for some of the implementation, etc.

> (Notice that if
> the core doesn't provide a facility to modify existing functions,
> then the core has to declare all its hooks in advance.  But please
> don't confuse this statement of fact, with an argument for not doing
> something I've already agreed to do...)

It should be easy though, because you know which hooks you'll need in
order to add @before and friends...

> Anyway, perhaps you don't care if those features can be added back
> in, or perhaps you actively wish to discourage this.  It would be
> good to know where you stand on this point.

Well, right now I don't care because you haven't shown me the use
case. I could definitely be swayed by a detailed description of a
large use case; much more so than by other arguments I've seen so far.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Tue May 15 01:21:51 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 19:21:51 -0400
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.co
 m>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514192423.624D63A4036@sparrow.telecommunity.com>
	<ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>
	<20070514214915.C361C3A4036@sparrow.telecommunity.com>
	<ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>
Message-ID: <20070514232017.BA6A43A4036@sparrow.telecommunity.com>

At 03:43 PM 5/14/2007 -0700, Guido van Rossum wrote:
> > Chaining using the first argument can be implemented using a bound
> > method object, which gets performance bonuses from the C eval loop
> > that partial() objects don't.  (Of course, when RuleDispatch was
> > written, partial() objects didn't exist, anyway.)
>
>Sounds like premature optimization to me. We can find a way to do it
>fast later; let's first make it right.

As I said, when RuleDispatch was written, partial() didn't exist; it 
was less a matter of performance there than convenience.


>True. So are you working with Tim Delaney on this? Otherwise he may
>propose a simpler mechanism that won't allow this re-use of the
>mechanism.

PEP 367 doesn't currently propose a mechanism for the actual 
assignment; I was waiting to see what was proposed, to then suggest 
as minimal a tweak or generalization as necessary.  Also, prior to 
now, you hadn't commented on the first-argument-class rule and I 
didn't know if you were going to reject it anyway.


>super is going to be a keyword with magic properties. Wouldn't it be
>great if instead of
>
>@when(...)
>def flatten(x: Mapping, nm: next_method):
>   ...
>   nm(x)
>
>we could write
>
>@when(...)
>def flatten(x: Mapping):
>   ...
>   super.flatten(x)  # or super(x)
>
>or some other permutation of super?

Well, either we'd have to implement it using a hidden parameter, or 
give up on the possibility of the same function being added more than 
once to the same function (e.g., for both Mapping and some specific 
types).  There's no way for the code in the body of the overload to 
know in what context it was invoked.

The current mechanism works by creating bound methods for each 
registration of the same function object, in each "applicability chain".

That doesn't mean it's impossible, just that I haven't given the 
mechanism any thought, and at first glance it looks really hairy to 
implement -- even if it were done using a hidden parameter.


>Or do you see the need to call
>both next-method and super from the same code?

Hm, that's a mind-bender.  I can't think of a sensible use case for 
that, though.  If you're a plain method, you'd just use super.  If 
you're a generic function or overloaded method, you'd just call the 
next method.

The only way I can see you doing that is if you needed to call the 
super of some *other* method, which doesn't make a lot of sense.  In 
any case, we could probably use super(...) for next-method and 
super.methodname() for everything else, so I wouldn't worry about 
it.  (Which means you'd have to use super.__call__() inside of a 
__call__ method, but I think that's OK.)


>Do note that e.g. in IronPython (and maybe also in Jython?)
>exec/eval/compile are 10-50x slower (relative to the rest of the
>system) than in CPython.

This would only get done by @overloadable, and never again thereafter.


>It does look like a clever approach though.

Does that mean you dislike it?  ;-)


> > Hm.  I'll need to give some thought to that, but it seems to me that
> > it's sort of like having None defaults for the missing arguments, and
> > then treating the missing-argument versions as requiring type(None)
> > for those arguments.  Except that we'd need something besides None,
> > and that the overloads would need wrappers that drop the extra
> > arguments.  It certainly seems possible, anyway.
> >
> > I'm not sure I like it, though.
>
>C++ and Java users use it all the time though.

Right, but they don't have keyword arguments or defaults, 
either.  The part I'm not sure about has to do with interaction with 
Python-specific things like those.  When do you use each one?  One 
Obvious Way seems to favor default arguments, especially since you 
can always use defaults of None and implement overloads for 
type(None) to catch the default cases.  i.e., ISTM that cases like 
range() are more an exception than the rule.


> > It's not obvious from the first
> > function's signature that you can call it with fewer arguments, or
> > what that would mean.  For example, shouldn't the later signatures be
> > "range(stop)" and "range(start,stop)"?  Hm.
>
>I don't know if the arg names for overloadings must match those of the
>default function or not -- is that specified by your PEP?

It isn't currently, but that's because it's assumed that all the 
methods have the same signature.  If we were going to allow 
subset-signatures (i.e, allow you to define methods whose signature 
omits portions of the main function's signature), ISTM that the 
argument names should have meaning.

Of course, maybe a motivating example other than "range()" would help 
here, since not too many other functions have optional positional 
arguments in the middle of the argument list.  :)


>My own trivially simple overloading code (sandbox/overload, and now
>also added as an experiment to sandbox/abc, with slightly different
>terminology and using issubclass exclusively, as you recommended over
>a year ago :-) has no problem with this. Of course it only handles
>positional arguments and completely ignores argument names except as
>keys into the annotations dict.

Yeah, none of my GF implementations care about the target methods' 
signatures except for the next_method thingy.  But with variable 
argument lists, I think we *should* care.

Also, AFAIK, the languages that allow different-sized argument lists 
for the same function either don't have first class functions (e.g. 
Java) or else have special syntax to allow you to refer to the 
different variations, e.g. "x/1" and "x/2" to refer to the 1 and 2 
argument versions of function x.  That is, they really *are* 
different objects.  (And Java and C++ of course have less 
comprehensible forms of name mangling internally.)

Personally, though, I think that kind of overloading is a poor 
substitute for the parameter flexibility we already have in 
Python.  That is, I think those other languages should be envying 
Python here, rather than the other way around.  :)


From guido at python.org  Tue May 15 02:17:57 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 17:17:57 -0700
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <20070514232017.BA6A43A4036@sparrow.telecommunity.com>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514192423.624D63A4036@sparrow.telecommunity.com>
	<ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>
	<20070514214915.C361C3A4036@sparrow.telecommunity.com>
	<ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>
	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>
Message-ID: <ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>

On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 03:43 PM 5/14/2007 -0700, Guido van Rossum wrote:
> > > Chaining using the first argument can be implemented using a bound
> > > method object, which gets performance bonuses from the C eval loop
> > > that partial() objects don't.  (Of course, when RuleDispatch was
> > > written, partial() objects didn't exist, anyway.)
> >
> >Sounds like premature optimization to me. We can find a way to do it
> >fast later; let's first make it right.
>
> As I said, when RuleDispatch was written, partial() didn't exist; it
> was less a matter of performance there than convenience.
>
>
> >True. So are you working with Tim Delaney on this? Otherwise he may
> >propose a simpler mechanism that won't allow this re-use of the
> >mechanism.
>
> PEP 367 doesn't currently propose a mechanism for the actual
> assignment; I was waiting to see what was proposed, to then suggest
> as minimal a tweak or generalization as necessary.  Also, prior to
> now, you hadn't commented on the first-argument-class rule and I
> didn't know if you were going to reject it anyway.
>
>
> >super is going to be a keyword with magic properties. Wouldn't it be
> >great if instead of
> >
> >@when(...)
> >def flatten(x: Mapping, nm: next_method):
> >   ...
> >   nm(x)
> >
> >we could write
> >
> >@when(...)
> >def flatten(x: Mapping):
> >   ...
> >   super.flatten(x)  # or super(x)
> >
> >or some other permutation of super?
>
> Well, either we'd have to implement it using a hidden parameter, or
> give up on the possibility of the same function being added more than
> once to the same function (e.g., for both Mapping and some specific
> types).  There's no way for the code in the body of the overload to
> know in what context it was invoked.
>
> The current mechanism works by creating bound methods for each
> registration of the same function object, in each "applicability chain".
>
> That doesn't mean it's impossible, just that I haven't given the
> mechanism any thought, and at first glance it looks really hairy to
> implement -- even if it were done using a hidden parameter.
>
>
> >Or do you see the need to call
> >both next-method and super from the same code?
>
> Hm, that's a mind-bender.  I can't think of a sensible use case for
> that, though.  If you're a plain method, you'd just use super.  If
> you're a generic function or overloaded method, you'd just call the
> next method.
>
> The only way I can see you doing that is if you needed to call the
> super of some *other* method, which doesn't make a lot of sense.  In
> any case, we could probably use super(...) for next-method and
> super.methodname() for everything else, so I wouldn't worry about
> it.  (Which means you'd have to use super.__call__() inside of a
> __call__ method, but I think that's OK.)
>
>
> >Do note that e.g. in IronPython (and maybe also in Jython?)
> >exec/eval/compile are 10-50x slower (relative to the rest of the
> >system) than in CPython.
>
> This would only get done by @overloadable, and never again thereafter.
>
>
> >It does look like a clever approach though.
>
> Does that mean you dislike it?  ;-)
>
>
> > > Hm.  I'll need to give some thought to that, but it seems to me that
> > > it's sort of like having None defaults for the missing arguments, and
> > > then treating the missing-argument versions as requiring type(None)
> > > for those arguments.  Except that we'd need something besides None,
> > > and that the overloads would need wrappers that drop the extra
> > > arguments.  It certainly seems possible, anyway.
> > >
> > > I'm not sure I like it, though.
> >
> >C++ and Java users use it all the time though.
>
> Right, but they don't have keyword arguments or defaults,
> either.  The part I'm not sure about has to do with interaction with
> Python-specific things like those.  When do you use each one?  One
> Obvious Way seems to favor default arguments, especially since you
> can always use defaults of None and implement overloads for
> type(None) to catch the default cases.  i.e., ISTM that cases like
> range() are more an exception than the rule.
>
>
> > > It's not obvious from the first
> > > function's signature that you can call it with fewer arguments, or
> > > what that would mean.  For example, shouldn't the later signatures be
> > > "range(stop)" and "range(start,stop)"?  Hm.
> >
> >I don't know if the arg names for overloadings must match those of the
> >default function or not -- is that specified by your PEP?
>
> It isn't currently, but that's because it's assumed that all the
> methods have the same signature.  If we were going to allow
> subset-signatures (i.e, allow you to define methods whose signature
> omits portions of the main function's signature), ISTM that the
> argument names should have meaning.
>
> Of course, maybe a motivating example other than "range()" would help
> here, since not too many other functions have optional positional
> arguments in the middle of the argument list.  :)
>
>
> >My own trivially simple overloading code (sandbox/overload, and now
> >also added as an experiment to sandbox/abc, with slightly different
> >terminology and using issubclass exclusively, as you recommended over
> >a year ago :-) has no problem with this. Of course it only handles
> >positional arguments and completely ignores argument names except as
> >keys into the annotations dict.
>
> Yeah, none of my GF implementations care about the target methods'
> signatures except for the next_method thingy.  But with variable
> argument lists, I think we *should* care.
>
> Also, AFAIK, the languages that allow different-sized argument lists
> for the same function either don't have first class functions (e.g.
> Java) or else have special syntax to allow you to refer to the
> different variations, e.g. "x/1" and "x/2" to refer to the 1 and 2
> argument versions of function x.  That is, they really *are*
> different objects.  (And Java and C++ of course have less
> comprehensible forms of name mangling internally.)
>
> Personally, though, I think that kind of overloading is a poor
> substitute for the parameter flexibility we already have in
> Python.  That is, I think those other languages should be envying
> Python here, rather than the other way around.  :)

Perhaps. Though C++ *does* have argument default values.

Other use cases that come to mind are e.g. APIs that you can pass
either a Point object or two (or three!) floats. This is not a natural
use case for argument default values, and it's not always convenient
to require the user to pass a tuple of floats (perhaps the
three-floats API already existed and its signature cannot be changed
for compatibility reasons). Or think of a networking function that
takes either a "host:port" string or a host and port pair; thinking of
this as having a default port is also slightly awkward, as you don't
know what to do when passed a "host:port" string and a port.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Tue May 15 02:24:00 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 20:24:00 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <ca471dc20705141619o42fd2626m92b17c4db7a2085@mail.gmail.com
 >
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
	<ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>
	<20070514193852.7BE1E3A4036@sparrow.telecommunity.com>
	<ca471dc20705141251y7162cc1fpf2f93e9afd970a6@mail.gmail.com>
	<20070514223422.9CEF23A4036@sparrow.telecommunity.com>
	<ca471dc20705141619o42fd2626m92b17c4db7a2085@mail.gmail.com>
Message-ID: <20070515002213.532623A4036@sparrow.telecommunity.com>

At 04:19 PM 5/14/2007 -0700, Guido van Rossum wrote:
>On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>>However, with respect, I didn't go to all the trouble of implementing
>>method combination in RuleDispatch just for the heck of it.  (And it
>>was considerable trouble, doing it the way CLOS implements it, until
>>I figured out an approach more suitable for Python and decorators.)
>
>So you owe us more motivating examples (in addition to the explanatory
>examples), showing how you had a particular problem, and you couldn't
>solve it cleanly using the usual suspects (subclassing, callbacks,
>etc.), and how method combining came to the rescue. Perhaps writing it
>up like a pattern description a la GoF might help.

It's really not that complicated.  If you have only strict precedence 
(i.e., methods with the same signature are ambiguous), you wind up in 
practice needing a way to disambiguate methods when you don't really 
care what order they're executed in (because they're being registered 
independently).

Before and After methods give you that escape, because they're 
assumed to be independent, and thus any number of libraries can thus 
register a before or after method for any given signature, without 
conflicting with each other.

So the "particular problem" I had is simply that when you are using 
GF methods as "observer"-like hooks, you need a way to specify them 
that doesn't result in ambiguities between code that's watching the 
same thing (but is written by different people).  And, the nature of 
these observer-ish use cases is that you sometimes need 
pre-observers, and sometimes you need post-observers.

(For example, a pre-observer like "block the sale if there's a hold 
on the item by a more valuable customer" or a post observer like, 
"send an email to the sales manager if this is an account we got from 
FooCorp.")

Can these use cases be handled with callbacks of some other 
sort?  Sure!  But then, we can and do also get by with implementing 
ad-hoc generic functions using __special__ methods and copy_reg and 
so on.  The point of the PEP was to provide a standardized API for 
generic functions and method combination, so you don't need to 
reinvent or relearn new ways of doing it for every single Python 
library that uses something that follows these patterns.

Indeed, having yet another implementation of generic functions was 
never the point of the PEP, as we already have several of them in the 
language and stdlib, plus several more third-party modules that implement them!

The point, instead, was to standardize an *API* for generic 
functions, so that one need only learn that API once.  A default GF 
implementation is merely necessary for bootstrapping that API, and 
useful for "batteries included"-ness.

So, if the bar is that a feature has to be unsolvable using ad hoc 
techniques, it seems the entire PEP would fail on those grounds.  We 
have plenty of ad hoc techniques for implementing GF's or quasi-GF's 
already, likewise for callbacks and the like.  The point was for you 
to Pronounce on One Obvious API (to Rule Them All).


>>But let me try to get closer to the issue that I have.  I honestly
>>don't see at this moment in time, how to split out most of the
>>features you don't like (mainly before/after/around), in such a way
>>that they can be put back in by a third-party module, without leading
>>to other problems.  For example, I fear that certain of those
>>features (especially before/after/around) require a single "blessed"
>>implementation in order to have a sane/stable base for library
>>inter-op, even if they *could* be separated out and put back
>>in.  That is, even if it's possible to separate the "mechanism", I
>>think that for "policy" reasons, they should have a canonical implementation.
>
>Please share more details, so your readers can understand this too.
>Right now the whole discussion around this appears to be in your head
>only, and what you write is the conclusion *you* have drawn.

Actually, the discussion about method combination precedence has been 
ongoing in several threads here on Py3K, mostly with Greg Ewing and 
Jim Jewett.  These discussions illustrate why having some basic 
operators of known precedence gives the system more stability when 
multiple libraries start playing together.


>But can you at least share enough of the problem so others can look at
>it and either suggest a solution or agree with your conclusion?

Sure.  Take a look at peak.rules.core (while keeping in mind all the 
bits that will be changed per your prior requests):

http://svn.eby-sarna.com/PEAK-Rules/peak/rules/core.py?view=markup

What you'll notice is that the method combination framework (Method, 
MethodList, combine_actions, always_overrides, and merge_by_default, 
if you don't count the places these things get called) is in fact 
most of the code, with relatively little of it being the actual 
implementation of Around, Before, or After (or even generic functions 
themselves!).

In principle, I could pull that framework out and leave just a 
mechanism for adding it back in.  But in practice, that framework 
lays down the principles of "governance" for method combination, as 
far as how to decide what things have precedence over what.

Thus, I'm skeptical of how useful it is in this area to provide 
mechanism but no policy.  It's always possible for someone to create 
their own independent policy within the mechanism -- even if there's 
a default policy.  But One Obvious Way suggests that there should be 
*some* sort of policy in place by default, just like we have a 
standard set of descriptors that implement the conventional forms of 
properties and methods.  You can subclass them or entirely replace 
them, but they cover all the typical use cases, and you can use them 
as examples to understand how to do more exotic things.

Meanwhile, if we didn't have the examples of properties and methods, 
how would we know we were designing descriptor hooks correctly?  If 
we are positing that I know enough to design the hooks correctly, we 
are implicitly positing that I know what the hooks will be used and 
useful *for*.  :)  However, by making various use cases (before, 
after, around, and the custom example) explicit in the PEP, I was 
attempting to provide the motivation and rationale for the design of 
the hooks.  (Although in all fairness, the hooks are not actually 
documented in the PEP yet, aside from a listing of function names.)


>I'm all for hooks. They can take the form of a particular factoring
>into methods that make it easy to override some method; or using GF's
>recursively for some of the implementation, etc.

This is in fact how it works now; all the extension API functions in 
the PEP are either existing GF's in peak.rules.core, or proposed for addition.


From pje at telecommunity.com  Tue May 15 02:35:42 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 20:35:42 -0400
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.co
 m>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514192423.624D63A4036@sparrow.telecommunity.com>
	<ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>
	<20070514214915.C361C3A4036@sparrow.telecommunity.com>
	<ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>
	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>
	<ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>
Message-ID: <20070515003354.194B83A4036@sparrow.telecommunity.com>

At 05:17 PM 5/14/2007 -0700, Guido van Rossum wrote:
>Other use cases that come to mind are e.g. APIs that you can pass
>either a Point object or two (or three!) floats. This is not a natural
>use case for argument default values, and it's not always convenient
>to require the user to pass a tuple of floats (perhaps the
>three-floats API already existed and its signature cannot be changed
>for compatibility reasons). Or think of a networking function that
>takes either a "host:port" string or a host and port pair; thinking of
>this as having a default port is also slightly awkward, as you don't
>know what to do when passed a "host:port" string and a port.

How do people handle these in Python now?  ISTM that idiomatic Python 
for these cases would either use tuples, or else different method names.

Or is the intention here to make it easier for people porting code 
over from Java and C++?

Anyway, as I said, I think it's *possible* to do this.  It just 
strikes me as more complex than existing ways of handling it in Python.

More importantly, it seems to go against the grain of at least my 
mental concept of Python call signatures, in which arguments are 
inherently *named* (and can be passed using explicit names), with 
only rare exceptions like range().  In contrast, the languages that 
have this sort of positional thing only allow arguments to be 
specified by position, IIRC.  That's what makes me uncomfortable with it.

That having been said, if you want it, there's probably a way to make 
it work.  I just think we should try to preserve the "nameness" of 
arguments in the process -- and consider whether the use cases you've 
listed here actually improve the code clarity any.


From guido at python.org  Tue May 15 02:45:42 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 17:45:42 -0700
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070515002213.532623A4036@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
	<ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>
	<20070514193852.7BE1E3A4036@sparrow.telecommunity.com>
	<ca471dc20705141251y7162cc1fpf2f93e9afd970a6@mail.gmail.com>
	<20070514223422.9CEF23A4036@sparrow.telecommunity.com>
	<ca471dc20705141619o42fd2626m92b17c4db7a2085@mail.gmail.com>
	<20070515002213.532623A4036@sparrow.telecommunity.com>
Message-ID: <ca471dc20705141745w4ff9db11p9662fa553967919@mail.gmail.com>

I refuse to continue this discussion until the PEP has been rewritten.
It's probably a much better use of your time to rewrite the PEP than
to argue with me in email too.

On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 04:19 PM 5/14/2007 -0700, Guido van Rossum wrote:
> >On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> >>However, with respect, I didn't go to all the trouble of implementing
> >>method combination in RuleDispatch just for the heck of it.  (And it
> >>was considerable trouble, doing it the way CLOS implements it, until
> >>I figured out an approach more suitable for Python and decorators.)
> >
> >So you owe us more motivating examples (in addition to the explanatory
> >examples), showing how you had a particular problem, and you couldn't
> >solve it cleanly using the usual suspects (subclassing, callbacks,
> >etc.), and how method combining came to the rescue. Perhaps writing it
> >up like a pattern description a la GoF might help.
>
> It's really not that complicated.  If you have only strict precedence
> (i.e., methods with the same signature are ambiguous), you wind up in
> practice needing a way to disambiguate methods when you don't really
> care what order they're executed in (because they're being registered
> independently).
>
> Before and After methods give you that escape, because they're
> assumed to be independent, and thus any number of libraries can thus
> register a before or after method for any given signature, without
> conflicting with each other.
>
> So the "particular problem" I had is simply that when you are using
> GF methods as "observer"-like hooks, you need a way to specify them
> that doesn't result in ambiguities between code that's watching the
> same thing (but is written by different people).  And, the nature of
> these observer-ish use cases is that you sometimes need
> pre-observers, and sometimes you need post-observers.
>
> (For example, a pre-observer like "block the sale if there's a hold
> on the item by a more valuable customer" or a post observer like,
> "send an email to the sales manager if this is an account we got from
> FooCorp.")
>
> Can these use cases be handled with callbacks of some other
> sort?  Sure!  But then, we can and do also get by with implementing
> ad-hoc generic functions using __special__ methods and copy_reg and
> so on.  The point of the PEP was to provide a standardized API for
> generic functions and method combination, so you don't need to
> reinvent or relearn new ways of doing it for every single Python
> library that uses something that follows these patterns.
>
> Indeed, having yet another implementation of generic functions was
> never the point of the PEP, as we already have several of them in the
> language and stdlib, plus several more third-party modules that implement them!
>
> The point, instead, was to standardize an *API* for generic
> functions, so that one need only learn that API once.  A default GF
> implementation is merely necessary for bootstrapping that API, and
> useful for "batteries included"-ness.
>
> So, if the bar is that a feature has to be unsolvable using ad hoc
> techniques, it seems the entire PEP would fail on those grounds.  We
> have plenty of ad hoc techniques for implementing GF's or quasi-GF's
> already, likewise for callbacks and the like.  The point was for you
> to Pronounce on One Obvious API (to Rule Them All).
>
>
> >>But let me try to get closer to the issue that I have.  I honestly
> >>don't see at this moment in time, how to split out most of the
> >>features you don't like (mainly before/after/around), in such a way
> >>that they can be put back in by a third-party module, without leading
> >>to other problems.  For example, I fear that certain of those
> >>features (especially before/after/around) require a single "blessed"
> >>implementation in order to have a sane/stable base for library
> >>inter-op, even if they *could* be separated out and put back
> >>in.  That is, even if it's possible to separate the "mechanism", I
> >>think that for "policy" reasons, they should have a canonical implementation.
> >
> >Please share more details, so your readers can understand this too.
> >Right now the whole discussion around this appears to be in your head
> >only, and what you write is the conclusion *you* have drawn.
>
> Actually, the discussion about method combination precedence has been
> ongoing in several threads here on Py3K, mostly with Greg Ewing and
> Jim Jewett.  These discussions illustrate why having some basic
> operators of known precedence gives the system more stability when
> multiple libraries start playing together.
>
>
> >But can you at least share enough of the problem so others can look at
> >it and either suggest a solution or agree with your conclusion?
>
> Sure.  Take a look at peak.rules.core (while keeping in mind all the
> bits that will be changed per your prior requests):
>
> http://svn.eby-sarna.com/PEAK-Rules/peak/rules/core.py?view=markup
>
> What you'll notice is that the method combination framework (Method,
> MethodList, combine_actions, always_overrides, and merge_by_default,
> if you don't count the places these things get called) is in fact
> most of the code, with relatively little of it being the actual
> implementation of Around, Before, or After (or even generic functions
> themselves!).
>
> In principle, I could pull that framework out and leave just a
> mechanism for adding it back in.  But in practice, that framework
> lays down the principles of "governance" for method combination, as
> far as how to decide what things have precedence over what.
>
> Thus, I'm skeptical of how useful it is in this area to provide
> mechanism but no policy.  It's always possible for someone to create
> their own independent policy within the mechanism -- even if there's
> a default policy.  But One Obvious Way suggests that there should be
> *some* sort of policy in place by default, just like we have a
> standard set of descriptors that implement the conventional forms of
> properties and methods.  You can subclass them or entirely replace
> them, but they cover all the typical use cases, and you can use them
> as examples to understand how to do more exotic things.
>
> Meanwhile, if we didn't have the examples of properties and methods,
> how would we know we were designing descriptor hooks correctly?  If
> we are positing that I know enough to design the hooks correctly, we
> are implicitly positing that I know what the hooks will be used and
> useful *for*.  :)  However, by making various use cases (before,
> after, around, and the custom example) explicit in the PEP, I was
> attempting to provide the motivation and rationale for the design of
> the hooks.  (Although in all fairness, the hooks are not actually
> documented in the PEP yet, aside from a listing of function names.)
>
>
> >I'm all for hooks. They can take the form of a particular factoring
> >into methods that make it easy to override some method; or using GF's
> >recursively for some of the implementation, etc.
>
> This is in fact how it works now; all the extension API functions in
> the PEP are either existing GF's in peak.rules.core, or proposed for addition.
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue May 15 02:51:19 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 17:51:19 -0700
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <20070515003354.194B83A4036@sparrow.telecommunity.com>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514192423.624D63A4036@sparrow.telecommunity.com>
	<ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>
	<20070514214915.C361C3A4036@sparrow.telecommunity.com>
	<ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>
	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>
	<ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>
	<20070515003354.194B83A4036@sparrow.telecommunity.com>
Message-ID: <ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.com>

On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 05:17 PM 5/14/2007 -0700, Guido van Rossum wrote:
> >Other use cases that come to mind are e.g. APIs that you can pass
> >either a Point object or two (or three!) floats. This is not a natural
> >use case for argument default values, and it's not always convenient
> >to require the user to pass a tuple of floats (perhaps the
> >three-floats API already existed and its signature cannot be changed
> >for compatibility reasons). Or think of a networking function that
> >takes either a "host:port" string or a host and port pair; thinking of
> >this as having a default port is also slightly awkward, as you don't
> >know what to do when passed a "host:port" string and a port.
>
> How do people handle these in Python now?  ISTM that idiomatic Python
> for these cases would either use tuples, or else different method names.

Both of which are sub-optimal compared to the C++ and Java solutions.
(Especially for constructors, wheer choosing different method names is
even moe effort as you'd need to switch to factory functions.)

> Or is the intention here to make it easier for people porting code
> over from Java and C++?

No, my observation is that they have something that would be useful for us.

> Anyway, as I said, I think it's *possible* to do this.  It just
> strikes me as more complex than existing ways of handling it in Python.
>
> More importantly, it seems to go against the grain of at least my
> mental concept of Python call signatures, in which arguments are
> inherently *named* (and can be passed using explicit names), with
> only rare exceptions like range().  In contrast, the languages that
> have this sort of positional thing only allow arguments to be
> specified by position, IIRC.  That's what makes me uncomfortable with it.

Well, in *my* metnal model the argument names are just as often
irrelevant as they are useful. I'd be taken aback if I saw this in
someone's code: open(filename="/etc/passwd", mode="r"). Perhaps it's
too bad that Python cannot express the notion of "these parameters are
positional-only" except very clumsily.

> That having been said, if you want it, there's probably a way to make
> it work.  I just think we should try to preserve the "nameness" of
> arguments in the process -- and consider whether the use cases you've
> listed here actually improve the code clarity any.

There seems to be a stalemate. It seems I cannot convince you that
this type of overloading is useful. And it seems you cannot explain to
me why I need a framework for method combining.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From steven.bethard at gmail.com  Tue May 15 03:13:01 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon, 14 May 2007 19:13:01 -0600
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <20070515002213.532623A4036@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
	<ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>
	<20070514193852.7BE1E3A4036@sparrow.telecommunity.com>
	<ca471dc20705141251y7162cc1fpf2f93e9afd970a6@mail.gmail.com>
	<20070514223422.9CEF23A4036@sparrow.telecommunity.com>
	<ca471dc20705141619o42fd2626m92b17c4db7a2085@mail.gmail.com>
	<20070515002213.532623A4036@sparrow.telecommunity.com>
Message-ID: <d11dcfba0705141813h1845db18lca8606ebffb9b6f9@mail.gmail.com>

On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 04:19 PM 5/14/2007 -0700, Guido van Rossum wrote:
> >But can you at least share enough of the problem so others can look at
> >it and either suggest a solution or agree with your conclusion?
>
> Sure.  Take a look at peak.rules.core (while keeping in mind all the
> bits that will be changed per your prior requests):
>
> http://svn.eby-sarna.com/PEAK-Rules/peak/rules/core.py?view=markup
>
> What you'll notice is that the method combination framework (Method,
> MethodList, combine_actions, always_overrides, and merge_by_default,
> if you don't count the places these things get called) is in fact
> most of the code, with relatively little of it being the actual
> implementation of Around, Before, or After (or even generic functions
> themselves!).

Seems to me that from this link what we're missing is a good
explanation of how "Method" works since that is the base for Before,
After, etc.  Thus I'd suggest ripping out the Before, After, etc.
sections in the PEP, and replacing them with a section on how Method
works.  You can use Before and After as examples of how to extend
Method.

(I'm fine with Before and After being in the module.  It's just
confusing that they take such a prominent role in the PEP without the
mechanism behind them being explained enough.)

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From gproux+py3000 at gmail.com  Tue May 15 03:18:09 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Tue, 15 May 2007 10:18:09 +0900
Subject: [Python-3000] Support for PEP 3131 (some links to evidence of
	usage within communities)
Message-ID: <19dd68ba0705141818w62c942b7g576016fcd3cc0ac1@mail.gmail.com>

Found some evidence of usage of identifiers in Japanese while doing a
quick google search

All links below are in Japanese.

* Ruby has support for Japanese identifiers (which is not unexpected
when you know the origin country of Ruby)
http://www.ruby-lang.org/ja/man/?cmd=view;name=%CA%D1%BF%F4%A4%C8%C4%EA%BF%F4
Notice that it says that this is only supported on a local basis.
(probably because Ruby cannot handle unicode natively). I also found
other people discussing their usage pattern of identifiers and
Japanese and they also report this is tremedously useful for beginners
especially when you need to read a stacktrace while debugging.


* Java has strong supporter of Japanese characters within identifiers.
??http://java-house.jp/ml/archive/j-h-b/032664.html#body

They comment that: using japanese improves readability unless used in
an extreme way (like changing a *for* loop to use ????? instead of i)

One example they give is

-------------------------------------------
	i = revised(i);
???
        i = RevisedByMarubatuMethod(i);
???
        i = revised_by_marubatu_method(i);

???????

        i = ????????????(i);
-------------------------------------------
And of course think the last one is the best.....


Table of contents of "Visual J++ Applet Programming book"
http://www.hir-net.com/book/book18/contents.html
see "Chapter 2.2: You can use Japanese Identifiers !!!"


Discussion about variable naming and how being able to use Japanese
would solve many naming issues:
http://www.atmarkit.co.jp/bbs/phpBB/viewtopic.php?topic=13878&forum=3&start=8&15

Another one like this, where people explain that because it is
difficult to come up with good names in English they end up calling
everything : makeItem, doItem, addItem
http://www.atmarkit.co.jp/bbs/phpBB/viewtopic.php?mode=viewtopic&topic=18616&forum=7&start=0


And for fun, there is this interesting link about a programming
language "in Japanese", made for beginners (check this example...
awesome!):
  http://nadesi.com/doc/cmd/doc.cgi?mode=cmd&id=200

I am sure you can find a lot more evidence like this for each and
every language.
Letting people use their own script and vocabulary to name things will
make them better programmers in their own country/cultural reference
point. This will increase the audience and support for Python
worldwide.

I will be contacting Japanese python user group and let them know of
the current discussion.

Regards,

Guillaume

From pje at telecommunity.com  Tue May 15 03:45:25 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 21:45:25 -0400
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.co
 m>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514192423.624D63A4036@sparrow.telecommunity.com>
	<ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>
	<20070514214915.C361C3A4036@sparrow.telecommunity.com>
	<ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>
	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>
	<ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>
	<20070515003354.194B83A4036@sparrow.telecommunity.com>
	<ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.com>
Message-ID: <20070515014338.8D8EA3A4036@sparrow.telecommunity.com>

At 05:51 PM 5/14/2007 -0700, Guido van Rossum wrote:
>On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>>At 05:17 PM 5/14/2007 -0700, Guido van Rossum wrote:
>> >Other use cases that come to mind are e.g. APIs that you can pass
>> >either a Point object or two (or three!) floats. This is not a natural
>> >use case for argument default values, and it's not always convenient
>> >to require the user to pass a tuple of floats (perhaps the
>> >three-floats API already existed and its signature cannot be changed
>> >for compatibility reasons). Or think of a networking function that
>> >takes either a "host:port" string or a host and port pair; thinking of
>> >this as having a default port is also slightly awkward, as you don't
>> >know what to do when passed a "host:port" string and a port.
>>
>>How do people handle these in Python now?  ISTM that idiomatic Python
>>for these cases would either use tuples, or else different method names.
>
>Both of which are sub-optimal compared to the C++ and Java solutions.

C++ and Java don't have tuples, do they?

The open(filename="...") example you gave doesn't bother me in the 
least, but when I see range()-style APIs, I cringe.  However, since 
this is a matter of taste, I yield to the BDFL.


>>That having been said, if you want it, there's probably a way to make
>>it work.  I just think we should try to preserve the "nameness" of
>>arguments in the process -- and consider whether the use cases you've
>>listed here actually improve the code clarity any.
>
>There seems to be a stalemate. It seems I cannot convince you that
>this type of overloading is useful. And it seems you cannot explain to
>me why I need a framework for method combining.

And yet, the difference is that I'm not ruling your proposal out; I'm 
merely suggesting that we work a bit more on defining what the best 
way to implement your proposal would be, in order to avoid collateral damage.

I also wanted to know more about your use cases; it's now clear that 
my previous thinking in terms of range() and named arguments as a 
typical use case is wrong; the things I'd want to do to handle that 
set of signatures are totally different from the thing you really 
want, which is to have truly positional arguments.

Perhaps the best thing would be to first define a syntactic notion of 
purely-positional arguments?  Then it would merely be a concept that 
overloading could respect, rather than being something that applies 
only to generic functions.

Or perhaps we could just say that if the main function is defined 
with *args, we treat those arguments as positional?  i.e.:

     @abstract
     def range(*args):
         """This just defines the signature; no implementation here"""

     @range.overload
     def range(stop):
         ...

     @range.overload
     def range(start, stop, step=None):
         ...

or:

     @abstract
     def draw(*coords):
         """This just defines the signature; no implementation here"""

     @draw.overload
     def draw(x:float, y:float, z:float):
         draw(Point(x,y,z))

     @draw.overload
     def draw(point:Point):
         ...


From greg.ewing at canterbury.ac.nz  Tue May 15 03:42:19 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 15 May 2007 13:42:19 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <20070515002213.532623A4036@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
	<ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>
	<20070514193852.7BE1E3A4036@sparrow.telecommunity.com>
	<ca471dc20705141251y7162cc1fpf2f93e9afd970a6@mail.gmail.com>
	<20070514223422.9CEF23A4036@sparrow.telecommunity.com>
	<ca471dc20705141619o42fd2626m92b17c4db7a2085@mail.gmail.com>
	<20070515002213.532623A4036@sparrow.telecommunity.com>
Message-ID: <46490FFB.9050904@canterbury.ac.nz>

Phillip J. Eby wrote:
> If you have only strict precedence 
> (i.e., methods with the same signature are ambiguous), you wind up in 
> practice needing a way to disambiguate methods when you don't really 
> care what order they're executed in
 > ...
> And, the nature of 
> these observer-ish use cases is that you sometimes need 
> pre-observers, and sometimes you need post-observers.

This is by far the best explanation I've seen so far of
the rationale behind @before/@after. It should definitely
go in the PEP.

Can you provide a similar justification for @around?
Including why it should go around everything else
rather than between the @before/@afters and the normal
method.

Also, why have three things (@before/@after/@around)
instead of just one thing (@around with a next-method
call).

--
Greg

From pje at telecommunity.com  Tue May 15 03:56:33 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 14 May 2007 21:56:33 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <46490FFB.9050904@canterbury.ac.nz>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
	<ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>
	<20070514193852.7BE1E3A4036@sparrow.telecommunity.com>
	<ca471dc20705141251y7162cc1fpf2f93e9afd970a6@mail.gmail.com>
	<20070514223422.9CEF23A4036@sparrow.telecommunity.com>
	<ca471dc20705141619o42fd2626m92b17c4db7a2085@mail.gmail.com>
	<20070515002213.532623A4036@sparrow.telecommunity.com>
	<46490FFB.9050904@canterbury.ac.nz>
Message-ID: <20070515015454.4F2563A4036@sparrow.telecommunity.com>

At 01:42 PM 5/15/2007 +1200, Greg Ewing wrote:
>Phillip J. Eby wrote:
> > If you have only strict precedence
> > (i.e., methods with the same signature are ambiguous), you wind up in
> > practice needing a way to disambiguate methods when you don't really
> > care what order they're executed in
>  > ...
> > And, the nature of
> > these observer-ish use cases is that you sometimes need
> > pre-observers, and sometimes you need post-observers.
>
>This is by far the best explanation I've seen so far of
>the rationale behind @before/@after. It should definitely
>go in the PEP.
>
>Can you provide a similar justification for @around?

@around is for applications to have the "last word" on how something 
should be handled, i.e. to replace or wrap everything else.


>Including why it should go around everything else
>rather than between the @before/@afters and the normal
>method.
>
>Also, why have three things (@before/@after/@around)
>instead of just one thing (@around with a next-method
>call).

Because "around" isn't additive, while before and after are.  Any 
number of before and after methods can be registered for any 
signature, because they can't directly interfere with one another 
(since they don't directly call the "next" method.

But primary and "around" methods *do* call the next method, so if 
they are applying any transformation to the arguments or return 
values, they ordering must be predictable and strict.  Thus, methods 
that can call a next-method (i.e. primaries and arounds) must have a 
guaranteed unambiguous precedence.  Imagine what would happen if the 
results of calling super() depended on what order your modules had 
been imported in!

Thus, it's an ambiguity error to define two chainable methods for the 
same type signature.  Whereas unchained methods like befores and 
afters can have as many registrations for the signature as you'd like 
to include.


From talin at acm.org  Tue May 15 04:47:32 2007
From: talin at acm.org (Talin)
Date: Mon, 14 May 2007 19:47:32 -0700
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
Message-ID: <46491F44.9030406@acm.org>

Guido van Rossum wrote:
> Next, I have a question about the __proceed__ magic argument. I can
> see why this is useful, and I can see why having this as a magic
> argument is preferable over other solutions (I couldn't come up with a
> better solution, and believe me I tried :-).  However, I think making
> this the *first* argument would upset tools that haven't been taught
> about this yet. Is there any problem with making it a keyword argument
> with a default of None, by convention to be placed last?

I earlier suggested that the __proceed__ functionality be implemented by 
a differently-named decorator, such as "overload_chained".

Phillip objected to this on the basis that it would double the number of 
decorators. However, I don't think that this is the case, since only a 
few of the decorators that he has defined supports a __proceed__ 
argument - certainly 'before' and 'after' don't (since they *all* run), 
and around has it implicitly.

Also, I believe having a separate code path for the two cases would be 
more efficient when dispatching.

> Forgive me if this is mentioned in the PEP, but what happens with
> keyword args? Can I invoke an overloaded function with (some) keyword
> args, assuming they match the argument names given in the default
> implementation? Or are we restricted to positional argument passing
> only? (That would be a big step backwards.)
> 
> ******************
> 
> Also, can we overload different-length signatures (like in C++ or
> Java)? This is very common in those languages; while Python typically
> uses default argument values, there are use cases that don't easily
> fit in that pattern (e.g. the signature of range()).

Well, from an algorithmic purity standpoint, I know exactly how it would 
work: You put all of the overloads, regardless of number of arguments, 
keywords, defaults, and everything else into a single bin. When you call 
that function, you search through ever entry in that bin and throw out 
all the ones that don't fit, then sort the remaining ones by specificity.

The problem of course is that I don't know how to build an efficient 
dispatch table to do that, and I'm not even sure that it's possible.

-- Talin

From talin at acm.org  Tue May 15 04:53:00 2007
From: talin at acm.org (Talin)
Date: Mon, 14 May 2007 19:53:00 -0700
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <20070514193852.7BE1E3A4036@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>	<20070510161417.192943A4061@sparrow.telecommunity.com>	<464395AB.6040505@canterbury.ac.nz>	<20070510231845.9C98C3A4061@sparrow.telecommunity.com>	<4643C4F4.30708@canterbury.ac.nz>	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>	<20070514163231.275CE3A4036@sparrow.telecommunity.com>	<ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>
	<20070514193852.7BE1E3A4036@sparrow.telecommunity.com>
Message-ID: <4649208C.4020801@acm.org>

Phillip J. Eby wrote:
> Meanwhile, I've been told repeatedly that TurboGears makes extensive 
> use of RuleDispatch, and my quick look today showed they actually use 
> a custom method combination, but I haven't yet tracked down where it 
> gets used, or what the rationale for it is.

You want to look at the module TurboJSON:

http://docs.turbogears.org/1.0/JsonifyDecorator

http://trac.turbogears.org/browser/projects/TurboJson/trunk/turbojson/jsonify.py?rev=2200

-- Talin

From mbk.lists at gmail.com  Tue May 15 05:34:16 2007
From: mbk.lists at gmail.com (Mike Krell)
Date: Mon, 14 May 2007 20:34:16 -0700
Subject: [Python-3000] Support for PEP 3131 (some links to evidence of
	usage within communities)
In-Reply-To: <19dd68ba0705141818w62c942b7g576016fcd3cc0ac1@mail.gmail.com>
References: <19dd68ba0705141818w62c942b7g576016fcd3cc0ac1@mail.gmail.com>
Message-ID: <da7032ce0705142034l21e854ebhcfbe6d5d830a6086@mail.gmail.com>

> One example they give is
>
> -------------------------------------------
>         i = revised(i);
> ???
>         i = RevisedByMarubatuMethod(i);
> ???
>         i = revised_by_marubatu_method(i);
>
> ???????
>
>         i = ????????????(i);
> -------------------------------------------
> And of course think the last one is the best.....

What, the one with all the question marks? :-)

Sorry, couldn't resist.

   Mike

From guido at python.org  Tue May 15 06:37:27 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 21:37:27 -0700
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <46491F44.9030406@acm.org>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<46491F44.9030406@acm.org>
Message-ID: <ca471dc20705142137u5eec1fedo899942527781d253@mail.gmail.com>

On 5/14/07, Talin <talin at acm.org> wrote:
> Guido van Rossum wrote:
> > Also, can we overload different-length signatures (like in C++ or
> > Java)? This is very common in those languages; while Python typically
> > uses default argument values, there are use cases that don't easily
> > fit in that pattern (e.g. the signature of range()).
>
> Well, from an algorithmic purity standpoint, I know exactly how it would
> work: You put all of the overloads, regardless of number of arguments,
> keywords, defaults, and everything else into a single bin. When you call
> that function, you search through ever entry in that bin and throw out
> all the ones that don't fit, then sort the remaining ones by specificity.
>
> The problem of course is that I don't know how to build an efficient
> dispatch table to do that, and I'm not even sure that it's possible.

Have a look at sandbox/abc/abc.py, class overloadable (if you don't
want to set up a svn workspace, see
http://svn.python.org/view/sandbox/trunk/abc/abc.py). It doesn't
handle keyword args or defaults, but it does handle positional
argument lists of different sizes efficiently, by using a cache
indexed with a tuple of the argument types. The first time a
particular combination of argument types is seen it does an exhaustive
search; the result is then cached. Performance is good assuming there
are many calls but few distinct call signatures, per overloaded
function. (At least, I think it's efficient; I once timed an earlier
implementation of the same idea, and it wasn't too bad. That code is
still in sandbox/overload/.)

Argument default values could be added relatively easily by treating a
function with a default argument value as multiple signatures; e.g.

  @foo.overload
  def _(a:str, b:int=1, c:Point=None): ...

would register these three signatures:

  (str,)
  (str, int)
  (str, int, Point)

Phillip suggested a clever idea to deal with keyword arguments, by
compieing a synthesized function that has the expected signature and
calls the dispatch machinery. I think it would need some adjustment to
deal with variable-length signatures too, but I think it could be made
to work as long as the problem isn't fundamentally ambiguous (which it
may be when you combine different-sides positional signatures with
defaults *and* keywords). The synthetic fuction is just a speed hack;
the same thing can be done without synthesizing code, at the cost
(considerable, and repeated per call) of decoding *args and **kwds
explicitly.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue May 15 06:43:32 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2007 21:43:32 -0700
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <20070515014338.8D8EA3A4036@sparrow.telecommunity.com>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514192423.624D63A4036@sparrow.telecommunity.com>
	<ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>
	<20070514214915.C361C3A4036@sparrow.telecommunity.com>
	<ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>
	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>
	<ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>
	<20070515003354.194B83A4036@sparrow.telecommunity.com>
	<ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.com>
	<20070515014338.8D8EA3A4036@sparrow.telecommunity.com>
Message-ID: <ca471dc20705142143p42cfb5d4x7d9d3cfe26047696@mail.gmail.com>

On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> Or perhaps we could just say that if the main function is defined
> with *args, we treat those arguments as positional?  i.e.:
>
>      @abstract
>      def range(*args):
>          """This just defines the signature; no implementation here"""

That sounds about right.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gproux at gmail.com  Tue May 15 01:46:17 2007
From: gproux at gmail.com (Guillaume Proux)
Date: Tue, 15 May 2007 08:46:17 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <20070514125306.8563.JCARLSON@uci.edu>
References: <20070514093643.8559.JCARLSON@uci.edu>
	<bb8868b90705141246m25261365h2d66fb59d903e3a1@mail.gmail.com>
	<20070514125306.8563.JCARLSON@uci.edu>
Message-ID: <19dd68ba0705141646j67423b20m87d9e752913830ef@mail.gmail.com>

Hello,

On 5/15/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> and comment their code ;).  It would be nice to be able to find more
> examples in Java.

I believe that a lot of people do not know that you can use most
Unicode characters in Java identifiers. I did not know myself until
this discussion. Furthermore, to find some examples of those, you
would have to find in the native language of each speaker.

> I guess the question is whether the potential for community
> fragmentation is worth trying to handle a (seemingly much) smaller set
> of use-cases than is (already arguably sufficiently) handled with ascii
> identifiers.

Which fragmentation? People who can write ascii-restricted Python will
not go away and form their own little sect. One of the big issue of
this debate here in English is that the people who have a real stake
(the one who do not master English) cannot join in the discussion. I
was trying to let you know of my experience with native Japanese
speaker that have no or little English capability but people do not
want to listen.

Regards,

Guillaume

From ncoghlan at gmail.com  Tue May 15 11:28:54 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 15 May 2007 19:28:54 +1000
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.com>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>	<20070514192423.624D63A4036@sparrow.telecommunity.com>	<ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>	<20070514214915.C361C3A4036@sparrow.telecommunity.com>	<ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>	<ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>	<20070515003354.194B83A4036@sparrow.telecommunity.com>
	<ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.com>
Message-ID: <46497D56.3020101@gmail.com>

Guido van Rossum wrote:
> On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
>> More importantly, it seems to go against the grain of at least my
>> mental concept of Python call signatures, in which arguments are
>> inherently *named* (and can be passed using explicit names), with
>> only rare exceptions like range().  In contrast, the languages that
>> have this sort of positional thing only allow arguments to be
>> specified by position, IIRC.  That's what makes me uncomfortable with it.
> 
> Well, in *my* metnal model the argument names are just as often
> irrelevant as they are useful. I'd be taken aback if I saw this in
> someone's code: open(filename="/etc/passwd", mode="r"). Perhaps it's
> too bad that Python cannot express the notion of "these parameters are
> positional-only" except very clumsily.

The idea of positional-only arguments came up during the PEP 3102 
discussions. I believe the proposal was to allow a tuple of annotated 
names instead of a single name for the varargs parameter:

@overloadable
def range(*(start:int, stop:int, step:int)):
     ...  # implement xrange

@range.overload
def range(*(stop:int,)):
     return range(0, x, 1)

@range.overload
def range(*(start:int, stop:int)):
     return range(x, y, 1)

PJE's approach (using *args in the base signature, but allowing 
overloads to omit it) is probably cleaner, though.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From greg.ewing at canterbury.ac.nz  Tue May 15 13:20:10 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 15 May 2007 23:20:10 +1200
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <20070515014338.8D8EA3A4036@sparrow.telecommunity.com>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514192423.624D63A4036@sparrow.telecommunity.com>
	<ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>
	<20070514214915.C361C3A4036@sparrow.telecommunity.com>
	<ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>
	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>
	<ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>
	<20070515003354.194B83A4036@sparrow.telecommunity.com>
	<ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.com>
	<20070515014338.8D8EA3A4036@sparrow.telecommunity.com>
Message-ID: <4649976A.1030301@canterbury.ac.nz>

Phillip J. Eby wrote:

> C++ and Java don't have tuples, do they?

No, but in C++ you could probably do something clever by
overloading the comma operator if you were feeling perverse
enough...

--
Greg

From greg.ewing at canterbury.ac.nz  Tue May 15 13:25:36 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 15 May 2007 23:25:36 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <20070515015454.4F2563A4036@sparrow.telecommunity.com>
References: <5.1.1.6.0.20070430184628.02c9b280@sparrow.telecommunity.com>
	<79990c6b0705110140u59c6d46euddc59b919f55b4e8@mail.gmail.com>
	<ee2a432c0705111129jbec3135ie937d29f31778aab@mail.gmail.com>
	<79990c6b0705140700x11409b5eje2305d5baca09b62@mail.gmail.com>
	<ca471dc20705140829u14cce1c3l93db30d1324303d9@mail.gmail.com>
	<20070514163231.275CE3A4036@sparrow.telecommunity.com>
	<ca471dc20705140941w4eb5736che823b388a1068548@mail.gmail.com>
	<20070514193852.7BE1E3A4036@sparrow.telecommunity.com>
	<ca471dc20705141251y7162cc1fpf2f93e9afd970a6@mail.gmail.com>
	<20070514223422.9CEF23A4036@sparrow.telecommunity.com>
	<ca471dc20705141619o42fd2626m92b17c4db7a2085@mail.gmail.com>
	<20070515002213.532623A4036@sparrow.telecommunity.com>
	<46490FFB.9050904@canterbury.ac.nz>
	<20070515015454.4F2563A4036@sparrow.telecommunity.com>
Message-ID: <464998B0.6020906@canterbury.ac.nz>

Phillip J. Eby wrote:
> Imagine what would happen if the results of 
> calling super() depended on what order your modules had been imported in!

Actually, something like this does happen with super.
You can't be sure which method super() will call when
you write it, because it depends on what other classes
people inherit along with your class, and what order
they're in.

--
Greg

From ajm at flonidan.dk  Tue May 15 13:50:18 2007
From: ajm at flonidan.dk (Anders J. Munch)
Date: Tue, 15 May 2007 13:50:18 +0200
Subject: [Python-3000] Support for PEP 3131
Message-ID: <9B1795C95533CA46A83BA1EAD4B01030031F92@flonidanmail.flonidan.net>

tomer filiba wrote:
> 
> once we have chinese, french and hindi function names, i'd be very
> difficult to interoperate with third party libs. imagine i wrote my
> code using twisted-he, while my client has installed twisted-fr...
> kaboom?

Indeed if the authors of twisted suddenly go insane and decide to
produce multiple, incompatible versions, that would be bad.  If you're
afraid of that happening, rather than argue over PEP 3131, you should
give Matthew Lefkowitz a big hug and buy him a beer.

The Java community isn't fragmented by language barriers despite
having had Unicode identifiers from the onset.  There's no reason to
think this will make the Python community spontaneously self-destruct
either.

- Anders

From tanzer at swing.co.at  Tue May 15 14:53:51 2007
From: tanzer at swing.co.at (Christian Tanzer)
Date: Tue, 15 May 2007 14:53:51 +0200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: Your message of "Tue, 15 May 2007 23:25:36 +1200."
	<464998B0.6020906@canterbury.ac.nz>
Message-ID: <E1HnwXH-0004sG-B8@swing.co.at>


Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

> Phillip J. Eby wrote:
> > Imagine what would happen if the results of
> > calling super() depended on what order your modules had been imported in!
>
> Actually, something like this does happen with super.

No, it doesn't.

The order of super-calls is always well-defined (and the only sane
one)!

> You can't be sure which method super() will call when
> you write it, because it depends on what other classes
> people inherit along with your class, and what order
> they're in.

This is true but doesn't matter (which is the beauty of super).

-- 
Christian Tanzer                                    http://www.c-tanzer.at/


From pje at telecommunity.com  Tue May 15 16:52:28 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 15 May 2007 10:52:28 -0400
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <46491F44.9030406@acm.org>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<46491F44.9030406@acm.org>
Message-ID: <20070515145039.913B73A40A9@sparrow.telecommunity.com>

At 07:47 PM 5/14/2007 -0700, Talin wrote:
>Guido van Rossum wrote:
>>Next, I have a question about the __proceed__ magic argument. I can
>>see why this is useful, and I can see why having this as a magic
>>argument is preferable over other solutions (I couldn't come up with a
>>better solution, and believe me I tried :-).  However, I think making
>>this the *first* argument would upset tools that haven't been taught
>>about this yet. Is there any problem with making it a keyword argument
>>with a default of None, by convention to be placed last?
>
>I earlier suggested that the __proceed__ functionality be 
>implemented by a differently-named decorator, such as "overload_chained".
>
>Phillip objected to this on the basis that it would double the 
>number of decorators. However, I don't think that this is the case, 
>since only a few of the decorators that he has defined supports a 
>__proceed__ argument - certainly 'before' and 'after' don't (since 
>they *all* run), and around has it implicitly.
>
>Also, I believe having a separate code path for the two cases would 
>be more efficient when dispatching.

This isn't so.  Method combination only takes place when a particular 
combination of arguments hasn't been seen before, and the result of 
combination is a single object.  That object can be a bound method 
chain, which is *very* efficient.  In fact, CPython invokes bound 
methods almost as quickly as plain functions, as it has a C-level 
check for them.

In any case, if a method does not have a next-method argument, the 
resulting "combined" method is just the function object, which is 
called directly.

(PEAK-Rules, btw, doesn't incorporate this bound-method-or-function 
optimization at the moment, but it's built into RuleDispatch and is 
pretty darn trivial.)


>The problem of course is that I don't know how to build an efficient 
>dispatch table to do that, and I'm not even sure that it's possible.

Oh, it's possible all right.  The only tricky bit with the proposal 
under discussion is that I need to know the maximum arity (number of 
arguments) that the function is ever dispatched on, in order to build 
a dispatch tuple of the correct length.  Missing arguments get a 
"missing" class put in the corresponding tuple position.  The rest 
works just like normal type-tuple dispatching, so it's really not that complex.


From pje at telecommunity.com  Tue May 15 17:07:44 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 15 May 2007 11:07:44 -0400
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <ca471dc20705142143p42cfb5d4x7d9d3cfe26047696@mail.gmail.co
 m>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514192423.624D63A4036@sparrow.telecommunity.com>
	<ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>
	<20070514214915.C361C3A4036@sparrow.telecommunity.com>
	<ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>
	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>
	<ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>
	<20070515003354.194B83A4036@sparrow.telecommunity.com>
	<ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.com>
	<20070515014338.8D8EA3A4036@sparrow.telecommunity.com>
	<ca471dc20705142143p42cfb5d4x7d9d3cfe26047696@mail.gmail.com>
Message-ID: <20070515150556.9CE063A40A7@sparrow.telecommunity.com>

At 09:43 PM 5/14/2007 -0700, Guido van Rossum wrote:
>On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > Or perhaps we could just say that if the main function is defined
> > with *args, we treat those arguments as positional?  i.e.:
> >
> >      @abstract
> >      def range(*args):
> >          """This just defines the signature; no implementation here"""
>
>That sounds about right.

After thinking about the implementation some more, I believe it'll be 
necessary to know *in advance* the maximum size of *args that will be 
used by any subsequent overload, in order to both generate the 
correct code for the main function (which must construct a fixed-size 
lookup tuple containing special values for not-supplied arguments), 
and the correct type tuples for individual overloads (which must 
contain similar special values for the to-be-omitted arguments).

So, if we could do something like this:

     @abstract
     def range(*args:3):
         ...

then that would be best.  I propose, therefore, that we require an 
integer annotation on the *args to enable positional dispatching.

If there are more *args at call time than this defined amount, only 
methods that have more positional arguments (or a *args) will be selected.

If the number is omitted (e.g. just *args with no annotation), the 
*args will not be used for method selection.

Still good?


From jjb5 at cornell.edu  Tue May 15 17:31:05 2007
From: jjb5 at cornell.edu (Joel Bender)
Date: Tue, 15 May 2007 11:31:05 -0400
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <20070515150556.9CE063A40A7@sparrow.telecommunity.com>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>	<20070514192423.624D63A4036@sparrow.telecommunity.com>	<ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>	<20070514214915.C361C3A4036@sparrow.telecommunity.com>	<ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>	<ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>	<20070515003354.194B83A4036@sparrow.telecommunity.com>	<ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.com>	<20070515014338.8D8EA3A4036@sparrow.telecommunity.com>	<ca471dc20705142143p42cfb5d4x7d9d3cfe26047696@mail.gmail.com>
	<20070515150556.9CE063A40A7@sparrow.telecommunity.com>
Message-ID: <4649D239.5030009@cornell.edu>

>      @abstract
>      def range(*args:3):
>          ...
> 
> then that would be best.  I propose, therefore, that we require an 
> integer annotation on the *args to enable positional dispatching.

I thought there was already a proposal to do something like this:

        @abstract
        def range(x, y, z, *):
            ...

So there was a specific flag that there are no more positional 
arguments.  Even in an abstract function definition they would at least 
be labeled, which is a good thing.


Joel

From guido at python.org  Tue May 15 17:32:02 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 15 May 2007 08:32:02 -0700
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <20070515150556.9CE063A40A7@sparrow.telecommunity.com>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514214915.C361C3A4036@sparrow.telecommunity.com>
	<ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>
	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>
	<ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>
	<20070515003354.194B83A4036@sparrow.telecommunity.com>
	<ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.com>
	<20070515014338.8D8EA3A4036@sparrow.telecommunity.com>
	<ca471dc20705142143p42cfb5d4x7d9d3cfe26047696@mail.gmail.com>
	<20070515150556.9CE063A40A7@sparrow.telecommunity.com>
Message-ID: <ca471dc20705150832n29e110f4vae84150b068514a3@mail.gmail.com>

On 5/15/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 09:43 PM 5/14/2007 -0700, Guido van Rossum wrote:
> >On 5/14/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> > > Or perhaps we could just say that if the main function is defined
> > > with *args, we treat those arguments as positional?  i.e.:
> > >
> > >      @abstract
> > >      def range(*args):
> > >          """This just defines the signature; no implementation here"""
> >
> >That sounds about right.
>
> After thinking about the implementation some more, I believe it'll be
> necessary to know *in advance* the maximum size of *args that will be
> used by any subsequent overload, in order to both generate the
> correct code for the main function (which must construct a fixed-size
> lookup tuple containing special values for not-supplied arguments),
> and the correct type tuples for individual overloads (which must
> contain similar special values for the to-be-omitted arguments).
>
> So, if we could do something like this:
>
>      @abstract
>      def range(*args:3):
>          ...
>
> then that would be best.  I propose, therefore, that we require an
> integer annotation on the *args to enable positional dispatching.
>
> If there are more *args at call time than this defined amount, only
> methods that have more positional arguments (or a *args) will be selected.
>
> If the number is omitted (e.g. just *args with no annotation), the
> *args will not be used for method selection.
>
> Still good?

Not so good; I expect the overloads could be written by different
authors or at least at different times. Why can't you dynamically
update the dispatcher when an overloading with more arguments comes
along?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Tue May 15 18:25:24 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 15 May 2007 12:25:24 -0400
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <ca471dc20705150832n29e110f4vae84150b068514a3@mail.gmail.co
 m>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514214915.C361C3A4036@sparrow.telecommunity.com>
	<ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>
	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>
	<ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>
	<20070515003354.194B83A4036@sparrow.telecommunity.com>
	<ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.com>
	<20070515014338.8D8EA3A4036@sparrow.telecommunity.com>
	<ca471dc20705142143p42cfb5d4x7d9d3cfe26047696@mail.gmail.com>
	<20070515150556.9CE063A40A7@sparrow.telecommunity.com>
	<ca471dc20705150832n29e110f4vae84150b068514a3@mail.gmail.com>
Message-ID: <20070515162336.AD9A23A4036@sparrow.telecommunity.com>

At 08:32 AM 5/15/2007 -0700, Guido van Rossum wrote:
>Not so good; I expect the overloads could be written by different
>authors or at least at different times. Why can't you dynamically
>update the dispatcher when an overloading with more arguments comes
>along?

You mean by changing its __code__?  The code to generate the tuple 
goes in the original function object generated by @abstract or @overloadable.

If we can't specify the count in advance, the remaining choices appear to be:

* Require *args to be annotated :overloadable in order to enable 
dispatching on them, or

* Only enable *args dispatching if the original function has no 
explicit positional arguments

* Mutate the function

Of these, I lean towards the third, but I imagine you'll like one of 
the other two better.  :)

If we don't do one of these things, the performance of functions that 
have *args but don't want to dispatch on them will suffer enormously 
due to the need to loop over *args and create a dynamic-length 
tuple.  (As shown by the performance tests you did on your tuple 
dispatch prototype.)

Conversely, if we mutate the function, then even dispatching over 
*args won't require a loop slowdown; the tuple can *always* be of a 
fixed length.


From guido at python.org  Tue May 15 18:40:28 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 15 May 2007 09:40:28 -0700
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <20070515162336.AD9A23A4036@sparrow.telecommunity.com>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>
	<ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>
	<20070515003354.194B83A4036@sparrow.telecommunity.com>
	<ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.com>
	<20070515014338.8D8EA3A4036@sparrow.telecommunity.com>
	<ca471dc20705142143p42cfb5d4x7d9d3cfe26047696@mail.gmail.com>
	<20070515150556.9CE063A40A7@sparrow.telecommunity.com>
	<ca471dc20705150832n29e110f4vae84150b068514a3@mail.gmail.com>
	<20070515162336.AD9A23A4036@sparrow.telecommunity.com>
Message-ID: <ca471dc20705150940h61cc6591s1791fc9181cf99ef@mail.gmail.com>

On 5/15/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 08:32 AM 5/15/2007 -0700, Guido van Rossum wrote:
> >Not so good; I expect the overloads could be written by different
> >authors or at least at different times. Why can't you dynamically
> >update the dispatcher when an overloading with more arguments comes
> >along?
>
> You mean by changing its __code__?  The code to generate the tuple
> goes in the original function object generated by @abstract or @overloadable.
>
> If we can't specify the count in advance, the remaining choices appear to be:
>
> * Require *args to be annotated :overloadable in order to enable
> dispatching on them, or
>
> * Only enable *args dispatching if the original function has no
> explicit positional arguments
>
> * Mutate the function
>
> Of these, I lean towards the third, but I imagine you'll like one of
> the other two better.  :)
>
> If we don't do one of these things, the performance of functions that
> have *args but don't want to dispatch on them will suffer enormously
> due to the need to loop over *args and create a dynamic-length
> tuple.  (As shown by the performance tests you did on your tuple
> dispatch prototype.)
>
> Conversely, if we mutate the function, then even dispatching over
> *args won't require a loop slowdown; the tuple can *always* be of a
> fixed length.

It looks like you're focused ion an implementation that is both highly
optimized and (technically) pure Python (using every trick in the
book). Personally I would rather go for a a slower but simpler pure
Python implementation and eventually add C support to speed it up;
that way the Python version can maintain more of the advantages of
Python code like readability, maintainability, and evolvability (just
in case we don't get it perfect on the first try -- even knowing that
it's more like the 3rd try for you ;-).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Tue May 15 19:41:53 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 15 May 2007 13:41:53 -0400
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <ca471dc20705150940h61cc6591s1791fc9181cf99ef@mail.gmail.co
 m>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>
	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>
	<ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>
	<20070515003354.194B83A4036@sparrow.telecommunity.com>
	<ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.com>
	<20070515014338.8D8EA3A4036@sparrow.telecommunity.com>
	<ca471dc20705142143p42cfb5d4x7d9d3cfe26047696@mail.gmail.com>
	<20070515150556.9CE063A40A7@sparrow.telecommunity.com>
	<ca471dc20705150832n29e110f4vae84150b068514a3@mail.gmail.com>
	<20070515162336.AD9A23A4036@sparrow.telecommunity.com>
	<ca471dc20705150940h61cc6591s1791fc9181cf99ef@mail.gmail.com>
Message-ID: <20070515174006.49A063A4036@sparrow.telecommunity.com>

At 09:40 AM 5/15/2007 -0700, Guido van Rossum wrote:
>It looks like you're focused ion an implementation that is both highly
>optimized and (technically) pure Python (using every trick in the
>book). Personally I would rather go for a a slower but simpler pure
>Python implementation

Actually, the need to handle keyword arguments intelligently pretty 
much demands that you use a function object as a front-end.  It's a 
*lot* easier to compile a function that does the right thing for one 
specific signature, than it is to write a single routine that 
interprets arbitrary function signatures correctly!

(Note that the "inspect" module does all the heavy lifting, including 
the formatting of all the argument strings.  It even supports nested 
tuple arguments, though of course we won't be needing those for Py3K!)

IOW, I originally started using functions as front-ends in 
RuleDispatch to support keyword arguments correctly, not to improve 
performance.  It actually slows things down a little in RuleDispatch 
to do that, because it adds an extra calling level.  But it's a 
correctness thing, not a performance thing.

Anyway, I've figured out at least one way to handle *args 
efficiently, without any pre-declaration, by modifying the function 
template slightly when *args are in play:

     def make_function(__defaults, __lookup, __starcount):
         def $funcname(..., *args, ...):
             if args and __starcount:
                  # code to make a type tuple using args[:__starcount]
             else:
                  # fast code that doesn't use __starcount
         def __setcount(count):
             nonlocal __starcount
             __starcount = count

         return $funcname, __setcount

This avoids any need to mutate the function later; instead, the 
dispatch engine can just call __setcount() when it encounters 
signatures that dispatch on the contents of *args.  So, I think this 
will do everything you wanted.


From collinw at gmail.com  Wed May 16 00:30:13 2007
From: collinw at gmail.com (Collin Winter)
Date: Tue, 15 May 2007 15:30:13 -0700
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <d11dcfba0705141333j3aa93914s5d680fff99af7283@mail.gmail.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
	<d11dcfba0705132308h712fbadeq91e4474f294ac936@mail.gmail.com>
	<43aa6ff70705141303t63520c44g8f88f8ae56732137@mail.gmail.com>
	<d11dcfba0705141333j3aa93914s5d680fff99af7283@mail.gmail.com>
Message-ID: <43aa6ff70705151530m1f414acdlc4b70383ab2471d5@mail.gmail.com>

On 5/14/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> On 5/14/07, Collin Winter <collinw at gmail.com> wrote:
> > There really is no difference between roles and all- at abstractmethod
> > ABCs. From my point of view, though, roles win because they don't
> > require any changes to the interpreter; they're a much simpler way of
> > expressing the same concept.
>
> Ok, you clearly have an implementation in mind, but I don't know what
> it is.  As far as I can tell:
>
> * metaclass=Role ~ metaclass=ABCMeta, except that all methods must be abstract
> * perform_role(role)(cls) ~ role.register(cls)
> * performs(obj, role) ~ isinstance(obj, role)
>
> And so, as far as I can see, without an Implementation section, all
> you're propsing is a different syntax for the same functionality. Was
> there a discussion of your implementation that I missed?
>
> > You may like adding the extra complexity
> > and indirection to the VM necessary to support
> > issubclass()/isinstance() overriding, but I don't.
>
> Have you looked at Guido's issubclass()/isinstance() patch
> (http://bugs.python.org/1708353)?  I'd hardly say that 34 lines of C
> code is substantial "extra complexity".

This is what I don't understand: ABCs require changing the VM, roles
don't; all that change buys you is the ability to spell "performs()"
as "isinstance()". Why are ABCs preferable, again?

Collin Winter

From guido at python.org  Wed May 16 00:39:59 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 15 May 2007 15:39:59 -0700
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <43aa6ff70705151530m1f414acdlc4b70383ab2471d5@mail.gmail.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
	<d11dcfba0705132308h712fbadeq91e4474f294ac936@mail.gmail.com>
	<43aa6ff70705141303t63520c44g8f88f8ae56732137@mail.gmail.com>
	<d11dcfba0705141333j3aa93914s5d680fff99af7283@mail.gmail.com>
	<43aa6ff70705151530m1f414acdlc4b70383ab2471d5@mail.gmail.com>
Message-ID: <ca471dc20705151539n1ca9a787r7b691666efea1bce@mail.gmail.com>

On 5/15/07, Collin Winter <collinw at gmail.com> wrote:
> This is what I don't understand: ABCs require changing the VM, roles
> don't; all that change buys you is the ability to spell "performs()"
> as "isinstance()". Why are ABCs preferable, again?

Actually, if you didn't care about overloading isinstance(), you could
have everything else in PEP 3119 by using a different spelling than
isinstance(). Suppose the playing field were to be leveled like this
-- IMO ABCs would still be preferable because they can *also* be
subclassed directly and provide concrete or partially-implemented
methods, acting as mix-in classes. You can also turn it around. If
Roles were overloading isinstance() -- how would they be better than
ABCs?

But I *like* overloading isinstance(), because it means there's less
to learn, and so does Phillip -- it means there can be a uniform way
for the GF machinery to talk about the relationships between instances
and the various things that can be used as argument annotations (even
zope.interfaces could overload isinstance() to do the right thing).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From steven.bethard at gmail.com  Wed May 16 01:43:53 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Tue, 15 May 2007 17:43:53 -0600
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <43aa6ff70705151530m1f414acdlc4b70383ab2471d5@mail.gmail.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
	<d11dcfba0705132308h712fbadeq91e4474f294ac936@mail.gmail.com>
	<43aa6ff70705141303t63520c44g8f88f8ae56732137@mail.gmail.com>
	<d11dcfba0705141333j3aa93914s5d680fff99af7283@mail.gmail.com>
	<43aa6ff70705151530m1f414acdlc4b70383ab2471d5@mail.gmail.com>
Message-ID: <d11dcfba0705151643w2a5872a9s96394fbac44f91f5@mail.gmail.com>

On 5/15/07, Collin Winter <collinw at gmail.com> wrote:
> On 5/14/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> > On 5/14/07, Collin Winter <collinw at gmail.com> wrote:
> > > You may like adding the extra complexity
> > > and indirection to the VM necessary to support
> > > issubclass()/isinstance() overriding, but I don't.
> >
> > Have you looked at Guido's issubclass()/isinstance() patch
> > (http://bugs.python.org/1708353)?  I'd hardly say that 34 lines of C
> > code is substantial "extra complexity".
>
> This is what I don't understand: ABCs require changing the VM, roles
> don't; all that change buys you is the ability to spell "performs()"
> as "isinstance()".

Sorry, I can't really respond to this until you give me some idea what
your implementation is.  You keep saying that roles don't require
changing the VM, but I don't know what they *do* involve changing. So
I can't judge how different that is from allowing isinstance() to be
overloaded.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From greg.ewing at canterbury.ac.nz  Wed May 16 02:19:29 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 16 May 2007 12:19:29 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <E1HnwXH-0004sG-B8@swing.co.at>
References: <E1HnwXH-0004sG-B8@swing.co.at>
Message-ID: <464A4E11.9030107@canterbury.ac.nz>

Christian Tanzer wrote:
> Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
> > Phillip J. Eby wrote:
> >
> > > Imagine what would happen if the results of
> > > calling super() depended on what order your modules had been imported in!
> >
> > Actually, something like this does happen with super.
> 
> This is true but doesn't matter (which is the beauty of super).

Only because super methods are written with this
knowledge in mind, however. Seems to me you ought
to have something similar in mind when overloading
a generic function.

--
Greg

From collinw at gmail.com  Wed May 16 02:38:42 2007
From: collinw at gmail.com (Collin Winter)
Date: Tue, 15 May 2007 17:38:42 -0700
Subject: [Python-3000] Revised PEP 3119 (Abstract Base Classes)
In-Reply-To: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
References: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
Message-ID: <43aa6ff70705151738x715c4616pbfaf1b085ffda1fa@mail.gmail.com>

On 5/11/07, Guido van Rossum <guido at python.org> wrote:
> - Overloading isinstance and issubclass is now a key mechanism rather
> than an afterthought; it is also the only change to C code required
>
> - Built-in (and user-defined) types can be registered as "virtual
> subclasses" (not related to virtual base classes in C++) of the
> standard ABCs, e.g. Sequence.register(tuple) makes issubclass(tuple,
> Sequence) true (but Sequence won't show up in __bases__ or __mro__).

(The bit about "issubclass(tuple, Sequence)" currently isn't true with
the sandbox prototype, but let's assume that it is/will be.)

Given:

class MyABC(metaclass=ABCMeta):
  def foo(self): # A concrete method
    return 5

class MyClass(MyABC): # Mark as implementing the ABC's interface
  pass

>>> a = MyClass()
>>> isinstance(a, MyABC)
True # Good, I can call foo()
>>> a.foo()
5

>>> MyABC.register(list)
>>> isinstance([], MyABC)
True # Good, I can call foo()
>>> [].foo()
Traceback (most recent call last):
AttributeError: 'list' object has no attribute 'foo'

Have I missed something? It would seem that when dealing with ABCs
that provide concrete methods, "isinstance(x, SomeABC) == True" is
useless.

Collin Winter

From pje at telecommunity.com  Wed May 16 02:41:38 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 15 May 2007 20:41:38 -0400
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <464A4E11.9030107@canterbury.ac.nz>
References: <E1HnwXH-0004sG-B8@swing.co.at> <464A4E11.9030107@canterbury.ac.nz>
Message-ID: <20070516003949.BE0733A4036@sparrow.telecommunity.com>

At 12:19 PM 5/16/2007 +1200, Greg Ewing wrote:
>Christian Tanzer wrote:
> > Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> >
> > > Phillip J. Eby wrote:
> > >
> > > > Imagine what would happen if the results of
> > > > calling super() depended on what order your modules had been 
> imported in!
> > >
> > > Actually, something like this does happen with super.
> >
> > This is true but doesn't matter (which is the beauty of super).
>
>Only because super methods are written with this
>knowledge in mind, however. Seems to me you ought
>to have something similar in mind when overloading
>a generic function.

This discussion has wandered away from the point.  Next-method calls 
are in fact identical to super calls in the degenerate case of 
specializing only on the first argument.

However, the point of before/after methods is that they don't follow 
this pattern.

If only one person is writing all the methods in a generic function, 
they don't have much benefit from using before/after methods, because 
they could just code the desired behavior into the primary methods.

The benefit of before/after, on the other hand, is that they allow 
any number of developers to "hook" the calling of the function.  Any 
given developer can predict the calling order for *their* before and 
after methods, but does not necessarily know when other developers' 
before/after methods might be called.

If you and I both define a @before(foo,(X,Y)) method, there is no way 
for either of us to predict whose method will be called first, even 
though we can each predict that our own (X,Y) method will be called 
before an (object,object) method that we also registered.  Our 
methods are in parallel universes that do not overlap.

This is the driving force for having before and after methods: 
allowing independent hooks to be registered, while ensuring that they 
can't mess anything up (as long as they stick to their own business).

To put it another way, if you care about the *overall* (absolute?) 
order, you have to use primary or "around" methods.  If you only care 
about the *relative* order, you want a before or after method.  The 
fact that you do NOT have explicit control over the chaining is the 
very thing that makes them able to be independent.


From greg.ewing at canterbury.ac.nz  Wed May 16 03:01:19 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 16 May 2007 13:01:19 +1200
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
 Interfaces, etc.
In-Reply-To: <20070516003949.BE0733A4036@sparrow.telecommunity.com>
References: <E1HnwXH-0004sG-B8@swing.co.at> <464A4E11.9030107@canterbury.ac.nz>
	<20070516003949.BE0733A4036@sparrow.telecommunity.com>
Message-ID: <464A57DF.90002@canterbury.ac.nz>

Phillip J. Eby wrote:

> This is the driving force for having before and after methods: allowing 
> independent hooks to be registered, while ensuring that they can't mess 
> anything up (as long as they stick to their own business).

Some discipline is still required to make sure they do
stick to their business. The same result could be achieved
with only one order-independent decorator instead of two,
and an additional discipline of always calling the next
method.

--
Greg

From guido at python.org  Wed May 16 03:25:48 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 15 May 2007 18:25:48 -0700
Subject: [Python-3000] PEP 3124 - Overloading, Generic Functions,
	Interfaces, etc.
In-Reply-To: <464A4E11.9030107@canterbury.ac.nz>
References: <E1HnwXH-0004sG-B8@swing.co.at> <464A4E11.9030107@canterbury.ac.nz>
Message-ID: <ca471dc20705151825h74727a5k337b09f151128b0f@mail.gmail.com>

Note that Phillip's hypothetical was about it depending on *the order
in which modules are imported*. Super has no such dependency -- it
just depends on the inheritance graph, which is much more
well-defined.

--Guido

On 5/15/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Christian Tanzer wrote:
> > Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> >
> > > Phillip J. Eby wrote:
> > >
> > > > Imagine what would happen if the results of
> > > > calling super() depended on what order your modules had been imported in!
> > >
> > > Actually, something like this does happen with super.
> >
> > This is true but doesn't matter (which is the beauty of super).
>
> Only because super methods are written with this
> knowledge in mind, however. Seems to me you ought
> to have something similar in mind when overloading
> a generic function.
>
> --
> Greg
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed May 16 03:34:52 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 15 May 2007 18:34:52 -0700
Subject: [Python-3000] Revised PEP 3119 (Abstract Base Classes)
In-Reply-To: <43aa6ff70705151738x715c4616pbfaf1b085ffda1fa@mail.gmail.com>
References: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
	<43aa6ff70705151738x715c4616pbfaf1b085ffda1fa@mail.gmail.com>
Message-ID: <ca471dc20705151834q2dd5e766sb4321f8f5c79f79e@mail.gmail.com>

On 5/15/07, Collin Winter <collinw at gmail.com> wrote:
> On 5/11/07, Guido van Rossum <guido at python.org> wrote:
> > - Overloading isinstance and issubclass is now a key mechanism rather
> > than an afterthought; it is also the only change to C code required
> >
> > - Built-in (and user-defined) types can be registered as "virtual
> > subclasses" (not related to virtual base classes in C++) of the
> > standard ABCs, e.g. Sequence.register(tuple) makes issubclass(tuple,
> > Sequence) true (but Sequence won't show up in __bases__ or __mro__).
>
> (The bit about "issubclass(tuple, Sequence)" currently isn't true with
> the sandbox prototype, but let's assume that it is/will be.)

Perhaps you tried it without the patch (reference [12] from PEP 3119)
applied? It works for me:

guido at pythonic:abc$ python3.0
Python 3.0x (p3yk, May 10 2007, 17:05:42)
[GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import abc
>>> isinstance((), abc.Sequence)
True
>>>


> Given:
>
> class MyABC(metaclass=ABCMeta):
>   def foo(self): # A concrete method
>     return 5
>
> class MyClass(MyABC): # Mark as implementing the ABC's interface
>   pass
>
> >>> a = MyClass()
> >>> isinstance(a, MyABC)
> True # Good, I can call foo()
> >>> a.foo()
> 5
>
> >>> MyABC.register(list)
> >>> isinstance([], MyABC)
> True # Good, I can call foo()
> >>> [].foo()
> Traceback (most recent call last):
> AttributeError: 'list' object has no attribute 'foo'
>
> Have I missed something? It would seem that when dealing with ABCs
> that provide concrete methods, "isinstance(x, SomeABC) == True" is
> useless.

The intention is that you shouldn't register such cases. This falls
under the consenting-adults rule.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Wed May 16 03:54:56 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 15 May 2007 21:54:56 -0400
Subject: [Python-3000] Revised PEP 3119 (Abstract Base Classes)
In-Reply-To: <ca471dc20705151834q2dd5e766sb4321f8f5c79f79e@mail.gmail.co
 m>
References: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
	<43aa6ff70705151738x715c4616pbfaf1b085ffda1fa@mail.gmail.com>
	<ca471dc20705151834q2dd5e766sb4321f8f5c79f79e@mail.gmail.com>
Message-ID: <20070516015310.156A53A4036@sparrow.telecommunity.com>

At 06:34 PM 5/15/2007 -0700, Guido van Rossum wrote:
> > Have I missed something? It would seem that when dealing with ABCs
> > that provide concrete methods, "isinstance(x, SomeABC) == True" is
> > useless.
>
>The intention is that you shouldn't register such cases. This falls
>under the consenting-adults rule.

Not only that, but the presence of the isinstance()/issubclass() 
hooks actually means that if you want to create your own "Role" or 
"Interface" types that actually *verify* your requirements, you can do so!


From yi.codeplayer at gmail.com  Wed May 16 05:06:29 2007
From: yi.codeplayer at gmail.com (=?UTF-8?B?6buE5q+F?=)
Date: Wed, 16 May 2007 11:06:29 +0800
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <20070514093643.8559.JCARLSON@uci.edu>
References: <4647B15F.7040700@canterbury.ac.nz>
	<bb8868b90705140922v2c1d862ic8fb4f91418656bd@mail.gmail.com>
	<20070514093643.8559.JCARLSON@uci.edu>
Message-ID: <d7bc2f9d0705152006xc6db8a9seb850fe718bef9c6@mail.gmail.com>

>
> Have you been able to find substantial Java source in which non-ascii
> identifiers were used?  I have been curious about its prevalence, but
> wouldn't even know how to start searching for such code.


I've seen many (and written some) java and c# code use chinese identifiers,
and yes, most of that kind of code are close source .
And i'd like to see this feature in python, because as
an english-second-language programmer, it's hard to traslate all the
chinese terms in my head to english, sometimes it's even impossible to do,
and then i have to make up some strange words based on the pronunciation,
and i hate that very much.

-- 
http://codeplayer.blogspot.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070516/ac27a3fa/attachment.htm 

From guido at python.org  Wed May 16 06:46:48 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 15 May 2007 21:46:48 -0700
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <4648D626.1030201@benjiyork.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
	<4648D626.1030201@benjiyork.com>
Message-ID: <ca471dc20705152146p382ac396u5f7dc4b4f138cd26@mail.gmail.com>

On 5/14/07, Benji York <benji at benjiyork.com> wrote:
> Collin Winter wrote:
> > PEP: 3133
> > Title: Introducing Roles
>
> Everything included here is included in zope.interface.  See in-line
> comments below for the analogs.

Could you look at PEP 3119 and do a similar analysis? I expect that
the main thing missing there is that it (currently) has no way to
claim that a particular *object* has a certain behavior. The
overloading of isinstance() makes it possible to add this however --
if not as part of that PEP, then as part of a revamping of
zope.interface using isinterface()/issubclass() overloading and PEP
3129 style class decorators. PEP 3119 currently also doesn't have a
verification step -- but this could easily be added as an (optional)
part of the registration call.

If this is confirmed, I like the convergence that this suggests -- if
several designs (ABCs, Roles and zope.interface) mostly map onto each
other, we're probably on to an important concept, even if we can
quibble over the spelling of behavior checks and other details. It
also all appears to dovetail nicely with GFs.

BTW I think Collin made a mistake when he claimed that the Doglike
role should throw a tantrum just because the actual bark()
implementation has an optional extra argument; that would be like
complaining that it also has  a poop() method which is not part of the
Doglike role. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gproux+py3000 at gmail.com  Wed May 16 07:13:16 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Wed, 16 May 2007 14:13:16 +0900
Subject: [Python-3000] Support for PEP 3131 - discussion on python zope
	users group
Message-ID: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>

Hello,

Just to let you know that a discussion on japanese python users group
is going on regarding this issue.

Most people feel like the PEP3131 would be a welcome addition.
-> some people point out the fact that special characters like the
greek letters would be great for all kind of  maths calculation.
-> Many people think that this would enable them to make their own DSL
-> unittest - very useful to give a better overview of the result of
unit test. People pointed us at a Visual C# MVP tutorial
  http://www.atmarkit.co.jp/fdotnet/nagile/nagile02/nagile02_03.html

One person expressed the worry that mixing japanese and ascii would
oblige them to change input mode too often but other posters said that
this could be easily arranged by putting the right settings in the
IME.

Guillaume

From ntoronto at cs.byu.edu  Wed May 16 07:18:47 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Tue, 15 May 2007 23:18:47 -0600
Subject: [Python-3000] Support for PEP 3131 - discussion on python zope
 users group
In-Reply-To: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>
References: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>
Message-ID: <464A9437.6030701@cs.byu.edu>

Guillaume Proux wrote:
> Hello,
>
> Just to let you know that a discussion on japanese python users group
> is going on regarding this issue.
>
> Most people feel like the PEP3131 would be a welcome addition.
> -> some people point out the fact that special characters like the
> greek letters would be great for all kind of  maths calculation.

It could be nice for reading, for those who know the Greek alphabet. 
(Those who don't would see every Greek letter as just a squiggle.) 
Writing, though? I don't have a clue how to type Greek letters, so I'd 
end up copy-and-pasting variable names. Icky.

Neil


From gproux+py3000 at gmail.com  Wed May 16 07:46:40 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Wed, 16 May 2007 14:46:40 +0900
Subject: [Python-3000] Support for PEP 3131 - discussion on python zope
	users group
In-Reply-To: <464A9437.6030701@cs.byu.edu>
References: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>
	<464A9437.6030701@cs.byu.edu>
Message-ID: <19dd68ba0705152246n51d268acpcc90710157a6bca3@mail.gmail.com>

One of the big advantage of japanese Input Methods. They can be
extended easily to fit your need.

I can type "siguma" on my laptop here and windows (same in Linux)
gives me the following choices

???
???
?
?


cute no?

Guillaume

On 5/16/07, Neil Toronto <ntoronto at cs.byu.edu> wrote:
> Guillaume Proux wrote:
> > Hello,
> >
> > Just to let you know that a discussion on japanese python users group
> > is going on regarding this issue.
> >
> > Most people feel like the PEP3131 would be a welcome addition.
> > -> some people point out the fact that special characters like the
> > greek letters would be great for all kind of  maths calculation.
>
> It could be nice for reading, for those who know the Greek alphabet.
> (Those who don't would see every Greek letter as just a squiggle.)
> Writing, though? I don't have a clue how to type Greek letters, so I'd
> end up copy-and-pasting variable names. Icky.
>
> Neil
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/gproux%2Bpy3000%40gmail.com
>

From pedronis at openendsystems.com  Wed May 16 09:35:05 2007
From: pedronis at openendsystems.com (Samuele Pedroni)
Date: Wed, 16 May 2007 09:35:05 +0200
Subject: [Python-3000] Revised PEP 3119 (Abstract Base Classes)
In-Reply-To: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
References: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
Message-ID: <464AB429.5010305@openendsystems.com>

Guido van Rossum wrote:
> **Open issues:** Conceivably, instead of using the ABCMeta metaclass,
> these classes could override ``__instancecheck__`` and
> ``__subclasscheck__`` to check for the presence of the applicable
> special method; for example::
>
>     class Sized(metaclass=ABCMeta):
>         @abstractmethod
>         def __hash__(self):
>             return 0
>         @classmethod
>         def __instancecheck__(cls, x):
>             return hasattr(x, "__len__")
>         @classmethod
>         def __subclasscheck__(cls, C):
>             return hasattr(C, "__bases__") and hasattr(C, "__len__")
>
> This has the advantage of not requiring explicit registration.
> However, the semantics hard to get exactly right given the confusing
> semantics of instance attributes vs. class attributes, and that a
> class is an instance of its metaclass; the check for ``__bases__`` is
> only an approximation of the desired semantics.  **Strawman:** Let's
> do it, but let's arrange it in such a way that the registration API
> also works.
>
>
>   
just to say that I still think the strawman would be right thing to do.





From ncoghlan at gmail.com  Wed May 16 12:10:27 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 May 2007 20:10:27 +1000
Subject: [Python-3000] Revised PEP 3119 (Abstract Base Classes)
In-Reply-To: <43aa6ff70705151738x715c4616pbfaf1b085ffda1fa@mail.gmail.com>
References: <ca471dc20705111647t1096c665o301d6ef14018146d@mail.gmail.com>
	<43aa6ff70705151738x715c4616pbfaf1b085ffda1fa@mail.gmail.com>
Message-ID: <464AD893.8010702@gmail.com>

Collin Winter wrote:
>>>> MyABC.register(list)
>>>> isinstance([], MyABC)
> True # Good, I can call foo()
>>>> [].foo()
> Traceback (most recent call last):
> AttributeError: 'list' object has no attribute 'foo'
> 
> Have I missed something? It would seem that when dealing with ABCs
> that provide concrete methods, "isinstance(x, SomeABC) == True" is
> useless.

You've missed something - the declaration in your example that list is 
compliant with the example ABC when that is not in fact the case is an 
out-and-out bug that leads directly to the exception on the last line.

I can construct an identical example for PEP 3133:

class MyRole(metaclass=Role):
   def foo(self): # An abstract method
     pass

@performs_role(MyRole)
class MyRoleMixin(object):
   def foo(self): # A concrete method
     return 5

class MyClass(MyRoleMixin): # Use Mixin to perform the Role
   pass

.>>> a = MyClass()
.>>> performs(a, MyRole)
True # Good, I can call foo()
.>>> a.foo()
5

.>>> performs_role(MyRole)(list) # This assertion is WRONG!
.>>> performs([], MyRole)
True # Good, I can call foo()
.>>> [].foo()
Traceback (most recent call last):
AttributeError: 'list' object has no attribute 'foo'


One of the key things that PEP 3119 does is to permit a single ABC to 
handle both of the jobs that PEP 3133 assigns to separate Role and Mixin 
classes.

When implementing a PEP 3119 style interface you have two options - 
inherit from the ABC and benefit from its mixin characteristics, or else 
do a post-hoc registration and implement everything yourself.

Enforcing an explicit Role/Mixin distinction the way that PEP 3133 does 
just makes the interface developer repeat themselves - once to write the 
Role and then again to write a Mixin that provides the concrete methods 
which can be defined entirely in terms of other methods in the interface.

After that extra work, the user of the interface still has the same two 
options - inherit from the Mixin and benefit from the partial 
implementation, or do the post-hoc registration and full implementation.

Equivalent expressiveness and significantly less typing gives me a 
strong personal preference towards the PEP 3119 approach. The 
improvements in proxy object support and keeping isinstance() as the one 
obvious way to introspect interfaces are also nice bonuses.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From guido at python.org  Wed May 16 16:13:22 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 May 2007 07:13:22 -0700
Subject: [Python-3000] Support for PEP 3131 - discussion on python zope
	users group
In-Reply-To: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>
References: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>
Message-ID: <ca471dc20705160713o3fd968f6gbe4a5bfe9eff1551@mail.gmail.com>

After the warm user testimonials posted in the last few days I am now
warming up to this proposal. Hearing about how it has been a positive
influence in certain local Java communities was especially useful.

--Guido

On 5/15/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> Hello,
>
> Just to let you know that a discussion on japanese python users group
> is going on regarding this issue.
>
> Most people feel like the PEP3131 would be a welcome addition.
> -> some people point out the fact that special characters like the
> greek letters would be great for all kind of  maths calculation.
> -> Many people think that this would enable them to make their own DSL
> -> unittest - very useful to give a better overview of the result of
> unit test. People pointed us at a Visual C# MVP tutorial
>   http://www.atmarkit.co.jp/fdotnet/nagile/nagile02/nagile02_03.html
>
> One person expressed the worry that mixing japanese and ascii would
> oblige them to change input mode too often but other posters said that
> this could be easily arranged by putting the right settings in the
> IME.
>
> Guillaume
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed May 16 16:48:24 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 May 2007 07:48:24 -0700
Subject: [Python-3000] Alternatives for __del__
Message-ID: <ca471dc20705160748h2ff7110qa33ebe0507309326@mail.gmail.com>

Since no PEP has been submitted about eliminating __del__, __del__
remains, by default, in Python 3000. I am more comfortable with this
anyway.

However, I still welcome an informational PEP describing the "best
practices" for avoiding it by using weak references, including some
support code to be added to weakref.py (this could probably be added
to Python 2.6 as well; and for earlier releases it could be made
available as a 3rd party add-on or as a "recipe" in the online Python
Cookbook (http://aspn.activestate.com/ASPN/Python/Cookbook/).

I am hoping that someone besides Raymond will volunteer to write such
a PEP; his busy schedule makes it unlikely that he will have the time
necessary to devote to this project.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed May 16 17:46:13 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 May 2007 08:46:13 -0700
Subject: [Python-3000] Raw strings containing \u or \U
Message-ID: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>

Walter Doerwald, in private mail, reminded me of a third use case for
raw strings: docstrings containing example code using backslashes.
Here it really seems wrong to interpolate \u and \U.

So this is swaying me towards changing this behavior: r"\u1234" will
be a string of length 6, and r"\U00012345" one of length 10.

I'm still on the fence about the trailing backslash; I personally
prefer to write Windows paths using regular strings and doubled
backslashes.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From collinw at gmail.com  Wed May 16 17:55:15 2007
From: collinw at gmail.com (Collin Winter)
Date: Wed, 16 May 2007 08:55:15 -0700
Subject: [Python-3000] Support for PEP 3131 - discussion on python zope
	users group
In-Reply-To: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>
References: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>
Message-ID: <43aa6ff70705160855s8d2edb8k9212455f0696c6f8@mail.gmail.com>

On 5/15/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> Just to let you know that a discussion on japanese python users group
> is going on regarding this issue.
>
> Most people feel like the PEP3131 would be a welcome addition.
> -> some people point out the fact that special characters like the
> greek letters would be great for all kind of  maths calculation.
> -> Many people think that this would enable them to make their own DSL

Oooh, and we could use actual lambdas instead of the lambda keyword. </sarcasm>

So now we've made the jump from "help (some) international users" to
"I want to use unicode characters just for the hell of it".

> -> unittest - very useful to give a better overview of the result of
> unit test. People pointed us at a Visual C# MVP tutorial
>   http://www.atmarkit.co.jp/fdotnet/nagile/nagile02/nagile02_03.html

I don't know what "a better overview of the result of unit test"
means. Also, the linked page is in Japanese.

Collin Winter

From collinw at gmail.com  Wed May 16 18:04:38 2007
From: collinw at gmail.com (Collin Winter)
Date: Wed, 16 May 2007 09:04:38 -0700
Subject: [Python-3000] Support for PEP 3131 (some links to evidence of
	usage within communities)
In-Reply-To: <19dd68ba0705141818w62c942b7g576016fcd3cc0ac1@mail.gmail.com>
References: <19dd68ba0705141818w62c942b7g576016fcd3cc0ac1@mail.gmail.com>
Message-ID: <43aa6ff70705160904p104962b3nb142a1e08bb68b78@mail.gmail.com>

On 5/14/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> Found some evidence of usage of identifiers in Japanese while doing a
> quick google search
>
> All links below are in Japanese.

I have absolutely no way of evaluating the content of these links.
Testimonials that I can't read are less than interesting.

Collin Winter

From murman at gmail.com  Wed May 16 18:15:00 2007
From: murman at gmail.com (Michael Urman)
Date: Wed, 16 May 2007 11:15:00 -0500
Subject: [Python-3000] Support for PEP 3131 (some links to evidence of
	usage within communities)
In-Reply-To: <43aa6ff70705160904p104962b3nb142a1e08bb68b78@mail.gmail.com>
References: <19dd68ba0705141818w62c942b7g576016fcd3cc0ac1@mail.gmail.com>
	<43aa6ff70705160904p104962b3nb142a1e08bb68b78@mail.gmail.com>
Message-ID: <dcbbbb410705160915t6967d2f6s894f5a64ecd0ff61@mail.gmail.com>

On 5/16/07, Collin Winter <collinw at gmail.com> wrote:
> On 5/14/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> > Found some evidence of usage of identifiers in Japanese while doing a
> > quick google search
> >
> > All links below are in Japanese.
>
> I have absolutely no way of evaluating the content of these links.
> Testimonials that I can't read are less than interesting.

See the web page option, Japanese to English BETA:
http://translate.google.com/translate_t

-- 
Michael Urman

From collinw at gmail.com  Wed May 16 18:26:29 2007
From: collinw at gmail.com (Collin Winter)
Date: Wed, 16 May 2007 09:26:29 -0700
Subject: [Python-3000] Support for PEP 3131 (some links to evidence of
	usage within communities)
In-Reply-To: <19dd68ba0705160909y47a1eb4cpcabc9d53a8581b6e@mail.gmail.com>
References: <19dd68ba0705141818w62c942b7g576016fcd3cc0ac1@mail.gmail.com>
	<43aa6ff70705160904p104962b3nb142a1e08bb68b78@mail.gmail.com>
	<19dd68ba0705160909y47a1eb4cpcabc9d53a8581b6e@mail.gmail.com>
Message-ID: <43aa6ff70705160926n59aa9c4el9b165b074e172e4c@mail.gmail.com>

On 5/16/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> Hi Collin,
>
> You express the same frustration than people who can't read English.
> You feel the same than Japanese people faced with an impenetrable wall
> of English...

Presumably people who don't speak English aren't provided
English-language reading materials as evidence during an
otherwise-Japanese discussion.

>  > Testimonials that I can't read are less than interesting.
>
> You seem to be unable to open your mind to other cultures.

I speak three languages. It's insulting to allege that my opposition
to this proposal is somehow based in English-language cultural
imperialism or some other politically-correct nonsense.

Collin Winter

From guido at python.org  Wed May 16 18:39:07 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 May 2007 09:39:07 -0700
Subject: [Python-3000] Support for PEP 3131 - discussion on python zope
	users group
In-Reply-To: <43aa6ff70705160855s8d2edb8k9212455f0696c6f8@mail.gmail.com>
References: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>
	<43aa6ff70705160855s8d2edb8k9212455f0696c6f8@mail.gmail.com>
Message-ID: <ca471dc20705160939w2f4dc0c6se98f1fb3cfb14afc@mail.gmail.com>

On 5/16/07, Collin Winter <collinw at gmail.com> wrote:
> On 5/15/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> > Just to let you know that a discussion on japanese python users group
> > is going on regarding this issue.
> >
> > Most people feel like the PEP3131 would be a welcome addition.
> > -> some people point out the fact that special characters like the
> > greek letters would be great for all kind of  maths calculation.
> > -> Many people think that this would enable them to make their own DSL
>
> Oooh, and we could use actual lambdas instead of the lambda keyword. </sarcasm>

Calm down, Collin. You know full well that that is not in the PEP and
if it were I'd be the first to reject it.

> So now we've made the jump from "help (some) international users" to
> "I want to use unicode characters just for the hell of it".

Down that road lies Perl 6. We need to give the world a sane alternative.

> > -> unittest - very useful to give a better overview of the result of
> > unit test. People pointed us at a Visual C# MVP tutorial
> >   http://www.atmarkit.co.jp/fdotnet/nagile/nagile02/nagile02_03.html
>
> I don't know what "a better overview of the result of unit test"
> means. Also, the linked page is in Japanese.

I've just ignored the pages in Japanese, except as proof that there
*are* people out there who like to discuss programming in their native
language which isn't English. I say more power to them.

Just to clarify my position to those who might think I have gone soft:
the standard library (with the exception of test modules specifically
aimed at testing this feature) should continue to use ASCII
exclusively for identifiers, English exclusively for comments and
messages, and should limit the use of non-ASCII characters in comments
and string literals to the names of contributors. Where names are
written using an alphabet that is not the Latin alphabet, a Latin
translation should be given alongside. I'd like to see this added to
both PEP 3131 and, for good measure, to PEP 8, the style guide (which
ought to be self-contained, and has a wider applicability than just
the standard library).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jason.orendorff at gmail.com  Wed May 16 18:44:50 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Wed, 16 May 2007 12:44:50 -0400
Subject: [Python-3000] Support for PEP 3131 - discussion on python zope
	users group
In-Reply-To: <43aa6ff70705160855s8d2edb8k9212455f0696c6f8@mail.gmail.com>
References: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>
	<43aa6ff70705160855s8d2edb8k9212455f0696c6f8@mail.gmail.com>
Message-ID: <bb8868b90705160944x674042cfkfd59bbaaaae610cb@mail.gmail.com>

On 5/16/07, Collin Winter <collinw at gmail.com> wrote:
> > -> unittest - very useful to give a better overview of the result of
> > unit test. People pointed us at a Visual C# MVP tutorial
> >   http://www.atmarkit.co.jp/fdotnet/nagile/nagile02/nagile02_03.html
>
> I don't know what "a better overview of the result of unit test"
> means. Also, the linked page is in Japanese.

The page illustrates how a unit test can serve as an
executable specification.  The third box of code is a
TestFixture class with methods like this one:

>>  [Test][ExpectedException(typeof(ArgumentException))]
>>  public void ???????????()
>>  {
>>    Date date = new Date(0, 1, 1);
>>  }

The name translates to something like "if the year is
less than one, it's an error".  Interesting.  Kind of a weird
thing to do; ordinarily you wouldn't want method names
that take so long to type.  But a unit test method is a
special case.

The mix of Japanese and English is not as visually
jarring as I expected.  It actually looks kinda cool. :)

-j

From guido at python.org  Wed May 16 18:49:16 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 May 2007 09:49:16 -0700
Subject: [Python-3000] Support for PEP 3131 - discussion on python zope
	users group
In-Reply-To: <bb8868b90705160944x674042cfkfd59bbaaaae610cb@mail.gmail.com>
References: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>
	<43aa6ff70705160855s8d2edb8k9212455f0696c6f8@mail.gmail.com>
	<bb8868b90705160944x674042cfkfd59bbaaaae610cb@mail.gmail.com>
Message-ID: <ca471dc20705160949h293612d6i23fb91c7b2f0d6d3@mail.gmail.com>

On 5/16/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> On 5/16/07, Collin Winter <collinw at gmail.com> wrote:
> > > -> unittest - very useful to give a better overview of the result of
> > > unit test. People pointed us at a Visual C# MVP tutorial
> > >   http://www.atmarkit.co.jp/fdotnet/nagile/nagile02/nagile02_03.html
> >
> > I don't know what "a better overview of the result of unit test"
> > means. Also, the linked page is in Japanese.
>
> The page illustrates how a unit test can serve as an
> executable specification.  The third box of code is a
> TestFixture class with methods like this one:
>
> >>  [Test][ExpectedException(typeof(ArgumentException))]
> >>  public void ???????????()
> >>  {
> >>    Date date = new Date(0, 1, 1);
> >>  }
>
> The name translates to something like "if the year is
> less than one, it's an error".  Interesting.  Kind of a weird
> thing to do; ordinarily you wouldn't want method names
> that take so long to type.  But a unit test method is a
> special case.
>
> The mix of Japanese and English is not as visually
> jarring as I expected.  It actually looks kinda cool. :)

Thanks for the translation!

This meshes nicely with a pattern I've only recently learned in unit
testing -- using long descriptive names for tests so the name of the
test indicates the tested behavior (as opposed to, say, the name of
the class or method being tested).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From hfoffani at gmail.com  Wed May 16 18:51:51 2007
From: hfoffani at gmail.com (Hernan M Foffani)
Date: Wed, 16 May 2007 18:51:51 +0200
Subject: [Python-3000] Support for PEP 3131 - discussion on python zope
	users group
In-Reply-To: <43aa6ff70705160855s8d2edb8k9212455f0696c6f8@mail.gmail.com>
References: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>
	<43aa6ff70705160855s8d2edb8k9212455f0696c6f8@mail.gmail.com>
Message-ID: <11fab4bc0705160951v3a0577eahdf00081ebd8a9032@mail.gmail.com>

2007/5/16, Collin Winter <collinw at gmail.com>:
> On 5/15/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> > Just to let you know that a discussion on japanese python users group
> > is going on regarding this issue.
> >
> > Most people feel like the PEP3131 would be a welcome addition.
> > -> some people point out the fact that special characters like the
> > greek letters would be great for all kind of  maths calculation.
> > -> Many people think that this would enable them to make their own DSL
>
> Oooh, and we could use actual lambdas instead of the lambda keyword. </sarcasm>
>
> So now we've made the jump from "help (some) international users" to
> "I want to use unicode characters just for the hell of it".

I understand that the acronym DSL is not the right choice in this
discussion because it already has a well known meaning.  What I do
believe is that the proposal will help users to implement their
solutions using the same words they already use in their domain.

> > -> unittest - very useful to give a better overview of the result of
> > unit test. People pointed us at a Visual C# MVP tutorial
> >   http://www.atmarkit.co.jp/fdotnet/nagile/nagile02/nagile02_03.html
>
> I don't know what "a better overview of the result of unit test"
> means. Also, the linked page is in Japanese.

Please, Guillaume, correct me if I'm wrong, but my understanding
is that one of the partner is proposing the use of natural language
for test names making the correspondence between the specification
and the test case as close as possible.  Thus, you can use current
unittest tools to show, for instance, the state of your project.

From steven.bethard at gmail.com  Wed May 16 18:55:57 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Wed, 16 May 2007 10:55:57 -0600
Subject: [Python-3000] Raw strings containing \u or \U
In-Reply-To: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>
References: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>
Message-ID: <d11dcfba0705160955r769e6356qe4bad4e776b4a55d@mail.gmail.com>

On 5/16/07, Guido van Rossum <guido at python.org> wrote:
> Walter Doerwald, in private mail, reminded me of a third use case for
> raw strings: docstrings containing example code using backslashes.
> Here it really seems wrong to interpolate \u and \U.
>
> So this is swaying me towards changing this behavior: r"\u1234" will
> be a string of length 6, and r"\U00012345" one of length 10.

+1 for making raw strings truly raw (where backslashes don't escape
anything) and teaching the re module about the necessary escapes (\u,
\n, \r, etc.).

> I'm still on the fence about the trailing backslash; I personally
> prefer to write Windows paths using regular strings and doubled
> backslashes.

+1 for no escaping of quotes in raw strings.  Python provides so many
different ways to quote a string, the cases in which you can't just
switch to another quoting style are vanishingly small.  Examples from
the stdlib and their translations::

    '\'' --> "'"
    '("|\')' --> '''("|')'''
    'Can\'t stat' --> "Can't stat"
    '(\'[^\']*\'|"[^"]*")?' --> '''('[^']*'|"[^"]*")?'''

Note that allowing trailing backslashes could also clean up stuff in
modules like ntpath::

    path[-1] in "/\\" --> path[-1] in r"/\"
    firstTwo == '\\\\' --> firstTwo == r'\\'


STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From guido at python.org  Wed May 16 19:05:45 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 May 2007 10:05:45 -0700
Subject: [Python-3000] Raw strings containing \u or \U
In-Reply-To: <d11dcfba0705160955r769e6356qe4bad4e776b4a55d@mail.gmail.com>
References: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>
	<d11dcfba0705160955r769e6356qe4bad4e776b4a55d@mail.gmail.com>
Message-ID: <ca471dc20705161005r684c359dg8fa7b78355cb4ccc@mail.gmail.com>

On 5/16/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> On 5/16/07, Guido van Rossum <guido at python.org> wrote:
> > Walter Doerwald, in private mail, reminded me of a third use case for
> > raw strings: docstrings containing example code using backslashes.
> > Here it really seems wrong to interpolate \u and \U.
> >
> > So this is swaying me towards changing this behavior: r"\u1234" will
> > be a string of length 6, and r"\U00012345" one of length 10.
>
> +1 for making raw strings truly raw (where backslashes don't escape
> anything) and teaching the re module about the necessary escapes (\u,
> \n, \r, etc.).

It already knows about all of those except \u and \U. Someone care to
submit a patch?

> > I'm still on the fence about the trailing backslash; I personally
> > prefer to write Windows paths using regular strings and doubled
> > backslashes.
>
> +1 for no escaping of quotes in raw strings.  Python provides so many
> different ways to quote a string, the cases in which you can't just
> switch to another quoting style are vanishingly small.  Examples from
> the stdlib and their translations::
>
>     '\'' --> "'"
>     '("|\')' --> '''("|')'''
>     'Can\'t stat' --> "Can't stat"
>     '(\'[^\']*\'|"[^"]*")?' --> '''('[^']*'|"[^"]*")?'''
>
> Note that allowing trailing backslashes could also clean up stuff in
> modules like ntpath::
>
>     path[-1] in "/\\" --> path[-1] in r"/\"
>     firstTwo == '\\\\' --> firstTwo == r'\\'

Can you also search for how often this feature is *used* (i.e. a raw
string that has to be raw for other reasons also contains an escaped
quote)? If that's rare or we can agree on easy fixes it would ease my
mind about this part of the proposal.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gproux+py3000 at gmail.com  Wed May 16 19:14:03 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Thu, 17 May 2007 02:14:03 +0900
Subject: [Python-3000] Support for PEP 3131 (some links to evidence of
	usage within communities)
In-Reply-To: <43aa6ff70705160926n59aa9c4el9b165b074e172e4c@mail.gmail.com>
References: <19dd68ba0705141818w62c942b7g576016fcd3cc0ac1@mail.gmail.com>
	<43aa6ff70705160904p104962b3nb142a1e08bb68b78@mail.gmail.com>
	<19dd68ba0705160909y47a1eb4cpcabc9d53a8581b6e@mail.gmail.com>
	<43aa6ff70705160926n59aa9c4el9b165b074e172e4c@mail.gmail.com>
Message-ID: <19dd68ba0705161014x11beea5j679334204e211018@mail.gmail.com>

Hi Collin,

Sorry did not mean to hurt your feelings.

On 5/17/07, Collin Winter <collinw at gmail.com> wrote:
> I speak three languages. It's insulting to allege that my opposition
> to this proposal is somehow based in English-language cultural
> imperialism or some other politically-correct nonsense.

I was just trying to point out the fact that the people who are likely
the most to be impacted by this PEP3131 are *exactly* the same that
will not write up pages in English or any kind of latin languages. I
did not really want to mean anything attacking your knowledge and
capabilities.

My point is: If you really want to understand the benefit of PEP3131
for Japanese people as seen from their eyes, I believe that you have
no choice but to either: learn japanese or try to get by with
automatic japanese->english translating tools and that you should not
complain about the links being in Japanese: this is exactly the reason
why people would love python being able to speak their own language.


Regards,

Guillaume

From benji at benjiyork.com  Wed May 16 19:41:11 2007
From: benji at benjiyork.com (Benji York)
Date: Wed, 16 May 2007 13:41:11 -0400
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <ca471dc20705152146p382ac396u5f7dc4b4f138cd26@mail.gmail.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>	
	<4648D626.1030201@benjiyork.com>
	<ca471dc20705152146p382ac396u5f7dc4b4f138cd26@mail.gmail.com>
Message-ID: <464B4237.4090802@benjiyork.com>

Guido van Rossum wrote:
> On 5/14/07, Benji York <benji at benjiyork.com> wrote:
>> Collin Winter wrote:
>>> PEP: 3133
>>> Title: Introducing Roles
>> Everything included here is included in zope.interface.  See in-line
>> comments below for the analogs.
> 
> Could you look at PEP 3119 and do a similar analysis?

Sure.

> I expect that
> the main thing missing there is that it (currently) has no way to
> claim that a particular *object* has a certain behavior.

Is "it" in that sentence the ABC PEP or zope.interface?

> PEP 3119 currently also doesn't have a
> verification step -- but this could easily be added as an (optional)
> part of the registration call.

I don't care much for verification.  People using zope.interface have 
found that writing good tests is superior to on-demand verification, and 
I suspect execution time verification is a non-starter because of the 
overhead (not to mention its actual desirability, or lack thereof).

> BTW I think Collin made a mistake when he claimed that the Doglike
> role should throw a tantrum just because the actual bark()
> implementation has an optional extra argument

Agreed.
-- 
Benji York
http://benjiyork.com

From guido at python.org  Wed May 16 19:50:07 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 May 2007 10:50:07 -0700
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <464B4237.4090802@benjiyork.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
	<4648D626.1030201@benjiyork.com>
	<ca471dc20705152146p382ac396u5f7dc4b4f138cd26@mail.gmail.com>
	<464B4237.4090802@benjiyork.com>
Message-ID: <ca471dc20705161050w3f99d762r68b90bd5253c1b8c@mail.gmail.com>

On 5/16/07, Benji York <benji at benjiyork.com> wrote:
> Guido van Rossum wrote:
> > On 5/14/07, Benji York <benji at benjiyork.com> wrote:
> >> Collin Winter wrote:
> >>> PEP: 3133
> >>> Title: Introducing Roles
> >> Everything included here is included in zope.interface.  See in-line
> >> comments below for the analogs.
> >
> > Could you look at PEP 3119 and do a similar analysis?
>
> Sure.
>
> > I expect that
> > the main thing missing there is that it (currently) has no way to
> > claim that a particular *object* has a certain behavior.
>
> Is "it" in that sentence the ABC PEP or zope.interface?

The ABC PEP.

> > PEP 3119 currently also doesn't have a
> > verification step -- but this could easily be added as an (optional)
> > part of the registration call.
>
> I don't care much for verification.  People using zope.interface have
> found that writing good tests is superior to on-demand verification, and
> I suspect execution time verification is a non-starter because of the
> overhead (not to mention its actual desirability, or lack thereof).

I don't care much for it either. If zope.interface users don't care
for it either, I'm happy to declare it a non-use case. I was just
thinking of how to "sell" ABCs as an alternative to current happy
users of zop.interfaces.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From steven.bethard at gmail.com  Wed May 16 20:32:37 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Wed, 16 May 2007 12:32:37 -0600
Subject: [Python-3000] Raw strings containing \u or \U
In-Reply-To: <ca471dc20705161005r684c359dg8fa7b78355cb4ccc@mail.gmail.com>
References: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>
	<d11dcfba0705160955r769e6356qe4bad4e776b4a55d@mail.gmail.com>
	<ca471dc20705161005r684c359dg8fa7b78355cb4ccc@mail.gmail.com>
Message-ID: <d11dcfba0705161132o227fd065y2c7fef01c0824822@mail.gmail.com>

On 5/16/07, Guido van Rossum <guido at python.org> wrote:
> On 5/16/07, Steven Bethard <steven.bethard at gmail.com> wrote:
> > +1 for no escaping of quotes in raw strings.  Python provides so many
> > different ways to quote a string, the cases in which you can't just
> > switch to another quoting style are vanishingly small.  Examples from
> > the stdlib and their translations::
> >
> >     '\'' --> "'"
> >     '("|\')' --> '''("|')'''
> >     'Can\'t stat' --> "Can't stat"
> >     '(\'[^\']*\'|"[^"]*")?' --> '''('[^']*'|"[^"]*")?'''
> >
> > Note that allowing trailing backslashes could also clean up stuff in
> > modules like ntpath::
> >
> >     path[-1] in "/\\" --> path[-1] in r"/\"
> >     firstTwo == '\\\\' --> firstTwo == r'\\'
>
> Can you also search for how often this feature is *used* (i.e. a raw
> string that has to be raw for other reasons also contains an escaped
> quote)? If that's rare or we can agree on easy fixes it would ease my
> mind about this part of the proposal.

Well, remembering that when you escape a quote in a raw string, the
backslash is left in regardless of the enclosing quote type, e.g.::

    r"\"" == r'\"' == r"""\"""" == r'''\"''' == '\\"'

the question is then whether there are any situations where you can't
just switch the quote type. The only things in the stdlib that I could
find[1] where the string quotes and the escaped quote were of the same
type were:

    r"^\s*=\s*\"([^\"\\]*(?:\\.[^\"\\]*)*)\""
    r"([\"\\])"
    r'[^\\\'\"%s ]*'
    r'#\s*doctest:\s*([^\n\'"]*)$',
    r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./,:;+*%?!&$\(\)_#=~@]*))?'
    r"([^.'\"\\#]\b|^)"
    r'(\'[^\']*\'|"[^"]*")\s*'
    r'((\\[\\abfnrtv\'"]|\\[0-9]..|\\x..|\\u....)+)',
    r'(\'[^\']*\'|"[^"]*"|[][\-a-zA-Z0-9./,:;+*%?!&$\(\)_#=~\'"@]*))?'
    r'(?<=[\w\!\"\'\&\.\,\?])-{2,}(?=\w))'
    r'[\"\']?'
    r'[ \(\)<>@,;:\\"/\[\]\?=]'
    r"[&<>\"\x80-\xff]+"

I believe every one of these would continue to work if you simply
replaced r'...' or r"..." with r'''...''', that is, if you used the
triple-quoted version. Even some much nastier ones than what's in the
stdlib (e.g. where the string starts and ends with different quote
types) seem to work out okay when you switch to the appropriate triple
quotes::

    r'\'\"' == r'''\'\"'''
    r'"\'' == r""""\'"""

I actually wasn't able to find something I couldn't translate.  It
would be helpful to have another set of eyes if anyone has the time.

[1] I skipped the tests dir because I'm lazy. ;-)

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From bwinton at latte.ca  Wed May 16 20:51:54 2007
From: bwinton at latte.ca (Blake Winton)
Date: Wed, 16 May 2007 14:51:54 -0400
Subject: [Python-3000] Support for PEP 3131 - discussion on python zope
 users group
In-Reply-To: <bb8868b90705160944x674042cfkfd59bbaaaae610cb@mail.gmail.com>
References: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>	<43aa6ff70705160855s8d2edb8k9212455f0696c6f8@mail.gmail.com>
	<bb8868b90705160944x674042cfkfd59bbaaaae610cb@mail.gmail.com>
Message-ID: <464B52CA.9030001@latte.ca>

Jason Orendorff wrote:
> On 5/16/07, Collin Winter <collinw at gmail.com> wrote:
>>>  [Test][ExpectedException(typeof(ArgumentException))]
>>>  public void ???????????()
>>>  {
>>>    Date date = new Date(0, 1, 1);
>>>  }
> The mix of Japanese and English is not as visually
> jarring as I expected.  It actually looks kinda cool. :)

I agree, but that particular example kind of worried me, since in my
browser's font, ? looks a lot like ( followed by some other Japanese
character.  I spent a couple of minutes looking for the closing paren
before realizing that it wasn't what I thought it was...  Or course, I
have the same problem in English, with "rn" looking a lot like "m"
sometirnes.  (In a related story, a friend of mine mentioned she was on
the Pom-pom squad in high-school.)

Later,
Blake.


From guido at python.org  Wed May 16 21:14:36 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 May 2007 12:14:36 -0700
Subject: [Python-3000] Support for PEP 3131 - discussion on python zope
	users group
In-Reply-To: <464B52CA.9030001@latte.ca>
References: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>
	<43aa6ff70705160855s8d2edb8k9212455f0696c6f8@mail.gmail.com>
	<bb8868b90705160944x674042cfkfd59bbaaaae610cb@mail.gmail.com>
	<464B52CA.9030001@latte.ca>
Message-ID: <ca471dc20705161214p6ddc5de1o1a32af049ded2ff2@mail.gmail.com>

Yeah, I've decided for myself that similar-looking characters are a
non-issue. They are a real problem in domain names because spammers
use them to fool users into believing they're going to the real ebay.
But source code just doesn't have that attack model. There are lots of
characters that look the same already -- 1/l/I, o/O/0, in some fonts
{/( and )/}. We deal with them.

--Guido

On 5/16/07, Blake Winton <bwinton at latte.ca> wrote:
> Jason Orendorff wrote:
> > On 5/16/07, Collin Winter <collinw at gmail.com> wrote:
> >>>  [Test][ExpectedException(typeof(ArgumentException))]
> >>>  public void ???????????()
> >>>  {
> >>>    Date date = new Date(0, 1, 1);
> >>>  }
> > The mix of Japanese and English is not as visually
> > jarring as I expected.  It actually looks kinda cool. :)
>
> I agree, but that particular example kind of worried me, since in my
> browser's font, ? looks a lot like ( followed by some other Japanese
> character.  I spent a couple of minutes looking for the closing paren
> before realizing that it wasn't what I thought it was...  Or course, I
> have the same problem in English, with "rn" looking a lot like "m"
> sometirnes.  (In a related story, a friend of mine mentioned she was on
> the Pom-pom squad in high-school.)
>
> Later,
> Blake.
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rrr at ronadam.com  Wed May 16 22:01:01 2007
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 16 May 2007 15:01:01 -0500
Subject: [Python-3000] Raw strings containing \u or \U
In-Reply-To: <d11dcfba0705161132o227fd065y2c7fef01c0824822@mail.gmail.com>
References: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>	<d11dcfba0705160955r769e6356qe4bad4e776b4a55d@mail.gmail.com>	<ca471dc20705161005r684c359dg8fa7b78355cb4ccc@mail.gmail.com>
	<d11dcfba0705161132o227fd065y2c7fef01c0824822@mail.gmail.com>
Message-ID: <464B62FD.4070400@ronadam.com>

Steven Bethard wrote:

> I actually wasn't able to find something I couldn't translate.  It
> would be helpful to have another set of eyes if anyone has the time.

I have a patch against (*) 2.6 tokanize.py that ignores '\' characters in 
raw strings.  This has two effects.  A matching quote, """, ''', ", ', of 
the type that started the string closes the string even if it is preceded 
by a back slash, and a back slash can end a raw string.  No changes to 
regular string behavior was made.

I'll try to make a patch against the python 3000 branch and uploaded so it 
can be used for testing.  (Unless of course someone else has already did it.)

Ron


* I didn't have the python 3000 branch on my computer at the time.

From guido at python.org  Wed May 16 22:29:04 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 May 2007 13:29:04 -0700
Subject: [Python-3000] Raw strings containing \u or \U
In-Reply-To: <464B62FD.4070400@ronadam.com>
References: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>
	<d11dcfba0705160955r769e6356qe4bad4e776b4a55d@mail.gmail.com>
	<ca471dc20705161005r684c359dg8fa7b78355cb4ccc@mail.gmail.com>
	<d11dcfba0705161132o227fd065y2c7fef01c0824822@mail.gmail.com>
	<464B62FD.4070400@ronadam.com>
Message-ID: <ca471dc20705161329h2b7ae659qe21ad594e2939d6a@mail.gmail.com>

That would be great! This will automatically turn \u1234 into 6
characters, right?

Perhaps you could make the patch against the py3k-struni branch
instead of against the regular p3yk (sic) branch?

On 5/16/07, Ron Adam <rrr at ronadam.com> wrote:
> Steven Bethard wrote:
>
> > I actually wasn't able to find something I couldn't translate.  It
> > would be helpful to have another set of eyes if anyone has the time.
>
> I have a patch against (*) 2.6 tokanize.py that ignores '\' characters in
> raw strings.  This has two effects.  A matching quote, """, ''', ", ', of
> the type that started the string closes the string even if it is preceded
> by a back slash, and a back slash can end a raw string.  No changes to
> regular string behavior was made.
>
> I'll try to make a patch against the python 3000 branch and uploaded so it
> can be used for testing.  (Unless of course someone else has already did it.)
>
> Ron
>
>
> * I didn't have the python 3000 branch on my computer at the time.
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rrr at ronadam.com  Wed May 16 23:05:57 2007
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 16 May 2007 16:05:57 -0500
Subject: [Python-3000] Raw strings containing \u or \U
In-Reply-To: <ca471dc20705161329h2b7ae659qe21ad594e2939d6a@mail.gmail.com>
References: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>	
	<d11dcfba0705160955r769e6356qe4bad4e776b4a55d@mail.gmail.com>	
	<ca471dc20705161005r684c359dg8fa7b78355cb4ccc@mail.gmail.com>	
	<d11dcfba0705161132o227fd065y2c7fef01c0824822@mail.gmail.com>	
	<464B62FD.4070400@ronadam.com>
	<ca471dc20705161329h2b7ae659qe21ad594e2939d6a@mail.gmail.com>
Message-ID: <464B7235.20500@ronadam.com>

Guido van Rossum wrote:
> That would be great! This will automatically turn \u1234 into 6
> characters, right?

I'm not exactly clear when the '\uxxxx' characters get converted.  There 
isn't any conversion done in tokanize.c that I can see.  It's primarily 
only concerned with finding the beginning and ending of the string at that 
point.  It looks like everything between the beginning and end is just 
passed along "as is" and it's translated further later in the chain.

(I had said earlier tokanize.py,  meant tokanize.c)


> Perhaps you could make the patch against the py3k-struni branch
> instead of against the regular p3yk (sic) branch?

I can do that.  :-)



> On 5/16/07, Ron Adam <rrr at ronadam.com> wrote:
>> Steven Bethard wrote:
>>
>> > I actually wasn't able to find something I couldn't translate.  It
>> > would be helpful to have another set of eyes if anyone has the time.
>>
>> I have a patch against (*) 2.6 tokanize.py that ignores '\' characters in
>> raw strings.  This has two effects.  A matching quote, """, ''', ", ', of
>> the type that started the string closes the string even if it is preceded
>> by a back slash, and a back slash can end a raw string.  No changes to
>> regular string behavior was made.
>>
>> I'll try to make a patch against the python 3000 branch and uploaded 
>> so it
>> can be used for testing.  (Unless of course someone else has already 
>> did it.)
>>
>> Ron
>>
>>
>> * I didn't have the python 3000 branch on my computer at the time.
>> _______________________________________________
>> Python-3000 mailing list
>> Python-3000 at python.org
>> http://mail.python.org/mailman/listinfo/python-3000
>> Unsubscribe: 
>> http://mail.python.org/mailman/options/python-3000/guido%40python.org
>>
> 
> 


From guido at python.org  Wed May 16 23:10:17 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 May 2007 14:10:17 -0700
Subject: [Python-3000] Raw strings containing \u or \U
In-Reply-To: <464B7235.20500@ronadam.com>
References: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>
	<d11dcfba0705160955r769e6356qe4bad4e776b4a55d@mail.gmail.com>
	<ca471dc20705161005r684c359dg8fa7b78355cb4ccc@mail.gmail.com>
	<d11dcfba0705161132o227fd065y2c7fef01c0824822@mail.gmail.com>
	<464B62FD.4070400@ronadam.com>
	<ca471dc20705161329h2b7ae659qe21ad594e2939d6a@mail.gmail.com>
	<464B7235.20500@ronadam.com>
Message-ID: <ca471dc20705161410g88e3ab1me677e59e06fab10b@mail.gmail.com>

On 5/16/07, Ron Adam <rrr at ronadam.com> wrote:
> Guido van Rossum wrote:
> > That would be great! This will automatically turn \u1234 into 6
> > characters, right?
>
> I'm not exactly clear when the '\uxxxx' characters get converted.  There
> isn't any conversion done in tokanize.c that I can see.  It's primarily
> only concerned with finding the beginning and ending of the string at that
> point.  It looks like everything between the beginning and end is just
> passed along "as is" and it's translated further later in the chain.

OK, I think that happens in a totally different place. But it also
needs to be fixed. :-)

> (I had said earlier tokanize.py,  meant tokanize.c)

Well, actually, tokenize.py also needs adjustments to support this...

> > Perhaps you could make the patch against the py3k-struni branch
> > instead of against the regular p3yk (sic) branch?
>
> I can do that.  :-)

Great!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rrr at ronadam.com  Thu May 17 00:42:24 2007
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 16 May 2007 17:42:24 -0500
Subject: [Python-3000] Raw strings containing \u or \U
In-Reply-To: <ca471dc20705161329h2b7ae659qe21ad594e2939d6a@mail.gmail.com>
References: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>	
	<d11dcfba0705160955r769e6356qe4bad4e776b4a55d@mail.gmail.com>	
	<ca471dc20705161005r684c359dg8fa7b78355cb4ccc@mail.gmail.com>	
	<d11dcfba0705161132o227fd065y2c7fef01c0824822@mail.gmail.com>	
	<464B62FD.4070400@ronadam.com>
	<ca471dc20705161329h2b7ae659qe21ad594e2939d6a@mail.gmail.com>
Message-ID: <464B88D0.6080309@ronadam.com>

Guido van Rossum wrote:
> That would be great! This will automatically turn \u1234 into 6
> characters, right?
> 
> Perhaps you could make the patch against the py3k-struni branch
> instead of against the regular p3yk (sic) branch?

Done.  Patch number 1720390

https://sourceforge.net/tracker/index.php?func=detail&aid=1720390&group_id=5470&atid=305470

This doesn't include the strings needing changes in the library to passes 
all the tests.  That's mostly changing single quotes to triple quotes when 
a string contains both quote characters.  I'll make a second patch that 
includes those.

Cheers,
    Ron

From rasky at develer.com  Thu May 17 01:29:25 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Thu, 17 May 2007 01:29:25 +0200
Subject: [Python-3000] PEP 3124 - more commentary
In-Reply-To: <4649976A.1030301@canterbury.ac.nz>
References: <ca471dc20705141125o183bec55pa18124ac47c4a810@mail.gmail.com>	<20070514192423.624D63A4036@sparrow.telecommunity.com>	<ca471dc20705141247j166a7b8bi76afa0186a925086@mail.gmail.com>	<20070514214915.C361C3A4036@sparrow.telecommunity.com>	<ca471dc20705141543m6ad00b2ame7698080ba94367e@mail.gmail.com>	<20070514232017.BA6A43A4036@sparrow.telecommunity.com>	<ca471dc20705141717m34a9b20dp7567c06af8a977f9@mail.gmail.com>	<20070515003354.194B83A4036@sparrow.telecommunity.com>	<ca471dc20705141751n41f904depe90d08947c7ed73d@mail.gmail.com>	<20070515014338.8D8EA3A4036@sparrow.telecommunity.com>
	<4649976A.1030301@canterbury.ac.nz>
Message-ID: <f2g44l$55m$1@sea.gmane.org>

On 15/05/2007 13.20, Greg Ewing wrote:

>> C++ and Java don't have tuples, do they?
> 
> No, but in C++ you could probably do something clever by
> overloading the comma operator if you were feeling perverse
> enough...

Well, there's also tr1::tuple :)
-- 
Giovanni Bajo


From jimjjewett at gmail.com  Thu May 17 02:19:21 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 16 May 2007 20:19:21 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <bb8868b90705130639p69b719e8h2f40863e44547106@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org>
	<bb8868b90705130639p69b719e8h2f40863e44547106@mail.gmail.com>
Message-ID: <fb6fbf560705161719m6e7f1c9cka4e843297d932aea@mail.gmail.com>

On 5/13/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> I think the gesture alone is worth it, even if no one ever used the
> feature productively.  But people will.  The cost to python-dev is low,
> and the cost to English-speaking users is very likely zero.

> What am I missing?

Additional costs:

(1)  Security concerns.

Offhand, I'm not sure how to exploit it, but I could imagine scenarios, such as

    if var<limit:

where "var<limit" turned out to be an identifier (using a character
that looked like "<") rather than a comparison.

(2)  Obscure bugs.

I have seen code that did the wrong thing because a method override
(or global variable name) was misspelled.  You can argue that it was
sloppy code, but that sort of thing would be more common when the
programmer couldn't tell the difference visually.  (Just as today's
typos are more likely to involve "0" and "O" than "T" and "5")

Guillaume has pointed out that people whose native language isn't
written in Latin characters already have this problem, but it is a
problem they already learn to deal with as part of learning to
program.

-jJ

From guido at python.org  Thu May 17 02:25:49 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 16 May 2007 17:25:49 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705161719m6e7f1c9cka4e843297d932aea@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org>
	<bb8868b90705130639p69b719e8h2f40863e44547106@mail.gmail.com>
	<fb6fbf560705161719m6e7f1c9cka4e843297d932aea@mail.gmail.com>
Message-ID: <ca471dc20705161725g2d3222f7naf2cd9f7b81fef6f@mail.gmail.com>

As I mentioned before, I don't expect either of these will be much of
a concern. I guess tools like pylint could optionally warn if
non-ascii characters are used.

On 5/16/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 5/13/07, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> > I think the gesture alone is worth it, even if no one ever used the
> > feature productively.  But people will.  The cost to python-dev is low,
> > and the cost to English-speaking users is very likely zero.
>
> > What am I missing?
>
> Additional costs:
>
> (1)  Security concerns.
>
> Offhand, I'm not sure how to exploit it, but I could imagine scenarios, such as
>
>     if var<limit:
>
> where "var<limit" turned out to be an identifier (using a character
> that looked like "<") rather than a comparison.
>
> (2)  Obscure bugs.
>
> I have seen code that did the wrong thing because a method override
> (or global variable name) was misspelled.  You can argue that it was
> sloppy code, but that sort of thing would be more common when the
> programmer couldn't tell the difference visually.  (Just as today's
> typos are more likely to involve "0" and "O" than "T" and "5")
>
> Guillaume has pointed out that people whose native language isn't
> written in Latin characters already have this problem, but it is a
> problem they already learn to deal with as part of learning to
> program.
>
> -jJ
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jimjjewett at gmail.com  Thu May 17 02:26:27 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 16 May 2007 20:26:27 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705131104r85531f3o12b7e1769d7b7140@mail.gmail.com>
References: <1d85506f0705130833v1058b022re0597cf9f259320d@mail.gmail.com>
	<19dd68ba0705130925j1dd55f1boba9e1b6c036d0422@mail.gmail.com>
	<43aa6ff70705131009s7d5b177dmea7c790d670ac3c0@mail.gmail.com>
	<1d85506f0705131042q23270a91qa31ff2f3940019ed@mail.gmail.com>
	<19dd68ba0705131104r85531f3o12b7e1769d7b7140@mail.gmail.com>
Message-ID: <fb6fbf560705161726s123a2e19xeaf2321db9607f33@mail.gmail.com>

On 5/13/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> HI Tomer,

> > if ??????.?????:
> >     pass

> > which comes first? does it say bacon.eggs or eggs.bacon?
> > and what happens if the editor uses a dot prefixed by LTR
> > marker? the meaning is reversed, but it still looks the same!

> All that is really a *presentation* issue. And as such, an editor
> specialized in editing hebrew or arabic python should help you write
> the code you want to write.

How should I interpret:

    if ??????.spam:

Even if we restricted identifiers to a single script, the combinations
of identifiers would still have this issue.

> Additionally,would a professional programmer choose to add LTR markers
> to make the source code ambiguous?

Maybe they're trying to inject a security breach?  Unicode identifiers
do make auditing by inspection harder.

> > you can always translate or transliterate a word to english, like so:
> > if beykon.beytzim:

> Is this a bijective translation ?  How good is most people latin
> character reading ability among Hebrew speakers? From the beginning, I
> can tell from experience that Japanese people have great difficulties
> in reading english or even transliterated japanese (which is never
> good anyway because of homonyms)

It could be turned into one, using a custom "encoding" codec.

-jJ

From jimjjewett at gmail.com  Thu May 17 02:31:41 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 16 May 2007 20:31:41 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <c19402930705131741n7149073w3f531a6775873ffa@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<43aa6ff70705130822q32e3971bradf80fd90ac36578@mail.gmail.com>
	<46476291.6040502@jmunch.dk>
	<c19402930705131741n7149073w3f531a6775873ffa@mail.gmail.com>
Message-ID: <fb6fbf560705161731s3e912d3bk5330458c08ad29b7@mail.gmail.com>

On 5/13/07, Arvind Singh <arvind1.singh at gmail.com> wrote:
> On 5/14/07, Anders J. Munch <2007 at jmunch.dk > wrote:

> This PEP talks about support for *identifiers*. If you need *extensive*
> vocabulary for your *identifiers*, I'd assume that you're coding something
> non-trivial (with ignorable exceptions). Such non-trivial code should be
> sharable under a _common_ language that *others* can understand as well,
> IMHO.

But that common language might well be Japanese, particularly if you
are writing for a specific customer which happens to be a Japanese
company.

>  Further, if you are doing something non-trivial, I can also assume that
> you'd be using third-party libraries. How would the code look if identifiers
> were written in various encodings?

The core of CPython prefixes its identifiers with Py_ to distinguish
them from other libraries.  I suspect that a Chinese character vs a
Latin character would be almost as distinctive as whether or not the
identifer starts with "Py_"

-jJ

From jyasskin at gmail.com  Thu May 17 02:31:50 2007
From: jyasskin at gmail.com (Jeffrey Yasskin)
Date: Thu, 17 May 2007 02:31:50 +0200
Subject: [Python-3000] Updated and simplified PEP 3141: A Type Hierarchy for
	Numbers
Message-ID: <5d44f72f0705161731j4700bdb3h4e36e97757bd6a32@mail.gmail.com>

I've updated PEP3141 to remove the algebraic classes and bring the
numeric hierarchy much closer to scheme's design. Let me know what you
think. Feel free to send typographical and formatting problems just to
me. My schedule's a little shaky the next couple weeks, but I'll make
updates as quickly as I can.



PEP: 3141
Title: A Type Hierarchy for Numbers
Version: $Revision: 54928 $
Last-Modified: $Date: 2007-04-23 16:37:29 -0700 (Mon, 23 Apr 2007) $
Author: Jeffrey Yasskin <jyasskin at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 23-Apr-2007
Post-History: Not yet posted


Abstract
========

This proposal defines a hierarchy of Abstract Base Classes (ABCs) (PEP
3119) to represent number-like classes. It proposes a hierarchy of
``Number :> Complex :> Real :> Rational :> Integer`` where ``A :> B``
means "A is a supertype of B", and a pair of ``Exact``/``Inexact``
classes to capture the difference between ``floats`` and
``ints``. These types are significantly inspired by Scheme's numeric
tower [#schemetower]_.

Rationale
=========

Functions that take numbers as arguments should be able to determine
the properties of those numbers, and if and when overloading based on
types is added to the language, should be overloadable based on the
types of the arguments. For example, slicing requires its arguments to
be ``Integers``, and the functions in the ``math`` module require
their arguments to be ``Real``.

Specification
=============

This PEP specifies a set of Abstract Base Classes with default
implementations. If the reader prefers to think in terms of Roles (PEP
3133), the default implementations for (for example) the Real ABC
would be moved to a RealDefault class, with Real keeping just the
method declarations.

Although this PEP uses terminology from PEP 3119, the hierarchy is
intended to be meaningful for any systematic method of defining sets
of classes, including Interfaces. I'm also using the extra notation
from PEP 3107 (Function Annotations) to specify some types.


Exact vs. Inexact Classes
-------------------------

Floating point values may not exactly obey several of the properties
you would expect. For example, it is possible for ``(X + -X) + 3 ==
3``, but ``X + (-X + 3) == 0``. On the range of values that most
functions deal with this isn't a problem, but it is something to be
aware of.

Therefore, I define ``Exact`` and ``Inexact`` ABCs to mark whether
types have this problem. Every instance of ``Integer`` and
``Rational`` should be Exact, but ``Reals`` and ``Complexes`` may or
may not be. (Do we really only need one of these, and the other is
defined as ``not`` the first?)::

    class Exact(metaclass=MetaABC): pass
    class Inexact(metaclass=MetaABC): pass


Numeric Classes
---------------

We begin with a Number class to make it easy for people to be fuzzy
about what kind of number they expect. This class only helps with
overloading; it doesn't provide any operations. **Open question:**
Should it specify ``__add__``, ``__sub__``, ``__neg__``, ``__mul__``,
and ``__abs__`` like Haskell's ``Num`` class?::

    class Number(metaclass=MetaABC): pass


Some types (primarily ``float``) define "Not a Number" (NaN) values
that return false for any comparison, including equality with
themselves, and are maintained through operations. Because this
doesn't work well with the Reals (which are otherwise totally ordered
by ``<``), Guido suggested we might put NaN in its own type. It is
conceivable that this can still be represented by C doubles but be
included in a different ABC at runtime. **Open issue:** Is this a good
idea?::

    class NotANumber(Number):
        """Implement IEEE 754 semantics."""
        def __lt__(self, other): return false
        def __eq__(self, other): return false
        ...
        def __add__(self, other): return self
        def __radd__(self, other): return self
        ...

Complex numbers are immutable and hashable. Implementors should be
careful that they make equal numbers equal and hash them to the same
values. This may be subtle if there are two different extensions of
the real numbers::

    class Complex(Hashable, Number):
        """A ``Complex`` should define the operations that work on the
        Python ``complex`` type. If it is given heterogenous
        arguments, it may fall back on this class's definition of the
        operations. These operators should never return a
        TypeError as long as both arguments are instances of Complex
        (or even just implement __complex__).
        """
        @abstractmethod
        def __complex__(self):
            """This operation gives the arithmetic operations a fallback.
            """
            return complex(self.real, self.imag)
        @property
        def real(self):
            return complex(self).real
        @property
        def imag(self):
            return complex(self).imag

I define the reversed operations here so that they serve as the final
fallback for operations involving instances of Complex. **Open
issue:** Should Complex's operations check for ``isinstance(other,
Complex)``? Duck typing seems to imply that we should just try
__complex__ and succeed if it works, but stronger typing might be
justified for the operators. TODO: analyze the combinations of normal
and reversed operations with real and virtual subclasses of Complex::

        def __radd__(self, other):
            """Should this catch any type errors and return
            NotImplemented instead?"""
            return complex(other) + complex(self)
        def __rsub__(self, other):
            return complex(other) - complex(self)
        def __neg__(self):
            return -complex(self)
        def __rmul__(self, other):
            return complex(other) * complex(self)
        def __rdiv__(self, other):
            return complex(other) / complex(self)

        def __abs__(self):
            return abs(complex(self))

        def conjugate(self):
            return complex(self).conjugate()

        def __hash__(self):
            """Two "equal" values of different complex types should
            hash in the same way."""
	    return hash(complex(self))


The ``Real`` ABC indicates that the value is on the real line, and
supports the operations of the ``float`` builtin. Real numbers are
totally ordered. (NaNs were handled above.)::

    class Real(Complex, metaclass=TotallyOrderedABC):
        @abstractmethod
	def __float__(self):
	    """Any Real can be converted to a native float object."""
	    raise NotImplementedError
        def __complex__(self):
            """Which gives us an easy way to define the conversion to
            complex."""
            return complex(float(self))
        @property
        def real(self): return self
        @property
        def imag(self): return 0

        def __radd__(self, other):
            if isinstance(other, Real):
                return float(other) + float(self)
            else:
                return super(Real, self).__radd__(other)
        def __rsub__(self, other):
            if isinstance(other, Real):
                return float(other) - float(self)
            else:
                return super(Real, self).__rsub__(other)
        def __neg__(self):
            return -float(self)
        def __rmul__(self, other):
            if isinstance(other, Real):
                return float(other) * float(self)
            else:
                return super(Real, self).__rmul__(other)
        def __rdiv__(self, other):
            if isinstance(other, Real):
                return float(other) / float(self)
            else:
                return super(Real, self).__rdiv__(other)
        def __rdivmod__(self, other):
            """Implementing divmod() for your type is sufficient to
            get floordiv and mod too.
            """
            if isinstance(other, Real):
                return divmod(float(other), float(self))
            else:
                return super(Real, self).__rdivmod__(other)
        def __rfloordiv__(self, other):
            return divmod(other, self)[0]
        def __rmod__(self, other):
            return divmod(other, self)[1]

	def __trunc__(self):
            """Do we want properfraction, floor, ceiling, and round?"""
            return trunc(float(self))

        def __abs__(self):
            return abs(float(self))

There is no way to define only the reversed comparison operators, so
these operations take precedence over any defined in the other
type. :( ::

        def __lt__(self, other):
            """The comparison operators in Python seem to be more
            strict about their input types than other functions. I'm
            guessing here that we want types to be incompatible even
            if they define a __float__ operation, unless they also
            declare themselves to be Real numbers.
            """
            if isinstance(other, Real):
                return float(self) < float(other)
            else:
                return NotImplemented

        def __le__(self, other):
            if isinstance(other, Real):
                return float(self) <= float(other)
            else:
                return NotImplemented

        def __eq__(self, other):
            if isinstance(other, Real):
                return float(self) == float(other)
            else:
                return NotImplemented


There is no built-in rational type, but it's straightforward to write,
so we provide an ABC for it::

    class Rational(Real, Exact):
        """rational.numerator and rational.denominator should be in
        lowest terms.
        """
        @abstractmethod
        @property
        def numerator(self):
            raise NotImplementedError
        @abstractmethod
        @property
        def denominator(self):
            raise NotImplementedError

        def __float__(self):
            return self.numerator / self.denominator


    class Integer(Rational):
	@abstractmethod
	def __int__(self):
	    raise NotImplementedError
	def __float__(self):
	    return float(int(self))
        @property
        def numerator(self): return self
        @property
        def denominator(self): return 1

	def __ror__(self, other):
	    return int(other) | int(self)
	def __rxor__(self, other):
	    return int(other) ^ int(self)
	def __rand__(self, other):
	    return int(other) & int(self)
	def __rlshift__(self, other):
	    return int(other) << int(self)
	def __rrshift__(self, other):
	    return int(other) >> int(self)
	def __invert__(self):
	    return ~int(self)

        def __radd__(self, other):
            """All of the Real methods need to be overridden here too
            in order to get a more exact type for their results.
            """
            if isinstance(other, Integer):
                return int(other) + int(self)
            else:
                return super(Integer, self).__radd__(other)
        ...

        def __hash__(self):
            """Surprisingly, hash() needs to be overridden too, since
            there are integers that float can't represent."""
            return hash(int(self))


Adding More Numeric ABCs
------------------------

There are, of course, more possible ABCs for numbers, and this would
be a poor hierarchy if it precluded the possibility of adding
those. You can add ``MyFoo`` between ``Complex`` and ``Real`` with::

    class MyFoo(Complex): ...
    MyFoo.register(Real)

TODO(jyasskin): Check this.


Rejected Alternatives
=====================

The initial version of this PEP defined an algebraic hierarchy
inspired by a Haskell Numeric Prelude [#numericprelude]_ including
MonoidUnderPlus, AdditiveGroup, Ring, and Field, and mentioned several
other possible algebraic types before getting to the numbers. I had
expected this to be useful to people using vectors and matrices, but
the NumPy community really wasn't interested. The numbers then had a
much more branching structure to include things like the Gaussian
Integers and Z/nZ, which could be Complex but wouldn't necessarily
support things like division. The community decided that this was too
much complication for Python, so the proposal has been scaled back to
resemble the Scheme numeric tower much more closely.

References
==========

.. [#pep3119] Introducing Abstract Base Classes
   (http://www.python.org/dev/peps/pep-3119/)

.. [#pep3107] Function Annotations
   (http://www.python.org/dev/peps/pep-3107/)

.. [3] Possible Python 3K Class Tree?, wiki page created by Bill Janssen
   (http://wiki.python.org/moin/AbstractBaseClasses)

.. [#numericprelude] NumericPrelude: An experimental alternative
hierarchy of numeric type classes
   (http://darcs.haskell.org/numericprelude/docs/html/index.html)

.. [#schemetower] The Scheme numerical tower
   (http://www.swiss.ai.mit.edu/ftpdir/scheme-reports/r5rs-html/r5rs_8.html#SEC50)


Acknowledgements
================

Thanks to Neil Norwitz for encouraging me to write this PEP in the
first place, to Travis Oliphant for pointing out that the numpy people
didn't really care about the algebraic concepts, to Alan Isaac for
reminding me that Scheme had already done this, and to Guido van
Rossum and lots of other people on the mailing list for refining the
concept.

Copyright
=========

This document has been placed in the public domain.

From greg.ewing at canterbury.ac.nz  Thu May 17 02:48:11 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 17 May 2007 12:48:11 +1200
Subject: [Python-3000] Raw strings containing \u or \U
In-Reply-To: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>
References: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>
Message-ID: <464BA64B.4020007@canterbury.ac.nz>

Guido van Rossum wrote:

> I'm still on the fence about the trailing backslash; I personally
> prefer to write Windows paths using regular strings and doubled
> backslashes.

Maybe we should have a special w"..." string in the Windows
version of Python for pathnames.

It would raise a SyntaxError in non-Windows Pythons, thus
discouraging people trying to use Windows pathnames in
cross-platform code. :-)

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From jimjjewett at gmail.com  Thu May 17 02:50:33 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 16 May 2007 20:50:33 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <1d85506f0705141212m65b9ec37q5f685f507e394f01@mail.gmail.com>
References: <1d85506f0705130833v1058b022re0597cf9f259320d@mail.gmail.com>
	<19dd68ba0705130925j1dd55f1boba9e1b6c036d0422@mail.gmail.com>
	<43aa6ff70705131009s7d5b177dmea7c790d670ac3c0@mail.gmail.com>
	<1d85506f0705131042q23270a91qa31ff2f3940019ed@mail.gmail.com>
	<19dd68ba0705131104r85531f3o12b7e1769d7b7140@mail.gmail.com>
	<eae285400705141131u7cda0202p8a1fef47f5021878@mail.gmail.com>
	<1d85506f0705141212m65b9ec37q5f685f507e394f01@mail.gmail.com>
Message-ID: <fb6fbf560705161750n2e78547dl2444dcbeae6622ba@mail.gmail.com>

On 5/14/07, tomer filiba <tomerfiliba at gmail.com> wrote:
> as an english-second-language programmer, i'd really like to be able
> to have unicode identifiers -- but my gut feeling is -- it will open the
> door for a tower of babel.

I don't think this happened in Lisp.  I won't pretend there hasn't
been a tower of babel there, but it isn't because you can use
non-ascii symbols.

You can use any character in a symbol (~= identifier), including (if
your implementation supports such characters at all, even in comments)
Hebrew or Chinese characters.

On the other hand, you have to go out of your way to use unusual
identifier characters (including latin characters, if you care about
the case); this may have contributed to the strong tendency to stick
with ascii.

-jJ

From greg.ewing at canterbury.ac.nz  Thu May 17 02:51:58 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 17 May 2007 12:51:58 +1200
Subject: [Python-3000] Support for PEP 3131 - discussion on python zope
 users group
In-Reply-To: <43aa6ff70705160855s8d2edb8k9212455f0696c6f8@mail.gmail.com>
References: <19dd68ba0705152213h7dc04e48qfc2f1ad4f5d61b99@mail.gmail.com>
	<43aa6ff70705160855s8d2edb8k9212455f0696c6f8@mail.gmail.com>
Message-ID: <464BA72E.9010204@canterbury.ac.nz>

Collin Winter wrote:

> So now we've made the jump from "help (some) international users" to
> "I want to use unicode characters just for the hell of it".

Seems to me it's more like "I want to express my algorithm
in a way that other mathematicians can easily follow".

Which isn't all that much different from "I want to express
my algorithm in a way that other speakers of my native
language can easily follow".

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From tomerfiliba at gmail.com  Thu May 17 03:06:13 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Thu, 17 May 2007 03:06:13 +0200
Subject: [Python-3000] pep 3131 again
Message-ID: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>

=== RTL/LTR ===
i pointed out already that no existing editor can handle LTR-RTL
representation correctly, which essentially renders all RTL languages
out of the scope of this PEP. that doesn't bother me personally so much,
as i'm not going to use this feature anyway, but that still leaves us with
the "european imposed colonialism" :)

the only practical way to use RTL languages in code is to have an RTL
programming language, where "if" is spelled "??", "for" as "????",
"in" as "????", and so on, and the entire program is RTL. having code
like --

for ??? in ????(1,2,3)

is only unreadable by all means (since the parenthesis are LTR, while
the name is RTL, etc.)

=== help people who can't type english ===
since the keywords remain ASCII, along with stdlib and all other major
third party libs -- how does that help the english-illiterate programmer?

    import random
    ?? = range(100)
    random.shuffle(?? )
    ? = ??.pop(7)
    if len(?) > 58:
        print "?????!!!" #  ?? ?? ??????? ???? ??? ?????

apart from excessive visual noise, the amount of *latin* identifiers and
keywords is not negligible. if all you're trying to save is coming up with
english names for your functions, than that's okay, but saying
"japanese people have a hard time coding in the latin alphabet"
does not withstand practical usage.

the solution is an intermediate translator that lies between the programmer
and the interpreter. that -- or learning latin (it's only *26* letters :) and
transliterating japanese names with latin characters.

all in all, i'm still -1 on that. i would rather go halfway -- allow unicode
comments. let people write docs in their native language, that's all
fine with me (or is that already imposed by the UTF8 PEP?)



-tomer

From jcarlson at uci.edu  Thu May 17 03:30:29 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 16 May 2007 18:30:29 -0700
Subject: [Python-3000] pep 3131 again
In-Reply-To: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
References: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
Message-ID: <20070516182431.8591.JCARLSON@uci.edu>


"tomer filiba" <tomerfiliba at gmail.com> wrote:
> all in all, i'm still -1 on that. i would rather go halfway -- allow unicode
> comments. let people write docs in their native language, that's all
> fine with me (or is that already imposed by the UTF8 PEP?)

I could have sworn that unicode comments and docstrings are already
allowed with any sufficient encoding, with or without the UFT8 default
encoding PEP.  Testing on Python 2.3 seems to confirm this.

 - Josiah


From talin at acm.org  Thu May 17 04:30:03 2007
From: talin at acm.org (Talin)
Date: Wed, 16 May 2007 19:30:03 -0700
Subject: [Python-3000] PEP 3131 - the details
Message-ID: <464BBE2B.1050201@acm.org>

While there has been a lot of discussion as to whether to accept PEP 
3131 as a whole, there has been little discussion as to the specific 
details of the PEP. In particular, is it generally agreed that the 
Unicode character classes listed in the PEP are the ones we want to 
include in identifiers? My preference is to be conservative in terms of 
what's allowed.

-- Talin

From mike.klaas at gmail.com  Thu May 17 04:58:39 2007
From: mike.klaas at gmail.com (Mike Klaas)
Date: Wed, 16 May 2007 19:58:39 -0700
Subject: [Python-3000] pep 3131 again
In-Reply-To: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
References: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
Message-ID: <3BD9B8F2-64EE-4864-A229-8C0D7E86EE96@gmail.com>


On 16-May-07, at 6:06 PM, tomer filiba wrote:
>
> === help people who can't type english ===
> since the keywords remain ASCII, along with stdlib and all other major
> third party libs -- how does that help the english-illiterate  
> programmer?
>
>     import random
>     ?? = range(100)
>     random.shuffle(?? )
>     ? = ??.pop(7)
>     if len(?) > 58:
>         print "?????!!!" #  ?? ?? ???????  
> ???? ??? ?????
>
> apart from excessive visual noise, the amount of *latin*  
> identifiers and
> keywords is not negligible. if all you're trying to save is coming  
> up with
> english names for your functions, than that's okay, but saying
> "japanese people have a hard time coding in the latin alphabet"
> does not withstand practical usage.

It will always be harder for non-english-speaking people to learn an  
english-derived programming language.  It is somewhat specious to  
equate the difficulty of learning the keywords and (some of the)  
standard library with the difficulty of using latin completely.

Consider that for many language which aren't as pseudocodal as  
python, there is already a need to learn arbitrary symbols.  $,%,@  
have special meaning in perl, "car/cdr" in lisp, '!/&&/||/~' in c...  
these finite sets of symbols are necessary for english-speaking  
people to learn, and non-english-speaking people would (I imagine)  
apply similar rules for learning the keywords of python.

Imagine if python keywords were in english, but written using the  
phonetic symbols.  It would take a while to get used to the different  
keywords, as would learning any new symbols.  It might even be  
extremely difficult.  However, the difficulty would not be the same  
as learning to quickly write and understand english words written  
phonetically (which would be required if the phonetic alphabet were  
the canonical characters of python symbols).

I don't have experience learning to program in a foreign language,  
but it seems evident to me that the two levels of familiarity are  
substantially diffierent.

-Mike

From foom at fuhm.net  Thu May 17 05:14:21 2007
From: foom at fuhm.net (James Y Knight)
Date: Wed, 16 May 2007 23:14:21 -0400
Subject: [Python-3000] pep 3131 again
In-Reply-To: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
References: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
Message-ID: <82E08374-B97F-4884-9D26-2F5A4CCF9392@fuhm.net>


On May 16, 2007, at 9:06 PM, tomer filiba wrote:

> === RTL/LTR ===
> i pointed out already that no existing editor can handle LTR-RTL
> representation correctly, which essentially renders all RTL languages
> out of the scope of this PEP. that doesn't bother me personally so  
> much,
> as i'm not going to use this feature anyway, but that still leaves  
> us with
> the "european imposed colonialism" :)
>
> the only practical way to use RTL languages in code is to have an RTL
> programming language, where "if" is spelled "??", "for" as  
> "????",
> "in" as "????", and so on, and the entire program is RTL. having  
> code
> like --

> for ??? in ????(1,2,3)

> is only unreadable by all means (since the parenthesis are LTR, while
> the name is RTL, etc.)

It is interesting to contrast the rendering of that (ABC being  
substitutes for hebrew characters):
for ABB in 1,2,3)ACAC)

with the rendering of:
for ??? in ????(a,b,c)
as:
for ABB in ACAC(a,b,c)

This is I suppose due to numbers and punctuation having weak  
directionality in the bidi algorithm, which isn't really appropriate  
for tokens in a programming language. So yes, clearly, an editor that  
takes into account the special needs of programming languages is  
necessary to effectively write bidi code. But it's certainly not  
inconceivable, and I don't see that the non-existence of an effective  
bidi editor should influence the decision to allow unicode characters  
in python at all. For a majority of languages that are LTR, it is not  
an issue, and I have every confidence that the bidi programming  
editor problem will be solved at some point in the future. The only  
thing python can possibly do to help with this is to ignore any RLO/ 
LRO/LRE/RLE/PDF/RLM/LRM characters it sees during tokenization.  
(probably ought to ignore anything with the  
"Default_Ignorable_Code_Point" unicode property).

This would allow a smart editor to save the text with such formatting  
characters in it, so that other "dumb" viewers would not be confused.
For example, with explicit formatting added, rendering can be made  
correct:
for ????? in ???????(1,2,3)

http://imagic.weizmann.ac.il/~dov/Hebrew/logicUI24.htm#h1-25 shows  
someone has thought about this at least a little from the editor  
perspective...

James

From tjreedy at udel.edu  Thu May 17 05:38:53 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 16 May 2007 23:38:53 -0400
Subject: [Python-3000] PEP 3131 - the details
References: <464BBE2B.1050201@acm.org>
Message-ID: <f2gioe$df5$1@sea.gmane.org>


"Talin" <talin at acm.org> wrote in message news:464BBE2B.1050201 at acm.org...
| While there has been a lot of discussion as to whether to accept PEP
| 3131 as a whole, there has been little discussion as to the specific
| details of the PEP. In particular, is it generally agreed that the
| Unicode character classes listed in the PEP are the ones we want to
| include in identifiers? My preference is to be conservative in terms of
| what's allowed.

Some questions I have: is the defined UID set the same as in the referenced 
appendix?  Is it the same as in Java (and hence Jython)?  The same as in 
.NET (and hence IronPython)?

tjr




From greg.ewing at canterbury.ac.nz  Thu May 17 05:53:26 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 17 May 2007 15:53:26 +1200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705161750n2e78547dl2444dcbeae6622ba@mail.gmail.com>
References: <1d85506f0705130833v1058b022re0597cf9f259320d@mail.gmail.com>
	<19dd68ba0705130925j1dd55f1boba9e1b6c036d0422@mail.gmail.com>
	<43aa6ff70705131009s7d5b177dmea7c790d670ac3c0@mail.gmail.com>
	<1d85506f0705131042q23270a91qa31ff2f3940019ed@mail.gmail.com>
	<19dd68ba0705131104r85531f3o12b7e1769d7b7140@mail.gmail.com>
	<eae285400705141131u7cda0202p8a1fef47f5021878@mail.gmail.com>
	<1d85506f0705141212m65b9ec37q5f685f507e394f01@mail.gmail.com>
	<fb6fbf560705161750n2e78547dl2444dcbeae6622ba@mail.gmail.com>
Message-ID: <464BD1B6.9020202@canterbury.ac.nz>

Jim Jewett wrote:

> You can use any character in a symbol (~= identifier), including (if
> your implementation supports such characters at all, even in comments)
> Hebrew or Chinese characters.

Lisp is a bit different, because it's always had only a very
few chars that aren't identifier chars, so you're used to seeing
identifiers with all sorts of junk in them. But in Python, you
tend to see anything that you don't recognise as a letter or
digit as "punctuation" and therefore non-identifier.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From g.brandl at gmx.net  Thu May 17 07:45:17 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 17 May 2007 07:45:17 +0200
Subject: [Python-3000] Raw strings containing \u or \U
In-Reply-To: <464B7235.20500@ronadam.com>
References: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>		<d11dcfba0705160955r769e6356qe4bad4e776b4a55d@mail.gmail.com>		<ca471dc20705161005r684c359dg8fa7b78355cb4ccc@mail.gmail.com>		<d11dcfba0705161132o227fd065y2c7fef01c0824822@mail.gmail.com>		<464B62FD.4070400@ronadam.com>	<ca471dc20705161329h2b7ae659qe21ad594e2939d6a@mail.gmail.com>
	<464B7235.20500@ronadam.com>
Message-ID: <f2gq55$d4p$1@sea.gmane.org>

Ron Adam schrieb:
> Guido van Rossum wrote:
>> That would be great! This will automatically turn \u1234 into 6
>> characters, right?
> 
> I'm not exactly clear when the '\uxxxx' characters get converted.  There 
> isn't any conversion done in tokanize.c that I can see.  It's primarily 
> only concerned with finding the beginning and ending of the string at that 
> point.  It looks like everything between the beginning and end is just 
> passed along "as is" and it's translated further later in the chain.

Look at Python/ast.c, which has functions parsestr() and decode_unicode().
The latter calls PyUnicode_DecodeRawUnicodeEscape() which I think is the
function you're looking for.

Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From foom at fuhm.net  Thu May 17 07:50:17 2007
From: foom at fuhm.net (James Y Knight)
Date: Thu, 17 May 2007 01:50:17 -0400
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <464BBE2B.1050201@acm.org>
References: <464BBE2B.1050201@acm.org>
Message-ID: <69B09BFE-3BF3-4532-98EA-8A7E44461D77@fuhm.net>

On May 16, 2007, at 10:30 PM, Talin wrote:
> While there has been a lot of discussion as to whether to accept PEP
> 3131 as a whole, there has been little discussion as to the specific
> details of the PEP. In particular, is it generally agreed that the
> Unicode character classes listed in the PEP are the ones we want to
> include in identifiers?

One issue I see is that the PEP defines ID_Start and ID_Continue  
itself. It should not do that, bue instead reference as authoritative  
the unicode properties ID_Start and ID_Continue defined in the  
unicode property database.

ID_Start is officially: Lu+Ll+Lt+Lm+Lo+Nl+Other_ID_Start
and ID_Continue is officially: ID_Start + Mn+Mc+Nd+Pc +  
Other_ID_Continue

The only differences between PEP 3131's definition and the official  
ones is the Other_* bits. Those are there to ensure the requirement  
that anything now in ID_Start/ID_Continue will always in the future  
be in said categories. That is an important feature, and should not  
be overlooked. Without the supplemental list, a future version of  
unicode which changes the general class of a character could make a  
previously valid identifier become invalid. The list currently  
includes the following entries:

2118          ; Other_ID_Start # So       SCRIPT CAPITAL P
212E          ; Other_ID_Start # So       ESTIMATED SYMBOL
309B..309C    ; Other_ID_Start # Sk   [2] KATAKANA-HIRAGANA VOICED  
SOUND MARK..KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK
1369..1371    ; Other_ID_Continue # No   [9] ETHIOPIC DIGIT  
ONE..ETHIOPIC DIGIT NINE

This list is available as part of the PropList.txt file in the  
unicode data, which ought to be included automatically in python's  
unicode database so as to get future changes.

> My preference is to be conservative in terms of what's allowed.

I do not believe it is a good idea for python to define its own  
identifier rules. The rules defined in UAX31 make sense and should be  
used directly, with only the minor amendment of _ as an allowable  
start character.

James


From talin at acm.org  Thu May 17 09:40:19 2007
From: talin at acm.org (Talin)
Date: Thu, 17 May 2007 00:40:19 -0700
Subject: [Python-3000] Updated and simplified PEP 3141: A Type Hierarchy
 for	Numbers
In-Reply-To: <5d44f72f0705161731j4700bdb3h4e36e97757bd6a32@mail.gmail.com>
References: <5d44f72f0705161731j4700bdb3h4e36e97757bd6a32@mail.gmail.com>
Message-ID: <464C06E3.2090104@acm.org>

Jeffrey Yasskin wrote:
> I've updated PEP3141 to remove the algebraic classes and bring the
> numeric hierarchy much closer to scheme's design. Let me know what you
> think. Feel free to send typographical and formatting problems just to
> me. My schedule's a little shaky the next couple weeks, but I'll make
> updates as quickly as I can.

General comments:

I need to give some background first, so be patient :)

The original version of this PEP was written at a time when ABCs were at 
an earlier stage in their conceptual development. The notion of 
overriding 'isinstance' had not yet been introduced, and so the only way 
to inherit from an ABC was by the traditional inheritance mechanism.

At that time, there were a number of proposals for adding ABC base 
classes to Python's built-in types. Those ABCs, being the foundation for 
the built-in types, would have had to be built-ins themselves, and would 
have been required to be initialized prior to the built-ins that 
depended on them. This in turn meant that those ABCs were "special", in 
the sense that they were officially sanctioned by the Python runtime 
itself. The ABCs in this PEP and in the other ABC PEPs would have been 
given a privileged status, elevated even above the classes in the 
standard library.

My feeling at the time was that I was uncomfortable with a brand new 
type hierarchy, still in a relatively immature stage of development, 
being deeply rooted into the core of Python. Being embedded into the 
interpreter means that it would be hard to experiment with different 
variations and to test out different options for the number hierarchy. 
Not only would the embedded classes be hard to change, but there would 
be no way that alternative proposals could compete.

My concern was that there would be little, if any, evolution of these 
concepts, and that we would be stuck with a set of decisions which had 
been made in haste. This is exactly contrary to the usual prescription 
for Python library modules, which are supposed to prove themselves in 
real-world apps before being enshrined in the standard library.

Now, the situation has changed somewhat. The ABC PEP has radically 
shifted its focus, de-emphasizing traditional inheritance towards a new 
mechanism which I call 'dynamic inheritance' - the ability to declare 
new inheritance relations after a class has been created.

Lets therefore assume that the numeric ABCs will use this new 
inheritance mechanism, avoiding the problem of taking an immature class 
hierarchy and setting it in stone. The PEPs in this class would then no 
longer need to have this privileged status; They could be replaced and 
changed at will.

Assuming that this is true, the question then becomes whether these 
classes should be treated like any other standard library submission. In 
other words, shouldn't this PEP be implemented as a separate module, and 
have to prove itself 'in the wild' before being adopted into the stdlib? 
Does this PEP even need to be a PEP at all, or can it just be a 
3rd-party library that is eventually adopted into Python?

Now, I *could* see adopting an untried library embodying untested ideas 
into the stdlib if there was a crying need for the features of such a 
library, and those needs were clearly being unfulfilled. However, I am 
not certain that this is the case here.

At the very least, I think it should be stated in the PEP whether or not 
the ABCs defined here are going to be using traditional or dynamic 
inheritance.

If it is the latter, and we decide that this PEP is going to be part of 
the stdlib, then I propose the following library organization:

    import abc              # Imports the basic ABC mechanics
    import abc.collections  # MutableSequence and such
    import abc.math         # The number hierarchy
    ... and so on

Now, there is another issue that needs to be dicussed.

The classes in the PEP appear to be written with lots of mixin methods, 
such as __rsub__ and __abs__ and such. Unfortunately, the current 
proposed method for dynamic inheritance does not allow for methods or 
properties to be inherited from the 'virtual' base class. Which means 
that all of the various methods defined in this PEP are utterly 
meaningless other than as documentation - except in the case of a new 
user-created class of numbers which inherit from these ABCs using 
traditional inheritance, which is not something that I expect to happen 
very often at all. For virtually all practical uses, the elaborate 
methods defined in this PEP will be unused and inaccessible.

This really highlights what I think is a problem with dynamic 
inheritance, and I think that this inconsistency between traditional and 
dynamic inheritance will eventually come back to haunt us. It has always 
been the case in the past that for every property of class B, if 
isinstance(A, B) == True, then A also has that property, either 
inherited from B, or overridden in A. The fact that this invariant will 
no longer hold true is a problem in my opinion.

I realize that there isn't currently a solution to efficiently allow 
inheritance of properties via dynamic inheritance. As a software 
engineer, however, I generally feel that if a feature is unreliable, 
then it shouldn't be used at all. So if I were designing a class 
hierarchy of ABCs, I would probably make a rule for myself not to define 
any properties or methods in the ABCs at all, and to *only* use ABCs for 
type testing via 'isinstance'.

In other words, if I were writing this PEP, all of those special methods 
would be omitted, simply because as a writer of a subclass I couldn't 
rely on being able to use them.

The only alternative that I can see is to not use dynamic inheritance at 
all, and instead have the number classes inherit from these ABCs using 
the traditional mechanism. But that brings up all the problems of 
immaturity and requiring them to be built-in that I brought up earlier.

-- Talin


From martin at v.loewis.de  Thu May 17 10:51:19 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 May 2007 10:51:19 +0200
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <f2gioe$df5$1@sea.gmane.org>
References: <464BBE2B.1050201@acm.org> <f2gioe$df5$1@sea.gmane.org>
Message-ID: <464C1787.7090209@v.loewis.de>

> Some questions I have: is the defined UID set the same as in the referenced 
> appendix?  

Yes; it was copied from there.

> Is it the same as in Java (and hence Jython)?

No. Not sure whether I can produce a complete list of differences, but
some of them are:
- Java allows $ in identifiers, the PEP doesn't (as is Python tradition)
  (more generally: it allows currency symbols in identifiers)
- Java allows arbitrary connecting punctuators as the start; the PEP
  only allows the underscore
- Java allows "arbitrary" digits in an identifier. I'm not quite sure
  what that means: JLS refers to isJavaIdentifierPart, which specifies
  "a digit" and refers to isLetterOrDigit, which refers to JLS. isDigit
  gives true if the character NAME contains DIGIT, and the digit is
  not in the range U+2000..U+2FFF
  The PEP specifies that digits need to have the Nd class.
  Comparing these two, it seems that Java allows several characters from
  the No class, which Python does not allow.
- Java allows "ignorable control characters" in identifiers, which
  Python doesn't allow.

So, in short, it seems that Python's identifier syntax would be strictly
more restrictive than Java's.

>  The same as in .NET (and hence IronPython)?

This kind of research is time consuming; it cost me an hour to come
up with above list. Please research it for yourself.

Regards,
Martin

From martin at v.loewis.de  Thu May 17 11:10:58 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 May 2007 11:10:58 +0200
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <69B09BFE-3BF3-4532-98EA-8A7E44461D77@fuhm.net>
References: <464BBE2B.1050201@acm.org>
	<69B09BFE-3BF3-4532-98EA-8A7E44461D77@fuhm.net>
Message-ID: <464C1C22.7030806@v.loewis.de>

> One issue I see is that the PEP defines ID_Start and ID_Continue  
> itself. It should not do that, bue instead reference as authoritative  
> the unicode properties ID_Start and ID_Continue defined in the  
> unicode property database.

ID_Start and ID_Continue are derived non-mandatory properties, and I
believe UAX#31 is the one defining these properties. So I thought I
could just copy the definition.

Currently, the Python unicodedata module does not contain a
definition for ID_Start and ID_Continue, so I could not use
it in the PEP.

> ID_Start is officially: Lu+Ll+Lt+Lm+Lo+Nl+Other_ID_Start
> and ID_Continue is officially: ID_Start + Mn+Mc+Nd+Pc +  
> Other_ID_Continue

I know see what 'stability extensions' are which are mentioned
in the PEP (copied from UAX#31). Even though Python currently does
not include  Other_ID_Start and Other_ID_Continue, it could be
made so in the parser.

It would have been nice if UAX#31 had mentioned that the "stability
extensions" are recorded in these properties.

> The only differences between PEP 3131's definition and the official  
> ones is the Other_* bits. Those are there to ensure the requirement  
> that anything now in ID_Start/ID_Continue will always in the future  
> be in said categories. That is an important feature, and should not  
> be overlooked.

See the PEP: there was an XXX remark I still needed to resolve.

> This list is available as part of the PropList.txt file in the  
> unicode data, which ought to be included automatically in python's  
> unicode database so as to get future changes.

This I'm not so sure about. I changed the PEP to say that
Other_ID_{Start|Continue} should be included. Whether the other
properties should be added to the unidata module, I don't know -
I would like to see use cases first before including them.

> I do not believe it is a good idea for python to define its own  
> identifier rules. The rules defined in UAX31 make sense and should be  
> used directly, with only the minor amendment of _ as an allowable  
> start character.

That was my plan indeed.

Regards,
Martin

From martin at v.loewis.de  Thu May 17 11:13:48 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 17 May 2007 11:13:48 +0200
Subject: [Python-3000] pep 3131 again
In-Reply-To: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
References: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
Message-ID: <464C1CCC.3070008@v.loewis.de>

> all in all, i'm still -1 on that. i would rather go halfway -- allow unicode
> comments. let people write docs in their native language, that's all
> fine with me (or is that already imposed by the UTF8 PEP?)

As other's have pointed out: you can use non-ASCII comments for a long
time. In fact, you could use non-ASCII comments in *all* versions of
Python, and people have been doing so for years.

Regards,
Martin

From martin at v.loewis.de  Thu May 17 11:23:00 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 17 May 2007 11:23:00 +0200
Subject: [Python-3000] pep 3131 again
In-Reply-To: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
References: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
Message-ID: <464C1EF4.6040603@v.loewis.de>

> === help people who can't type english ===
> since the keywords remain ASCII, along with stdlib and all other major
> third party libs -- how does that help the english-illiterate programmer?

english-illiterate and "can't type english" are very different things.
By "can't type english", I assume you mean "can't type Latin
characters". These users are not helped at all by this PEP, but I think
they are really rare, since keyboards commonly support a mode to enter
Latin characters (perhaps after pressing some modifier key, or switching
to Latin mode).

> 
>     import random
>     ?? = range(100)
>     random.shuffle(?? )
>     ? = ??.pop(7)
>     if len(?) > 58:
>         print "?????!!!" #  ?? ?? ??????? ???? ??? ?????
> 
> apart from excessive visual noise, the amount of *latin* identifiers and
> keywords is not negligible.

Right. However, you don't have to understand *English* to write or read
this text. You don't need to know that "import" means "to bring from a
foreign or external source", and that "shuffle" means "to mix in a mass
confusedly". Instead, understanding them by their Python meaning is
enough.

> if all you're trying to save is coming up with
> english names for your functions, than that's okay, but saying
> "japanese people have a hard time coding in the latin alphabet"
> does not withstand practical usage.

Coming up with English names is not necessary today. Coming up
with Latin spellings is.

Whether or not Japanese or Chinese people with no knowledge of
English still can master the Latin alphabet easily, I don't know,
as all Chinese people I do know speak German or English well.

I would say "they can speak for themselves", except that then
neither of us would understand them.

Regards,
Martin

From hfoffani at gmail.com  Thu May 17 11:27:51 2007
From: hfoffani at gmail.com (Hernan M Foffani)
Date: Thu, 17 May 2007 11:27:51 +0200
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <464C1787.7090209@v.loewis.de>
References: <464BBE2B.1050201@acm.org> <f2gioe$df5$1@sea.gmane.org>
	<464C1787.7090209@v.loewis.de>
Message-ID: <11fab4bc0705170227s1925cee2j2181c45b7772d9d7@mail.gmail.com>

> >  The same as in .NET (and hence IronPython)?
>
> This kind of research is time consuming; it cost me an hour to come
> up with above list. Please research it for yourself.

FYI:

-----------------
ECMA-334

C# Language Specification
9 Lexical structure
9.4 Tokens
9.4.2 Identifiers
                                        Paragraph 1 (Page 55, Line 11)

1 The rules for identifiers rules given in this section correspond
exactly to those recommended by the Unicode Standard Annex 15 except
that underscore is allowed as an initial character (as is traditional
in the C programming language), Unicode escape sequences are permitted
in identifiers, and the "@" character is allowed as a prefix to enable
keywords to be used as identifiers.

identifier : available-identifier @ identifier-or-keyword

available-identifier : An identifier-or-keyword that is not a keyword

identifier-or-keyword : identifier-start-character identifier-part-charactersopt

identifier-start-character : letter-character _ (the underscore
character U+005F)

identifier-part-characters : identifier-part-character
identifier-part-characters identifier-part-character

identifier-part-character : letter-character decimal-digit-character
connecting-character combining-character formatting-character

letter-character : A Unicode character of classes Lu, Ll, Lt, Lm, Lo,
or Nl A unicode-escape-sequence representing a character of classes
Lu, Ll, Lt, Lm, Lo, or Nl

combining-character : A Unicode character of classes Mn or Mc A
unicode-escape-sequence representing a character of classes Mn or Mc

decimal-digit-character : A Unicode character of the class Nd A
unicode-escape-sequence representing a character of the class Nd

connecting-character : A Unicode character of the class Pc A
unicode-escape-sequence representing a character of the class Pc

formatting-character : A Unicode character of the class Cf A
unicode-escape-sequence representing a character of the class Cf

-------

Disclaimer: don't know the specification date nor its authenticity.

From hernan at foffani.org  Thu May 17 11:35:19 2007
From: hernan at foffani.org (=?ISO-8859-1?Q?Hernan_Mart=EDnez-Foffani?=)
Date: Thu, 17 May 2007 11:35:19 +0200
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <11fab4bc0705170227s1925cee2j2181c45b7772d9d7@mail.gmail.com>
References: <464BBE2B.1050201@acm.org> <f2gioe$df5$1@sea.gmane.org>
	<464C1787.7090209@v.loewis.de>
	<11fab4bc0705170227s1925cee2j2181c45b7772d9d7@mail.gmail.com>
Message-ID: <11fab4bc0705170235m3d9d50fg2aab33eb712f05b0@mail.gmail.com>

> > >  The same as in .NET (and hence IronPython)?
> >
> > This kind of research is time consuming; it cost me an hour to come
> > up with above list. Please research it for yourself.

C# identifiers (cont)
About normalization and @

---------


                                        Paragraph 2 (Page 56, Line 5)

1 An identifier in a conforming program must be in the canonical
format defined by Unicode Normalization Form C, as defined by Unicode
Standard Annex 15. 2 The behavior when encountering an identifier not
in Normalization Form C is implementation-defined; however, a
diagnostic is not required.



                                        Paragraph 3 (Page 56, Line 8)

1 The prefix "@" enables the use of keywords as identifiers, which is
useful when interfacing with other programming languages. 2 The
character @ is not actually part of the identifier, so the identifier
might be seen in other languages as a normal identifier, without the
prefix. 3 An identifier with an @ prefix is called a verbatim
identifier. [Note: Use of the @ prefix for identifiers that are not
keywords is permitted, but strongly discouraged as a matter of style.
end note] [Example: The example:

class @class
{
   public static void @static(bool @bool) {
      if (@bool)
      System.Console.WriteLine("true");
      else
      System.Console.WriteLine("false");
   }
}
class Class1
{
   static void M() {
      cl\u0061ss.st\u0061tic(true);
   }
}

defines a class named "class" with a static method named "static" that
takes a parameter named "bool". Note that since Unicode escapes are
not permitted in keywords, the token "cl\u0061ss" is an identifier,
and is the same identifier as "@class". end example]



                                       Paragraph 4 (Page 56, Line 32)

1 Two identifiers are considered the same if they are identical after
the following transformations are applied, in order:

    * 2 The prefix "@", if used, is removed.
    * 3 Each unicode-escape-sequence is transformed into its
corresponding Unicode character.
    * 4 Any formatting-characters are removed.



                                       Paragraph 5 (Page 56, Line 37)

1 Identifiers containing two consecutive underscore characters
(U+005F) are reserved for use by the implementation; however, no
diagnostic is required if such an identifier is defined. [Note: For
example, an implementation might provide extended keywords that begin
with two underscores. end note]

-----------------

Same disclaimer as before applies.

Regards,
-Hern?n.

From ncoghlan at gmail.com  Thu May 17 12:56:04 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 May 2007 20:56:04 +1000
Subject: [Python-3000] Updated and simplified PEP 3141: A Type Hierarchy
 for	Numbers
In-Reply-To: <464C06E3.2090104@acm.org>
References: <5d44f72f0705161731j4700bdb3h4e36e97757bd6a32@mail.gmail.com>
	<464C06E3.2090104@acm.org>
Message-ID: <464C34C4.2080702@gmail.com>

Talin wrote:
> This really highlights what I think is a problem with dynamic 
> inheritance, and I think that this inconsistency between traditional and 
> dynamic inheritance will eventually come back to haunt us. It has always 
> been the case in the past that for every property of class B, if 
> isinstance(A, B) == True, then A also has that property, either 
> inherited from B, or overridden in A. The fact that this invariant will 
> no longer hold true is a problem in my opinion.
> 
> I realize that there isn't currently a solution to efficiently allow 
> inheritance of properties via dynamic inheritance. As a software 
> engineer, however, I generally feel that if a feature is unreliable, 
> then it shouldn't be used at all. So if I were designing a class 
> hierarchy of ABCs, I would probably make a rule for myself not to define 
> any properties or methods in the ABCs at all, and to *only* use ABCs for 
> type testing via 'isinstance'.

If a class doesn't implement the interface defined by an ABC, you should 
NOT be registering it with that ABC via dynamic inheritance. *That's* 
the bug - the program is claiming that "instances of class A can be 
treated as if they were an instance of B" when that statement is simply 
not true. And without defining an interface, dispatching on the ABC is 
pointless - you don't know whether or not you support the operations 
implied by that ABC because there aren't any defined!

Now, with respect to the number hierarchy, I think building it as a 
vertical stack doesn't really match the way numbers have historically 
worked in Python -  integers and floats, for example, don't implement 
the complex number API:

 >>> (1).real
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute 'real'
 >>> (1.0).real
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
AttributeError: 'float' object has no attribute 'real'

Given the migration of PEP 3119 to an approach which is friendlier to 
classification after the fact, it's probably fine to simply punt on the 
question of an ABC heirarchy for numbers (as Talin already pointed out).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Thu May 17 13:11:20 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 May 2007 21:11:20 +1000
Subject: [Python-3000] r55359
	-	python/branches/py3k-struni/Lib/test/test_strop.py
In-Reply-To: <bbaeab100705161423w750c92efp41854fce8d892264@mail.gmail.com>
References: <20070515214221.787161E4012@bag.python.org>	<bbaeab100705151455m7c719bas8a79533655e477b8@mail.gmail.com>	<17995.29840.132278.935792@montanaro.dyndns.org>
	<bbaeab100705161423w750c92efp41854fce8d892264@mail.gmail.com>
Message-ID: <464C3858.3030307@gmail.com>

(relocating thread from python-3000-checkins)

Brett Cannon wrote:
> On 5/16/07, *skip at pobox.com <mailto:skip at pobox.com>* <skip at pobox.com 
> <mailto:skip at pobox.com>> wrote:
>         Brett> Strop should go when the string module goes.  I don't
>     remember
>         Brett> where the last "let's kill string but what do we do about
>     the few
>         Brett> useful things in there" conversation went.
> 
>     Sorry, I don't read the py3k list (but see checkins).  What about
>     the few
>     bits of string that have no obvious other place to live (lowercase,
>     digits,
>     etc)?  Do they somehow become attributes of the str class? 
> 
> That undecided at the moment.  Guido killed strop as there is a Python 
> implementation so it doesn't affect how to handle the string module.  As 
> of this moment no decision has been made whether to keep 'string' or to 
> kill it.

To be honest, I have never understood the repeated proposals to get rid 
of the string module. Get rid of the functions that are just duplicates 
of str methods, sure, but the module makes sense to me as a home for 
text related constants and other machinery (such as string.Template and 
the various building blocks for more advanced PEP 3101 based formatting).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From santagada at gmail.com  Thu May 17 14:36:14 2007
From: santagada at gmail.com (Leonardo Santagada)
Date: Thu, 17 May 2007 09:36:14 -0300
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <11fab4bc0705170235m3d9d50fg2aab33eb712f05b0@mail.gmail.com>
References: <464BBE2B.1050201@acm.org> <f2gioe$df5$1@sea.gmane.org>
	<464C1787.7090209@v.loewis.de>
	<11fab4bc0705170227s1925cee2j2181c45b7772d9d7@mail.gmail.com>
	<11fab4bc0705170235m3d9d50fg2aab33eb712f05b0@mail.gmail.com>
Message-ID: <0EE9E992-F418-4E2A-872D-7B2CE012FAC3@gmail.com>

Here are the rules for identifiers in javascript in case someone  
wants to know:
http://interglacial.com/javascript_spec/a-7.html#a-7.6

--
Leonardo Santagada
santagada at gmail.com




From aahz at pythoncraft.com  Thu May 17 15:06:39 2007
From: aahz at pythoncraft.com (Aahz)
Date: Thu, 17 May 2007 06:06:39 -0700
Subject: [Python-3000] Whither string? (was Re:
	python/branches/py3k-struni/Lib/test/test_strop.py)
In-Reply-To: <464C3858.3030307@gmail.com>
References: <20070515214221.787161E4012@bag.python.org>
	<bbaeab100705151455m7c719bas8a79533655e477b8@mail.gmail.com>
	<17995.29840.132278.935792@montanaro.dyndns.org>
	<bbaeab100705161423w750c92efp41854fce8d892264@mail.gmail.com>
	<464C3858.3030307@gmail.com>
Message-ID: <20070517130639.GA20958@panix.com>

On Thu, May 17, 2007, Nick Coghlan wrote:
> 
> To be honest, I have never understood the repeated proposals to get
> rid of the string module. Get rid of the functions that are just
> duplicates of str methods, sure, but the module makes sense to me
> as a home for text related constants and other machinery (such as
> string.Template and the various building blocks for more advanced PEP
> 3101 based formatting).

The trend in support seems to be toward moving everything left that is
useful from "string" to "text", which would be a package.  Overall, I'm
+1 on that idea.  I can see arguments in favor of leaving string, but
that name just has too much baggage.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Look, it's your affair if you want to play with five people, but don't
go calling it doubles."  --John Cleese anticipates Usenet

From benji at benjiyork.com  Thu May 17 15:15:28 2007
From: benji at benjiyork.com (Benji York)
Date: Thu, 17 May 2007 09:15:28 -0400
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <ca471dc20705161050w3f99d762r68b90bd5253c1b8c@mail.gmail.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>	<4648D626.1030201@benjiyork.com>	<ca471dc20705152146p382ac396u5f7dc4b4f138cd26@mail.gmail.com>	<464B4237.4090802@benjiyork.com>
	<ca471dc20705161050w3f99d762r68b90bd5253c1b8c@mail.gmail.com>
Message-ID: <464C5570.5050205@benjiyork.com>

Guido van Rossum wrote:
> On 5/16/07, Benji York <benji at benjiyork.com> wrote:
>> Guido van Rossum wrote:
>>> On 5/14/07, Benji York <benji at benjiyork.com> wrote:
>>>> Collin Winter wrote:
>>>>> PEP: 3133
>>>>> Title: Introducing Roles
>>>> Everything included here is included in zope.interface.  See in-line
>>>> comments below for the analogs.
>>> Could you look at PEP 3119 and do a similar analysis?
>> Sure.

And here it is:

 > PEP: 3119
 > Title: Introducing Abstract Base Classes

I've placed my comments in-line and snipped chunks of the original PEP
where it seemed appropriate.

 > Version: $Revision$
 > Last-Modified: $Date$
 > Author: Guido van Rossum <guido at python.org>, Talin <talin at acm.org>
 > Status: Draft
 > Type: Standards Track
 > Content-Type: text/x-rst
 > Created: 18-Apr-2007
 > Post-History: 26-Apr-2007, 11-May-2007

[snip]

 > Rationale
 > =========
 >
 > In the domain of object-oriented programming, the usage patterns for
 > interacting with an object can be divided into two basic categories,
 > which are 'invocation' and 'inspection'.
 >
 > Invocation means interacting with an object by invoking its methods.
 > Usually this is combined with polymorphism, so that invoking a given
 > method may run different code depending on the type of an object.
 >
 > Inspection means the ability for external code (outside of the
 > object's methods) to examine the type or properties of that object,
 > and make decisions on how to treat that object based on that
 > information.
 >
 > Both usage patterns serve the same general end, which is to be able to
 > support the processing of diverse and potentially novel objects in a
 > uniform way, but at the same time allowing processing decisions to be
 > customized for each different type of object.
 >
 > In classical OOP theory, invocation is the preferred usage pattern,
 > and inspection is actively discouraged, being considered a relic of an
 > earlier, procedural programming style.  However, in practice this view
 > is simply too dogmatic and inflexible, and leads to a kind of design
 > rigidity that is very much at odds with the dynamic nature of a
 > language like Python.

I disagree with the last sentance in the above paragraph.  While
zope.interface has been shown (in a seperate message) to perform the
same tasks as the "rolls" PEP (3133) and below I show the similarities
between this PEP (ABCs) and zope.interface, I want to point out that
users of zope.interface don't actually use it in these ways.

So, what /do/ people use zope.interface for?  There are two primary
uses: making contracts explicit and adaptation.  If more detail is
desired about these uses; I'll be glad to share.

My main point is that the time machine worked; people have had the moral
equivalent of ABCs and Roles for years and have decided against using
them the way the PEPs envision.  Of course if people still think ABCs
are keen, then a stand-alone package can be created and we can see if
there is uptake, if so; it can be added to the standard library later.

If I recall correctly, the original motivation for ABCs was that some
times people want to "sniff" an object and see what it is, almost always
to dispatch appropriately.  That use case of "dispatch in the small",
would seem to me to be much better addressed by generic functions.  If
those generic functions want something in addition to classes to
dispatch on, then interfaces can be used too.

If GF aren't desirable for that use case, then basefile, basesequence,
and basemapping can be added to Python and cover 90% of what people
need.  I think the Java Collections system has shown that it's not
neccesary to provide all interfaces for all people.  If you can only
provide a subset of an interface, make unimplemented methods raise
NotImplementedError.

[snip]

 > Overloading ``isinstance()`` and ``issubclass()``
 > -------------------------------------------------

Perhaps the PEP should just be reduced to include only this section.

[snip]

 > The ``abc`` Module: an ABC Support Framework
 > --------------------------------------------
[snip]
 > These methods are intended to be be called on classes whose metaclass
 > is (derived from) ``ABCMeta``; for example::
 >
 >     from abc import ABCMeta

     import zope.interface

 >     class MyABC(metaclass=ABCMeta):
 >         pass

     class MyInterface(zope.interface.Interface):
         pass

 >     MyABC.register(tuple)

     zope.interface.classImplements(tuple, MyInterface)

 >     assert issubclass(tuple, MyABC)

     assert MyInterface.implementedBy(tuple)

 >     assert isinstance((), MyABC)

     assert MyInterface.providedBy(())


 > The last two asserts are equivalent to the following two::
 >
 >     assert MyABC.__subclasscheck__(tuple)
 >     assert MyABC.__instancecheck__(())
 >
 > Of course, you can also directly subclass MyABC::
 >
 >     class MyClass(MyABC):
 >         pass

     class MyClass:
         zope.interface.implements(MyInterface)

 >     assert issubclass(MyClass, MyABC)

     assert MyInterface.implementedBy(MyClass)

 >     assert isinstance(MyClass(), MyABC)

     assert MyInterface.providedBy(MyClass())

 > Also, of course, a tuple is not a ``MyClass``::
 >
 >     assert not issubclass(tuple, MyClass)
 >     assert not isinstance((), MyClass)
 >
 > You can register another class as a subclass of ``MyClass``::
 >
 >     MyClass.register(list)

There is an interface that MyClass implements that list implements as well.

     class MyClassInterface(MyInterface):
         pass

     zope.interface.classImplements(list, MyClassInterface)

Sidebar: this highlights one of the reasons zope.interface users employ
the naming convention of prefixing their interface names with "I", it
helps keep interface names short while giving you an easy name for
"interface that corresponds to things of class Foo", which would be
IFoo.

 >     assert issubclass(list, MyClass)

     assert MyClassInterface.implementedBy(list)

 >     assert issubclass(list, MyABC)

     assert MyClassInterface.extends(MyInterface)

 > You can also register another ABC::
 >
 >     class AnotherClass(metaclass=ABCMeta):
 >         pass

     class AnotherInterface(zope.interface.Interface):
         pass

 >     AnotherClass.register(basestring)

     zope.interface.classImplements(basestring, AnotherInterface)

 >     MyClass.register(AnotherClass)

I don't quite understand the intent of the above line.  It appears to be
extending the contract that AnotherClass embodies to promise to fulfill
any contract that MyClass embodies.  That seems to be an unusual thing
to want to express.  Although unusual, you could still do it using
zope.interface.  One way would be to add MyClassInterface to the
__bases__ of AnotherInterface.

OTOH, I might be confused by the colapsing of the class and interface
hierarchies.  Do the classes in the above line of code represent the
implementation or specification?

[snip]

 > ABCs for Containers and Iterators
 > ---------------------------------

zope.interface defines similar interfaces.  Surprisingly they aren't
used all that often.  They can be viewed at
http://svn.zope.org/zope.interface/trunk/src/zope/interface/common/.
The files mapping.py, sequence.py, and idatetime.py are the most
interesting.

[snip rest]

> I was just
> thinking of how to "sell" ABCs as an alternative to current happy
> users of zop.interfaces.

One of the things that makes zope.interface users happy is the 
separation of specification and implementation.  The increasing 
separation of specification from implementation is what has
driven Abstract Data Types in procedural languages, encapsulation
in OOP, and now zope.interface.  Mixing the two back together in
ABCs doesn't seem attractive.

As for "selling" current users on an alternative, why bother?  If people 
need interfaces, they know where to find them.  I suspect I'm confused 
as to the intent of this discussion.
-- 
Benji York
http://benjiyork.com

From martin at v.loewis.de  Thu May 17 15:49:31 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 May 2007 15:49:31 +0200
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <ca471dc20705140943r43ca5f33tad5aff8df87e96ec@mail.gmail.com>
References: <dcbbbb410705132003k73cf08c3hf7f0b99e2e1b9895@mail.gmail.com>	<20070514113240.19381.1581980774.divmod.quotient.32995@ohm>	<ca471dc20705140825s453a6281ife63414e2c7e0a9b@mail.gmail.com>	<bb8868b90705140842x27257e07o9faaa8407f953d53@mail.gmail.com>
	<ca471dc20705140943r43ca5f33tad5aff8df87e96ec@mail.gmail.com>
Message-ID: <464C5D6B.6070302@v.loewis.de>

> Does the tokenizer do this for all string literals, too? Otherwise you
> could still get surprises with things like x.foo vs. getattr(x,
> "foo"), if the name foo were normalized but the string "foo" were not.

No. If you use a string literal, chances are very high that you put
NFC into your source code file (if it's not UTF-8, most codecs will
produce NFC naturally; if it is UTF-8, it depends on your editor).

If you get the attribute name from elsewhere, it's a design choice
of who should perform the normalization. One could specify that
builtin getattr does that, or one could require that the application
does it in cases where the strings aren't guaranteed to be in NFC.

The only case where I know of a software that explicitly changes
the normalization, and not to NFC, is OSX, which uses NFD on
disk.

Regards,
Martin

From fdrake at acm.org  Thu May 17 16:26:37 2007
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 17 May 2007 10:26:37 -0400
Subject: [Python-3000]
	=?iso-8859-1?q?r55359_-=09python/branches/py3k-stru?=
	=?iso-8859-1?q?ni/Lib/test/test=5Fstrop=2Epy?=
In-Reply-To: <464C3858.3030307@gmail.com>
References: <20070515214221.787161E4012@bag.python.org>
	<bbaeab100705161423w750c92efp41854fce8d892264@mail.gmail.com>
	<464C3858.3030307@gmail.com>
Message-ID: <200705171026.37851.fdrake@acm.org>

On Thursday 17 May 2007, Nick Coghlan wrote:
 > To be honest, I have never understood the repeated proposals to get rid
 > of the string module. Get rid of the functions that are just duplicates
 > of str methods, sure, but the module makes sense to me as a home for
 > text related constants and other machinery (such as string.Template and
 > the various building blocks for more advanced PEP 3101 based formatting).

Agreed.  I see no need to add another name to the stdlib as a place to store 
those values, and placing them on str doesn't seem particularly attractive 
(especially for 2.x).


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From tomerfiliba at gmail.com  Thu May 17 16:41:16 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Thu, 17 May 2007 16:41:16 +0200
Subject: [Python-3000] pep 3131 again
In-Reply-To: <464C1EF4.6040603@v.loewis.de>
References: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
	<464C1EF4.6040603@v.loewis.de>
Message-ID: <1d85506f0705170741t653cbc2fx74dadbfa2cf44303@mail.gmail.com>

well, i still don't see what problems having that would solve. it seems
like just "a cool feature" people want to have. they will still need to use
latin text/english docs most of the time.

on the other i don't see a reason to limit them intentionally. if that
would keep them content/make the transition easier/help them learn
programming, i'd guess there's nothing wrong with that.

so i'm not enthused about it all, but i'll give that +0



-tomer

On 5/17/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>
> > === help people who can't type english ===
> > since the keywords remain ASCII, along with stdlib and all other major
> > third party libs -- how does that help the english-illiterate
> programmer?
>
> english-illiterate and "can't type english" are very different things.
> By "can't type english", I assume you mean "can't type Latin
> characters". These users are not helped at all by this PEP, but I think
> they are really rare, since keyboards commonly support a mode to enter
> Latin characters (perhaps after pressing some modifier key, or switching
> to Latin mode).
>
> >
> >     import random
> >     ?? = range(100)
> >     random.shuffle(?? )
> >     ? = ??.pop(7)
> >     if len(?) > 58:
> >         print "?????!!!" #  ?? ?? ??????? ???? ??? ?????
> >
> > apart from excessive visual noise, the amount of *latin* identifiers and
> > keywords is not negligible.
>
> Right. However, you don't have to understand *English* to write or read
> this text. You don't need to know that "import" means "to bring from a
> foreign or external source", and that "shuffle" means "to mix in a mass
> confusedly". Instead, understanding them by their Python meaning is
> enough.
>
> > if all you're trying to save is coming up with
> > english names for your functions, than that's okay, but saying
> > "japanese people have a hard time coding in the latin alphabet"
> > does not withstand practical usage.
>
> Coming up with English names is not necessary today. Coming up
> with Latin spellings is.
>
> Whether or not Japanese or Chinese people with no knowledge of
> English still can master the Latin alphabet easily, I don't know,
> as all Chinese people I do know speak German or English well.
>
> I would say "they can speak for themselves", except that then
> neither of us would understand them.
>
> Regards,
> Martin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070517/8c41cb79/attachment.htm 

From martin at v.loewis.de  Thu May 17 17:22:19 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 May 2007 17:22:19 +0200
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <464BBE2B.1050201@acm.org>
References: <464BBE2B.1050201@acm.org>
Message-ID: <464C732B.8050103@v.loewis.de>

> While there has been a lot of discussion as to whether to accept PEP 
> 3131 as a whole, there has been little discussion as to the specific 
> details of the PEP. In particular, is it generally agreed that the 
> Unicode character classes listed in the PEP are the ones we want to 
> include in identifiers? My preference is to be conservative in terms of 
> what's allowed.

John Nagle suggested to consider UTR#39
(http://unicode.org/reports/tr39/). I encourage anybody to help me
understand what it says.

The easiest part is 3.1: this seems to say we should restrict characters
listed as "restrict" in [idmod]. My suggestion would be to warn about
them. I'm not sure about the purpose of the additional characters:
surely, they don't think we should support HYPHEN-MINUS in identifiers?

4. Confusable Detection: Without considering details, it seems you need
two strings to decide whether they are confusable. So it's not clear
to me how this could apply to banning certain identifiers.

5. Mixed Script Detection: That might apply, but I can't map the
algorithm to terminology I'm familiar with. What is UScript.COMMON
and UScript.INHERITED? I'm skeptical about mixed-script detection,
because you surely want to allow ASCII digits (0..9) in Cyrillic
identifiers - not sure whether the detection would claim that the
digits are Latin (which they aren't - they are Arabic numbers).
So a precise algorithm in Python (using unicodedata) would be
helpful. I still would like to make that produce a warning only;
users more concerned about phishing could turn the warning into
an error.

Regards,
Martin

From martin at v.loewis.de  Thu May 17 17:28:14 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 May 2007 17:28:14 +0200
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <0EE9E992-F418-4E2A-872D-7B2CE012FAC3@gmail.com>
References: <464BBE2B.1050201@acm.org>
	<f2gioe$df5$1@sea.gmane.org>	<464C1787.7090209@v.loewis.de>	<11fab4bc0705170227s1925cee2j2181c45b7772d9d7@mail.gmail.com>	<11fab4bc0705170235m3d9d50fg2aab33eb712f05b0@mail.gmail.com>
	<0EE9E992-F418-4E2A-872D-7B2CE012FAC3@gmail.com>
Message-ID: <464C748E.7020209@v.loewis.de>

Leonardo Santagada schrieb:
> Here are the rules for identifiers in javascript in case someone  
> wants to know:
> http://interglacial.com/javascript_spec/a-7.html#a-7.6

In all these reports, part of the analysis is also to determine
how those specifications deviate (or not) from PEP 3131.

In this case, it seems that:
- JS additionally allows $
- PEP 3131 additionally considers the stability extensions
  (Other_ID_{Start|Continue}). That may be because the JS
  specification was based on an earlier version of UAX#31,
  which perhaps didn't had the need for this stability
  feature.
- JS uses Unicode 3.0, whereas Python uses whatever Unicode
  version lives in unicodedata.

Regards,
Martin


From janssen at parc.com  Thu May 17 17:48:18 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 17 May 2007 08:48:18 PDT
Subject: [Python-3000] Whither string? (was Re:
	python/branches/py3k-struni/Lib/test/test_strop.py)
In-Reply-To: <20070517130639.GA20958@panix.com> 
References: <20070515214221.787161E4012@bag.python.org>
	<bbaeab100705151455m7c719bas8a79533655e477b8@mail.gmail.com>
	<17995.29840.132278.935792@montanaro.dyndns.org>
	<bbaeab100705161423w750c92efp41854fce8d892264@mail.gmail.com>
	<464C3858.3030307@gmail.com> <20070517130639.GA20958@panix.com>
Message-ID: <07May17.084825pdt."57996"@synergy1.parc.xerox.com>

> The trend in support seems to be toward moving everything left that is
> useful from "string" to "text", which would be a package.

Sigh.  After 15 years of carefully writing python code using "text"
instead of "string" as a variable name, I'm sure this will work out
just fine.  :-)

Bill


From g.brandl at gmx.net  Thu May 17 17:54:55 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 17 May 2007 17:54:55 +0200
Subject: [Python-3000] Whither string? (was Re:
	python/branches/py3k-struni/Lib/test/test_strop.py)
In-Reply-To: <07May17.084825pdt."57996"@synergy1.parc.xerox.com>
References: <20070515214221.787161E4012@bag.python.org>	<bbaeab100705151455m7c719bas8a79533655e477b8@mail.gmail.com>	<17995.29840.132278.935792@montanaro.dyndns.org>	<bbaeab100705161423w750c92efp41854fce8d892264@mail.gmail.com>	<464C3858.3030307@gmail.com>
	<20070517130639.GA20958@panix.com>
	<07May17.084825pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <f2hts6$rqh$1@sea.gmane.org>

Bill Janssen schrieb:
>> The trend in support seems to be toward moving everything left that is
>> useful from "string" to "text", which would be a package.
> 
> Sigh.  After 15 years of carefully writing python code using "text"
> instead of "string" as a variable name, I'm sure this will work out
> just fine.  :-)

As long as we don't find other functions/classes that warrant the name
"text", I propose to keep the string module as it is.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From collinw at gmail.com  Thu May 17 18:37:39 2007
From: collinw at gmail.com (Collin Winter)
Date: Thu, 17 May 2007 09:37:39 -0700
Subject: [Python-3000] Updated and simplified PEP 3141: A Type Hierarchy
	for Numbers
In-Reply-To: <464C34C4.2080702@gmail.com>
References: <5d44f72f0705161731j4700bdb3h4e36e97757bd6a32@mail.gmail.com>
	<464C06E3.2090104@acm.org> <464C34C4.2080702@gmail.com>
Message-ID: <43aa6ff70705170937u19113f3et9f23971448049c0e@mail.gmail.com>

On 5/17/07, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Talin wrote:
> > This really highlights what I think is a problem with dynamic
> > inheritance, and I think that this inconsistency between traditional and
> > dynamic inheritance will eventually come back to haunt us. It has always
> > been the case in the past that for every property of class B, if
> > isinstance(A, B) == True, then A also has that property, either
> > inherited from B, or overridden in A. The fact that this invariant will
> > no longer hold true is a problem in my opinion.
> >
> > I realize that there isn't currently a solution to efficiently allow
> > inheritance of properties via dynamic inheritance. As a software
> > engineer, however, I generally feel that if a feature is unreliable,
> > then it shouldn't be used at all. So if I were designing a class
> > hierarchy of ABCs, I would probably make a rule for myself not to define
> > any properties or methods in the ABCs at all, and to *only* use ABCs for
> > type testing via 'isinstance'.
>
> If a class doesn't implement the interface defined by an ABC, you should
> NOT be registering it with that ABC via dynamic inheritance. *That's*
> the bug - the program is claiming that "instances of class A can be
> treated as if they were an instance of B" when that statement is simply
> not true. And without defining an interface, dispatching on the ABC is
> pointless - you don't know whether or not you support the operations
> implied by that ABC because there aren't any defined!

ABCs can define concrete methods. These concrete methods provide
functionality that the child classes do not themselves provide. Let's
imagine that Python didn't have the readlines() method, and that I
wanted to define one. I could create an ABC that provides a default
concrete implementation of readlines() in terms of readline().

class ReadlinesABC(metaclass=ABCMeta):
  def readlines(self):
    # some concrete implementation

  @abstractmethod
  def readline(self):
    pass

If I register a Python-language class as implementing this ABC,
"isinstance(x, ReadlinesABC) == True" means that I can now call the
readlines() method. However, if I register a C-language extension
class as implementing this ABC, "isinstance(x, ReadlinesABC) == True"
may or may not indicate that I can call readlines(), making the test
of questionable value.

You can say that I shouldn't have registered a C extension class with
this ABC in the first place, but that's not the point. The point is
that for consumer code "isinstance(x, ReadlinesABC) == True" is an
unreliable test that may or may not accurately reflect the object's
true capabilities.

Maybe attempting to use partially-concrete ABCs in tandem with C
classes should raise an exception. That would make this whole issue go
away.

Collin Winter

From guido at python.org  Thu May 17 18:48:27 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 17 May 2007 09:48:27 -0700
Subject: [Python-3000] PEP 3131 accepted
Message-ID: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>

I have accepted PEP 3131. Note that it now contains the following policy:

"""
As an addition to the Python Coding style, the following policy is
prescribed: All identifiers in the Python standard library MUST use
ASCII-only identifiers, and SHOULD use English words wherever feasible
(in many cases, abbreviations and technical terms are used which
aren't English). In addition, string literals and comments must also
be in ASCII. The only exceptions are (a) test cases testing the
non-ASCII features, and (b) names of authors. Authors whose names are
not based on the latin alphabet MUST provide a latin transliteration
of their names.
"""

I recommend that open source projects with a global audience adopt a
similar policy. I'll also add it to PEP 8.

I expect that small details of the PEP will still change as discussion
about these takes place and as implementation is undertaken. This does
not affect my acceptance of the PEP.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu May 17 19:08:11 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 17 May 2007 10:08:11 -0700
Subject: [Python-3000] Updated and simplified PEP 3141: A Type Hierarchy
	for Numbers
In-Reply-To: <43aa6ff70705170937u19113f3et9f23971448049c0e@mail.gmail.com>
References: <5d44f72f0705161731j4700bdb3h4e36e97757bd6a32@mail.gmail.com>
	<464C06E3.2090104@acm.org> <464C34C4.2080702@gmail.com>
	<43aa6ff70705170937u19113f3et9f23971448049c0e@mail.gmail.com>
Message-ID: <ca471dc20705171008t9aa7625if9fa5f5249ffe56e@mail.gmail.com>

On 5/17/07, Collin Winter <collinw at gmail.com> wrote:
> ABCs can define concrete methods. These concrete methods provide
> functionality that the child classes do not themselves provide.

You seem to be misreading my intention here. ABCs serve two purposes:
they are interface specifications, and they provide "default" or
"mix-in" implementations of some of the methods they specify. The
pseudo-inheritance enabled by the register() call uses only the
specification part, and requires that the registered class implement
all the specified methods itself. In order to benefit from the
"mix-in" side of the ABC, you must subclass it directly.

> Let's
> imagine that Python didn't have the readlines() method, and that I
> wanted to define one. I could create an ABC that provides a default
> concrete implementation of readlines() in terms of readline().
>
> class ReadlinesABC(metaclass=ABCMeta):
>   def readlines(self):
>     # some concrete implementation
>
>   @abstractmethod
>   def readline(self):
>     pass
>
> If I register a Python-language class as implementing this ABC,
> "isinstance(x, ReadlinesABC) == True" means that I can now call the
> readlines() method. However, if I register a C-language extension
> class as implementing this ABC, "isinstance(x, ReadlinesABC) == True"
> may or may not indicate that I can call readlines(), making the test
> of questionable value.
>
> You can say that I shouldn't have registered a C extension class with
> this ABC in the first place, but that's not the point.

No, it is *exactly* the point. If you want to have functionality that
is *not* provided by some class, you should use an adaptor.

> The point is
> that for consumer code "isinstance(x, ReadlinesABC) == True" is an
> unreliable test that may or may not accurately reflect the object's
> true capabilities.
>
> Maybe attempting to use partially-concrete ABCs in tandem with C
> classes should raise an exception. That would make this whole issue go
> away.

The register() call could easily verify that the registered class
implements the specified set of methods -- this would include methods
that are concrete for the benefit of direct subclassing.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From foom at fuhm.net  Thu May 17 19:13:57 2007
From: foom at fuhm.net (James Y Knight)
Date: Thu, 17 May 2007 13:13:57 -0400
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <464C1C22.7030806@v.loewis.de>
References: <464BBE2B.1050201@acm.org>
	<69B09BFE-3BF3-4532-98EA-8A7E44461D77@fuhm.net>
	<464C1C22.7030806@v.loewis.de>
Message-ID: <1192E673-212B-4C5C-AB1F-31EBA657DE4E@fuhm.net>

On May 17, 2007, at 5:10 AM, Martin v. L?wis wrote:
>> This list is available as part of the PropList.txt file in the
>> unicode data, which ought to be included automatically in python's
>> unicode database so as to get future changes.
>
> This I'm not so sure about. I changed the PEP to say that
> Other_ID_{Start|Continue} should be included. Whether the other
> properties should be added to the unidata module, I don't know -
> I would like to see use cases first before including them.

I only meant that the python's idea of Other_ID_* should be  
automatically generated from the unicode data file, so that when  
someone upgrades python's database to Unicode 5.1 (or whatever), they  
don't forget to update a manually copied Other_ID_* list as well.

>> I do not believe it is a good idea for python to define its own
>> identifier rules. The rules defined in UAX31 make sense and should be
>> used directly, with only the minor amendment of _ as an allowable
>> start character.
>
> That was my plan indeed.

I was voicing my support for your plan, in contrast to Talin's  
comment that perhaps a more conservative subset would be good.

James

From foom at fuhm.net  Thu May 17 19:28:40 2007
From: foom at fuhm.net (James Y Knight)
Date: Thu, 17 May 2007 13:28:40 -0400
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <464C732B.8050103@v.loewis.de>
References: <464BBE2B.1050201@acm.org> <464C732B.8050103@v.loewis.de>
Message-ID: <12AAEFAC-9E5A-4954-89DB-7A195A8E64A4@fuhm.net>


On May 17, 2007, at 11:22 AM, Martin v. L?wis wrote:

>> While there has been a lot of discussion as to whether to accept PEP
>> 3131 as a whole, there has been little discussion as to the specific
>> details of the PEP. In particular, is it generally agreed that the
>> Unicode character classes listed in the PEP are the ones we want to
>> include in identifiers? My preference is to be conservative in  
>> terms of
>> what's allowed.
>
> John Nagle suggested to consider UTR#39
> (http://unicode.org/reports/tr39/). I encourage anybody to help me
> understand what it says.

I think this is not something that is appropriate for Python. It  
looks fairly specific to implementing a centralized name registry  
(say: DNS).  Specifically, the backwards compatibility is not  
appropriate, as it doesn't guarantee that a name valid now will be  
valid in the future. They point out that that is okay for DNS, where  
the rules can be applied at name-registration time, and previously- 
registered names can continue to be used.

James


From martin at v.loewis.de  Thu May 17 19:43:50 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 May 2007 19:43:50 +0200
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <12AAEFAC-9E5A-4954-89DB-7A195A8E64A4@fuhm.net>
References: <464BBE2B.1050201@acm.org> <464C732B.8050103@v.loewis.de>
	<12AAEFAC-9E5A-4954-89DB-7A195A8E64A4@fuhm.net>
Message-ID: <464C9456.2000805@v.loewis.de>

> I think this is not something that is appropriate for Python. It looks
> fairly specific to implementing a centralized name registry (say: DNS). 
> Specifically, the backwards compatibility is not appropriate, as it
> doesn't guarantee that a name valid now will be valid in the future.
> They point out that that is okay for DNS, where the rules can be applied
> at name-registration time, and previously-registered names can continue
> to be used.

Right - that would be a reason to not ban identifiers that are
considered questionable. Issuing a warning might be possible, though:
if an identifier is warned about that wasn't warned about before,
the program would still run.

It turns out that John Nagle had a different spec in mind, though:
Level 2 (Highly Restrictive) from

http://unicode.org/reports/tr36/#Security_Levels_and_Alerts

I think that is way too restrictive for programming languages,
as it would ban combining cyrillic letters with ASCII digits,
2.10.2.B.1 of TR#36 recommends to use the general profile
from UTS-39; 2.10.2.B.2 recommends to use NFKC and case-folding
for identifier comparison - that, again, can't apply to Python
as the language is case-sensitive.

Regards,
Martin


From guido at python.org  Thu May 17 19:53:42 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 17 May 2007 10:53:42 -0700
Subject: [Python-3000] Updated and simplified PEP 3141: A Type Hierarchy
	for Numbers
In-Reply-To: <464C06E3.2090104@acm.org>
References: <5d44f72f0705161731j4700bdb3h4e36e97757bd6a32@mail.gmail.com>
	<464C06E3.2090104@acm.org>
Message-ID: <ca471dc20705171053g42c077bfx7c3727fa26e454e9@mail.gmail.com>

On 5/17/07, Talin <talin at acm.org> wrote:
> Lets therefore assume that the numeric ABCs will use this new
> inheritance mechanism, avoiding the problem of taking an immature class
> hierarchy and setting it in stone. The PEPs in this class would then no
> longer need to have this privileged status; They could be replaced and
> changed at will.
>
> Assuming that this is true, the question then becomes whether these
> classes should be treated like any other standard library submission. In
> other words, shouldn't this PEP be implemented as a separate module, and
> have to prove itself 'in the wild' before being adopted into the stdlib?
> Does this PEP even need to be a PEP at all, or can it just be a
> 3rd-party library that is eventually adopted into Python?

No; I think there's a lot of synergy to be had by making it a standard
library module. For example, the Complex, Real and Integer types
provide a common ground for the built-in types and the types
implemented in numpy. Assuming (some form of) PEP 3124 is accepted, it
would be a shame if we had to specialize GFs on concrete types like
int or float. If we could encourage the habit right from the start to
use the abstract classes in such positions, then numpy integration
would be much easier.

> Now, I *could* see adopting an untried library embodying untested ideas
> into the stdlib if there was a crying need for the features of such a
> library, and those needs were clearly being unfulfilled. However, I am
> not certain that this is the case here.

The ideas here are hardly untested; the proposed hierarchy is taught
in high school math if not before, and many other languages use it
(e.g. Scheme's numeric tower, referenced in the PEP). Some
implementation details are untested, but I doubt that the general idea
sparks much controversy (it hasn't so far).

> At the very least, I think it should be stated in the PEP whether or not
> the ABCs defined here are going to be using traditional or dynamic
> inheritance.

Dynamic inheritance for sure.

> If it is the latter, and we decide that this PEP is going to be part of
> the stdlib, then I propose the following library organization:
>
>     import abc              # Imports the basic ABC mechanics
>     import abc.collections  # MutableSequence and such
>     import abc.math         # The number hierarchy
>     ... and so on

I don't like the idea of creating a ghetto for ABCs. Just like the
ABCs for use with I/O are defined in the io module (read PEP 3161 and
notice that it already uses a thinly disguised form of ABCs), the ABCs
for collections should be in the existing collections module. I'm not
sure where to place the numeric ABCs, but I'd rather have a top-level
numbers module.

> Now, there is another issue that needs to be dicussed.

> The classes in the PEP appear to be written with lots of mixin methods,
> such as __rsub__ and __abs__ and such. Unfortunately, the current
> proposed method for dynamic inheritance does not allow for methods or
> properties to be inherited from the 'virtual' base class. Which means
> that all of the various methods defined in this PEP are utterly
> meaningless other than as documentation - except in the case of a new
> user-created class of numbers which inherit from these ABCs using
> traditional inheritance, which is not something that I expect to happen
> very often at all. For virtually all practical uses, the elaborate
> methods defined in this PEP will be unused and inaccessible.

And yet they are designed for the benefit of new numeric type
implementations so that mixed-mode arithmetic involving two different
3rd party implementations can be defined soundly. If this is deemed
too controversial or unwieldy and unnecessary I'd be okay with
dropping it, though I'm not sure that it hurts. There are some
problems with specifying it so that all the cases work right, however.
In any case we could have separate mix-in classes for this purpose --
while ABCs *can* be dual-purpose (both specification and mix-in),
there's no rule that says the *must* be.

The problem Jeffrey and I were trying to solve with this part of the
spec is what should happen if you have two independently developed 3rd
party types, e.g. MyReal and YourReal, both implementing the Real ABC.
Obviously we want MyReal("3.5") + YourReal("4.5") to return an object
for which isinstance(x, Real) holds and whose value is close to
float("8.0"). But this is not so easy. Typically, classes like this
have methods like these::

class MyReal:
  def __add__(self, other):
    if not isinstance(other, MyReal): return NotImplemented
    return MyReal(...)
  def __radd__(self, other):
    if not isinstance(other, MyReal): return NotImplemented
    return MyReal(...)

but this doesn't support mixed-mode arithmetic at all.

(Reminder of the underlying mechanism: for a+b, first a.__add__(b) is
tried; if that returns NotImplemented, b.__radd__(a) is tried; if that
also returns NotImplemented, the Python VM raises TypeError.
Exceptions raised at any stage cause the remaining steps to be
abandoned, so if e.g. __add__ raises TypeError, __radd__ is never
tried. This is the crux of the problem we're trying to solve.)

Supporting mixed-mode arithmetic with *known* other types like float
is easy: just add either

  if isinstance(other, float): return self + MyReal(other)

or

  if isinstance(other, float): return float(self) + other

to the top of each method. But that still doesn't support MyReal() +
YourReal(). For that to work, at least one of the two classes has to
blindly attempt to cast the other argument to a known class. For
example, instead of returning NotImplemented, __radd__ (which knows it
is being called as a last resort) could return self+float(other), or
float(self)+float(other), under the assumption that all 3rd party Real
types can be converted to the built-in float type with only moderate
loss.

Unfortunately, there are a lot of cases to consider here. E.g. we
could be adding MyComplex() to YourReal(), and then the float cast in
__radd__ would be a disaster (since MyComplex() can't be cast to
float, only to complex). We are trying to make things easier for 3rd
parties that *do* want to use the ABC as a mix-in, by asking them to
call super.__[r]add__(self, other) whenever the other argument is not
something they specifically recognize.

I'm pretty sure that we haven't gotten the logic for this quite right
yet. I'm only moderately sure that we *can* get it right. But in any
case, please don't let this distract you from the specification part
of the numeric ABCs.

> This really highlights what I think is a problem with dynamic
> inheritance, and I think that this inconsistency between traditional and
> dynamic inheritance will eventually come back to haunt us. It has always
> been the case in the past that for every property of class B, if
> isinstance(A, B) == True, then A also has that property, either
> inherited from B, or overridden in A. The fact that this invariant will
> no longer hold true is a problem in my opinion.

Actually there is a school of thought (which used to prevail amongst
Zopistas, I don't know if they've been cured yet) that class
inheritance was purely for implementation inheritance, and that a
subclass was allowed to reverse policies set by the base class. While
I don't endorse this as a general rule, I don't see how (with a
sufficiently broad definition of "property") your invariant can be
assumed even for traditional inheritance. E.g. a base class may be
hashable or or immutable but the subclass may not be (it's trivial to
create mutable subclasses of int, str or tuple).

> I realize that there isn't currently a solution to efficiently allow
> inheritance of properties via dynamic inheritance. As a software
> engineer, however, I generally feel that if a feature is unreliable,
> then it shouldn't be used at all.

That sounds like a verdict against all dynamic properties of the language.

> So if I were designing a class
> hierarchy of ABCs, I would probably make a rule for myself not to define
> any properties or methods in the ABCs at all, and to *only* use ABCs for
> type testing via 'isinstance'.

And that is a fine policy to hold yourself to.

> In other words, if I were writing this PEP,

Hey, you are! Or do you want your name taken off? At the very least I
think the rationale (all your words) needs some ervision in the light
of Benji York's comments in another thread.

> all of those special methods
> would be omitted, simply because as a writer of a subclass I couldn't
> rely on being able to use them.

As the author of a subclass, you control the choice between real
inheritance via inclusion in __bases__ or pseudo-inheritance via
register(). So I don't see your point here.

> The only alternative that I can see is to not use dynamic inheritance at
> all, and instead have the number classes inherit from these ABCs using
> the traditional mechanism. But that brings up all the problems of
> immaturity and requiring them to be built-in that I brought up earlier.

That's off the table already.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jason.orendorff at gmail.com  Thu May 17 19:55:57 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Thu, 17 May 2007 13:55:57 -0400
Subject: [Python-3000] pep 3131 again
In-Reply-To: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
References: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
Message-ID: <bb8868b90705171055w619bbd02t76743245fce6deb0@mail.gmail.com>

Martin, this message suggests an addition to PEP 3131.

On 5/16/07, tomer filiba <tomerfiliba at gmail.com> wrote:
> === RTL/LTR ===
> the only practical way to use RTL languages in code is to have an RTL
> programming language, where "if" is spelled "??", "for" as "????",
> "in" as "????", and so on, and the entire program is RTL. having code
> like --
>
> for ??? in ????(1,2,3)
>
> is only unreadable by all means (since the parenthesis are LTR, while
> the name is RTL, etc.)

In theory, the Right Thing to do for this is support Unicode bidi
format control characters.  Check this out:

  for ??? in ?????(1,2,3):
      blort(???)

I just added U+200E, "LEFT-TO-RIGHT MARK", after each
misbehaving RTL identifier, as recommended here:
  http://unicode.org/reports/tr9/#Usage

Note: some mail/news agents strip out format characters.
(?.gnikrow era sretcarahc lortnoc idib ,siht daer nac uoy fI??)
(?If you can read this, control characters were stripped/ignored.??)

Now... it's clearly absurd to be pasting invisible magic characters
into source code, but that part is automatable.  Just hack your
editor to add U+200E after each run of strong-RTL characters,
except in strings and comments.  The real problems are:

1.  Many editors don't have bidi support.  This might improve
with time.  Or not.

2.  Python forbids these characters.  Martin, JavaScript
treats these specially, and I think Python probably
should, too:

The ECMAScript 3 standard for JavaScript requires the
tokenizer to throw away all Unicode format-control characters
(general category Cf).

ECMAScript 4 will likely tweak this (an incompatible change)
to retain those characters only in strings and regexps.
I like that better.

Cheers,
-j

From foom at fuhm.net  Thu May 17 20:03:54 2007
From: foom at fuhm.net (James Y Knight)
Date: Thu, 17 May 2007 14:03:54 -0400
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <464BBE2B.1050201@acm.org>
References: <464BBE2B.1050201@acm.org>
Message-ID: <2A4F5FE3-9F8A-4B74-B46D-B63F1260B7FD@fuhm.net>

I mentioned this in another thread as an aside in the middle of the  
email, but I thought I'd put it out here at the top:

It should be considered whether formatting characters should be  
ignored. And if so, which list of properties should be used for that.

I notice that the excerpt from the C# standard says:
>     * 4 Any formatting-characters are removed.

I don't know what they mean by that, but I'm going to guess  
characters in the Cf class.

However, UAX #31 says:
> 2.2 Layout and Format Control Characters
>
> Certain Unicode characters are used to control joining behavior,  
> bidirectional ordering control, and alternative formats for  
> display. These have the General_Category value of Cf. Unlike space  
> characters or other delimiters, they do not indicate word, line, or  
> other unit boundaries.
>
> While it is possible to ignore these characters in determining  
> identifiers, the recommendation is to not ignore them and to not  
> permit them in identifiers except in special cases. This is because  
> of the possibility for confusion between two visually identical  
> strings; see [UTR36]. Some possible exceptions are the ZWJ and ZWNJ  
> in certain contexts, such as between certain characters in Indic  
> words.

It doesn't seem to me that an attack vector here is particularly  
relevant, so perhaps going along with C# and ignoring Cf characters  
in the source code might be a good idea. But I do notice that Unicode  
4.0.1 and earlier used to recommend ignoring formatting characters in  
identifiers (Ch 5 of the book), so that might be where C# got it from.

So, maybe it's better to keep the status quo, and not allow Cf  
characters, unless someone comes up with a particular need for doing  
so. Hm, I think I've convinced myself of that now. :)

James

From martin at v.loewis.de  Thu May 17 20:22:13 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 May 2007 20:22:13 +0200
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <2A4F5FE3-9F8A-4B74-B46D-B63F1260B7FD@fuhm.net>
References: <464BBE2B.1050201@acm.org>
	<2A4F5FE3-9F8A-4B74-B46D-B63F1260B7FD@fuhm.net>
Message-ID: <464C9D55.9080501@v.loewis.de>

> So, maybe it's better to keep the status quo, and not allow Cf
> characters, unless someone comes up with a particular need for doing so.
> Hm, I think I've convinced myself of that now. :)

That is my reasoning, too. People seem to want to be conservative,
so it's safer to reject formatting characters for the moment.
If people come up with a need, they still can be added.

(there might be a need for it in RTL languages, supporting
200E..200F and 202A..202E, but it seems that speakers of RTL
languages are skeptical about the entire PEP, so it's unclear
whether allowing these would help anything)

Regards,
Martin

From collinw at gmail.com  Thu May 17 20:24:14 2007
From: collinw at gmail.com (Collin Winter)
Date: Thu, 17 May 2007 11:24:14 -0700
Subject: [Python-3000] Updated and simplified PEP 3141: A Type Hierarchy
	for Numbers
In-Reply-To: <ca471dc20705171008t9aa7625if9fa5f5249ffe56e@mail.gmail.com>
References: <5d44f72f0705161731j4700bdb3h4e36e97757bd6a32@mail.gmail.com>
	<464C06E3.2090104@acm.org> <464C34C4.2080702@gmail.com>
	<43aa6ff70705170937u19113f3et9f23971448049c0e@mail.gmail.com>
	<ca471dc20705171008t9aa7625if9fa5f5249ffe56e@mail.gmail.com>
Message-ID: <43aa6ff70705171124i63c3edc7j1d4e133bdce1ce4f@mail.gmail.com>

On 5/17/07, Guido van Rossum <guido at python.org> wrote:
> On 5/17/07, Collin Winter <collinw at gmail.com> wrote:
> > ABCs can define concrete methods. These concrete methods provide
> > functionality that the child classes do not themselves provide.
>
> You seem to be misreading my intention here. ABCs serve two purposes:
> they are interface specifications, and they provide "default" or
> "mix-in" implementations of some of the methods they specify. The
> pseudo-inheritance enabled by the register() call uses only the
> specification part, and requires that the registered class implement
> all the specified methods itself. In order to benefit from the
> "mix-in" side of the ABC, you must subclass it directly.

I think I'm getting confused between the PEP and what you've said at
one of the various whiteboard sessions.

From guido at python.org  Thu May 17 20:36:27 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 17 May 2007 11:36:27 -0700
Subject: [Python-3000] PEP 3133: Introducing Roles
In-Reply-To: <464C5570.5050205@benjiyork.com>
References: <43aa6ff70705132236w6d2dbfc4r8d1fadede753a6ca@mail.gmail.com>
	<4648D626.1030201@benjiyork.com>
	<ca471dc20705152146p382ac396u5f7dc4b4f138cd26@mail.gmail.com>
	<464B4237.4090802@benjiyork.com>
	<ca471dc20705161050w3f99d762r68b90bd5253c1b8c@mail.gmail.com>
	<464C5570.5050205@benjiyork.com>
Message-ID: <ca471dc20705171136s5a4879ebgc854f06a478d694a@mail.gmail.com>

On 5/17/07, Benji York <benji at benjiyork.com> wrote:
> > [PEP 3119]
>  > In classical OOP theory, invocation is the preferred usage pattern,
>  > and inspection is actively discouraged, being considered a relic of an
>  > earlier, procedural programming style.  However, in practice this view
>  > is simply too dogmatic and inflexible, and leads to a kind of design
>  > rigidity that is very much at odds with the dynamic nature of a
>  > language like Python.
>
> I disagree with the last sentance in the above paragraph.  While
> zope.interface has been shown (in a seperate message) to perform the
> same tasks as the "rolls" PEP (3133) and below I show the similarities
> between this PEP (ABCs) and zope.interface, I want to point out that
> users of zope.interface don't actually use it in these ways.

I'm not wedded to this sentence; the rationale didn't get a facelift
like the rest of the PEP. I do want to point out that other mechanisms
like GFs need to have access to isinstance/isclass or equivalent in
order to do their magic.

> So, what /do/ people use zope.interface for?  There are two primary
> uses: making contracts explicit and adaptation.  If more detail is
> desired about these uses; I'll be glad to share.

I know what they are. Adaptation is another area where isinstance or
equivalent is needed by the underlying machinery.

I do note that a fairly common place where *some* kind of type
checking (whether isinstance- or hasattr-based) is the implementation
of binary operators; a typical __add__ or __radd__ method usually
starts by testing whether the other argument is an object it
understands, and if not, it returns NotImplemented. Since this is
essentially *implementing* a (limited) GF strategy, I don't see how
adaptation or GFs will help to eliminate such type checks.

> My main point is that the time machine worked; people have had the moral
> equivalent of ABCs and Roles for years and have decided against using
> them the way the PEPs envision.

It's not the only way the PEP envisions they are used, and no longer
the major way. I expect that the uses you mention are actually more
important. And given that people want these I think it would be useful
to have them in the standard library.

> Of course if people still think ABCs
> are keen, then a stand-alone package can be created and we can see if
> there is uptake, if so; it can be added to the standard library later.

I want to add something to the standard library now, because it's been
relegated to 3rd party status for too long. However, I think I can do
better than zope.interface; in some cases I just disagree with its
design choices, on other cases I can change the language to improve
upon the contortions that zope had to go through to make things look
right. (E.g. overloading isinstance, keyword arguments to class
declarations, class decorators.) I am also leaving some of the more
esoteric parts of zope.interface out (e.g. assertions about
instances), but the mechanism I am proposing (isinstance overloading)
supports this just fine.

> If I recall correctly, the original motivation for ABCs was that some
> times people want to "sniff" an object and see what it is, almost always
> to dispatch appropriately.  That use case of "dispatch in the small",
> would seem to me to be much better addressed by generic functions.  If
> those generic functions want something in addition to classes to
> dispatch on, then interfaces can be used too.

That's just one motivation. Another, more important motivation is to
have something that can fulfill the role that zope.interfaces
fulfills, but with fewer arbitrary differences from what we already
know, a class hierarchy.

> If GF aren't desirable for that use case, then basefile, basesequence,
> and basemapping can be added to Python and cover 90% of what people
> need.  I think the Java Collections system has shown that it's not
> neccesary to provide all interfaces for all people.  If you can only
> provide a subset of an interface, make unimplemented methods raise
> NotImplementedError.

The ABC PEP currently defines fewer ABCs than the Java Collections
system, so I'm not sure what you're worried about.

>  > Overloading ``isinstance()`` and ``issubclass()``
>  > -------------------------------------------------
>
> Perhaps the PEP should just be reduced to include only this section.

And require every 3rd party library to invent its own notions of
sequence, mapping etc.? No way!

> Sidebar: this highlights one of the reasons zope.interface users employ
> the naming convention of prefixing their interface names with "I", it
> helps keep interface names short while giving you an easy name for
> "interface that corresponds to things of class Foo", which would be
> IFoo.

Yeah, that is out of necessity because zope forces you to have a
separate interface and implementation class. The ABC proposal does
away with this silly duplicate hierarchy.

>  >     assert issubclass(list, MyClass)
>
>      assert MyClassInterface.implementedBy(list)
>
>  >     assert issubclass(list, MyABC)
>
>      assert MyClassInterface.extends(MyInterface)
>
>  > You can also register another ABC::
>  >
>  >     class AnotherClass(metaclass=ABCMeta):
>  >         pass
>
>      class AnotherInterface(zope.interface.Interface):
>          pass
>
>  >     AnotherClass.register(basestring)
>
>      zope.interface.classImplements(basestring, AnotherInterface)
>
>  >     MyClass.register(AnotherClass)
>
> I don't quite understand the intent of the above line.  It appears to be
> extending the contract that AnotherClass embodies to promise to fulfill
> any contract that MyClass embodies.  That seems to be an unusual thing
> to want to express.

MyClass is meant to be an ABC here whose contract is weaker than that
of AnotherClass. Suppose there was only a built-in MutableSequence,
but a 3rd party needed a Sequence interface that didn't need to be
mutable. It could use this example, with AnotherClass being the
built-in MutableSequence, and MyClass being the 3rd party's Sequence
class.

> Although unusual, you could still do it using
> zope.interface.  One way would be to add MyClassInterface to the
> __bases__ of AnotherInterface.
>
> OTOH, I might be confused by the colapsing of the class and interface
> hierarchies.  Do the classes in the above line of code represent the
> implementation or specification?

Specification.

> [snip]
>
>  > ABCs for Containers and Iterators
>  > ---------------------------------
>
> zope.interface defines similar interfaces.  Surprisingly they aren't
> used all that often.  They can be viewed at
> http://svn.zope.org/zope.interface/trunk/src/zope/interface/common/.
> The files mapping.py, sequence.py, and idatetime.py are the most
> interesting.

I believe I worked for Zope Corp around the time these were designed.
I remember it was pretty painful to agree on which methods to include
or exclude. Having this decided by the standard library would solve
the problem by fiat for a larger audience.

> [snip rest]
>
> > I was just
> > thinking of how to "sell" ABCs as an alternative to current happy
> > users of zope.interfaces.
>
> One of the things that makes zope.interface users happy is the
> separation of specification and implementation.  The increasing
> separation of specification from implementation is what has
> driven Abstract Data Types in procedural languages, encapsulation
> in OOP, and now zope.interface.  Mixing the two back together in
> ABCs doesn't seem attractive.

This is a self-selecting audience -- people who see it differently
(like me) aren't likely to use zope.interface, or might be hoping for
something better.

> As for "selling" current users on an alternative, why bother?  If people
> need interfaces, they know where to find them.  I suspect I'm confused
> as to the intent of this discussion.

Many people have a strong preference for standard features over 3rd
party features, everything else being equal -- or even if everything
else isn't quite equal (but mostly so). A feature-by-feature
comparison between zope.interface and ABCs is helpful for people who
aren't yet current zope.interface users. it seems that in the balance,
we have:

- ABCs have support in the standard library

- ABCs don't force you to separate specification from implementation.
This counts as a pro for some, a con for others

- ABCs (out of the box) don't let you make assertions about instances.
This is a con for those who need that feature,  but I haven't met many
of those.

- With ABCs, the spelling of "does this object conform to this
interface" and "does this object inherit implementation from this
class" is the same (isinstance()). This counts as a pro for some, a
con for others. IMO it's mostly emotional and there is no rational
reason to worry about the spelling being the same.

I won't be offended if the authors and users of zope.interface decide
to ignore the ABCs in the standard library; the rest of us will still
benefit from them (and from GFs). However, I fully expect that
zope.interface will embrace at least some of the mechanisms added to
Py3k in support of ABCs, as they make implementing zope.interface
easier too. Also, if we add standard GFs, zope.interface will want to
work closely with those.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Thu May 17 20:42:56 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 17 May 2007 20:42:56 +0200
Subject: [Python-3000] pep 3131 again
In-Reply-To: <bb8868b90705171055w619bbd02t76743245fce6deb0@mail.gmail.com>
References: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
	<bb8868b90705171055w619bbd02t76743245fce6deb0@mail.gmail.com>
Message-ID: <464CA230.3000004@v.loewis.de>

> 2.  Python forbids these characters.  Martin, JavaScript
> treats these specially, and I think Python probably
> should, too:
> 
> The ECMAScript 3 standard for JavaScript requires the
> tokenizer to throw away all Unicode format-control characters
> (general category Cf).
> 
> ECMAScript 4 will likely tweak this (an incompatible change)
> to retain those characters only in strings and regexps.
> I like that better.

I've added this as an open issue. It would be easy to add,
but I would like to get some confirmation first that it
actually helps writers of the RTL languages (preferably
from some native speakers).

The proposed change would be that Cf characters would be
allowed *only* in and immediately around identifiers, and
in string literals and comments, i.e. the scanner would
work this way:

- perform token classification only based on individual
  ASCII letters; classify all non-ASCII letters as potential
  identifiers.
- for identifiers potential identifiers (i.e. runs of
  non-ASCII characters and ASCII letters, digits, and
  underscore), drop Cf characters, then verify identifier
  syntax.

IOW, you couldn't put the formatting characters around
whitespace, keywords, or punctuation.

An alternative implementation would be to drop formatting
characters everywhere except in string literals.

I'll repeat that UTR#39 explicitly discourages support
for formatting characters in identifiers.

Regards,
Martin


From rasky at develer.com  Fri May 18 00:50:36 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Fri, 18 May 2007 00:50:36 +0200
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
Message-ID: <f2im7s$cur$2@sea.gmane.org>

On 17/05/2007 18.48, Guido van Rossum wrote:

> I have accepted PEP 3131.

Do you have a rationale to share with us? Especially given that your previous 
public mails about the PEP looked mostly against it. This way, the rationale 
can be embedded in the PEP for future reference.

Thanks!
-- 
Giovanni Bajo


From guido at python.org  Fri May 18 00:56:19 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 17 May 2007 15:56:19 -0700
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <f2im7s$cur$2@sea.gmane.org>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<f2im7s$cur$2@sea.gmane.org>
Message-ID: <ca471dc20705171556l998fbbak42c2ffdee7dc9337@mail.gmail.com>

You've missed a few of my mails. I liked the reports from the Java world.

On 5/17/07, Giovanni Bajo <rasky at develer.com> wrote:
> On 17/05/2007 18.48, Guido van Rossum wrote:
>
> > I have accepted PEP 3131.
>
> Do you have a rationale to share with us? Especially given that your previous
> public mails about the PEP looked mostly against it. This way, the rationale
> can be embedded in the PEP for future reference.
>
> Thanks!
> --
> Giovanni Bajo
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rasky at develer.com  Fri May 18 01:04:35 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Fri, 18 May 2007 01:04:35 +0200
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <ca471dc20705131231r510d6fb3i10f035feee2d4477@mail.gmail.com>
References: <dcbbbb410705131204x17a32b5dq363703f219cb707@mail.gmail.com>
	<ca471dc20705131231r510d6fb3i10f035feee2d4477@mail.gmail.com>
Message-ID: <f2in24$h71$1@sea.gmane.org>

On 13/05/2007 21.31, Guido van Rossum wrote:

> The answer to all of this is the filesystem encoding, which is already
> supported. Doesn't appear particularly difficult to me.

sys.getfilesystemencoding() is None on most Linux computers I have access to. 
How is the problem solved there?

In fact, I have a question about this. Can anybody show me a valid 
multi-platform Python code snippet that, given a filename as *unicode* string, 
create a file with that name, possibly adjusting the name so to ignore an 
encoding problem (so that the function *always* succeed)?

def dump_to_file(unicode_filename):
     ...


I attempted this a couple of times without being satisfied at all by the 
solutions.
-- 
Giovanni Bajo


From guido at python.org  Fri May 18 01:10:25 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 17 May 2007 16:10:25 -0700
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <f2in24$h71$1@sea.gmane.org>
References: <dcbbbb410705131204x17a32b5dq363703f219cb707@mail.gmail.com>
	<ca471dc20705131231r510d6fb3i10f035feee2d4477@mail.gmail.com>
	<f2in24$h71$1@sea.gmane.org>
Message-ID: <ca471dc20705171610q698ad0bciaa63f4d96a2cae0@mail.gmail.com>

On 5/17/07, Giovanni Bajo <rasky at develer.com> wrote:
> On 13/05/2007 21.31, Guido van Rossum wrote:
>
> > The answer to all of this is the filesystem encoding, which is already
> > supported. Doesn't appear particularly difficult to me.
>
> sys.getfilesystemencoding() is None on most Linux computers I have access to.
> How is the problem solved there?

i suppose on such systems filenames are binary strings (except for '/'
and '\0') and defaulting to utf8 would work just fine.

> In fact, I have a question about this. Can anybody show me a valid
> multi-platform Python code snippet that, given a filename as *unicode* string,
> create a file with that name, possibly adjusting the name so to ignore an
> encoding problem (so that the function *always* succeed)?
>
> def dump_to_file(unicode_filename):
>      ...
>
>
> I attempted this a couple of times without being satisfied at all by the
> solutions.

Why does it have to be cross-platform? The mapping from module names
to the filesystem is considered platform specific.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rasky at develer.com  Fri May 18 01:14:00 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Fri, 18 May 2007 01:14:00 +0200
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <ca471dc20705171610q698ad0bciaa63f4d96a2cae0@mail.gmail.com>
References: <dcbbbb410705131204x17a32b5dq363703f219cb707@mail.gmail.com>	<ca471dc20705131231r510d6fb3i10f035feee2d4477@mail.gmail.com>	<f2in24$h71$1@sea.gmane.org>
	<ca471dc20705171610q698ad0bciaa63f4d96a2cae0@mail.gmail.com>
Message-ID: <f2injo$h71$2@sea.gmane.org>

On 18/05/2007 1.10, Guido van Rossum wrote:

 >> In fact, I have a question about this. Can anybody show me a valid
 >> multi-platform Python code snippet that, given a filename as *unicode*
 >> string,
 >> create a file with that name, possibly adjusting the name so to ignore an
 >> encoding problem (so that the function *always* succeed)?
 >>
 >> def dump_to_file(unicode_filename):
 >>      ...
 >>
 >>
 >> I attempted this a couple of times without being satisfied at all by the
 >> solutions.
 >
 > Why does it have to be cross-platform? The mapping from module names
 > to the filesystem is considered platform specific.

With cross-platform, I meant a snippet of code which worked on all platform. 
The canonicalization of the filename that is produced could of course be 
different on each plaform.
-- 
Giovanni Bajo


From guido at python.org  Fri May 18 01:22:11 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 17 May 2007 16:22:11 -0700
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <f2injo$h71$2@sea.gmane.org>
References: <dcbbbb410705131204x17a32b5dq363703f219cb707@mail.gmail.com>
	<ca471dc20705131231r510d6fb3i10f035feee2d4477@mail.gmail.com>
	<f2in24$h71$1@sea.gmane.org>
	<ca471dc20705171610q698ad0bciaa63f4d96a2cae0@mail.gmail.com>
	<f2injo$h71$2@sea.gmane.org>
Message-ID: <ca471dc20705171622p6a16d594kf4fa00a896af5853@mail.gmail.com>

On 5/17/07, Giovanni Bajo <rasky at develer.com> wrote:
> On 18/05/2007 1.10, Guido van Rossum wrote:
>
>  >> In fact, I have a question about this. Can anybody show me a valid
>  >> multi-platform Python code snippet that, given a filename as *unicode*
>  >> string,
>  >> create a file with that name, possibly adjusting the name so to ignore an
>  >> encoding problem (so that the function *always* succeed)?
>  >>
>  >> def dump_to_file(unicode_filename):
>  >>      ...
>  >>
>  >>
>  >> I attempted this a couple of times without being satisfied at all by the
>  >> solutions.
>  >
>  > Why does it have to be cross-platform? The mapping from module names
>  > to the filesystem is considered platform specific.
>
> With cross-platform, I meant a snippet of code which worked on all platform.
> The canonicalization of the filename that is produced could of course be
> different on each plaform.

And I meant what I said. The algorithm is up to the Python
implementation on a specific platform. This means that we will have to
decide what it will be. Feel free to contribute a suggestion to the
PEP author.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rasky at develer.com  Fri May 18 01:22:07 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Fri, 18 May 2007 01:22:07 +0200
Subject: [Python-3000] pep 3131 again
In-Reply-To: <464C1EF4.6040603@v.loewis.de>
References: <1d85506f0705161806w19914adfid76e36c4151c336@mail.gmail.com>
	<464C1EF4.6040603@v.loewis.de>
Message-ID: <f2io30$jvm$1@sea.gmane.org>

On 17/05/2007 11.23, Martin v. L?wis wrote:

> Whether or not Japanese or Chinese people with no knowledge of
> English still can master the Latin alphabet easily, I don't know,
> as all Chinese people I do know speak German or English well.

All Chinese people are taught the Latin-character transliteration of Mandarin 
in school. It's called "pin-yin": http://en.wikipedia.org/wiki/Pin_yin. In 
fact, they use this Latin transliteration as the main *mean* to teach children 
how to pronounce each Chinese character.

This transliteration is so common that it is supported as an input method on 
devices like cellphones or keyboards (even though it is usually not the 
default: they have more specific and tuned input methods for computers and SMS).

So yes, Chinese people do master the Latin alphabet. And funnily enough for 
this thread, Pin-yin cannot be fully expressed in ASCII because it requires 
accented vowels (?, ?, and many others I don't have handy).
-- 
Giovanni Bajo



From foom at fuhm.net  Fri May 18 01:24:21 2007
From: foom at fuhm.net (James Y Knight)
Date: Thu, 17 May 2007 19:24:21 -0400
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <f2in24$h71$1@sea.gmane.org>
References: <dcbbbb410705131204x17a32b5dq363703f219cb707@mail.gmail.com>
	<ca471dc20705131231r510d6fb3i10f035feee2d4477@mail.gmail.com>
	<f2in24$h71$1@sea.gmane.org>
Message-ID: <8CC6390F-107E-43A2-AD3A-8051BF039B70@fuhm.net>


On May 17, 2007, at 7:04 PM, Giovanni Bajo wrote:

> On 13/05/2007 21.31, Guido van Rossum wrote:
>
>> The answer to all of this is the filesystem encoding, which is  
>> already
>> supported. Doesn't appear particularly difficult to me.
>
> sys.getfilesystemencoding() is None on most Linux computers I have  
> access to.
> How is the problem solved there?
>
> In fact, I have a question about this. Can anybody show me a valid
> multi-platform Python code snippet that, given a filename as  
> *unicode* string,
> create a file with that name, possibly adjusting the name so to  
> ignore an
> encoding problem (so that the function *always* succeed)?
>
> def dump_to_file(unicode_filename):
>      ...

unicode_filename.encode(sys.getfilesystemencoding() or 'ascii',  
'xmlcharrefreplace') would work.

Although I don't think I've seen a platform where  
sys.getfilesystemencoding() is None.

If I unset LANG/LANGUAGE/LC_*, python reports 'ANSI_X3.4-1968'. But  
normally on my system it reports 'UTF-8', since I have LANG=en_US.UTF-8.

The *really* tricky thing is that on unix systems, if you want to be  
able to access all the files on the disk, you have to use the byte- 
string API, as not all filenames are convertible to unicode. But on  
windows, if you want to be able to access all the files on the disk,  
you *CANNOT* use the byte-string api, because not all filenames  
(which are unicode on disk) are convertible to bytestrings via the  
"mbcs" encoding (which is what getfilesystemencoding() reports). It's  
quite a pain in the ass really.

James

From rasky at develer.com  Fri May 18 01:31:03 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Fri, 18 May 2007 01:31:03 +0200
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <8CC6390F-107E-43A2-AD3A-8051BF039B70@fuhm.net>
References: <dcbbbb410705131204x17a32b5dq363703f219cb707@mail.gmail.com>	<ca471dc20705131231r510d6fb3i10f035feee2d4477@mail.gmail.com>	<f2in24$h71$1@sea.gmane.org>
	<8CC6390F-107E-43A2-AD3A-8051BF039B70@fuhm.net>
Message-ID: <f2iojn$l55$1@sea.gmane.org>

On 18/05/2007 1.24, James Y Knight wrote:

> unicode_filename.encode(sys.getfilesystemencoding() or 'ascii',  
> 'xmlcharrefreplace') would work.

Thanks - using "xmlcharrefreplace" hadn't occurred to me!

> The *really* tricky thing is that on unix systems, if you want to be  
> able to access all the files on the disk, you have to use the byte- 
> string API, as not all filenames are convertible to unicode. But on  
> windows, if you want to be able to access all the files on the disk,  
> you *CANNOT* use the byte-string api, because not all filenames  
> (which are unicode on disk) are convertible to bytestrings via the  
> "mbcs" encoding (which is what getfilesystemencoding() reports). It's  
> quite a pain in the ass really.

Yes. I hope that Py3k will solve this somehow.
-- 
Giovanni Bajo


From guido at python.org  Fri May 18 01:48:57 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 17 May 2007 16:48:57 -0700
Subject: [Python-3000] Radical idea: remove built-in open (require import io)
Message-ID: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>

Do people think it would be too radical if the built-in open()
function was removed altogether, requiring all code that opens files
to import the io module first? This would make it easier to identify
modules that engage in I/O.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at gmail.com  Fri May 18 02:14:49 2007
From: aleaxit at gmail.com (Alex Martelli)
Date: Thu, 17 May 2007 17:14:49 -0700
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
Message-ID: <e8a0972d0705171714r1400f9eyb280d2c9878abf73@mail.gmail.com>

On 5/17/07, Guido van Rossum <guido at python.org> wrote:
> Do people think it would be too radical if the built-in open()
> function was removed altogether, requiring all code that opens files
> to import the io module first? This would make it easier to identify
> modules that engage in I/O.

I think it would be an excellent idea.

Among other advantages, it makes it easier/cleaner to "mock things up"
for testing purposes.

Right now, if I want to make very small and lightweight unit-tests for
a module that uses `open', I have to do that by poking a fake 'open'
in the builtins (or in the module under test, but that may be hard to
achieve if the module imports other modules which import other modules
which...).  I do it, but not happily.

If all I/O occurred through the io module, I could mock things up in
an easier and cleaner way by sticking a "mock io module" in
sys.modules['io'] before I import from my unittest the module I'm
testing -- very similar to what I do in order to have small
lightweight tests of modules that interact with the filesystem with
functions such as os.listdir, and the like; I am far more comfortable
with this approach than I am with poking into builtins.


Alex

From shiblon at gmail.com  Fri May 18 02:20:12 2007
From: shiblon at gmail.com (Chris Monson)
Date: Thu, 17 May 2007 20:20:12 -0400
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
Message-ID: <da3f900e0705171720n7a0bac04u185752114998d32d@mail.gmail.com>

Would other IO builtins also move, like (formerly raw_) input and
print?  What about the file type?

it seems to me that if the rationale is to make use of IO
identifiable, then all IO functions would have to move into the io
module.  What am I missing?

- C

On 5/17/07, Guido van Rossum <guido at python.org> wrote:
> Do people think it would be too radical if the built-in open()
> function was removed altogether, requiring all code that opens files
> to import the io module first? This would make it easier to identify
> modules that engage in I/O.
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/shiblon%40gmail.com
>

From guido at python.org  Fri May 18 02:42:54 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 17 May 2007 17:42:54 -0700
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <da3f900e0705171720n7a0bac04u185752114998d32d@mail.gmail.com>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
	<da3f900e0705171720n7a0bac04u185752114998d32d@mail.gmail.com>
Message-ID: <ca471dc20705171742s74976fbere593192438d5e685@mail.gmail.com>

On 5/17/07, Chris Monson <shiblon at gmail.com> wrote:
> Would other IO builtins also move, like (formerly raw_) input and
> print?  What about the file type?

The file type is already gone in py3k.

> it seems to me that if the rationale is to make use of IO
> identifiable, then all IO functions would have to move into the io
> module.  What am I missing?

I guess a refinement of the point is that you need the io module to
create new I/O streams, while input() and print() act on existing
streams. Code that makes read() and write() calls doesn't need to
import the io module either, so we're not really making all I/O
identifiable, just the open() calls.

--Guido

> - C
>
> On 5/17/07, Guido van Rossum <guido at python.org> wrote:
> > Do people think it would be too radical if the built-in open()
> > function was removed altogether, requiring all code that opens files
> > to import the io module first? This would make it easier to identify
> > modules that engage in I/O.
> >
> > --
> > --Guido van Rossum (home page: http://www.python.org/~guido/)
> > _______________________________________________
> > Python-3000 mailing list
> > Python-3000 at python.org
> > http://mail.python.org/mailman/listinfo/python-3000
> > Unsubscribe:
> > http://mail.python.org/mailman/options/python-3000/shiblon%40gmail.com
> >
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From shiblon at gmail.com  Fri May 18 02:58:49 2007
From: shiblon at gmail.com (Chris Monson)
Date: Thu, 17 May 2007 20:58:49 -0400
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <ca471dc20705171742s74976fbere593192438d5e685@mail.gmail.com>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
	<da3f900e0705171720n7a0bac04u185752114998d32d@mail.gmail.com>
	<ca471dc20705171742s74976fbere593192438d5e685@mail.gmail.com>
Message-ID: <da3f900e0705171758o9e416aep9a6742dd05b7d740@mail.gmail.com>

On 5/17/07, Guido van Rossum <guido at python.org> wrote:
>
> On 5/17/07, Chris Monson <shiblon at gmail.com> wrote:
> > Would other IO builtins also move, like (formerly raw_) input and
> > print?  What about the file type?
>
> The file type is already gone in py3k.
>
> > it seems to me that if the rationale is to make use of IO
> > identifiable, then all IO functions would have to move into the io
> > module.  What am I missing?
>
> I guess a refinement of the point is that you need the io module to
> create new I/O streams, while input() and print() act on existing
> streams. Code that makes read() and write() calls doesn't need to
> import the io module either, so we're not really making all I/O
> identifiable, just the open() calls.


Aha.  Of course, now that you say all of that, it seems obvious.  :-)

- C

--Guido
>
> > - C
> >
> > On 5/17/07, Guido van Rossum <guido at python.org> wrote:
> > > Do people think it would be too radical if the built-in open()
> > > function was removed altogether, requiring all code that opens files
> > > to import the io module first? This would make it easier to identify
> > > modules that engage in I/O.
> > >
> > > --
> > > --Guido van Rossum (home page: http://www.python.org/~guido/)
> > > _______________________________________________
> > > Python-3000 mailing list
> > > Python-3000 at python.org
> > > http://mail.python.org/mailman/listinfo/python-3000
> > > Unsubscribe:
> > > http://mail.python.org/mailman/options/python-3000/shiblon%40gmail.com
> > >
> >
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070517/d45f58cd/attachment.html 

From greg.ewing at canterbury.ac.nz  Fri May 18 03:21:17 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 18 May 2007 13:21:17 +1200
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <464C9D55.9080501@v.loewis.de>
References: <464BBE2B.1050201@acm.org>
	<2A4F5FE3-9F8A-4B74-B46D-B63F1260B7FD@fuhm.net>
	<464C9D55.9080501@v.loewis.de>
Message-ID: <464CFF8D.7040504@canterbury.ac.nz>

Martin v. L?wis wrote:

> (there might be a need for it in RTL languages, supporting
> 200E..200F and 202A..202E, but it seems that speakers of RTL
> languages are skeptical about the entire PEP, so it's unclear
> whether allowing these would help anything)

The ideal kind of programming language for use by both
LTR and RTL people would be some kind of RPN. Then the
whole program could be read either way as either prefix
or postfix. :-)

--
Greg

From greg.ewing at canterbury.ac.nz  Fri May 18 03:41:29 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 18 May 2007 13:41:29 +1200
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <8CC6390F-107E-43A2-AD3A-8051BF039B70@fuhm.net>
References: <dcbbbb410705131204x17a32b5dq363703f219cb707@mail.gmail.com>
	<ca471dc20705131231r510d6fb3i10f035feee2d4477@mail.gmail.com>
	<f2in24$h71$1@sea.gmane.org>
	<8CC6390F-107E-43A2-AD3A-8051BF039B70@fuhm.net>
Message-ID: <464D0449.9020406@canterbury.ac.nz>

James Y Knight wrote:

> The *really* tricky thing is that on unix systems, if you want to be  
> able to access all the files on the disk, you have to use the byte- 
> string API ... But on windows ... you *CANNOT* use the byte-string api

How are we going to cope with this in Py3k with
unicode-only strings?

--
Greg

From shiblon at gmail.com  Fri May 18 04:09:52 2007
From: shiblon at gmail.com (Chris Monson)
Date: Thu, 17 May 2007 22:09:52 -0400
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <464CFF8D.7040504@canterbury.ac.nz>
References: <464BBE2B.1050201@acm.org>
	<2A4F5FE3-9F8A-4B74-B46D-B63F1260B7FD@fuhm.net>
	<464C9D55.9080501@v.loewis.de> <464CFF8D.7040504@canterbury.ac.nz>
Message-ID: <da3f900e0705171909j589bf2cdj97299acc583a09dd@mail.gmail.com>

Ignoring for a moment that prefix != reverse(postfix), that is....

:-)

- C

On 5/17/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Martin v. L?wis wrote:
>
> > (there might be a need for it in RTL languages, supporting
> > 200E..200F and 202A..202E, but it seems that speakers of RTL
> > languages are skeptical about the entire PEP, so it's unclear
> > whether allowing these would help anything)
>
> The ideal kind of programming language for use by both
> LTR and RTL people would be some kind of RPN. Then the
> whole program could be read either way as either prefix
> or postfix. :-)
>
> --
> Greg
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/shiblon%40gmail.com
>

From guido at python.org  Fri May 18 04:35:23 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 17 May 2007 19:35:23 -0700
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <464D0449.9020406@canterbury.ac.nz>
References: <dcbbbb410705131204x17a32b5dq363703f219cb707@mail.gmail.com>
	<ca471dc20705131231r510d6fb3i10f035feee2d4477@mail.gmail.com>
	<f2in24$h71$1@sea.gmane.org>
	<8CC6390F-107E-43A2-AD3A-8051BF039B70@fuhm.net>
	<464D0449.9020406@canterbury.ac.nz>
Message-ID: <ca471dc20705171935n7c89d645pe8268779bd45df68@mail.gmail.com>

On 5/17/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> James Y Knight wrote:
> > The *really* tricky thing is that on unix systems, if you want to be
> > able to access all the files on the disk, you have to use the byte-
> > string API ... But on windows ... you *CANNOT* use the byte-string api
>
> How are we going to cope with this in Py3k with
> unicode-only strings?

Not any different that we do now -- you can already pass both types of
strings to a Windows API and we convert it to the kind of string the
API needs.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From aahz at pythoncraft.com  Fri May 18 04:56:52 2007
From: aahz at pythoncraft.com (Aahz)
Date: Thu, 17 May 2007 19:56:52 -0700
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
Message-ID: <20070518025651.GA7643@panix.com>

On Thu, May 17, 2007, Guido van Rossum wrote:
>
> Do people think it would be too radical if the built-in open()
> function was removed altogether, requiring all code that opens files
> to import the io module first? This would make it easier to identify
> modules that engage in I/O.

My initial take was -1, but now that I see that the existing tutorial
introduces modules before it discusses files, I'm only -0.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Look, it's your affair if you want to play with five people, but don't
go calling it doubles."  --John Cleese anticipates Usenet

From brett at python.org  Fri May 18 05:53:20 2007
From: brett at python.org (Brett Cannon)
Date: Thu, 17 May 2007 20:53:20 -0700
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
Message-ID: <bbaeab100705172053u420a0a7dk5da680a9815b713f@mail.gmail.com>

On 5/17/07, Guido van Rossum <guido at python.org> wrote:
>
> Do people think it would be too radical if the built-in open()
> function was removed altogether, requiring all code that opens files
> to import the io module first? This would make it easier to identify
> modules that engage in I/O.



I support it.  My security work wanted open and execfile yanked out of the
built-in namespace anyway.  =)

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070517/975a52f6/attachment.htm 

From martin at v.loewis.de  Fri May 18 07:26:09 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 18 May 2007 07:26:09 +0200
Subject: [Python-3000] Unicode strings, identifiers, and import
In-Reply-To: <f2in24$h71$1@sea.gmane.org>
References: <dcbbbb410705131204x17a32b5dq363703f219cb707@mail.gmail.com>	<ca471dc20705131231r510d6fb3i10f035feee2d4477@mail.gmail.com>
	<f2in24$h71$1@sea.gmane.org>
Message-ID: <464D38F1.5030808@v.loewis.de>

>> The answer to all of this is the filesystem encoding, which is already
>> supported. Doesn't appear particularly difficult to me.
> 
> sys.getfilesystemencoding() is None on most Linux computers I have access to. 

That's strange. Is LANG not set?

> How is the problem solved there?

A default needs to be applied. In 2.x, the default is the system
encoding. Not sure whether the notion of a Python system encoding
will be preserved for 3.x, but it should be safe, on Unix, to default
to UTF-8 for the file system encoding unless LANG specifies something
different.

> In fact, I have a question about this. Can anybody show me a valid 
> multi-platform Python code snippet that, given a filename as *unicode* string, 
> create a file with that name, possibly adjusting the name so to ignore an 
> encoding problem (so that the function *always* succeed)?

That's not really a python-dev or py3k question. If you want to support
*arbitrary* Unicode strings, you clearly cannot map them to file names
directly: what if the Unicode string contains the directory separator,
or other characters not allowed in file names (such as : or * on
Windows).

If you need to guarantee that any Unicode string can map to a file
name, I suggest

f = open(filename.encode("utf-8").encode("hex"), "w")

> I attempted this a couple of times without being satisfied at all by the 
> solutions.

That's probably because you failed to specify all requirements that you
need for satisfaction. If you would explicitly specify them, you would
likely find that they conflict, and that no solution can possibly
exist satisfying all your requirements, and that this has nothing to
do with Unicode.

Notice that my above solution meets the *specified* needs: it supports
all unicode strings, succeeds always, and possibly adjusts the file
name to ignore an encoding problem. Of course, interpreting the file
name in a file explorer is somewhat tedious...

Regards,
Martin

From jimjjewett at gmail.com  Fri May 18 17:24:19 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 18 May 2007 11:24:19 -0400
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <464C732B.8050103@v.loewis.de>
References: <464BBE2B.1050201@acm.org> <464C732B.8050103@v.loewis.de>
Message-ID: <fb6fbf560705180824p59885a1dq5da6d2c70e0ed8b0@mail.gmail.com>

On 5/17/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > is it generally agreed that the
> > Unicode character classes listed in the PEP are the ones we want to
> > include in identifiers? My preference is to be conservative in terms of
> > what's allowed.

> John Nagle suggested to consider UTR#39
> (http://unicode.org/reports/tr39/). I encourage anybody to help me
> understand what it says.

> The easiest part is 3.1: this seems to say we should restrict characters
> listed as "restrict" in [idmod]. My suggestion would be to warn about
> them. I'm not sure about the purpose of the additional characters:
> surely, they don't think we should support HYPHEN-MINUS in identifiers?

Rather, they mean that it is commonly used (Lisp and DNS names, at
least), and is (deemed by them as) safe (given that you have applied
their exclusions, such as the dashes).  Python should still use a
tailoring and exclude it.

 > 4. Confusable Detection: Without considering details, it seems you need
> two strings to decide whether they are confusable. So it's not clear
> to me how this could apply to banning certain identifiers.

In most cases, the strings are confusable because individual characters are.

TR 39 makes it sound more complicated than it need to be, because they
want to permit all sorts of strangeness, so long as it is at least
unambiguous strangeness.

My take:

Single-script confusables are things like "1" vs "l", and it is
probably too late to fight them.

Whole-script confusables are cases where two scripts look alike; you
can get something looking like "scope" in either Latin or Cyrillic.
If we're going to allow non-Latin identifiers, then we'll probably
have to live with this.

Mixed-script confusables are spoofing that wouldn't work if you
insisted that any single identifier stick to a "single" script.
('p?yp?l', with Cyrillic '?'s).

Their algorithm talks about entire strings because they want to allow
'toys-?-us'.
Technically, Latin doesn't have a character that looks like a
backwards-R, and Cyrillic doesn't have matches for *all of* "toys us".
 Personally, I don't see a strong need to support toys_?_us just
because it would be possible.

On the other hand, I'm not sure how often users of non-latin languages
will want to mix in latin letters.  The tech report suggested that it
is fairly common to use all of (Hiragana | Katakana | Han | Latin) in
Japanese text, but I'm not sure whether it would be normal to mix them
within a single identifier.

 > 5. Mixed Script Detection: That might apply, but I can't map the
> algorithm to terminology I'm familiar with. What is UScript.COMMON
> and UScript.INHERITED?

Those are characters used in many different languages, such as the From TR 24

    http://www.unicode.org/reports/tr24/

    Inherited?for characters that may be used with multiple scripts,
and inherit their script from the preceding characters. Includes
nonspacing marks, enclosing marks, and the zero width
joiner/non-joiner characters.

    Common?for other characters that may be used with multiple scripts.


> I'm skeptical about mixed-script detection,
> because you surely want to allow ASCII digits (0..9) in Cyrillic

According to http://www.unicode.org/Public/UNIDATA/Scripts.txt, the 52
letters [A-Za-z] are latin, but the rest of ASCII (including digits)
is COMMON, and should be allowed with any script.

-jJ

From ark-mlist at att.net  Fri May 18 17:51:58 2007
From: ark-mlist at att.net (Andrew Koenig)
Date: Fri, 18 May 2007 11:51:58 -0400
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
Message-ID: <002c01c79964$77a6fc00$66f4f400$@net>

> Do people think it would be too radical if the built-in open()
> function was removed altogether, requiring all code that opens files
> to import the io module first? This would make it easier to identify
> modules that engage in I/O.

+1.

Presumably you can still write to the standard input, output, error, and log
files without importing io.

(I'm feeling slightly pedantic today, so I want to say that the proposal
doesn't make it any easier to identify modules that engage in I/O -- it
makes it easier to identify modules that assuredly do not engage in I/O.  +1
anyway.)



From rrr at ronadam.com  Fri May 18 18:17:53 2007
From: rrr at ronadam.com (Ron Adam)
Date: Fri, 18 May 2007 11:17:53 -0500
Subject: [Python-3000] Raw strings containing \u or \U
In-Reply-To: <f2gq55$d4p$1@sea.gmane.org>
References: <ca471dc20705160846s7daf048age45300a15111c645@mail.gmail.com>		<d11dcfba0705160955r769e6356qe4bad4e776b4a55d@mail.gmail.com>		<ca471dc20705161005r684c359dg8fa7b78355cb4ccc@mail.gmail.com>		<d11dcfba0705161132o227fd065y2c7fef01c0824822@mail.gmail.com>		<464B62FD.4070400@ronadam.com>	<ca471dc20705161329h2b7ae659qe21ad594e2939d6a@mail.gmail.com>	<464B7235.20500@ronadam.com>
	<f2gq55$d4p$1@sea.gmane.org>
Message-ID: <464DD1B1.1080009@ronadam.com>

Georg Brandl wrote:
> Ron Adam schrieb:
>> Guido van Rossum wrote:
>>> That would be great! This will automatically turn \u1234 into 6
>>> characters, right?
>> I'm not exactly clear when the '\uxxxx' characters get converted.  There 
>> isn't any conversion done in tokanize.c that I can see.  It's primarily 
>> only concerned with finding the beginning and ending of the string at that 
>> point.  It looks like everything between the beginning and end is just 
>> passed along "as is" and it's translated further later in the chain.
> 
> Look at Python/ast.c, which has functions parsestr() and decode_unicode().
> The latter calls PyUnicode_DecodeRawUnicodeEscape() which I think is the
> function you're looking for.
> 
> Georg

Thanks, I'll look there.

That should be where I need to look to fix a glitch where the last quote of 
a raw string is both the end of the string and part of a string.

 >>> r'\'
"\\'"

Interestingly it works just fine for raw byte strings.  (I wish the letter 
were reversed, saying bytes-raw-string is awkward.)

 >>> br'\'
b'\\'

Anyway, I've made the corresponding modifications to tokenize.py and 
tokenize_tests.txt.

The tests for tokenize.py need to be updated.  They do a round trip test, 
but I've found that doesn't mean it's the correct round trip!

Cheers,
    Ron






From fumanchu at amor.org  Fri May 18 18:35:30 2007
From: fumanchu at amor.org (Robert Brewer)
Date: Fri, 18 May 2007 09:35:30 -0700
Subject: [Python-3000] Radical idea: remove built-in open (requireimport
	io)
In-Reply-To: <002c01c79964$77a6fc00$66f4f400$@net>
Message-ID: <435DF58A933BA74397B42CDEB8145A860C23930D@ex9.hostedexchange.local>

Guido van Rossum wrote:
> Do people think it would be too radical if the built-in open()
> function was removed altogether, requiring all code that opens files
> to import the io module first? This would make it easier to identify
> modules that engage in I/O.

I must be dense, because I don't see how the proposal "makes it easier
to identify modules that engage in I/O". Who's supposed to be doing the
identification and when? And how will it not be fooled by __import__ and
plain 'ol cross-module references?


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From guido at python.org  Fri May 18 18:44:54 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 18 May 2007 09:44:54 -0700
Subject: [Python-3000] Radical idea: remove built-in open (requireimport
	io)
In-Reply-To: <435DF58A933BA74397B42CDEB8145A860C23930D@ex9.hostedexchange.local>
References: <002c01c79964$77a6fc00$66f4f400$@net>
	<435DF58A933BA74397B42CDEB8145A860C23930D@ex9.hostedexchange.local>
Message-ID: <ca471dc20705180944o7c2111faqf55ca915a122691d@mail.gmail.com>

On 5/18/07, Robert Brewer <fumanchu at amor.org> wrote:
> Guido van Rossum wrote:
> > Do people think it would be too radical if the built-in open()
> > function was removed altogether, requiring all code that opens files
> > to import the io module first? This would make it easier to identify
> > modules that engage in I/O.
>
> I must be dense, because I don't see how the proposal "makes it easier
> to identify modules that engage in I/O". Who's supposed to be doing the
> identification and when? And how will it not be fooled by __import__ and
> plain 'ol cross-module references?

I wasn't thinking of this from a security POV -- more from the
perspective of trying to understand roughly what a module does.
Looking at the imports is often a good place to start. If you see it
importing socket, that's kind of a hint that it might need the
network. If you see it importing io or os, that would be a similar
hint that it might access the filesystem. Of course, if you see it
import some other module you will have to understand what that module
does (or put it on your stack for later), and so on.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From steven.bethard at gmail.com  Fri May 18 18:54:05 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 18 May 2007 10:54:05 -0600
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <ca471dc20705171742s74976fbere593192438d5e685@mail.gmail.com>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
	<da3f900e0705171720n7a0bac04u185752114998d32d@mail.gmail.com>
	<ca471dc20705171742s74976fbere593192438d5e685@mail.gmail.com>
Message-ID: <d11dcfba0705180954v405d7b77ka2f48dcbe238dc88@mail.gmail.com>

Guido van Rossum wrote:
> Do people think it would be too radical if the built-in open()
> function was removed altogether, requiring all code that opens files
> to import the io module first? This would make it easier to identify
> modules that engage in I/O.
[and later]
> I guess a refinement of the point is that you need the io module to
> create new I/O streams, while input() and print() act on existing
> streams. Code that makes read() and write() calls doesn't need to
> import the io module either, so we're not really making all I/O
> identifiable, just the open() calls.

+0.5.  I'm all for keeping the builtins as simple as possible.  And if
you're already used to importing io for file, when you discover you
need to do something more complicated involving other layers of the io
stack, you'll already be looking in the right place.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From collinw at gmail.com  Fri May 18 19:02:35 2007
From: collinw at gmail.com (Collin Winter)
Date: Fri, 18 May 2007 10:02:35 -0700
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
Message-ID: <43aa6ff70705181002u779a6ae3w9a4ea050601cde70@mail.gmail.com>

On 5/17/07, Guido van Rossum <guido at python.org> wrote:
> Do people think it would be too radical if the built-in open()
> function was removed altogether, requiring all code that opens files
> to import the io module first? This would make it easier to identify
> modules that engage in I/O.

+1

Thinking out loud: I wonder if the io module should also become the
canonical source for stdin, stdout, stderr instead of sys.

Collin Winter

From guido at python.org  Fri May 18 20:02:56 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 18 May 2007 11:02:56 -0700
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
Message-ID: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>

While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean
Operators) by Greg Ewing. I am of two minds of this -- on the one
hand, it's been a long time without any working code or anything. OTOH
it might be quite useful to e.g. numpy folks.

It is time to reject it due to lack of interest, or revive it!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From baptiste13 at altern.org  Fri May 18 20:05:52 2007
From: baptiste13 at altern.org (Baptiste Carvello)
Date: Fri, 18 May 2007 20:05:52 +0200
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
Message-ID: <f2kpv6$sm4$1@sea.gmane.org>

Guido van Rossum a ?crit :
> Do people think it would be too radical if the built-in open()
> function was removed altogether, requiring all code that opens files
> to import the io module first? This would make it easier to identify
> modules that engage in I/O.
> 

-1

Will someone think of the interactive users ?


From guido at python.org  Fri May 18 20:10:13 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 18 May 2007 11:10:13 -0700
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <f2kpv6$sm4$1@sea.gmane.org>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
	<f2kpv6$sm4$1@sea.gmane.org>
Message-ID: <ca471dc20705181110w30093c03ufadcbbf074aa0ff8@mail.gmail.com>

On 5/18/07, Baptiste Carvello <baptiste13 at altern.org> wrote:
> Guido van Rossum a ?crit :
> > Do people think it would be too radical if the built-in open()
> > function was removed altogether, requiring all code that opens files
> > to import the io module first? This would make it easier to identify
> > modules that engage in I/O.
>
> -1
>
> Will someone think of the interactive users ?

What kind of interactive use are you making of open()?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From g.brandl at gmx.net  Fri May 18 20:21:43 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 18 May 2007 20:21:43 +0200
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <f2kpv6$sm4$1@sea.gmane.org>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
	<f2kpv6$sm4$1@sea.gmane.org>
Message-ID: <f2kqre$4cl$1@sea.gmane.org>

Baptiste Carvello schrieb:
> Guido van Rossum a ?crit :
>> Do people think it would be too radical if the built-in open()
>> function was removed altogether, requiring all code that opens files
>> to import the io module first? This would make it easier to identify
>> modules that engage in I/O.
>> 
> 
> -1
> 
> Will someone think of the interactive users ?

They can still put "import sys, os, io" in their PYTHONSTARTUP file.

Or use IPython.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From python at rcn.com  Fri May 18 20:55:44 2007
From: python at rcn.com (Raymond Hettinger)
Date: Fri, 18 May 2007 14:55:44 -0400 (EDT)
Subject: [Python-3000] Radical idea: remove built-in open
 (requireimport	io)
Message-ID: <20070518145544.BJT29928@ms09.lnh.mail.rcn.net>

> I wasn't thinking of this from a security POV -- more from the
> perspective of trying to understand roughly what a module does.
> Looking at the imports is often a good place to start.

In the case of open(), this may be a false benefit. Too many other calls (logging, shelve, etc) can open files, so the presence or absence of an IO import is not a reliable indicator of anything.

Also, the character of a script doesn't change when it decides to switch from stdin/stdout to actual files.  I don't think we gain anything here and are instead adding a small irritant. The open() function is so basic, it should remain a builtin. In theory, all builtins could be moved to other modules, but in practice it would be a PITA for day-to-day script writing.

I enjoy being able to dash off quick, expressive lines like this:

   for i, line in enumerate(open('data.txt')): ...

Needing an import for that frequently used function would detract from the enjoyment.

Taking a more global viewpoint, I'm experiencing a little FUD about Py3k.  There were good reasons for introducing the print() function, but then we've made "hello world" a little less lightweight.  Larger applications have some legitimate needs which make abstract base classes attractive, but they are going to add significantly to the learning curve for beginners when using APIs that require them.  Packages were introduced to address the needs of large applications and complicated namespace issues, but now we're about to split the trivially simple string module into a package.  One of the design goals should be to keep the core language as trivially simple/straightforward as possible for day-to-day use.  Requiring an import for file opening runs contrary to that goal.


Raymond

From jdahlin at async.com.br  Fri May 18 21:49:11 2007
From: jdahlin at async.com.br (Johan Dahlin)
Date: Fri, 18 May 2007 16:49:11 -0300
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
Message-ID: <464E0337.3000905@async.com.br>

Guido van Rossum wrote:
> While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean
> Operators) by Greg Ewing. I am of two minds of this -- on the one
> hand, it's been a long time without any working code or anything. OTOH
> it might be quite useful to e.g. numpy folks.

This kind of feature would also be useful for ORMs, to be able to map
boolean operators to SQL.

Johan


From greg.ewing at canterbury.ac.nz  Sat May 19 01:57:36 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 19 May 2007 11:57:36 +1200
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <da3f900e0705171909j589bf2cdj97299acc583a09dd@mail.gmail.com>
References: <464BBE2B.1050201@acm.org>
	<2A4F5FE3-9F8A-4B74-B46D-B63F1260B7FD@fuhm.net>
	<464C9D55.9080501@v.loewis.de> <464CFF8D.7040504@canterbury.ac.nz>
	<da3f900e0705171909j589bf2cdj97299acc583a09dd@mail.gmail.com>
Message-ID: <464E3D70.8050408@canterbury.ac.nz>

Chris Monson wrote:
> Ignoring for a moment that prefix != reverse(postfix), that is....

It is if you don't insist on putting silly
parentheses all over the place. (IOW, "prefix"
is not synonymous with "Lisp".)

--
Greg

From shiblon at gmail.com  Sat May 19 02:29:13 2007
From: shiblon at gmail.com (Chris Monson)
Date: Fri, 18 May 2007 20:29:13 -0400
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <464E3D70.8050408@canterbury.ac.nz>
References: <464BBE2B.1050201@acm.org>
	<2A4F5FE3-9F8A-4B74-B46D-B63F1260B7FD@fuhm.net>
	<464C9D55.9080501@v.loewis.de> <464CFF8D.7040504@canterbury.ac.nz>
	<da3f900e0705171909j589bf2cdj97299acc583a09dd@mail.gmail.com>
	<464E3D70.8050408@canterbury.ac.nz>
Message-ID: <da3f900e0705181729h16c31dbfq72713e28903b422b@mail.gmail.com>

So / 4 2 = 2 4 / ?  I beg to differ :-).  At any rate, </silly_digression>

- C

On 5/18/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Chris Monson wrote:
> > Ignoring for a moment that prefix != reverse(postfix), that is....
>
> It is if you don't insist on putting silly
> parentheses all over the place. (IOW, "prefix"
> is not synonymous with "Lisp".)
>
> --
> Greg
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/shiblon%40gmail.com
>

From greg.ewing at canterbury.ac.nz  Sat May 19 02:47:29 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 19 May 2007 12:47:29 +1200
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
Message-ID: <464E4921.9040101@canterbury.ac.nz>

Guido van Rossum wrote:
> While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean
> Operators) by Greg Ewing.
> 
> It is time to reject it due to lack of interest, or revive it!

Didn't you post something about this a short time ago,
suggesting you were in favour of it?

If you need an up-to-date implementation before it can
be accepted, let me know and I'll see what I can do.
I wouldn't want it to be rejected just because of that.

--
Greg

From greg.ewing at canterbury.ac.nz  Sat May 19 03:12:19 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 19 May 2007 13:12:19 +1200
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <da3f900e0705181729h16c31dbfq72713e28903b422b@mail.gmail.com>
References: <464BBE2B.1050201@acm.org>
	<2A4F5FE3-9F8A-4B74-B46D-B63F1260B7FD@fuhm.net>
	<464C9D55.9080501@v.loewis.de> <464CFF8D.7040504@canterbury.ac.nz>
	<da3f900e0705171909j589bf2cdj97299acc583a09dd@mail.gmail.com>
	<464E3D70.8050408@canterbury.ac.nz>
	<da3f900e0705181729h16c31dbfq72713e28903b422b@mail.gmail.com>
Message-ID: <464E4EF3.8090107@canterbury.ac.nz>

Chris Monson wrote:
> So / 4 2 = 2 4 / ?

It would be unusual, but there's nothing to prevent /
from being defined that way in the postfix version of
the language.

--
Greg

From guido at python.org  Sat May 19 03:21:45 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 18 May 2007 18:21:45 -0700
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <464E4921.9040101@canterbury.ac.nz>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<464E4921.9040101@canterbury.ac.nz>
Message-ID: <ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>

On 5/18/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
> > While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean
> > Operators) by Greg Ewing.
> >
> > It is time to reject it due to lack of interest, or revive it!
>
> Didn't you post something about this a short time ago,
> suggesting you were in favour of it?

I think I did, but I hope I'm not the only one in favor.

> If you need an up-to-date implementation before it can
> be accepted, let me know and I'll see what I can do.
> I wouldn't want it to be rejected just because of that.

Working implementations are good for all sorts of reasons.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rasky at develer.com  Sat May 19 13:24:55 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Sat, 19 May 2007 13:24:55 +0200
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>	<464E4921.9040101@canterbury.ac.nz>
	<ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>
Message-ID: <f2mmq8$mjv$1@sea.gmane.org>

On 19/05/2007 3.21, Guido van Rossum wrote:

>>> While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean
>>> Operators) by Greg Ewing.
>>>
>>> It is time to reject it due to lack of interest, or revive it!
>> Didn't you post something about this a short time ago,
>> suggesting you were in favour of it?
> 
> I think I did, but I hope I'm not the only one in favor.

I'm -0 on the idea, they're very rarely overloaded in C++ as well, since there 
are only few really valid use cases.

In fact, the only example I saw till now where those of constructing 
meta-languages using Python's syntax, which is something that Python has never 
really encouraged (see the metaprogramming syntax which is now officially vetoed).

But I'm not -1 because I assume that (just like unicode identifiers) they will 
not be abused by the community, and they probably do help some very rare and 
uncommon use cases where they are really required.
-- 
Giovanni Bajo


From ncoghlan at gmail.com  Sat May 19 13:54:23 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 19 May 2007 21:54:23 +1000
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <fb6fbf560705180824p59885a1dq5da6d2c70e0ed8b0@mail.gmail.com>
References: <464BBE2B.1050201@acm.org> <464C732B.8050103@v.loewis.de>
	<fb6fbf560705180824p59885a1dq5da6d2c70e0ed8b0@mail.gmail.com>
Message-ID: <464EE56F.5040300@gmail.com>

Jim Jewett wrote:
> On the other hand, I'm not sure how often users of non-latin languages
> will want to mix in latin letters.  The tech report suggested that it
> is fairly common to use all of (Hiragana | Katakana | Han | Latin) in
> Japanese text, but I'm not sure whether it would be normal to mix them
> within a single identifier.

Mixing Kanji (Han script) & Hiragana in a single Japanese word is 
certainly quite common (main part of the word in kanji, the ending in 
hiragana).  I can't think of any cases where the other two would be 
mixed (with each other or with either of the first two scripts) within a 
single word, but my Japanese is pretty poor - there could easily be 
cases I'm not aware of.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From jason.orendorff at gmail.com  Sat May 19 14:41:32 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Sat, 19 May 2007 08:41:32 -0400
Subject: [Python-3000] [Python-Dev] Wither PEP 335 (Overloadable Boolean
	Operators)?
In-Reply-To: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
Message-ID: <bb8868b90705190541i4b459097waa0be2d8e14ddbfb@mail.gmail.com>

On 5/18/07, Guido van Rossum <guido at python.org> wrote:
> While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean
> Operators) by Greg Ewing.

-1.  "and" and "or" affect the flow of control.  It's a matter
of taste, but I feel the benefit is too small here to add
another flow-control quirk.  I like that part of the language
to be simple.

Anyway, if this *is* done, logically it should cover
"(... if ... else ...)" as well.  Same use cases.

-j

From ntoronto at cs.byu.edu  Sat May 19 19:12:28 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Sat, 19 May 2007 11:12:28 -0600
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <f2mmq8$mjv$1@sea.gmane.org>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>	<464E4921.9040101@canterbury.ac.nz>	<ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>
	<f2mmq8$mjv$1@sea.gmane.org>
Message-ID: <464F2FFC.3060902@cs.byu.edu>

Giovanni Bajo wrote:
> On 19/05/2007 3.21, Guido van Rossum wrote:
>   
>>>> While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean
>>>> Operators) by Greg Ewing.
>>>>
>>>> It is time to reject it due to lack of interest, or revive it!
>>>>         
>>> Didn't you post something about this a short time ago,
>>> suggesting you were in favour of it?
>>>       
>> I think I did, but I hope I'm not the only one in favor.
>>     
>
> I'm -0 on the idea, they're very rarely overloaded in C++ as well, since there 
> are only few really valid use cases.
>
> In fact, the only example I saw till now where those of constructing 
> meta-languages using Python's syntax, which is something that Python has never 
> really encouraged (see the metaprogramming syntax which is now officially vetoed).
>
> But I'm not -1 because I assume that (just like unicode identifiers) they will 
> not be abused by the community, and they probably do help some very rare and 
> uncommon use cases where they are really required.
>   

There's a fairly common one, actually, that comes up quite a lot in 
Numpy. Currently, best practice is a wart. Here's some code of mine for 
evaluating log probabilities from the Multinomial family:

    class Multinomial(DistFamily):
        @classmethod
        def logProb(cls, x, n, p):
            x = scipy.asarray(x)
            n = scipy.asarray(n)
            p = scipy.asarray(p)
            result = special.gammaln(n + 1) - special.gammaln(x + 
1).sum(-1) + (x * scipy.log(p)).sum(-1)
            xsum = x.sum(-1)
            psum = p.sum(-1)
            return scipy.where((xsum != n) | (psum < 0.99999) | (psum > 
1.00001) | ~scipy.isfinite(result), -scipy.inf, result)


That last bit is really confusing to new Numpy users, especially 
figuring out how to do it in the first place. (Once you get it, it's not 
*so* bad.) The parenthesis are required, by the way. With overloadable 
booleans, it would become much more readable and newbie-friendly:

            return scipy.where(xsum != n or psum < 0.99999 or psum > 
1.00001 or not scipy.isfinite(result), -scipy.inf, result)


This isn't just an issue with "where" though - boolean arrays come up 
quite a bit elsewhere, especially in indexing (you can index an array 
with an array of booleans) and counting. Given that we're supposed to 
see tighter integration with Numpy, I'd say this family of use cases is 
fairly significant.

Neil


From rasky at develer.com  Sat May 19 21:21:58 2007
From: rasky at develer.com (Giovanni Bajo)
Date: Sat, 19 May 2007 21:21:58 +0200
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <464F2FFC.3060902@cs.byu.edu>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>	<464E4921.9040101@canterbury.ac.nz>	<ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>	<f2mmq8$mjv$1@sea.gmane.org>
	<464F2FFC.3060902@cs.byu.edu>
Message-ID: <f2nion$5b2$1@sea.gmane.org>

On 19/05/2007 19.12, Neil Toronto wrote:

> There's a fairly common one, actually, that comes up quite a lot in 
> Numpy. Currently, best practice is a wart. Here's some code of mine for 
> evaluating log probabilities from the Multinomial family:
> 
>     class Multinomial(DistFamily):
>         @classmethod
>         def logProb(cls, x, n, p):
>             x = scipy.asarray(x)
>             n = scipy.asarray(n)
>             p = scipy.asarray(p)
>             result = special.gammaln(n + 1) - special.gammaln(x + 
> 1).sum(-1) + (x * scipy.log(p)).sum(-1)
>             xsum = x.sum(-1)
>             psum = p.sum(-1)
>             return scipy.where((xsum != n) | (psum < 0.99999) | (psum > 
> 1.00001) | ~scipy.isfinite(result), -scipy.inf, result)
> 
> 
> That last bit is really confusing to new Numpy users, especially 
> figuring out how to do it in the first place. (Once you get it, it's not 
> *so* bad.) The parenthesis are required, by the way. With overloadable 
> booleans, it would become much more readable and newbie-friendly:
> 
>             return scipy.where(xsum != n or psum < 0.99999 or psum > 
> 1.00001 or not scipy.isfinite(result), -scipy.inf, result)

Probably it's better in the numpy contest, but surely it's a little confusing 
at first sight for a non-numpy savvy. In fact, as you said, I don't think the 
current best-practice is *that* bad after all. I'll keep my -0.

========================

Now for the fun side :)

Another workaround could be:

return scipy.where(
     "xsum != n or psum < 0.99999 or "
     "psum > 1.000001 or not scipy.isfinite(result)",
     -scipy.inf, result)

with the necessary magic to pull out variables from the stack frame. Parsing 
could be done only once of course. But I'm sure the numpy guys have already 
thought and discarded this solution as it's more complicated.

[[ In fact, numpy is actually trying to create a DSL with Python itself. I 
assume things like "x.sum(-1)" would have been probably spelled sum(x, -1), if 
  you could freely decide what to do without worrying about the implementation. ]]

Or, another workaround is something like this: 
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/384122, which could 
probably be extended to more "operators" that numpy can't simulate using the 
plain Python syntax.
-- 
Giovanni Bajo


From robert.kern at gmail.com  Sat May 19 21:45:58 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 19 May 2007 14:45:58 -0500
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <f2nion$5b2$1@sea.gmane.org>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>	<464E4921.9040101@canterbury.ac.nz>	<ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>	<f2mmq8$mjv$1@sea.gmane.org>	<464F2FFC.3060902@cs.byu.edu>
	<f2nion$5b2$1@sea.gmane.org>
Message-ID: <f2nk5p$bqf$1@sea.gmane.org>

Giovanni Bajo wrote:
> Another workaround could be:
> 
> return scipy.where(
>      "xsum != n or psum < 0.99999 or "
>      "psum > 1.000001 or not scipy.isfinite(result)",
>      -scipy.inf, result)
> 
> with the necessary magic to pull out variables from the stack frame. Parsing 
> could be done only once of course. But I'm sure the numpy guys have already 
> thought and discarded this solution as it's more complicated.

Well, it doesn't actually solve the problem. Yes, we could write functions that
parse some language that looks like Python but executes as something else, but
that doesn't advance us towards the goal of making the code easier to understand.

> [[ In fact, numpy is actually trying to create a DSL with Python itself.

It isn't. At least, not any more than any other custom type is.

> I 
> assume things like "x.sum(-1)" would have been probably spelled sum(x, -1), if 
>   you could freely decide what to do without worrying about the implementation. ]]

In fact, it can be spelled so and once could only be spelled so in Numeric.

> Or, another workaround is something like this: 
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/384122, which could 
> probably be extended to more "operators" that numpy can't simulate using the 
> plain Python syntax.

Much as we'd like it to be, it's just not practical.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From python at rcn.com  Sat May 19 23:37:23 2007
From: python at rcn.com (Raymond Hettinger)
Date: Sat, 19 May 2007 14:37:23 -0700
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>	<464E4921.9040101@canterbury.ac.nz>	<ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>	<f2mmq8$mjv$1@sea.gmane.org>	<464F2FFC.3060902@cs.byu.edu><f2nion$5b2$1@sea.gmane.org>
	<f2nk5p$bqf$1@sea.gmane.org>
Message-ID: <025401c79a5d$e3b4a010$f101a8c0@RaymondLaptop1>

> Giovanni Bajo wrote:
>> Another workaround could be:

Before focusing mental talents on workarounds and implementations,
it would be worthwhile to consider whether the idea would help or
hurt the language.  The and/or keywords already have some complexity
due to their returning non-boolean values.  IMO, it would be a disservice
to the language to further complexify their meanings.  Right now, at least, 
we can make a static reading of the code and have a good idea of what 
the and/or keywords mean.

Someone once proposed overloadable behavior for the "is" operator.
IMO, the reasons for rejecting that idea also apply to this proposal.

FWIW, the peephole optimizer takes advantage of the current meaning
of and/or to generate faster code.  It would be ashamed to lose this
optimization and have all applications pay a price in slower code.

Raymond

From bob at redivi.com  Sun May 20 00:01:45 2007
From: bob at redivi.com (Bob Ippolito)
Date: Sat, 19 May 2007 15:01:45 -0700
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <025401c79a5d$e3b4a010$f101a8c0@RaymondLaptop1>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<464E4921.9040101@canterbury.ac.nz>
	<ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>
	<f2mmq8$mjv$1@sea.gmane.org> <464F2FFC.3060902@cs.byu.edu>
	<f2nion$5b2$1@sea.gmane.org> <f2nk5p$bqf$1@sea.gmane.org>
	<025401c79a5d$e3b4a010$f101a8c0@RaymondLaptop1>
Message-ID: <6a36e7290705191501m3ea09731u2329c39473301e58@mail.gmail.com>

On 5/19/07, Raymond Hettinger <python at rcn.com> wrote:
> > Giovanni Bajo wrote:
> >> Another workaround could be:
>
> Before focusing mental talents on workarounds and implementations,
> it would be worthwhile to consider whether the idea would help or
> hurt the language.  The and/or keywords already have some complexity
> due to their returning non-boolean values.  IMO, it would be a disservice
> to the language to further complexify their meanings.  Right now, at least,
> we can make a static reading of the code and have a good idea of what
> the and/or keywords mean.

Would "and" and "or" still be able to properly short-circuit given
this proposal?

-bob

From robert.kern at gmail.com  Sun May 20 02:19:24 2007
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 19 May 2007 19:19:24 -0500
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <025401c79a5d$e3b4a010$f101a8c0@RaymondLaptop1>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>	<464E4921.9040101@canterbury.ac.nz>	<ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>	<f2mmq8$mjv$1@sea.gmane.org>	<464F2FFC.3060902@cs.byu.edu><f2nion$5b2$1@sea.gmane.org>	<f2nk5p$bqf$1@sea.gmane.org>
	<025401c79a5d$e3b4a010$f101a8c0@RaymondLaptop1>
Message-ID: <f2o46h$hrs$1@sea.gmane.org>

Raymond Hettinger wrote:
>> Giovanni Bajo wrote:
>>> Another workaround could be:
> 
> Before focusing mental talents on workarounds and implementations,
> it would be worthwhile to consider whether the idea would help or
> hurt the language.  The and/or keywords already have some complexity
> due to their returning non-boolean values.  IMO, it would be a disservice
> to the language to further complexify their meanings.  Right now, at least, 
> we can make a static reading of the code and have a good idea of what 
> the and/or keywords mean.

It would probably hurt the language, and for the record, I'm against it. We
already have problems with rich comparisons not reliably returning booleans.
It's a fairly common occurrence to do equality testing against generic data
types. For example, finding if an object is in a list with list.index().
However, this does not reliably work when == can return something that is not
interpretable as a boolean value like numpy arrays do. I don't think rich
comparisons are a mistake (I use them much more frequently than I use
list.index(), for example), but propagating the uncertainty further is probably
a mistake. For numpy, the bitwise operators |&~ work fine on boolean arrays, and
that's all such operators really need to work on.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From greg.ewing at canterbury.ac.nz  Sun May 20 02:28:04 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 20 May 2007 12:28:04 +1200
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <025401c79a5d$e3b4a010$f101a8c0@RaymondLaptop1>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<464E4921.9040101@canterbury.ac.nz>
	<ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>
	<f2mmq8$mjv$1@sea.gmane.org> <464F2FFC.3060902@cs.byu.edu>
	<f2nion$5b2$1@sea.gmane.org> <f2nk5p$bqf$1@sea.gmane.org>
	<025401c79a5d$e3b4a010$f101a8c0@RaymondLaptop1>
Message-ID: <464F9614.1010208@canterbury.ac.nz>

Raymond Hettinger wrote:
> Someone once proposed overloadable behavior for the "is" operator.
> IMO, the reasons for rejecting that idea also apply to this proposal.

The reason for rejecting that is that it would leave us
with no way of reliably testing whether two references
point to the same object.

That objection doesn't apply here, because there would
still be a way of ensuring that you get boolean semantics
if it matters for some reason: bool(a) and bool(b), etc.

--
Greg

From greg.ewing at canterbury.ac.nz  Sun May 20 02:32:58 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 20 May 2007 12:32:58 +1200
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <025401c79a5d$e3b4a010$f101a8c0@RaymondLaptop1>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<464E4921.9040101@canterbury.ac.nz>
	<ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>
	<f2mmq8$mjv$1@sea.gmane.org> <464F2FFC.3060902@cs.byu.edu>
	<f2nion$5b2$1@sea.gmane.org> <f2nk5p$bqf$1@sea.gmane.org>
	<025401c79a5d$e3b4a010$f101a8c0@RaymondLaptop1>
Message-ID: <464F973A.8070205@canterbury.ac.nz>

Raymond Hettinger wrote:

> FWIW, the peephole optimizer takes advantage of the current meaning
> of and/or to generate faster code.

Can you give some examples of the sort of optimisations
that are done? It may still be possible to do them --
the AND1 and OR1 bytecodes in my proposal are conditional
branch instructions, much like the existing boolean operator
bytecodes.

--
Greg

From greg.ewing at canterbury.ac.nz  Sun May 20 02:34:51 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 20 May 2007 12:34:51 +1200
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <6a36e7290705191501m3ea09731u2329c39473301e58@mail.gmail.com>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<464E4921.9040101@canterbury.ac.nz>
	<ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>
	<f2mmq8$mjv$1@sea.gmane.org> <464F2FFC.3060902@cs.byu.edu>
	<f2nion$5b2$1@sea.gmane.org> <f2nk5p$bqf$1@sea.gmane.org>
	<025401c79a5d$e3b4a010$f101a8c0@RaymondLaptop1>
	<6a36e7290705191501m3ea09731u2329c39473301e58@mail.gmail.com>
Message-ID: <464F97AB.2040808@canterbury.ac.nz>

Bob Ippolito wrote:

> Would "and" and "or" still be able to properly short-circuit given
> this proposal?

Yes. I was very careful to ensure that all the existing semantics
are preserved in the case of no overloads, and also that overloads
can mimic all of the existing semantics if they need to.

--
Greg

From python at rcn.com  Sun May 20 00:13:10 2007
From: python at rcn.com (Raymond Hettinger)
Date: Sat, 19 May 2007 15:13:10 -0700
Subject: [Python-3000] Radical idea: remove built-in open (requireimport
	io)
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>
	<002c01c79964$77a6fc00$66f4f400$@net>
Message-ID: <003701c79a87$a7db5460$f101a8c0@RaymondLaptop1>

From: "Andrew Koenig" <ark-mlist at att.net>
> (I'm feeling slightly pedantic today, so I want to say that the proposal
> doesn't make it any easier to identify modules that engage in I/O -- it
> makes it easier to identify modules that assuredly do not engage in I/O.

u = urllib.urlopen('http://www.python.org')
s =  shelve.open('persistantmap.shl')
logging.basicConfig('events.log')


Raymond

From jason.orendorff at gmail.com  Sun May 20 05:46:22 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Sat, 19 May 2007 23:46:22 -0400
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <464F973A.8070205@canterbury.ac.nz>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<464E4921.9040101@canterbury.ac.nz>
	<ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>
	<f2mmq8$mjv$1@sea.gmane.org> <464F2FFC.3060902@cs.byu.edu>
	<f2nion$5b2$1@sea.gmane.org> <f2nk5p$bqf$1@sea.gmane.org>
	<025401c79a5d$e3b4a010$f101a8c0@RaymondLaptop1>
	<464F973A.8070205@canterbury.ac.nz>
Message-ID: <bb8868b90705192046u5a682a2evf63dc2a83dffab22@mail.gmail.com>

On 5/19/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Raymond Hettinger wrote:
> > FWIW, the peephole optimizer takes advantage of the current meaning
> > of and/or to generate faster code.
>
> Can you give some examples of the sort of optimisations
> that are done?

Look in Python/peephole.c, function PyCode_Optimize().  Search for
"case JUMP_IF_FALSE".  There's a nice comment immediately
preceding that line.

-j

From greg.ewing at canterbury.ac.nz  Sun May 20 06:57:55 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 20 May 2007 16:57:55 +1200
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <bb8868b90705192046u5a682a2evf63dc2a83dffab22@mail.gmail.com>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<464E4921.9040101@canterbury.ac.nz>
	<ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>
	<f2mmq8$mjv$1@sea.gmane.org> <464F2FFC.3060902@cs.byu.edu>
	<f2nion$5b2$1@sea.gmane.org> <f2nk5p$bqf$1@sea.gmane.org>
	<025401c79a5d$e3b4a010$f101a8c0@RaymondLaptop1>
	<464F973A.8070205@canterbury.ac.nz>
	<bb8868b90705192046u5a682a2evf63dc2a83dffab22@mail.gmail.com>
Message-ID: <464FD553.1090000@canterbury.ac.nz>

Jason Orendorff wrote:
> Look in Python/peephole.c,

Which version of Python is this in? I can't find a file by
that name anywhere in my 2.3, 2.4.3 or 2.5 sources.

--
Greg

From steven.bethard at gmail.com  Sun May 20 07:05:01 2007
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sat, 19 May 2007 23:05:01 -0600
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <464FD553.1090000@canterbury.ac.nz>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<ca471dc20705181821m6fc8bbeah7a1b9c920a68ef4c@mail.gmail.com>
	<f2mmq8$mjv$1@sea.gmane.org> <464F2FFC.3060902@cs.byu.edu>
	<f2nion$5b2$1@sea.gmane.org> <f2nk5p$bqf$1@sea.gmane.org>
	<025401c79a5d$e3b4a010$f101a8c0@RaymondLaptop1>
	<464F973A.8070205@canterbury.ac.nz>
	<bb8868b90705192046u5a682a2evf63dc2a83dffab22@mail.gmail.com>
	<464FD553.1090000@canterbury.ac.nz>
Message-ID: <d11dcfba0705192205v5dad6a69w3875f4ac69905171@mail.gmail.com>

On 5/19/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Jason Orendorff wrote:
> > Look in Python/peephole.c,
>
> Which version of Python is this in? I can't find a file by
> that name anywhere in my 2.3, 2.4.3 or 2.5 sources.

http://svn.python.org/view/python/trunk/Python/peephole.c?rev=54086&view=markup

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
        --- Bucky Katt, Get Fuzzy

From tcdelaney at optusnet.com.au  Sun May 20 08:25:52 2007
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Sun, 20 May 2007 16:25:52 +1000
Subject: [Python-3000] PEP 367: New Super
References: <003001c795f8$d5275060$0201a8c0@mshome.net>
	<20070514165704.4F8D23A4036@sparrow.telecommunity.com>
Message-ID: <000b01c79aa7$ba716cc0$0201a8c0@mshome.net>

Phillip J. Eby wrote:
> At 05:23 PM 5/14/2007 +1000, Tim Delaney wrote:
>> Determining the class object to use
>> '''''''''''''''''''''''''''''''''''
>>
>> The exact mechanism for associating the method with the defining
>> class is not
>> specified in this PEP, and should be chosen for maximum performance.
>> For CPython, it is suggested that the class instance be held in a
>> C-level variable
>> on the function object which is bound to one of ``NULL`` (not part
>> of a class),
>> ``Py_None`` (static method) or a class object (instance or class
>> method).
>
> Another open issue here: is the decorated class used, or the
> undecorated class?

Sorry I've taken so long to get back to you about this - had email problems.

I'm not sure what you're getting at here - are you referring to the 
decorators for classes PEP? In that case, the decorator is applied after the 
class is constructed, so it would be the undecorated class.

Are class decorators going to update the MRO? I see nothing about that in 
PEP 3129, so using the undecorated class would match the current super(cls, 
self) behaviour.

Tim Delaney 


From tcdelaney at optusnet.com.au  Sun May 20 08:44:03 2007
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Sun, 20 May 2007 16:44:03 +1000
Subject: [Python-3000] PEP 367: New Super
Message-ID: <009c01c79aaa$441b0dd0$0201a8c0@mshome.net>

Tim Delaney wrote:
> Phillip J. Eby wrote:
>> At 05:23 PM 5/14/2007 +1000, Tim Delaney wrote:
>>> Determining the class object to use
>>> '''''''''''''''''''''''''''''''''''
>>>
>>> The exact mechanism for associating the method with the defining
>>> class is not
>>> specified in this PEP, and should be chosen for maximum performance.
>>> For CPython, it is suggested that the class instance be held in a
>>> C-level variable
>>> on the function object which is bound to one of ``NULL`` (not part
>>> of a class),
>>> ``Py_None`` (static method) or a class object (instance or class
>>> method).
>>
>> Another open issue here: is the decorated class used, or the
>> undecorated class?
>
> Sorry I've taken so long to get back to you about this - had email
> problems.
> I'm not sure what you're getting at here - are you referring to the
> decorators for classes PEP? In that case, the decorator is applied
> after the class is constructed, so it would be the undecorated class.
>
> Are class decorators going to update the MRO? I see nothing about
> that in PEP 3129, so using the undecorated class would match the
> current super(cls, self) behaviour.

Duh - I'm an idiot. Of course, the current behaviour uses name lookup, so it 
would use the decorated class.

So the question is, should the method store the class, or the name? Looking 
up by name could pick up a totally unrelated class, but storing the 
undecorated class could miss something important in the decoration.

Tim Delaney 


From ncoghlan at gmail.com  Sun May 20 09:42:22 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 20 May 2007 17:42:22 +1000
Subject: [Python-3000] PEP 367: New Super
In-Reply-To: <009c01c79aaa$441b0dd0$0201a8c0@mshome.net>
References: <009c01c79aaa$441b0dd0$0201a8c0@mshome.net>
Message-ID: <464FFBDE.4000109@gmail.com>

Tim Delaney wrote:
> So the question is, should the method store the class, or the name? Looking 
> up by name could pick up a totally unrelated class, but storing the 
> undecorated class could miss something important in the decoration.

Couldn't we provide a mechanism whereby the cell can be adjusted to 
point to the decorated class? (heck, the interpreter has access to both 
classes after execution of the class statement - it could probably 
arrange for this to happen automatically whenever the decorated and 
undecorated classes are different).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From martin at v.loewis.de  Sun May 20 09:47:16 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 20 May 2007 09:47:16 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <f27rmv$k1d$1@sea.gmane.org>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com><4646A3CA.40705@acm.org>	<4646FCAE.7090804@v.loewis.de>
	<f27rmv$k1d$1@sea.gmane.org>
Message-ID: <464FFD04.90602@v.loewis.de>

> That is how I felt when you dismissed my effort to make your proposal more 
> useful and more acceptable to some (by addressing transliteration) with the 
> little molehill problem that Norwegians and Germans disagree about o: 
> (rotated 90 degrees).

So let me phrase this differently: I'm not aware of an algorithm that
can do transliteration for all Unicode characters. Therefore, I cannot
add transliteration into the PEP.

Do you know of any?

Regards,
Martin


From tcdelaney at optusnet.com.au  Sun May 20 10:20:37 2007
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Sun, 20 May 2007 18:20:37 +1000
Subject: [Python-3000] PEP 367: New Super
References: <009c01c79aaa$441b0dd0$0201a8c0@mshome.net>
	<464FFBDE.4000109@gmail.com>
Message-ID: <00ae01c79ab7$c1ea5100$0201a8c0@mshome.net>

Nick Coghlan wrote:
> Tim Delaney wrote:
>> So the question is, should the method store the class, or the name?
>> Looking up by name could pick up a totally unrelated class, but
>> storing the undecorated class could miss something important in the
>> decoration. 
> 
> Couldn't we provide a mechanism whereby the cell can be adjusted to
> point to the decorated class? (heck, the interpreter has access to
> both classes after execution of the class statement - it could
> probably arrange for this to happen automatically whenever the
> decorated and undecorated classes are different).

Yep - I thought of that. I think that's probably the right way to go.

Tim Delaney

From jdahlin at async.com.br  Fri May 18 21:49:11 2007
From: jdahlin at async.com.br (Johan Dahlin)
Date: Fri, 18 May 2007 16:49:11 -0300
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
Message-ID: <464E0337.3000905@async.com.br>

Guido van Rossum wrote:
> While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean
> Operators) by Greg Ewing. I am of two minds of this -- on the one
> hand, it's been a long time without any working code or anything. OTOH
> it might be quite useful to e.g. numpy folks.

This kind of feature would also be useful for ORMs, to be able to map
boolean operators to SQL.

Johan

From baptiste13 at altern.org  Sun May 20 16:10:15 2007
From: baptiste13 at altern.org (Baptiste Carvello)
Date: Sun, 20 May 2007 16:10:15 +0200
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <ca471dc20705181110w30093c03ufadcbbf074aa0ff8@mail.gmail.com>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>	<f2kpv6$sm4$1@sea.gmane.org>
	<ca471dc20705181110w30093c03ufadcbbf074aa0ff8@mail.gmail.com>
Message-ID: <f2pktt$ssh$1@sea.gmane.org>

Guido van Rossum a ?crit :
> On 5/18/07, Baptiste Carvello <baptiste13 at altern.org> wrote:
>> Guido van Rossum a ?crit :
>>> Do people think it would be too radical if the built-in open()
>>> function was removed altogether, requiring all code that opens files
>>> to import the io module first? This would make it easier to identify
>>> modules that engage in I/O.
>> -1
>>
>> Will someone think of the interactive users ?
> 
> What kind of interactive use are you making of open()?
> 

Well, mostly two things: for one, quick inspection of data files (I'm working in
physics). Sure, I can also use pylab.load with most reasonable data file
formats. But sometimes, you have a really weird format and/or you just want to
quickly read a few values. The other main use case is common sysadmin-type jobs,
as in

>>> for line in open('records.txt'):
...     print line.split(':')[0]

Now, I was jokingly making it sound more dramatic than it really is. Of course,
I can do import io (especially with a 2-letter module name, it's not that bad),
just like I now do import shutil (or is that shutils, I never remember) when I
need to modify the filesystem. No big deal.

I just wanted to point out that any cleaning of the builtin namespace is a
benefit for programmers, but also a disadvantage for interactive users. How the
trade-off is made is yours to decide.

Thanks for caring,
Baptiste


From baptiste13 at altern.org  Sun May 20 16:19:14 2007
From: baptiste13 at altern.org (Baptiste Carvello)
Date: Sun, 20 May 2007 16:19:14 +0200
Subject: [Python-3000] Radical idea: remove built-in open (require
	import io)
In-Reply-To: <f2kqre$4cl$1@sea.gmane.org>
References: <ca471dc20705171648i28bd435dk2358c1ed77da4714@mail.gmail.com>	<f2kpv6$sm4$1@sea.gmane.org>
	<f2kqre$4cl$1@sea.gmane.org>
Message-ID: <f2plep$v0f$1@sea.gmane.org>

Georg Brandl a ?crit :
> Baptiste Carvello schrieb:
>> Guido van Rossum a ?crit :
>>> Do people think it would be too radical if the built-in open()
>>> function was removed altogether, requiring all code that opens files
>>> to import the io module first? This would make it easier to identify
>>> modules that engage in I/O.
>>>
>> -1
>>
>> Will someone think of the interactive users ?
> 
> They can still put "import sys, os, io" in their PYTHONSTARTUP file.
> 
Thanks, I had forgotten that possibility.

> Or use IPython.
> 
Well, I have to say that I'm a bit worried with a current trend on python-dev,
to answer any question about interactive use with pointing to IPython. I *love*
IPython. I'm using it a lot. But sometimes, because of the longer startup time,
or because you want to stay close to "normal" python, you prefer to use the
standard interpreter. And I believe this should really stay an *supported* use.

Of course, on this specific case, I understand a trade-off has to be made.

Baptiste


From alexandre at peadrop.com  Sun May 20 23:28:14 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Sun, 20 May 2007 17:28:14 -0400
Subject: [Python-3000] Introduction and request for commit access to the
	sandbox.
Message-ID: <acd65fa20705201428y69c90329i84602ecee5b9cf8e@mail.gmail.com>

Hello,

As some of you may already know, I will be working on Python for this
year Google Summer of Code. My project is to merge the modules with a
dual C and Python implementation, i.e. cPickle/pickle,
cStringIO/StringIO and cProfile/profile [1]. This project is part of
the standard library reorganization for Python 3000 [2]. And my mentor
for this project is Brett Cannon.

So first, let me introduce myself. I am currently a student from
Quebec, Canada. I plan to make a career as a (hopefully good)
programmer. Therefore, I dedicate a lot of my free time contributing
to open source projects, like Ubuntu. I, recently, became interested
by how compilers and interpreters work. So, I started reading Python's
source code, which is one of the most well organized and comprehensive
code base I have seen. This motivated me to start contributing to
Python. However since school kept me fairly busy, I haven't had the
chance to do anything other than providing support to Python's users
in the #python FreeNode IRC channel. This year Summer of Code will
give me the chance to do a significant contribution to Python, and to
get started with Python code development as well.

With that said, I would to request svn access to the sandbox for my
work. I will use this access only for modifying stuff in the directory
I will be assigned to. I would like to use the username "avassalotti"
and the attached SSH2 public key for this access.

One last thing, if you know semantic differences (other than the
obvious ones) between the C and Python versions of the modules I need
to merge, please let know. This will greatly simplify the merge and
reduce the chances of later breaking.

Cheers,
-- Alexandre

.. [1] Abstract of my application, Merge the C and Python
implementations of the same interface
   (http://code.google.com/soc/psf/appinfo.html?csaid=C6768E09BEF7CCE2)
.. [2] PEP 3108, Standard Library Reorganization, Cannon
   (http://www.python.org/dev/peps/pep-3108)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: id_dsa.pub
Type: application/octet-stream
Size: 610 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-3000/attachments/20070520/1242e699/attachment.obj 

From pje at telecommunity.com  Mon May 21 02:07:42 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 20 May 2007 20:07:42 -0400
Subject: [Python-3000] PEP 367: New Super
In-Reply-To: <000b01c79aa7$ba716cc0$0201a8c0@mshome.net>
References: <003001c795f8$d5275060$0201a8c0@mshome.net>
	<20070514165704.4F8D23A4036@sparrow.telecommunity.com>
	<000b01c79aa7$ba716cc0$0201a8c0@mshome.net>
Message-ID: <20070521000552.B64C93A4061@sparrow.telecommunity.com>

At 04:25 PM 5/20/2007 +1000, Tim Delaney wrote:
>I'm not sure what you're getting at here - are you referring to the 
>decorators for classes PEP? In that case, the decorator is applied 
>after the class is constructed, so it would be the undecorated class.
>
>Are class decorators going to update the MRO? I see nothing about 
>that in PEP 3129, so using the undecorated class would match the 
>current super(cls, self) behaviour.

Class decorators can (and sometimes *do*, in PEAK) return an object 
that's not the original class object.  So that would break super, 
which is why my inclination is to go with using the decorated result.


From pje at telecommunity.com  Mon May 21 02:11:24 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 20 May 2007 20:11:24 -0400
Subject: [Python-3000] [Python-Dev]  PEP 367: New Super
In-Reply-To: <00ae01c79ab7$c1ea5100$0201a8c0@mshome.net>
References: <009c01c79aaa$441b0dd0$0201a8c0@mshome.net>
	<464FFBDE.4000109@gmail.com>
	<00ae01c79ab7$c1ea5100$0201a8c0@mshome.net>
Message-ID: <20070521000933.202083A4061@sparrow.telecommunity.com>

At 06:20 PM 5/20/2007 +1000, Tim Delaney wrote:
>Nick Coghlan wrote:
> > Tim Delaney wrote:
> >> So the question is, should the method store the class, or the name?
> >> Looking up by name could pick up a totally unrelated class, but
> >> storing the undecorated class could miss something important in the
> >> decoration.
> >
> > Couldn't we provide a mechanism whereby the cell can be adjusted to
> > point to the decorated class? (heck, the interpreter has access to
> > both classes after execution of the class statement - it could
> > probably arrange for this to happen automatically whenever the
> > decorated and undecorated classes are different).
>
>Yep - I thought of that. I think that's probably the right way to go.

Btw, PEP 3124 needs a way to receive the same class object at more or 
less the same moment, although in the form of a callback rather than 
a cell assignment.  Guido suggested I co-ordinate with you to design 
a mechanism for this.



From martin at v.loewis.de  Mon May 21 06:44:25 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 21 May 2007 06:44:25 +0200
Subject: [Python-3000] [Python-Dev] Introduction and request for commit
 access to the sandbox.
In-Reply-To: <acd65fa20705201428y69c90329i84602ecee5b9cf8e@mail.gmail.com>
References: <acd65fa20705201428y69c90329i84602ecee5b9cf8e@mail.gmail.com>
Message-ID: <465123A9.8090500@v.loewis.de>

> With that said, I would to request svn access to the sandbox for my
> work. I will use this access only for modifying stuff in the directory
> I will be assigned to. I would like to use the username "avassalotti"
> and the attached SSH2 public key for this access.

I have added your key. As we have a strict first.last account policy,
I named it alexandre.vassalotti; please correct me if I misspelled it.

> One last thing, if you know semantic differences (other than the
> obvious ones) between the C and Python versions of the modules I need
> to merge, please let know. This will greatly simplify the merge and
> reduce the chances of later breaking.

Somebody noticed on c.l.p that, for cPickle,
a) cPickle will start memo keys at 1; pickle at 0
b) cPickle will not put things into the memo if their refcount is
   1, whereas pickle puts everything into the memo.

Not sure what you'd consider obvious, but I'll mention that cStringIO
"obviously" is constrained in what data types you can write (namely,
byte strings only), whereas StringIO allows Unicode strings as well.
Less obviously, StringIO also allows

py> s = StringIO(0)
py> s.write(10)
py> s.write(20)
py> s.getvalue()
'1020'

Regards,
Martin

From tomerfiliba at gmail.com  Mon May 21 12:03:50 2007
From: tomerfiliba at gmail.com (tomer filiba)
Date: Mon, 21 May 2007 12:03:50 +0200
Subject: [Python-3000] PEP 3131 - the details
Message-ID: <1d85506f0705210303g7e18a769vd95480799be08dd2@mail.gmail.com>

[Martin v. L?wis]
> > So, maybe it's better to keep the status quo, and not allow Cf
> > characters, unless someone comes up with a particular need for doing so.
> > Hm, I think I've convinced myself of that now. :)
>
> That is my reasoning, too. People seem to want to be conservative,
> so it's safer to reject formatting characters for the moment.
> If people come up with a need, they still can be added.
>
> (there might be a need for it in RTL languages, supporting
> 200E..200F and 202A..202E, but it seems that speakers of RTL
> languages are skeptical about the entire PEP, so it's unclear
> whether allowing these would help anything)

i thought of simply treating Cf chars as whitespace -- i.e., they
are allowed BETWEEN identifiers, but not INSIDE of them.

but then again, what if i wanted identifiers in more than one language
or direction? that may seem pointless, but i can give concrete
examples of usage -- the cardinal numbers (aleph one and friends):
?1
1?

without the LTR marker, it would read one-aleph, which also *looks* like
an invalid indentifier, because it begins with a number (although it doesn't).
the point is -- you must allow such markers to appear inside tokens.

allowing me to use greek symbols in equations, but NOT allowing me
to use hebrew ones, is just wrong. either you allow latin-only, or you
allow every character supported by unicode. there's no justification
for compromises, as the motivation of the PEP is localization, and
you can't discriminate one locale from another.

it's getting complicated. that's why i was against it from the very start.
i mean, i wouldn't mind having it, but being familiar with RTL languages,
i know how complex it is.


-tomer

From jimjjewett at gmail.com  Mon May 21 17:56:21 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 21 May 2007 11:56:21 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <464FFD04.90602@v.loewis.de>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>
	<f27rmv$k1d$1@sea.gmane.org> <464FFD04.90602@v.loewis.de>
Message-ID: <fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>

On 5/20/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > That is how I felt when you dismissed my effort to make your proposal more
> > useful and more acceptable to some (by addressing transliteration) with the
> > little molehill problem that Norwegians and Germans disagree about o:
> > (rotated 90 degrees).

> So let me phrase this differently: I'm not aware of an algorithm that
> can do transliteration for all Unicode characters. Therefore, I cannot
> add transliteration into the PEP.

> Do you know of any?

There is no single transliteration that will both

    (1)  Work for all languages, and
    (2)  Be readable on its own

But are those real requirements?

(1)

Would it be acceptable to create an encoding such that you could read and write

    L?wis

in your editor, but upon import, python treated it as though you had writtten

    LU_246wis

Other modules would see LU_246wis, unless they also used that encoding
-- in which case the user should also see L?wis while editing.

(I'm not suggesting character-at-a-time replacements as the *right*
answer, but the mechanics of recoding are less important than whether
or not to accept the use of mangled internal identifiers.)

(2)

If the above is not acceptable, and even the internal representation
has to be readable, then would it be acceptable to make the
transliteration strategy something the user could set, similar to
today's coding: directive?

-jJ

From jimjjewett at gmail.com  Mon May 21 18:30:35 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 21 May 2007 12:30:35 -0400
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <1d85506f0705210303g7e18a769vd95480799be08dd2@mail.gmail.com>
References: <1d85506f0705210303g7e18a769vd95480799be08dd2@mail.gmail.com>
Message-ID: <fb6fbf560705210930i656666fxde5a8423e85675cf@mail.gmail.com>

On 5/21/07, tomer filiba <tomerfiliba at gmail.com> wrote:

> i thought of simply treating Cf chars as whitespace -- i.e., they
> are allowed BETWEEN identifiers, but not INSIDE of them.

I think the suggestion from other languages was to strip them out
during canonicalization.   This allows abc and cba to refer to the
same identifier, if someone is being sneaky.  Whether that is a
problem or not, ... I think so, but it is a judgment call.


> but then again, what if i wanted identifiers in more than one language
> or direction? that may seem pointless, but i can give concrete
> examples of usage -- the cardinal numbers (aleph one and friends):
> ?1
> 1?

In my English math classes, this was simply written with the aleph
before the one; since the aleph was only a single character, it didn't
really matter which order we would have used for additional
characters.

I think this could be generalized so that RTL is assumed to switch
when switching back out of an RTL script, even if the next character
is "inherited" (like parens) or "common" (like numbers).

?
 123

-jJ

From jason.orendorff at gmail.com  Mon May 21 19:18:59 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Mon, 21 May 2007 13:18:59 -0400
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <464C9D55.9080501@v.loewis.de>
References: <464BBE2B.1050201@acm.org>
	<2A4F5FE3-9F8A-4B74-B46D-B63F1260B7FD@fuhm.net>
	<464C9D55.9080501@v.loewis.de>
Message-ID: <bb8868b90705211018r74396a6fofa671ea671efc31f@mail.gmail.com>

On 5/17/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> That is my reasoning, too. People seem to want to be conservative,
> so it's safer to reject formatting characters for the moment.
> If people come up with a need, they still can be added.

How about this: *require* the LEFT-TO-RIGHT MARK after
every sequence of RTL characters outside a string or
comment; and *forbid* all other Cf characters.

This is just as conservative, but supports RTL-language
identifiers better. It prevents all the "stupid bidi tricks"
I know of (abc = cba and so forth).

It pins the cost of maintaining bidi sanity on writers rather
than readers of code.  For all existing code, this is no cost
at all, of course.  For RTL languages this is a nontrivial
burden, but Python can't fix that--it's a fact of bidi life.

-j

From tjreedy at udel.edu  Mon May 21 23:30:28 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 May 2007 17:30:28 -0400
Subject: [Python-3000] Support for PEP 3131
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com><4646A3CA.40705@acm.org>	<4646FCAE.7090804@v.loewis.de><f27rmv$k1d$1@sea.gmane.org>
	<464FFD04.90602@v.loewis.de>
Message-ID: <f2t31i$p07$1@sea.gmane.org>


""Martin v. L?wis"" <martin at v.loewis.de> wrote in message 
news:464FFD04.90602 at v.loewis.de...
| I'm not aware of an algorithm that
| can do transliteration for all Unicode characters.

Were you proposing to allow all Unicode characters in Python names?-)

| Therefore, I cannot add transliteration into the PEP.

Non sequitor.  How I read this is "Because I do not know how to do 
something that does not need to be done, I cannot do something that could 
be done."  So it strikes me as another red-herring dismissal that seems to 
ignore the actual content of what I proposed, which was to do something 
that I believe can be done and which would be useful to do.

My proposal was that the Unicode characters allowed in Python identifiers 
be limited to those with a transliteration, either current or to be 
developed by those who want to use a particular character set.  So if, for 
instance, one or more people wanted to program in Klingon in its 'native' 
characters, they would need to provide the mapping (which I suspect already 
exists).  Transliterations more or less official do exist, I believe, for 
the major languages that we are seriously concerned with.  And for just 
readablity purposes, I would leave the accented latin chars alone, and even 
let them be available as part of an extended target set.  So while I might 
be wrong, I *think* that we could get 99% use-case coverage.

While the PEPs acceptance as-is (for which I congratulate you for your 
persistence) makes transliteration moot as an acceptibility enhancement, it 
does not change its desireability for use purposes.  To repeat: without it, 
national character identifiers will tend to ghettoize code.  While this 
might be a minor issue for Chinese, it will be a bigger issue for people 
writing in Thai or Ibo or other languages with small pioneering groups of 
Python programmers.

Terry Jan Reedy






From ncoghlan at gmail.com  Tue May 22 00:01:29 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 22 May 2007 08:01:29 +1000
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <f2t31i$p07$1@sea.gmane.org>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com><4646A3CA.40705@acm.org>	<4646FCAE.7090804@v.loewis.de><f27rmv$k1d$1@sea.gmane.org>	<464FFD04.90602@v.loewis.de>
	<f2t31i$p07$1@sea.gmane.org>
Message-ID: <465216B9.8040802@gmail.com>

Terry Reedy wrote:
> 
> My proposal was that the Unicode characters allowed in Python identifiers 
> be limited to those with a transliteration, either current or to be 
> developed by those who want to use a particular character set.

Japanese has a transliteration to Roman script, but it suffers from 
ambiguity that typically isn't present in the native written forms of 
the words (i.e. there are different characters in Kanji which are 
pronounced the same way, and spelt the same way in hiragana - and it is 
only the hiragana syllabary which can be mapped to roman characters).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From martin at v.loewis.de  Tue May 22 00:19:33 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 22 May 2007 00:19:33 +0200
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <1d85506f0705210303g7e18a769vd95480799be08dd2@mail.gmail.com>
References: <1d85506f0705210303g7e18a769vd95480799be08dd2@mail.gmail.com>
Message-ID: <46521AF5.3030500@v.loewis.de>

> i thought of simply treating Cf chars as whitespace -- i.e., they
> are allowed BETWEEN identifiers, but not INSIDE of them.

Ok - that would also work. Are you proposing that the PEP is changed
in that way, or are you merely stating that it would "work"? (ie.
would you prefer to see it changed that way)

> without the LTR marker, it would read one-aleph, which also *looks* like
> an invalid indentifier, because it begins with a number (although it
> doesn't).
> the point is -- you must allow such markers to appear inside tokens.

That seems to be a different specification now - you are now saying
that they should *not* be treated like whitespace.

So I'm still at a loss what the PEP should say about Cf characters.

> allowing me to use greek symbols in equations, but NOT allowing me
> to use hebrew ones, is just wrong. either you allow latin-only, or you
> allow every character supported by unicode. there's no justification
> for compromises, as the motivation of the PEP is localization, and
> you can't discriminate one locale from another.

But the PEP does not do that! It allows to use both Hebrew and Greek
letters in identifiers.

> it's getting complicated. that's why i was against it from the very start.
> i mean, i wouldn't mind having it, but being familiar with RTL languages,
> i know how complex it is.

Sure. If there isn't a clearly "correct" specification, the conservative
approach requested by several people here would require to reject Cf
characters - they are not letters, so they are *not* similar to Greek
letters (not sure whether you suggested that they are).

Then, if later there is a demonstrated need for formatting characters,
they still could be added.

Regards,
Martin

From martin at v.loewis.de  Tue May 22 00:27:35 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 22 May 2007 00:27:35 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>	
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>	
	<f27rmv$k1d$1@sea.gmane.org> <464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
Message-ID: <46521CD7.9030004@v.loewis.de>

> Would it be acceptable to create an encoding such that you could read
> and write
> 
>    L?wis
> 
> in your editor, but upon import, python treated it as though you had
> writtten
> 
>    LU_246wis
> 
> Other modules would see LU_246wis, unless they also used that encoding
> -- in which case the user should also see L?wis while editing.

What problem would that solve? You could type the identifier that
way - but you would need to know already that this is the identifier
you want to type; how do you know?

> (I'm not suggesting character-at-a-time replacements as the *right*
> answer, but the mechanics of recoding are less important than whether
> or not to accept the use of mangled internal identifiers.)

Again, I'm uncertain what the use case here would be. For "proper"
transliteration, users can memorize easily what the transliterated
name would be, and visually identify the two representations.
With a "numeric transliteration", users would *not* normally be
able to tell what a transliterated character means, or how to
transliterate a given character.

> If the above is not acceptable, and even the internal representation
> has to be readable, then would it be acceptable to make the
> transliteration strategy something the user could set, similar to
> today's coding: directive?
> 

Then I don't understand your above proposal. I thought you were
proposing to replace all non-ASCII characters with some ASCII form
on import of the module. What do you mean by "readable internal
representation"?

Regards,
Martin


From martin at v.loewis.de  Tue May 22 00:31:36 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 22 May 2007 00:31:36 +0200
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <bb8868b90705211018r74396a6fofa671ea671efc31f@mail.gmail.com>
References: <464BBE2B.1050201@acm.org>	
	<2A4F5FE3-9F8A-4B74-B46D-B63F1260B7FD@fuhm.net>	
	<464C9D55.9080501@v.loewis.de>
	<bb8868b90705211018r74396a6fofa671ea671efc31f@mail.gmail.com>
Message-ID: <46521DC8.3080704@v.loewis.de>

> How about this: *require* the LEFT-TO-RIGHT MARK after
> every sequence of RTL characters outside a string or
> comment; and *forbid* all other Cf characters.
> 
> This is just as conservative, but supports RTL-language
> identifiers better. It prevents all the "stupid bidi tricks"
> I know of (abc = cba and so forth).

This is indeed more conservative, and I could happily put it
in the PEP, but again I prefer not to do so without an explicit
confirmation from a user of such a language that this actually
helps anything.

tomer's comment (that you need the mark even inside an identifier)
has puzzled me.

Regards,
Martin

From jimjjewett at gmail.com  Tue May 22 01:29:36 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 21 May 2007 19:29:36 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <46521CD7.9030004@v.loewis.de>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>
	<f27rmv$k1d$1@sea.gmane.org> <464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
Message-ID: <fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>

On 5/21/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > Would it be acceptable to create an encoding such that you could read
> > and write

> >    L?wis

> > in your editor, but upon import, python treated it as though you had
> > writtten

> >    LU_246wis

> > Other modules would see LU_246wis, unless they also used that encoding
> > -- in which case the user should also see L?wis while editing.

> What problem would that solve? You could type the identifier that
> way - but you would need to know already that this is the identifier
> you want to type; how do you know?

(1)
If I am using the module based on its documentation, or based on
opening it up and reading it, then I can use the same encoding, and I
can write  L?wis.

(2)
If I do arbitrary introspection, such as

    import sys
    for k, v in sys.modules:
        if v:
            print dir(v)

then I will get something usable, though perhaps not easily readable.

(3)
The mapping is reversible, so I can work interactively with the
arbitrary characters by setting my console/idle preferences to the
special encoding.

> Again, I'm uncertain what the use case here would be. For "proper"
> transliteration, users can memorize easily what the transliterated
> name would be, and visually identify the two representations.

For two latin-based alphabets, yes.  I'm not so sure for non-western scripts.

As you pointed out, the correct transliteration may depend on the
natural language (instead of just the character code point), which
means we probably can't do it automatically.

It also has to be a one-way transliteration; if ? -> o (or oe) then an
o (or oe) in the result can't always be transliterated back.

> With a "numeric transliteration", users would *not* normally be
> able to tell what a transliterated character means, or how to
> transliterate a given character.

(1)  They shouldn't ever need to see the numeric version unless
they're intentionally peeking under the covers, or their site doesn't
have the appropriate encoding installed.  One advantage of this method
is that a single transliteration method could work for any language,
so it probably would be installed already.

(2)  Even if users did somehow see the numeric version, it wouldn't be
that awful.  For the langauges close enough to ASCII that a
transliteration is straightforward, the number of extra characters to
memorize is fairly small.

> > If the above is not acceptable, and even the internal representation
> > has to be readable, then would it be acceptable to make the
> > transliteration strategy something the user could set, similar to
> > today's coding: directive?

> Then I don't understand your above proposal. I thought you were
> proposing to replace all non-ASCII characters with some ASCII form
> on import of the module. What do you mean by "readable internal
> representation"?

This alternative would let an individual user say "I'm writing
Swedish; turn my ? into an o."   The actual identifiers used by Python
itself would be more readable, but the downside is that users would
have to read them more often, instead of using/editing/viewing
strictly in the untransliterated version.

-jJ

From foom at fuhm.net  Tue May 22 02:28:20 2007
From: foom at fuhm.net (James Y Knight)
Date: Mon, 21 May 2007 20:28:20 -0400
Subject: [Python-3000] PEP 3131 - the details
In-Reply-To: <46521DC8.3080704@v.loewis.de>
References: <464BBE2B.1050201@acm.org>	
	<2A4F5FE3-9F8A-4B74-B46D-B63F1260B7FD@fuhm.net>	
	<464C9D55.9080501@v.loewis.de>
	<bb8868b90705211018r74396a6fofa671ea671efc31f@mail.gmail.com>
	<46521DC8.3080704@v.loewis.de>
Message-ID: <18E17A4B-1B0A-4C75-993B-2B74B8DE5D91@fuhm.net>

On May 21, 2007, at 6:31 PM, Martin v. L?wis wrote:
> This is indeed more conservative, and I could happily put it
> in the PEP, but again I prefer not to do so without an explicit
> confirmation from a user of such a language that this actually
> helps anything.
>
> tomer's comment (that you need the mark even inside an identifier)
> has puzzled me.

I agree: nothing should be done without an explicit example of how it  
will actually improve matters. For example: editor XYZ can be used to  
sensibly edit JS/C#/Java/whatever code in an RTL language, and could  
also be used to edit python if only python did <insert helpful thing  
here>. Anything else seems to be simply wild speculation, and should  
not be implemented on the off chance that it might be useful in the  
future.

James

From martin at v.loewis.de  Tue May 22 07:00:52 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 22 May 2007 07:00:52 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>	
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>	
	<f27rmv$k1d$1@sea.gmane.org> <464FFD04.90602@v.loewis.de>	
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>	
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
Message-ID: <46527904.1000202@v.loewis.de>

> If I do arbitrary introspection, such as
> 
>    import sys
>    for k, v in sys.modules:
>        if v:
>            print dir(v)
> 
> then I will get something usable, though perhaps not easily readable.

I think this is unacceptable (at least I cannot accept it):
with reflection, I want to get the *true* variable names, not
the mangled ones. In the scenario that people had discussed with
using long Japanese method names for test methods, if the method
fails, you clearly want to see the Japanese name, so you can
easily read what failed.

> The mapping is reversible, so I can work interactively with the
> arbitrary characters by setting my console/idle preferences to the
> special encoding.

That could work both ways, of course. If you want a reflective
API to give you mangled names, you could easily implement that
yourself on top of PEP 3131.

>> Again, I'm uncertain what the use case here would be. For "proper"
>> transliteration, users can memorize easily what the transliterated
>> name would be, and visually identify the two representations.
> 
> For two latin-based alphabets, yes.  I'm not so sure for non-western
> scripts.

I know that the Chinese regularly use pinyin for transliteration,
and somebody confirmed in c.l.p that they also use it in programming
if they can't use the Chinese characters directly.

> As you pointed out, the correct transliteration may depend on the
> natural language (instead of just the character code point), which
> means we probably can't do it automatically.

That's the problem, yes.

> It also has to be a one-way transliteration; if ? -> o (or oe) then an
> o (or oe) in the result can't always be transliterated back.

The same is true for your "numeric transliteration": there is no way
to *reliably* tell whether some string is a mangled string, or
just happens to include U_ in the identifier (which it legally can
do today).

That's why Java and C++ use \u, so you would write L\u00F6wis
as an identifier. *This* is truly unambiguous. I claim that it
is also useless.

> (1)  They shouldn't ever need to see the numeric version unless
> they're intentionally peeking under the covers, or their site doesn't
> have the appropriate encoding installed.  One advantage of this method
> is that a single transliteration method could work for any language,
> so it probably would be installed already.

I think you are really arguing for \u escapes in identifiers here.

> (2)  Even if users did somehow see the numeric version, it wouldn't be
> that awful.  For the langauges close enough to ASCII that a
> transliteration is straightforward, the number of extra characters to
> memorize is fairly small.

What about the other languages? This PEP is not just for latin-based
scripts.

>> Then I don't understand your above proposal. I thought you were
>> proposing to replace all non-ASCII characters with some ASCII form
>> on import of the module. What do you mean by "readable internal
>> representation"?
> 
> This alternative would let an individual user say "I'm writing
> Swedish; turn my ? into an o."   The actual identifiers used by Python
> itself would be more readable, but the downside is that users would
> have to read them more often, instead of using/editing/viewing
> strictly in the untransliterated version.

That again cannot work because you don't have transliteration
algorithms for all characters, or all languages.

Regards,
Martin

From martin at v.loewis.de  Tue May 22 07:58:17 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 22 May 2007 07:58:17 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <f2t31i$p07$1@sea.gmane.org>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com><4646A3CA.40705@acm.org>	<4646FCAE.7090804@v.loewis.de><f27rmv$k1d$1@sea.gmane.org>	<464FFD04.90602@v.loewis.de>
	<f2t31i$p07$1@sea.gmane.org>
Message-ID: <46528679.9060709@v.loewis.de>

> | I'm not aware of an algorithm that
> | can do transliteration for all Unicode characters.
> 
> Were you proposing to allow all Unicode characters in Python names?-)

Not sure how to interpret your question: no, I'm not proposing
to allow all Unicode characters, just a selected subset (but then,
I don't know a universal transliteration algorithm for that subset,
either).

> | Therefore, I cannot add transliteration into the PEP.
> 
> Non sequitor.  How I read this is "Because I do not know how to do 
> something that does not need to be done, I cannot do something that could 
> be done."

No. You should read it "because I don't know how to do it, *I* will
not do it".

> My proposal was that the Unicode characters allowed in Python identifiers 
> be limited to those with a transliteration, either current or to be 
> developed by those who want to use a particular character set.

But what would be the purpose of doing so? Mere existence of a
transliteration algorithm surely isn't what you are after.

> While the PEPs acceptance as-is (for which I congratulate you for your 
> persistence) makes transliteration moot as an acceptibility enhancement, it 
> does not change its desireability for use purposes.  To repeat: without it, 
> national character identifiers will tend to ghettoize code.  While this 
> might be a minor issue for Chinese, it will be a bigger issue for people 
> writing in Thai or Ibo or other languages with small pioneering groups of 
> Python programmers.

What I fail to see is how existence of a transliteration algorithm would
remove the ghettoization. It must be used somehow, no?

Regards,
Martin

From jimjjewett at gmail.com  Tue May 22 22:29:02 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 22 May 2007 16:29:02 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <46527904.1000202@v.loewis.de>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>
	<f27rmv$k1d$1@sea.gmane.org> <464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
Message-ID: <fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>

On 5/22/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:

> That's why Java and C++ use \u, so you would write L\u00F6wis
> as an identifier. ...
> I think you are really arguing for \u escapes in identifiers here.

Yes, that is effectively what I was suggesting.

> *This* is truly unambiguous. I claim that it is also useless.

It means users could see the usability benefits of PEP3131, but the
python internals could still work with ASCII only.

It simplifies checking for identifiers that *don't* stick to ASCII,
which reduces some of the concerns about confusable characters, and
which ones to allow.

Short list of judgment calls that we need to resolve if we go with
non-ASCII identifiers, but can largely ignore if we just use escaping:

Based only on UAX 31:

    ID vs XID  (unicode changed their mind on recommendations)

    include stability extensions?  (*Python* didn't allow those
letters previously.)

    which of ID_CONTINUE should be left out.  (We don't want "-", and
some of the punctuation and other marks may be closer to "-" than to
"_".  Or they might not be, and I don't know how to judge that.)

    layout and control charcters (At the top of section 2, tr31
recommends acting as though they weren't there ... but if we use a
normal (unicode) string, then they will still affect the hash.  Down
in 2.2, they say not to permit them, except sometimes...)

    Canonicalization

    Combining Marks should be accepted (only as continuation chars),
but not if they're enclosing marks, because ... well, I'm not sure,
but I'll have to trust them.

    Specific character Adjustments (sec 2.3) -- The example suggests
that we might have to tailor for our use of "_", though I didn't get
that from the table.  They do suggest tailoring out certain
Decomposition Types.

    Additional (non-letter?) characters which may occur in words (see
UAX29, but I don't claim to fully understand it)

    Undefined code points, particularly those which might be defined later?

    Should we exclude the letters that look like punctuation?  A
proposed update (http://www.unicode.org/reports/tr31/tr31-8.html)
mentions U+02B9 (modifier letter prime) only because the visually
equivalent U+0374 (Greek Numeral Sign) shouldn't be an identifier, but
does fold to it under (some?) canonicalization.  (They suggest
allowing both, instead of neither.)

Then TR 39 http://www.unicode.org/reports/tr39/ recommends excluding
(most, but not all of)

    characters not in modern use;

    characters only used in specialized fields, such as liturgical
characters, mathematical letter-like symbols, and certain phonetic
alphabetics;

    and ideographic characters that are not part of a set of core CJK
ideographs consisting of the CJK Unified Ideographs block plus IICore
(the set of characters defined by the IRG as the minimal set of
required ideographs for East Asian use).

They summarize this in
http://www.unicode.org/reports/tr39/data/xidmodifications.txt; I
wouldn't add the hyphen-minus back in, but I don't know whether
katakana middle dot should be allowed.

Should mixed-script identifiers be allowed?  According to TR 36
(http://www.unicode.org/reports/tr36/) ASCII only is the safest, and
that is followed by limits on mixed-script identifiers.  Those limits
sound reasonable to me, but ... I'm not the one who would be mixing
them.

Note that even "highly restrictive" allows ASCII + Han + Hiragana +
Katakana, ASCII + Han + Bopomofo, and ASCII + Han + Hangul.  (I think
we wanted at least the ASCII numbers with anything.)

-jJ

From jimjjewett at gmail.com  Tue May 22 22:30:50 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 22 May 2007 16:30:50 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <46528679.9060709@v.loewis.de>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>
	<f27rmv$k1d$1@sea.gmane.org> <464FFD04.90602@v.loewis.de>
	<f2t31i$p07$1@sea.gmane.org> <46528679.9060709@v.loewis.de>
Message-ID: <fb6fbf560705221330nbb7a9ax6e3a025ad8cd182c@mail.gmail.com>

On 5/22/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:

[Referring to my alternate alternative proposal -- user-controlled
transliteration, rather than unicode escapes in identifiers]

> >> Then I don't understand your above proposal. I thought you were
> >> proposing to replace all non-ASCII characters with some ASCII
> >> form on import of the module. What do you mean by "readable
> >> internal representation"?

That ASCII form -- and the requirement that it still be something
humans don't mind reading -- which in turn means that it can't be done
as a single one-size-fits-all algorithm; users would have to be able
to choose (and perhaps locally modify) it.

-jJ

From alexandre at peadrop.com  Tue May 22 22:35:36 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Tue, 22 May 2007 16:35:36 -0400
Subject: [Python-3000] [Python-Dev] Introduction and request for commit
	access to the sandbox.
In-Reply-To: <465123A9.8090500@v.loewis.de>
References: <acd65fa20705201428y69c90329i84602ecee5b9cf8e@mail.gmail.com>
	<465123A9.8090500@v.loewis.de>
Message-ID: <acd65fa20705221335i1dc81496h7f6d168472adb170@mail.gmail.com>

On 5/21/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > With that said, I would to request svn access to the sandbox for my
> > work. I will use this access only for modifying stuff in the directory
> > I will be assigned to. I would like to use the username "avassalotti"
> > and the attached SSH2 public key for this access.
>
> I have added your key. As we have a strict first.last account policy,
> I named it alexandre.vassalotti; please correct me if I misspelled it.

Thanks!

> > One last thing, if you know semantic differences (other than the
> > obvious ones) between the C and Python versions of the modules I need
> > to merge, please let know. This will greatly simplify the merge and
> > reduce the chances of later breaking.
>
> Somebody noticed on c.l.p that, for cPickle,
> a) cPickle will start memo keys at 1; pickle at 0
> b) cPickle will not put things into the memo if their refcount is
>    1, whereas pickle puts everything into the memo.

Noted. I think I found the thread on c.l.p about it:
http://groups.google.com/group/comp.lang.python/browse_thread/thread/68c72a5066e4c9bb/b2bc78f7d8d50320

> Not sure what you'd consider obvious, but I'll mention that cStringIO
> "obviously" is constrained in what data types you can write (namely,
> byte strings only), whereas StringIO allows Unicode strings as well.

Yes. I was already aware of this. I just hope this problem will go
away with the string unification in Python 3000. However, I will need
to deal with this, sooner or later, if I want to port the merge to
2.x.

> Less obviously, StringIO also allows
>
> py> s = StringIO(0)
> py> s.write(10)
> py> s.write(20)
> py> s.getvalue()
> '1020'

That is probably due to the design of cStringIO, which is separated
into two subparts StringI and StringO. So when the constructor of
cStringIO is given a string, it builds an output object, otherwise it
builds an input object:

    static PyObject *
    IO_StringIO(PyObject *self, PyObject *args) {
      PyObject *s=0;

      if (!PyArg_UnpackTuple(args, "StringIO", 0, 1, &s)) return NULL;

      if (s) return newIobject(s);
      return newOobject(128);
    }

As you see, cStringIO's code also needs a good cleanup to make it,
at least, conforms to PEP-7.

-- Alexandre

From python at zesty.ca  Wed May 23 00:08:05 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Tue, 22 May 2007 17:08:05 -0500 (CDT)
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>

On Thu, 17 May 2007, Guido van Rossum wrote:
> I have accepted PEP 3131.

I'm surprised that this happened so quickly.  I oppose this proposal
quite strongly.

Currently Python has the property that the character set is a fully
known quantity.  There currently exists a choice of keyboard, a choice
of editor, and a set of literacy skills that is sufficient for any
Python code in the world.

Adopting PEP 3131 destroys this property.  It is not just that
particular communities (e.g. English speakers) will be unable to
understand code by other particular communities (e.g. Japanese
speakers); that is relatively minor and arguably already the case.
The real problem is that it will be impossible for *anyone*, no matter
what their background, to acquire the resources necessary to handle
all Python code.  There will exist no keyboard that enables one to
edit any Python program, and probably no editor.  There will not be a
single human being alive who can know or recognize the whole character
set.  Using APIs in a few different languages would yield a program
that no one could understand.

Today, if a non-English speaker asks you how to learn Python, you can
answer that question.  You can explain Python's syntax and semantics,
and tell them they need to know the 26 letters of the Roman alphabet.
After PEP 3131, you won't be able to answer their question -- because
it will be impossible for any human being to enumerate, let alone
possess, the knowledge required to read an arbitrary piece of Python
code.

PEP 3131 will also cause problems for code review.  Because many
characters have indistinguishable appearances, there will be no
mapping between what you see when you look at code and what the code
actually says.  So it will no longer be possible to look at a piece of
Python code on your screen or on paper and be sure you know what it
means, or even know that it is valid Python syntax.  It will be much
easier to write programs that look right but do the wrong thing, which
is particularly bad if you are concerned with security.

I like the idea that, after studying and working with Python for a
modest amount of time, one can acquire a complete understanding of the
language that affords confidence in the ability to read arbitrary
programs written in Python, make changes to anything written in Python,
and reuse any libraries or modules written in Python.  (It is for the
same reason that Python has a small and limited set of keywords that
Python should have a small character set.)  I don't like how PEP 3131
would not only take such abilities away from me, but remove them from
the realm of possibility altogether.

Of course, nothing stops one from creating a new language (say,
"UniPython") that consists of Python with Unicode identifiers.  One
could even write a translator from UniPython to Python, thus making it
straightforward to run UniPython programs.  But it would be much
better for this to be a separate language that no one is expected to
fully understand, so that Python can remain a language that one *can*
fully understand.


-- ?!ng

From santagada at gmail.com  Wed May 23 04:56:23 2007
From: santagada at gmail.com (Leonardo Santagada)
Date: Tue, 22 May 2007 23:56:23 -0300
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
Message-ID: <B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>


Em 22/05/2007, ?s 19:08, Ka-Ping Yee escreveu:

>
> Currently Python has the property that the character set is a fully
> known quantity.  There currently exists a choice of keyboard, a choice
> of editor, and a set of literacy skills that is sufficient for any
> Python code in the world.
>
No, any python code can extend itself to infinity as python is a  
turing complete language, we probably don't even have a way to say if  
a python program ever stops, so saying that you now posses  
capabilities to understand every python program is being a little too  
confident on your hacking skillz :)

> There will exist no keyboard that enables one to
> edit any Python program, and probably no editor.
Yes you can, using only a hex editor and simple cut and paste (or  
copying hex codes in paper and the re-typing then back) can edit any  
python code.


> Today, if a non-English speaker asks you how to learn Python, you can
> answer that question.  You can explain Python's syntax and semantics,
> and tell them they need to know the 26 letters of the Roman alphabet.
Have you ever explained that to someone? "You need to know only the  
26 letters of the alphabet, plus _+=-{}[]()_0123456789!@#%^*><,./?\"  
Really? And i probably missed a lot of stuff in there. The sintax  
rules continues to be as simple as ever, identifiers can contain  
whatever character the user knows (and all he doesn't know, but then  
he is not going to be able to read it anyway).

> PEP 3131 will also cause problems for code review.  Because many
> characters have indistinguishable appearances, there will be no
> mapping between what you see when you look at code and what the code
> actually says.
This was already discussed, if your font has the same symbol for  
different characters it is not a problem with python, but with the  
font. Then there is the different chars in unicode that are really  
suposed to be the same, then you need to know the context of the  
expression to know their meaning and then again this is not a python  
problem, maybe a unicode problem, I like to think this is a cultural  
problem, and we have to learn to live with it.

> so that Python can remain a language that one *can*
> fully understand.

I know I'm being picky, but rethink for a little about this, probably  
you are afraid of this change, but really there is nothing to be  
afraid of. If some code is written in some language you don't  
understand or in an encoding that your editor don't know how to  
handle you will not be able to edit it. But then again, this is  
probably already true with the # _*_ encoding: parameter and with  
lots of crappy editors that don't even know how to handle utf-8.

--
Leonardo Santagada
santagada at gmail.com




From python at zesty.ca  Wed May 23 05:30:03 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Tue, 22 May 2007 22:30:03 -0500 (CDT)
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>
Message-ID: <Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>

On Tue, 22 May 2007, Leonardo Santagada wrote:
> > Today, if a non-English speaker asks you how to learn Python, you can
> > answer that question.  You can explain Python's syntax and semantics,
> > and tell them they need to know the 26 letters of the Roman alphabet.
> Have you ever explained that to someone? "You need to know only the
> 26 letters of the alphabet, plus _+=-{}[]()_0123456789!@#%^*><,./?\"
> Really? And i probably missed a lot of stuff in there.

Except for those with disabilities, every Python programmer today can
easily recognize, read, write, type, and speak every character in the
syntax character set.

Python fits your brain.  Let's keep it that way.

> > PEP 3131 will also cause problems for code review.  Because many
> > characters have indistinguishable appearances, there will be no
> > mapping between what you see when you look at code and what the code
> > actually says.
>
> This was already discussed, if your font has the same symbol for
> different characters it is not a problem with python, but with the
> font. Then there is the different chars in unicode that are really
> suposed to be the same, then you need to know the context of the
> expression to know their meaning and then again this is not a python
> problem, maybe a unicode problem, I like to think this is a cultural
> problem, and we have to learn to live with it.

Assigning blame elsewhere will not make the problem go away.  We do
not incorporate buggy libraries into the Python core and then absolve
ourselves by pointing fingers at the library authors; we should not
incorporate the complicated and unsolved problems of international
character sets into the language syntax definition, thereby turning
them from problems with Unicode to problems with Python.


-- ?!ng

From showell30 at yahoo.com  Wed May 23 05:25:45 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Tue, 22 May 2007 20:25:45 -0700 (PDT)
Subject: [Python-3000] please keep open() as a builtin,
	and general concerns about Py3k complexity
Message-ID: <601776.94139.qm@web33506.mail.mud.yahoo.com>

Hi, this is my first post to the list.  My name is
Steve Howell, and I currently work on a system,
largely written in Python, that processes a billion
transactions per year.  On the opposite side of the
sprectrum, I've also had experience in classrooms
using Python as a teaching tool.

In the system I've worked on for the last three years,
we have at least 200 calls to the builtin open()
method.  Ironically, to compile that stat, I wrote a
tiny Python program that used open() as a builtin.

So I'm -201 on the proposal to eliminate it as a
builtin.  I understand the original justification for
the proposal--that it helps you identify modules that
do I/O--but I don't find it difficult in practice to
find modules that use I/O, and I definitely work with
a large enough code base where that comes up.

Although the open() debate seems to have died out, I'd
like to reply to Raymond Hettinger's observation that
"Taking a more global viewpoint, I'm experiencing a
little FUD about Py3k."  I think he's on to something.
 I've been following the Py3k discussions for several
months, and I find myself frequently feeling very
bewildered about the new features being proposed, even
though I'm hardly a newbie.

FWIW one of my favorite accepted PEPs is PEP 3111,
"Simple input built-in in Python 3000."  BTW  it's the
only 3000 series PEP with the word "simple" in the
title.  I realize looking at PEPs and mailing list
archives can skew an outsider's view of how well Py3K
simplifies the language, since simple ideas often
don't require PEPs, and complex ideas often lead to
lengthier debates than simple ones, but I'm not
feeling the simplicity.






 
____________________________________________________________________________________
Don't pick lemons.
See all the new 2007 cars at Yahoo! Autos.
http://autos.yahoo.com/new_cars.html 

From guido at python.org  Wed May 23 06:20:31 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 22 May 2007 21:20:31 -0700
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>
Message-ID: <ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>

On 5/22/07, Ka-Ping Yee <python at zesty.ca> wrote:
> Python fits your brain.  Let's keep it that way.

I'm sorry, Ping, but you sound just like I was feeling about the PEP
at the start (and many others were too). You missed a bunch of
enlightening posts from people with quite a different perspective.

In particular very helpful was a couple of reports from the Java
world, where Unicode letters in identifiers have been legal for a long
time now. (JavaScript also supports this BTW.) The Java world has not
fallen apart, but Java programmers in countries where English is not
spoken regularly between programmers (e.g. Japan) find it very helpful
to be able to communicate with each other through identifiers in their
own language. Remember the mantra that *human* readability of code is
important? Well, it helps if your code can use at least some the
language spoken by those humans.

Of course, even Japanese programmers must master *some* English -- the
standard library and the language keywords are still in English, and
they are okay with that. But the code they write for each other to
read will be more readable *to them* if they don't have to resort to
Latin transliterations of Japanese words. Because that's what they do
today. And they don't like it. There code is already unreadable for us
(for me, anyway :-) -- their comments are in Japanese (that's legal
today) and so are their output messages (that's also legal today).

My own personal example would be a program calculating Dutch income
tax -- I'd be crazy trying to translate the Dutch tax-technical terms
into English, and since the ideosyncracies of taxes are utterly
localized, there would be no use for my program in other countries.
Now Dutch can (for the most part, without much loss of readability) be
written in ASCII, but the same idea of course applies to any
application of local law, customs etc.

Of course, for the standard library, there's a strict style rule
requiring only ASCII in identifiers, and using English for names,
comments and messages. A similar style guide is likely to be adopted
by other global open source projects. But there are lots of regional
open source projects too, and they can standardize on a different
common language.

Will there be occasional pain when someone writes a useful hack using
their local language and finds they have to translate it to English in
order to open source it? Sure. But the pain already exists if they
chose to use their own language for comments, messages, or even
identifiers (transliterated to the Latin alphabet). I don't expect
there to be much additional pain.

> > > PEP 3131 will also cause problems for code review.  Because many
> > > characters have indistinguishable appearances, there will be no
> > > mapping between what you see when you look at code and what the code
> > > actually says.

I trust most programmers to *want* to write clear code, so they will
steer clear from such things. If someone wants to obfuscate their code
they already have plenty of opportunities (even in Python!). The
problem is no worse than the lack of difference between 1 and l in
some fonts, and between l and I in others (and there are even fonts
where o and 0 look the same).

> Assigning blame elsewhere will not make the problem go away.

You may be misunderstanding the enthusiasm of your respondent.

> We do
> not incorporate buggy libraries into the Python core and then absolve
> ourselves by pointing fingers at the library authors; we should not
> incorporate the complicated and unsolved problems of international
> character sets into the language syntax definition, thereby turning
> them from problems with Unicode to problems with Python.

Yes, Unicode has its problems (so does ASCII BTW). But they can be
solved (see: Java and JavaScript). The Unicode standard also has some
guidelines. Solutions are actively being discussed in this list. If
you have any experience with other languages or fonts, please help. We
should probably be conservative; I'm not too hopeful about support for
right-to-left alphabets for example. But we can do better than ASCII
(or Latin-1, which is much worse).

Cheers,

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nnorwitz at gmail.com  Wed May 23 06:43:17 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 22 May 2007 21:43:17 -0700
Subject: [Python-3000] [Python-Dev] Introduction and request for commit
	access to the sandbox.
In-Reply-To: <acd65fa20705221335i1dc81496h7f6d168472adb170@mail.gmail.com>
References: <acd65fa20705201428y69c90329i84602ecee5b9cf8e@mail.gmail.com>
	<465123A9.8090500@v.loewis.de>
	<acd65fa20705221335i1dc81496h7f6d168472adb170@mail.gmail.com>
Message-ID: <ee2a432c0705222143g2050bd11u4ec4f510b1f7c3a2@mail.gmail.com>

On 5/22/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
>
> As you see, cStringIO's code also needs a good cleanup to make it,
> at least, conforms to PEP-7.

Alexandre,

It would be great if you could break up unrelated changes into
separate patches.  Some of these can go in sooner rather than later.
I don't know all the things that need to be done, but I could imagine
a separate patch for each of:

 * whitespace normalization
 * function name modification
 * other formatting changes
 * bug fixes
 * changes to make consistent with StringIO

I don't know if all those items in the list need to change, but that's
the general idea.  Separate patches will make it much easier to review
and get benefits from your work earlier.

I look forward to seeing your work!

n

From stephen at xemacs.org  Wed May 23 07:05:05 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 23 May 2007 14:05:05 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>
	<f27rmv$k1d$1@sea.gmane.org> <464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
Message-ID: <87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>

Jim Jewett writes:

 > On 5/22/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
 > 
 > > That's why Java and C++ use \u, so you would write L\u00F6wis
 > > as an identifier. ...
 > > I think you are really arguing for \u escapes in identifiers here.
 > 
 > Yes, that is effectively what I was suggesting.
 > 
 > > *This* is truly unambiguous. I claim that it is also useless.
 > 
 > It means users could see the usability benefits of PEP3131, but the
 > python internals could still work with ASCII only.

But this reasoning is not coherent.  Python internals will have no
problems with non-ASCII; in fact, they would have no problems with
tokens containing Cf characters or even reserved code points.  Just
give an unambiguous grammar for tokens composed of code points.  It's
only when a human enters the loop (ie, presentation of the identifier
on an output stream) that they cause problems.

It's *users* who are at risk, not the Python translator, and if there
are any usability benefits to be taken advantage of by *presenting*
identifiers that don't stick to ASCII, the risks of confusing or
deliberately obfuscated code inhere in that very presentation.  Not in
the internals.  For example:

 > It simplifies checking for identifiers that *don't* stick to ASCII,

Only if you assume that people will actually perceive the 10-character
string "L\u00F6wis" as an identifier, regardless of the fact that any
programmable editor can be trained to display the 5-character string
"L?wis" in a very small amount of code.  Conversely, any programmable
editor can easily be trained to take the internal representation
"L?wis" and display it as "L\u00F6wis", giving all the benefits of the
representation you propose.  But who would ever enable it?  (I suppose
this is what Martin means by "useless".)

 > which reduces some of the concerns about confusable characters, and
 > which ones to allow.

For the given reasons above, it reduces no concerns at all, except to
the extent that it makes use of human-readable identifiers as Python
identifiers inconvenient.

I conclude that IMO PEP 3131 is precisely correct in scope as far as
it goes.  The only issues PEP 3131 should be concerned with *defining*
are those that cause problems with canonicalization, and the range of
characters and languages allowed in the standard library.

I propose it would be useful to provide a standard mechanism for
auditing the input stream.  There would be one implementation for the
stdlib that complains[1] about non-ASCII characters and possibly
non-English words, and IMO that should be the default (for the reasons
Ka-Ping gives for opposing the whole PEP).  A second one should
provide a very conservative Unicode set, with provision for amendment
as experience shows restriction to be desirable or extension to be
safe.  A third, allowing any character that can be canonicalized into
the form that PEP 3131 allows internally, is left as an exercise for
the reader wild 'n' crazy enough to want to use it.

For user convenience, it would be nice if these were implemented using
the codec interface, although if applied to raw input there would need
to be some duplication of parsing logic (specifically, comments and
strings would have to be passed unchecked).  I suppose it would be too
expensive to use the codec interface at the point of interning an
identifier (but maybe not, since it only needs to happen when adding
an identifier to the symbol table; later occurrances would be
short-circuited by probing the table and finding the token there).


Footnotes: 
[1]  I'm not sure what "complain" would mean in practice, since the
PEP acknowledges use cases for both non-ASCII and non-English in the
stdlib.


From nnorwitz at gmail.com  Wed May 23 07:13:46 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 22 May 2007 22:13:46 -0700
Subject: [Python-3000] please keep open() as a builtin,
	and general concerns about Py3k complexity
In-Reply-To: <601776.94139.qm@web33506.mail.mud.yahoo.com>
References: <601776.94139.qm@web33506.mail.mud.yahoo.com>
Message-ID: <ee2a432c0705222213n27a193d9lba68f6678de63e00@mail.gmail.com>

On 5/22/07, Steve Howell <showell30 at yahoo.com> wrote:
>
> In the system I've worked on for the last three years,
> we have at least 200 calls to the builtin open()
> method.

This number is meaningless by itself.  200 calls in how many lines of code?
How many files total and how many files use open?

I'm not sure if the numbers are useful, but if it's only used in 0.1%
of the modules, that's not a strong case for keeping it.

> FWIW one of my favorite accepted PEPs is PEP 3111,
> "Simple input built-in in Python 3000."  BTW  it's the
> only 3000 series PEP with the word "simple" in the
> title.

:-)  This PEP really just restores (raw_)input.  So it mostly keeps
the status quo.
The name raw_input goes away, there is only input which is the same as
raw_input() in 2.x.

>  I realize looking at PEPs and mailing list
> archives can skew an outsider's view of how well Py3K
> simplifies the language, since simple ideas often
> don't require PEPs, and complex ideas often lead to
> lengthier debates than simple ones, but I'm not
> feeling the simplicity.

Sure, that's understandable.  For the most part, the PEPs are about
adding new features, not about removing warts and cruft.  PEP 3100 is
the primary PEP which has info about removals.

I'll pull some stats from the Misc/NEWS file which (hopefully)
contains most of what's been done to date.

At least 7 builtins have been removed.  I expect at least 2-3 more
will be removed completely.  There will probably be ~5 others that are
not used frequently which will be moved elsewhere (e.g., intern was
already moved to sys).

1 package (compiler), 11 platform-independent modules, and probably
~20 platform-dependent modules have been removed.  I'd expect another
5-10 platform-independent modules will be removed.

Details from Misc/NEWS:

Core:
------

- Absolute import is the default behavior for 'import foo' etc.

- Removed support for syntax:
  backticks (ie, `x`), <>

- Removed these Python builtins:
  apply(), callable(), coerce(), file()

- Removed these Python methods:
  {}.has_key

- Removed these opcodes:
  BINARY_DIVIDE, INPLACE_DIVIDE, UNARY_CONVERT

- Remove C API support for restricted execution.

Library
-------

- Remove the imageop module.  Obsolete long with its unit tests becoming
  useless from the removal of rgbimg and imgfile.

- Removed these attributes from Python modules:
  * operator module: div, idiv, __div__, __idiv__, isCallable, sequenceIncludes

- Remove the compiler package.  Use of the _ast module and (an eventual)
  AST -> bytecode mechanism.

- Removed these modules:
  * Bastion, bsddb185, exceptions, md5, popen2, rexec,
    sets, sha, stringold, strop, xmllib

- Remove obsolete IRIX modules: al/AL, cd/CD, cddb, cdplayer, cl/CL, DEVICE,
  ERRNO, FILE, fl/FL, flp, fm, GET, gl/GL, GLWS, IN, imgfile, IOCTL, jpeg,
  panel, panelparser, readcd, sgi, sv/SV, torgb, WAIT.

- Remove obsolete functions:
  * commands.getstatus(), os.popen*,

- Remove functions in the string module that are also string methods.

- Remove support for long obsolete platforms: plat-aix3, plat-irix5.

- Remove xmlrpclib.SlowParser.  It was based on xmllib.


C API
-----

- Removed these Python slots:
  __coerce__, __div__, __idiv__, __rdiv__

- Removed these C APIs:
  PyNumber_Coerce(), PyNumber_CoerceEx()

- Removed these C slots/fields:
  nb_divide, nb_inplace_divide

- Removed these macros:
  staticforward, statichere, PyArg_GetInt, PyArg_NoArgs

- Removed these typedefs:
  intargfunc, intintargfunc, intobjargproc, intintobjargproc,
  getreadbufferproc, getwritebufferproc, getsegcountproc, getcharbufferproc

I'm pretty sure there is a lot missing from this list of removals.  I
also know there will be more coming. :-)

There will also be reorganizations that help reduce some conceptual
overhead.  So even though the standard library won't necessarily get
smaller, it will be easier for new people to ignore sections they
aren't interested in.  For example, database modules or web libraries.

n

From showell30 at yahoo.com  Wed May 23 07:45:15 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Tue, 22 May 2007 22:45:15 -0700 (PDT)
Subject: [Python-3000] please keep open() as a builtin,
	and general concerns about Py3k complexity
In-Reply-To: <ee2a432c0705222213n27a193d9lba68f6678de63e00@mail.gmail.com>
Message-ID: <250008.69531.qm@web33509.mail.mud.yahoo.com>


--- Neal Norwitz <nnorwitz at gmail.com> wrote:

> On 5/22/07, Steve Howell <showell30 at yahoo.com>
> wrote:
> >
> > In the system I've worked on for the last three
> years,
> > we have at least 200 calls to the builtin open()
> > method.
> 
> This number is meaningless by itself.  200 calls in
> how many lines of code?
> How many files total and how many files use open?
> 
> I'm not sure if the numbers are useful, but if it's
> only used in 0.1%
> of the modules, that's not a strong case for keeping
> it.
> 

17.7% of the files I searched have calls to open().

980 source files
174 files call open()
242898 lines of code
305 calls to open()

This is the quick and dirty Python code to compute
these stats, which has a call to the open() builtin.


    import os
    fns = []
    for dir in ('/ts-qa51', '/ars-qa12', '/is-qa7'):
        cmd = "cd %s && find . -name '*.py'" % dir
        output = os.popen(cmd).readlines()
        fns += [os.path.join(dir, line[2:]) for
                line in output]
        fns = [fn.strip() for fn in fns]

    numSourceFiles = len(fns)
    print '%d source files' % numSourceFiles
    loc = 0
    filesWithBuiltin = 0
    openLines = 0
    for fn in fns:
        fn = fn.strip()
        lines = open(fn).readlines()
        loc += len(lines)
        hasBuiltin = False
        for line in lines:
            if ' open(' in line:
                hasBuiltin = True
                openLines += 1
        if hasBuiltin:
            filesWithBuiltin += 1

    print '%d files call open()' % filesWithBuiltin
    print '%d lines of code' % loc
    print '%d calls to open()' % openLines



       
____________________________________________________________________________________Get the Yahoo! toolbar and be alerted to new email wherever you're surfing.
http://new.toolbar.yahoo.com/toolbar/features/mail/index.php

From gproux+py3000 at gmail.com  Wed May 23 07:48:03 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Wed, 23 May 2007 14:48:03 +0900
Subject: [Python-3000] please keep open() as a builtin,
	and general concerns about Py3k complexity
In-Reply-To: <250008.69531.qm@web33509.mail.mud.yahoo.com>
References: <ee2a432c0705222213n27a193d9lba68f6678de63e00@mail.gmail.com>
	<250008.69531.qm@web33509.mail.mud.yahoo.com>
Message-ID: <19dd68ba0705222248w69697069u704988fa6a333db0@mail.gmail.com>

On 5/23/07, Steve Howell <showell30 at yahoo.com> wrote:
> 17.7% of the files I searched have calls to open().

My understand is that the mythical "python 2.x -> 3.0" tool will
automatically migrate your code by using the AST to find all
references to "open" and  when finding one, add the correct import and
replace the open by the io.open call

Regards,

Guillaume

From showell30 at yahoo.com  Wed May 23 08:01:43 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Tue, 22 May 2007 23:01:43 -0700 (PDT)
Subject: [Python-3000] please keep open() as a builtin,
	and general concerns about Py3k complexity
In-Reply-To: <19dd68ba0705222248w69697069u704988fa6a333db0@mail.gmail.com>
Message-ID: <418292.50469.qm@web33510.mail.mud.yahoo.com>


--- Guillaume Proux <gproux+py3000 at gmail.com> wrote:

> On 5/23/07, Steve Howell <showell30 at yahoo.com>
> wrote:
> > 17.7% of the files I searched have calls to
> open().
> 
> My understand is that the mythical "python 2.x ->
> 3.0" tool will
> automatically migrate your code by using the AST to
> find all
> references to "open" and  when finding one, add the
> correct import and
> replace the open by the io.open call
> 

Agreed, but my concern isn't the conversion itself.  I
just want open() to stay as a builtin.  In simple
throwaway programs I appreciate the convenience, and
in larger programs I appreciate not having to
context-switch from the problem at hand to put an
"import" at the top.  

But since you mentioned conversion, our system is a
good example of a shop that will be running multiple
versions of Python side by side for many years.  We'll
cut over new components to Py3k, and then we'll
gradually upgrade legacy components.  And, of course,
some of those components will want to use the same
common modules.


      ____________________________________________________________________________________Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel and lay it on us. http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 


From nnorwitz at gmail.com  Wed May 23 08:03:29 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 22 May 2007 23:03:29 -0700
Subject: [Python-3000] please keep open() as a builtin,
	and general concerns about Py3k complexity
In-Reply-To: <19dd68ba0705222248w69697069u704988fa6a333db0@mail.gmail.com>
References: <ee2a432c0705222213n27a193d9lba68f6678de63e00@mail.gmail.com>
	<250008.69531.qm@web33509.mail.mud.yahoo.com>
	<19dd68ba0705222248w69697069u704988fa6a333db0@mail.gmail.com>
Message-ID: <ee2a432c0705222303i279f4f4hb563d984a4e58c62@mail.gmail.com>

On 5/22/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> On 5/23/07, Steve Howell <showell30 at yahoo.com> wrote:
> > 17.7% of the files I searched have calls to open().
>
> My understand is that the mythical "python 2.x -> 3.0" tool will
> automatically migrate your code by using the AST to find all
> references to "open" and  when finding one, add the correct import and
> replace the open by the io.open call

Sure a fixer would be written if this change was made.

I'm not sure from your comment about the tool being 'mythical' if you
meant to imply that this wasn't real.  Just in case there is any
doubt, it is alive and well and lives in the sandbox:

   http://svn.python.org/projects/sandbox/trunk/2to3/

There are currently fixers for:

apply, callable, dict, dummy, except, exec, has_key, input, intern,
long, ne, next, nonzero, numliterals, print, raise, raw_input, repr,
sysexcinfo, throw, tuple_params, unicode, ws_comma, xrange

I'm not sure if this is the best list to handle questions about what
does/doesn't exist for 3k.  However, I don't know of a better place to
discuss some of the transition issues.  If there are doubts about
what's being done, it would be great to raise them here and now, so we
can dispel any myths that might exist.

Other 3k status:
 * Most major changes have already been made
 * Biggest remaining change to the core language deals with
string-unicode unification
 * ~10 accepted PEPs have yet to be implemented (some have patches)
 * 8 PEPs have not been accepted or rejected yet
 * Re-organization of the standard library is starting to move forward a little
 * Doc needs lots of work, only some changes have been made
 * First alpha optimistically will ship within ~3 months

There are some issues with getting the alpha out within 3 months due
to finishing the important tasks (ie, people's availability).  So my
guess is that the alpha will slip a little.  str-uni needs to get
done.

We are running tests and building twice a day.  There is a single
failing test.  Generally all the tests are working.

n

From nnorwitz at gmail.com  Wed May 23 08:07:29 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 22 May 2007 23:07:29 -0700
Subject: [Python-3000] please keep open() as a builtin,
	and general concerns about Py3k complexity
In-Reply-To: <418292.50469.qm@web33510.mail.mud.yahoo.com>
References: <19dd68ba0705222248w69697069u704988fa6a333db0@mail.gmail.com>
	<418292.50469.qm@web33510.mail.mud.yahoo.com>
Message-ID: <ee2a432c0705222307t23dc8f73t4a15c7db867376f9@mail.gmail.com>

On 5/22/07, Steve Howell <showell30 at yahoo.com> wrote:
>
> But since you mentioned conversion, our system is a
> good example of a shop that will be running multiple
> versions of Python side by side for many years.  We'll
> cut over new components to Py3k, and then we'll
> gradually upgrade legacy components.  And, of course,
> some of those components will want to use the same
> common modules.

Once we get a solid 3.0 (probably in beta), we will focus more energy
on dealing with these sorts of problems.

I can see there being a compatibility module that could fix things up
to run with Python 2.x (*) - 3.0.  I don't know if that will be
distributed by the core or by a third party.  There are many people
that care about this issue.  It's not being forgotten.  We just
haven't gotten to it yet.  2.6 and 3.0 are a year away, probably more.

(*) probably between 2.2 and 2.4 depending on how hard it is to support.

Pretty soon I'll start focusing on getting 2.6 in shape to help ease
the transition.

n

From g.brandl at gmx.net  Wed May 23 08:21:10 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 23 May 2007 08:21:10 +0200
Subject: [Python-3000] please keep open() as a builtin,
 and general concerns about Py3k complexity
In-Reply-To: <418292.50469.qm@web33510.mail.mud.yahoo.com>
References: <19dd68ba0705222248w69697069u704988fa6a333db0@mail.gmail.com>
	<418292.50469.qm@web33510.mail.mud.yahoo.com>
Message-ID: <f30mh2$p33$1@sea.gmane.org>

Steve Howell schrieb:
> --- Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> 
>> On 5/23/07, Steve Howell <showell30 at yahoo.com>
>> wrote:
>> > 17.7% of the files I searched have calls to
>> open().
>> 
>> My understand is that the mythical "python 2.x ->
>> 3.0" tool will
>> automatically migrate your code by using the AST to
>> find all
>> references to "open" and  when finding one, add the
>> correct import and
>> replace the open by the io.open call
>> 
> 
> Agreed, but my concern isn't the conversion itself.  I
> just want open() to stay as a builtin.  In simple
> throwaway programs I appreciate the convenience, and
> in larger programs I appreciate not having to
> context-switch from the problem at hand to put an
> "import" at the top.  

ISTM that many modules using open() do also use os.path
utilities to create the filename given to open(). In that
case, you have an import statement in any case.

Georg


From showell30 at yahoo.com  Wed May 23 08:36:15 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Tue, 22 May 2007 23:36:15 -0700 (PDT)
Subject: [Python-3000] please keep open() as a builtin,
	and general concerns about Py3k complexity
In-Reply-To: <f30mh2$p33$1@sea.gmane.org>
Message-ID: <168303.61404.qm@web33510.mail.mud.yahoo.com>


--- Georg Brandl <g.brandl at gmx.net> wrote:

> ISTM that many modules using open() do also use
> os.path
> utilities to create the filename given to open(). In
> that
> case, you have an import statement in any case.
> 

Not the case for us:

  154 modules call only open()
  11 modules call only os.path.join()
  20 modules do call both

But to your larger point, 80 out of the 174 modules
that call open() do say "import os" for other reasons.




       
____________________________________________________________________________________Sick sense of humor? Visit Yahoo! TV's 
Comedy with an Edge to see what's on, when. 
http://tv.yahoo.com/collections/222

From stephen at xemacs.org  Wed May 23 09:36:24 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 23 May 2007 16:36:24 +0900
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>
	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
Message-ID: <87lkfg56vb.fsf@uwakimon.sk.tsukuba.ac.jp>

Guido van Rossum writes:
 > We should probably be conservative; I'm not too hopeful about
 > support for right-to-left alphabets for example.

I don't see what's for *Python* to *support*.  

My reasoning: bidi is entirely an issue of presentation; all Python
should do is prohibit[1] direction markers in identifiers.  To the
extent that we don't know of editors that can consistently[2] present
such identifiers as users would expect to see them, say bidi
identifiers should be avoided as a "best current practice".

AFAICS, PEP 3131 is going to work fine if we just delegate all the
problems that have been brought up to the development environment in
that way, except the important issues that Ka-Ping raises.  IMHO the
answer you gave is entirely satisfactory.


Footnotes: 
[1]   Or ignore, but I prefer prohibit because the bookkeeping
involved in ensuring that introspective output produces the identifier
that was read in from a file is unjustifiable overhead, and because
permitting them opens the door to "stupid bidi tricks" by authors (we
can't do anything about people who let their editors play stupid bidi
tricks on them).

[2]  Ie, so that different identifiers always look different, and the
same identifier is always presented in the same form.

From python at zesty.ca  Wed May 23 09:37:14 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Wed, 23 May 2007 02:37:14 -0500 (CDT)
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com> 
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org> 
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com> 
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>
	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>

I can see that I don't stand a very high chance of convincing you.
But I'd like to make sure you understand what I'm getting at, anyway.
(And I will get to some specific suggestions at the end of this
message.)

The key thing is that the language definition is about to transition
from something which has always "fit in your head", and which holds
that property as a core value, to something which cannot possibly fit
in anyone's head no matter how hard they try.  (This core value of
Python is not something I see as having been a core value of Java,
and it's one of the reasons I like Python better.)

> > PEP 3131 will also cause problems for code review.  Because many
> > characters have indistinguishable appearances, there will be no
> > mapping between what you see when you look at code and what the code
> > actually says.
>
> I trust most programmers to *want* to write clear code, so they will
> steer clear from such things. If someone wants to obfuscate their code
> they already have plenty of opportunities (even in Python!).

Indeed -- but that's not an argument for creating more opportunities.
For example, we like the fact that Python doesn't look like Perl; the
mere fact that some kinds of obfuscation are possible in Python
doesn't require us to give up on simplicity entirely and open the door
to a Perl-like proliferation of operators.

Not all programmers want to write clear code; from a security
perspective, the most important programmers are the ones who have an
incentive to fool you.  Unicode identifiers are a new avenue for any
insider who wants to use a Python program as a vector of attack; they
enable changes that are harder to detect, track down, and understand.

> The problem is no worse than the lack of difference between 1 and l in
> some fonts, and between l and I in others (and there are even fonts
> where o and 0 look the same).

It's far, far worse.  The number of ways in which characters can be
confused in Unicode is much greater.  There are many fonts you can
choose from that offer a clear visual difference between 1 and l and I,
whereas there are no fonts in the world that distinguish all the
identifier characters in Unicode.  More importantly, there probably
never will be.  It's not just incrementally harder to identify
characters; Unicode intends to make it impossible by design.

> Remember the mantra that *human* readability of code is
> important? Well, it helps if your code can use at least some the
> language spoken by those humans.

Yes, a programming language is a communication medium among humans
and computers.  If you look at this as a communication medium, the
problem is that we're losing round-trip ability to human-readable media.

Suppose I hand you a printout of a Python program for you to review.
One of the questions you are faced with answering is, "Is this a valid
Python program?"  But your answer will necessarily be "I don't know",
for almost any program.  "I cannot possibly know" will be the only
truthful answer anyone can give.

Or suppose you are reading a book about Python and it shows you a bit
of code.  You want to type in the example -- but you cannot be sure
what you should type.

I don't deny that there is some convenience to be gained by those who
prefer to use other human languages when discussing and writing
programs.  But there is an extremely high cost to the language
definition.  With this definitional change, every Python program
that is displayed on a screen or printed on paper (or, in fact, in
any human-accessible representation) instantly becomes untrustworthy.

Another way to look at it is the computer science definition of a
language: what a language specifies is the set of acceptable programs.
So the purpose of a language is to restrict: to define the boundary
between what is in the language and what is not in the language.  But
that's just syntax; in addition, programming languages have semantics,
so the other half of the purpose is to give programs meaning for the
people who read them and construct compilers, interpreters, etc.  If
you put these two things together you get:

    The purpose of a programming language is to restrict the set
    of acceptable programs to a set that is small enough and simple
    enough that humans can agree on a clear meaning to each program.

Maybe this will help you see why I am so concerned about PEP 3131 --
in my judgement, it violates the fundamental purpose of a programming
language.  The big difference between natural languages and
programming languages is that it's okay for natural languages to be
fuzzy, but programs need to have exactly one meaning because they're
supposed to be operational.

                    *           *           *

Okay.  I've said my arguments, and I hope they will convince you.

But I recognize that they may not.  And if so, I have a couple of
suggestions for you to consider that might help address my concerns.

First: the "Common Objections" section of the PEP is too thin.  I'd
like the following arguments to be mentioned there for the record:

    1.  Python will lose the ability to make a reliable round trip
        between a computer file and any human-accessible medium
        such as a visual display or a printed page.

    2.  Python will become vulnerable to a new class of security
        exploits via the writing of misleading or malicious code
        that is visually indistinguishable from correct code.
        Consequently it will be more difficult for humans to
        inspect code and assure its correctness or trustworthiness.
        There is very little established best practice for
        addressing homograph security issues.

    3.  The Python language will become too large for any single
        person to fully know, in the sense that no human being can
        know the full character set, and therefore no one can ever
        acquire the ability to independently examine a program and
        decide whether it is valid Python.

    4.  Python programs that reuse other Python modules may come
        to contain a mix of character sets such that no one can
        fully read them or properly display them.

    5.  Unicode is young and unfinished.  As far as I know there
        are no truly complete Unicode fonts and there may not be
        for some time.  Tool support is weak.  The whole computer
        industry has 40 years of experience working with ASCII
        for everything, including programming languages; our
        experience with Unicode security issues and Unicode in
        programming languages is fairly immature.


Second: we need a way to be sure about the programs we're running.
So let the acceptance of Unicode identifiers be controlled by a
command-line flag, e.g. "python -U" accepts them, "python" alone
does not.  And let's keep the code for this feature clearly separated
so that one can be sure, with high confidence, that when this feature
is turned off, none of the code for Unicode identifiers will be
touched.  It should be possible to compile a Python that is incapable
of supporting Unicode identifiers.

Then people who want to use non-ASCII identifiers can do so, and
anyone can still run their programs if they want.  At the same time,
people who want to know exactly what their programs say can be
confident that Python is working with a small and manageable character
set.  And people who don't know or don't care about this change won't
suddenly have a whole new source of surprises thrust upon them; if
they know enough to know they want this feature, they can ask for it.

If we're going to introduce a significant new source of complexity,
let's at least make it easy to keep things simple (and reliably
simple) for those who want to do so; we can expect this to be the vast
majority, given interoperability and extensibility concerns, existing
industry practices, and the policy for the Python standard library.


What do you think?


-- ?!ng

From jcarlson at uci.edu  Wed May 23 10:29:57 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 23 May 2007 01:29:57 -0700
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
References: <ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
Message-ID: <20070523011101.85F0.JCARLSON@uci.edu>


Ka-Ping Yee <python at zesty.ca> wrote:
> I can see that I don't stand a very high chance of convincing you.
> But I'd like to make sure you understand what I'm getting at, anyway.
> (And I will get to some specific suggestions at the end of this
> message.)
> 
> The key thing is that the language definition is about to transition
> from something which has always "fit in your head", and which holds
> that property as a core value, to something which cannot possibly fit
> in anyone's head no matter how hard they try.  (This core value of
> Python is not something I see as having been a core value of Java,
> and it's one of the reasons I like Python better.)
[snip]
> If we're going to introduce a significant new source of complexity,
> let's at least make it easy to keep things simple (and reliably
> simple) for those who want to do so; we can expect this to be the vast
> majority, given interoperability and extensibility concerns, existing
> industry practices, and the policy for the Python standard library.
> 
> 
> What do you think?

For what it's worth, I've been wary of PEP 3131 for a while (if not
outright against it). From identical character glyph issues (which have
been discussed off and on for at least a year), to editing issues (being
that I write and maintain a Python editor), to code sharing issues (and
the ghettoization of code as Jim Jewett calls it), everything in between,
and even things that we haven't thought of.

Yes, PEP 3131 makes writing software in Python easier for some, but for
others, it makes maintenance of 3rd party code a potential nightmare
(regardless of 'community standards' to use ascii identifiers).


 - Josiah


From ian.bollinger at gmail.com  Wed May 23 12:03:43 2007
From: ian.bollinger at gmail.com (Ian D. Bollinger)
Date: Wed, 23 May 2007 06:03:43 -0400
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
Message-ID: <4654117F.9020901@gmail.com>

Ka-Ping Yee wrote:
>     2.  Python will become vulnerable to a new class of security
>         exploits via the writing of misleading or malicious code
>         that is visually indistinguishable from correct code.
>         Consequently it will be more difficult for humans to
>         inspect code and assure its correctness or trustworthiness.
>         There is very little established best practice for
>         addressing homograph security issues.
>   
Isn't it already easy enough to do that today?

 >>> import base64; exec 
base64.decodestring('cHJpbnQgJ0hlbGxvLCB3b3JsZCEn\n')
... Hello, world!

Admittedly, you could look for anything like that and be suspicious, but
running a program from an untrusted source is always going to be
dangerous.  For standalone applications, you can already do things like
compile malicious C extension modules that are impossible to verify.

As for programs that use Python for scripting, shouldn't it be up to
them to ensure that it runs in a restricted environment?  A browser, for
instance, would have to do that already.

- Ian D. Bollinger

From stephen at xemacs.org  Wed May 23 13:07:57 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 23 May 2007 20:07:57 +0900
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <20070523011101.85F0.JCARLSON@uci.edu>
References: <ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
	<20070523011101.85F0.JCARLSON@uci.edu>
Message-ID: <87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>

Josiah Carlson writes:

 > From identical character glyph issues (which have been discussed
 > off and on for at least a year),

In my experience, this is not a show-stopping problem.  Emacs/MULE has
had it for 20 years because of the (horrible) design decision to
attach charset information to each character in the representation of
text.  Thus, MULE distinguishes between NO-BREAK SPACE and NO-BREAK
SPACE (the same!) depending on whether the containing text "is" ISO
8859-15 or "is" ISO 8859-1.  (Semantically this is different from the
identical glyph, different character problem, since according to ISO
8859 those characters are identical.  However, as a practical matter,
the problem of detecting and dealing with the situation is the same as
in MULE the character codes are different.)

How does Emacs deal with this?  Simple.  We provide facilities to
identify identical characters (not relevant to PEP 3131, probably), to
highlight suspicious characters (proposed, not actually implemented
AFAIK, since identification does what almost all users want), and to
provide information on characters in the editing buffer.  The
remaining problems with coding confusion are due to deficient
implementation (mea maxima culpa).

I consider this to be an editor/presentation problem, not a language
definition issue.

Note that Ka-Ping's worry about the infinite extensibility of Unicode
relative to any human being's capacity is technically not a problem.
You simply have your editor substitute machine-generated identifiers
for each identifier that contains characters outside of the user's
preferred set (eg, using hex codes to restrict to ASCII), then review
the code.  When you discover what an identifier's semantics are, you
give it a mnemonic name according to the local style guide.
Expensive, yes.  But cost is a management problem, not the kind of
conceptual problem Ka-Ping claims is presented by multilingual
identifiers.  Python is still, in this sense, a finitely generated
language.

 > to editing issues (being that I write and maintain a Python editor)

Multilingual editing (except for non-LTR scripts) is pretty much a
solved problem, in theory, although adding it to any given
implementation can be painful.  However, since there are many
programmer's editors that can handle multilingual text already, that
is not a strong argument against PEP 3131.

 > Yes, PEP 3131 makes writing software in Python easier for some, but for
 > others, it makes maintenance of 3rd party code a potential nightmare
 > (regardless of 'community standards' to use ascii identifiers).

Yes, there are lots of nightmares.  In over 15 years of experience
with multilingual identifiers, I can't recall any that have lasted
past the break of dawn, though.

I just don't see such identifiers very often, and when I do, they are
never hard to deal with.  Admittedly, I don't ever need to deal with
Arabic or Devanagari or Thai, but I'd be willing to bet I could deal
with identifiers in those languages, as long as the syntax is ASCII.

As for third party code, "the doctor says that if you put down that
hammer, your head will stop hurting".  If multilingual third party
code looks like a maintenance risk, don't deal with that third
party.[1]  Or budget for translation up front; translators are quite a
bit cheaper than programmers.

BTW, "find . -name '*.py' | xargs grep -l '[^[:ascii:]]'" is a pretty
cheap litmus test for your software vendors!  And yes, it *should* be
looking into strings and comments.  In practice (once I acquired a
multilingual editor), handling non-English strings and comments has
been 99% of the headache of maintaining code that contains non-ASCII.

I've been maintaining the edict.el library, an interface to Jim
Breen's Japanese-English dictionary EDICT for XEmacs for 10 years
(there was serious development activity for only about the first 2,
though).  A large fraction of the identifiers specific to that library
contain Japanese characters (both ideographic kanji and syllabic kana,
as well as the pseudo-namespace prefix "edict-" in ASCII).  There are
several Japanese identifiers in there whose meaning I still don't
know, except by referring to the code to see what it does (they're
technical terms in Japanese linguistics, I believe, and probably about
as intelligible to the layman as terms in Dutch tax law).  At the time
I started maintaining that library, I did so because I *couldn't read
Japanese* (obviously!)

This turned out to pose no problem.  Japanese identifiers were *not*
visually distinct to me, but when I needed to analyze a function, I
became familiar with the glyphs of related identifiers quickly.  And
having an intelligible name to start with wouldn't have helped much; I
needed to analyze the function because it wasn't doing what I wanted
it to do, not because I couldn't translate the name.

There are other packages in XEmacs which use non-ASCII, non-English
identifiers, but they are rare.  Maintaining them has never been
reported as a problem.

N.B.  This is limited experience with what many might characterize as
a niche language.  And I'm an idiosyncratic individual, blessed with a
reasonable amount of talent at language learning.  Both valid points.

However, I think the killer point in the above is the one about
strings and comments.  If you can discipline your team to write
comments and strings in ASCII/English, extending that to identifiers
is no problem.  If your team insists on multilingual strings/comments,
or needs them due to the task, multilingual identifiers will be the
least of your problems, and the most susceptible to technical solution
(eg, via identification and quarantine by cross-reference tables).

Granted, this is going to be a more or less costly transition for
ASCII-only Pythonistas.  I think we should focus on cost-reduction,
not on why it shouldn't happen.


Footnotes: 
[1]  Yes, I know, in the real world sometimes you have to.
Multilingual identifiers are the least of your worries when dealing
with a monopoly supplier.



From python at zesty.ca  Wed May 23 13:18:59 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Wed, 23 May 2007 06:18:59 -0500 (CDT)
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <4654117F.9020901@gmail.com>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com> 
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org> 
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com> 
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>
	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
	<4654117F.9020901@gmail.com>
Message-ID: <Pine.LNX.4.58.0705230542100.8399@server1.LFW.org>

On Wed, 23 May 2007, Ian D. Bollinger wrote:
> Ka-Ping Yee wrote:
> >     2.  Python will become vulnerable to a new class of security
> >         exploits via the writing of misleading or malicious code
> >         that is visually indistinguishable from correct code.
> >         Consequently it will be more difficult for humans to
> >         inspect code and assure its correctness or trustworthiness.
> >         There is very little established best practice for
> >         addressing homograph security issues.
> >
> Isn't it already easy enough to do that today?

There are two simultaneous errors in reasoning here.  First, the fact
that one can write confusing code today is not a reason to enable the
writing of even more confusing code.

Second, the Unicode identifier issue is different from the example you
give here.  In your example, it is obvious that the code is doing
something hard to understand; if I showed you something like this and
asked you what it did, you would think "hmm, that looks obfuscated":

>  >>> import base64; exec
> base64.decodestring('cHJpbnQgJ0hlbGxvLCB3b3JsZCEn\n')
> ... Hello, world!

But with Unicode identifiers you have no way to know even whether you
should be suspicious.  You would feel confident that you know what
a simple piece of code does, and yet be wrong.  For example, this
looks like a normal fragment of code:

    def remove_if_allowed(user, filename):
        allow = 1
        for group in disabled_groups:
            if user in group:
                allow = 0
        if allow:
            os.remove(filename)

But there is no way to tell by looking at it whether it works or not.
If all three occurrences of 'allow' are spelled with ASCII characters,
it will work.  If the second occurrence of 'allow' is spelled with a
Cyrillic 'a' (U+0430), you have a silent security hole.

Now imagine that this is part of an open-source project that accepts
patches from the community, and senior developers check in the patches
after reviewing them.  The use of Unicode identifiers opens the door
for someone to introduce a security hole that is guaranteed to be
undetectable by reading the code, no matter how carefully anyone reads it.

Will this be caught?  Maybe someone will test the routine; maybe not.
Either way, it is clear that the reviewer's job has just gotten much
more difficult, and accepting patches is much more dangerous as a
result of PEP 3131.


-- ?!ng

From alexandre at peadrop.com  Wed May 23 16:01:11 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Wed, 23 May 2007 10:01:11 -0400
Subject: [Python-3000] [Python-Dev] Introduction and request for commit
	access to the sandbox.
In-Reply-To: <ee2a432c0705222143g2050bd11u4ec4f510b1f7c3a2@mail.gmail.com>
References: <acd65fa20705201428y69c90329i84602ecee5b9cf8e@mail.gmail.com>
	<465123A9.8090500@v.loewis.de>
	<acd65fa20705221335i1dc81496h7f6d168472adb170@mail.gmail.com>
	<ee2a432c0705222143g2050bd11u4ec4f510b1f7c3a2@mail.gmail.com>
Message-ID: <acd65fa20705230701q675d437cx5048518dccbe1d79@mail.gmail.com>

On 5/23/07, Neal Norwitz <nnorwitz at gmail.com> wrote:
> On 5/22/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> >
> > As you see, cStringIO's code also needs a good cleanup to make it,
> > at least, conforms to PEP-7.
>
> Alexandre,
>
> It would be great if you could break up unrelated changes into
> separate patches.  Some of these can go in sooner rather than later.
> I don't know all the things that need to be done, but I could imagine
> a separate patch for each of:
>
>  * whitespace normalization
>  * function name modification
>  * other formatting changes
>  * bug fixes
>  * changes to make consistent with StringIO
>
> I don't know if all those items in the list need to change, but that's
> the general idea.  Separate patches will make it much easier to review
> and get benefits from your work earlier.

I totally agree, and that was already my current idea.

> I look forward to seeing your work!

Thanks!

-- Alexandre

From jcarlson at uci.edu  Wed May 23 18:23:28 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 23 May 2007 09:23:28 -0700
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20070523011101.85F0.JCARLSON@uci.edu>
	<87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20070523082241.85F3.JCARLSON@uci.edu>


"Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> Josiah Carlson writes:
> 
>  > From identical character glyph issues (which have been discussed
>  > off and on for at least a year),
> 
> In my experience, this is not a show-stopping problem.

I never claimed that this, by itself, was a showstopper.

And my post should not be seen as a "these are all the problems that I
have seen with PEP 3131".  Those are merely the issues that have been
discussed over and over, for which I (and seemingly others) are still
concerned with, regardless of the hundreds of posts here and in
comp.lang.python seeking to convince us that "they are not a problem".

> Emacs/MULE has
> had it for 20 years because of the (horrible) design decision to
> attach charset information to each character in the representation of
> text.  Thus, MULE distinguishes between NO-BREAK SPACE and NO-BREAK
> SPACE (the same!) depending on whether the containing text "is" ISO
> 8859-15 or "is" ISO 8859-1.  (Semantically this is different from the
> identical glyph, different character problem, since according to ISO
> 8859 those characters are identical.  However, as a practical matter,
> the problem of detecting and dealing with the situation is the same as
> in MULE the character codes are different.)
> 
> How does Emacs deal with this?  Simple.  We provide facilities to
> identify identical characters (not relevant to PEP 3131, probably), to
> highlight suspicious characters (proposed, not actually implemented
> AFAIK, since identification does what almost all users want), and to
> provide information on characters in the editing buffer.  The
> remaining problems with coding confusion are due to deficient
> implementation (mea maxima culpa).
> 
> I consider this to be an editor/presentation problem, not a language
> definition issue.

This particular excuse pisses me off the most.  "If you can't
differentiate, then your font or editor sucks."  Thank you for passing
judgement on my choice of font or editor, but Ka-Ping already stated
why this argument is bullshit: there does not currently exist a font
where one *can* differentiate all the glyphs, and further, even if one
could visually differentiate similar glyphs, *remembering* the 64,000+
glyphs that are available in just the primary unicode plane to
differentiate them, is a herculean task.

Never mind the fact that people use dozens, perhaps hundreds of
different editors to write and maintain Python code, that the 'Emacs
works' argument is poor at best.  Heck, Thomas Bushnell made the same
argument when I spoke with him 2 1/2 years ago (though he also included
Vim as an alternative to Emacs); it smelled like bullshit then, and it
smells like bullshit now.


> Note that Ka-Ping's worry about the infinite extensibility of Unicode
> relative to any human being's capacity is technically not a problem.
> You simply have your editor substitute machine-generated identifiers
> for each identifier that contains characters outside of the user's
> preferred set (eg, using hex codes to restrict to ASCII), then review
> the code.  When you discover what an identifier's semantics are, you
> give it a mnemonic name according to the local style guide.
> Expensive, yes.  But cost is a management problem, not the kind of
> conceptual problem Ka-Ping claims is presented by multilingual
> identifiers.  Python is still, in this sense, a finitely generated
> language.

That's a bullshit argument, and you know it.  "Just use hex escapes"? 
Modulo unicode comments and strings, all Python programs are easily read
in default fonts available on every platform on the planet today.  But
with 3131, people accepting 3rd party code need to break 15+ years of
"what you see is what is actually there" by verifying the character
content of every identifier?  That's a silly and unnecessary workload
addition for anyone who wants to accept patches from 3rd parties, and
relies on the same "your tools suck" argument to invalidate concerns
over unicode glyph similarity.

Speaking of which, do you know of a fixed-width font that is able to
allow for the visual distinction of all unicode glyphs in the primary
plane, or even the portion that Martin is proposing we support?  This
also "is not a show-stopper", but it certainly reduces audience
satisfaction by a large margin.


>  > to editing issues (being that I write and maintain a Python editor)
> 
> Multilingual editing (except for non-LTR scripts) is pretty much a
> solved problem, in theory, although adding it to any given
> implementation can be painful.  However, since there are many
> programmer's editors that can handle multilingual text already, that
> is not a strong argument against PEP 3131.

Another "your tools suck" argument.  While my editor has been able to
handle unicode content for a couple years now (supporting all encodings
available to Python), every editor that wants to properly support the
adding of unicode text in any locale will necessitate the creation of
charmap-like interfaces in basically every editor.

But really, I'm glad that Emacs works for you and has solved this
problem for you.  I honestly tried to use it 4 years ago, spent a couple
weeks with it.  But it didn't work for me, and I've spent the last 4
years writing an editor because it and the other 35 editors I tried at
the time didn't work for me (as have the dozens of others for the exact
same reason). But of course, our tools suck, and because we can't use
Emacs, we are already placed in a 2nd tier ghettoized part of the Python
community of "people with tools that suck".

Thank you for hitting home that unless people use Emacs, their tools
suck.  I still don't believe that my concerns have been addressed. And I
certainly don't believe that those Ka-Ping brought up (which are better
than mine) have been addressed.  But hey, my tools suck, so obviusly my
concerns regarding using my tools to edit Python in the future don't
matter.  Thank you for the vote of confidence.


 - Josiah


From jimjjewett at gmail.com  Wed May 23 18:26:55 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 23 May 2007 12:26:55 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<4646FCAE.7090804@v.loewis.de> <f27rmv$k1d$1@sea.gmane.org>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>

On 5/23/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Jim Jewett writes:

>  > It simplifies checking for identifiers that *don't* stick to ASCII,

> Only if you assume that people will actually perceive the 10-character
> string "L\u00F6wis" as an identifier, regardless of the fact that any
> programmable editor can be trained to display the 5-character string
> "L?wis" in a very small amount of code.  Conversely, any programmable
> editor can easily be trained to take the internal representation
> "L?wis" and display it as "L\u00F6wis", giving all the benefits of the
> representation you propose.  But who would ever enable it?

I would.

I would like an alert (and possibly an import exception) on any code
whose *executable portion* is not entirely in ASCII.

Comments aren't a problem, unless they somehow erase or hide other
characters or line breaks.  Strings aren't a problem unless I evaluate
them.  Code ... I want to know if there is some non-ASCII.

Even Latin-1 isn't much of a problem, except for single-quotes.  I do
want to know if 'abc' is a string or an identifier made with the
"prime" letter.

This might be an innocent cut-and-paste error (and how else would most
people enter non-native characters), but it is still a problem -- and
python would often create a new variable instead of warning me.

> The only issues PEP 3131 should be concerned with *defining*
> are those that cause problems with canonicalization, and the range of
> characters and languages allowed in the standard library.

Fair enough -- but the problem is that this isn't a solved issue yet;
the unicode group themselves make several contradictory
recommendations.

I can come up with rules that are probably just about right, but I
will make mistakes (just as the unicode consortium itself did, which
is why they have both ID and XID, and why both have stability
characters).  Even having read their reports, my initial rules would
still have banned mixed-script, which would have prevented your edict-
example.

So I'll agree that defining the charsets and combinations and
canonicalization is the right scope; I just feel that best practice
isn't yet clear enough.

> I propose it would be useful to provide a standard mechanism for
> auditing the input stream.  There would be one implementation for the
> stdlib that complains[1] about non-ASCII characters and possibly
> non-English words, and IMO that should be the default (for the reasons
> Ka-Ping gives for opposing the whole PEP).  A second one should
> provide a very conservative Unicode set, with provision for amendment
> as experience shows restriction to be desirable or extension to be
> safe.  A third, allowing any character that can be canonicalized into
> the form that PEP 3131 allows internally, is left as an exercise for
> the reader wild 'n' crazy enough to want to use it.

This might deal with my concerns.  It is a bit more complicated than
the current plans.

-jJ

From jimjjewett at gmail.com  Wed May 23 18:39:43 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 23 May 2007 12:39:43 -0400
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>
	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
Message-ID: <fb6fbf560705230939g1a43524bw44c8273768a0173@mail.gmail.com>

On 5/23/07, Ka-Ping Yee <python at zesty.ca> wrote:
> First: the "Common Objections" section of the PEP is too thin.  I'd
> like the following arguments to be mentioned there for the record:

>     4.  Python programs that reuse other Python modules may come
>         to contain a mix of character sets such that no one can
>         fully read them or properly display them.

4.a

Certain cut-and-paste errors (such as cutting from a word document
that uses "smart quotes") will change from syntax errors to silently
creating new identifiers.

>     5.  Unicode is young and unfinished.  As far as I know there
>         are no truly complete Unicode fonts and there may not be
>         for some time.  Tool support is weak.  The whole computer
>         industry has 40 years of experience working with ASCII
>         for everything, including programming languages; our
>         experience with Unicode security issues and Unicode in
>         programming languages is fairly immature.

5.a  Use of unicode for identifiers is not yet a resolved issue.  The
unicode consortium mostly recommends XID rather than the older ID;
both sets already have "stability characters" and canonicalization
concerns.  It isn't quite clear which marks/letters/scripts to leave
out.  (The recommendations conflict; other than ASCII-only, I'm not
sure I've found one yet that leaves out "letters" indistiguishable
(even in the reference font) from already-meaningful syntax
characters.)

We can make up our own answers, but if we do that... maybe we shouldn't rush.

-jJ

From guido at python.org  Wed May 23 18:45:46 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 23 May 2007 09:45:46 -0700
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <fb6fbf560705230939g1a43524bw44c8273768a0173@mail.gmail.com>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>
	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
	<fb6fbf560705230939g1a43524bw44c8273768a0173@mail.gmail.com>
Message-ID: <ca471dc20705230945q6d9a31fas9e0011959fa1e643@mail.gmail.com>

On 5/23/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> Certain cut-and-paste errors (such as cutting from a word document
> that uses "smart quotes") will change from syntax errors to silently
> creating new identifiers.

Really? Are those quote characters considered letters by the Unicode standard?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From radix at twistedmatrix.com  Wed May 23 18:45:56 2007
From: radix at twistedmatrix.com (Christopher Armstrong)
Date: Wed, 23 May 2007 12:45:56 -0400
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <fb6fbf560705230939g1a43524bw44c8273768a0173@mail.gmail.com>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>
	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
	<fb6fbf560705230939g1a43524bw44c8273768a0173@mail.gmail.com>
Message-ID: <60ed19d40705230945m558756day2863b81e38618747@mail.gmail.com>

On 5/23/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 5/23/07, Ka-Ping Yee <python at zesty.ca> wrote:
> > First: the "Common Objections" section of the PEP is too thin.  I'd
> > like the following arguments to be mentioned there for the record:
>
> >     4.  Python programs that reuse other Python modules may come
> >         to contain a mix of character sets such that no one can
> >         fully read them or properly display them.
>
> 4.a
>
> Certain cut-and-paste errors (such as cutting from a word document
> that uses "smart quotes") will change from syntax errors to silently
> creating new identifiers.

Is this actually true? Are the fancy quote characters really going to
be in the set of characters that would valid in identifiers, as
proposed?


-- 
Christopher Armstrong
International Man of Twistery
http://radix.twistedmatrix.com/
http://twistedmatrix.com/
http://canonical.com/

From bwinton at latte.ca  Wed May 23 18:52:25 2007
From: bwinton at latte.ca (Blake Winton)
Date: Wed, 23 May 2007 12:52:25 -0400
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <Pine.LNX.4.58.0705230542100.8399@server1.LFW.org>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>	<4654117F.9020901@gmail.com>
	<Pine.LNX.4.58.0705230542100.8399@server1.LFW.org>
Message-ID: <46547149.8050304@latte.ca>

Ka-Ping Yee wrote:
 > But with Unicode identifiers you have no way to know even whether you
 > should be suspicious.  You would feel confident that you know what
 > a simple piece of code does, and yet be wrong.

Also, Jim Jewett wrote:
 > Strings aren't a problem unless I evaluate them.

a = """This string has a triple quote and a command in it. \"""
os.remove("*")
"""

If that \ is merely a unicode character that looks like \, you've just 
deleted your harddrive.  (To close it off, you could use """, where the 
middle quote is a unicode character that looks like ".)  Two strings, 
with some executable code in the middle, that looks like one harmless 
string.

Actually, I think that could shorten down to:
a = """
os.remove("*")
"""
with the middle character of each """ not being a ".

My point here is that if you're confident that you know what a simple 
piece of code does, you're already wrong.  Unicode identifiers don't 
change that.

 > But there is no way to tell by looking at it whether it works or not.
 > If all three occurrences of 'allow' are spelled with ASCII characters,
 > it will work.  If the second occurrence of 'allow' is spelled with a
 > Cyrillic 'a' (U+0430), you have a silent security hole.

If you search for "allow", it'll only match the ones that actually 
match.  Yes, it makes patch reviewers jobs harder, or makes the tools 
they need to do their jobs need to be smarter.  No, I don't think it's 
as bad as you think it is.  And heck, if you're a patch reviewer, set 
the ASCII-only flag on your version of Python, or run a program before 
checking it in to flag non-ASCII characters, and reject all patches from 
that person in the future, since clearly they're a black hat.

Also, I find strangely amusing that complaints about characters that 
look the same as other characters come from someone named "?!ng".  :)

Later,
314|<3.


From jcarlson at uci.edu  Wed May 23 20:21:53 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 23 May 2007 11:21:53 -0700
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20070523011101.85F0.JCARLSON@uci.edu>
	<87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20070523111704.85FC.JCARLSON@uci.edu>


Removing those words that some found offensive, perhaps I will get a
reponse to the point of my post: "your tools aren't very good" and
"Emacs does it right" are not valid responses to the concerns brought up
regarding unicode.

"Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> Josiah Carlson writes:
> 
>  > From identical character glyph issues (which have been discussed
>  > off and on for at least a year),
> 
> In my experience, this is not a show-stopping problem.

I never claimed that this, by itself, was a showstopper.

And my post should not be seen as a "these are all the problems that I
have seen with PEP 3131".  Those are merely the issues that have been
discussed over and over, for which I (and seemingly others) are still
concerned with, regardless of the hundreds of posts here and in
comp.lang.python seeking to convince us that "they are not a problem".

> Emacs/MULE has
> had it for 20 years because of the (horrible) design decision to
> attach charset information to each character in the representation of
> text.  Thus, MULE distinguishes between NO-BREAK SPACE and NO-BREAK
> SPACE (the same!) depending on whether the containing text "is" ISO
> 8859-15 or "is" ISO 8859-1.  (Semantically this is different from the
> identical glyph, different character problem, since according to ISO
> 8859 those characters are identical.  However, as a practical matter,
> the problem of detecting and dealing with the situation is the same as
> in MULE the character codes are different.)
> 
> How does Emacs deal with this?  Simple.  We provide facilities to
> identify identical characters (not relevant to PEP 3131, probably), to
> highlight suspicious characters (proposed, not actually implemented
> AFAIK, since identification does what almost all users want), and to
> provide information on characters in the editing buffer.  The
> remaining problems with coding confusion are due to deficient
> implementation (mea maxima culpa).
> 
> I consider this to be an editor/presentation problem, not a language
> definition issue.

This particular excuse angers me the most.  "If you can't differentiate,
then your font or editor is garbage."  Thank you for passing judgement
on my choice of font or editor, but Ka-Ping already stated why this
argument isn't valid: there does not currently exist a font where one
*can* differentiate all the glyphs, and further, even if one could
visually differentiate similar glyphs, *remembering* the 64,000+ glyphs
that are available in just the primary unicode plane to differentiate
them, is a herculean task.

Never mind the fact that people use dozens, perhaps hundreds of
different editors to write and maintain Python code, that the 'Emacs
works' argument is poor at best.  Heck, Thomas Bushnell made the same
argument when I spoke with him 2 1/2 years ago (though he also included
Vim as an alternative to Emacs); it smelled like garbage then, and it
smells like garbage now.


> Note that Ka-Ping's worry about the infinite extensibility of Unicode
> relative to any human being's capacity is technically not a problem.
> You simply have your editor substitute machine-generated identifiers
> for each identifier that contains characters outside of the user's
> preferred set (eg, using hex codes to restrict to ASCII), then review
> the code.  When you discover what an identifier's semantics are, you
> give it a mnemonic name according to the local style guide.
> Expensive, yes.  But cost is a management problem, not the kind of
> conceptual problem Ka-Ping claims is presented by multilingual
> identifiers.  Python is still, in this sense, a finitely generated
> language.

That's a poor argument, and you know it.  "Just use hex escapes"? Modulo
unicode comments and strings, all Python programs are easily read in
default fonts available on every platform on the planet today.  But with
3131, people accepting 3rd party code need to break 15+ years of "what
you see is what is actually there" by verifying the character content of
every identifier?  That's a silly and unnecessary workload addition for
anyone who wants to accept patches from 3rd parties, and relies on the
same "your tools are poor" argument to invalidate concerns over unicode
glyph similarity.

Speaking of which, do you know of a fixed-width font that is able to
allow for the visual distinction of all unicode glyphs in the primary
plane, or even the portion that Martin is proposing we support?  This
also "is not a show-stopper", but it certainly reduces audience
satisfaction by a large margin.


>  > to editing issues (being that I write and maintain a Python editor)
> 
> Multilingual editing (except for non-LTR scripts) is pretty much a
> solved problem, in theory, although adding it to any given
> implementation can be painful.  However, since there are many
> programmer's editors that can handle multilingual text already, that
> is not a strong argument against PEP 3131.

Another "your tools aren't very good" argument.  While my editor has
been able to handle unicode content for a couple years now (supporting
all encodings available to Python), every editor that wants to properly
support the adding of unicode text in any locale will necessitate the
creation of charmap-like interfaces in basically every editor.

But really, I'm glad that Emacs works for you and has solved this
problem for you.  I honestly tried to use it 4 years ago, spent a couple
weeks with it.  But it didn't work for me, and I've spent the last 4
years writing an editor because it and the other 35 editors I tried at
the time didn't work for me (as have the dozens of others for the exact
same reason). But of course, our tools suck, and because we can't use
Emacs, we are already placed in a 2nd tier ghettoized part of the Python
community of "people with tools that aren't Emacs".

Thank you for hitting home that unless people use Emacs, their tools
arent sufficient for Python development. I still don't believe that my
concerns have been addressed. And I certainly don't believe that those
Ka-Ping brought up (which are better than mine) have been addressed.
 But hey, my tools aren't Emacs, so obviusly my concerns regarding using
my tools to edit Python in the future don't matter.  Thank you for the
vote of confidence.


 - Josiah


From ntoronto at cs.byu.edu  Wed May 23 21:48:10 2007
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Wed, 23 May 2007 13:48:10 -0600
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <20070523111704.85FC.JCARLSON@uci.edu>
References: <20070523011101.85F0.JCARLSON@uci.edu>	<87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070523111704.85FC.JCARLSON@uci.edu>
Message-ID: <46549A7A.6000807@cs.byu.edu>

Josiah Carlson wrote:
> Thank you for hitting home that unless people use Emacs, their tools
> arent sufficient for Python development. I still don't believe that my
> concerns have been addressed. And I certainly don't believe that those
> Ka-Ping brought up (which are better than mine) have been addressed.
>  But hey, my tools aren't Emacs, so obviusly my concerns regarding using
> my tools to edit Python in the future don't matter.  Thank you for the
> vote of confidence.
>   

Though I don't develop an editor in my spare time, I had a similar 
reaction to the "Emacs does Unicode this way, which is correct" 
solutions. My favorite editor is going to have to get awfully smart.

It reminds me of some friction I experienced when trying out Lisp. It's 
fairly painful to program in Lisp without an editor that does 
paren-matching and automatic indentation. I tried Emacs, and I didn't 
like it, which is a shame because it's the One True Editor for 
programming in Lisp. I basically dropped Lisp over this issue.

In Lisp's case, the editor has to be smart because Lisp syntax is 
insufficient on its own to express program semantics *to a human*. 
(Every programming language has this problem to some extent, Lisp more 
than most because of all the parenthesis and general lack of visual 
cues, and Python much less than most because of smart use of operators 
and syntactically significant whitespace.) This is a user interface 
problem for a *language*, so it rubs me the wrong way to have to have it 
solved by an *editor*.

Likewise, Unicode identifiers present numerous (detailed elsewhere) user 
interface problems. My general feeling is that language issues shouldn't 
be solved by editors. You should be able to comfortably change the 
semantics of a program with just about any text editor. Otherwise, we 
have a situation where some editors are blessed for use with the 
language and most are not, and if a would-be programmer's favorite isn't 
on the list, he leaves.

Neil


From python at zesty.ca  Wed May 23 23:11:00 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Wed, 23 May 2007 16:11:00 -0500 (CDT)
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <ca471dc20705230945q6d9a31fas9e0011959fa1e643@mail.gmail.com>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com> 
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org> 
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com> 
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org> 
	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com> 
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org> 
	<fb6fbf560705230939g1a43524bw44c8273768a0173@mail.gmail.com>
	<ca471dc20705230945q6d9a31fas9e0011959fa1e643@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0705231554480.8399@server1.LFW.org>

On Wed, 23 May 2007, Guido van Rossum wrote:
> On 5/23/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > Certain cut-and-paste errors (such as cutting from a word document
> > that uses "smart quotes") will change from syntax errors to silently
> > creating new identifiers.
>
> Really? Are those quote characters considered letters by the Unicode standard?

According to the table at

    http://www.dcl.hpi.uni-potsdam.de/home/loewis/table-331.html

, the following quote-like characters are not identifier characters:

    U+2018 LEFT SINGLE QUOTATION MARK
    U+2019 RIGHT SINGLE QUOTATION MARK
    U+201C LEFT DOUBLE QUOTATION MARK
    U+201D RIGHT DOUBLE QUOTATION MARK

I believe these four are the "smart quotes" produced by Word.

But the following are identifier characters:

    U+02BB MODIFIER LETTER TURNED COMMA (same glyph as U+2018)
    U+02BC MODIFIER LETTER APOSTROPHE (same glyph as U+2019)
    U+02EE MODIFIER LETTER DOUBLE APOSTROPHE (same glyph as U+201D)
    U+0312 COMBINING TURNED COMMA ABOVE (same glyph as U+2018)
    U+0313 COMBINING COMMA ABOVE (same glyph as U+2019)
    U+0315 COMBINING COMMA ABOVE RIGHT (same glyph as U+2019)

So there are three sets of characters that look the same:

    U+02BB = U+0312 = U+2018
    U+02BC = U+0313 = U+0315 = U+2019
    U+02EE = U+201D

U+0312, U+0313, and U+0315 are combining characters that cause the
comma to appear over the preceding letter, and they are not allowed
to appear as the first character in an identifier.  So, if your
editor displays combining characters as properly combined, they will
not be confusable with quotation marks; otherwise, they could be.


-- ?!ng

From jimjjewett at gmail.com  Wed May 23 23:25:00 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 23 May 2007 17:25:00 -0400
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <ca471dc20705230945q6d9a31fas9e0011959fa1e643@mail.gmail.com>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>
	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
	<fb6fbf560705230939g1a43524bw44c8273768a0173@mail.gmail.com>
	<ca471dc20705230945q6d9a31fas9e0011959fa1e643@mail.gmail.com>
Message-ID: <fb6fbf560705231425q7c204f56r2820a6d58873c0ee@mail.gmail.com>

On 5/23/07, Guido van Rossum <guido at python.org> wrote:
> On 5/23/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > Certain cut-and-paste errors (such as cutting from a word document
> > that uses "smart quotes") will change from syntax errors to silently
> > creating new identifiers.

> Really? Are those quote characters considered letters by the Unicode standard?

I'm not certain which specific character MS Word uses for smart
quotes.  My best guess is that it is actually "PRIVATE USE 1", which
is supposed to be ignored (don't prevent it; just pretend it isn't
there).

My fears were heightened by
http://www.unicode.org/reports/tr31/tr31-8.html.  They discuss NFKC
canonicalization (though another tech report recommends NFKD.  If you
use NFKC, they say to modify it so that because U+0374 ( ? ) GREEK
NUMERAL SIGN should not be allowed, but it folds to U+02B9 ( ? )
MODIFIER LETTER PRIME, which they claim should be allowed.

Within the codepoints < 256,
if we ban rather than ignore,
the only remaining problems are likely to be

(1)  that we must add _ as an allowed ID start, and
(2)  we must decide whether or not to allow the recommended

00AA          ; ID_Start # L&       FEMININE ORDINAL INDICATOR
00B5          ; ID_Start # L&       MICRO SIGN
00BA          ; ID_Start # L&       MASCULINE ORDINAL INDICATOR

(also in XID_START, and in the CONTINUE sets)

-jJ

From jason.orendorff at gmail.com  Wed May 23 23:19:57 2007
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Wed, 23 May 2007 17:19:57 -0400
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <46549A7A.6000807@cs.byu.edu>
References: <20070523011101.85F0.JCARLSON@uci.edu>
	<87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070523111704.85FC.JCARLSON@uci.edu> <46549A7A.6000807@cs.byu.edu>
Message-ID: <bb8868b90705231419u1d471056ve0b11575dd95665d@mail.gmail.com>

This discussion is off the rails again.

I'm at least sympathetic to the spoofing argument, because theoretical
security concerns have a way of becoming serious practical concerns
overnight.

But I'm not sure what to make of the rest.  Other languages have had
this feature for many years.  The "numerous user interface problems"
do not seem to arise in practice.

-j

From python at zesty.ca  Thu May 24 00:02:01 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Wed, 23 May 2007 17:02:01 -0500 (CDT)
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <Pine.LNX.4.58.0705231554480.8399@server1.LFW.org>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com> 
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org> 
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com> 
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org> 
	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com> 
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org> 
	<fb6fbf560705230939g1a43524bw44c8273768a0173@mail.gmail.com>
	<ca471dc20705230945q6d9a31fas9e0011959fa1e643@mail.gmail.com>
	<Pine.LNX.4.58.0705231554480.8399@server1.LFW.org>
Message-ID: <Pine.LNX.4.58.0705231636230.8399@server1.LFW.org>

On Wed, 23 May 2007, Ka-Ping Yee wrote:
> So there are three sets of characters that look the same:
>
>     U+02BB = U+0312 = U+2018
>     U+02BC = U+0313 = U+0315 = U+2019
>     U+02EE = U+201D

The Greek combining koronis, U+0343, is an allowed identifier
character and also looks identical to a single right quote,
U+02BC = U+0313 = U+0315 = U+0343 = U+2019.

> U+0312, U+0313, and U+0315 are combining characters that cause the
> comma to appear over the preceding letter, and they are not allowed
> to appear as the first character in an identifier.  So, if your
> editor displays combining characters as properly combined, they will
> not be confusable with quotation marks; otherwise, they could be.

I just realized that this is not the whole story.  There's no
requirement that a combining character has to actually come
after a character it can be combined with.  So there might be
valid identifiers containing sequences of characters that don't
have a sensible rendering, or that force the combining comma to
appear separately and thus indistinguishable from a quotation
mark even in a Unicode-aware editor.


-- ?!ng

From python at zesty.ca  Thu May 24 00:35:52 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Wed, 23 May 2007 17:35:52 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org>
	<4646FCAE.7090804@v.loewis.de> <f27rmv$k1d$1@sea.gmane.org>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <Pine.LNX.4.58.0705231709231.8399@server1.LFW.org>

On Wed, 23 May 2007, Stephen J. Turnbull wrote:
>  > It means users could see the usability benefits of PEP3131, but the
>  > python internals could still work with ASCII only.
>
> But this reasoning is not coherent.  Python internals will have no
> problems with non-ASCII; in fact, they would have no problems with
> tokens containing Cf characters or even reserved code points.  Just
> give an unambiguous grammar for tokens composed of code points.  It's
> only when a human enters the loop (ie, presentation of the identifier
> on an output stream) that they cause problems.

You've got this backwards, and I suspect that's part of the root of
the disagreement.  It's not that "when humans enter the loop they
cause problems."  The purpose of the language is to *serve humans*.
Without humans, we would just use machine code instead of Python.
If it doesn't work for humans, it's not because the humans are broken,
the language is broken.

The grammar has to be something a human can understand.

(And if 90%, or more than 50%, of the tools are "broken" with respect
to the language, that's a language problem, not just a tool problem.)

> I propose it would be useful to provide a standard mechanism for
> auditing the input stream.  There would be one implementation for the
> stdlib that complains[1] about non-ASCII characters and possibly
> non-English words, and IMO that should be the default

This should be built in to the Python interpreter and on by default,
unless it is turned off by a command-line switch that says "I want to
allow the full set of Unicode identifier characters in identifiers."

> A second one should provide a very conservative Unicode set, with
> provision for amendment as experience shows restriction to be
> desirable or extension to be safe.

If we are going to allow Unicode identifiers at all, then I would
recommend only allowing identifiers that are already normalized
(in NFC).  If this recommendation is rejected, then I propose that
the second-level mode that Stephen suggests here only allow
normalized identifiers.

In summary, my preference ordering of the possibilities would be:

    1.  Identifiers remain ASCII-only.

    2.  "python" allows only ASCII identifiers.  "python -U" allows
        Unicode identifiers that are in NFC and use a conservative,
        *fixed* subset of the available characters.  Support for
        "-U" is a compile-time option, preferably not compiled into
        official binary releases of Python.

    3.  "python" and "python -U" are as above.  "python -UU" allows
        all Unicode identifier characters (which may grow over time
        as the Unicode standard changes).  Support for "-UU" is a
        compile-time option, never on in official binary releases of
        Python, and discouraged with "here be dragons" warnings, etc.

The ideas that I'm in favour of include:

    (a) Require identifiers to be in ASCII.

    (b) Require a compile-time option to enable non-ASCII identifiers.

    (c) Require a command-line flag to enable non-ASCII identifiers.

    (d) Require identifiers to be in NFC.

    (e) Use a character set that is fixed over time.


-- ?!ng

From showell30 at yahoo.com  Thu May 24 02:57:44 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Wed, 23 May 2007 17:57:44 -0700 (PDT)
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <46549A7A.6000807@cs.byu.edu>
Message-ID: <906645.13234.qm@web33503.mail.mud.yahoo.com>


--- Neil Toronto <ntoronto at cs.byu.edu> wrote:

> Josiah Carlson wrote:
> > Thank you for hitting home that unless people use
> Emacs, their tools
> > arent sufficient for Python development. I still
> don't believe that my
> > concerns have been addressed. And I certainly
> don't believe that those
> > Ka-Ping brought up (which are better than mine)
> have been addressed.
> >  But hey, my tools aren't Emacs, so obviusly my
> concerns regarding using
> > my tools to edit Python in the future don't
> matter.  Thank you for the
> > vote of confidence.
> >   
> 
> Though I don't develop an editor in my spare time, I
> had a similar 
> reaction to the "Emacs does Unicode this way, which
> is correct" 
> solutions. My favorite editor is going to have to
> get awfully smart.
> 
> It reminds me of some friction I experienced when
> trying out Lisp. It's 
> fairly painful to program in Lisp without an editor
> that does 
> paren-matching and automatic indentation. I tried
> Emacs, and I didn't 
> like it, which is a shame because it's the One True
> Editor for 
> programming in Lisp. I basically dropped Lisp over
> this issue. [...]

I'm +1 on being able to use Py3k effectively with a
relatively dumb editor.  (I now use vim, which sucks a
lot less than vi, but doesn't compare to some pretty
damn awesome Windows editors that I used in my
pre-Python days).

Still, I'm plus +1 on PEP 3131.  It will benefit me in
no way whatsoever, as I'm a native English speaker,
I'm of English descent, I work with people with code
most effectively in English (even though it's often
their second language), I18N doesn't fit my brain, I
like English muffins, etc.

The thing that's compelling to me about PEP 3131 is
that it truly opens up Python to a new audience.  I
just hope I never have to write an app that parses
Dutch tax laws.




       
____________________________________________________________________________________Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online.
http://smallbusiness.yahoo.com/webhosting 

From showell30 at yahoo.com  Thu May 24 03:56:56 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Wed, 23 May 2007 18:56:56 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <Pine.LNX.4.58.0705231709231.8399@server1.LFW.org>
Message-ID: <257359.78365.qm@web33507.mail.mud.yahoo.com>


--- Ka-Ping Yee <python at zesty.ca> wrote:
> In summary, my preference ordering of the
> possibilities would be:
> 
> [...]
> 
>     2.  "python" allows only ASCII identifiers. 
> "python -U" allows
>         Unicode identifiers that are in NFC and use
> a conservative,
>         *fixed* subset of the available characters. 
> Support for
>         "-U" is a compile-time option, preferably
> not compiled into
>         official binary releases of Python.
> 
>     3.  "python" and "python -U" are as above. 
> "python -UU" allows
>         all Unicode identifier characters (which may
> grow over time
>         as the Unicode standard changes).  Support
> for "-UU" is a
>         compile-time option, never on in official
> binary releases of
>         Python, and discouraged with "here be
> dragons" warnings, etc.
> 

I'm in favor of that, with the idea that by 3.1 or
3.later (depending on feeback from international
community), Python would eventually deprecate those
options, and it would eventually be the burden of
non-Unicoders (which includes me) to specify
--asciionly if they were worried about running
non-ASCII Python.

I disagree with option 1 (not quoted), but not
passionately.






       
____________________________________________________________________________________Be a better Heartthrob. Get better relationship answers from someone who knows. Yahoo! Answers - Check it out. 
http://answers.yahoo.com/dir/?link=list&sid=396545433

From gproux+py3000 at gmail.com  Thu May 24 04:14:15 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Thu, 24 May 2007 11:14:15 +0900
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <906645.13234.qm@web33503.mail.mud.yahoo.com>
References: <46549A7A.6000807@cs.byu.edu>
	<906645.13234.qm@web33503.mail.mud.yahoo.com>
Message-ID: <19dd68ba0705231914x491174ffha533763799e17a81@mail.gmail.com>

Regarding using looking-alike glyphs (in certain fonts) security
issues, wouldn't it be a good thing for any project anyway to have a
number of pre-conditions for any given contribution to a given project
to be cleared. On of such litmus tests would be like the following.
try:
     codecs.open("contributedfile.py","r","ascii")
     print("contribution accepted")
except UnicodeDecodeError:
     print("contribution rejected. evil non-ascii characters lurking
in your source. ")

(it should be possible (and this is left as exercise to the reader) to
use some regexp to first remove from the scope of the test strings and
comments or to use AST tools to make the tests directly on the
generated AST)

In Japan, replace the above "ascii" by "sjis" for example.

it should be fairly easy to write a number of tools that would
highlight "strange" characters in a piece of source code and I trust
that if there is such a need, the market for python specialized
editors (and other generic editors) will let you pick a different
color for characters that would not be part of the ascii set. Once
again, mostly a presentation and workflow issue that can be solved by
using the right tools or writing some very simple tools to work around
your favorite editor's lacks.

Regards,

Guillaume

From showell30 at yahoo.com  Thu May 24 04:26:03 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Wed, 23 May 2007 19:26:03 -0700 (PDT)
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <19dd68ba0705231914x491174ffha533763799e17a81@mail.gmail.com>
Message-ID: <707787.71027.qm@web33510.mail.mud.yahoo.com>


--- Guillaume Proux <gproux+py3000 at gmail.com> wrote:

> Regarding using looking-alike glyphs (in certain
> fonts) security
> issues, wouldn't it be a good thing for any project
> anyway to have a
> number of pre-conditions for any given contribution
> to a given project
> to be cleared. On of such litmus tests would be like
> the following.
> try:
>      codecs.open("contributedfile.py","r","ascii")
>      print("contribution accepted")
> except UnicodeDecodeError:
>      print("contribution rejected. evil non-ascii
> characters lurking
> in your source. ")
> 

Yep.  Pychecker and automated unit tests could also
protect against bugs or holes caused by bad encodings
or typos (whether malicious or accidental).




       
____________________________________________________________________________________Ready for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV. 
http://tv.yahoo.com/

From guido at python.org  Thu May 24 04:26:31 2007
From: guido at python.org (Guido van Rossum)
Date: Wed, 23 May 2007 19:26:31 -0700
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <19dd68ba0705231914x491174ffha533763799e17a81@mail.gmail.com>
References: <46549A7A.6000807@cs.byu.edu>
	<906645.13234.qm@web33503.mail.mud.yahoo.com>
	<19dd68ba0705231914x491174ffha533763799e17a81@mail.gmail.com>
Message-ID: <ca471dc20705231926r79858529p5b4d8cd251125c1b@mail.gmail.com>

The tokenize module could easily be used to do such tests, as lenient
or as strict as required by any particular style guide.

On 5/23/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> Regarding using looking-alike glyphs (in certain fonts) security
> issues, wouldn't it be a good thing for any project anyway to have a
> number of pre-conditions for any given contribution to a given project
> to be cleared. On of such litmus tests would be like the following.
> try:
>      codecs.open("contributedfile.py","r","ascii")
>      print("contribution accepted")
> except UnicodeDecodeError:
>      print("contribution rejected. evil non-ascii characters lurking
> in your source. ")
>
> (it should be possible (and this is left as exercise to the reader) to
> use some regexp to first remove from the scope of the test strings and
> comments or to use AST tools to make the tests directly on the
> generated AST)
>
> In Japan, replace the above "ascii" by "sjis" for example.
>
> it should be fairly easy to write a number of tools that would
> highlight "strange" characters in a piece of source code and I trust
> that if there is such a need, the market for python specialized
> editors (and other generic editors) will let you pick a different
> color for characters that would not be part of the ascii set. Once
> again, mostly a presentation and workflow issue that can be solved by
> using the right tools or writing some very simple tools to work around
> your favorite editor's lacks.
>
> Regards,
>
> Guillaume
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Thu May 24 07:12:54 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 24 May 2007 07:12:54 +0200
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <20070523082241.85F3.JCARLSON@uci.edu>
References: <20070523011101.85F0.JCARLSON@uci.edu>	<87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070523082241.85F3.JCARLSON@uci.edu>
Message-ID: <46551ED6.5070900@v.loewis.de>

> This particular excuse pisses me off the most.  "If you can't
> differentiate, then your font or editor sucks."  Thank you for passing
> judgement on my choice of font or editor, but Ka-Ping already stated
> why this argument is bullshit: there does not currently exist a font
> where one *can* differentiate all the glyphs

That's not true. In the Unicode BMP fallback font, you can easily
differentiate all Unicode characters (in the BMP):

http://scripts.sil.org/UnicodeBMPFallbackFont

> Speaking of which, do you know of a fixed-width font that is able to
> allow for the visual distinction of all unicode glyphs in the primary
> plane, or even the portion that Martin is proposing we support?  This
> also "is not a show-stopper", but it certainly reduces audience
> satisfaction by a large margin.

See above.

Regards,
Martin

From martin at v.loewis.de  Thu May 24 07:25:10 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 24 May 2007 07:25:10 +0200
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <Pine.LNX.4.58.0705231636230.8399@server1.LFW.org>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>
	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
	<fb6fbf560705230939g1a43524bw44c8273768a0173@mail.gmail.com>	<ca471dc20705230945q6d9a31fas9e0011959fa1e643@mail.gmail.com>	<Pine.LNX.4.58.0705231554480.8399@server1.LFW.org>
	<Pine.LNX.4.58.0705231636230.8399@server1.LFW.org>
Message-ID: <465521B6.1050601@v.loewis.de>

> I just realized that this is not the whole story.  There's no
> requirement that a combining character has to actually come
> after a character it can be combined with.  So there might be
> valid identifiers containing sequences of characters that don't
> have a sensible rendering, or that force the combining comma to
> appear separately and thus indistinguishable from a quotation
> mark even in a Unicode-aware editor.

That can't happen. In Unicode, there is no notion of "can be combined
with": any base character can be combined with any combining character.
The rendering engine is supposed to create a glyph on the fly.

Regards,
Martin

From martin at v.loewis.de  Thu May 24 07:38:48 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 24 May 2007 07:38:48 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <Pine.LNX.4.58.0705231709231.8399@server1.LFW.org>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>	<4646A3CA.40705@acm.org>	<4646FCAE.7090804@v.loewis.de>
	<f27rmv$k1d$1@sea.gmane.org>	<464FFD04.90602@v.loewis.de>	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>	<46521CD7.9030004@v.loewis.de>	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>	<46527904.1000202@v.loewis.de>	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<Pine.LNX.4.58.0705231709231.8399@server1.LFW.org>
Message-ID: <465524E8.4000008@v.loewis.de>

> You've got this backwards, and I suspect that's part of the root of
> the disagreement.  It's not that "when humans enter the loop they
> cause problems."  The purpose of the language is to *serve humans*.
> Without humans, we would just use machine code instead of Python.
> If it doesn't work for humans, it's not because the humans are broken,
> the language is broken.
> 
> The grammar has to be something a human can understand.

Indeed, it is easy for a human to still understand the Py3k grammar.
An identifier starts with a letter, followed by letters and digits.
It's really the same rule that was in use all the time.

It's not easy for a single human to memorize the entire *language*,
and never was. The language is not just about the syntax: it's
also about the library. While there are many details of the library
that you can memorize, I bet nobody could enumerate all classes,
functions, methods, symbolic constants etc in the entire library;
this causes no concern for people.

> If we are going to allow Unicode identifiers at all, then I would
> recommend only allowing identifiers that are already normalized
> (in NFC).

In what way would that be an improvement compared to what the PEP
already says?

>     2.  "python" allows only ASCII identifiers.  "python -U" allows
>         Unicode identifiers that are in NFC and use a conservative,
>         *fixed* subset of the available characters.  Support for
>         "-U" is a compile-time option, preferably not compiled into
>         official binary releases of Python.
> 
>     3.  "python" and "python -U" are as above.  "python -UU" allows
>         all Unicode identifier characters (which may grow over time
>         as the Unicode standard changes).  Support for "-UU" is a
>         compile-time option, never on in official binary releases of
>         Python, and discouraged with "here be dragons" warnings, etc.

This would cripple the feature, so I'm -1.

Regards,
Martin


From jcarlson at uci.edu  Thu May 24 09:05:39 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 24 May 2007 00:05:39 -0700
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <46551ED6.5070900@v.loewis.de>
References: <20070523082241.85F3.JCARLSON@uci.edu>
	<46551ED6.5070900@v.loewis.de>
Message-ID: <20070524000016.862B.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> > This particular excuse pisses me off the most.  "If you can't
> > differentiate, then your font or editor sucks."  Thank you for passing
> > judgement on my choice of font or editor, but Ka-Ping already stated
> > why this argument is bullshit: there does not currently exist a font
> > where one *can* differentiate all the glyphs
> 
> That's not true. In the Unicode BMP fallback font, you can easily
> differentiate all Unicode characters (in the BMP):
> 
> http://scripts.sil.org/UnicodeBMPFallbackFont

That's a cute hack that offers a method of applying the "just use hex"
argument to any editor with multi-font support, but it certainly isn't
usable for actual work.

 - Josiah


From stephen at xemacs.org  Thu May 24 09:19:42 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 24 May 2007 16:19:42 +0900
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <20070523082241.85F3.JCARLSON@uci.edu>
References: <20070523011101.85F0.JCARLSON@uci.edu>
	<87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070523082241.85F3.JCARLSON@uci.edu>
Message-ID: <87bqga6641.fsf@uwakimon.sk.tsukuba.ac.jp>

Josiah Carlson writes:

 > Thank you for hitting home that unless people use Emacs, their tools
 > suck.

I'm sorry you took it that way.  My experience is limited to Emacs;
that's the only experience I can describe.

If you can tell the story of a maintainer of a package that contains
multilingual identifiers, and experienced a horror story, I'd like to
hear it, and I sure hope you tell Guido about it.

I'll deal with the technical content of your reply elsewhere.

Sincerely yours,
Steve


From showell30 at yahoo.com  Thu May 24 09:10:51 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Thu, 24 May 2007 00:10:51 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <465524E8.4000008@v.loewis.de>
Message-ID: <548742.28521.qm@web33503.mail.mud.yahoo.com>


--- "Martin v. L?wis" <martin at v.loewis.de> wrote:

> [...]
> >     2.  "python" allows only ASCII identifiers. 
> "python -U" allows
> >         Unicode identifiers that are in NFC and
> use a conservative,
> >         *fixed* subset of the available
> characters.  Support for
> >         "-U" is a compile-time option, preferably
> not compiled into
> >         official binary releases of Python.
> > 
> >     3.  "python" and "python -U" are as above. 
> "python -UU" allows
> >         all Unicode identifier characters (which
> may grow over time
> >         as the Unicode standard changes).  Support
> for "-UU" is a
> >         compile-time option, never on in official
> binary releases of
> >         Python, and discouraged with "here be
> dragons" warnings, etc.
> 
> This would cripple the feature, so I'm -1.
> 

FWIW the Ruby interpreter (1.8.5) seems to require
this flag to allow you to turn on the Japanese code
set.

  -Kkcode  specifies KANJI (Japanese) code-set

I have no idea whether or not this cripples the
feature in Ruby, and perhaps it's an apples/oranges
comparison.






       
____________________________________________________________________________________Looking for a deal? Find great prices on flights and hotels with Yahoo! FareChase.
http://farechase.yahoo.com/

From martin at v.loewis.de  Thu May 24 09:11:27 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 24 May 2007 09:11:27 +0200
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <20070524000016.862B.JCARLSON@uci.edu>
References: <20070523082241.85F3.JCARLSON@uci.edu>
	<46551ED6.5070900@v.loewis.de>
	<20070524000016.862B.JCARLSON@uci.edu>
Message-ID: <46553A9F.9010307@v.loewis.de>

>>> This particular excuse pisses me off the most.  "If you can't
>>> differentiate, then your font or editor sucks."  Thank you for passing
>>> judgement on my choice of font or editor, but Ka-Ping already stated
>>> why this argument is bullshit: there does not currently exist a font
>>> where one *can* differentiate all the glyphs
>> That's not true. In the Unicode BMP fallback font, you can easily
>> differentiate all Unicode characters (in the BMP):
>>
>> http://scripts.sil.org/UnicodeBMPFallbackFont
> 
> That's a cute hack that offers a method of applying the "just use hex"
> argument to any editor with multi-font support, but it certainly isn't
> usable for actual work.

Depends on what you want to achieve. If your objective is "I want to
visually recognize whether there are any stray characters in the file,
outside the range of characters which I normally use", then such
a kind of font can work very well.

This one (or one similar to it) is installed (by default?) on Debian
Linux, and it helps to recognize cases where you have characters
in a text that you could not display otherwise.

In any case, I still think it proves the argument wrong: "there does not
currently exist a font where one *can* differentiate all the glyphs".

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Thu May 24 10:28:40 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 24 May 2007 20:28:40 +1200
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <46551ED6.5070900@v.loewis.de>
References: <20070523011101.85F0.JCARLSON@uci.edu>
	<87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070523082241.85F3.JCARLSON@uci.edu> <46551ED6.5070900@v.loewis.de>
Message-ID: <46554CB8.7050209@canterbury.ac.nz>

Martin v. L?wis wrote:

> That's not true. In the Unicode BMP fallback font, you can easily
> differentiate all Unicode characters (in the BMP):
> 
> http://scripts.sil.org/UnicodeBMPFallbackFont

Er... somehow I don't think that's what Martin had in mind
when he used the word "font" in that context. :-)

--
Greg

From stephen at xemacs.org  Thu May 24 12:05:24 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 24 May 2007 19:05:24 +0900
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <20070523111704.85FC.JCARLSON@uci.edu>
References: <20070523011101.85F0.JCARLSON@uci.edu>
	<87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070523111704.85FC.JCARLSON@uci.edu>
Message-ID: <87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>

Josiah Carlson writes:

 > Removing those words that some found offensive, perhaps I will get a
 > reponse to the point of my post: "your tools aren't very good" and
 > "Emacs does it right" are not valid responses to the concerns brought up
 > regarding unicode.

You're missing my point still, and I don't find the words offensive.
(It's a pain in the neck, since I already wrote my reply, but I'll
remove them too.)  Nor do I find your completely groundless conclusion
that I'm deprecating other tools offensive.

I find them to be an indicator of your fears which cannot be grounded
in any experience of mine---in exactly the kind of environment PEP
3131 will provide.  I strongly suspect you have no experience at all,
not even hearsay, to offer.  *Please* prove me wrong!  My experience
is *far* from definitive.

But if you can't, well, I don't blame you for your fear, but I also
cannot take it seriously as a reason to not implement this PEP in the
face of my own long experience.

 > but Ka-Ping already stated why this argument is invalid: there
 > does not currently exist a font where one *can* differentiate all
 > the glyphs,

I'll tell you why Ka-Ping's argument is a strawman.  First, one only
*needs* to be able to distinguish those characters that one can read.
It's nice to be able to admire the rest, of course, but you don't need
to see them as a speaker of that language would.  You just use a font
you like for the characters you can read, and the rest can be any old
dog.

Second, you do *not* need a single font with universal coverage.  I
typically use different fonts for Roman, Kanji, half-width kana, and
Hangul.  If I happen to have some Chinese in there, that will be yet
another font.  If I had cause to use Arabic, Hebrew, or Thai, they
would be yet other fonts.  It simply is not at all unpleasant to use
LucidaTypwriter for ASCII and Latin-1 in the same buffer with Sazanami
Gothic for Japanese.

N.B.  Martin is correct to point out the existing of the SIL BMP
fallback font, but that doesn't answer the real issue, that users
should use the fonts (and tools) they like best.

 > and further, even if one could visually differentiate similar

I have actually worked in an environment where you can't visually
distinguish different characters.  Security aside, it's a PITA, and
you *do* want tools to deal with it.  Those tools are *not* expensive;
simply audit the editor buffer for characters outside of the user's
acceptable set, and be 99% happy.  Once you've got tools, it's not a
big deal.  Can you find somebody with experience to say otherwise?

 > glyphs, *remembering* the 64,000+ glyphs that are available in just
 > the primary unicode plane to differentiate them, is a herculean
 > task.

Strawman.  The only people who need to remember the glyphs are those
who need to read them anyway, or glyphs that look like them (cf
Ka-Ping's example).  So they have already memorized them.

 > Never mind the fact that people use dozens, perhaps hundreds of
 > different editors to write and maintain Python code, that the
 > 'Emacs works' argument is poor at best.  it was invalid then, and
 > it was invalid now.

It was intended only to counter Ka-Ping's strawman of "impossible to
detect", and it demolishes that claim.

But addressing the content of what you write, you mean that, in a
world that allows multilingual identifiers, 'Emacs works' "smells
like" [from your original post] a threat to the market share of
editors that can't deal with multilingual identifiers, not to mention
the work habits of Emacs-haters everywhere, don't you?

Well, you're probably wrong.  *If* your users need to deal with
multilingual identifiers, *maybe* they'll prefer to switch to Emacs.
*If* they need extremely robust handling of multilingual identifiers
on a daily basis, they probably will switch to Emacs.

I doubt it, though.  What they'll probably do is write a five line
patch to get them 90% of the way to what Emacs gives them out of the
box, and be ecstatic that they don't have to use Emacs at all.
(That's a guess, as an XEmacs developer I don't see much of that
activity.)

And that's a big "if".  Most of your users will not see code in a
language the current version of your editor can't deal with in their
working lives, and 90% won't in the usable life of your product.  That
I can tell you from experience.  Emacs has all these wonderful
multilingual features, but you know what?  95% of our users are
monoscript 100% of the time.[1]  90% of the rest use their primary
script 95% of the time.  Emacs being multilingual only means that the
one language might be Japanese or Thai.  If 99% of your users
currently use only ISO-8859-15, that isn't going to change by much just
because Python now allows Thai identifiers.

In other works, if you're up multilingual creek without a paddle,
Emacs will get you to shore.  Do you have a problem with it, put that
way?

 > That's a invalid argument, and you know it.  "Just use hex
 > escapes"?

No, my argument is not "just use hex escapes".  Please read it again,
and if you wish to respond to what I wrote, feel free.

So, you have my apologies, but I still advocate implementation of PEP
3131 over your objections, and those of Ka-Ping.

Footnotes: 
[1]  Eg, all Swiss know a half-dozen languages, but they can write all
of them with one script, ISO-8859-15.


From stephen at xemacs.org  Thu May 24 12:19:02 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 24 May 2007 19:19:02 +0900
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <46549A7A.6000807@cs.byu.edu>
References: <20070523011101.85F0.JCARLSON@uci.edu>
	<87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070523111704.85FC.JCARLSON@uci.edu>
	<46549A7A.6000807@cs.byu.edu>
Message-ID: <878xbe5xt5.fsf@uwakimon.sk.tsukuba.ac.jp>

Neil Toronto writes:

 > Though I don't develop an editor in my spare time, I had a similar 
 > reaction to the "Emacs does Unicode this way, which is correct" 
 > solutions. My favorite editor is going to have to get awfully smart.

It isn't.  It will need to learn about widechars, which is painful for
the editor's developer.  (But only if she writes in C: "what do you
mean I can't use strncat?!")  Other than that, there's probably not
that much to it (see the last part of my reply to Josiah).  Most
editors have access to a reasonable GUI environment these days that
will handle the input and the fonts (even if that environment comes
via Terminal or uxterm).


From stephen at xemacs.org  Thu May 24 13:17:57 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 24 May 2007 20:17:57 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <Pine.LNX.4.58.0705231709231.8399@server1.LFW.org>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>
	<f27rmv$k1d$1@sea.gmane.org> <464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<Pine.LNX.4.58.0705231709231.8399@server1.LFW.org>
Message-ID: <877iqy5v2y.fsf@uwakimon.sk.tsukuba.ac.jp>

Ka-Ping Yee writes:

 > On Wed, 23 May 2007, Stephen J. Turnbull wrote:

 > >  > It means users could see the usability benefits of PEP3131, but the
 > >  > python internals could still work with ASCII only.

 > > But this reasoning is not coherent.  Python internals will have no
 > > problems with non-ASCII; in fact, they would have no problems with
 > > tokens containing Cf characters or even reserved code points.  Just
 > > give an unambiguous grammar for tokens composed of code points.  It's
 > > only when a human enters the loop (ie, presentation of the identifier
 > > on an output stream) that they cause problems.
 > 
 > You've got this backwards, and I suspect that's part of the root of
 > the disagreement.  It's not that "when humans enter the loop they
 > cause problems."  The purpose of the language is to *serve humans*.

Of course!  "Incoherent" refers *only* to "python internals".  We need
to look at the parts of the loop where the humans are.

N.B. I take offense at your misquote.  *Humans do not cause problems.*
It is *non-ASCII tokens* that *cause* the (putative) problem.  However,
the alleged problems only arise when humans are present.

 > The grammar has to be something a human can understand.

There are an infinite number of ASCII-only Python tokens.  Whether
those tokens are lexically composed of a small fixed finite alphabet
vs. a large extensible finite alphabet doesn't change anything in
terms of understanding the *grammar*.

The character-identity problem is vastly aggravated (created, if you
insist) by large numbers of characters, but that is something
separate.  I don't understand why you conflate lexical issues with the
still-fits-in-*my*-pin-head simplicity of the Python grammar.  Am I
missing something?

 > (And if 90%, or more than 50%, of the tools are "broken" with respect
 > to the language, that's a language problem, not just a tool problem.)

It's a *problem* for the tools, because they may become obsolete,
depending on how expensive the feature of handling new language
constructs is.  It is an *issue* for the language, *not* a "problem"
in the same sense.  The language designer must balance the problems
faced by the tools, and the cost of upgrading them---including users'
switching costs!---against the benefits of the new language feature.
Nothing new here.

The question is how expensive will the upgrade be, and what are the
benefits.  My experience suggests that the cost is negligible *because
most users won't use non-ASCII identifiers*, and they'll just stick
with their ASCII-only tools.  The benefits are speculative; I know
that my students love the idea of a programming language that doesn't
look like English (which has extremely painful associations for most).

And there are cases (Dutch tax law, Japanese morphology) where having
a judicious selection of non-ASCII identifiers is very convenient.
Specifically, from my own experience, if I don't know what a
particular function in edict is supposed to do, I just ask the nearest
Japanese.  And they tell me, "oh, that parses the INFLECTION-TYPE of
PART-OF-SPEECH", and when I look blank, they continue, "you know, the
'-masu' in 'gozaimasu'".  Now, since there is no exact equivalent to
"-masu" in English (or any European language AFAIK), it would be
impossible to give a precise self-documenting name in ASCII.  Sure,
you can work around this -- but why not put down the ASCII hammer and
save on all that ibuprofen?

 > > I propose it would be useful to provide a standard mechanism for
 > > auditing the input stream.  There would be one implementation for the
 > > stdlib that complains[1] about non-ASCII characters and possibly
 > > non-English words, and IMO that should be the default
 > 
 > This should be built in to the Python interpreter and on by default,
 > unless it is turned off by a command-line switch that says "I want to
 > allow the full set of Unicode identifier characters in identifiers."

I'd make it more tedious and more flexible to relax the restriction,
actually.  "python" gives you the stdlib, ASCII-only restriction.
"python -U TABLE" takes a mandatory argument, which is the table of
allowed characters.  If you want to rule out "stupid file substitution
tricks", TABLE could take the special arguments "stdlib" and "stduni"
which refer to built-in tables.  But people really should be able to
restrict to "Japanese joyo kanji, kana, and ASCII only" or "IBM
Japanese only" as local standards demand, so -U should also be able to
take a file name, or a module name, or something like that.

 > If we are going to allow Unicode identifiers at all, then I would
 > recommend only allowing identifiers that are already normalized
 > (in NFC).

Already in the PEP.

 > The ideas that I'm in favour of include:
 > 
 >     (e) Use a character set that is fixed over time.

The BASIC that I learned first only had 26 user identifiers.  Maybe
that's the way we should go?<duck />

From stephen at xemacs.org  Thu May 24 13:55:01 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 24 May 2007 20:55:01 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<4646FCAE.7090804@v.loewis.de> <f27rmv$k1d$1@sea.gmane.org>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
Message-ID: <87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>

Jim Jewett writes:

 > I would like an alert (and possibly an import exception) on any code
 > whose *executable portion* is not entirely in ASCII.

Are you talking about language definition or implementation?  I like
the idea of such checks, as long as they are not mandatory in the
language and can be turned off easily at run time in the default
configuration.  I'd also really like a generalization (described below).

 > > The only issues PEP 3131 should be concerned with *defining*
 > > are those that cause problems with canonicalization, and the range of
 > > characters and languages allowed in the standard library.
 > 
 > Fair enough -- but the problem is that this isn't a solved issue yet;

IMHO the stdlib *is* a solved issue.  The PEP says "in the standard
library, we use ASCII only, except in tests and the like," and "we use
English unless there is no reasonable equivalent in English."  That's
right.

AFAIK *canonicalization* is also a solved issue (although exactly what
"NFC" means might change with Unicode errata and of course with future
addition of combining characters or precombined characters).

The notion of "identifier constituent" is a bit thorny.  While in
general Cf characters don't belong in my understanding, there are some
weird references to ZWJ and ZWNJ that I don't understand in UAX#31.  I
say "leave them out until somebody named 'Bhattacharya' says 'Hey! I
need that!'"<wink>  In general, when in doubt, leave it out.

And prohibit it.  I think it's a very bad idea to give identifier
authors *any* control over their presentation to readers.  If an
editor has a broken or nonexistent bidi implementation, for example,
its user is probably used to that.  With *sufficient* breakage in a
presentation algorithm, I suppose that the same identifier could be
presented differently in different contexts, and that different
identifiers could be presented identically.  But that's not Python's
problem.  This can easily happen in ASCII, too.  (Consider an editor
that truncates display silently at column 80.)

 > Even having read their reports, my initial rules would still have
 > banned mixed-script, which would have prevented your edict-
 > example.

Urk.  I see your point (Ka-Ping's Cyrillic example makes it glaringly
clear why that's the conservative way to go).  I don't have to like
it, but I could live with it.  (Especially since "edict-" is a poor
man's namespace.  That device isn't needed in Python.)

 > > I propose it would be useful to provide a standard mechanism for
 > > auditing the input stream.  There would be one implementation for the
 > > stdlib ....  A second ....  A third, ....
 > 
 > This might deal with my concerns.  It is a bit more complicated than
 > the current plans.

Well, what I *really* want is a loadable table.  My motivation is that
I want organizations to be able to "enforce" a policy that is less
restrictive than "ASCII-only" but more restrictive than "almost
anything goes".  My students don't need Sanskrit; Guido's tax
accountant doesn't need kanji, and neither needs Arabic.  I think that
they should be able to get the same strict "alert or even import
exception" (that you want on non-ASCII) for characters outside their
larger, but still quite restricted, sets.

From showell30 at yahoo.com  Thu May 24 15:20:26 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Thu, 24 May 2007 06:20:26 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <877iqy5v2y.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <257070.91051.qm@web33507.mail.mud.yahoo.com>


--- "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> 
> I'd make it more tedious and more flexible to relax
> the restriction,
> actually.  "python" gives you the stdlib, ASCII-only
> restriction.
> "python -U TABLE" takes a mandatory argument, which
> is the table of
> allowed characters. 

Now that the PEP has been accepted, maybe some more
language could be added to it that addresses the
concerns of folks who want to keep their code
ASCII-only.

It seems that if Python, by default, restricts to
ASCII, then you at least eliminate the most obvious
objections.  (You still have the indirect arguments
about it contributing to less code written in English
worldwide, etc.).

Then, for all the other classes of users (Dutch tax
lawyer who still doesn't want Sanskrit, etc.), do you
advocate having multiple convenient ways to specify
their desired character set (command line flag, env
setting, magic directive at top of file, etc.), or do
you want the "one true way"?



      ____________________________________________________________________________________Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel and lay it on us. http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 


From gproux at gmail.com  Thu May 24 10:47:13 2007
From: gproux at gmail.com (Guillaume Proux)
Date: Thu, 24 May 2007 17:47:13 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <548742.28521.qm@web33503.mail.mud.yahoo.com>
References: <465524E8.4000008@v.loewis.de>
	<548742.28521.qm@web33503.mail.mud.yahoo.com>
Message-ID: <19dd68ba0705240147k76e9009dna63a6acda449aafa@mail.gmail.com>

On 5/24/07, Steve Howell <showell30 at yahoo.com> wrote:
>   -Kkcode  specifies KANJI (Japanese) code-set
>

Isn't it to simply let Ruby know which is the actual codepage
(encoding) in which the file is encoded?

Regards,

Guillaume Proux
Scala

From stephen at xemacs.org  Thu May 24 17:39:12 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 25 May 2007 00:39:12 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <257070.91051.qm@web33507.mail.mud.yahoo.com>
References: <877iqy5v2y.fsf@uwakimon.sk.tsukuba.ac.jp>
	<257070.91051.qm@web33507.mail.mud.yahoo.com>
Message-ID: <871wh65izj.fsf@uwakimon.sk.tsukuba.ac.jp>

Steve Howell writes:

 > Then, for all the other classes of users (Dutch tax
 > lawyer who still doesn't want Sanskrit, etc.), do you
 > advocate having multiple convenient ways to specify
 > their desired character set (command line flag, env
 > setting, magic directive at top of file, etc.), or do
 > you want the "one true way"?

-1 on magic directive.  That delegates the decision to the file.
That's not what we want here.

+1 on "command line only".  Ie, force the user to redefine the
Python command with an alias or something if they want to set a
different default from site policy.


From jimjjewett at gmail.com  Thu May 24 17:48:58 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 24 May 2007 11:48:58 -0400
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20070523011101.85F0.JCARLSON@uci.edu>
	<87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070523111704.85FC.JCARLSON@uci.edu>
	<87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <fb6fbf560705240848s39697c70p61f5a2509559847b@mail.gmail.com>

On 5/24/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:

> I have actually worked in an environment where you can't visually
> distinguish different characters.  Security aside, it's a PITA, and
> you *do* want tools to deal with it.  ...
> simply audit the editor buffer for characters outside of the user's
> acceptable set, and be 99% happy.  Once you've got tools, it's not a
> big deal.  Can you find somebody with experience to say otherwise?

...

> And that's a big "if".  Most of your users will not see code in a
> language the current version of your editor can't deal with in their
> working lives, ...

The problem (with larger charsets) isn't that you regularly face
indistinguishable characters.  It is that you face them rarely enough
that you don't remember to run that audit, so the actual bug is very
difficult to track down.

Ignoring security issues, that could probably be handled by having to
flip a switch before importing those modules.  So long as the default
allows only ASCII, the act of flipping that switch is my reminder to
check.

While an on/off toggle would generally be sufficient for my needs, I
would feel more comfortable with a per-script allowance, so that I
could say "OK, go ahead and allow Kanji, but still warn me if there is
a stray Cyrillic character."

-jJ

From jcarlson at uci.edu  Thu May 24 19:50:39 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 24 May 2007 10:50:39 -0700
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20070523111704.85FC.JCARLSON@uci.edu>
	<87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20070524082737.862E.JCARLSON@uci.edu>


"Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> Josiah Carlson writes:
>  > Removing those words that some found offensive, perhaps I will get a
>  > reponse to the point of my post: "your tools aren't very good" and
>  > "Emacs does it right" are not valid responses to the concerns brought up
>  > regarding unicode.
> 
> You're missing my point still, and I don't find the words offensive.
> (It's a pain in the neck, since I already wrote my reply, but I'll
> remove them too.)  Nor do I find your completely groundless conclusion
> that I'm deprecating other tools offensive.

I'll skip to the chase here.

Much of my concerns could be addressed through the use of commandline,
environment variable, or in-source code definitions of what are
allowable identifier characters.  Generally, in-source definitions (like
the coding: directive) are the most flexible, but are the biggest pain
for editors and IDEs (which may want to verify every identifier as it is
being typed, etc.).  The not insignificant problem is that it allows for
identifier characters to be defined on a per-module basis.  This is
'fixed' by commandline/environment variables, but it also makes running
(rather than editing) a bigger pain than it should be.


If people can agree on a method for specifying, 'ascii only', 'ascii +
character sets X, Y, Z', and it actually becomes an accepted part of the
proposal, gets implemented, etc., I will grumble to myself at home, but
I will stop trying to raise a stink here.


> I find them to be an indicator of your fears which cannot be grounded
> in any experience of mine---in exactly the kind of environment PEP
> 3131 will provide.  I strongly suspect you have no experience at all,
> not even hearsay, to offer.  *Please* prove me wrong!  My experience
> is *far* from definitive.

My "fear" is that being able to prove (to myself and others) that the
code I am looking at does what it should do.  As you say, maybe I will
never see non-ascii source in my life.  But even if I don't, I know some
of my users will, and to not be American-centric, I need to continue to
provide them with "tools that don't suck" (which will likely necessitate
testing using non-ascii identifiers).


> But addressing the content of what you write, you mean that, in a
> world that allows multilingual identifiers, 'Emacs works' "smells
> like" [from your original post] a threat to the market share of
> editors that can't deal with multilingual identifiers, not to mention
> the work habits of Emacs-haters everywhere, don't you?

Please understand me, I don't hate Emacs.  I also don't hate Vim.  I
just don't find that they fit my personal aesthetics for doing what I
like and need to do: write Python (and a few others).  And because I'm
not selling my editor, I don't really care about (my) market share. What
I care about is functional tools for everyone who wants to use Python.

To me, that means that people should be able to write software in
whatever tool they currently prefer, whether that is Emacs, Vim, Eric3,
Idle, SPE, NewEdit, DrPython, PythonWin, Scite, PyPE, Nedit, Kate, Gedit,
Leo, Boa Constructor, Windows Notepad, Visual Studio, etc.  Some of
those will get certain functionality for free (due to their use of the
same editing component), but each will need to write their own "discover
usable characters, verify identifiers, report to user" mechanism (though
some will opt for merely syntax highlighting).  Some will want/need
alternate input methods for needing to write characters out of a user's
locale.

Who knows, maybe it is as simple as a 5 line change.  And maybe it won't
be as big a problem as we are concerned about.  But "it's not a problem",
"in my experience with Java", and "Emacs users rarely if ever have to
deal with such things" don't make me feel any better about the issues
regarding Python and editors that aren't Emacs*.


 - Josiah

* Partly because I don't know the market share that Emacs has with Java
developers, and/or whether Java editor market share is flat across
national boundaries.


From jimjjewett at gmail.com  Thu May 24 20:14:41 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 24 May 2007 14:14:41 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>

On 5/24/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Jim Jewett writes:

>  > I would like an alert (and possibly an import exception) on any code
>  > whose *executable portion* is not entirely in ASCII.

> Are you talking about language definition or implementation?  I like
> the idea of such checks, as long as they are not mandatory in the
> language and can be turned off easily at run time in the default
> configuration.  I'd also really like a generalization (described below).

Definition; I don't care whether it is a different argument to import
or a flag or an environment variable or a command-line option, or ...
I just want the decision to accept non-ASCII characters to be
explicit.

Ideally, it would even be explicit per extra character allowed, though
there should obviously be shortcuts to accept entire scripts.

>  > > The only issues PEP 3131 should be concerned with *defining*
>  > > are those that cause problems with canonicalization, and the range of
>  > > characters and languages allowed in the standard library.

Sorry; I missed the "stdlib" part of that sentence when I first
replied.  I agree except that the range of characters/languages
allowed by *python* is also an open issue.

> AFAIK *canonicalization* is also a solved issue (although exactly what
> "NFC" means might change with Unicode errata and of course with future
> addition of combining characters or precombined characters).

Why NFC?

The Tech Reports seem to suggest NFKD -- and that makes a certain
amount of sense.  Using compatibility characters reduces the problem
with equivalent characters that are distinct only for historical
reasons.  Using decomposed characters  simplifies processing.

On the other hand, NFC might often be faster in practice, as it might
not require changes -- but if you don't do the processing to verify
that, then you mess up the hash.

I'm willing to trust the judgment of those with more experience, but
the decision of which form to use should be explicit.

> The notion of "identifier constituent" is a bit thorny.

I think it is even thornier than you do, but I think we may agree on
an acceptable answer.

> Well, what I *really* want is a loadable table.  My motivation is that
> I want organizations to be able to "enforce" a policy that is less
> restrictive than "ASCII-only" but more restrictive than "almost
> anything goes".  My students don't need Sanskrit; Guido's tax
> accountant doesn't need kanji, and neither needs Arabic.  I think that
> they should be able to get the same strict "alert or even import
> exception" (that you want on non-ASCII) for characters outside their
> larger, but still quite restricted, sets.

So how about

(1)  By default, python allows only ASCII.
(2)  Additional characters are permitted if they appear in a table
named on the command line.

These additional characters should be restricted to code points larger
than ASCII (so you can't easily turn "!" into an ID char), but beyond
that, anything goes.  If you want to include punctuation or undefined
characters, so be it.

Presumably, code using Kanji would be fairly easy to run in a Kanji
environment, but code using punctuation or Linear B would ... need to
convince people that there was a valid reason for it.

Note that I think a single table argument is sufficient; I don't see
the point in saying that identifiers can include Japanese Accounting
Numbers, but can't start with them.  (Unless someone is going to
suggest that they be parsed to their numeric value?)

-jJ

From martin at v.loewis.de  Thu May 24 20:37:07 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 24 May 2007 20:37:07 +0200
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <46554CB8.7050209@canterbury.ac.nz>
References: <20070523011101.85F0.JCARLSON@uci.edu>	<87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>	<20070523082241.85F3.JCARLSON@uci.edu>
	<46551ED6.5070900@v.loewis.de> <46554CB8.7050209@canterbury.ac.nz>
Message-ID: <4655DB53.80200@v.loewis.de>

> That's not true. In the Unicode BMP fallback font, you can easily
>> differentiate all Unicode characters (in the BMP):
>>
>> http://scripts.sil.org/UnicodeBMPFallbackFont
> 
> Er... somehow I don't think that's what Martin had in mind
> when he used the word "font" in that context. :-)

That might well be - however, I think that is because of an
unclear problem statement. From the discussion, I gathered
that the perceived problem is this:

"Somebody maliciously sends me a patch, and I want to be
able to tell visually that it's wrong."

A possible answer to that was proposed as "the editor should
render the characters differently", to which the
counter-argument was "there is no font to do that, so the
editor can't". I just wanted to point out that this just
is not true: there is an approach to Unicode fonts where
you can guarantee that all characters can be rendered,
and that all characters rendered in that font look different.

Regards,
Martin

From martin at v.loewis.de  Thu May 24 20:45:34 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 24 May 2007 20:45:34 +0200
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <20070524082737.862E.JCARLSON@uci.edu>
References: <20070523111704.85FC.JCARLSON@uci.edu>	<87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070524082737.862E.JCARLSON@uci.edu>
Message-ID: <4655DD4E.3050809@v.loewis.de>

> Much of my concerns could be addressed through the use of commandline,
> environment variable, or in-source code definitions of what are
> allowable identifier characters.  Generally, in-source definitions (like
> the coding: directive) are the most flexible, but are the biggest pain
> for editors and IDEs (which may want to verify every identifier as it is
> being typed, etc.).

Not sure (anymore) what problem you are trying to solve, but it might be
that the coding directive already *is* the solution. If you want to
constrain characters that you can use in a single source file, adding
a coding directive will automatically impose such a constraint (namely,
to the characters available in the encoding).

In particular, if you set the encoding to us-ascii, you have restricted
your source file to ASCII only.

> If people can agree on a method for specifying, 'ascii only', 'ascii +
> character sets X, Y, Z', and it actually becomes an accepted part of the
> proposal, gets implemented, etc., I will grumble to myself at home, but
> I will stop trying to raise a stink here.

I think you can stop now - this is supported as a side effect of
PEP 263, and implemented for years.

> My "fear" is that being able to prove (to myself and others) that the
> code I am looking at does what it should do.  As you say, maybe I will
> never see non-ascii source in my life.  But even if I don't, I know some
> of my users will, and to not be American-centric, I need to continue to
> provide them with "tools that don't suck" (which will likely necessitate
> testing using non-ascii identifiers).

I think the PEP 263 machinery allows for great flexibility hear.

Additional tools can be implemented, of course, and will be produced
if there is a demand for them (e.g. post-commit hooks for versioning
systems).

Regards,
Martin


From martin at v.loewis.de  Thu May 24 20:50:28 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 24 May 2007 20:50:28 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <548742.28521.qm@web33503.mail.mud.yahoo.com>
References: <548742.28521.qm@web33503.mail.mud.yahoo.com>
Message-ID: <4655DE74.4090708@v.loewis.de>

> FWIW the Ruby interpreter (1.8.5) seems to require
> this flag to allow you to turn on the Japanese code
> set.
> 
>   -Kkcode  specifies KANJI (Japanese) code-set
> 
> I have no idea whether or not this cripples the
> feature in Ruby, and perhaps it's an apples/oranges
> comparison.

If you don't have source encoding declarations (like the one
in PEP 263), you must have some means of setting the source
encoding; this is what -Kkcode does (similar to javac's
-encoding command line option).

This approach has several flaws, e.g. you can only specify
a single encoding, which breaks if you have modules in
different encodings.

In any case, it's different from the suggested -UU option:
Python already knows what the source encoding is, -UU
would not change that. Instead, that option would merely
serve to constrain the source code (if it's not being
passed).

Regards,
Martin

From jimjjewett at gmail.com  Thu May 24 21:17:39 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 24 May 2007 15:17:39 -0400
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <4655DD4E.3050809@v.loewis.de>
References: <20070523111704.85FC.JCARLSON@uci.edu>
	<87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070524082737.862E.JCARLSON@uci.edu> <4655DD4E.3050809@v.loewis.de>
Message-ID: <fb6fbf560705241217i3db8f42av2353d15007479c89@mail.gmail.com>

On 5/24/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > Much of my concerns could be addressed through the use of commandline,
> > environment variable, or in-source code definitions of what are
> > allowable identifier characters.  Generally, in-source definitions (like
> > the coding: directive) are the most flexible, but are the biggest pain
> > for editors and IDEs (which may want to verify every identifier as it is
> > being typed, etc.).

> Not sure (anymore) what problem you are trying to solve, but it might be
> that the coding directive already *is* the solution. If you want to
> constrain characters that you can use in a single source file, adding
> a coding directive will automatically impose such a constraint (namely,
> to the characters available in the encoding).

Wanting to constrain identifiers is not the same as wanting to
constrain all characters.

> In particular, if you set the encoding to us-ascii, you have restricted
> your source file to ASCII only.

The stdlib is largely restricted to ASCII.  I don't think I want (the
vast majority of) the stdlib to grow a coding directive just to
enforce this.  I also don't want to lift that restriction and
accidentally allow Kanji identifiers just because L?wis appears in a
comment.

-jJ

From python at zesty.ca  Thu May 24 23:04:16 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Thu, 24 May 2007 16:04:16 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>

On Thu, 24 May 2007, Jim Jewett wrote:
> So how about
>
> (1)  By default, python allows only ASCII.
> (2)  Additional characters are permitted if they appear in a table
> named on the command line.
>
> These additional characters should be restricted to code points larger
> than ASCII (so you can't easily turn "!" into an ID char), but beyond
> that, anything goes.  If you want to include punctuation or undefined
> characters, so be it.

+1!  This is a fine solution.  It is better than the "python -U"
option I proposed -- it has all the advantages of that proposal, plus:

    - The identifier character set won't spontaneously change when
      one upgrades to a new version of Python, even for users of
      non-ASCII identifiers.

    - Having to specify the table of acceptable characters
      demonstrates at least some knowledge of the character set
      one is using.

    - It provides the flexibility for different communities to
      to adopt identifier conventions that suit their preferred
      tradeoff of risk vs. expressiveness.

Jim's proposal appears to be the best path to making everyone happy.


-- ?!ng

From python at zesty.ca  Thu May 24 23:12:55 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Thu, 24 May 2007 16:12:55 -0500 (CDT)
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <4655DB53.80200@v.loewis.de>
References: <20070523011101.85F0.JCARLSON@uci.edu>
	<87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070523082241.85F3.JCARLSON@uci.edu> <46551ED6.5070900@v.loewis.de>
	<46554CB8.7050209@canterbury.ac.nz> <4655DB53.80200@v.loewis.de>
Message-ID: <Pine.LNX.4.58.0705241605100.8399@server1.LFW.org>

On Thu, 24 May 2007, [ISO-8859-1] "Martin v. L?wis" wrote:
> > That's not true. In the Unicode BMP fallback font, you can easily
> >> differentiate all Unicode characters (in the BMP):
> >>
> >> http://scripts.sil.org/UnicodeBMPFallbackFont
> >
> > Er... somehow I don't think that's what Martin had in mind
> > when he used the word "font" in that context. :-)
>
> That might well be - however, I think that is because of an
> unclear problem statement. From the discussion, I gathered
> that the perceived problem is this:
>
> "Somebody maliciously sends me a patch, and I want to be
> able to tell visually that it's wrong."

The BMP fallback font isn't a meaningful answer to that problem
unless most people get in the habit of doing code reviews using
that font.  Most Python programmers, who probably won't be aware
of this issue because it doesn't come up in their day-to-day use,
are unlikely to do that.


-- ?!ng

From python at zesty.ca  Thu May 24 23:35:47 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Thu, 24 May 2007 16:35:47 -0500 (CDT)
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <4655DD4E.3050809@v.loewis.de>
References: <20070523111704.85FC.JCARLSON@uci.edu>
	<87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070524082737.862E.JCARLSON@uci.edu> <4655DD4E.3050809@v.loewis.de>
Message-ID: <Pine.LNX.4.58.0705241620150.8399@server1.LFW.org>

On Thu, 24 May 2007, [ISO-8859-1] "Martin v. L?wis" wrote:
> > Much of my concerns could be addressed through the use of commandline,
> > environment variable, or in-source code definitions of what are
> > allowable identifier characters.
[...]
> Not sure (anymore) what problem you are trying to solve, but it might be
> that the coding directive already *is* the solution. If you want to
> constrain characters that you can use in a single source file, adding
> a coding directive will automatically impose such a constraint (namely,
> to the characters available in the encoding).
>
> In particular, if you set the encoding to us-ascii, you have restricted
> your source file to ASCII only.

Alas, the coding directive is not good enough.  Have a look at this:

    http://zesty.ca/python/tricky.png

That's an image of a text editor containing some Python code.  Can you
tell whether running it (post-PEP-3131) will delete your .bashrc file?


-- ?!ng

From python at zesty.ca  Thu May 24 23:44:02 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Thu, 24 May 2007 16:44:02 -0500 (CDT)
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20070523011101.85F0.JCARLSON@uci.edu>
	<87iraj6bn6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070523111704.85FC.JCARLSON@uci.edu>
	<87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <Pine.LNX.4.58.0705241611220.8399@server1.LFW.org>

On Thu, 24 May 2007, Stephen J. Turnbull wrote:
> I'll tell you why Ka-Ping's argument is a strawman.  First, one only
> *needs* to be able to distinguish those characters that one can read.
> It's nice to be able to admire the rest, of course, but you don't need
> to see them as a speaker of that language would.  You just use a font
> you like for the characters you can read, and the rest can be any old
> dog.

The problem is that you don't know *when* you'll need to distinguish
those characters.

Situations where things are not obviously incorrect, but only subtly
incorrect, are a common source of practical problems.  Choosing the
full set of Unicode identifier characters as the identifier character
set for everyone puts nearly all Python users in that situation.

That's what the issue is here: defining correct practice to be
something sufficiently difficult that almost everyone's regular
practices are subtly wrong in ways they don't fully understand.
That's a recipe for bugs, vulnerabilities, confusion, etc.

The loadable table that you proposed, and Jim proposed, really sounds
like the best way to go here.  Those that are ready and able to handle
the added complexity can voluntarily adopt it, and those who don't (or
don't even know about it) won't have to deal with it.


-- ?!ng

From foom at fuhm.net  Thu May 24 23:47:45 2007
From: foom at fuhm.net (James Y Knight)
Date: Thu, 24 May 2007 17:47:45 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>
Message-ID: <781A2C3C-011E-4048-A72A-BE631C0C5127@fuhm.net>

On May 24, 2007, at 5:04 PM, Ka-Ping Yee wrote:
>> (1)  By default, python allows only ASCII.
>> (2)  Additional characters are permitted if they appear in a table
>> named on the command line.
>
> +1!  This is a fine solution.  It is better than the "python -U"
> option I proposed -- it has all the advantages of that proposal, plus:
>
>     - The identifier character set won't spontaneously change when
>       one upgrades to a new version of Python, even for users of
>       non-ASCII identifiers.

FUD. Already won't, unicode explicitly makes that promise. They can  
add characters, but not remove them.

>     - Having to specify the table of acceptable characters
>       demonstrates at least some knowledge of the character set
>       one is using.

This is a negative. Why should I have to show knowledge of the  
character set I'm using to type the characters?

>     - It provides the flexibility for different communities to
>       to adopt identifier conventions that suit their preferred
>       tradeoff of risk vs. expressiveness.

Also a negative. Now, if I want to run the modules from multiple  
communities I need to figure out how to merge the tables they have to  
separately distribute with their modules.

> Jim's proposal appears to be the best path to making everyone happy.

Nope. It does nobody any good. It may make people who fear non-ascii  
code happy, but only because it totally castrates this feature for  
people who do want to use non-ascii identifiers.

It really seems to me people are spewing a lot of FUD here. Rejecting  
certain characters when loading a file is simply not necessary.

Either:

a) you trust that the author of the file has authored it correctly,  
in which case it doesn't matter one bit what character set they used.  
Restricting the charset at import time is just something to get in  
your way with no actual value.

b) you don't trust the code, and want to inspect it.

Okay, in this case you actually have to inspect the *code* --  
checking the character set is an utterly useless thing to do by  
itself. It tells you nothing useful.

While checking the code, you may want to have strange characters  
outside your comfort range flagged for you. Either grep or editor  
support are a simple enough solution for this. Or, let's say your  
editor is unable to highlight suspicious characters, and you want to  
find identifiers with strange characters, and not get tripped up on  
comments. Fine, make a tool that uses the compiler.parser module to  
iterate over identifiers in the source code.

Adding baroque command line options for users of other languages to  
do some useless verification at import time is not an acceptable  
answer. It'd be better to just reject the PEP entirely.

James

From foom at fuhm.net  Thu May 24 23:50:50 2007
From: foom at fuhm.net (James Y Knight)
Date: Thu, 24 May 2007 17:50:50 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
Message-ID: <F672551C-1DC8-421C-91D6-66D09C2A1A1F@fuhm.net>


On May 24, 2007, at 2:14 PM, Jim Jewett wrote:

> The Tech Reports seem to suggest NFKD -- and that makes a certain
> amount of sense.  Using compatibility characters reduces the problem
> with equivalent characters that are distinct only for historical
> reasons.  Using decomposed characters  simplifies processing.

Please read again:

"Generally if the programming language has case-sensitive  
identifiers, then Normalization Form C is appropriate; whereas, if  
the programming language has case-insensitive identifiers, then  
Normalization Form KC is more appropriate."

James

From martin at v.loewis.de  Fri May 25 00:33:01 2007
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 May 2007 00:33:01 +0200
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <Pine.LNX.4.58.0705241620150.8399@server1.LFW.org>
References: <20070523111704.85FC.JCARLSON@uci.edu>
	<87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070524082737.862E.JCARLSON@uci.edu>
	<4655DD4E.3050809@v.loewis.de>
	<Pine.LNX.4.58.0705241620150.8399@server1.LFW.org>
Message-ID: <4656129D.5000406@v.loewis.de>

> Alas, the coding directive is not good enough.  Have a look at this:
> 
>     http://zesty.ca/python/tricky.png
> 
> That's an image of a text editor containing some Python code.  Can you
> tell whether running it (post-PEP-3131) will delete your .bashrc file?

I would think that it doesn't (i.e. allowed should stay at 0).

Why does os.remove get invoked?

Regards,
Martin

From martin at v.loewis.de  Fri May 25 00:46:33 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 May 2007 00:46:33 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<464FFD04.90602@v.loewis.de>	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>	<46521CD7.9030004@v.loewis.de>	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>	<46527904.1000202@v.loewis.de>	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>
Message-ID: <465615C9.4080505@v.loewis.de>

Ka-Ping Yee schrieb:
> On Thu, 24 May 2007, Jim Jewett wrote:
>> So how about
>>
>> (1)  By default, python allows only ASCII.
>> (2)  Additional characters are permitted if they appear in a table
>> named on the command line.
>>
>> These additional characters should be restricted to code points larger
>> than ASCII (so you can't easily turn "!" into an ID char), but beyond
>> that, anything goes.  If you want to include punctuation or undefined
>> characters, so be it.
> 
> +1!  This is a fine solution.  It is better than the "python -U"
> option I proposed 

-2. Any solution found must also accommodate users which are unaware
of the security issue, and just want to use their native language
for identifiers. So requiring them to change their environment or
pass additional command line parameters is unacceptable.

> Jim's proposal appears to be the best path to making everyone happy.

Please *do* consider the needs of the people who want to actively
use the feature as well. Otherwise, you have no chance of understanding
what will make everyone happy.

Regards,
Martin

From mike.klaas at gmail.com  Fri May 25 01:03:30 2007
From: mike.klaas at gmail.com (Mike Klaas)
Date: Thu, 24 May 2007 16:03:30 -0700
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <4656129D.5000406@v.loewis.de>
References: <20070523111704.85FC.JCARLSON@uci.edu>
	<87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070524082737.862E.JCARLSON@uci.edu>
	<4655DD4E.3050809@v.loewis.de>
	<Pine.LNX.4.58.0705241620150.8399@server1.LFW.org>
	<4656129D.5000406@v.loewis.de>
Message-ID: <A2620FD5-C79D-4B8D-B824-AFB59A56735F@gmail.com>

On 24-May-07, at 3:33 PM, Martin v. L?wis wrote:

>> Alas, the coding directive is not good enough.  Have a look at this:
>>
>>     http://zesty.ca/python/tricky.png
>>
>> That's an image of a text editor containing some Python code.  Can  
>> you
>> tell whether running it (post-PEP-3131) will delete your .bashrc  
>> file?
>
> I would think that it doesn't (i.e. allowed should stay at 0).
>
> Why does os.remove get invoked?

Perhaps a letter in the encoding declaration is non-ascii, nullifying  
the encoding enforcement and allowing a cyrillic 'a' in  allowed = 0?

-Mike



From python at zesty.ca  Fri May 25 01:06:16 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Thu, 24 May 2007 18:06:16 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <465615C9.4080505@v.loewis.de>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>
	<465615C9.4080505@v.loewis.de>
Message-ID: <Pine.LNX.4.58.0705241759100.8399@server1.LFW.org>

On Fri, 25 May 2007, [ISO-8859-1] "Martin v. L?wis" wrote:
> Please *do* consider the needs of the people who want to actively
> use the feature as well. Otherwise, you have no chance of understanding
> what will make everyone happy.

People who want to use the feature can turn it on.  I don't see what's
so unreasonable about that.


-- ?!ng

From jimjjewett at gmail.com  Fri May 25 01:12:27 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 24 May 2007 19:12:27 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <465615C9.4080505@v.loewis.de>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>
	<465615C9.4080505@v.loewis.de>
Message-ID: <fb6fbf560705241612o38fad58ascdbfd597d483da77@mail.gmail.com>

On 5/24/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Ka-Ping Yee schrieb:
> > On Thu, 24 May 2007, Jim Jewett wrote:
> >> So how about

> >> (1)  By default, python allows only ASCII.
> >> (2)  Additional characters are permitted if they appear in a table
> >> named on the command line.

> >> These additional characters should be restricted to code
> >> points larger than ASCII (so you can't easily turn "!" into
> >> an ID char), but beyond that, anything goes.  If you want to
> >> include punctuation or undefined characters, so be it.

> > +1!  This is a fine solution.  It is better than the "python -U"
> > option I proposed

> -2. Any solution found must also accommodate users which are
> unaware of the security issue, and just want to use their native
> language for identifiers. So requiring them to change their
> environment or pass additional command line parameters is
> unacceptable.

There is no hope of explaining security; therefore, the defaults
should be relatively safe.  If the default is "anything goes", that
isn't safe.  If the default is "ASCII", that is safe, but possibly
inconvenient.  It depends on how hard it is to make the switch.

Is your concern just that it should be possible to do once (perhaps at
install), rather than on each run?  That would probably be OK too, so
long as the default install was ASCII-only, so that *someone* had to
make a decision about what to allow.

I assume that large communities will standardize on a tailored table,
but a first-pass slightly-too-inclusive table is easy enough to
create.

Here are the Thaana lines from (unicode consortium file) Scripts.txt

0780..07A5    ; Thaana # Lo  [38] THAANA LETTER HAA..THAANA LETTER WAAVU
07A6..07B0    ; Thaana # Mn  [11] THAANA ABAFILI..THAANA SUKUN
07B1          ; Thaana # Lo       THAANA LETTER NAA

Though if it were me, I would probably simplify that to

0780..07B1    ; Thaana

Similarly, Devanagari has 15 lines in Scripts.txt, but you could simplify it to

0901..0939    ; Devanagari
093C..094D   ; Devanagari
0950..0954    ; Devanagari
0958..0963    ; Devanagari
0966..096F    ; Devanagari
097B..097F    ; Devanagari

or even
0901..097F    ; Devanagari and some undefined characters

if you (as a Devanagari speaker) were confident that none of your
characters would be confused with ASCII.  (In practice, you might well
want to exclude the Devangari numbers for looking too similar to ASCII
digits with different values, but ... that is a judgment call for
Devanagari speakers to make, so long as they make it explicitly.)

-jJ

From showell30 at yahoo.com  Fri May 25 01:49:49 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Thu, 24 May 2007 16:49:49 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <4655DE74.4090708@v.loewis.de>
Message-ID: <83075.64514.qm@web33514.mail.mud.yahoo.com>


--- "Martin v. L?wis" <martin at v.loewis.de> wrote:

> > FWIW the Ruby interpreter (1.8.5) seems to require
> > this flag to allow you to turn on the Japanese
> code
> > set.
> > 
> >   -Kkcode  specifies KANJI (Japanese) code-set
> > 
> > I have no idea whether or not this cripples the
> > feature in Ruby, and perhaps it's an
> apples/oranges
> > comparison.
> 
> If you don't have source encoding declarations (like
> the one
> in PEP 263), you must have some means of setting the
> source
> encoding; this is what -Kkcode does (similar to
> javac's
> -encoding command line option).
> 
> This approach has several flaws, e.g. you can only
> specify
> a single encoding, which breaks if you have modules
> in
> different encodings.
> 
> In any case, it's different from the suggested -UU
> option:
> Python already knows what the source encoding is,
> -UU
> would not change that. Instead, that option would
> merely
> serve to constrain the source code (if it's not
> being
> passed).
> 

Ok, I think it's pretty clear that this is an
apples/oranges comparison, and there are lots of
differences between Ruby's implementation and PEP 3131
that muddy the waters.

Still, the reason I brought it up is still valid, I
think.  

Ruby is a language that presumably has a lot of
Japanese users, and it appears to me (I'm not a Ruby
person, so I admit this is speculation) that Japanese
users have to explicitly choose to use Japanese
encoding to run source files encoded in Japanese.

Setting aside all the limitations of Ruby, wouldn't
the fact that non-latin-writing Japanese Ruby users
live with the command line restriciton in Ruby suggest
that they'd be just as willing to live with command
line burdens in Python, if they decided to switch to
Python?

To your point about Py3k being more flexible, couldn't
you imagine a scenario where a Japanese programmer
gets fed up with Ruby's all-or-nothing capability with
respect to Kanji, and switches over to Python, and
changes his little wrapper shell script to say "python
-U" instead of "ruby -Kkcode"?  He could then start to
use non-Japanese Python modules while still writing
his own Python code in Japanese.










       
____________________________________________________________________________________Yahoo! oneSearch: Finally, mobile search 
that gives answers, not web links. 
http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC

From guido at python.org  Fri May 25 02:08:53 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 24 May 2007 17:08:53 -0700
Subject: [Python-3000] Accepting PEP 3119, rejecting PEP 3133
Message-ID: <ca471dc20705241708m72d3f591kbce16beef4123d43@mail.gmail.com>

I'm accepting PEP 3119 (Abstract Base Classes). The latest round of
feedback has been sufficiently friendly that I am confident that it
will be a welcome addition. There are some loose ends in the PEP which
I will resolve while implementing it.

This means I'm also rejecting the main competing proposal, PEP (Roles).

I am hopeful that PEP 3124 (Generic Functions) will be updated; since
it works so well with ABCs I expect to accept it, in some form; but
I'm still waiting for the rewrite that Phillip proposed.

I am also expecting to accept PEP 3141 (numeric ABCs). The most
serious current objection to that one is that the concrete
implementations it provides may not be useful enough to warrant their
complexity; maybe I'll just take those out. I'll be pondering this
after implementing PEP3119.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri May 25 02:10:12 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 24 May 2007 17:10:12 -0700
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
Message-ID: <ca471dc20705241710i50e50992m6405ed411e02aaac@mail.gmail.com>

On 5/18/07, Guido van Rossum <guido at python.org> wrote:
> While reviewing PEPs, I stumbled over PEP 335 ( Overloadable Boolean
> Operators) by Greg Ewing. I am of two minds of this -- on the one
> hand, it's been a long time without any working code or anything. OTOH
> it might be quite useful to e.g. numpy folks.
>
> It is time to reject it due to lack of interest, or revive it!

Last call for discussion! I'm tempted to reject this -- the ability to
generate optimized code based on the shortcut semantics of and/or is
pretty important to me.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From showell30 at yahoo.com  Fri May 25 02:15:38 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Thu, 24 May 2007 17:15:38 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <465615C9.4080505@v.loewis.de>
Message-ID: <320102.38046.qm@web33515.mail.mud.yahoo.com>


--- "Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
> -2. Any solution found must also accommodate users
> which are unaware
> of the security issue, and just want to use their
> native language
> for identifiers. So requiring them to change their
> environment or
> pass additional command line parameters is
> unacceptable.

Let me say first that I'm 100% behind PEP 3131, and
that I agree with you with that many of the objections
to the PEP are kind of FUDish.

Still, I have a hard time accepting your premise that
even ordinary, non-security-aware programmers are so
deterred by changing their environment.  

In almost every programming situation I've been in,
I've had to deal with environmental issues, even
though my character set of choice has never been the
primary issue.

When I programmed in C, I had to learn my way around
makefiles, figure out LD_LIBRARY_PATH, etc.

When I programmed in Perl, I had to change my shebangs
when I moved from one Unix box to another, due to the
way sys admins installed Perl.

When I programmed in Java, I had to learn how .jar
files worked.

Now that I program in Python, I still have to fuss
with PYTHONPATH and LD_LIBRARY_PATH (we use C
extensions) when I go between version 20 (installed in
the field), version 21 (installed in the text box),
and shDevBranch (code I'm working on now).

Also in Python, the concept of a wrapper shell script
is just part of a programmer's life.  I have a program
that needs to run as user "operator," and I can't
sudo-enable Python itself (big security hole), so I
write a one-line sudo script that just calls safe.py,
and I sudo enable it.

 
> 
> Please *do* consider the needs of the people who
> want to actively
> use the feature as well. Otherwise, you have no
> chance of understanding
> what will make everyone happy.
> 

I think there are things that can be done here, even
if we make Python's default mode to be ascii-pure. 
Regional distros can set the environment
appropriately.  Python error messages about non-ascii
characters can suggest how to enable the -U flag.  The
Tokyo Python User's Group can educate programmers,
etc.




 
____________________________________________________________________________________
Don't pick lemons.
See all the new 2007 cars at Yahoo! Autos.
http://autos.yahoo.com/new_cars.html 

From gproux+py3000 at gmail.com  Fri May 25 03:05:01 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Fri, 25 May 2007 10:05:01 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <320102.38046.qm@web33515.mail.mud.yahoo.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
Message-ID: <19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>

Hello,

There has been many proposals of flags around.
I don't even understand anymore which -U you are talking about now.

But let me add my own proposal for a flag. (just to confuse everybody
else a little more)

It is my understanding that the only remaining objection for unicode
in identifier is for claimed security issues. The most important
application of unicode in identifiers in my view would be to bring
back computer control to children (like in the OLPC project).
(Notwithstanding the fact that AFAIK we have yet to hear about big
security issues in Java/C# world that were caused by the ability to
use unicode chars).

So probably a good flag for security-minded people would be to have
like gcc a "pedantic" flag.
python -pedantic no_scary_chars_here.py

Regarding the  the notion you should be able to give a single accepted
charset, the problem arises that restricting charsets on a global
scope (from a global command line flag or a site.py file) will prevent
me for example to freely mix English, French, Greek and Japanese in
the same large project and/or dynamically call on any .py with a
different charset.

I also think one of the great aspect of python is the ability to
simply get embedded in other C/C++/etc. projects and as such we need
to give the interpreter-embedders the ablity to execute any script the
user will present them without restricting to any specific charset.

The additional burden that ascii loving people would like to impose on
the rest of the world through the usage of command line switches is
unwanted IMHO.

I would think that a better way to help everybody would be to:
1) have a default of not restricting identifiers charsets but...
2) enable various people (or security minded distributions) to have a
customized site.py or $HOME file that would spit warnings or raise
exceptions when opening up files that have identifiers that are not
pure ascii. Notice that having to verify that EACH and EVERY
identifier can be expressed in a specific charset is going to be an
expensive runtime cost.

A good middle-ground would be to have the main python distribution
come out with the site.py spitting warnings (and giving a quite
explanation of why the warning and how to disable it for yourself (not
globally) if you are REALLY REALLY sure).

 It would be very interesting to enable the first time "interactive"
user to be able to disable the warning for *this* user for good from a
simple prompt.

Regards,

Guillaume

From jimjjewett at gmail.com  Fri May 25 03:37:47 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 24 May 2007 21:37:47 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
Message-ID: <fb6fbf560705241837m5b03cb0pc27a0bf78c5aa4ff@mail.gmail.com>

On 5/24/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:

> It is my understanding that the only remaining objection for unicode
> in identifier is for claimed security issues.

It isn't strictly security; when I've been burned by cut-and-paste
that turned out to be an unexpected character, it didn't cause damage,
but it did take me a long time to debug.

> Regarding the  the notion you should be able to give a single
> accepted charset, the problem arises that restricting charsets on a
> global scope (from a global command line flag or a site.py file) will
> prevent me for example to freely mix English, French, Greek and
> Japanese in the same large project

For most people, the appearance of a Greek or Japanese (let alone
both) character would be more likely to indicate a typo.  If you know
that your project is using both languages, then just allow both; the
point is that you have made an explicit decision to do so.

-jJ

From showell30 at yahoo.com  Fri May 25 04:01:07 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Thu, 24 May 2007 19:01:07 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
Message-ID: <250467.66423.qm@web33502.mail.mud.yahoo.com>


--- Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> 
> The additional burden that ascii loving people would
> like to impose on
> the rest of the world through the usage of command
> line switches is
> unwanted IMHO.
> 

I think now that PEP 3131 has been accepted, you can
coarsely frame the remaining conflict as between ascii
lovers and non-ascii lovers, and the dispute is over
who has to muck with their command line/environment to
get Python to reflect their bias.

Obviously, in any conflict, there are solutions that
mostly satisfy both parties.  

If Python 3.0 leaned too much toward appeasing
non-ascii lovers, you could still devise plenty of
workarounds that made ascii lovers not suffer too
immensely.  Ascii lovers could revisit their security
philosophies, by paying more scrunity to who actually
supplies patches, etc.  Ascii lovers could upgrade
their editors, run more unit tests, etc.  Ascii lovers
could build tools from tokenize.py, etc., that
facilitated the porting of non-English or non-Latin
code to English/lation.

If Python 3.0 leaned too much toward appeasing ascii
lovers, you could still devise plenty of workarounds
that made non-ascii lovers not suffer too immensely. 
You could make error messages more helpful, you could
have regional distros supply useful aliases, you could
have users groups educated newbies, etc.

If Python 3.0 judged the middle ground wrong, you
could adjust in Python 3.1.

Of course, there's a lot of gray area when you put
people on the spectrum.  As an example, take me--I'm
mostly an ascii lover, but I'm sympathetic to
non-ascii concerns.  My first language is English, but
I speak a bit of French, have written applications for
Spanish users, and have collaborated with people who
internationalized my software for languages that I'm
almost completely unfamiliar with (Dutch, Catalan,
etc.)

Regarding the "command line," this ascii mostly-lover
doesn't necessarily want to impose command line
restrictions on anybody.  I'd much rather impose
"environment" restrictions on ALL Python users.

Here's my reasoning:

  1) It's fair.  Even as an ascii lover and
beneficiary, I have to deal with environment variables
nearly as much as non-ascii lovers (PYTHONPATH,
LD_LIBRARY_PATH, ORA_HOME, etc.)

  2) It's really all about the environment.  There's a
difference between running Python in an enterprisy
environment, an OLPC environment, a
Japanese-person-trying-to-ween-himself-off-Ruby
environment, etc.

  3) It's often free.  I suspect most non-ASCII users
already have environment settings that suggest their
willingness to tolerate lack of ASCII purity.  Coudn't
Python sniff those out?






      ____________________________________________________________________________________Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel and lay it on us. http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 


From gproux+py3000 at gmail.com  Fri May 25 04:01:56 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Fri, 25 May 2007 11:01:56 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705241837m5b03cb0pc27a0bf78c5aa4ff@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<fb6fbf560705241837m5b03cb0pc27a0bf78c5aa4ff@mail.gmail.com>
Message-ID: <19dd68ba0705241901h23468237md8e81aaa65f9b7a6@mail.gmail.com>

Hi Jim,
On 5/25/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> It isn't strictly security; when I've been burned by cut-and-paste
> that turned out to be an unexpected character, it didn't cause damage,
> but it did take me a long time to debug.

Can you give a longer explanation because I don't understand what is
the issue. Is it like the issue with confusing 0 and O ? You seemingly
already have an experience with using something that is now not legal
in Python. Was it in Java or .NET world?

> For most people, the appearance of a Greek or Japanese (let alone
> both) character would be more likely to indicate a typo.  If you know
> that your project is using both languages, then just allow both; the
> point is that you have made an explicit decision to do so.

You are missing one of my main points but it is maybe not a very
strong point (the earlier email was maybe throwing away  too many
ideas at a time... i guess japanese sake lasts longer in the mouth :)
)

* Python is dynamic (you can have a e.g. pygtk user interface which
enables you to load at runtime a new .py file even to use a text view
to type in a mini-script that will do something specific in your
application domain): you never know what will get loaded next
* Python is embeddable: and often it is to bring the power of python
to less sophisticated users. You can imagine having a global system
deployed all around the world by a global company enabling each user
in each subsidiary to create their own extension scripts.
* There is a runtime cost for checking: the speed vs. security
tradeoff (for a security benefit that is still very much hypothetical
in the face of the experience of Java and .NET people) should be born
by the paranoid people (who are ALREADY accustomed to losing CPU
cycles to RSBAC security systems).
* In real life, you won't see much python programs that are not
written in your script. If you are really paranoid to see evil chars
take over your python src dir though, a -pedantic option as pointed
out earlier should take care of all your worries.

cheers,

G

From greg.ewing at canterbury.ac.nz  Fri May 25 04:05:35 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 25 May 2007 14:05:35 +1200
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <ca471dc20705241710i50e50992m6405ed411e02aaac@mail.gmail.com>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<ca471dc20705241710i50e50992m6405ed411e02aaac@mail.gmail.com>
Message-ID: <4656446F.8030802@canterbury.ac.nz>

Guido van Rossum wrote:

> Last call for discussion! I'm tempted to reject this -- the ability to
> generate optimized code based on the shortcut semantics of and/or is
> pretty important to me.

Please don't be hasty. I've had to think about this issue
a bit.

The conclusion I've come to is that there may be a small loss
in the theoretical amount of optimization opportunity available,
but not much. Furthermore, if you take into account some other
improvements that can be made (which I'll explain below) the
result is actually *better* than what 2.5 currently generates.

For example, Python 2.5 currently compiles

   if a and b:
     <stats>

into

     <evaluate a>
     JUMP_IF_FALSE L1
     POP_TOP
     <evaluate b>
     JUMP_IF_FALSE L1
     POP_TOP
     <stats>
     JUMP_FORWARD L2
   L1:
     15 POP_TOP
   L2:

Under my PEP, without any other changes, this would become

     <evaluate a>
     LOGICAL_AND_1 L1
     <evaluate b>
     LOGICAL_AND_2
   L1:
     JUMP_IF_FALSE L2
     POP_TOP
     <stats>
     JUMP_FORWARD L3
   L2:
     15 POP_TOP
   L3:

The fastest path through this involves executing one extra
bytecode. However, since we're not using JUMP_IF_FALSE to
do the short-circuiting any more, there's no need for it
to leave its operand on the stack. So let's redefine it and
change its name to POP_JUMP_IF_FALSE. This allows us to
get rid of all the POP_TOPs, plus the jump at the end of
the statement body. Now we have

     <evaluate a>
     LOGICAL_AND_1 L1
     <evaluate b>
     LOGICAL_AND_2
   L1:
     POP_JUMP_IF_FALSE L2
     <stats>
   L2:

The fastest path through this executes one *less* bytecode
than in the current 2.5-generated code. Also, any path that
ends up executing the body benefits from the lack of a
jump at the end.

The same benefits also result when the boolean expression is
more complex, e.g.

   if a or b and c:
     <stats>

becomes

     <evaluate a>
     LOGICAL_OR_1 L1
     <evaluate b>
     LOGICAL_AND_1 L2
     <evaluate c>
     LOGICAL_AND_2
   L2:
     LOGICAL_OR_2
   L1:
     POP_JUMP_IF_FALSE L3
     <stats>
   L3:

which contains 3 fewer instructions overall than the
corresponding 2.5-generated code.

So I contend that optimization is not an argument for
rejecting this PEP, and may even be one for accepting
it.

--
Greg

From gproux+py3000 at gmail.com  Fri May 25 04:13:01 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Fri, 25 May 2007 11:13:01 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <250467.66423.qm@web33502.mail.mud.yahoo.com>
References: <19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<250467.66423.qm@web33502.mail.mud.yahoo.com>
Message-ID: <19dd68ba0705241913j1d2f60e1ndc89e05bfd926c52@mail.gmail.com>

Hello,

On 5/25/07, Steve Howell <showell30 at yahoo.com> wrote:
> willingness to tolerate lack of ASCII purity.  Coudn't
> Python sniff those out?

On my Linux machine, my encoding is set to UTF8 (and I am sure that
most monolingual Ubuntu user have the same settings). On my Windows
PC, Unicode is the rule of the world.

I have a hard time seeing how you could sniff out the willingness to
accept in a Japanese environment, a piece of code written in Russian
because your buddy from Siberia has written this cool matrix class
that is 30% faster than most but contains a bunch of cyrillic
characters because people are using cyrillic characters for local
variable identifiers (but not module level identifiers).

I think that the beauty of the world that has moved from everybody
being their little codepage island to the global UTF8 based world (in
Linux), UTF16 world (in Windows) is that now all scripts are more or
less equal citizen and nobody benefits more than any other or has to
do more effort than the others to access to its own language but also
other people language.

Doing a kind of language segregation would prevent getting more people
working together and exchanging code and ideas, while opening up
culturally to other horizons and cultures.

(and no, I am not smoking illegal substances in front of my keyboard)

Guillaume

From python at zesty.ca  Fri May 25 04:40:03 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Thu, 24 May 2007 21:40:03 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>

Guillaume Proux wrote:
> It is my understanding that the only remaining objection for unicode
> in identifier is for claimed security issues.

You're missing much of the debate.  Please read this message:

    http://mail.python.org/pipermail/python-3000/2007-May/007855.html

Steve Howell wrote:
> I think now that PEP 3131 has been accepted, you can coarsely frame
> the remaining conflict as between ascii lovers and non-ascii lovers

To pit this as "ascii lovers vs. non-ascii lovers" is a pretty large
oversimplification.  You could name them "people who want to be able
to know what the code says" and "people who don't mind not being able
to know what the code says".  Or you could name them "people who want
Python's lexical syntax to be something they fully understand" and
"people who don't mind the extra complexity".  Or "people who don't
want Python's lexical syntax to be tied to a changing external
standard" and "people who don't mind the extra variability."

However you characterize them, keep in mind that those in the former
group are asking for default behaviour that 100% of Python users
already use and understand.  There's no cost to keeping identifiers
ASCII-only because that's what Python already does.

I think that's a pretty strong reason for making the new, more complex
behaviour optional.


-- ?!ng

From showell30 at yahoo.com  Fri May 25 04:46:42 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Thu, 24 May 2007 19:46:42 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
Message-ID: <456503.41918.qm@web33505.mail.mud.yahoo.com>


--- Ka-Ping Yee <python at zesty.ca> wrote:

> Steve Howell wrote:
> > I think now that PEP 3131 has been accepted, you
> can coarsely frame
> > the remaining conflict as between ascii lovers and
> non-ascii lovers
> 
> To pit this as "ascii lovers vs. non-ascii lovers"
> is a pretty large
> oversimplification.  You could name them "people who
> want to be able
> to know what the code says" and "people who don't
> mind not being able
> to know what the code says".  Or you could name them
> "people who want
> Python's lexical syntax to be something they fully
> understand" and
> "people who don't mind the extra complexity".  Or
> "people who don't
> want Python's lexical syntax to be tied to a
> changing external
> standard" and "people who don't mind the extra
> variability."
> 

Agreed.

> However you characterize them, keep in mind that
> those in the former
> group are asking for default behaviour that 100% of
> Python users
> already use and understand.  There's no cost to
> keeping identifiers
> ASCII-only because that's what Python already does.
> 

Agreed.

> I think that's a pretty strong reason for making the
> new, more complex
> behaviour optional.
> 

Agreed also.  Just to be clear, I am 100% in the camp
of people who want non-ascii behavior to be an
explicit choice, at least for 3.0.  EIBTI.

But I also think we want to be as creative as possible
for enabling and encouraging non-ascii functionality. 
I think that's where this thread should start
focusing.

I also share Guillaume's optimistic viewpoint about a
Python world with no cultural boundaries, etc. (sorry
if that's a bad paraphrase).





 
____________________________________________________________________________________
We won't tell. Get more on shows you hate to love 
(and love to hate): Yahoo! TV's Guilty Pleasures list.
http://tv.yahoo.com/collections/265 

From guido at python.org  Fri May 25 04:53:40 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 24 May 2007 19:53:40 -0700
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <4656446F.8030802@canterbury.ac.nz>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<ca471dc20705241710i50e50992m6405ed411e02aaac@mail.gmail.com>
	<4656446F.8030802@canterbury.ac.nz>
Message-ID: <ca471dc20705241953r5f7dbdb3x8a93b213a142f62a@mail.gmail.com>

On 5/24/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>
> > Last call for discussion! I'm tempted to reject this -- the ability to
> > generate optimized code based on the shortcut semantics of and/or is
> > pretty important to me.
>
> Please don't be hasty. I've had to think about this issue
> a bit.
>
> The conclusion I've come to is that there may be a small loss
> in the theoretical amount of optimization opportunity available,
> but not much. Furthermore, if you take into account some other
> improvements that can be made (which I'll explain below) the
> result is actually *better* than what 2.5 currently generates.
>
> For example, Python 2.5 currently compiles
>
>    if a and b:
>      <stats>
>
> into
>
>      <evaluate a>
>      JUMP_IF_FALSE L1
>      POP_TOP
>      <evaluate b>
>      JUMP_IF_FALSE L1
>      POP_TOP
>      <stats>
>      JUMP_FORWARD L2
>    L1:
>      15 POP_TOP
>    L2:
>
> Under my PEP, without any other changes, this would become
>
>      <evaluate a>
>      LOGICAL_AND_1 L1
>      <evaluate b>
>      LOGICAL_AND_2
>    L1:
>      JUMP_IF_FALSE L2
>      POP_TOP
>      <stats>
>      JUMP_FORWARD L3
>    L2:
>      15 POP_TOP
>    L3:
>
> The fastest path through this involves executing one extra
> bytecode. However, since we're not using JUMP_IF_FALSE to
> do the short-circuiting any more, there's no need for it
> to leave its operand on the stack. So let's redefine it and
> change its name to POP_JUMP_IF_FALSE. This allows us to
> get rid of all the POP_TOPs, plus the jump at the end of
> the statement body. Now we have
>
>      <evaluate a>
>      LOGICAL_AND_1 L1
>      <evaluate b>
>      LOGICAL_AND_2
>    L1:
>      POP_JUMP_IF_FALSE L2
>      <stats>
>    L2:
>
> The fastest path through this executes one *less* bytecode
> than in the current 2.5-generated code. Also, any path that
> ends up executing the body benefits from the lack of a
> jump at the end.
>
> The same benefits also result when the boolean expression is
> more complex, e.g.
>
>    if a or b and c:
>      <stats>
>
> becomes
>
>      <evaluate a>
>      LOGICAL_OR_1 L1
>      <evaluate b>
>      LOGICAL_AND_1 L2
>      <evaluate c>
>      LOGICAL_AND_2
>    L2:
>      LOGICAL_OR_2
>    L1:
>      POP_JUMP_IF_FALSE L3
>      <stats>
>    L3:
>
> which contains 3 fewer instructions overall than the
> corresponding 2.5-generated code.
>
> So I contend that optimization is not an argument for
> rejecting this PEP, and may even be one for accepting
> it.

Do you have an implementation available to measure this? In most cases
the cost is not in the number of bytecode instructions executed but in
the total amount of work. Two cheap bytecodes might well be cheaper
than one expensive one.

However, I'm happy to keep your PEP open until you have code that we
can measure. (However, adding additional optimizations elsewhere to
make up for the loss wouldn't be fair -- we would have to compare with
a 2.5 or trunk (2.6) interpreter with the same additional
optimizations added.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri May 25 05:09:33 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 24 May 2007 20:09:33 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
Message-ID: <ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>

On 5/24/07, Ka-Ping Yee <python at zesty.ca> wrote:
> To pit this as "ascii lovers vs. non-ascii lovers" is a pretty large
> oversimplification.  You could name them "people who want to be able
> to know what the code says" and "people who don't mind not being able
> to know what the code says".  Or you could name them "people who want
> Python's lexical syntax to be something they fully understand" and
> "people who don't mind the extra complexity".  Or "people who don't
> want Python's lexical syntax to be tied to a changing external
> standard" and "people who don't mind the extra variability."
>
> However you characterize them, keep in mind that those in the former
> group are asking for default behaviour that 100% of Python users
> already use and understand.  There's no cost to keeping identifiers
> ASCII-only because that's what Python already does.
>
> I think that's a pretty strong reason for making the new, more complex
> behaviour optional.

If there's a security argument to be made for restricting the alphabet
used by code contributions (even by co-workers at the same company), I
don't see why ASCII-only projects should have it easier than projects
in other cultures.

It doesn't look like any kind of global flag passed to the interpreter
would scale -- once I am using a known trusted contribution that uses
a different character set than mine, I would have to change the global
setting to be more lenient, and the leniency would affect all code I'm
using.

A more useful approach would seem to be a set of auditing tools that
can be applied routinely to all new contributions (e.g. as a
pre-commit hook when using a source control system), or to all code in
a given directory, download, etc. I don't see this as all that
different from using e.g. PyChecker of PyLint.

While I routinely perform visual code inspections (code review is the
law at Google, and I wrote the tool used internally to do these), I
certainly don't see this as a security audit -- I use it as a
mentoring activity and to reach agreement over issues as diverse as
coding style, architecture and implementation techniques between
trusting colleagues. Scanning for stray non-ASCII characters is best
left to automated tools.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From showell30 at yahoo.com  Fri May 25 05:54:17 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Thu, 24 May 2007 20:54:17 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
Message-ID: <66548.18605.qm@web33508.mail.mud.yahoo.com>


--- Guido van Rossum <guido at python.org> wrote:

> If there's a security argument to be made for
> restricting the alphabet
> used by code contributions (even by co-workers at
> the same company), I
> don't see why ASCII-only projects should have it
> easier than projects
> in other cultures.
> 
> It doesn't look like any kind of global flag passed
> to the interpreter
> would scale -- once I am using a known trusted
> contribution that uses
> a different character set than mine, I would have to
> change the global
> setting to be more lenient, and the leniency would
> affect all code I'm
> using.
> 

Ok, that argument sways me.

Can the debate about security be put to rest by adding
something to the "Common Objections" section of the
PEP, or has your pronouncement already put the debate
to rest?

To the extent that recent objections don't fall under
security, what are they?

Have these been adequately refuted?

1) People want to be able to know what non-ascii code
says.

2) People don't want extra complexity in the language.

3) People don't want Python's lexical syntax to be
tied to a changing external standard.

My opinion:

#1 -- easy to refute

#2 -- too general to refute

#3 -- still an interesting point for debate







       
____________________________________________________________________________________Need a vacation? Get great deals
to amazing places on Yahoo! Travel.
http://travel.yahoo.com/

From jcarlson at uci.edu  Fri May 25 06:36:12 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 24 May 2007 21:36:12 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
References: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
Message-ID: <20070524213605.864B.JCARLSON@uci.edu>


"Guido van Rossum" <guido at python.org> wrote:
> On 5/24/07, Ka-Ping Yee <python at zesty.ca> wrote:
> > To pit this as "ascii lovers vs. non-ascii lovers" is a pretty large
> > oversimplification.  You could name them "people who want to be able
> > to know what the code says" and "people who don't mind not being able
> > to know what the code says".  Or you could name them "people who want
> > Python's lexical syntax to be something they fully understand" and
> > "people who don't mind the extra complexity".  Or "people who don't
> > want Python's lexical syntax to be tied to a changing external
> > standard" and "people who don't mind the extra variability."
> >
> > However you characterize them, keep in mind that those in the former
> > group are asking for default behaviour that 100% of Python users
> > already use and understand.  There's no cost to keeping identifiers
> > ASCII-only because that's what Python already does.
> >
> > I think that's a pretty strong reason for making the new, more complex
> > behaviour optional.
> 
> If there's a security argument to be made for restricting the alphabet
> used by code contributions (even by co-workers at the same company), I
> don't see why ASCII-only projects should have it easier than projects
> in other cultures.

For the sake of argument, pretend that we went with a command line
option to enable certain character sets.  In my opinion, there should be
a default character set that is allowed.  The only character set that
makes sense as a default, ignoring previously-existing environment
variables (which don't necessarily help us), is ascii.

Why?  Primarily because ascii identifiers are what are allowed today,
and have been allowed for 15 years.  But there is this secondary data
point that Stephen Turnbull brought up; 95% of users (of Emacs) never
touch non-ascii code.  Poor extrapolation of statistics aside, to make
the default be something that does not help 95% of users seems a
bit... overenthusiastic.  Where else in Python have we made the default
behavior only desired or useful to 5% of our users?

With that said, and with what Stephen and others have said about unicode
in Java, I don't believe there will be terribly significant cross
polination of non-ascii identifier source.  Of the source that *does*
become popular and has non-ascii identifiers, I don't believe that it
would take much time before there are normalized versions of the source,
either published by the original authors or created by users. (having a
tool to do unicode -> ascii transliteration of identifiers would make
this a non-issue)

Though others don't like it, I think that having a command line option
to enable other character sets is a reasonable burdon to place on the 5%
of users that will experience non-ascii identifiers.  For those who work
with it on a regular basis, having an environment variable should be
sufficient (with command line arguments to add additional allowable
character sets).  For those who wish to import code at runtime and/or
have arbitrary identifiers, having an interface for adding or removing
allowable character sets for code imported during runtime should work
reasonably well (both for people who want to allow arbitrary identifiers,
and those who want to restrict identifiers after the runtime system is
up).

In terms of speed issues that Guillaume has brought up, this is a
non-issue. The time to verify identifiers as a pyc is loaded, when every
identifier in a pyc file is interned on loading, is insignificant;
especially when in Python one can do...

    for identifier in identifiers:
        for character in identifier:
            if character not in allowable_characters:
                raise ImportError("...")

And considering we can do *millions* of dictionary/set lookups each
second on a modern machine, I can't imagine that identifier verification
time will be a significant burden.


 - Josiah


From martin at v.loewis.de  Fri May 25 06:32:15 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 May 2007 06:32:15 +0200
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <A2620FD5-C79D-4B8D-B824-AFB59A56735F@gmail.com>
References: <20070523111704.85FC.JCARLSON@uci.edu>
	<87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070524082737.862E.JCARLSON@uci.edu>
	<4655DD4E.3050809@v.loewis.de>
	<Pine.LNX.4.58.0705241620150.8399@server1.LFW.org>
	<4656129D.5000406@v.loewis.de>
	<A2620FD5-C79D-4B8D-B824-AFB59A56735F@gmail.com>
Message-ID: <465666CF.3040507@v.loewis.de>

> Perhaps a letter in the encoding declaration is non-ascii, nullifying
> the encoding enforcement and allowing a cyrillic 'a' in  allowed = 0?

I see. Of course, if I receive a patch where one of the lines changed
is the coding declaration, and there is no apparent difference between
the old and the new declaration, I would become cautious, wondering
what's going on.

Regrds,
Martin

From martin at v.loewis.de  Fri May 25 06:35:58 2007
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 25 May 2007 06:35:58 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <Pine.LNX.4.58.0705241759100.8399@server1.LFW.org>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>
	<465615C9.4080505@v.loewis.de>
	<Pine.LNX.4.58.0705241759100.8399@server1.LFW.org>
Message-ID: <465667AE.2090000@v.loewis.de>

Ka-Ping Yee schrieb:
> On Fri, 25 May 2007, [ISO-8859-1] "Martin v. L?wis" wrote:
>> Please *do* consider the needs of the people who want to actively
>> use the feature as well. Otherwise, you have no chance of understanding
>> what will make everyone happy.
> 
> People who want to use the feature can turn it on.  I don't see what's
> so unreasonable about that.

People who want to use the feature would have to know that it is only
present if you turn it on. It's like saying "you can use hexadecimal
integer literals, but you have to turn them on". This wouldn't work:
people try to use them, find out that it won't work, and assume
that it's not supported.

Regards,
Martin

From martin at v.loewis.de  Fri May 25 06:38:27 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 May 2007 06:38:27 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705241612o38fad58ascdbfd597d483da77@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>	
	<46527904.1000202@v.loewis.de>	
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>	
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>	
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>	
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>	
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>	
	<Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>	
	<465615C9.4080505@v.loewis.de>
	<fb6fbf560705241612o38fad58ascdbfd597d483da77@mail.gmail.com>
Message-ID: <46566843.9080407@v.loewis.de>

> Is your concern just that it should be possible to do once (perhaps at
> install), rather than on each run?

My concern is that people assume that you can't use non-ASCII
identifiers if they try it out and it doesn't work. If they believe
the feature is not there, that's just as if it really wasn't there.

Regards,
Martin

From jcarlson at uci.edu  Fri May 25 07:04:53 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 24 May 2007 22:04:53 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <465667AE.2090000@v.loewis.de>
References: <Pine.LNX.4.58.0705241759100.8399@server1.LFW.org>
	<465667AE.2090000@v.loewis.de>
Message-ID: <20070524215742.864E.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> Ka-Ping Yee schrieb:
> > On Fri, 25 May 2007, [ISO-8859-1] "Martin v. L???wis" wrote:
> >> Please *do* consider the needs of the people who want to actively
> >> use the feature as well. Otherwise, you have no chance of understanding
> >> what will make everyone happy.
> > 
> > People who want to use the feature can turn it on.  I don't see what's
> > so unreasonable about that.
> 
> People who want to use the feature would have to know that it is only
> present if you turn it on. It's like saying "you can use hexadecimal
> integer literals, but you have to turn them on". This wouldn't work:
> people try to use them, find out that it won't work, and assume
> that it's not supported.

Are we going to stop offering informational error messages to people? 
Because an informational error message could go a long way towards
helping people to understand what is going on.


 - Josiah


From gproux+py3000 at gmail.com  Fri May 25 07:31:00 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Fri, 25 May 2007 14:31:00 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <20070524213605.864B.JCARLSON@uci.edu>
References: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<20070524213605.864B.JCARLSON@uci.edu>
Message-ID: <19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>

On 5/25/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> a default character set that is allowed.  The only character set that
> makes sense as a default, ignoring previously-existing environment
> variables (which don't necessarily help us), is ascii.

This is ignoring the movement in the last 5-10 years that happened in
both the operating systems, filesystems and even language space.
Now, the "standard" allowed charset in all of the above environments
is Unicode.

> Why?  Primarily because ascii identifiers are what are allowed today,
> and have been allowed for 15 years.  But there is this secondary data

And guess what, they will still be allowed tomorrow... (tongue-in-cheek)

If you look at the typical use case for programs written in python
(usually also in rough order of experience)
A) directly in interpreter (i love that)
B) small-ish one-off scripts
C) middle size scripts
D) multi-module programs made by a single person
E) large-ish programs made by a group of people

Out of these, really only people belonging to category E) are
expressing an opinion that identifiers should stay ASCII forever.
Those should be the same people who have a strong source code
compliance policy, unit test, lint-izatoin etc...

Unicode support out of the box without constraint strongly benefits
category A-D. (just for the funny story, I was asking the opinion of
my colleague this morning who is a beginner in Visual Basic.NET about
Japanese identifiers, and he was shocked to hear that Python does not
accept Japanese identifiers today out of the box... VB.NET apparently
does and entry level programmers here DO (ab?)use this). Unicode is an
accepted norm isn't it? (even if some extremists in Japan long argue
of the superiority of the local encoding over Unicode but apart on 2ch
this is an old story now)

I think Martin's and my point is that to get people to level E) there
is no reason to put any charset restriction on level A ->D. And when
you are at level E), it is difficult to argue that making a one-time
test at source code checkin time is a bad practice.

Regards,

Guillaume

From rhamph at gmail.com  Fri May 25 07:38:19 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 24 May 2007 23:38:19 -0600
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<f27rmv$k1d$1@sea.gmane.org> <464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
Message-ID: <aac2c7cb0705242238o175718f9gd43b9a48f60c72d7@mail.gmail.com>

On 5/23/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > The only issues PEP 3131 should be concerned with *defining*
> > are those that cause problems with canonicalization, and the range of
> > characters and languages allowed in the standard library.
>
> Fair enough -- but the problem is that this isn't a solved issue yet;
> the unicode group themselves make several contradictory
> recommendations.
>
> I can come up with rules that are probably just about right, but I
> will make mistakes (just as the unicode consortium itself did, which
> is why they have both ID and XID, and why both have stability
> characters).  Even having read their reports, my initial rules would
> still have banned mixed-script, which would have prevented your edict-
> example.

If we allowed an underscore as a mixed-script separator (allowing "def
get_??(self):"), does this let us get away with otherwise banning
mixed-scripts?

This wouldn't protect us from single-character identifiers or a
single-character identifier segment, but those seem to be fairly
obscure (and perhaps suspicious, for those concerned about security).

-- 
Adam Olsen, aka Rhamphoryncus

From martin at v.loewis.de  Fri May 25 08:24:22 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 May 2007 08:24:22 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <20070524215742.864E.JCARLSON@uci.edu>
References: <Pine.LNX.4.58.0705241759100.8399@server1.LFW.org>
	<465667AE.2090000@v.loewis.de>
	<20070524215742.864E.JCARLSON@uci.edu>
Message-ID: <46568116.202@v.loewis.de>

>> People who want to use the feature would have to know that it is only
>> present if you turn it on. It's like saying "you can use hexadecimal
>> integer literals, but you have to turn them on". This wouldn't work:
>> people try to use them, find out that it won't work, and assume
>> that it's not supported.
> 
> Are we going to stop offering informational error messages to people? 
> Because an informational error message could go a long way towards
> helping people to understand what is going on.

I don't think there is precedence in Python for such an informational
error message. It is not pythonic to give an error in the case
"I know what you want, and I could easily do it, but I don't feel
like doing it, read these ten pages of text to learn more about the
problem".

The most similar case is the future import statement, where we in fact
report an error even though it's typically clear what the desired
meaning of the program is. However, this statement is only meant
as a transitional measure, with a view of eventually changing
the error into making the future behavior the default. I understand
that you want that to be a permanent error, and this I object to.

People should not have to read long system configuration pages
just to run the program that they intuitively wrote correctly
right from the start.

If you think there are cases in which the user should be warned
about potential problems and risks, then the warning machinery
would be more appropriate. Of course, it would be important
to not produce too many false positives for such a warning.

Regards,
Martin

From jcarlson at uci.edu  Fri May 25 08:59:59 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 24 May 2007 23:59:59 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <46568116.202@v.loewis.de>
References: <20070524215742.864E.JCARLSON@uci.edu> <46568116.202@v.loewis.de>
Message-ID: <20070524234516.8654.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> >> People who want to use the feature would have to know that it is only
> >> present if you turn it on. It's like saying "you can use hexadecimal
> >> integer literals, but you have to turn them on". This wouldn't work:
> >> people try to use them, find out that it won't work, and assume
> >> that it's not supported.
> > 
> > Are we going to stop offering informational error messages to people? 
> > Because an informational error message could go a long way towards
> > helping people to understand what is going on.
> 
> I don't think there is precedence in Python for such an informational
> error message. It is not pythonic to give an error in the case
> "I know what you want, and I could easily do it, but I don't feel
> like doing it, read these ten pages of text to learn more about the
> problem".

ImportError("non-ascii names used without proper charset definition")

They hop online, enter that phrase into google, and (hopefully) get a
page at python.org that says something like...

If you have received this error, and merely want to get your source to
run, use: python --charset=unicode ...

If you know the character set of the source you want to run (which can
be discovered by checking the output of scripts/charset.py), you can
use: python --charset=<charset> ...

If you would like to make this the default, add a PY_CHARSET environment
variable with a comma separated list of allowable character sets (ascii
is always included).

If you would like to programmatically change the allowable character set,
use the <charset modification module> .


> The most similar case is the future import statement, where we in fact
> report an error even though it's typically clear what the desired
> meaning of the program is. However, this statement is only meant
> as a transitional measure, with a view of eventually changing
> the error into making the future behavior the default. I understand
> that you want that to be a permanent error, and this I object to.

That's fine, but it's not just me that has this opinion and desire for
ascii default behavior.

> People should not have to read long system configuration pages
> just to run the program that they intuitively wrote correctly
> right from the start.

You mean that 5% of users who run into code written using non-ascii
identifiers will find this sufficiently burdensome to force the 95% of
ascii users to use additional verification and checking tools to make
sure that they are not confronted with non-ascii identifiers?  I don't
find that a reasonable tradeoff for the majority of (non-unicode) users.


 - Josiah


From stephen at xemacs.org  Fri May 25 09:10:03 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 25 May 2007 16:10:03 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <83075.64514.qm@web33514.mail.mud.yahoo.com>
References: <4655DE74.4090708@v.loewis.de>
	<83075.64514.qm@web33514.mail.mud.yahoo.com>
Message-ID: <87veeh4bw4.fsf@uwakimon.sk.tsukuba.ac.jp>

Steve Howell writes:

 > respect to Kanji, and switches over to Python, and
 > changes his little wrapper shell script to say "python
 > -U" instead of "ruby -Kkcode"?  He could then start to
 > use non-Japanese Python modules while still writing
 > his own Python code in Japanese.

But that's not enough.  The problem is that the reason for -Kkcode is
that kcode != Unicode.  Japanese use several mutually incompatible
encodings, and they mix anarchically over the Internet.  What -K does
is allow you to specify which one you're giving to the interpreter at
runtime.

The analogy to -K would be if you get a English-language Python source
file from somewhere, look into it, realize it's from IBM, and run it
with "python -K ebcdic whizbang.py".  Same characters, only the bytes
are changed to confuse the innocent.  That's what -Kkcode is for.


From martin at v.loewis.de  Fri May 25 09:05:28 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 May 2007 09:05:28 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <83075.64514.qm@web33514.mail.mud.yahoo.com>
References: <83075.64514.qm@web33514.mail.mud.yahoo.com>
Message-ID: <46568AB8.2010601@v.loewis.de>

> Ruby is a language that presumably has a lot of
> Japanese users, and it appears to me (I'm not a Ruby
> person, so I admit this is speculation) that Japanese
> users have to explicitly choose to use Japanese
> encoding to run source files encoded in Japanese.
> 
> Setting aside all the limitations of Ruby, wouldn't
> the fact that non-latin-writing Japanese Ruby users
> live with the command line restriciton in Ruby suggest
> that they'd be just as willing to live with command
> line burdens in Python, if they decided to switch to
> Python?

"Just as willing" is probably the right analysis. It's
speculation that the ruby users are *happy* that they
cannot double-click a kcode script in the explorer
to run it, or perhaps there is another mechanism in
Ruby that avoids this problem - it's also speculation
that you *have* to use this command line option in
order to be able to use Japanese identifiers.

Regards,
Martin

From martin at v.loewis.de  Fri May 25 09:09:48 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 May 2007 09:09:48 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <320102.38046.qm@web33515.mail.mud.yahoo.com>
References: <320102.38046.qm@web33515.mail.mud.yahoo.com>
Message-ID: <46568BBC.9060801@v.loewis.de>

> In almost every programming situation I've been in,
> I've had to deal with environmental issues, even
> though my character set of choice has never been the
> primary issue.

People can certainly adjust to whatever challenges
technology confronts them with (some people can do
that easier, some have more difficulties). Still,
beautiful is better than ugly.

> I think there are things that can be done here, even
> if we make Python's default mode to be ascii-pure. 
> Regional distros can set the environment
> appropriately.  Python error messages about non-ascii
> characters can suggest how to enable the -U flag.  The
> Tokyo Python User's Group can educate programmers,
> etc.

Yes, but these are all work-arounds for an avoidable ugliness.

Regards,
Martin

From martin at v.loewis.de  Fri May 25 09:14:55 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 May 2007 09:14:55 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
References: <465615C9.4080505@v.loewis.de>	<320102.38046.qm@web33515.mail.mud.yahoo.com>	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
Message-ID: <46568CEF.2030900@v.loewis.de>

> However you characterize them, keep in mind that those in the former
> group are asking for default behaviour that 100% of Python users
> already use and understand.  There's no cost to keeping identifiers
> ASCII-only because that's what Python already does.

How does adding conditionality make the language easier to understand?
It seems you are still asking for a fork in the language. I very much
resist to the notion that forking the language is desirable (for
whatever reasons).

> I think that's a pretty strong reason for making the new, more complex
> behaviour optional.

Thus making it simpler????? The more complex behavior still remains,
to fully understand the language, you have to understand that behavior,
*plus* you need to understand that it may sometimes not be present.

Regards,
Martin

From martin at v.loewis.de  Fri May 25 09:36:47 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 May 2007 09:36:47 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <20070524234516.8654.JCARLSON@uci.edu>
References: <20070524215742.864E.JCARLSON@uci.edu> <46568116.202@v.loewis.de>
	<20070524234516.8654.JCARLSON@uci.edu>
Message-ID: <4656920F.9040001@v.loewis.de>

>> People should not have to read long system configuration pages
>> just to run the program that they intuitively wrote correctly
>> right from the start.
> 
> You mean that 5% of users who run into code written using non-ascii
> identifiers will find this sufficiently burdensome to force the 95% of
> ascii users to use additional verification and checking tools to make
> sure that they are not confronted with non-ascii identifiers?  I don't
> find that a reasonable tradeoff for the majority of (non-unicode) users.

I think I lost track of what problem you are trying to solve: is it
the security issue, or is the the problem Ping stated ("you cannot
know the full lexical rules by heart anymore").

If it is the latter, I don't understand why the 95% ascii users need
to run additional verification and checking tools. If they don't
know the full language, they won't use it - why should they run
any checking tools?

If it is the security issue, I don't see why a warning wouldn't
address the concerns of these users just as well.

Regards,
Martin


From nevillegrech at gmail.com  Fri May 25 11:25:17 2007
From: nevillegrech at gmail.com (Neville Grech Neville Grech)
Date: Fri, 25 May 2007 11:25:17 +0200
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <ca471dc20705241953r5f7dbdb3x8a93b213a142f62a@mail.gmail.com>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<ca471dc20705241710i50e50992m6405ed411e02aaac@mail.gmail.com>
	<4656446F.8030802@canterbury.ac.nz>
	<ca471dc20705241953r5f7dbdb3x8a93b213a142f62a@mail.gmail.com>
Message-ID: <de9ae4950705250225i64fe8a0fga5b86aed62556fae@mail.gmail.com>

>From a user's POV, I'm +1 on having overloadable boolean functions. In many
cases I had to resort to overload add or neg instead of and & not, I foresee
a lot of cases where the and overload could be used to join objects which
represent constraints. Overloadable boolean operators could also be used to
implement other types of logic (eg: fuzzy logic). Constraining them to just
primitive binary operations in my view will be delimiting for a myriad of
use cases.

Sure, in some cases, one could overload the neg operator instead of the not
but semantically they have different meanings.

On 5/25/07, Guido van Rossum <guido at python.org> wrote:
>
> On 5/24/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > Guido van Rossum wrote:
> >
> > > Last call for discussion! I'm tempted to reject this -- the ability to
> > > generate optimized code based on the shortcut semantics of and/or is
> > > pretty important to me.
> >
> > Please don't be hasty. I've had to think about this issue
> > a bit.
> >
> > The conclusion I've come to is that there may be a small loss
> > in the theoretical amount of optimization opportunity available,
> > but not much. Furthermore, if you take into account some other
> > improvements that can be made (which I'll explain below) the
> > result is actually *better* than what 2.5 currently generates.
> >
> > For example, Python 2.5 currently compiles
> >
> >    if a and b:
> >      <stats>
> >
> > into
> >
> >      <evaluate a>
> >      JUMP_IF_FALSE L1
> >      POP_TOP
> >      <evaluate b>
> >      JUMP_IF_FALSE L1
> >      POP_TOP
> >      <stats>
> >      JUMP_FORWARD L2
> >    L1:
> >      15 POP_TOP
> >    L2:
> >
> > Under my PEP, without any other changes, this would become
> >
> >      <evaluate a>
> >      LOGICAL_AND_1 L1
> >      <evaluate b>
> >      LOGICAL_AND_2
> >    L1:
> >      JUMP_IF_FALSE L2
> >      POP_TOP
> >      <stats>
> >      JUMP_FORWARD L3
> >    L2:
> >      15 POP_TOP
> >    L3:
> >
> > The fastest path through this involves executing one extra
> > bytecode. However, since we're not using JUMP_IF_FALSE to
> > do the short-circuiting any more, there's no need for it
> > to leave its operand on the stack. So let's redefine it and
> > change its name to POP_JUMP_IF_FALSE. This allows us to
> > get rid of all the POP_TOPs, plus the jump at the end of
> > the statement body. Now we have
> >
> >      <evaluate a>
> >      LOGICAL_AND_1 L1
> >      <evaluate b>
> >      LOGICAL_AND_2
> >    L1:
> >      POP_JUMP_IF_FALSE L2
> >      <stats>
> >    L2:
> >
> > The fastest path through this executes one *less* bytecode
> > than in the current 2.5-generated code. Also, any path that
> > ends up executing the body benefits from the lack of a
> > jump at the end.
> >
> > The same benefits also result when the boolean expression is
> > more complex, e.g.
> >
> >    if a or b and c:
> >      <stats>
> >
> > becomes
> >
> >      <evaluate a>
> >      LOGICAL_OR_1 L1
> >      <evaluate b>
> >      LOGICAL_AND_1 L2
> >      <evaluate c>
> >      LOGICAL_AND_2
> >    L2:
> >      LOGICAL_OR_2
> >    L1:
> >      POP_JUMP_IF_FALSE L3
> >      <stats>
> >    L3:
> >
> > which contains 3 fewer instructions overall than the
> > corresponding 2.5-generated code.
> >
> > So I contend that optimization is not an argument for
> > rejecting this PEP, and may even be one for accepting
> > it.
>
> Do you have an implementation available to measure this? In most cases
> the cost is not in the number of bytecode instructions executed but in
> the total amount of work. Two cheap bytecodes might well be cheaper
> than one expensive one.
>
> However, I'm happy to keep your PEP open until you have code that we
> can measure. (However, adding additional optimizations elsewhere to
> make up for the loss wouldn't be fair -- we would have to compare with
> a 2.5 or trunk (2.6) interpreter with the same additional
> optimizations added.)
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe:
> http://mail.python.org/mailman/options/python-3000/nevillegrech%40gmail.com
>



-- 
Regards,
Neville Grech
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070525/8c88dbf1/attachment.html 

From python at zesty.ca  Fri May 25 11:36:28 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Fri, 25 May 2007 04:36:28 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com> 
	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0705250341180.27740@server1.LFW.org>

On Thu, 24 May 2007, Guido van Rossum wrote:
> If there's a security argument to be made for restricting the alphabet
> used by code contributions (even by co-workers at the same company), I
> don't see why ASCII-only projects should have it easier than projects
> in other cultures.

This keeps getting characterized as only a security argument, but
it's much deeper; it's a basic code comprehension issue.  It's all
five of the issues I mentioned at

    http://mail.python.org/pipermail/python-3000/2007-May/007855.html

and the additional point about Unicode standards raised by Jim at

    http://mail.python.org/pipermail/python-3000/2007-May/007863.html

I still believe all of these should at least be acknowledged in the PEP.

----

If you like, you could look at this as trying to serve two different
communities, the "ASCII folks" and the "non-ASCII folks", as has been
said in other messages here.  (IMHO, it would be better to think of
many different communities of non-ASCII folks rather than just one,
which is why the choose-your-own-table solution makes the most sense.)

But suppose we just look at the simpler question of "what should the
default be?" -- there are two possible behaviours; which should the
default favour?  All these decision criteria agree:

  - Explicit or implicit?  Better to explicitly enable the new feature.

  - Simple or complex?  ASCII is the simpler character set.

  - Majority or minority?  By far the majority will use only ASCII.

  - Status quo or new behaviour?  ASCII is established and familiar.

The safer choice is to stick to ASCII by default.  There's nothing to
lose by doing so.  Why rush to change the lexical syntax?  Why is it
*necessary* to do it right now, and all at once, and by default?

----

> A more useful approach would seem to be a set of auditing tools that
> can be applied routinely to all new contributions (e.g. as a
> pre-commit hook when using a source control system), or to all code in
> a given directory, download, etc. I don't see this as all that
> different from using e.g. PyChecker of PyLint.
[...]
> Scanning for stray non-ASCII characters is best
> left to automated tools.

...like the Python interpreter.  Having the Python interpreter do this
is a good idea for all the same reasons that the Python interpreter
checks for tab/space inconsistency.

Imagine a parallel universe in which Python has always forbidden
tabs and only allowed spaces for indentation.  In Python 3.0, it is
proposed to introduce tabs.  Alter-Guido announces he will accept
the proposal.  Some folks are opposed to adding tabs, saying it could
be confusing, but he disagrees.  Some folks suggest that this feature
could at least be made optional, but he disagrees.  Some folks suggest
that the Python interpreter should at least warn when this happens,
but he disagrees.

"But," they say, "mixing tabs and spaces can yield programs that have
invisibly different meanings.  "No matter," says alter-Guido, "you
just shouldn't do that."  Or "You should use an editor that takes care
of this for you."  Or "You need to write your own checking tools and
scan all your code before you check it in."  "But what about all the
users who aren't aware of this change?" they ask.

Wouldn't it just be so much easier if the Python interpreter did the
checking?  In our universe, it does, and this is a very good thing.
Why did we decide to do that?

I would say, becuase it makes our programs more reliable, and it
means we have less to worry about when we're coding.  Is it a
"security issue"?  You could call it that, but really it's just a
sanity issue.


-- ?!ng

From python at zesty.ca  Fri May 25 11:50:07 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Fri, 25 May 2007 04:50:07 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <46568116.202@v.loewis.de>
References: <Pine.LNX.4.58.0705241759100.8399@server1.LFW.org>
	<465667AE.2090000@v.loewis.de> <20070524215742.864E.JCARLSON@uci.edu>
	<46568116.202@v.loewis.de>
Message-ID: <Pine.LNX.4.58.0705250447390.27740@server1.LFW.org>

On Fri, 25 May 2007, [ISO-8859-1] "Martin v. L?wis" wrote:
> I don't think there is precedence in Python for such an informational
> error message.

SyntaxError: Non-ASCII character '\xd1' in file foo.py on line 2, but
no encoding declared; see http://www.python.org/peps/pep-0263.html for
details

> It is not pythonic to give an error in the case
> "I know what you want, and I could easily do it, but I don't feel
> like doing it, read these ten pages of text to learn more about the
> problem".

Python is not a DWIM language.  That is one of its strengths.  It is
Pythonic to give an error in the case "I could guess what this means,
but it might be a mistake.  Please be clear about what you want."


-- ?!ng

From python at zesty.ca  Fri May 25 11:51:18 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Fri, 25 May 2007 04:51:18 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <465667AE.2090000@v.loewis.de>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>
	<465615C9.4080505@v.loewis.de>
	<Pine.LNX.4.58.0705241759100.8399@server1.LFW.org>
	<465667AE.2090000@v.loewis.de>
Message-ID: <Pine.LNX.4.58.0705250439320.27740@server1.LFW.org>

On Fri, 25 May 2007, [UTF-8] "Martin v. L??wis" wrote:
> Ka-Ping Yee schrieb:
> > On Fri, 25 May 2007, [ISO-8859-1] "Martin v. L???wis" wrote:
> > People who want to use the feature can turn it on.  I don't see what's
> > so unreasonable about that.
>
> People who want to use the feature would have to know that it is only
> present if you turn it on. It's like saying "you can use hexadecimal
> integer literals, but you have to turn them on". This wouldn't work:
> people try to use them, find out that it won't work, and assume
> that it's not supported.

This argument is absurd.  If you know that you want Unicode literals
(a NEW FEATURE that has never existed in Python before), you know
enough to learn how to use the feature.

To show you just how absurd that argument is, realize that it is also
an argument for ignoring the entire standard library.  Since people
have to "import re" before using regular expressions, they'll assume
there's no regex support in Python?  Of course not -- part of learning
how to use regexes is that you "import re"; it's in the documentation,
it's in tutorials about regexes, it's how you teach beginners to use
regexes, etc.


-- ?!ng

From python at zesty.ca  Fri May 25 11:53:09 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Fri, 25 May 2007 04:53:09 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <46568BBC.9060801@v.loewis.de>
References: <320102.38046.qm@web33515.mail.mud.yahoo.com>
	<46568BBC.9060801@v.loewis.de>
Message-ID: <Pine.LNX.4.58.0705250451220.27740@server1.LFW.org>

On Fri, 25 May 2007, [ISO-8859-1] "Martin v. L?wis" wrote:
> > I think there are things that can be done here, even
> > if we make Python's default mode to be ascii-pure.
> > Regional distros can set the environment
> > appropriately.  Python error messages about non-ascii
> > characters can suggest how to enable the -U flag.  The
> > Tokyo Python User's Group can educate programmers,
> > etc.
>
> Yes, but these are all work-arounds for an avoidable ugliness.

You've got the defaults backwards.

If "anything goes" is the default, failures are silent as well as
invisible, and you have no help in recovering from them.

If "ASCII only" is the default, failures produce an error message,
and that error message can guide you to the solution.


-- ?!ng

From bjourne at gmail.com  Fri May 25 11:55:58 2007
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Fri, 25 May 2007 11:55:58 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <20070524215742.864E.JCARLSON@uci.edu>
References: <Pine.LNX.4.58.0705241759100.8399@server1.LFW.org>
	<465667AE.2090000@v.loewis.de> <20070524215742.864E.JCARLSON@uci.edu>
Message-ID: <740c3aec0705250255k642d6637re46e3929212f1369@mail.gmail.com>

On 5/25/07, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > Ka-Ping Yee schrieb:
> > > On Fri, 25 May 2007, [ISO-8859-1] "Martin v. L???wis" wrote:
> > >> Please *do* consider the needs of the people who want to actively
> > >> use the feature as well. Otherwise, you have no chance of understanding
> > >> what will make everyone happy.
> > >
> > > People who want to use the feature can turn it on.  I don't see what's
> > > so unreasonable about that.
> >
> > People who want to use the feature would have to know that it is only
> > present if you turn it on. It's like saying "you can use hexadecimal
> > integer literals, but you have to turn them on". This wouldn't work:
> > people try to use them, find out that it won't work, and assume
> > that it's not supported.
>
> Are we going to stop offering informational error messages to people?
> Because an informational error message could go a long way towards
> helping people to understand what is going on.

I think you are forgetting who this feature is intended for. I can't
for my life imagine that any free software project would start using
non-ASCII identifiers, nor any professional software development
company either. Decent programmers learn and use English because that
is the lingua franca of the computer world.

Newbies, on the other hand, would maybe appreciate being able to write:

?rjan = 42
?sa = 12
P?r = 12
genomsnitts?lder = (?rjan + ?sa + P?r) / 3
print genomsnitts?lder

instead of using the (in Swedish) less readable identifiers Orjan,
Asa, Par and genomsnitssAlder.

If Python required a switch for such a program to run, then this
feature would be totally wasted on them. They might use an IDE,
program in notepad.exe and dragging the file to the python.exe icon or
not even know about cmd.exe or what a command line switch is. An error
message, even an informal one, isn't easy to understand if you don't
know English.


-- 
mvh Bj?rn

From python at zesty.ca  Fri May 25 12:00:12 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Fri, 25 May 2007 05:00:12 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <46568116.202@v.loewis.de>
References: <Pine.LNX.4.58.0705241759100.8399@server1.LFW.org>
	<465667AE.2090000@v.loewis.de> <20070524215742.864E.JCARLSON@uci.edu>
	<46568116.202@v.loewis.de>
Message-ID: <Pine.LNX.4.58.0705250454420.27740@server1.LFW.org>

On Fri, 25 May 2007, [ISO-8859-1] "Martin v. L?wis" wrote:
> People should not have to read long system configuration pages
> just to run the program that they intuitively wrote correctly
> right from the start.

It is not intuitive.  One thing I learned from the discussion here
about Unicode identifiers in other languages is that, though this
support exists in several other languages, it is *different* in each
of them.  And PEP 3131 is different still.  They allow different
sets of characters, and even worse, use different normalization rules.

Can you keep straight which letters are allowed in Java, Javascript,
C#, Python?  What about two identifiers which refer to the same
variable in some languages but refer to different variables in others?

How do we know that PEP 3131's answer is the right answer and all
these other languages chose the wrong answer?

This is far from simple.


-- ?!ng

From jan.grant at bristol.ac.uk  Fri May 25 12:25:37 2007
From: jan.grant at bristol.ac.uk (Jan Grant)
Date: Fri, 25 May 2007 11:25:37 +0100 (BST)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
Message-ID: <20070525112422.K79178@tribble.ilrt.bris.ac.uk>

On Fri, 25 May 2007, Guillaume Proux wrote:

> Hello,
> 
> There has been many proposals of flags around.
> I don't even understand anymore which -U you are talking about now.
> 
> But let me add my own proposal for a flag. (just to confuse everybody
> else a little more)

If there must be a flag, +1* to the addition of an "ascii only" flag, 
and whilst we're at it, let's call it "-parochial".

Cheers,
jan

* although I do not get to vote.

-- 
jan grant, ISYS, University of Bristol. http://www.bris.ac.uk/
Tel +44 (0)117 3317661   http://ioctl.org/jan/
Unfortunately, I have a very good idea how fast my keys are moving.

From stephen at xemacs.org  Fri May 25 12:45:39 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 25 May 2007 19:45:39 +0900
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <4655DD4E.3050809@v.loewis.de>
References: <20070523111704.85FC.JCARLSON@uci.edu>
	<87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070524082737.862E.JCARLSON@uci.edu>
	<4655DD4E.3050809@v.loewis.de>
Message-ID: <87sl9l41ws.fsf@uwakimon.sk.tsukuba.ac.jp>

"Martin v. L?wis" writes:

 > > If people can agree on a method for specifying, 'ascii only', 'ascii +
 > > character sets X, Y, Z', and it actually becomes an accepted part of the
 > > proposal, gets implemented, etc., I will grumble to myself at home, but
 > > I will stop trying to raise a stink here.
 > 
 > I think you can stop now - this is supported as a side effect of
 > PEP 263, and implemented for years.

-1

That seems not to be the case.  PEP 263 allows you to specify a coding
system, not a character set.  Whether that will restrict the character
set depends on how the coding system is implemented.  For example,
ISO-2022-JP is implicitly a (near) UCS since it does not forbid
designations, so you don't know (XEmacs implements it as a UCS, I'm
not sure what GNU does), while ISO-2022-JP-2 is explicitly a UCS
because it explicitly permits designations.  And how about C1 code
points in ISO 2022-conformant 8-bit coding systems (including all ISO
8859 systems)?  Do they pass, or not?  Any restriction is simply a
side effect of the codec throwing an exception because it doesn't
recognize the input.  So this requires that users know how the
relevant codec is implemented.

Second, this also removes your ability to use literal strings and
comments outside that coding system.  (Of course Unicode escapes will
still be available, but hardly acceptable for string literals, and
completely out of the question for comments.)

Third, it also has the defect of requiring you to use a legacy coding
system, does it not?  Ie, if I want to restrict to ASCII + Cyrillic, I
can use ISO-8859-5 or KOI8-R but *not* UTF-8.

Finally it does not make it easy to create unions or subsets.  One has
to write a codec to do that.


From stephen at xemacs.org  Fri May 25 13:13:03 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 25 May 2007 20:13:03 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
Message-ID: <87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>

Jim Jewett writes:

 > Definition; I don't care whether it is a different argument to import
 > or a flag or an environment variable or a command-line option, or ...
 > I just want the decision to accept non-ASCII characters to be
 > explicit.

Ka-Ping's tricky.py shows that reliance on magic directives a la PEP
263 loses.  I agree with Martin that in practice most such hacks will
get caught in the ordinary process of editing, applying patches,
sending email, and the like, but if the compiler is going to do the
checking on behalf of the *user*, it should not rely on anything the
files say.

 > Ideally, it would even be explicit per extra character allowed, though
 > there should obviously be shortcuts to accept entire scripts.

How about a regexp character class as starting point?

 > So how about
 > 
 > (1)  By default, python allows only ASCII.

+1

But neither Martin nor Guido likes it, so I'm continuing to think
about it.  Martin's objection that people will try it and assume that
it's unimplemented smells like FUD to me, though.

 > (2)  Additional characters are permitted if they appear in a table
 > named on the command line.

+1

 > These additional characters should be restricted to code points larger
 > than ASCII (so you can't easily turn "!" into an ID char)

+1

You can specify any character you want, but if it's ASCII, or not in
the classes PEP 3131 ends up using to define the maximal set, it gets
deleted from the extension table (ASCII has its own table,
conceptually).  This permits whole scripts, blocks, or ranges to be
included.

Optionally warn on such deletions at load of the table (that would be
better a separate tool), but preferably when parsing the identifier
throw a SyntaxError

    """This character is in the table of extension characters for
    identifiers, but is of class Cf, which is forbidden in identifiers."""

 > If you want to include punctuation or

-1

Why waste the effort of the Unicode technical committees?

 > undefined characters, so be it.

-1

Assuming undefined == reserved for future standardization that
violates the Unicode standard.

-1 on private space characters

You *could* argue that a private space character could be valid within
a module, or an application of cooperating modules, but I don't think
it's worth trying to deal with it.  "I'm from Kansas, show me" (a use
case).


From stephen at xemacs.org  Fri May 25 13:33:38 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 25 May 2007 20:33:38 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <781A2C3C-011E-4048-A72A-BE631C0C5127@fuhm.net>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>
	<781A2C3C-011E-4048-A72A-BE631C0C5127@fuhm.net>
Message-ID: <87ps4p3zot.fsf@uwakimon.sk.tsukuba.ac.jp>

James Y Knight writes:

 > >     - The identifier character set won't spontaneously change when
 > >       one upgrades to a new version of Python, even for users of
 > >       non-ASCII identifiers.
 > 
 > FUD. Already won't, unicode explicitly makes that promise. They can  
 > add characters, but not remove them.

Addition is a change, in fact it's the change Ka-Ping dislikes most.

 > >     - Having to specify the table of acceptable characters
 > >       demonstrates at least some knowledge of the character set
 > >       one is using.
 > 
 > This is a negative. Why should I have to show knowledge of the  
 > character set I'm using to type the characters?

You don't.  Jim's proposal doesn't specify it, but there should be at
least two built-in tables, ascii (for the stdlib) and unicode
(everything Pythonic in the Identifier classes defined by Unicode).
If you don't want to know, just specify -U unicode.

And if there isn't one, just grab the list off Martin's
"non-normative" table and there you go.

 > >     - It provides the flexibility for different communities to
 > >       to adopt identifier conventions that suit their preferred
 > >       tradeoff of risk vs. expressiveness.
 > 
 > Also a negative. Now, if I want to run the modules from multiple  
 > communities I need to figure out how to merge the tables they have to  
 > separately distribute with their modules.

No, you just use -U unicode.

 > a) you trust that the author of the file has authored it correctly,  
 > in which case it doesn't matter one bit what character set they used.  

Which is why 9 out of 10 American viruses recommend Internet Explorer
5 or below.  Because most users *do* trust authors and other
purveyors, including porn sites, etc.

This may be *much less* true of Python users, but I think most
domestic offices of most American corporations would be quite happy to
disable Unicode identifier support at compile time.

 > Restricting the charset at import time is just something to get in  
 > your way with no actual value.

So don't do it; use -U unicode.  I bet Jim J and Josiah and Ka-Ping
will all explicitly use -U ascii, just to make sure.  What's wrong
with that, if that's what they want?

 > Adding baroque command line options for users of other languages to  
 > do some useless verification at import time is not an acceptable  
 > answer. It'd be better to just reject the PEP entirely.

Speaking of exaggeration ....

From stephen at xemacs.org  Fri May 25 14:53:10 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 25 May 2007 21:53:10 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
Message-ID: <87odk93w09.fsf@uwakimon.sk.tsukuba.ac.jp>

Guido van Rossum writes:

 > If there's a security argument to be made for restricting the alphabet
 > used by code contributions (even by co-workers at the same company), I
 > don't see why ASCII-only projects should have it easier than projects
 > in other cultures.

(1) Because all projects are currently ASCII-only.  I don't hear any
complaints from projects currently using non-ASCII identifiers<wink>,
and there will be few for many months.  The scaling argument gets a
similar response.  I.e., "it won't hurt (not much nor soon)".
N.B. Consistent with my Emacs Lisp experience.  What is Common Lisp
and/or Java experience?

I recall Alex Martelli's discussion of even allowing non-English
comments during PEP 263.  Many shops will resist non-ASCII identifiers
in published or purchased modules, even in the European community, I
would think.  Jamie Zawinski has an amusing anecdote about the great
profanity purge at Netscape; I bet that kind of boss would not be at
all happy about the idea of swear words he can't read.

The only thing that really worries me here is Martin's "people will
try it and think it's unimplemented" argument (avoidably delaying
diffusion of -U unicode), but I think a

SyntaxError: 'non-ASCII identifier: invalid unless enabled with the -U option'

would alleviate that.

(2) Because due to the scaling argument and reduction of fear of the
unknown, as well as development of the collateral tools, changing the
default from 'ascii' to 'unicode' will be very natural within a few
years.

I'm sympathetic to the argument that it's even more natural to make
the default unicode _now_ (ie, for the release of Python 3 which is
still well in the future) and let the conservatives use '-U ascii',
but (a) we have no experience with such a Python, and (b) we don't
have any of the tools yet, and I don't see why we would trust them to
do a good job without the experience.  At least for the "lookalike
glyphs" issue the devil is very much in the details.  Trial and error
stuff, to some extent.

From showell30 at yahoo.com  Fri May 25 14:49:12 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Fri, 25 May 2007 05:49:12 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <87veeh4bw4.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <105266.7737.qm@web33504.mail.mud.yahoo.com>


--- "Stephen J. Turnbull" <stephen at xemacs.org> wrote:

> Steve Howell writes:
> 
>  > respect to Kanji, and switches over to Python,
> and
>  > changes his little wrapper shell script to say
> "python
>  > -U" instead of "ruby -Kkcode"?  He could then
> start to
>  > use non-Japanese Python modules while still
> writing
>  > his own Python code in Japanese.
> 
> But that's not enough.  The problem is that the
> reason for -Kkcode is
> that kcode != Unicode.  Japanese use several
> mutually incompatible
> encodings, and they mix anarchically over the
> Internet.  What -K does
> is allow you to specify which one you're giving to
> the interpreter at
> runtime.
> 
> The analogy to -K would be if you get a
> English-language Python source
> file from somewhere, look into it, realize it's from
> IBM, and run it
> with "python -K ebcdic whizbang.py".  Same
> characters, only the bytes
> are changed to confuse the innocent.  That's what
> -Kkcode is for.
> 

I think you misintrepeted my post a bit.  I wasn't
suggesting that Python implement a flag that was
exactly equivalent to the -K flag in Ruby.  I
understand the arguments that such a flag might be
either unnecessary in Python, or unsatisfactory.

What I was trying to say here is that there might be
precedent for non-ascii users already tolerating
command line arguments. 




       
____________________________________________________________________________________Luggage? GPS? Comic books? 
Check out fitting gifts for grads at Yahoo! Search
http://search.yahoo.com/search?fr=oni_on_mail&p=graduation+gifts&cs=bz

From showell30 at yahoo.com  Fri May 25 15:03:01 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Fri, 25 May 2007 06:03:01 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <46568BBC.9060801@v.loewis.de>
Message-ID: <788141.82125.qm@web33507.mail.mud.yahoo.com>


--- "Martin v. L?wis" <martin at v.loewis.de> wrote:

> > In almost every programming situation I've been
> in,
> > I've had to deal with environmental issues, even
> > though my character set of choice has never been
> the
> > primary issue.
> 
> People can certainly adjust to whatever challenges
> technology confronts them with (some people can do
> that easier, some have more difficulties). Still,
> beautiful is better than ugly.
> 

Remember, you and I have no disagreement whatsoever
about what the Python code looks like.  I look forward
to seeing beautiful code written in French, Korean,
etc. under PEP 3131, and I have not opposed anything
in the proposal that affects the code itself.

We're just disagreeing about whether the Dutch tax law
programmer has to uglify his environment with an alias
of Python to "python3.0 -liberal_unicode," or whether
the American programmer in an enterprisy environment
has to uglify his environment with an alias of Python
to "python3.0 -parochial" to mollify his security
auditors.

I guess you could argue that the American programmer
in an enterprisy environment already is dealing with
so much ugliness, it wouldn't matter. ;)  




       
____________________________________________________________________________________Yahoo! oneSearch: Finally, mobile search 
that gives answers, not web links. 
http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC

From showell30 at yahoo.com  Fri May 25 15:17:18 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Fri, 25 May 2007 06:17:18 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>
Message-ID: <857617.19874.qm@web33514.mail.mud.yahoo.com>


--- Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> If you look at the typical use case for programs
> written in python
> (usually also in rough order of experience)
> A) directly in interpreter (i love that)
> B) small-ish one-off scripts
> C) middle size scripts
> D) multi-module programs made by a single person
> E) large-ish programs made by a group of people
> 

I have a funny dilemma as an ASCII user.  When I write
small-ish one-off scripts (category B), I often start
typing rapid fire, and there's a feature in vim that
if I hit just the wrong combination of keys, I get an
accented e, even though I intend to write unaccented
English.  This happens to me about once a month, and I
forget exactly what Python does when I try to run the
program where one identifier has the accented e, and a
later identifier doesn't.  

I'm not drawing any specific conclusion from this
anecdote about what to do in Py3k; I'm just pointing
out that ascii users can get flustered by non-ascii
characters, and sometimes it's purely accidental that
we introduce them to our code.




       
____________________________________________________________________________________Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games.
http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow  

From gproux+py3000 at gmail.com  Fri May 25 15:53:00 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Fri, 25 May 2007 22:53:00 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705250641j348a42adu974fe4969897761e@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<87odk93w09.fsf@uwakimon.sk.tsukuba.ac.jp>
	<19dd68ba0705250641j348a42adu974fe4969897761e@mail.gmail.com>
Message-ID: <19dd68ba0705250653v2c2a8188jac8c4ccc722fb747@mail.gmail.com>

One issue with the command line argument (and that unfortunately
applies ONLY to the -U case) that i haven't seen properly answered to
is..

On 5/25/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> SyntaxError: 'non-ASCII identifier: invalid unless enabled with the -U option'

Am I the only person on Earth to routinely start my python programs by
double clicking on them??  I don't think my daughter would be able to
understand what happens if the program does not start.

In another similar universe, the mutant python little brother Boo is
not seeing much flames erupt from the uncontroversial proposal to
enable unicode for identifiers...
 http://jira.codehaus.org/browse/BOO-633

Cheers,

Guillaume

From ncoghlan at gmail.com  Fri May 25 15:55:01 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 25 May 2007 23:55:01 +1000
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <46568CEF.2030900@v.loewis.de>
References: <465615C9.4080505@v.loewis.de>	<320102.38046.qm@web33515.mail.mud.yahoo.com>	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<46568CEF.2030900@v.loewis.de>
Message-ID: <4656EAB5.6080405@gmail.com>

Martin v. L?wis wrote:
>> I think that's a pretty strong reason for making the new, more complex
>> behaviour optional.
> 
> Thus making it simpler????? The more complex behavior still remains,
> to fully understand the language, you have to understand that behavior,
> *plus* you need to understand that it may sometimes not be present.

It's simpler because any existing automated unit tests will flag 
non-ascii identifiers without modification. Not only does it prevent 
surreptitious insertion of malicious code, but existing projects don't 
have to even waste any brainpower worrying about the implications of 
Unicode identifiers (because library code typically doesn't care about 
client code's identifiers, only about the objects the library is asked 
to deal with).

However, what the option *does* enable is for a class of 
users/developers to employ a broader range of characters if they *or 
their teacher or employer* choose to do so.

A free-for-all wasn't even proposed for strings and comments in PEP 263 
- why shouldn't we be equally conservative when it comes to 
progressively enabling Unicode identifiers?

Cheers,
Nick.


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From guido at python.org  Fri May 25 16:32:57 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 25 May 2007 07:32:57 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <20070524213605.864B.JCARLSON@uci.edu>
References: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<20070524213605.864B.JCARLSON@uci.edu>
Message-ID: <ca471dc20705250732k14598e57id40c5f95434ecee9@mail.gmail.com>

On 5/24/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> Where else in Python have we made the default
> behavior only desired or useful to 5% of our users?

Where are you getting that statistic? This seems an extremely
backwards, US-centric worldview.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ronaldoussoren at mac.com  Fri May 25 16:24:43 2007
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Fri, 25 May 2007 07:24:43 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <788141.82125.qm@web33507.mail.mud.yahoo.com>
References: <788141.82125.qm@web33507.mail.mud.yahoo.com>
Message-ID: <2CB3D8B9-0112-1000-A9BC-2D28C5213B40-Webmail-10016@mac.com>

 
On Friday, May 25, 2007, at 03:03PM, "Steve Howell" <showell30 at yahoo.com> wrote:
>
>
>Remember, you and I have no disagreement whatsoever
>about what the Python code looks like.  I look forward
>to seeing beautiful code written in French, Korean,
>etc. under PEP 3131, and I have not opposed anything
>in the proposal that affects the code itself.
>
>We're just disagreeing about whether the Dutch tax law
>programmer has to uglify his environment with an alias
>of Python to "python3.0 -liberal_unicode," or whether
>the American programmer in an enterprisy environment
>has to uglify his environment with an alias of Python
>to "python3.0 -parochial" to mollify his security
>auditors.
>
>I guess you could argue that the American programmer
>in an enterprisy environment already is dealing with
>so much ugliness, it wouldn't matter. ;)  

This could easily be solved by tool support instead of yet another switch (and in effect language variant). That is, pylint, pychecker or even a svn pre-commit hook could report on code that doesn't use the character range that is valid according the coding conventions for the project. 

I'm +0.5 on adding Unicode identifier support because it would allow me to use accented characters in localized code whenever appropriate.

Ronald

>
>
>
>
>       
>____________________________________________________________________________________Yahoo! oneSearch: Finally, mobile search 
>that gives answers, not web links. 
>http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC
>_______________________________________________
>Python-3000 mailing list
>Python-3000 at python.org
>http://mail.python.org/mailman/listinfo/python-3000
>Unsubscribe: http://mail.python.org/mailman/options/python-3000/ronaldoussoren%40mac.com
>
>

From stephen at xemacs.org  Fri May 25 17:07:42 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 26 May 2007 00:07:42 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <105266.7737.qm@web33504.mail.mud.yahoo.com>
References: <87veeh4bw4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<105266.7737.qm@web33504.mail.mud.yahoo.com>
Message-ID: <87k5ux3ps1.fsf@uwakimon.sk.tsukuba.ac.jp>

Steve Howell writes:

 > What I was trying to say here is that there might be
 > precedent for non-ascii users already tolerating
 > command line arguments. 

It's an idea, but it turns out not to correspond to reality.  It only
shows there's a precedent for Japanese tolerating command line
arguments.  The Japanese encoding mess is unique, and shameful.

From jimjjewett at gmail.com  Fri May 25 16:56:52 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 25 May 2007 10:56:52 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705241901h23468237md8e81aaa65f9b7a6@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<fb6fbf560705241837m5b03cb0pc27a0bf78c5aa4ff@mail.gmail.com>
	<19dd68ba0705241901h23468237md8e81aaa65f9b7a6@mail.gmail.com>
Message-ID: <fb6fbf560705250756s30f3a841k83ff872922fb3510@mail.gmail.com>

On 5/24/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> Hi Jim,
> On 5/25/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > It isn't strictly security; when I've been burned by cut-and-paste
> > that turned out to be an unexpected character, it didn't cause damage,
> > but it did take me a long time to debug.

> Can you give a longer explanation because I don't understand what is
> the issue. Is it like the issue with confusing 0 and O ? You seemingly
> already have an experience with using something that is now not legal
> in Python. Was it in Java or .NET world?

The really hard-to-debug ones were usually in C.  It happened more
when I was less experienced, or the available tools were limited.

They usually involved something that looked like a quote mark, but
wasn't.  (I worry about the characters that look like a less-than
sign, but I've never had trouble with them in practice.  Problems with
other punctuation were rare enough that I can't say they were worse
than "." vs "," or ":" vs ";".)

This would be less of a problem in python because it takes
triple-quotes to continue a line string across multiple lines -- but
it would still be an occasional problem.

This would be less of a problem if I had started out smarter, or I if
never worked with people who used presentation-focused editors (like
MS Word) when discussing code, but those are only theoretical
possibilities.

> > For most people, the appearance of a Greek or Japanese (let alone
> > both) character would be more likely to indicate a typo.  If you know
> > that your project is using both languages, then just allow both; the
> > point is that you have made an explicit decision to do so.

> * Python is dynamic (you can have a e.g. pygtk user interface which
> enables you to load at runtime a new .py file even to use a text view
> to type in a mini-script that will do something specific in your
> application domain): you never know what will get loaded next

I am not missing that -- that is the situation I worry about *most*.
If I'm running something that new, and I've only inspected it
visually, I want a great big warning about unexpected characters that
merely look like what I thought they were.

No, this won't happen often -- but like threading race conditions,
that almost makes it worse.  Because it is rare, people won't remember
to check for it unless the check is an automated default.

If I were in a Japanese environment, regularly getting code written in
Japanese, then Japanese code would be fine, so I would set my
environment to accept Japanese -- but I would still get that warning
for something with that appears Latin but actually contains Cyrillic.

> * Python is embeddable: and often it is to bring the power of python
> to less sophisticated users. You can imagine having a global system
> deployed all around the world by a global company enabling each user
> in each subsidiary to create their own extension scripts.

If they can supply their own scripts, they can supply their own data
files -- including an acceptable characters table.  But they wouldn't
really need to -- realistically, the acceptable characters would be a
corporate (or at least site-wide) policy decision that could be set at
install time.

> * There is a runtime cost for checking: the speed vs. security
> tradeoff

True, but if speed is that important, than ASCII-only is better; the
initial file reading will happen faster, as will the parsing to
characters, and the deciding whether characters can be part of an
identifier.  Even a blind "Anything code point greater than 127 is
always allowed" is still slower than not having to consider those code
points.

Once you start saying "letters and digits only", you need a
per-character lookup, and the difference between "in this set of 4000
out of several million" vs "in this set of several million out of
several more million" doesn't need to slow things down.

> (for a security benefit that is still very much hypothetical
> in the face of the experience of Java and .NET people)

(a)  Aren't those compile languages, rather than interpreted?  So a
misleadingly-named identifier doesn't matter as much,  because people
aren't looking at the source anyhow.
(b)  How do you know there haven't been problems that just weren't
caught?  (Perhaps more of the "wonder why that errored out" variety
than security breaches.)

> * In real life, you won't see much python programs that are not
> written in your script.

Exactly.  So when you do, they should be flagged.

-jJ

From jimjjewett at gmail.com  Fri May 25 17:04:24 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 25 May 2007 11:04:24 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705241913j1d2f60e1ndc89e05bfd926c52@mail.gmail.com>
References: <19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<250467.66423.qm@web33502.mail.mud.yahoo.com>
	<19dd68ba0705241913j1d2f60e1ndc89e05bfd926c52@mail.gmail.com>
Message-ID: <fb6fbf560705250804j6734d6e5me197fe3f7edccf2a@mail.gmail.com>

On 5/24/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:

> I have a hard time seeing how you could sniff out the willingness to
> accept in a Japanese environment, a piece of code written in Russian
> because your buddy from Siberia has written this cool matrix class
> that is 30% faster than most but contains a bunch of cyrillic
> characters because people are using cyrillic characters for local
> variable identifiers (but not module level identifiers).

You probably can't sniff that out automatically.  What you can do
automatically is say "Whoa!  unexpected characters!  If you're sure
that this code is OK, then do XYZ to allow it (and sufficiently
similar code) to run from now on."

If XYZ is simple enough, that seems a reasonable tradeoff.

The matrix class' distribution could even include the sample lines
that need to be added to your allowed-chars table, so you can do it
automatically at install time, *if* you explicitly indicate that you
know this source is using cyrillic, and that it is OK.

(In theory, you might want to allow cyrillic only for this file, not
for future files; in practice, people that careful can probably be
expected to do the extra work of setting up alternate environments.)

-jJ

From jimjjewett at gmail.com  Fri May 25 17:32:25 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 25 May 2007 11:32:25 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
Message-ID: <fb6fbf560705250832m7ce88fc8uc108059b3a770cda@mail.gmail.com>

On 5/24/07, Guido van Rossum <guido at python.org> wrote:

> It doesn't look like any kind of global flag passed to the interpreter
> would scale -- once I am using a known trusted contribution that uses
> a different character set than mine, I would have to change the global
> setting to be more lenient, and the leniency would affect all code I'm
> using.

Are you still thinking about the single on/off switch?

I agree that saying "Japanese identifiers are OK from now on" still
shouldn't turn on Cyrillic identifiers.  I think the current
alternative boils down to some variant of

    python -idchars allowedchars.txt

where allowedchars.txt would look something like


0780..07B1    ; Thaana

or

10000..100FA  ; Linear_B plus some blanks I was too lazy to exclude

(These lines are based on the unicode Scripts.txt, and use character
ranges instead of script names so that you can exclude certain symbols
if you want to.)

-jJ

From jimjjewett at gmail.com  Fri May 25 17:37:36 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 25 May 2007 11:37:36 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>
References: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<20070524213605.864B.JCARLSON@uci.edu>
	<19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>
Message-ID: <fb6fbf560705250837i16262513p9bc5a70f9a4c506f@mail.gmail.com>

On 5/25/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:

> If you look at the typical use case for programs written in python
> (usually also in rough order of experience)
> A) directly in interpreter (i love that)
> B) small-ish one-off scripts
> C) middle size scripts
> D) multi-module programs made by a single person
> E) large-ish programs made by a group of people

You're missing "here is this neat code from sourceforge", or "Here is
something I cut-and-pasted from ASPN".  If those use something outside
of ASCII, that's fine -- so long as they tell you about it.

If you didn't realize it was using non-ASCII (or even that it could),
and the author didn't warn you -- then that is an appropriate time for
the interpreter to warn you that things aren't as you expect.

-jJ

From pje at telecommunity.com  Fri May 25 17:54:50 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 25 May 2007 11:54:50 -0400
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <de9ae4950705250225i64fe8a0fga5b86aed62556fae@mail.gmail.co
 m>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<ca471dc20705241710i50e50992m6405ed411e02aaac@mail.gmail.com>
	<4656446F.8030802@canterbury.ac.nz>
	<ca471dc20705241953r5f7dbdb3x8a93b213a142f62a@mail.gmail.com>
	<de9ae4950705250225i64fe8a0fga5b86aed62556fae@mail.gmail.com>
Message-ID: <20070525155302.917F23A4061@sparrow.telecommunity.com>

At 11:25 AM 5/25/2007 +0200, Neville Grech Neville Grech wrote:
> >From a user's POV, I'm +1 on having overloadable boolean 
> functions. In many cases I had to resort to overload add or neg 
> instead of and & not, I foresee a lot of cases where the and 
> overload could be used to join objects which represent constraints. 
> Overloadable boolean operators could also be used to implement 
> other types of logic (eg: fuzzy logic). Constraining them to just 
> primitive binary operations in my view will be delimiting for a 
> myriad of use cases.
>
>Sure, in some cases, one could overload the neg operator instead of 
>the not but semantically they have different meanings.

Actually, I think that most of the use cases for this PEP would be 
better served by being able to "quote" code, i.e. to create AST 
objects directly from Python syntax.  Then, you can do anything you 
can do in a Python expression (including conditional expressions, 
generator expressions, yield expressions, lambdas, etc.) without 
having to introduce new special methods for any of that stuff.  In 
fact, if new features are added to the language later, they 
automatically become available in the same way.


From foom at fuhm.net  Fri May 25 17:54:50 2007
From: foom at fuhm.net (James Y Knight)
Date: Fri, 25 May 2007 11:54:50 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705250837i16262513p9bc5a70f9a4c506f@mail.gmail.com>
References: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<20070524213605.864B.JCARLSON@uci.edu>
	<19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>
	<fb6fbf560705250837i16262513p9bc5a70f9a4c506f@mail.gmail.com>
Message-ID: <2644E17A-207C-4637-AFAC-B5D27063582A@fuhm.net>


On May 25, 2007, at 11:37 AM, Jim Jewett wrote:
> You're missing "here is this neat code from sourceforge", or "Here is
> something I cut-and-pasted from ASPN".  If those use something outside
> of ASCII, that's fine -- so long as they tell you about it.
>
> If you didn't realize it was using non-ASCII (or even that it could),
> and the author didn't warn you -- then that is an appropriate time for
> the interpreter to warn you that things aren't as you expect.

Why? If, today, I download a python module (say, from pypi) that does  
something I need, I don't read the source code, I just import/run it.  
In the future, why should I even give one whit of concern that a  
module I download and don't inspect the source code of may use non- 
ascii characters internally?

The answer, for me, is simple: I shouldn't care, and the python  
interpreter shouldn't force me to care.

If I later choose to examine the source code, maybe *then* I care,  
but that has nothing to do with the python interpreter.

James


From gproux+py3000 at gmail.com  Fri May 25 17:55:23 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Sat, 26 May 2007 00:55:23 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705250854o40a1025cse3d5f2c38cd76785@mail.gmail.com>
References: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<20070524213605.864B.JCARLSON@uci.edu>
	<19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>
	<fb6fbf560705250837i16262513p9bc5a70f9a4c506f@mail.gmail.com>
	<19dd68ba0705250854o40a1025cse3d5f2c38cd76785@mail.gmail.com>
Message-ID: <19dd68ba0705250855r6d2676c6r5e9cb7a49b95b6ac@mail.gmail.com>

(I mistakenly replied in private. here is a copy for the py3000 mailing list.)


Good evening!

On 5/26/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> You're missing "here is this neat code from sourceforge", or "Here is
> something I cut-and-pasted from ASPN".  If those use something outside
> of ASCII, that's fine -- so long as they tell you about it.
>
> If you didn't realize it was using non-ASCII (or even that it could),
> and the author didn't warn you -- then that is an appropriate time for
> the interpreter to warn you that things aren't as you expect.

I fail to see your point. Why should the interpreter warn you?

There is nothing wrong to have programs written with identifiers using
accented letters, cyrillic alphabet, morse code?! Why should you be
warned? If the programmer who wrote the code decided to use its own
language to name some of the identifiers ... then.. bygones.

 If you have an actual requirement that everything should be ascii
then do not copy code off ASPN without first sanitizing it and do not
copy neat code from sf.net from people you hardly know without doing a
full ascii-compliance and security review.

but if the code you copy off somewhere else does what you need it to
do... then why do you want to force the author of this code generously
donated to you to downgrade his expressiveness by having to rewrite
all his code to reach ascii purity?

Guillaume

From stephen at xemacs.org  Fri May 25 18:08:48 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 26 May 2007 01:08:48 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705250653v2c2a8188jac8c4ccc722fb747@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<87odk93w09.fsf@uwakimon.sk.tsukuba.ac.jp>
	<19dd68ba0705250641j348a42adu974fe4969897761e@mail.gmail.com>
	<19dd68ba0705250653v2c2a8188jac8c4ccc722fb747@mail.gmail.com>
Message-ID: <87irag51in.fsf@uwakimon.sk.tsukuba.ac.jp>

Guillaume Proux writes:

 > Am I the only person on Earth to routinely start my python programs by
 > double clicking on them??

Surely not.  So?  If your python programs have non-ASCII identifiers
in them, they'll crash when you double-click them.  So I suspect you
have no programs now where there's a problem.  And there will be very
few for the near future.

For the medium term, there are ways to pass command line arguments to
programs invoked by GUI.  They're more or less ugly, but your daughter
will never see them, only the pretty icons.

Please be aware that I'm an advocate of the feature, and I would be a
bit happier if it were enabled and there was no way to disable it at
all.  However, this is a community, and some of the members are quite
concerned about possible effects on themselves and on the community.

I see little harm in providing the feature, and delaying making it
default for a while, deferring to their concerns until there is more
experience with the feature, and the offline checking programs that
several of us have proposed actually exist and have been field-tested.


From jcarlson at uci.edu  Fri May 25 18:05:07 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 25 May 2007 09:05:07 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <ca471dc20705250732k14598e57id40c5f95434ecee9@mail.gmail.com>
References: <20070524213605.864B.JCARLSON@uci.edu>
	<ca471dc20705250732k14598e57id40c5f95434ecee9@mail.gmail.com>
Message-ID: <20070525084117.865D.JCARLSON@uci.edu>


"Guido van Rossum" <guido at python.org> wrote:
> 
> On 5/24/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> > Where else in Python have we made the default
> > behavior only desired or useful to 5% of our users?
> 
> Where are you getting that statistic? This seems an extremely
> backwards, US-centric worldview.

Stephen Turnbill's rough statistics on multilingual use in Emacs...
"""
And that's a big "if".  Most of your users will not see code in a
language the current version of your editor can't deal with in their
working lives, and 90% won't in the usable life of your product.  That
I can tell you from experience.  Emacs has all these wonderful
multilingual features, but you know what?  95% of our users are
monoscript 100% of the time.[1]  90% of the rest use their primary
script 95% of the time.  Emacs being multilingual only means that the
one language might be Japanese or Thai.  If 99% of your users
currently use only ISO-8859-15, that isn't going to change by much just
because Python now allows Thai identifiers.
"""
    http://mail.python.org/pipermail/python-3000/2007-May/007887.html

Which I 'poorly extrapolate' to users who write source using non-ascii
identifiers...
"""
Why?  Primarily because ascii identifiers are what are allowed today,
and have been allowed for 15 years.  But there is this secondary data
point that Stephen Turnbull brought up; 95% of users (of Emacs) never
touch non-ascii code.  Poor extrapolation of statistics aside, to make
the default be something that does not help 95% of users seems a
bit... overenthusiastic.  Where else in Python have we made the default
behavior only desired or useful to 5% of our users?
"""
    http://mail.python.org/pipermail/python-3000/2007-May/007927.html

Apples and oranges to be sure, but there are no other statistics that
anyone else is able to offer about use of non-ascii identifiers in Java,
Javascript, C#, etc.


 - Josiah


From jimjjewett at gmail.com  Fri May 25 18:03:59 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 25 May 2007 12:03:59 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <aac2c7cb0705242238o175718f9gd43b9a48f60c72d7@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<aac2c7cb0705242238o175718f9gd43b9a48f60c72d7@mail.gmail.com>
Message-ID: <fb6fbf560705250903n66ddea7cvee0612b654e04579@mail.gmail.com>

On 5/25/07, Adam Olsen <rhamph at gmail.com> wrote:
> On 5/23/07, Jim Jewett <jimjjewett at gmail.com> wrote:

> > > ... range of characters and languages allowed ...

> > Fair enough -- but the problem is that this isn't a solved issue
> > yet; the unicode group themselves make several contradictory
> > recommendations.

> > I can come up with rules that are probably just about right, but I
> > will make mistakes (just as the unicode consortium itself did,
> > which is why they have both ID and XID, and why both have
> > stability characters).  Even having read their reports, my initial
> > rules would still have banned mixed-script, which would have
> > prevented your edict-example.

> If we allowed an underscore as a mixed-script separator
> (allowing "def get_??(self):"), does this let us get away
> with otherwise banning mixed-scripts?

I wondered that, until seeing that it wouldn't really solve the
problem anyhow.  It is possible to write entire words (such as "allow"
or "scope") in multiple scripts.  (Unicode calls these "whole script
confusables".)  You can't stop that without banning one of the scripts
entirely, which would disenfranche users of some languages.

So I think the least-bad solution is to say "OK, we won't allow these
potentially confusable characters unless you were expecting them."

And once we have a way to say "I'm expecting Cyrillic", we might as
well let the user specify exactly what they're expecting, and make
their own decisions on what it likely to be needed vs likely to be
confused.

For more information, see section 4 of

    http://www.unicode.org/reports/tr39/

and current likely problem characters at

    http://www.unicode.org/reports/tr39/data/confusables.txt
    http://www.unicode.org/reports/tr39/data/confusablesWholeScript.txt

-jJ

From gproux+py3000 at gmail.com  Fri May 25 18:10:02 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Sat, 26 May 2007 01:10:02 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <87irag51in.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<87odk93w09.fsf@uwakimon.sk.tsukuba.ac.jp>
	<19dd68ba0705250641j348a42adu974fe4969897761e@mail.gmail.com>
	<19dd68ba0705250653v2c2a8188jac8c4ccc722fb747@mail.gmail.com>
	<87irag51in.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <19dd68ba0705250910o5b56b4f9i9fccd450e37f48fe@mail.gmail.com>

On 5/26/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> For the medium term, there are ways to pass command line arguments to
> programs invoked by GUI.  They're more or less ugly, but your daughter
> will never see them, only the pretty icons.

Is there right now in Windows?  There is none that I know today at
least. All I know is that specific extensions are called automatically
using a given interpreter because of bindin defined in the  registry.
There is no simple way to add per-file info afaik.

Regards,

Guillaume

From guido at python.org  Fri May 25 18:31:13 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 25 May 2007 09:31:13 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705250832m7ce88fc8uc108059b3a770cda@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<fb6fbf560705250832m7ce88fc8uc108059b3a770cda@mail.gmail.com>
Message-ID: <ca471dc20705250931n6e012c21wf177a7a943e9249f@mail.gmail.com>

On 5/25/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 5/24/07, Guido van Rossum <guido at python.org> wrote:
>
> > It doesn't look like any kind of global flag passed to the interpreter
> > would scale -- once I am using a known trusted contribution that uses
> > a different character set than mine, I would have to change the global
> > setting to be more lenient, and the leniency would affect all code I'm
> > using.
>
> Are you still thinking about the single on/off switch?
>
> I agree that saying "Japanese identifiers are OK from now on" still
> shouldn't turn on Cyrillic identifiers.  I think the current
> alternative boils down to some variant of
>
>     python -idchars allowedchars.txt
>
> where allowedchars.txt would look something like
>
>
> 0780..07B1    ; Thaana
>
> or
>
> 10000..100FA  ; Linear_B plus some blanks I was too lazy to exclude
>
> (These lines are based on the unicode Scripts.txt, and use character
> ranges instead of script names so that you can exclude certain symbols
> if you want to.)

I still think such a command-line switch (or switches) is the wrong
approach. What if I have *one* module that uses Cyrillic legitimately.
A command-line switch would enable Cyrillic in *all* modules.

Auditing code using a separate tool can be much more flexible.
Organizations can establish their own conventions for flagging
exceptions on a per-module basis.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jcarlson at uci.edu  Fri May 25 18:37:50 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 25 May 2007 09:37:50 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705250910o5b56b4f9i9fccd450e37f48fe@mail.gmail.com>
References: <87irag51in.fsf@uwakimon.sk.tsukuba.ac.jp>
	<19dd68ba0705250910o5b56b4f9i9fccd450e37f48fe@mail.gmail.com>
Message-ID: <20070525092604.8666.JCARLSON@uci.edu>


"Guillaume Proux" <gproux+py3000 at gmail.com> wrote:
> On 5/26/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> > For the medium term, there are ways to pass command line arguments to
> > programs invoked by GUI.  They're more or less ugly, but your daughter
> > will never see them, only the pretty icons.
> 
> Is there right now in Windows?  There is none that I know today at
> least. All I know is that specific extensions are called automatically
> using a given interpreter because of bindin defined in the  registry.
> There is no simple way to add per-file info afaik.

I thought you didn't care what identifiers were in your source? 
Wouldn't you have already changed your environment to automatically
include all of unicode in the allowable identifiers?

But if you really want to muck about with the command line to each
script individually, you can create a shortcut and add 'python <whatever
stuff you want>' to the beginning of the command line.  Or, if you want
a semi-automatic solution, you can change the command line to Python to
a batch file that automatically generates a either a shortcut or a batch
file for each .py file that is run, which can then be edited either
using the properties dialog (for shortcuts) or any text editor (for
batch files) to change the command line options to Python.

You may be able to use the shortcuts automatically generated and placed
into your 'Documents and Settings\<username>\Recent' path, but I haven't
tested this.


 - Josiah


From jcarlson at uci.edu  Fri May 25 18:41:53 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 25 May 2007 09:41:53 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <4656920F.9040001@v.loewis.de>
References: <20070524234516.8654.JCARLSON@uci.edu>
	<4656920F.9040001@v.loewis.de>
Message-ID: <20070525091105.8663.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
> >> People should not have to read long system configuration pages
> >> just to run the program that they intuitively wrote correctly
> >> right from the start.
> > 
> > You mean that 5% of users who run into code written using non-ascii
> > identifiers will find this sufficiently burdensome to force the 95% of
> > ascii users to use additional verification and checking tools to make
> > sure that they are not confronted with non-ascii identifiers?  I don't
> > find that a reasonable tradeoff for the majority of (non-unicode) users.
> 
> I think I lost track of what problem you are trying to solve: is it
> the security issue, or is the the problem Ping stated ("you cannot
> know the full lexical rules by heart anymore").
> 
> If it is the latter, I don't understand why the 95% ascii users need
> to run additional verification and checking tools. If they don't
> know the full language, they won't use it - why should they run
> any checking tools?

Say that I have an ascii codebase that I've been happily using (and I
have been getting warnings/errors/whatever whenever non-ascii code is
found during runtime, so I know it is pure). But I want to use a 3rd
party package that offers additional functionality*.  I drop this
package into my tree, add the necessary imports and...

ImportError: non-ascii identifier used without -U option

Huh, apparently this 3rd party package uses non-ascii identifiers.  If I
wanted to keep my codebase ascii-only (a not unlikely case), I can
choose to either look for a different package, look for a variant of
this package with only ascii identifiers, or attempt to convert the
package myself (a tool that does the unicode -> ascii transliteration
process would make this smoother).

For those who don't care about ascii or non-ascii identifiers, they will
likely already have an environment variable or site.py modification that
offers all unicode characters that they want, and they will never see
this message.


> If it is the security issue, I don't see why a warning wouldn't
> address the concerns of these users just as well.

It's partially a security issue, but that's only 1 of the 5 reasons that
Ka-Ping pointed out.  But yes, I want to see a message and I want the
software to halt and tell me that it found something that may be an
issue.  And I want this to *automatically* happen every time I run
Python


 - Josiah

 * Or I copy and paste code from the Python Cookbook, a blog, etc.


From gproux+py3000 at gmail.com  Fri May 25 18:45:35 2007
From: gproux+py3000 at gmail.com (Guillaume Proux)
Date: Sat, 26 May 2007 01:45:35 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <20070525091105.8663.JCARLSON@uci.edu>
References: <20070524234516.8654.JCARLSON@uci.edu>
	<4656920F.9040001@v.loewis.de> <20070525091105.8663.JCARLSON@uci.edu>
Message-ID: <19dd68ba0705250945j3dadcefcu8db91b3d2c055fdf@mail.gmail.com>

On 5/26/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> wanted to keep my codebase ascii-only (a not unlikely case), I can

So you have a clear preference for an ascii-only way. *YOU* *really*
want to know when a non-ascii identifier crosses your path.

> For those who don't care about ascii or non-ascii identifiers, they will
> likely already have an environment variable or site.py modification that
> offers all unicode characters that they want, and they will never see
> this message.

I will rephrase your sentence this way.
"For those who DO care about ascii only identifiers, they will likely
have already an
environment variable or site.py modifcation that makes sure that all code ever
imported is pure ascii and are going to see the message they want to see..."

> issue.  And I want this to *automatically* happen every time I run
> Python

"and automatically every time they run Python"...

This argument cuts both ways.

Guillaume

From stephen at xemacs.org  Fri May 25 18:59:33 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 26 May 2007 01:59:33 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705250910o5b56b4f9i9fccd450e37f48fe@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<87odk93w09.fsf@uwakimon.sk.tsukuba.ac.jp>
	<19dd68ba0705250641j348a42adu974fe4969897761e@mail.gmail.com>
	<19dd68ba0705250653v2c2a8188jac8c4ccc722fb747@mail.gmail.com>
	<87irag51in.fsf@uwakimon.sk.tsukuba.ac.jp>
	<19dd68ba0705250910o5b56b4f9i9fccd450e37f48fe@mail.gmail.com>
Message-ID: <87fy5k4z62.fsf@uwakimon.sk.tsukuba.ac.jp>

Guillaume Proux writes:

 > Is there [a way to pass options to GUI programs] right now in
 > Windows?  There is none that I know today at least.

Can't you click on .BAT files?  (I did say "ugly"!)


From jimjjewett at gmail.com  Fri May 25 19:04:31 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 25 May 2007 13:04:31 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <740c3aec0705250255k642d6637re46e3929212f1369@mail.gmail.com>
References: <Pine.LNX.4.58.0705241759100.8399@server1.LFW.org>
	<465667AE.2090000@v.loewis.de> <20070524215742.864E.JCARLSON@uci.edu>
	<740c3aec0705250255k642d6637re46e3929212f1369@mail.gmail.com>
Message-ID: <fb6fbf560705251004q413a7a07k41539cdb197c015f@mail.gmail.com>

On 5/25/07, BJ?rn Lindqvist <bjourne at gmail.com> wrote:

> I think you are forgetting who this feature is intended for.

[I think experienced programmers will in fact use it too, but agree that ...]

> Newbies, on the other hand, would maybe appreciate being able to write:
...
> If Python required a switch for such a program to run, then this
> feature would be totally wasted on them. They might use an IDE,
> program in notepad.exe and dragging the file to the python.exe icon or
> not even know about cmd.exe or what a command line switch is. An error
> message, even an informal one, isn't easy to understand if you don't
> know English.

How about a default file, such as

"on launch, python looks for pyidchar.txt ... if you want to override
this default file do XYZ"

The default default file would be empty (except for comments
explaining the syntax) and allow only ASCII.  A Swedish volunteer
could create and distribute a version for Swedish characters.  (And
since these would be fairly small text files, some could probably be
distributed right in the primary distribution.)

What the teacher installs python, she just uses the Swedish
distribution, or picks the "also allow Latin1 IDs" option from the
custom MSI install.

-jJ

From tjreedy at udel.edu  Fri May 25 19:19:52 2007
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 25 May 2007 13:19:52 -0400
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com><ca471dc20705241710i50e50992m6405ed411e02aaac@mail.gmail.com>
	<4656446F.8030802@canterbury.ac.nz>
Message-ID: <f375rn$n4v$1@sea.gmane.org>


"Greg Ewing" <greg.ewing at canterbury.ac.nz> wrote in message 
news:4656446F.8030802 at canterbury.ac.nz...
| Guido van Rossum wrote:
|
| > Last call for discussion! I'm tempted to reject this -- the ability to
| > generate optimized code based on the shortcut semantics of and/or is
| > pretty important to me.
|
| Please don't be hasty. I've had to think about this issue
| a bit.

I have not seen any response to my suggestion to simplify the to-me overly 
baroque semantics.  Missed it?  Still thinking? Or did I miss something?

Terry Jan Reedy




From jimjjewett at gmail.com  Fri May 25 19:47:44 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 25 May 2007 13:47:44 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>

On 5/25/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Jim Jewett writes:

>  > Ideally, it would even be explicit per extra character allowed, though
>  > there should obviously be shortcuts to accept entire scripts.

> How about a regexp character class as starting point?

I'm not sure I understand.  Do you mean that part of localization
should be defining what certain regular expressions should match?
That sounds great from a consistency standpoint, but it would
certainly limit who could create their own reliable tailorings.

>  > So how about

>  > [ ASCII, plus chars in a named table]

> You can specify any character you want, but if it's ASCII, or not in
> the classes PEP 3131 ends up using to define the maximal set, it gets
> deleted from the extension table (ASCII has its own table,
> conceptually).  This permits whole scripts, blocks, or ranges to be
> included.

So long as we allow tailoring, I think the maximal set should be
generous -- and I don't see any reason to pre-exclude anything outside
ASCII.

There are people who like to use names like "Program Files" or
"Summary of Results.Apr-3-2007 version 2.xls"; I expect the same will
be true of identifiers.  So long as the punctuation is not ASCII, we
might as well let them.  (Internally, I expect some communities to say
"that is a bad idea" about certain characters, but *I* don't want to
prejudge which characters those will be.)

>  > If you want to include punctuation or

> Why waste the effort of the Unicode technical committees?

The other committees say to exclude certain scripts, like Linear B and
Ogham.  And not to allow mixed scripts, at least if they're
confusable.  But I really don't want to explain why someone using
Cyrillic can't use certain (apparently to him) randomly determined
identifiers just because it could be confused with ASCII (or
Armenian).

The only set the committees always recommend allowing is ASCII; beyond
that a nest of decisions (and exceptions) is almost unavoidable,
because the committees disagree among themselves.  Since we can't be
completely safe, I would rather err on the side of leniency towards
those concerned enough to make explicit decisions.

>  > undefined characters, so be it.

> -1

> Assuming undefined == reserved for future standardization that
> violates the Unicode standard.

If unicode comes out with a new revision, the new characters should
probably be allowed; I don't want a situation where users of Cham or
Lepcha[1] are told they have to wait another year because their
scripts weren't formally adopted into unicode until after python 3.4.0
was already released.

[1]  http://www.unicode.org/onlinedat/languages-scripts.html says that
these languages have their own scripts (and no alternate script), and
that these scripts have not yet been encoded in unicode.  I won't be
surprised to see Klingon identifiers before we see either of those,
but ... I don't want to contribute to their exclusion.

-jJ

From jimjjewett at gmail.com  Fri May 25 19:59:48 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 25 May 2007 13:59:48 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <857617.19874.qm@web33514.mail.mud.yahoo.com>
References: <19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>
	<857617.19874.qm@web33514.mail.mud.yahoo.com>
Message-ID: <fb6fbf560705251059l31ca980l42bb23702a2bcfee@mail.gmail.com>

On 5/25/07, Steve Howell <showell30 at yahoo.com> wrote:


> This happens to me about once a month, and I
> forget exactly what Python does when I try to run the
> program where one identifier has the accented e, and a
> later identifier doesn't.

It *should* throw up a syntax error.  If both letters were valid, it
would silently create a second identifier, and you would have some fun
tracking down the bug.

I say "*should*" because, at the moment, it seems to accept some
additional characters, in at least some environments.  In particular,
using Idle from 2.5.0, I just noticed that I can apparently use at
least some Latin-1 characters.

>>> ? = 5
>>> print ?
5

>>> ?=7
SyntaxError: invalid syntax
>>> ?=7
>>> ?
7

[And no, this doesn't mean "it's already in use; no big deal", because
the Latin-1 characters are not the biggest concern.]

-jJ

From rhamph at gmail.com  Fri May 25 20:16:46 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 25 May 2007 12:16:46 -0600
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705250903n66ddea7cvee0612b654e04579@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<aac2c7cb0705242238o175718f9gd43b9a48f60c72d7@mail.gmail.com>
	<fb6fbf560705250903n66ddea7cvee0612b654e04579@mail.gmail.com>
Message-ID: <aac2c7cb0705251116w7963a9d7y294e12a4e2b7dc16@mail.gmail.com>

On 5/25/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 5/25/07, Adam Olsen <rhamph at gmail.com> wrote:
> > If we allowed an underscore as a mixed-script separator
> > (allowing "def get_??(self):"), does this let us get away
> > with otherwise banning mixed-scripts?
>
> I wondered that, until seeing that it wouldn't really solve the
> problem anyhow.  It is possible to write entire words (such as "allow"
> or "scope") in multiple scripts.  (Unicode calls these "whole script
> confusables".)  You can't stop that without banning one of the scripts
> entirely, which would disenfranche users of some languages.
>
> So I think the least-bad solution is to say "OK, we won't allow these
> potentially confusable characters unless you were expecting them."
>
> And once we have a way to say "I'm expecting Cyrillic", we might as
> well let the user specify exactly what they're expecting, and make
> their own decisions on what it likely to be needed vs likely to be
> confused.

Indeed, the whole-script confusables does create significant holes,
but I think the best solution is still to ban mixed-scripts and accept
that it's only a "75% solution".  Using an "I'm expecting cyrillic"
flag makes it harder for those who need cyrillic AND still leaves them
vulnerable to the same problem we're trying to protect ourselves from.

A more extreme solution would be to introduce a symbol type that
converts that converts whole-script confusables to a canonical form
(as well as mixed-script confusables, if we don't ban them).  For
practically it would have to coerce any unicode it was compared with
for equality.. and probably not support sorting.

-- 
Adam Olsen, aka Rhamphoryncus

From jimjjewett at gmail.com  Fri May 25 20:20:50 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 25 May 2007 14:20:50 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705250855r6d2676c6r5e9cb7a49b95b6ac@mail.gmail.com>
References: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<20070524213605.864B.JCARLSON@uci.edu>
	<19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>
	<fb6fbf560705250837i16262513p9bc5a70f9a4c506f@mail.gmail.com>
	<19dd68ba0705250854o40a1025cse3d5f2c38cd76785@mail.gmail.com>
	<19dd68ba0705250855r6d2676c6r5e9cb7a49b95b6ac@mail.gmail.com>
Message-ID: <fb6fbf560705251120t73430a96pc98b1e15d03a36c4@mail.gmail.com>

On 5/25/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote:
> On 5/26/07, Jim Jewett <jimjjewett at gmail.com> wrote:

> > You're missing "here is this neat code from sourceforge", or "Here is
> > something I cut-and-pasted from ASPN".  If those use something outside
> > of ASCII, that's fine -- so long as they tell you about it.

> > If you didn't realize it was using non-ASCII (or even that it could),
> > and the author didn't warn you -- then that is an appropriate time for
> > the interpreter to warn you that things aren't as you expect.

> I fail to see your point. Why should the interpreter warn you?

I see some of the confusion now; as James Knight pointed out, some
people already treat python as binary code, and just run without
reading -- but some people don't.

I do read (or at least skim) other people's code before running it.
If nothing else, I want to see whether it has much chance of solving
my actual problem.  By the time I've finished reading it, I have a
fairly good idea what it is doing.  That's less true if I can't read
everything, but at least I know which parts to worry about.

Arbitrary unicode identifier opens up the possibility of code that
*looks* like ASCII, but isn't -- so I don't even realize that I missed
something.

> but if the code you copy off somewhere else does what you need it to
> do... then why do you want to force the author of this code generously
> donated to you to downgrade his expressiveness by having to rewrite
> all his code to reach ascii purity?

I don't mind that he used Sanskrit identifiers; I don't even mind if
he uses Cyrillic identifiers that look like ASCII.

I'll be less likely to use his code, but that is my own problem.  If
his code breaks when retyped ... again, that is mostly my own problem.

What I do mind is if he used identifier characters that look like > or
', and I didn't notice because the rest of the code was ASCII, and
python didn't warn me, because, hey, technically, those lookalikes
*are* letters now.

-jJ

From jimjjewett at gmail.com  Fri May 25 20:38:46 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 25 May 2007 14:38:46 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <ca471dc20705250931n6e012c21wf177a7a943e9249f@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>
	<320102.38046.qm@web33515.mail.mud.yahoo.com>
	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>
	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<fb6fbf560705250832m7ce88fc8uc108059b3a770cda@mail.gmail.com>
	<ca471dc20705250931n6e012c21wf177a7a943e9249f@mail.gmail.com>
Message-ID: <fb6fbf560705251138i1c66f0abyfd19a17078e93830@mail.gmail.com>

On 5/25/07, Guido van Rossum <guido at python.org> wrote:
> On 5/25/07, Jim Jewett <jimjjewett at gmail.com> wrote:

> > I agree that saying "Japanese identifiers are OK from now on" still
> > shouldn't turn on Cyrillic identifiers.  I think the current
> > alternative boils down to some variant of

> > where allowedchars.txt would look something like

> > 0780..07B1    ; Thaana

> > or

> > 10000..100FA  ; Linear_B plus some blanks I was too lazy to exclude

> I still think such a command-line switch (or switches) is the wrong
> approach. What if I have *one* module that uses Cyrillic legitimately.
> A command-line switch would enable Cyrillic in *all* modules.

Yes.

And that is the desired outcome for a student situation.

> ... Auditing code using a separate tool can ...

Large organizations can do whatever they need to, including an
automated transliteration before import.  The concern is for
relatively small groups, who don't have huge processes in place.

(1)
A new student shouldn't need to learn about import flags just to use
native characters.

Giving such fine-grained control as an advanced option is OK, but it
shouldn't be the *only* way to say "ASCII + characters I use when
reading or writing."

(2)
Someone downloading source code (not binary, source code) shouldn't
have to remember to run that code through an external tool just to see
if it uses unexpected characters (and might be saying something very
different from what she expected).

Note that this applies even to people who do want the extended
identifiers; wanting to write Han Chinese characters does not imply
wanting to accept Greek Coptic characters.

-jJ

From jimjjewett at gmail.com  Fri May 25 20:49:55 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 25 May 2007 14:49:55 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <aac2c7cb0705251116w7963a9d7y294e12a4e2b7dc16@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<aac2c7cb0705242238o175718f9gd43b9a48f60c72d7@mail.gmail.com>
	<fb6fbf560705250903n66ddea7cvee0612b654e04579@mail.gmail.com>
	<aac2c7cb0705251116w7963a9d7y294e12a4e2b7dc16@mail.gmail.com>
Message-ID: <fb6fbf560705251149n7082036et17dc34d193f66d7a@mail.gmail.com>

On 5/25/07, Adam Olsen <rhamph at gmail.com> wrote:
> On 5/25/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > On 5/25/07, Adam Olsen <rhamph at gmail.com> wrote:
> > > If we allowed an underscore as a mixed-script separator
> > > (allowing "def get_??(self):"), does this let us get away
> > > with otherwise banning mixed-scripts?

...

> Indeed, the whole-script confusables does create significant
> holes, but I think the best solution is still to ban mixed-scripts
> and accept that it's only a "75% solution".  Using an "I'm
> expecting cyrillic" flag makes it harder for those who need
> cyrillic AND still leaves them vulnerable to the same problem
> we're trying to protect ourselves from.

hmm... I had thought they should either not include the confusable
letters, or use different fonts -- whatever they normally do.

But I suppose using an _ separator could still be a useful crutch.
Whether it is useful enough ... I'll let others chime in.

> A more extreme solution would be to introduce a symbol type that
> converts that converts whole-script confusables to a canonical
> form

The unicode consortium recommends against this.  I'm not sure if it is
just a presentation issue, or concerns about compatibility; the
"confusables" lists are explicitly allowed to change.

-jJ

From python at zesty.ca  Fri May 25 21:29:50 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Fri, 25 May 2007 14:29:50 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <20070525084117.865D.JCARLSON@uci.edu>
References: <20070524213605.864B.JCARLSON@uci.edu>
	<ca471dc20705250732k14598e57id40c5f95434ecee9@mail.gmail.com>
	<20070525084117.865D.JCARLSON@uci.edu>
Message-ID: <Pine.LNX.4.58.0705251312480.27740@server1.LFW.org>

On Fri, 25 May 2007, Josiah Carlson wrote:
> Apples and oranges to be sure, but there are no other statistics that
> anyone else is able to offer about use of non-ascii identifiers in Java,
> Javascript, C#, etc.

Let's see what we can find.  I made several attempts to search for
non-ASCII identifiers using google.com/codesearch and here's what I got.


Java or JavaScript (total: about 1480000 files found with "lang:java .")
------------------------------------------------------------------------

1.  lang:java ^[^"]*[^\s!-~].*=    (assignment to non-ASCII name)

    2 files with a UTF-8 BOM at the beginning; 1 file with non-ASCII
    in comments; 5 files with non-ASCII in strings; 2 files with
    non-ASCII elsewhere in source code:

    1.  moin-1.5.8/wiki/htdocs/applets/moinFCKplugins/.../lang/en.js
        UTF-8 BOM in middle of file.

    2.  SMSkyline.wdgt/fr.lproj/localizedStrings.js
        UTF-16 BOM beginning of a UTF-8 file. (!)


2.  lang:java ^[^"]*[^\s!-~]\w*\.  (method call on non-ASCII name)

    2 files with a UTF-8 BOM at the beginning; 13 files with non-ASCII
    in comments; 5 files with non-ASCII in strings; 5 files with
    non-ASCII elsewhere in source code:

    1.  struts-2.0.6/src/core/src/.../Editor2Plugin/FindReplaceDialog.js
        UTF-8 BOM in middle of file.

    2.  moin-1.5.8/wiki/htdocs/applets/moinFCKplugins/.../lang/en.js
        UTF-8 BOM in middle of file.

    3.  chickenfoot/chickenscratch/tests/findTest.js
        Non-breaking spaces embedded in indentation.


3.  lang:java ^\s*class.*[^\s!-~]    (class declaration)

    2 files with non-ASCII in strings; no other hits.


4.  lang:javascript ^\s*function.*[^\s!-~]   (function declaration)

    1 non-JavaScript file; 9 files with non-ASCII in comments;
    1 file with non-ASCII in strings; 1 file with non-ASCII elsewhere
    in source code:

    1.  google_hacks_3E_code/hack_61/zoom-google.user.js
        Thin spaces (U+2009) embedded in code.


C# (total: about 266000 files found with "lang:c# .")
-----------------------------------------------------

5.  lang:c# ^[^"]*[^\s!-~].*=      (assignment to non-ASCII name)

    5 non-C# files; 6 files with a UTF-8 BOM at the beginning;
    9 files with non-ASCII in comments; 7 files with non-ASCII
    elsewhere in source code:

    1.  blam-1.8.4pre2/src/PreferencesDialog.cs
        Non-breaking spaces in the middle of the line.

    2.  BildschirmTennis2/BildschirmTennis2/Program1.cs
        Identifier containing non-ASCII.

    3.  Ukazkova reseni CS - Prakticke priklady/.../Exp_2_03/Class2.cs
        Identifier containing non-ASCII.

    4.  Rule.cs
        Identifier containing non-ASCII.

    5.  SharpIntroduction/ComplexExample/Zv?????tko.cs
        Identifier containing non-ASCII.

    6.  WitherwynWebDist/Witherwyn/Map.cs
        "Times" character in expression, probably a typo.

    7.  PDFsharp/XGraphicsLab/MainForm.cs
        Identifier containing non-ASCII.


6.  lang:c# ^[^"]*[^\s!-~]\w*\(    (function call on non-ASCII name)

    4 files with non-ASCII in comments; 6 files with non-ASCII
    elsewhere in source code:

    1.  BildschirmTennis2/BildschirmTennis2/Program1.cs
        Identifier containing non-ASCII.

    2.  SharpIntroduction/ComplexExample/Program.cs
        Identifier containing non-ASCII.

    3.  Ukazkova reseni CS - Prakticke priklady/.../Exp_2_03/Class1.cs
        Identifier containing non-ASCII.

    4.  ActiveRecord/Generator/.../RelationshipBuilderTestCase.cs
        Identifier containing non-ASCII, almost certainly a typo.

    5.  Sample1/Sample1/Program.cs
        Identifier containing non-ASCII.

    6.  Kap11/03/TEXT.CS
        Identifier containing non-ASCII.


7.  lang:c# ^\s*class.*[^\s!-~]    (class declaration)

    1 hit:

    1.  Kap06/03/Kalen.cs
        Identifier containing non-ASCII.


In summary, that means out of around 5.7 million Java, JavaScript,
and C# files that are indexed by Google Code Search, the only use
of non-ASCII identifiers I could find was in 12 C# files, and one
of those 12 occurrences is almost certainly a mistake.


-- ?!ng

From mike.klaas at gmail.com  Fri May 25 22:16:58 2007
From: mike.klaas at gmail.com (Mike Klaas)
Date: Fri, 25 May 2007 13:16:58 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <788141.82125.qm@web33507.mail.mud.yahoo.com>
References: <788141.82125.qm@web33507.mail.mud.yahoo.com>
Message-ID: <EEBBEDF8-C674-4D15-96E2-512BCA71E2D5@gmail.com>


On 25-May-07, at 6:03 AM, Steve Howell wrote:
>

> We're just disagreeing about whether the Dutch tax law
> programmer has to uglify his environment with an alias
> of Python to "python3.0 -liberal_unicode," or whether
> the American programmer in an enterprisy environment
> has to uglify his environment with an alias of Python
> to "python3.0 -parochial" to mollify his security
> auditors.

Surely if such mollification were necessary, -parochial would be  
routinely used for (most much enterprise-y) java?  I have never seen  
any such thing done, though my experience is perhaps not universal.

Then again, perhaps the security auditors would object to the use of  
python in the first place <wink>.

-Mike

From baptiste13 at altern.org  Fri May 25 22:42:53 2007
From: baptiste13 at altern.org (Baptiste Carvello)
Date: Fri, 25 May 2007 22:42:53 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <46568116.202@v.loewis.de>
References: <Pine.LNX.4.58.0705241759100.8399@server1.LFW.org>	<465667AE.2090000@v.loewis.de>	<20070524215742.864E.JCARLSON@uci.edu>
	<46568116.202@v.loewis.de>
Message-ID: <f37hra$sik$1@sea.gmane.org>

Martin v. L?wis a ?crit :
> 
> I don't think there is precedence in Python for such an informational
> error message. It is not pythonic to give an error in the case
> "I know what you want, and I could easily do it, but I don't feel
> like doing it, read these ten pages of text to learn more about the
> problem".
> 
in one word: exit


From baptiste13 at altern.org  Sat May 26 00:00:43 2007
From: baptiste13 at altern.org (Baptiste Carvello)
Date: Sat, 26 May 2007 00:00:43 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>	<320102.38046.qm@web33515.mail.mud.yahoo.com>	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
Message-ID: <f37mda$ek5$1@sea.gmane.org>

Guido van Rossum a ?crit :
> 
> If there's a security argument to be made for restricting the alphabet
> used by code contributions (even by co-workers at the same company), I
> don't see why ASCII-only projects should have it easier than projects
> in other cultures.
> 

there is only one valid reason: because that's the reasonable choice for open
source code, and you make the political choice to favor open-source.
An ASCII-only default helps open source projects keep their codebase readable,
and also makes it easier to open proprietary codebases after the fact. On the
other hand, a non-ASCII default does help novice users. So you will make someone
unhappy...

My personal data point: in scientific research, where I work, specialized
programs are sometimes not organised by projects, but by codes, which are
developped in-house and open-sourced *as is* after the fact. For this use case,
a non-ASCII default is clearly a nuisance, because non-ASCII identifiers would
be used without much thought when the program is a small in-house project, and
then make it difficult to debug 5 years down the road when it has become
important for the community. In this peculiar case, non-ASCII identifiers also
have less justification, because all researchers understand english well anyway.
So, for my personal interests, an ASCII-only default would be better.


just my 2 cents,
BC


From timothy.c.delaney at gmail.com  Sat May 26 00:16:35 2007
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Sat, 26 May 2007 08:16:35 +1000
Subject: [Python-3000] Fw: [Python-Dev] PEP 367: New Super
Message-ID: <002301c79f1a$5dc535c0$0201a8c0@mshome.net>

Bah - this should have gone to Pyton-3000 too, since it's discussing the 
PEP.

Tim Delaney

Tim Delaney wrote:
> Guido van Rossum wrote:
>
>> - This seems to be written from the POV of introducing it in 2.6.
>> Perhaps the PEP could be slightly simpler if it could focus just on
>> Py3k? Then it's up to the 2.6 release managers to decide if and how
>> to backport it.
>
> That was my original intention, but it was assigned a non-Py3k PEP
> number, so I presumed I'd missed an email where you'd decided it
> should be for 2.6.
> We should probably change the PEP number if it's to be targetted at
> Py3K only.
>
>> - Why not make super a keyword, instead of just prohibiting
>> assignment to it? (I'm planning to do the same with None BTW in Py3k
>> -- I find the "it's a name but you can't assign to it" a rather
>> silly business and hardly "the simplest solution".)
>
> That's currently an open issue - I'm happy to make it a keyword - in
> which case I think the title should be changed to "super as a
> keyword" or something like that.
>
>> - "Calling a static method or normal function that accesses the name
>> super will raise a TypeError at runtime." This seems too vague. What
>> if the function is nested within a method? Taking the specification
>> literally, a nested function using super will have its own preamble
>> setting super, which would be useless and wrong.
>
> I'd thought I'd covered that with "This name behaves
> identically to a normal local, including use by inner functions via a
> cell, with the following exceptions:", but re-reading it it's a bit
> clumsy.
> The intention is that functions that do not have access to a 'super'
> cell variable will raise a TypeError. Only methods using the keyword
> 'super' will have a preamble.
>
> Th preamble will only be added to functions/methods that cause the
> 'super' cell to exist i.e. for CPython have 'super' in co.cellvars.
> Functions that just have 'super' in co.freevars wouldn't have the
> preamble.
>> - "For static methods and normal functions, <class> will be None,
>> resulting in a TypeError being raised during the preamble." How do
>> you know you're in this situation at run time? By the time the
>> function body is entered the knowledge about whether this was a
>> static or instance method is lost.
>
> The preamble will not technically be part of the function body - it
> occurs after unpacking the parameters, but before entering the
> function body, and has access to the C-level variables of the
> function/method object. So the exception will be raised before
> entering the function body.
> The way I see it, during class construction, a C-level variable on the
> method object would be bound to the (decorated?) class. This really
> needs to be done as the last step in class construction if it's to
> bind to the decorated class - otherwise it can be done as the methods
> are processed.
> I was thinking that by binding that variable to Py_None for static
> methods it would allow someone to do the following:
>
> def modulefunc(self):
>    pass
>
> class A(object):
>    def func(self):
>        pass
>
>    @staticmethod
>    def staticfunc():
>        pass
>
> class B(object):
>    func = A.func
>    staticfunc = A.staticfunc
>    outerfunc = modulefunc
>
> class C(object):
>    outerfunc = B.outerfunc
>
> but that's already going to cause problems when you call the methods
> - they will be being called with instances of the wrong type (raising
> a TypeError).
> So now I think both static methods and functions should just have that
> variable left as NULL. Trying to get __super__(NULL) will throw a
> TypeError.
>> - The reference implementation (by virtue of its bytecode hacking)
>> only applies to CPython. (I'll have to study it in more detail
>> later.)
>
> Yep, and it has quite a few limitations. I'd really like to split it
> out from the PEP itself, but I'm not sure where I should host it.
>
>> I'll probably come up with more detailed feedback later. Keep up the
>> good work!!
>
> Now I've got to find the time to try implementing it. Neal has said
> he's willing to help, but I want to give it a go myself.
>
> Tim Delaney 


From baptiste13 at altern.org  Sat May 26 00:14:51 2007
From: baptiste13 at altern.org (Baptiste Carvello)
Date: Sat, 26 May 2007 00:14:51 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>
References: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>	<20070524213605.864B.JCARLSON@uci.edu>
	<19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>
Message-ID: <f37n7q$h0b$1@sea.gmane.org>

Guillaume Proux a ?crit :
> I think Martin's and my point is that to get people to level E) there
> is no reason to put any charset restriction on level A ->D. And when
> you are at level E), it is difficult to argue that making a one-time
> test at source code checkin time is a bad practice.
> 
you seem to believe that all useful open source code in the world is written as
part of a well organised project that makes use of all known good practices.
This is simply not true. In my field (research in physics), open source code
sometimes means somebody's in-house tool that he put on the internet at the end
of his PhD. This means no support, little documentation, and definitely no
"tests at source code checkin time". Still, it can be the best tool in its
specialized field. And I want to be able to debug it if needed.

just my 2 cents,
BC


From baptiste13 at altern.org  Sat May 26 00:22:35 2007
From: baptiste13 at altern.org (Baptiste Carvello)
Date: Sat, 26 May 2007 00:22:35 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705250855r6d2676c6r5e9cb7a49b95b6ac@mail.gmail.com>
References: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>	<20070524213605.864B.JCARLSON@uci.edu>	<19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>	<fb6fbf560705250837i16262513p9bc5a70f9a4c506f@mail.gmail.com>	<19dd68ba0705250854o40a1025cse3d5f2c38cd76785@mail.gmail.com>
	<19dd68ba0705250855r6d2676c6r5e9cb7a49b95b6ac@mail.gmail.com>
Message-ID: <f37nmb$i4e$1@sea.gmane.org>

Guillaume Proux a ?crit :
> (I mistakenly replied in private. here is a copy for the py3000 mailing list.)
> 
> 
> Good evening!
> 
> On 5/26/07, Jim Jewett <jimjjewett at gmail.com> wrote:
>> You're missing "here is this neat code from sourceforge", or "Here is
>> something I cut-and-pasted from ASPN".  If those use something outside
>> of ASCII, that's fine -- so long as they tell you about it.
>>
>> If you didn't realize it was using non-ASCII (or even that it could),
>> and the author didn't warn you -- then that is an appropriate time for
>> the interpreter to warn you that things aren't as you expect.
> 
> I fail to see your point. Why should the interpreter warn you?
> 
> There is nothing wrong to have programs written with identifiers using
> accented letters, cyrillic alphabet, morse code?! Why should you be
> warned? If the programmer who wrote the code decided to use its own
> language to name some of the identifiers ... then.. bygones.
>
sure, until you hit some bug and would like to debug it, and you can't even
recognise the identifiers from one another...

>  If you have an actual requirement that everything should be ascii
> then do not copy code off ASPN without first sanitizing it and do not
> copy neat code from sf.net from people you hardly know without doing a
> full ascii-compliance and security review.
> 
> but if the code you copy off somewhere else does what you need it to
> do... then why do you want to force the author of this code generously
> donated to you to downgrade his expressiveness by having to rewrite
> all his code to reach ascii purity?
> 
don't make it sound so dramatic. Python programmers already accept limits on
expressiveness in the name of readability. Heck, otherwise we would all be using
Perl.

BC


From python at zesty.ca  Sat May 26 00:45:18 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Fri, 25 May 2007 17:45:18 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <877iqy5v2y.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org>
	<4646FCAE.7090804@v.loewis.de> <f27rmv$k1d$1@sea.gmane.org>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<Pine.LNX.4.58.0705231709231.8399@server1.LFW.org>
	<877iqy5v2y.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <Pine.LNX.4.58.0705251721390.27740@server1.LFW.org>

On Thu, 24 May 2007, Stephen J. Turnbull wrote:
>  > You've got this backwards, and I suspect that's part of the root of
>  > the disagreement.  It's not that "when humans enter the loop they
>  > cause problems."  The purpose of the language is to *serve humans*.
[...]
> N.B. I take offense at your misquote.  *Humans do not cause problems.*
> It is *non-ASCII tokens* that *cause* the (putative) problem.  However,
> the alleged problems only arise when humans are present.

Oh, I apologize.  I misunderstood the antecedent of "they".

>  > The grammar has to be something a human can understand.
>
> There are an infinite number of ASCII-only Python tokens.  Whether
> those tokens are lexically composed of a small fixed finite alphabet
> vs. a large extensible finite alphabet doesn't change anything in
> terms of understanding the *grammar*.

I understand that you're talking about grammar as distinct from
lexical syntax -- I was using the word "grammar" to refer to everything.
I probably should have used the word "syntax" instead.

My point was just that you have to be able to tell what a token is before
you can read the syntax.  That's hard to do if you don't know what
characters are allowed and what characters aren't (and if there isn't
even a consensus on what should be allowed).

> The question is how expensive will the upgrade be, and what are the
> benefits.  My experience suggests that the cost is negligible *because
> most users won't use non-ASCII identifiers*, and they'll just stick
> with their ASCII-only tools.

That's exactly the danger.  It's a change that makes almost everyone's
tools and practices subtly, occasionally, and silently incorrect --
even unconsciously incorrect for many.  That's much worse than a
change that is obvious enough to force a correction in assumptions.

That just means, if we're going to provide this feature, we shouldn't
force subtle wrongness upon people by making it the default.  The
balance you're talking about weighs heavily in favour of ASCII by
default because that is what 100% of Python programs use now, it is
what the vast majority of Python programs will use in the future,
and it is what the vast majority of Python users will assume to be
the case for quite some time.

> And there are cases (Dutch tax law, Japanese morphology) where having
> a judicious selection of non-ASCII identifiers is very convenient.

Yes, granted.

>  > This should be built in to the Python interpreter and on by default,
>  > unless it is turned off by a command-line switch that says "I want to
>  > allow the full set of Unicode identifier characters in identifiers."
>
> I'd make it more tedious and more flexible to relax the restriction,
> actually.  "python" gives you the stdlib, ASCII-only restriction.
> "python -U TABLE" takes a mandatory argument, which is the table of
> allowed characters.  If you want to rule out "stupid file substitution
> tricks", TABLE could take the special arguments "stdlib" and "stduni"
> which refer to built-in tables.  But people really should be able to
> restrict to "Japanese joyo kanji, kana, and ASCII only" or "IBM
> Japanese only" as local standards demand, so -U should also be able to
> take a file name, or a module name, or something like that.

I strongly support this idea.  It's the best proposal I've heard so far.

>  > If we are going to allow Unicode identifiers at all, then I would
>  > recommend only allowing identifiers that are already normalized
>  > (in NFC).
>
> Already in the PEP.

The PEP says that Python will *convert* the identifiers into NFC.
I'd rather there not be lots of different ways to write the same
identifier (TOOWTDI), so this particular recommendation is that
identifiers in source code have to already be normalized.

>  > The ideas that I'm in favour of include:
>  >
>  >     (e) Use a character set that is fixed over time.
>
> The BASIC that I learned first only had 26 user identifiers.  Maybe
> that's the way we should go?<duck />

The solution you propose solves this nicely.


-- ?!ng

From jcarlson at uci.edu  Sat May 26 01:16:28 2007
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 25 May 2007 16:16:28 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705250945j3dadcefcu8db91b3d2c055fdf@mail.gmail.com>
References: <20070525091105.8663.JCARLSON@uci.edu>
	<19dd68ba0705250945j3dadcefcu8db91b3d2c055fdf@mail.gmail.com>
Message-ID: <20070525095511.866D.JCARLSON@uci.edu>


"Guillaume Proux" <gproux+py3000 at gmail.com> wrote:
> 
> On 5/26/07, Josiah Carlson <jcarlson at uci.edu> wrote:
> > wanted to keep my codebase ascii-only (a not unlikely case), I can
> 
> So you have a clear preference for an ascii-only way. *YOU* *really*
> want to know when a non-ascii identifier crosses your path.
> 
> > For those who don't care about ascii or non-ascii identifiers, they will
> > likely already have an environment variable or site.py modification that
> > offers all unicode characters that they want, and they will never see
> > this message.
> 
> I will rephrase your sentence this way.
> "For those who DO care about ascii only identifiers, they will likely
> have already an
> environment variable or site.py modifcation that makes sure that all code ever
> imported is pure ascii and are going to see the message they want to see..."
> 
> > issue.  And I want this to *automatically* happen every time I run
> > Python
> 
> "and automatically every time they run Python"...
> 
> This argument cuts both ways.

It does, but it also refuses the temptation to guess that *everyone*
wants to use unicode identifiers by default.  Why?  As Stephen Turnbull
has already stated, the majority of users will have *no use* and *no
exposure* to unicode identifiers.  Further, unicode identifiers may very
well break toolchains, so signaling as soon as possible that "there may
be something you didn't expect here" is the right thing to do.

Baptiste Carvello, in addition to Jim, Ka-Ping, Stephen, and myself,
further discusses why ascii is the only sane default in his most recent
3 posts.


 - Josiah


From guido at python.org  Sat May 26 01:13:17 2007
From: guido at python.org (Guido van Rossum)
Date: Fri, 25 May 2007 16:13:17 -0700
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
In-Reply-To: <001d01c79f15$f0afa140$0201a8c0@mshome.net>
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<ca471dc20705241713n7f348c7eh563b5631e512fd93@mail.gmail.com>
	<017d01c79e98$c6b84090$0201a8c0@mshome.net>
	<ca471dc20705251014r46c86f74j822729e843cef797@mail.gmail.com>
	<001d01c79f15$f0afa140$0201a8c0@mshome.net>
Message-ID: <ca471dc20705251613p59eb1a71k9aaf13f4b492181e@mail.gmail.com>

On 5/25/07, Tim Delaney <timothy.c.delaney at gmail.com> wrote:
> Bah - this should have gone to Pyton-3000 too, since it's discussing the
> PEP.

My fault; I started sending you feedback that only went to you, Calvin
and the PEP editors. I've added python-3000 at python.org back here.

> Guido van Rossum wrote:
>
> > - This seems to be written from the POV of introducing it in 2.6.
> > Perhaps the PEP could be slightly simpler if it could focus just on
> > Py3k? Then it's up to the 2.6 release managers to decide if and how to
> > backport it.
>
> That was my original intention, but it was assigned a non-Py3k PEP number,
> so I presumed I'd missed an email where you'd decided it should be for 2.6.
>
> We should probably change the PEP number if it's to be targetted at Py3K
> only.

Maybe. There are a bunch of PEPs that were originally proposed before
the Py3k work started but that are now slated for inclusion in 3.0. I
don't think we should renumber all of those.

> > - Why not make super a keyword, instead of just prohibiting assignment
> > to it? (I'm planning to do the same with None BTW in Py3k -- I find
> > the "it's a name but you can't assign to it" a rather silly business
> > and hardly "the simplest solution".)
>
> That's currently an open issue - I'm happy to make it a keyword - in which
> case I think the title should be changed to "super as a keyword" or
> something like that.

As it was before. :-)

What's the argument against?

> > - "Calling a static method or normal function that accesses the name
> > super will raise a TypeError at runtime." This seems too vague. What
> > if the function is nested within a method? Taking the specification
> > literally, a nested function using super will have its own preamble
> > setting super, which would be useless and wrong.
>
> I'd thought I'd covered that with "This name behaves
> identically to a normal local, including use by inner functions via a cell,
> with the following exceptions:", but re-reading it it's a bit clumsy.
>
> The intention is that functions that do not have access to a 'super' cell
> variable will raise a TypeError. Only methods using the keyword 'super' will
> have a preamble.
>
> Th preamble will only be added to functions/methods that cause the 'super'
> cell to exist i.e. for CPython have 'super' in co.cellvars. Functions that
> just have 'super' in co.freevars wouldn't have the preamble.

I think it's still too vague. For example:

class C:
  def f(s):
    return 1
class D(C):
  pass
def f(s):
  return 2*super.f()
D.f = f
print(D().f())

Should that work? I would be okay if it didn't, and if the super
keyword is only allowed inside a method that is lexically inside a
class. Then the second definition of f() should be a (phase 2)
SyntaxError.

Was it ever decided whether the implicitly bound class should be:

- the class object as produced by the class statement (before applying
class decorators);
- whatever is returned by the last class decorator (if any); or
- whatever is bound to the class name at the time the method is invoked?

I've got a hunch that #1 might be more solid; #3 seems asking for trouble.

There's also the issue of what to do when the method itself is
decorated (the compiler can't know what the decorators mean, even for
built-in decorators like classmethod).

> > - "For static methods and normal functions, <class> will be None,
> > resulting in a TypeError being raised during the preamble." How do you
> > know you're in this situation at run time? By the time the function
> > body is entered the knowledge about whether this was a static or
> > instance method is lost.
>
> The preamble will not technically be part of the function body - it occurs
> after unpacking the parameters, but before entering the function body, and
> has access to the C-level variables of the function/method object. So the
> exception will be raised before entering the function body.
>
> The way I see it, during class construction, a C-level variable on the
> method object would be bound to the (decorated?) class. This really needs to
> be done as the last step in class construction if it's to bind to the
> decorated class - otherwise it can be done as the methods are processed.

We could make the class in question a fourth attribute of the (poorly
named) "bound method" object, e.g. im_class_for_super (im_super would
be confusing IMO). Since this is used both by instance methods and by
the @classmethod decorator, it's just about perfect for this purpose.
(I would almost propose to reuse im_self for this purpose, but that's
probably asking for subtle backwards incompatibilities and not worth
it.)

Then when we're calling a bound method X (bound either to an instance
or to a class, depending on whether it's an instance or class method),
*if* the im_class_for_super is set, and *if* the function (im_func)
has a "free variable" named 'super', *then* we evaluate
__builtin__.__super__(X.im_class_for_super, X.im_self) and bind it to
that variable. If there's no such free variable, we skip this step.
This step could be inserted in call_function() in Python/ceval.c in
the block starting with "if (PyMethod_check(func) && ...)". It also
needs to be inserted into method_call() in Objects/classobject.c, in
the toplevel "else" block. (The ceval version is a speed hack, it
inlines the essence of method_call().)

Now we need to modify the compiler, as follows (assume super is a keyword):

- Consider three types of scopes, which may be nested: the outermost
(module or exec) scope, class scope, and function scope. The latter
two can be nested arbitrarily.

- The super keyword is only usable in an expression (it becomes an
alternative for 'atom' in the grammar). It can not be used as an
assignment target (this is a phase 2 SyntaxError) nor in a nonlocal
statement.

- The super keyword is only allowed in a function that is contained in
a class (directly or nested inside another function). It is not
allowed directly in a class, nor in the outermost scope.

- If a function contains a valid use of super, add a free variable
named 'super' to the function's set of free variables.

- If the function is nested inside another function (not in a class),
add the same free variable to that outer function too, and so on,
until a function is reached that is nested in a class, not in a
function.

- All *uses* of the super keyword are turned into references to this
free variable.

I think this should work; it mostly uses existing machinery; it is
explainable using existing mechanisms.

If a function using super is somehow called without going through the
binding of super, it will just get the normal error message when super
is used:

NameError: free variable 'super' referenced before assignment in enclosing scope

IMO that's good enough; it's pretty hard to produce such a call.

> I was thinking that by binding that variable to Py_None for static methods
> it would allow someone to do the following:
>
> def modulefunc(self):
>     pass
>
> class A(object):
>     def func(self):
>         pass
>
>     @staticmethod
>     def staticfunc():
>         pass
>
> class B(object):
>     func = A.func
>     staticfunc = A.staticfunc
>     outerfunc = modulefunc
>
> class C(object):
>     outerfunc = B.outerfunc
>
> but that's already going to cause problems when you call the methods - they
> will be being called with instances of the wrong type (raising a TypeError).

I don't see any references to super in that example -- what's the relevance?

> So now I think both static methods and functions should just have that
> variable left as NULL. Trying to get __super__(NULL) will throw a TypeError.

See my proposal above. It differs slightly in that the __super__ call
is made only when the class is not NULL. On the expectation that a
typical function that references super uses it exactly once per call
(that would be by far the most common case I expect) this is just
fine. In my proposal the 'super' variable contains whatever
__super__(<class>, <inst>) returned, rather than <class> which you
seem to be proposing here.

> > - The reference implementation (by virtue of its bytecode hacking)
> > only applies to CPython. (I'll have to study it in more detail later.)
>
> Yep, and it has quite a few limitations. I'd really like to split it out
> from the PEP itself, but I'm not sure where I should host it.

Submit it as a patch to SourceForge and link to it from the PEP (I did
this for PEP 3119). If you still care about it -- I'm also okay with
just having it in the subversion archives.

> > I'll probably come up with more detailed feedback later. Keep up the
> > good work!!
>
> Now I've got to find the time to try implementing it. Neal has said he's
> willing to help, but I want to give it a go myself.

Great (either way) !

PS if you like my proposal, feel free to edit it into shape for the PEP.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at zesty.ca  Sat May 26 01:20:07 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Fri, 25 May 2007 18:20:07 -0500 (CDT)
Subject: [Python-3000] PEP 3131 normalization forms
In-Reply-To: <465521B6.1050601@v.loewis.de>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com> 
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org> 
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com> 
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org> 
	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com> 
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org> 
	<fb6fbf560705230939g1a43524bw44c8273768a0173@mail.gmail.com>
	<ca471dc20705230945q6d9a31fas9e0011959fa1e643@mail.gmail.com>
	<Pine.LNX.4.58.0705231554480.8399@server1.LFW.org>
	<Pine.LNX.4.58.0705231636230.8399@server1.LFW.org>
	<465521B6.1050601@v.loewis.de>
Message-ID: <Pine.LNX.4.58.0705251816420.27740@server1.LFW.org>

NFKC might be a better choice than NFC for normalizing identifiers.
Do we really want "find()" (with the fi-ligature) and "find()"
(without the fi-ligature) to be two different functions?

Martin, is there a reason to prefer NFC over NFKC?


-- ?!ng

From rhamph at gmail.com  Sat May 26 01:29:34 2007
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 25 May 2007 17:29:34 -0600
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705251149n7082036et17dc34d193f66d7a@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<aac2c7cb0705242238o175718f9gd43b9a48f60c72d7@mail.gmail.com>
	<fb6fbf560705250903n66ddea7cvee0612b654e04579@mail.gmail.com>
	<aac2c7cb0705251116w7963a9d7y294e12a4e2b7dc16@mail.gmail.com>
	<fb6fbf560705251149n7082036et17dc34d193f66d7a@mail.gmail.com>
Message-ID: <aac2c7cb0705251629w7cc4b24bk6c47a29c34f08698@mail.gmail.com>

On 5/25/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 5/25/07, Adam Olsen <rhamph at gmail.com> wrote:
> > On 5/25/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > > On 5/25/07, Adam Olsen <rhamph at gmail.com> wrote:
> > > > If we allowed an underscore as a mixed-script separator
> > > > (allowing "def get_??(self):"), does this let us get away
> > > > with otherwise banning mixed-scripts?
>
> ...
>
> > Indeed, the whole-script confusables does create significant
> > holes, but I think the best solution is still to ban mixed-scripts
> > and accept that it's only a "75% solution".  Using an "I'm
> > expecting cyrillic" flag makes it harder for those who need
> > cyrillic AND still leaves them vulnerable to the same problem
> > we're trying to protect ourselves from.
>
> hmm... I had thought they should either not include the confusable
> letters, or use different fonts -- whatever they normally do.

I don't understand.  Are you suggesting that those typing in russian
or ukrainian should switch from cyrillic to latin when typing in 'a'?
Surely I misunderstand.

But as for how likely accidental confusion is, to provide statistics I
installed a ukrainian wordlist and grepped it for words that only
contained characters resembling lowercase latin characters (in my
font).  Of 990736 entries, only 133 matched.  Of those, only one of
them looked like an english word: a lone 'i'.  I'm tempted to suggest
special-casing it, but if that's the worst problem in all of this I
think it can wait until it's proven to be a problem.


> But I suppose using an _ separator could still be a useful crutch.
> Whether it is useful enough ... I'll let others chime in.

Using _ as a separator is only intended to allow fixed prefixes (or
suffixes) for arbitrary names[1].  I don't see how this becomes a
crutch.


[1] urllib2 uses this style, although it's unlikely to ever have
non-ascii names.  Still, I don't think we should limit the style.

> > A more extreme solution would be to introduce a symbol type that
> > converts that converts whole-script confusables to a canonical
> > form
>
> The unicode consortium recommends against this.  I'm not sure if it is
> just a presentation issue, or concerns about compatibility; the
> "confusables" lists are explicitly allowed to change.

Having the equivalences change between python versions (assuming at
least this aspect is hardcoded) would be quite troublesome.  Perhaps
even moreso than the confusion it's intended to prevent!

-- 
Adam Olsen, aka Rhamphoryncus

From greg.ewing at canterbury.ac.nz  Sat May 26 01:50:07 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 26 May 2007 11:50:07 +1200
Subject: [Python-3000] PEP 3131 normalization forms
In-Reply-To: <Pine.LNX.4.58.0705251816420.27740@server1.LFW.org>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>
	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
	<fb6fbf560705230939g1a43524bw44c8273768a0173@mail.gmail.com>
	<ca471dc20705230945q6d9a31fas9e0011959fa1e643@mail.gmail.com>
	<Pine.LNX.4.58.0705231554480.8399@server1.LFW.org>
	<Pine.LNX.4.58.0705231636230.8399@server1.LFW.org>
	<465521B6.1050601@v.loewis.de>
	<Pine.LNX.4.58.0705251816420.27740@server1.LFW.org>
Message-ID: <4657762F.8070307@canterbury.ac.nz>

Ka-Ping Yee wrote:
> NFKC might be a better choice than NFC for normalizing identifiers.
> Do we really want "find()" (with the fi-ligature) and "find()"
> (without the fi-ligature) to be two different functions?\

Do we really want to allow ligatures at all?

--
Greg

From python at zesty.ca  Sat May 26 02:14:47 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Fri, 25 May 2007 19:14:47 -0500 (CDT)
Subject: [Python-3000] PEP 3131 normalization forms
In-Reply-To: <4657762F.8070307@canterbury.ac.nz>
References: <ca471dc20705170948p5131cc2ew13fccf359e3212fb@mail.gmail.com>
	<Pine.LNX.4.58.0705221622240.14279@server1.LFW.org>
	<B098AAEE-06DD-42CA-BDE9-0C46AB1A8F9F@gmail.com>
	<Pine.LNX.4.58.0705222213340.14279@server1.LFW.org>
	<ca471dc20705222120m602e1da5jecfad2d49738f062@mail.gmail.com>
	<Pine.LNX.4.58.0705222345220.14279@server1.LFW.org>
	<fb6fbf560705230939g1a43524bw44c8273768a0173@mail.gmail.com>
	<ca471dc20705230945q6d9a31fas9e0011959fa1e643@mail.gmail.com>
	<Pine.LNX.4.58.0705231554480.8399@server1.LFW.org>
	<Pine.LNX.4.58.0705231636230.8399@server1.LFW.org>
	<465521B6.1050601@v.loewis.de>
	<Pine.LNX.4.58.0705251816420.27740@server1.LFW.org>
	<4657762F.8070307@canterbury.ac.nz>
Message-ID: <Pine.LNX.4.58.0705251901270.27740@server1.LFW.org>

On Sat, 26 May 2007, Greg Ewing wrote:
> Ka-Ping Yee wrote:
> > NFKC might be a better choice than NFC for normalizing identifiers.
> > Do we really want "find()" (with the fi-ligature) and "find()"
> > (without the fi-ligature) to be two different functions?\
>
> Do we really want to allow ligatures at all?

If we require identifiers in source code to be in NFKC, I believe
there won't be any ligatures.  The NFKC for the "fi" ligature is the
two-letter sequence "fi".


-- ?!ng

From showell30 at yahoo.com  Sat May 26 02:42:18 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Fri, 25 May 2007 17:42:18 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <ca471dc20705250931n6e012c21wf177a7a943e9249f@mail.gmail.com>
Message-ID: <28496.42353.qm@web33503.mail.mud.yahoo.com>


--- Guido van Rossum <guido at python.org> wrote:

> On 5/25/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> > On 5/24/07, Guido van Rossum <guido at python.org>
> wrote:
> >
> > > It doesn't look like any kind of global flag
> passed to the interpreter
> > > would scale -- once I am using a known trusted
> contribution that uses
> > > a different character set than mine, I would
> have to change the global
> > > setting to be more lenient, and the leniency
> would affect all code I'm
> > > using.
> >
> > Are you still thinking about the single on/off
> switch?
> >
> > I agree that saying "Japanese identifiers are OK
> from now on" still
> > shouldn't turn on Cyrillic identifiers.  I think
> the current
> > alternative boils down to some variant of
> >
> >     python -idchars allowedchars.txt
> >
> > where allowedchars.txt would look something like
> >
> >
> > 0780..07B1    ; Thaana
> >
> > or
> >
> > 10000..100FA  ; Linear_B plus some blanks I was
> too lazy to exclude
> >
> > (These lines are based on the unicode Scripts.txt,
> and use character
> > ranges instead of script names so that you can
> exclude certain symbols
> > if you want to.)
> 
> I still think such a command-line switch (or
> switches) is the wrong
> approach. What if I have *one* module that uses
> Cyrillic legitimately.
> A command-line switch would enable Cyrillic in *all*
> modules.
> 

I agreed with you at first that once you allow
Cyrillic code from your good, trusted buddy that codes
in Cyrillic, you essentially open the door for all bad
people that code in Cyrillic, so enabling/requiring a
flag that trusts/distrusts Cyrillic code is basically
an exercise in futility.

But why couldn't there be a mechanism to accept only
individual non-ascii modules as trusted modules?




 
____________________________________________________________________________________
Expecting? Get great news right away with email Auto-Check. 
Try the Yahoo! Mail Beta.
http://advision.webevents.yahoo.com/mailbeta/newmail_tools.html 

From showell30 at yahoo.com  Sat May 26 03:01:55 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Fri, 25 May 2007 18:01:55 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <20070525095511.866D.JCARLSON@uci.edu>
Message-ID: <236066.59081.qm@web33506.mail.mud.yahoo.com>

--- Josiah Carlson <jcarlson at uci.edu> wrote:

> 
> Baptiste Carvello, in addition to Jim, Ka-Ping,
> Stephen, and myself,
> further discusses why ascii is the only sane default
> in his most recent
> 3 posts.


I will add my much less venerated name to the list of
people who think ascii is the sane default in any
situation.

I think this whole debate could be put to rest by
agreeing to err on the side of ascii in 3.0 beta, and
if in real world experience, that turns out to be the
wrong decision, simply fix it in 3.0 production, 3.1,
or 3.2.

I like incrementism, despite the lofty agenda of 3.0.



       
____________________________________________________________________________________Got a little couch potato? 
Check out fun summer activities for kids.
http://search.yahoo.com/search?fr=oni_on_mail&p=summer+activities+for+kids&cs=bz 

From showell30 at yahoo.com  Sat May 26 03:12:40 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Fri, 25 May 2007 18:12:40 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <28496.42353.qm@web33503.mail.mud.yahoo.com>
Message-ID: <874523.63607.qm@web33506.mail.mud.yahoo.com>


--- Steve Howell <showell30 at yahoo.com> wrote:
> --- Guido van Rossum <guido at python.org> wrote:
> 
> > On 5/25/07, Jim Jewett <jimjjewett at gmail.com>
> wrote:
> > > On 5/24/07, Guido van Rossum <guido at python.org>
> > wrote:
> > >
> > > > It doesn't look like any kind of global flag
> > passed to the interpreter
> > > > would scale -- once I am using a known trusted
> > contribution that uses
> > > > a different character set than mine, I would
> > have to change the global
> > > > setting to be more lenient, and the leniency
> > would affect all code I'm
> > > > using.
> > >
> > > Are you still thinking about the single on/off
> > switch?
> > >
> > > I agree that saying "Japanese identifiers are OK
> > from now on" still
> > > shouldn't turn on Cyrillic identifiers.  I think
> > the current
> > > alternative boils down to some variant of
> > >
> > >     python -idchars allowedchars.txt
> > >
> > > where allowedchars.txt would look something like
> > >
> > >
> > > 0780..07B1    ; Thaana
> > >
> > > or
> > >
> > > 10000..100FA  ; Linear_B plus some blanks I was
> > too lazy to exclude
> > >
> > > (These lines are based on the unicode
> Scripts.txt,
> > and use character
> > > ranges instead of script names so that you can
> > exclude certain symbols
> > > if you want to.)
> > 
> > I still think such a command-line switch (or
> > switches) is the wrong
> > approach. What if I have *one* module that uses
> > Cyrillic legitimately.
> > A command-line switch would enable Cyrillic in
> *all*
> > modules.
> > 
> 
> I agreed with you at first that once you allow
> Cyrillic code from your good, trusted buddy that
> codes
> in Cyrillic, you essentially open the door for all
> bad
> people that code in Cyrillic, so enabling/requiring
> a
> flag that trusts/distrusts Cyrillic code is
> basically
> an exercise in futility.
> 
> But why couldn't there be a mechanism to accept only
> individual non-ascii modules as trusted modules?
> 

Never mind.  I already know the answer to my question.
 The mechanism to import only "trusted modules" is the
import statement itself, backed by unit tests, trust
models, etc.

I don't think my somewhat fallacious reasoning
invalidates the argument for making Python parochial
by default, though.




       
____________________________________________________________________________________You snooze, you lose. Get messages ASAP with AutoCheck
in the all-new Yahoo! Mail Beta.
http://advision.webevents.yahoo.com/mailbeta/newmail_html.html

From greg.ewing at canterbury.ac.nz  Sat May 26 03:15:45 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 26 May 2007 13:15:45 +1200
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <f375rn$n4v$1@sea.gmane.org>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<ca471dc20705241710i50e50992m6405ed411e02aaac@mail.gmail.com>
	<4656446F.8030802@canterbury.ac.nz> <f375rn$n4v$1@sea.gmane.org>
Message-ID: <46578A41.4090201@canterbury.ac.nz>

Terry Reedy wrote:

> I have not seen any response to my suggestion to simplify the to-me overly 
> baroque semantics.  Missed it?  Still thinking? Or did I miss something?

Sorry, I've been meaning to reply, but haven't got around
to it.

 > Delete special casing of NotImplemented.

This is the standard way for a binary operator method to
indicate that it doesn't know how to handle the types it's
been given. It signals to the interpreter machinery to give
the other operand a chance to handle the operation. It's
a complexity already present in the system for handling
binary operators, not something introduced by this proposal.

> Delete NeedOtherOperand (where would it even live?)

The same place as NotImplemented, Ellipsis, etc live already.

> The current spelling 
> is True for and and False for or, as with standard semantics.

No, that's not the current spelling. The current 'and' and
'or' know nothing about True and False, only whether their
operands are true or false (with a small 't').

It could possibly be *used* as the spelling for this purpose,
but my feeling is that it would muddy the distinction between
standard boolean semantics and whatever new semantics the
overloaded methods are implementing -- which is supposed to
be completely independent of the standard semantics.

> Delete the reverse methods.  They are only needed for mixed-type 
> operations, like scaler*matrix.  But such seems senseless here.  In any 
> case, they are not needed for any of your motivating applications, which 
> would define both methods without mixing.

I don't agree. For example, if you're implementing operations
on matrices of booleans, it seems reasonable that things like
'b and m' or 'm and b', where b is a standard boolean, should
broadcast the scalar over the matrix, as with all the other
binary operations. To make that work at the Python level, you
need the reversed methods.

As another example, in an SQL expression builder, it doesn't
seem unreasonable that mixing ordinary boolean values with
SQL boolean expressions should give the expected results.

Besides, if the reversed methods weren't there, it would
make these operator a special case with respect to all the
others, for no apparently good reason. So while it would be
a local simplification, I don't think it would simplify
things overall.

 > Delete the 'As a special case' sentence.

That would make the spec shorter, but would make the facility
more complicated to *use* in many cases. So again, I don't
think this would be an overall simplification.

 > Type Slots: someone else can decide if a new flag and 5 new slots are a
 > significant price.

I don't think anyone is worried about the size of type
objects -- they're not something you normally create in large
quantities.

--
Greg

From greg.ewing at canterbury.ac.nz  Sat May 26 03:41:29 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 26 May 2007 13:41:29 +1200
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <fb6fbf560705251233k55d5e193o54969f24d864e0c6@mail.gmail.com>
References: <fb6fbf560705251233k55d5e193o54969f24d864e0c6@mail.gmail.com>
Message-ID: <46579049.5060206@canterbury.ac.nz>

Jim Jewett wrote:

> It currently says that __not__ can return NotImplemented, which falls
> back to the current semantics.

I'm not sure why I put that there. As you observe, it's not
necessary, since you can always get the default semantics
simply by not defining the method.

An experiment suggests that the existing unary operator
methods don't special-case NotImplemented, so I'll remove
that part.

> It does not yet say what will happen for objects that return something
> else outside of {True, False},

There's nothing to say -- whatever you return is the
result. That's the whole point of making it overloadable.

> Is that OK, because "not not X" should now be spelled "bool(x)", and
> you haven't allowed the overriding of __bool__?

Yes, I would say that 'not not x' should indeed be spelled
bool(x), if that's what you intend it to mean.

Whether __bool__ should be overloadable is outside the scope
of this PEP. But if it is overloadable, I would recommend
that it not be allowed to return anything other than a boolean.

--
Greg

From greg.ewing at canterbury.ac.nz  Sat May 26 03:46:13 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 26 May 2007 13:46:13 +1200
Subject: [Python-3000] [Python-Dev] Wither PEP 335 (Overloadable Boolean
 Operators)?
In-Reply-To: <20070525155302.917F23A4061@sparrow.telecommunity.com>
References: <ca471dc20705181102q29329642qb166f076d6d93999@mail.gmail.com>
	<ca471dc20705241710i50e50992m6405ed411e02aaac@mail.gmail.com>
	<4656446F.8030802@canterbury.ac.nz>
	<ca471dc20705241953r5f7dbdb3x8a93b213a142f62a@mail.gmail.com>
	<de9ae4950705250225i64fe8a0fga5b86aed62556fae@mail.gmail.com>
	<20070525155302.917F23A4061@sparrow.telecommunity.com>
Message-ID: <46579165.1080408@canterbury.ac.nz>

Phillip J. Eby wrote:

> Actually, I think that most of the use cases for this PEP would be 
> better served by being able to "quote" code, i.e. to create AST 
> objects directly from Python syntax.

That's been suggested before, but hasn't received
a favourable response.

One problem is that it would force all alternative
implementations to be able to produce an AST with
the same structure as CPython's.

Also it could be considered dangerously close to
"programmable syntax".

--
Greg

From nnorwitz at gmail.com  Sat May 26 04:29:19 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Fri, 25 May 2007 19:29:19 -0700
Subject: [Python-3000] Wither PEP 335 (Overloadable Boolean Operators)?
In-Reply-To: <46579049.5060206@canterbury.ac.nz>
References: <fb6fbf560705251233k55d5e193o54969f24d864e0c6@mail.gmail.com>
	<46579049.5060206@canterbury.ac.nz>
Message-ID: <ee2a432c0705251929u144d8e58ne26377c1da88706c@mail.gmail.com>

On 5/25/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>
> > Is that OK, because "not not X" should now be spelled "bool(x)", and
> > you haven't allowed the overriding of __bool__?
>
> Yes, I would say that 'not not x' should indeed be spelled
> bool(x), if that's what you intend it to mean.
>
> Whether __bool__ should be overloadable is outside the scope
> of this PEP. But if it is overloadable, I would recommend
> that it not be allowed to return anything other than a boolean.

There is already a __bool__ method in 3k.  It's the old __nonzero__ method.

>>> 5 .__bool__()
True
>>> 0 .__bool__()
False

>>> class F:
...   def __bool__(self): return 5
>>> if F(): print('is')
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __bool__ should return bool, returned int

n

From bwinton at latte.ca  Sat May 26 05:00:38 2007
From: bwinton at latte.ca (Blake Winton)
Date: Fri, 25 May 2007 23:00:38 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <Pine.LNX.4.58.0705251312480.27740@server1.LFW.org>
References: <20070524213605.864B.JCARLSON@uci.edu>	<ca471dc20705250732k14598e57id40c5f95434ecee9@mail.gmail.com>	<20070525084117.865D.JCARLSON@uci.edu>
	<Pine.LNX.4.58.0705251312480.27740@server1.LFW.org>
Message-ID: <4657A2D6.3050809@latte.ca>

Ka-Ping Yee wrote:
> On Fri, 25 May 2007, Josiah Carlson wrote:
>> Apples and oranges to be sure, but there are no other statistics that
>> anyone else is able to offer about use of non-ascii identifiers in Java,
>> Javascript, C#, etc.
> Let's see what we can find.  I made several attempts to search for
> non-ASCII identifiers using google.com/codesearch and here's what I got.

I think you've got a selection bias here, since google isn't likely to 
index code not intended for the whole world, and thus the code you'll be 
searching through is more likely to be in english than code in general.

Perhaps searching the entire web for "class <non-ascii string>", or 
"<non-ascii string> (" or "<non-ascii string> =" would give more 
accurate results, if such a thing is even possible.

Later,
Blake.

From bwinton at latte.ca  Sat May 26 05:45:22 2007
From: bwinton at latte.ca (Blake Winton)
Date: Fri, 25 May 2007 23:45:22 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705251120t73430a96pc98b1e15d03a36c4@mail.gmail.com>
References: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>	<20070524213605.864B.JCARLSON@uci.edu>	<19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>	<fb6fbf560705250837i16262513p9bc5a70f9a4c506f@mail.gmail.com>	<19dd68ba0705250854o40a1025cse3d5f2c38cd76785@mail.gmail.com>	<19dd68ba0705250855r6d2676c6r5e9cb7a49b95b6ac@mail.gmail.com>
	<fb6fbf560705251120t73430a96pc98b1e15d03a36c4@mail.gmail.com>
Message-ID: <4657AD52.5020101@latte.ca>

Jim Jewett wrote:
 >>> If you didn't realize it was using non-ASCII (or even that it
 >>> could), and the author didn't warn you -- then that is an1
 >>> appropriate time for the interpreter to warn you that things aren't
 >>> as you expect.
 >> I fail to see your point. Why should the interpreter warn you?
 > Arbitrary Unicode identifier opens up the possibility of code that
 > *looks* like ASCII, but isn't -- so I don't even realize that I missed
 > something.

You already have that problem.  Right now.  And you've had it for at 
least a year (assuming you installed 2.4.3 when it came out).

All screenshots taken on Python 2.4.3, Mac OSX 10.4 Intel.

http://bwinton.latte.ca/temp/Python/File.png
http://bwinton.latte.ca/temp/Python/Run.png
http://bwinton.latte.ca/temp/Python/foo.py

So, what are you doing to mitigate this risk now, and why not do the 
same thing when identifiers are allowed to be arbitrary Unicode?

Later,
Blake.

From ocean at m2.ccsnet.ne.jp  Sat May 26 06:30:21 2007
From: ocean at m2.ccsnet.ne.jp (ocean)
Date: Sat, 26 May 2007 13:30:21 +0900
Subject: [Python-3000] python/trunk/Lib/test/test_urllib.py (for ftpwrapper)
Message-ID: <001101c79f4e$92d35f60$0300a8c0@whiterabc2znlh>

http://mail.python.org/pipermail/python-checkins/2007-May/060507.html

Hello. I'm using Windows2000, I tried some investigation for
test_ftpwrapper.

After I did this change, most errors were gone.

Index: Lib/urllib.py
===================================================================
--- Lib/urllib.py (revision 55584)
+++ Lib/urllib.py (working copy)
@@ -833,7 +833,7 @@
         self.busy = 0
         self.ftp = ftplib.FTP()
         self.ftp.connect(self.host, self.port, self.timeout)
-        self.ftp.login(self.user, self.passwd)
+#        self.ftp.login(self.user, self.passwd)
         for dir in self.dirs:
             self.ftp.cwd(dir)

I don't know, but probably 'login' on Win2000 is problamatic.

Remaining error is:

  File "e:\python-dev\trunk\lib\threading.py", line 460, in __bootstrap
    self.run()
  File "e:\python-dev\trunk\lib\threading.py", line 440, in run
    self.__target(*self.__args, **self.__kwargs)
  File "test_urllib.py", line 565, in server
    conn.recv(13)
error: (10035, 'The socket operation could not complete without blocking')

And after commented out conn.recv block in test_urllib.py, test passed fine.

def server(evt):
    serv = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    serv.settimeout(3)
    serv.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    serv.bind(("", 9093))
    serv.listen(5)
    try:
        conn, addr = serv.accept()
        conn.send("1 Hola mundo\n")
        """
        cantdata = 0
        while cantdata < 13:
            data = conn.recv(13-cantdata)
            cantdata += len(data)
            time.sleep(.3)
        """
        conn.send("2 No more lines\n")
        conn.close()
    except socket.timeout:
        pass
    finally:
        serv.close()
        evt.set()


From stephen at xemacs.org  Sat May 26 07:05:36 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 26 May 2007 14:05:36 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <Pine.LNX.4.58.0705251721390.27740@server1.LFW.org>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<19dd68ba0705120827s5415c4dcx12e5862f32cc3e06@mail.gmail.com>
	<4646A3CA.40705@acm.org> <4646FCAE.7090804@v.loewis.de>
	<f27rmv$k1d$1@sea.gmane.org> <464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<Pine.LNX.4.58.0705231709231.8399@server1.LFW.org>
	<877iqy5v2y.fsf@uwakimon.sk.tsukuba.ac.jp>
	<Pine.LNX.4.58.0705251721390.27740@server1.LFW.org>
Message-ID: <87abvs41jz.fsf@uwakimon.sk.tsukuba.ac.jp>

Thank you for the apology.  I have cooled off, and I hope you won't
hold the "take offense" against me.  I was hurt, for sure, but you're
right, that's a legitimate reading in colloquial English.

Ka-Ping Yee writes:

 > That just means, if we're going to provide this feature, we shouldn't
 > force subtle wrongness upon people by making it the default.

I agree wholeheartedly!  But AFAIK this is the first time you have
explicitly limited yourself in principle to discussion of the default.
Up to now you've opposed the whole idea.

 > The PEP says that Python will *convert* the identifiers into NFC.
 > I'd rather there not be lots of different ways to write the same
 > identifier (TOOWTDI), so this particular recommendation is that
 > identifiers in source code have to already be normalized.

A Unicode conforming process may not distinguish between different
representations of a given character.  Ie, the NFC conversion is an
internal optimization.  The characters are the same.  I think Unicode
conformance is close enough to TOOWDTI, and far more important than
the remaining difference.  YMMV.

Pragmatically, users are likely not to know how to do it.  I do it
with an explicit call to an external library provided by Mac OS X; I
don't know how to do it (ie, what the (de)composition is, and often
even how to input the resulting characters) without access to the
library canonicalization API.  My input methods do not provide such a
facility.  (And Unicode says that they may refuse to do so.)

Finally, this would also be inconsistent with the definition of Python
implicit in PEP 263, which clearly envisions a Python program as a
sequence of abstract characters which may have an arbitrary
ASCII-compatible encoding on disk.


From stephen at xemacs.org  Sat May 26 08:37:08 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 26 May 2007 15:37:08 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <20070525095511.866D.JCARLSON@uci.edu>
References: <20070525091105.8663.JCARLSON@uci.edu>
	<19dd68ba0705250945j3dadcefcu8db91b3d2c055fdf@mail.gmail.com>
	<20070525095511.866D.JCARLSON@uci.edu>
Message-ID: <878xbc3xbf.fsf@uwakimon.sk.tsukuba.ac.jp>

Josiah Carlson writes:

 > It does, but it also refuses the temptation to guess that *everyone*
 > wants to use unicode identifiers by default.  Why?  As Stephen Turnbull
 > has already stated, the majority of users will have *no use* and *no
 > exposure* to unicode identifiers.

I'm afraid I conflated two issues in that post.  I'm sorry for the
confusion.

My first claim is that editor (not Python!) users indeed will be
overwhelmingly monoscript for the foreseeable future.  I'd bet serious
money on that (as long as somebody else pays for the survey to make
the judgment :-).

My second claim is that where non-ASCII identifiers are *already*
available, their use is extremely restricted, and the overwhelming
majority of programmers never encounter them.  I predict that once PEP
3131 is implemented, their overall usage in Python programs will
increase very slowly for a few years.  However, there will be pockets
of fast diffusion (CP4E in particular, including programming classes
for history majors at university and the like).

<rant>
By the way, this is an example that shows that the recent injection of
the word "parochial" is truly pernicious, because it's attached to the
wrong set of arguments.

Please note, it is those pockets of Unicode adoption that are truly
parochial, not the ASCII advocates!  Those pockets can be early and
deep adopters precisely because they are small, homogeneous groups,
unconcerned with the world outside.  ASCII advocates are obviously
self-interested ("IAGNI, so *you* can't have it, it would cost me
extra effort"), but they are *not* parochial: they *know* they're
going to exchange code with other cultures, they *welcome* that
exchange, and *they do not want it hindered for "frivolous" reasons*.

Advocates of Unicode want it for themselves and their buddies, and of
course are happy to have it used by other groups---used
*independently* by *equally parochial* groups.

True, "frivolous" is a parochial evaluation of the cultural exchange
that use of Unicode identifiers can foster, but that notion of
"parochial" is on a different level.  IMHO that "cultural exchange"
level is highly relevant to the decision to implement Unicode
identifiers in some way, but it's the "code exchange" level that is
most relevant to the pace of introduction.  And that has to consider
the balance between faster growth within Unicode-using groups, versus
the facilitation of opportunistic[1] exchange among groups using the
(admittedly imperfect) lingua franca of ASCII.
</rant>


Footnotes: 
[1]  Ie, when you look at someone's app and go "I wonder how she does
that?  Can I use her code in my app?"  Obviously in a formal exchange,
the identifier constituent set can and should be negotiated.

From stephen at xemacs.org  Sat May 26 08:53:47 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 26 May 2007 15:53:47 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705251004q413a7a07k41539cdb197c015f@mail.gmail.com>
References: <Pine.LNX.4.58.0705241759100.8399@server1.LFW.org>
	<465667AE.2090000@v.loewis.de>
	<20070524215742.864E.JCARLSON@uci.edu>
	<740c3aec0705250255k642d6637re46e3929212f1369@mail.gmail.com>
	<fb6fbf560705251004q413a7a07k41539cdb197c015f@mail.gmail.com>
Message-ID: <877iqw3wjo.fsf@uwakimon.sk.tsukuba.ac.jp>

Jim Jewett writes:

 > On 5/25/07, BJ?rn Lindqvist <bjourne at gmail.com> wrote:
 > > If Python required a switch for such a program to run, then this
 > > feature would be totally wasted on them. They might use an IDE,
 > > program in notepad.exe and dragging the file to the python.exe icon or
 > > not even know about cmd.exe or what a command line switch is. An error
 > > message, even an informal one, isn't easy to understand if you don't
 > > know English.

This can be handled with wrappers, at install time.  Ugly, but workable.
Jim's idea is very suggestive, though:

 > How about a default file, such as
 > 
 > "on launch, python looks for pyidchar.txt ... if you want to override
 > this default file do XYZ"

This still doesn't help to address the "fine-grained" (per-module or
per-file) control issue, right?  Unless you complexified the syntax.
You could allow includes (from a site library of character set
definitions, not arbitrary files), inline table definitions, and a
file or module to table mapping.

Since this would a under control of the site (distriubtions could
supply examples, but not install them where Python would pick them
up), maybe such complexity would be OK?  I believe most people's file
would be


    [DEFAULT]

    000000-1FFFFF   # intersection of the full Unicode range and PEP
                    # 3131-permitted characters

(where DEFAULT is a special table used by default for files not mapped
to another table).

How about per-user overrides?


From stephen at xemacs.org  Sat May 26 09:42:57 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 26 May 2007 16:42:57 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>
Message-ID: <87646g3u9q.fsf@uwakimon.sk.tsukuba.ac.jp>

Jim Jewett writes:

 > > How about a regexp character class as starting point?
 > 
 > I'm not sure I understand.  Do you mean that part of localization
 > should be defining what certain regular expressions should match?

No, I meant simply a list of character ranges, as characters.  The
definition of "safe ASCII" would be something like

    r"\t\r\n -~"

Your table format is better.  If people want to put the actual
characters in comments (maybe in source files to be preprocessed
before installation), let them.

 > So long as we allow tailoring, I think the maximal set should be
 > generous -- and I don't see any reason to pre-exclude anything outside
 > ASCII.

Cf characters?  Are we admitting "stupid bidi tricks", too?<wink>

But I'll tell you what my reason is: we want to be in a position to
avoid prohibiting previously acceptable characters wherever possible.

 > There are people who like to use names like "Program Files" or
 > "Summary of Results.Apr-3-2007 version 2.xls"; I expect the same will
 > be true of identifiers.  So long as the punctuation is not ASCII, we
 > might as well let them.

Why not let them use ASCII punctuation, as long as it's not Python
syntax?

Ie, for one thing, we might want to do something with that punctuation
some day.  For example, I could imagine using guillemots to denote
rawstrings or to substitute for triple quotes.  Local parsing (as done
by program editors) would be easier with directed quotes.  Etc.  For
reasons of visual distinctiveness, we might choose to use Chinese or
Arabic versions.

 > The other committees say to exclude certain scripts, like Linear B and
 > Ogham.  And not to allow mixed scripts, at least if they're
 > confusable.  But I really don't want to explain why someone using
 > Cyrillic can't use certain (apparently to him) randomly determined
 > identifiers just because it could be confused with ASCII (or
 > Armenian).

-1 on restrictions according to confusability or the block.  That's a
matter for personal judgement, and there are cheap technical solutions
for those who want to use confusable Cyrillic or Linear B and still
avoid confusion.  I think those restrictions are an idea that must be
available (perhaps as a table we distribute), but I think they'll turn
out to suck pretty badly.

 > If unicode comes out with a new revision, the new characters should
 > probably be allowed; I don't want a situation where users of Cham or
 > Lepcha[1] are told they have to wait another year because their
 > scripts weren't formally adopted into unicode until after python 3.4.0
 > was already released.

Tough call.  I'd say, let's cross that bridge when we come to it.

In any case there will have to be some mechanism to access a Unicode
database at either build time or run time.  Let them munge that
database if they're in a hurry.

Maybe the way to handle this is to allow private-space characters in
identifiers as an option.  That would be doable with your well-known
file scheme.  But it's very dangerous across modules.

By the way, this is what the Japanese call the "gaiji" ("outside
character") problem.  It's a very tough nut to crack; the Japanese
never did.

From timothy.c.delaney at gmail.com  Sat May 26 10:13:51 2007
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Sat, 26 May 2007 18:13:51 +1000
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<ca471dc20705241713n7f348c7eh563b5631e512fd93@mail.gmail.com>
	<017d01c79e98$c6b84090$0201a8c0@mshome.net>
	<ca471dc20705251014r46c86f74j822729e843cef797@mail.gmail.com>
	<001d01c79f15$f0afa140$0201a8c0@mshome.net>
	<ca471dc20705251613p59eb1a71k9aaf13f4b492181e@mail.gmail.com>
Message-ID: <002d01c79f6d$ce090de0$0201a8c0@mshome.net>

Guido van Rossum wrote:

>>> - Why not make super a keyword, instead of just prohibiting
>>> assignment to it? (I'm planning to do the same with None BTW in
>>> Py3k -- I find the "it's a name but you can't assign to it" a
>>> rather silly business and hardly "the simplest solution".)
>>
>> That's currently an open issue - I'm happy to make it a keyword - in
>> which case I think the title should be changed to "super as a
>> keyword" or something like that.
>
> As it was before. :-)
>
> What's the argument against?

I don't see any really, especially if None is to become a true keyword. But 
some people have raised objections.

>> Th preamble will only be added to functions/methods that cause the
>> 'super' cell to exist i.e. for CPython have 'super' in co.cellvars.
>> Functions that just have 'super' in co.freevars wouldn't have the
>> preamble.
>
> I think it's still too vague. For example:
>
> class C:
>  def f(s):
>    return 1
> class D(C):
>  pass
> def f(s):
>  return 2*super.f()
> D.f = f
> print(D().f())
>
> Should that work? I would be okay if it didn't, and if the super
> keyword is only allowed inside a method that is lexically inside a
> class. Then the second definition of f() should be a (phase 2)
> SyntaxError.

That would simplify things. I'll update the PEP.

> Was it ever decided whether the implicitly bound class should be:
>
> - the class object as produced by the class statement (before applying
> class decorators);
> - whatever is returned by the last class decorator (if any); or
> - whatever is bound to the class name at the time the method is
> invoked?
> I've got a hunch that #1 might be more solid; #3 seems asking for
> trouble.

I think #3 is definitely the wrong thing to do, but there have been 
arguments put forwards for both #1 and #2.

I think I'll put it as an open issue for now.

> There's also the issue of what to do when the method itself is
> decorated (the compiler can't know what the decorators mean, even for
> built-in decorators like classmethod).

I think that may be a different issue. If you do something like:

class A:
    @decorator
    def func(self):
        pass

class B(A):
    @decorator
    def func(self):
        super.func()

then `super.func()` will call whatever `super(B, self).func()` would now, 
which (I think) would result in calling the decorated function.

However, I think the staticmethod decorator would need to be able to modify 
the class instance that's held by the method. Or see my proposal below ...

> We could make the class in question a fourth attribute of the (poorly
> named) "bound method" object, e.g. im_class_for_super (im_super would
> be confusing IMO). Since this is used both by instance methods and by
> the @classmethod decorator, it's just about perfect for this purpose.
> (I would almost propose to reuse im_self for this purpose, but that's
> probably asking for subtle backwards incompatibilities and not worth
> it.)

I'm actually thinking instead that an unbound method should reference an 
unbound super instance for the appropriate class - which we could then call 
im_super.

For a bound instance or class method, im_super would return the appropriate 
bound super instance. In practice, it would work like your autosuper recipe 
using __super.

e.g.

class A:
    def func(self):
        pass

>>> print A.func.im_super
<super: <class 'A'>, NULL>

>>> print A().func.im_super
<super: <class 'A'>, <A object>>

> See my proposal above. It differs slightly in that the __super__ call
> is made only when the class is not NULL. On the expectation that a
> typical function that references super uses it exactly once per call
> (that would be by far the most common case I expect) this is just
> fine. In my proposal the 'super' variable contains whatever
> __super__(<class>, <inst>) returned, rather than <class> which you
> seem to be proposing here.

Think I must have been explaining poorly - if you look at the reference 
implementation in the PEP, you'll see that that's exactly what's held in the 
'super' free variable.

I think your proposal is basically what I was trying to convey - I'll look 
at rewording the PEP so it's less ambiguous. But I'd like your thoughts on 
the above proposal to keep a reference to the actual super object rather 
than the class.

Cheers,

Tim Delaney 


From python at zesty.ca  Sat May 26 12:01:32 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Sat, 26 May 2007 05:01:32 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <4657A2D6.3050809@latte.ca>
References: <20070524213605.864B.JCARLSON@uci.edu>
	<ca471dc20705250732k14598e57id40c5f95434ecee9@mail.gmail.com>
	<20070525084117.865D.JCARLSON@uci.edu>
	<Pine.LNX.4.58.0705251312480.27740@server1.LFW.org>
	<4657A2D6.3050809@latte.ca>
Message-ID: <Pine.LNX.4.58.0705260415520.27740@server1.LFW.org>

On Fri, 25 May 2007, Blake Winton wrote:
> Ka-Ping Yee wrote:
> > Let's see what we can find.  I made several attempts to search for
> > non-ASCII identifiers using google.com/codesearch and here's what I got.
>
> I think you've got a selection bias here, since google isn't likely to
> index code not intended for the whole world, and thus the code you'll be
> searching through is more likely to be in english than code in general.

Indeed.  I couldn't think of a better way to do a search, but if you
come up with any better methods, go for it and let us know what you
find.


-- ?!ng

From python at zesty.ca  Sat May 26 12:33:23 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Sat, 26 May 2007 05:33:23 -0500 (CDT)
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <A2620FD5-C79D-4B8D-B824-AFB59A56735F@gmail.com>
References: <20070523111704.85FC.JCARLSON@uci.edu>
	<87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070524082737.862E.JCARLSON@uci.edu> <4655DD4E.3050809@v.loewis.de>
	<Pine.LNX.4.58.0705241620150.8399@server1.LFW.org>
	<4656129D.5000406@v.loewis.de>
	<A2620FD5-C79D-4B8D-B824-AFB59A56735F@gmail.com>
Message-ID: <Pine.LNX.4.58.0705260518290.27740@server1.LFW.org>

Ka-Ping Yee wrote:
> Alas, the coding directive is not good enough.  Have a look at this:
>
>     http://zesty.ca/python/tricky.png
>
> That's an image of a text editor containing some Python code.  Can you
> tell whether running it (post-PEP-3131) will delete your .bashrc file?

Martin v. L?wis wrote:
> I would think that it doesn't (i.e. allowed should stay at 0).
>
> Why does os.remove get invoked?

Mike Klaas wrote:
> Perhaps a letter in the encoding declaration is non-ascii, nullifying
> the encoding enforcement and allowing a cyrillic 'a' in  allowed = 0?

You got it.

See the actual source file at

    http://zesty.ca/python/tricky.py

There are three things going on here:

    1.  All three occurrences of "allowed" look the same.  And
        it seems they are truly the same, because the coding
        declaration on line 2 says the file is ASCII.  But in
        fact, they aren't the same -- one of them contains a
        Cyrillic "a", which changes the meaning of the program.

    2.  But how is that possible when the coding declaration
        says the file is ASCII?  If you believe it, then you
        also expect the coding declaration itself to be ASCII,
        i.e., a real coding declaration.  But it isn't -- the
        word "coding" contains a Cyrillic "c".

    3.  Then why doesn't Python complain about this non-ASCII
        character on line 2 of the file, since ASCII is supposed
        to be the default encoding?  Because there is a UTF-8 BOM
        at the beginning of the file.

        PEP 263 tries to prevent confusion by making Python complain
        if the coding declaration conflicts with the already-set
        UTF-8 encoding.  But even though line 2 looks like a coding
        declaration, Python doesn't notice it, so you get no warning.

The conclusion is that one cannot rely on the coding declaration
to know what the encoding is, because one cannot know what the
coding declaration says.  We would be able to rely on it, if only
it were encoded in ASCII.  But the enabling of UTF-8 by a BOM at the
beginning of the file is an invisible override.  This invisible
override is the source of the danger.  If we want to be able to
read the coding declaration with any confidence, we should get rid
of the invisible override.


-- ?!ng

From python at zesty.ca  Sat May 26 12:37:46 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Sat, 26 May 2007 05:37:46 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <4657AD52.5020101@latte.ca>
References: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<20070524213605.864B.JCARLSON@uci.edu>
	<19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>
	<fb6fbf560705250837i16262513p9bc5a70f9a4c506f@mail.gmail.com>
	<19dd68ba0705250854o40a1025cse3d5f2c38cd76785@mail.gmail.com>
	<19dd68ba0705250855r6d2676c6r5e9cb7a49b95b6ac@mail.gmail.com>
	<fb6fbf560705251120t73430a96pc98b1e15d03a36c4@mail.gmail.com>
	<4657AD52.5020101@latte.ca>
Message-ID: <Pine.LNX.4.58.0705260506250.27740@server1.LFW.org>

On Fri, 25 May 2007, Blake Winton wrote:
> Jim Jewett wrote:
>  > Arbitrary Unicode identifier opens up the possibility of code that
>  > *looks* like ASCII, but isn't -- so I don't even realize that I missed
>  > something.
>
> You already have that problem.  Right now.  And you've had it for at
> least a year (assuming you installed 2.4.3 when it came out).
>
> All screenshots taken on Python 2.4.3, Mac OSX 10.4 Intel.
>
> http://bwinton.latte.ca/temp/Python/File.png
> http://bwinton.latte.ca/temp/Python/Run.png
> http://bwinton.latte.ca/temp/Python/foo.py

Yes -- you have demonstrated exactly why the default encoding for
Python files should be 7-bit ASCII, and why a coding declaration should
be required to switch to other encodings, to let the reader know that
the file might not contain what it appears to contain.

Your file, like tricky.py, relies on the invisible enabling of UTF-8
by a UTF-8-encoded BOM at the beginning of the file.

Switching to UTF-8 invisibly (or by default) is dangerous; enabling
non-ASCII identifiers by default augments this problem to a whole
new level.  Neither should be the default.


-- ?!ng

From bwinton at latte.ca  Sat May 26 15:31:42 2007
From: bwinton at latte.ca (Blake Winton)
Date: Sat, 26 May 2007 09:31:42 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <Pine.LNX.4.58.0705260415520.27740@server1.LFW.org>
References: <20070524213605.864B.JCARLSON@uci.edu>	<ca471dc20705250732k14598e57id40c5f95434ecee9@mail.gmail.com>	<20070525084117.865D.JCARLSON@uci.edu>	<Pine.LNX.4.58.0705251312480.27740@server1.LFW.org>	<4657A2D6.3050809@latte.ca>
	<Pine.LNX.4.58.0705260415520.27740@server1.LFW.org>
Message-ID: <465836BE.8080900@latte.ca>

Ka-Ping Yee wrote:
> On Fri, 25 May 2007, Blake Winton wrote:
>> Ka-Ping Yee wrote:
>>> Let's see what we can find.  I made several attempts to search for
>>> non-ASCII identifiers using google.com/codesearch and here's what I got.
>> I think you've got a selection bias here, since google isn't likely to
>> index code not intended for the whole world, and thus the code you'll be
>> searching through is more likely to be in english than code in general.
> 
> Indeed.  I couldn't think of a better way to do a search, but if you
> come up with any better methods, go for it and let us know what you
> find.

That was what my second [snipped] paragraph was about.

If you could find tutorials or sample code in other languages, that 
might be less biased.  Or maybe more biased in the other direction.  On 
the other hand, I suspect you might have to work at Google to be able to 
run those sorts of queries.

It's a hard problem, and while I applaud your effort, I just wanted to 
make sure that people knew that it wasn't necessarily representative of 
the real world.

Later,
Blake.

From guido at python.org  Sat May 26 16:08:47 2007
From: guido at python.org (Guido van Rossum)
Date: Sat, 26 May 2007 07:08:47 -0700
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
In-Reply-To: <002d01c79f6d$ce090de0$0201a8c0@mshome.net>
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<ca471dc20705241713n7f348c7eh563b5631e512fd93@mail.gmail.com>
	<017d01c79e98$c6b84090$0201a8c0@mshome.net>
	<ca471dc20705251014r46c86f74j822729e843cef797@mail.gmail.com>
	<001d01c79f15$f0afa140$0201a8c0@mshome.net>
	<ca471dc20705251613p59eb1a71k9aaf13f4b492181e@mail.gmail.com>
	<002d01c79f6d$ce090de0$0201a8c0@mshome.net>
Message-ID: <ca471dc20705260708t952d820w7473474554c9469b@mail.gmail.com>

Quick, since I'm about to hop on a plane: Thinking about it again,
storing the super instance in the bound method object is fine, as long
as you only do it when the bound function needs it. Using an unbound
super object in an unbound method is also fine.

--Guido

On 5/26/07, Tim Delaney <timothy.c.delaney at gmail.com> wrote:
> Guido van Rossum wrote:
>
> >>> - Why not make super a keyword, instead of just prohibiting
> >>> assignment to it? (I'm planning to do the same with None BTW in
> >>> Py3k -- I find the "it's a name but you can't assign to it" a
> >>> rather silly business and hardly "the simplest solution".)
> >>
> >> That's currently an open issue - I'm happy to make it a keyword - in
> >> which case I think the title should be changed to "super as a
> >> keyword" or something like that.
> >
> > As it was before. :-)
> >
> > What's the argument against?
>
> I don't see any really, especially if None is to become a true keyword. But
> some people have raised objections.
>
> >> Th preamble will only be added to functions/methods that cause the
> >> 'super' cell to exist i.e. for CPython have 'super' in co.cellvars.
> >> Functions that just have 'super' in co.freevars wouldn't have the
> >> preamble.
> >
> > I think it's still too vague. For example:
> >
> > class C:
> >  def f(s):
> >    return 1
> > class D(C):
> >  pass
> > def f(s):
> >  return 2*super.f()
> > D.f = f
> > print(D().f())
> >
> > Should that work? I would be okay if it didn't, and if the super
> > keyword is only allowed inside a method that is lexically inside a
> > class. Then the second definition of f() should be a (phase 2)
> > SyntaxError.
>
> That would simplify things. I'll update the PEP.
>
> > Was it ever decided whether the implicitly bound class should be:
> >
> > - the class object as produced by the class statement (before applying
> > class decorators);
> > - whatever is returned by the last class decorator (if any); or
> > - whatever is bound to the class name at the time the method is
> > invoked?
> > I've got a hunch that #1 might be more solid; #3 seems asking for
> > trouble.
>
> I think #3 is definitely the wrong thing to do, but there have been
> arguments put forwards for both #1 and #2.
>
> I think I'll put it as an open issue for now.
>
> > There's also the issue of what to do when the method itself is
> > decorated (the compiler can't know what the decorators mean, even for
> > built-in decorators like classmethod).
>
> I think that may be a different issue. If you do something like:
>
> class A:
>     @decorator
>     def func(self):
>         pass
>
> class B(A):
>     @decorator
>     def func(self):
>         super.func()
>
> then `super.func()` will call whatever `super(B, self).func()` would now,
> which (I think) would result in calling the decorated function.
>
> However, I think the staticmethod decorator would need to be able to modify
> the class instance that's held by the method. Or see my proposal below ...
>
> > We could make the class in question a fourth attribute of the (poorly
> > named) "bound method" object, e.g. im_class_for_super (im_super would
> > be confusing IMO). Since this is used both by instance methods and by
> > the @classmethod decorator, it's just about perfect for this purpose.
> > (I would almost propose to reuse im_self for this purpose, but that's
> > probably asking for subtle backwards incompatibilities and not worth
> > it.)
>
> I'm actually thinking instead that an unbound method should reference an
> unbound super instance for the appropriate class - which we could then call
> im_super.
>
> For a bound instance or class method, im_super would return the appropriate
> bound super instance. In practice, it would work like your autosuper recipe
> using __super.
>
> e.g.
>
> class A:
>     def func(self):
>         pass
>
> >>> print A.func.im_super
> <super: <class 'A'>, NULL>
>
> >>> print A().func.im_super
> <super: <class 'A'>, <A object>>
>
> > See my proposal above. It differs slightly in that the __super__ call
> > is made only when the class is not NULL. On the expectation that a
> > typical function that references super uses it exactly once per call
> > (that would be by far the most common case I expect) this is just
> > fine. In my proposal the 'super' variable contains whatever
> > __super__(<class>, <inst>) returned, rather than <class> which you
> > seem to be proposing here.
>
> Think I must have been explaining poorly - if you look at the reference
> implementation in the PEP, you'll see that that's exactly what's held in the
> 'super' free variable.
>
> I think your proposal is basically what I was trying to convey - I'll look
> at rewording the PEP so it's less ambiguous. But I'd like your thoughts on
> the above proposal to keep a reference to the actual super object rather
> than the class.
>
> Cheers,
>
> Tim Delaney
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From showell30 at yahoo.com  Sat May 26 16:32:56 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Sat, 26 May 2007 07:32:56 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <878xbc3xbf.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <19685.97380.qm@web33511.mail.mud.yahoo.com>


--- "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> <rant>
> By the way, this is an example that shows that the
> recent injection of
> the word "parochial" is truly pernicious, because
> it's attached to the
> wrong set of arguments.
> 

Sorry.  I'm one of the folks who has propagated that
term, and I didn't mean for the use of the term to
have any pernicious side effect.  I have the used word
in a context that basically has me labelling myself as
"parochial," so I obviously I don't the word to carry
any baggage.

> Please note, it is those pockets of Unicode adoption
> that are truly
> parochial, not the ASCII advocates!  Those pockets
> can be early and
> deep adopters precisely because they are small,
> homogeneous groups,
> unconcerned with the world outside.  

That's how I see it too.  And again, I don't put any
baggage with the term "parochial."  I accept, and
embrace, the possibility that you could have thriving
small communities of Python somewhere on the other
side of the globe from me, and even though they're
writing code with identifiers that I can't read, they
may indirectly benefit me to the extent that they
eventually contribute back to the community.  Or maybe
they never benefit me at all, but the world is a
better place.

> ASCII advocates
> are obviously
> self-interested ("IAGNI, so *you* can't have it, it
> would cost me
> extra effort"), but they are *not* parochial: they
> *know* they're
> going to exchange code with other cultures, they
> *welcome* that
> exchange, and *they do not want it hindered for
> "frivolous" reasons*.
> 

That describes me perfectly.  I am self-interested to
the extent that my employers just pay me to write
working Python code, so I want the simplicity of ASCII
only.  My whole team is parochial in regards to the
content of the code itself, even though culturally we
are very diverse (American-born programmers are the
minority).

In the open source world, I have in fact exchanged
code with other cultures, I have welcomed the
exchange, and I wouldn't want it hindered for
frivolous reasons.  

> [...]

> True, "frivolous" is a parochial evaluation of the
> cultural exchange
> that use of Unicode identifiers can foster, but that
> notion of
> "parochial" is on a different level.  IMHO that
> "cultural exchange"
> level is highly relevant to the decision to
> implement Unicode
> identifiers in some way, but it's the "code
> exchange" level that is
> most relevant to the pace of introduction.  

Well said.

> And that
> has to consider
> the balance between faster growth within
> Unicode-using groups, versus
> the facilitation of opportunistic[1] exchange among
> groups using the
> (admittedly imperfect) lingua franca of ASCII.
> </rant>
> 

Yep.




       
____________________________________________________________________________________Sick sense of humor? Visit Yahoo! TV's 
Comedy with an Edge to see what's on, when. 
http://tv.yahoo.com/collections/222

From murman at gmail.com  Sat May 26 17:21:04 2007
From: murman at gmail.com (Michael Urman)
Date: Sat, 26 May 2007 10:21:04 -0500
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <Pine.LNX.4.58.0705260518290.27740@server1.LFW.org>
References: <20070523111704.85FC.JCARLSON@uci.edu>
	<87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070524082737.862E.JCARLSON@uci.edu> <4655DD4E.3050809@v.loewis.de>
	<Pine.LNX.4.58.0705241620150.8399@server1.LFW.org>
	<4656129D.5000406@v.loewis.de>
	<A2620FD5-C79D-4B8D-B824-AFB59A56735F@gmail.com>
	<Pine.LNX.4.58.0705260518290.27740@server1.LFW.org>
Message-ID: <dcbbbb410705260821xd0ff29cu8280662fe60378fb@mail.gmail.com>

On 5/26/07, Ka-Ping Yee <python at zesty.ca> wrote:
> But the enabling of UTF-8 by a BOM at the
> beginning of the file is an invisible override.  This invisible
> override is the source of the danger.  If we want to be able to
> read the coding declaration with any confidence, we should get rid
> of the invisible override.

Do we need to reconsider PEP 3120 "Using UTF-8 as the default source
encoding"? I don't see much difference between not knowing on visual
inspection whether:
    allowed is allowed
or
    "allowed" == "allowed"

I hope that's not your stance, because I still don't expect either to
cause problems in the real world. Of course since it's currently not
possible, it's hard to go trolling for existing use cases of confusing
identifiers in python code.

-- 
Michael Urman

From jimjjewett at gmail.com  Sat May 26 17:47:47 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sat, 26 May 2007 11:47:47 -0400
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
In-Reply-To: <ca471dc20705251613p59eb1a71k9aaf13f4b492181e@mail.gmail.com>
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<ca471dc20705241713n7f348c7eh563b5631e512fd93@mail.gmail.com>
	<017d01c79e98$c6b84090$0201a8c0@mshome.net>
	<ca471dc20705251014r46c86f74j822729e843cef797@mail.gmail.com>
	<001d01c79f15$f0afa140$0201a8c0@mshome.net>
	<ca471dc20705251613p59eb1a71k9aaf13f4b492181e@mail.gmail.com>
Message-ID: <fb6fbf560705260847p764f12c2m9c16bad0dea4ccc3@mail.gmail.com>

On 5/25/07, Guido van Rossum <guido at python.org> wrote:

> We could make the class in question a fourth attribute of the (poorly
> named) "bound method" object, e.g. im_class_for_super (im_super would
> be confusing IMO).

In the past, you have referred to this as the static class.

I think it has other uses as well, such as a class-wide registry
(whose location shouldn't be redirected without overriding the whole
method).

I realize this is the rejected __this_class__ proposal, but I can't
help feeling that if we're going to create the magic attribute anyhow,
it makes sense to have it be generally usable, instead of only as a
token to create a super.

> In my proposal the 'super' variable contains whatever
> __super__(<class>, <inst>) returned, rather than <class> which you
> seem to be proposing here.

That's fine, but the <class> still has to be stored with the method to
generate that super -- so why not expose it too?

-jJ

From jimjjewett at gmail.com  Sat May 26 18:00:02 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sat, 26 May 2007 12:00:02 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <4657AD52.5020101@latte.ca>
References: <Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>
	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>
	<20070524213605.864B.JCARLSON@uci.edu>
	<19dd68ba0705242231j4f391f00n79112a01c0f339bc@mail.gmail.com>
	<fb6fbf560705250837i16262513p9bc5a70f9a4c506f@mail.gmail.com>
	<19dd68ba0705250854o40a1025cse3d5f2c38cd76785@mail.gmail.com>
	<19dd68ba0705250855r6d2676c6r5e9cb7a49b95b6ac@mail.gmail.com>
	<fb6fbf560705251120t73430a96pc98b1e15d03a36c4@mail.gmail.com>
	<4657AD52.5020101@latte.ca>
Message-ID: <fb6fbf560705260900i55cc90bdl12e5e2644614e67e@mail.gmail.com>

On 5/25/07, Blake Winton <bwinton at latte.ca> wrote:
> Jim Jewett wrote:

>  > Arbitrary Unicode identifier opens up the possibility of code that
>  > *looks* like ASCII, but isn't -- so I don't even realize that I missed
>  > something.

> You already have that problem.

> All screenshots taken on Python 2.4.3, Mac OSX 10.4 Intel.

> http://bwinton.latte.ca/temp/Python/File.png
> http://bwinton.latte.ca/temp/Python/Run.png
> http://bwinton.latte.ca/temp/Python/foo.py

> So, what are you doing to mitigate this risk now, and why not do the
> same thing when identifiers are allowed to be arbitrary Unicode?

Looking at foo.py, I didn't even realize at first that it was supposed
to be a lookaline for triple-quotes; in my font, it is different
enough to draw the eye and tell me that something is wrong.

I don't like counting on that, but it does work -- for ASCII.  It
stops working with unicode, because the glyphs are even closer (or
identical).

This is partly a historical accident -- ASCII has been used long
enough that there are widespread fonts (including the most common
monospaced fonts) which make the distinctions fairly clear, and I'm
already trained to look for the edge cases.

Neither of these safeguards is true for unicode, nor will they become
true in the forseeable future.  Given the sheer size of unicode, these
safeguards may never become available in the general case -- but we
already have them for ASCII.

-jJ

From ncoghlan at gmail.com  Sat May 26 18:31:42 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 27 May 2007 02:31:42 +1000
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <19dd68ba0705250910o5b56b4f9i9fccd450e37f48fe@mail.gmail.com>
References: <465615C9.4080505@v.loewis.de>	<320102.38046.qm@web33515.mail.mud.yahoo.com>	<19dd68ba0705241805y52ba93fdt284a2c696b004989@mail.gmail.com>	<Pine.LNX.4.58.0705242126440.27740@server1.LFW.org>	<ca471dc20705242009h27882084la242b96222e28b29@mail.gmail.com>	<87odk93w09.fsf@uwakimon.sk.tsukuba.ac.jp>	<19dd68ba0705250641j348a42adu974fe4969897761e@mail.gmail.com>	<19dd68ba0705250653v2c2a8188jac8c4ccc722fb747@mail.gmail.com>	<87irag51in.fsf@uwakimon.sk.tsukuba.ac.jp>
	<19dd68ba0705250910o5b56b4f9i9fccd450e37f48fe@mail.gmail.com>
Message-ID: <465860EE.8050005@gmail.com>

Guillaume Proux wrote:
> On 5/26/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>> For the medium term, there are ways to pass command line arguments to
>> programs invoked by GUI.  They're more or less ugly, but your daughter
>> will never see them, only the pretty icons.
> 
> Is there right now in Windows?  There is none that I know today at
> least. All I know is that specific extensions are called automatically
> using a given interpreter because of bindin defined in the  registry.
> There is no simple way to add per-file info afaik.

You can edit the action used to launch .py files on double click by 
going into View->Options->File Types in Windows Explorer (that location 
may not be exactly correct - my Windows box isn't switched on at the 
moment).

Or, assuming an environment variable is supported (ala PYTHONINSPECT vs 
the -i switch), you could just set that environment variable to allow 
any character.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From jimjjewett at gmail.com  Sat May 26 18:39:57 2007
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sat, 26 May 2007 12:39:57 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <87646g3u9q.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>
	<87646g3u9q.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <fb6fbf560705260939x64cd9642qd025c9a01ef7604e@mail.gmail.com>

On 5/26/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Jim Jewett writes:
>  > So long as we allow tailoring, I think the maximal set should be
>  > generous -- and I don't see any reason to pre-exclude anything
>  > outside ASCII.

> Cf characters?  Are we admitting "stupid bidi tricks", too?<wink>

If Tomer needs them.

Seriously, I wouldn't put Cf characters in the default accepted
tabled.  (But remember that *I* would limit that default to ASCII.)
Tomer suggested that bidi characters might be needed to get Hebrew and
Arabic working correctly.  Given that someone has already decided to
use Arabic (or even Arabic presentational forms), he or she is better
placed to decide whether Cf characters are needed too.

> But I'll tell you what my reason is: we want to be in a position to
> avoid prohibiting previously acceptable characters wherever possible.

Agreed; but in my opinion, the decision to allow those characters is
local; the decision to rescind them would therefore also be local.

We do want to avoid retracting characters from the default set.  (And
again, if we restrict that default set to ASCII, we'll be fine.)

>  > There are people who like to use names like "Program Files" or
>  > "Summary of Results.Apr-3-2007 version 2.xls"; I expect the same will
>  > be true of identifiers.  So long as the punctuation is not ASCII, we
>  > might as well let them.

> Why not let them use ASCII punctuation, as long as it's not Python
> syntax?

Because there really isn't any unreserved ASCII punctuation.  One
issue with @decorators was that it caused some hassle for (reasonably
well-known) third-party tools which had been using the "@" character.

It would make perfect sense to me if the consensus French table
excluded guillemots.  But I figure that should be their decision.

>  > The other committees say to exclude certain scripts, like
>  >  Linear B and Ogham.

(I should probably have noted that Linear B and Ogham are not used by
any modern language; I *think* the excluded scripts were all for
things that would not represent anyone's primary script or mother
tongue.)

>  > If unicode comes out with a new revision, the new characters should
>  > probably be allowed; I don't want a situation where users of Cham or
>  > Lepcha[1] are told they have to wait another year because their
>  > scripts weren't formally adopted into unicode until after python 3.4.0
>  > was already released.

> Tough call.  I'd say, let's cross that bridge when we come to it.

> In any case there will have to be some mechanism to access a Unicode
> database at either build time or run time.  Let them munge that
> database if they're in a hurry.

I had been thinking of the unicode version as a feature that didn't
change within a python release.  Perhaps that is negotiable?

> Maybe the way to handle this is to allow private-space characters in
> identifiers as an option.  That would be doable with your well-known
> file scheme.  But it's very dangerous across modules.

It turns out that page was out of date; Lepcha and Cham now have code
points which haven't been formally approved, but aren't likely to
change.  Officially, they're still undefined, but using private-space
probably isn't the right answer.  So either we allow these particular
"undefined" characters, or we (for now) disallow Lepcha and Cham.

-jJ

From foom at fuhm.net  Sat May 26 21:49:30 2007
From: foom at fuhm.net (James Y Knight)
Date: Sat, 26 May 2007 15:49:30 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <87ps4p3zot.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>
	<781A2C3C-011E-4048-A72A-BE631C0C5127@fuhm.net>
	<87ps4p3zot.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <9D017904-5A64-40EC-8A5C-23502FB1E314@fuhm.net>


On May 25, 2007, at 7:33 AM, Stephen J. Turnbull wrote:

>> Adding baroque command line options for users of other languages to
>> do some useless verification at import time is not an acceptable
>> answer. It'd be better to just reject the PEP entirely.
>
> Speaking of exaggeration ....

I am serious. I fully support python having unicode identifier  
support. But I believe it would be far worse for Python to have  
complicated identifier syntax configuration via command line options  
or auxilliary files than to stay restricted to ASCII.

If the identifier syntax is changed to include unicode, all python  
modules are still usable everywhere. Once you start going down the  
road of configurable syntax (worse: globally configurable syntax),  
there will be a "second class" of python modules that won't work on  
some systems without extra pain.

I'm listening to all these proposals for options, and it's just  
getting *worse and worse*.

It started with a simple "-U", grew into a "-U <language>", grew into  
a 'pyidchar.txt' file with a list of character ranges, and now that  
pyidchar.txt file is going to have separate sections based on module  
name? Sorry, but are you !@# kidding me?!?

James

From timothy.c.delaney at gmail.com  Sat May 26 23:04:02 2007
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Sun, 27 May 2007 07:04:02 +1000
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<ca471dc20705241713n7f348c7eh563b5631e512fd93@mail.gmail.com>
	<017d01c79e98$c6b84090$0201a8c0@mshome.net>
	<ca471dc20705251014r46c86f74j822729e843cef797@mail.gmail.com>
	<001d01c79f15$f0afa140$0201a8c0@mshome.net>
	<ca471dc20705251613p59eb1a71k9aaf13f4b492181e@mail.gmail.com>
	<002d01c79f6d$ce090de0$0201a8c0@mshome.net>
	<ca471dc20705260708t952d820w7473474554c9469b@mail.gmail.com>
Message-ID: <003f01c79fd9$66948ec0$0201a8c0@mshome.net>

Guido van Rossum wrote:
> Quick, since I'm about to hop on a plane: Thinking about it again,
> storing the super instance in the bound method object is fine, as long
> as you only do it when the bound function needs it. Using an unbound
> super object in an unbound method is also fine.

OTOH, I've got a counter argument to storing the super object - we don't 
want to create a permantent cycle.

If we store the class, we can store it as a weakref - the when the super 
object is created, a strong reference to the class exists.

We can't store a weakref to the super instance though, as there won't be any 
other reference to it.

I still quite like the idea of im_super though, but it would need to be a 
property instead of just a reference.

I also agree with Jim that exposing the class object is useful e.g. for 
introspection.

So I propose the following:

1. Internal weakref to class object.

2. im_type - property that returns a strong ref to the class object.

I went through several names before coming up with im_type (im_staticclass, 
im_classobj, im_classobject, im_boundclass, im_bindingclass). I think 
im_type conveys exactly what we want this attribute to represent - the 
class/type that this method was defined in.

im_class would have also been suitable, but has had another, different 
meaning since 2.2.

3. im_super - property that returns the unbound super object (for an unbound 
method) and bound super object (for a bound method).

Tim Delaney

> On 5/26/07, Tim Delaney <timothy.c.delaney at gmail.com> wrote:
>> Guido van Rossum wrote:
>>
>>>>> - Why not make super a keyword, instead of just prohibiting
>>>>> assignment to it? (I'm planning to do the same with None BTW in
>>>>> Py3k -- I find the "it's a name but you can't assign to it" a
>>>>> rather silly business and hardly "the simplest solution".)
>>>>
>>>> That's currently an open issue - I'm happy to make it a keyword -
>>>> in which case I think the title should be changed to "super as a
>>>> keyword" or something like that.
>>>
>>> As it was before. :-)
>>>
>>> What's the argument against?
>>
>> I don't see any really, especially if None is to become a true
>> keyword. But some people have raised objections.
>>
>>>> Th preamble will only be added to functions/methods that cause the
>>>> 'super' cell to exist i.e. for CPython have 'super' in co.cellvars.
>>>> Functions that just have 'super' in co.freevars wouldn't have the
>>>> preamble.
>>>
>>> I think it's still too vague. For example:
>>>
>>> class C:
>>>  def f(s):
>>>    return 1
>>> class D(C):
>>>  pass
>>> def f(s):
>>>  return 2*super.f()
>>> D.f = f
>>> print(D().f())
>>>
>>> Should that work? I would be okay if it didn't, and if the super
>>> keyword is only allowed inside a method that is lexically inside a
>>> class. Then the second definition of f() should be a (phase 2)
>>> SyntaxError.
>>
>> That would simplify things. I'll update the PEP.
>>
>>> Was it ever decided whether the implicitly bound class should be:
>>>
>>> - the class object as produced by the class statement (before
>>> applying class decorators);
>>> - whatever is returned by the last class decorator (if any); or
>>> - whatever is bound to the class name at the time the method is
>>> invoked?
>>> I've got a hunch that #1 might be more solid; #3 seems asking for
>>> trouble.
>>
>> I think #3 is definitely the wrong thing to do, but there have been
>> arguments put forwards for both #1 and #2.
>>
>> I think I'll put it as an open issue for now.
>>
>>> There's also the issue of what to do when the method itself is
>>> decorated (the compiler can't know what the decorators mean, even
>>> for built-in decorators like classmethod).
>>
>> I think that may be a different issue. If you do something like:
>>
>> class A:
>>     @decorator
>>     def func(self):
>>         pass
>>
>> class B(A):
>>     @decorator
>>     def func(self):
>>         super.func()
>>
>> then `super.func()` will call whatever `super(B, self).func()` would
>> now, which (I think) would result in calling the decorated function.
>>
>> However, I think the staticmethod decorator would need to be able to
>> modify the class instance that's held by the method. Or see my
>> proposal below ...
>>> We could make the class in question a fourth attribute of the
>>> (poorly named) "bound method" object, e.g. im_class_for_super
>>> (im_super would be confusing IMO). Since this is used both by
>>> instance methods and by the @classmethod decorator, it's just about
>>> perfect for this purpose. (I would almost propose to reuse im_self
>>> for this purpose, but that's probably asking for subtle backwards
>>> incompatibilities and not worth it.)
>>
>> I'm actually thinking instead that an unbound method should
>> reference an unbound super instance for the appropriate class -
>> which we could then call im_super.
>>
>> For a bound instance or class method, im_super would return the
>> appropriate bound super instance. In practice, it would work like
>> your autosuper recipe using __super.
>>
>> e.g.
>>
>> class A:
>>     def func(self):
>>         pass
>>
>>>>> print A.func.im_super
>> <super: <class 'A'>, NULL>
>>
>>>>> print A().func.im_super
>> <super: <class 'A'>, <A object>>
>>
>>> See my proposal above. It differs slightly in that the __super__
>>> call is made only when the class is not NULL. On the expectation
>>> that a typical function that references super uses it exactly once
>>> per call (that would be by far the most common case I expect) this
>>> is just fine. In my proposal the 'super' variable contains whatever
>>> __super__(<class>, <inst>) returned, rather than <class> which you
>>> seem to be proposing here.
>>
>> Think I must have been explaining poorly - if you look at the
>> reference implementation in the PEP, you'll see that that's exactly
>> what's held in the 'super' free variable.
>>
>> I think your proposal is basically what I was trying to convey -
>> I'll look at rewording the PEP so it's less ambiguous. But I'd like
>> your thoughts on the above proposal to keep a reference to the
>> actual super object rather than the class.
>>
>> Cheers,
>>
>> Tim Delaney 


From baptiste13 at altern.org  Sun May 27 00:29:26 2007
From: baptiste13 at altern.org (Baptiste Carvello)
Date: Sun, 27 May 2007 00:29:26 +0200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <9D017904-5A64-40EC-8A5C-23502FB1E314@fuhm.net>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<464FFD04.90602@v.loewis.de>	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>	<46521CD7.9030004@v.loewis.de>	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>	<46527904.1000202@v.loewis.de>	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>	<Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>	<781A2C3C-011E-4048-A72A-BE631C0C5127@fuhm.net>	<87ps4p3zot.fsf@uwakimon.sk.tsukuba.ac.jp>
	<9D017904-5A64-40EC-8A5C-23502FB1E314@fuhm.net>
Message-ID: <f3acfd$417$1@sea.gmane.org>

James Y Knight a ?crit :

> there will be a "second class" of python modules that won't work on  
> some systems without extra pain.
> 
modules using unicode identifier *will be* second class anyway, because most
people won't be able to debug them in case of need. However, this does not
matter for teaching and for in-house code, which are the most compelling use
cases of the new feature.

BC


From showell30 at yahoo.com  Sun May 27 00:53:59 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Sat, 26 May 2007 15:53:59 -0700 (PDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <f3acfd$417$1@sea.gmane.org>
Message-ID: <73543.28835.qm@web33502.mail.mud.yahoo.com>


--- Baptiste Carvello <baptiste13 at altern.org> wrote:
> However, this does not
> matter for teaching and for in-house code, which are
> the most compelling use
> cases of the new feature.
> 

For the teaching use case, I'm wondering if the
English keywords would already present too high a
barrier for students who don't have first-semester
familiarity with English.

In this example below, altered from Chapter 4 of the
tutorial, I have tried to make the keywords appear
foreign to an English user, so that an
English-speaking person could imagine the opposite
scenario.

fed ask_ok(prompt, retries=4, complaint='Yes or no,
please!'):
    elihw Eurt:
        ok = tupni_war(prompt)
        fi ok ni ('y', 'ye', 'yes'): nruter Eurt
        fi ok ni ('n', 'no', 'nop', 'nope'): nruter
Eslaf
        retries = retries - 1
        fi retries < 0: esiar ROrreoi, 'refusenik
user'
        tnirp complaint

To truly enable Python in a non-English teaching
environment, I think you'd actually want to go a step
further and just internationalize the whole program.




       
____________________________________________________________________________________Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. 
http://mobile.yahoo.com/go?refer=1GNXIC

From showell30 at yahoo.com  Sun May 27 01:42:46 2007
From: showell30 at yahoo.com (Steve Howell)
Date: Sat, 26 May 2007 16:42:46 -0700 (PDT)
Subject: [Python-3000] some stats on identifiers (PEP 3131)
Message-ID: <601195.60246.qm@web33514.mail.mud.yahoo.com>

Here is a survey of some Python code to see how often
tokens typically get used in Python 2.

Here is the program I used to count the tokens, if you
want to try it out on your own in-house codebase:

import tokenize
import sys
fn = sys.argv[1]
g = tokenize.generate_tokens(open(fn).readline)
dct = {}
for tup in g:
    if tup[0] == 1:
        identifier = tup[1]
        dct[identifier] = dct.get(identifier, 0) + 1
identifiers = dct.keys()
identifiers.sort()
for identifier in identifiers:
    print '%4d' % dct[identifier], identifier

The top 15 in gettext.py:

ssslily> python2.5 count.py
/usr/local/lib/python2.5/gettext.py | sort -rn | head
-15
  98 self
  73 if
  69 return
  39 def
  35 msgid1
  34 tmsg
  33 n
  33 None
  32 domain
  31 message
  29 msgid2
  28 _fallback
  21 else
  20 locale
  20 in

The top 15 in an in-house program that deals with an
American-based format for sending financial
transactions (closest thing I could find to Dutch tax
law):

  23 trackData
  19 ErrorMessages
  18 rest
  16 cuts
  12 encryptedPin
  11 return
  10 request
  10 p2
  10 p1
  10 maskedMessage
  10 j
  10 in
  10 i
   9 len
   9 ccNum



       
____________________________________________________________________________________Choose the right car based on your needs.  Check out Yahoo! Autos new Car Finder tool.
http://autos.yahoo.com/carfinder/

From python at zesty.ca  Sun May 27 03:19:50 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Sat, 26 May 2007 20:19:50 -0500 (CDT)
Subject: [Python-3000] PEP 3131 accepted
In-Reply-To: <dcbbbb410705260821xd0ff29cu8280662fe60378fb@mail.gmail.com>
References: <20070523111704.85FC.JCARLSON@uci.edu>
	<87abvu5yfu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20070524082737.862E.JCARLSON@uci.edu> <4655DD4E.3050809@v.loewis.de> 
	<Pine.LNX.4.58.0705241620150.8399@server1.LFW.org>
	<4656129D.5000406@v.loewis.de>
	<A2620FD5-C79D-4B8D-B824-AFB59A56735F@gmail.com> 
	<Pine.LNX.4.58.0705260518290.27740@server1.LFW.org>
	<dcbbbb410705260821xd0ff29cu8280662fe60378fb@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0705261946280.27740@server1.LFW.org>

On Sat, 26 May 2007, Michael Urman wrote:
> On 5/26/07, Ka-Ping Yee <python at zesty.ca> wrote:
> > But the enabling of UTF-8 by a BOM at the
> > beginning of the file is an invisible override.  This invisible
> > override is the source of the danger.  If we want to be able to
> > read the coding declaration with any confidence, we should get rid
> > of the invisible override.
>
> Do we need to reconsider PEP 3120 "Using UTF-8 as the default source
> encoding"? I don't see much difference between not knowing on visual
> inspection whether:
>     allowed is allowed
> or
>     "allowed" == "allowed"

The concern is similar in nature, but there is a difference.  It is
more feasible to tell programmers not to trust the visual appearance
of strings than to tell them not to trust the visual appearance of
identifiers.  Strings are data, which makes them separable from the
structure and logic of a program, whereas identifiers are fundamental
to all programs.  Programmers are already trained to understand that
string literals in source code are non-verbatim representations (e.g.
"it's" == 'it\'s' == 'it' "'s" == "\x69t's"), whereas they have a well
established expectation that identifiers are written verbatim.

As long as you have a way of distinguishing strings reliably from the
rest of the source code, you can know whether your confidence is well
placed.  Blake's example illustrates that ambiguity in strings is
especially dangerous because it can obscure where strings begin and end.

PEP 3120 is problematic.  At the very least, it is definitely missing
a section addressing objections (the problem of not being able to
understand an expression like "allowed" == "allowed") and a section
on security considerations (like those raised by Blake's example).

Since that the default encoding is currently ASCII, almost all Python
programmers are unlikely to be prepared for ambiguity in strings;
thus the best thing to do would be to keep the default as ASCII and
require a visible declaration to activate such ambiguity (enable UTF-8).
Failing that, the next best thing to do would be to forbid all
confusable characters without an explicit declaration to permit them.
And the next best thing after that would be to forbid just the
characters that are confusable with the delimiters that fence off
ambiguous text (' " #) without an explicit declaration to permit them.


-- ?!ng

From ncoghlan at gmail.com  Sun May 27 05:28:59 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 27 May 2007 13:28:59 +1000
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <EEBBEDF8-C674-4D15-96E2-512BCA71E2D5@gmail.com>
References: <788141.82125.qm@web33507.mail.mud.yahoo.com>
	<EEBBEDF8-C674-4D15-96E2-512BCA71E2D5@gmail.com>
Message-ID: <4658FAFB.7010204@gmail.com>

Mike Klaas wrote:
> On 25-May-07, at 6:03 AM, Steve Howell wrote:
> 
>> We're just disagreeing about whether the Dutch tax law
>> programmer has to uglify his environment with an alias
>> of Python to "python3.0 -liberal_unicode," or whether
>> the American programmer in an enterprisy environment
>> has to uglify his environment with an alias of Python
>> to "python3.0 -parochial" to mollify his security
>> auditors.
> 
> Surely if such mollification were necessary, -parochial would be  
> routinely used for (most much enterprise-y) java?  I have never seen  
> any such thing done, though my experience is perhaps not universal.

Java (and C#) are statically typed - a simple assignment statement can't 
introduce a new variable, so the issue of deceptive assignment provides 
far less opportunity for mischief.

A Java or C# equivalent of KPY's deceptive code would either fail to 
compile with an unrecognised identifier error when it encountered the 
undeclared 'allow-with-Cyrillic-a' identifier (or else it have an extra 
apparently redundant identifier declaration).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ryan.freckleton at gmail.com  Sun May 27 09:19:20 2007
From: ryan.freckleton at gmail.com (Ryan Freckleton)
Date: Sun, 27 May 2007 01:19:20 -0600
Subject: [Python-3000] Composable abstract base class?
Message-ID: <318072440705270019u5c66ff5u54732c429d4beca8@mail.gmail.com>

I've been following the python-dev and python 3000 lists for over a
year, but this is my first posting.

I think I've found additional abstract base class to add to PEP 3119.
An ABC for composable data (e.g. list, tuple, set, and perhaps dict)
to inherit from. An composable object can contain instances of other
composable objects. In other words, a composable object can be used as
the outer container in a nested data structure.

The motivating example is when you want to recurse through a nested
list of strings, e.g.

>>> seq = ['my', 'hovercraft', ['is', 'full', 'of', ['eels']]]
>>> def recurse(sequence):
	if isinstance(sequence, list):
		for child in sequence:
			recurse(child)
	else:
		print sequence


>>> recurse(seq)
my
hovercraft
is
full
of
eels

You could solve this by the composite pattern, but I think that using
an ABC may be simpler.

If we had a Composable ABC that set, list and tuple inherit, the above
code could be written as:

def recurse(sequence):
	if isinstance(sequence, Composoble):
		for child in sequence:
			recurse(child)
	else:
		print sequence

Which is much more general.

This could be easily introduced by a third party developer, using the
mechanisms outlined in the PEP, the question is: would it be
worthwhile to add this ABC to PEP 3119?

If it was added to to PEP 3119, I believe that it should be a subtype
of Container. I do not think it should inherit from Iterable, since it
is possible for a container types to not support the iterator
protocol, but still support composition.

Sincerely,
-- 
=====
--Ryan E. Freckleton

From python at zesty.ca  Sun May 27 10:18:49 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Sun, 27 May 2007 03:18:49 -0500 (CDT)
Subject: [Python-3000] Composable abstract base class?
In-Reply-To: <318072440705270019u5c66ff5u54732c429d4beca8@mail.gmail.com>
References: <318072440705270019u5c66ff5u54732c429d4beca8@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0705270315540.27740@server1.LFW.org>

On Sun, 27 May 2007, Ryan Freckleton wrote:
> I've been following the python-dev and python 3000 lists for over a
> year, but this is my first posting.

Hello!

> I think I've found additional abstract base class to add to PEP 3119.
> An ABC for composable data (e.g. list, tuple, set, and perhaps dict)
> to inherit from. An composable object can contain instances of other
> composable objects. In other words, a composable object can be used as
> the outer container in a nested data structure.
[...]
> def recurse(sequence):
> 	if isinstance(sequence, Composoble):
> 		for child in sequence:
> 			recurse(child)
> 	else:
> 		print sequence

I think I understand your example, but I don't understand what makes
it necessary to introduce an ABC for Composable as separate from
Iterable.  What is intended to be different about Composable?  Can
you provide a usage example for Composable where Iterable would not
be sufficient?


-- ?!ng

From guido at python.org  Sun May 27 11:29:08 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 27 May 2007 02:29:08 -0700
Subject: [Python-3000] Composable abstract base class?
In-Reply-To: <Pine.LNX.4.58.0705270315540.27740@server1.LFW.org>
References: <318072440705270019u5c66ff5u54732c429d4beca8@mail.gmail.com>
	<Pine.LNX.4.58.0705270315540.27740@server1.LFW.org>
Message-ID: <ca471dc20705270229h45111dabr2f1f7a5be3f6222a@mail.gmail.com>

On 5/27/07, Ka-Ping Yee <python at zesty.ca> wrote:
> On Sun, 27 May 2007, Ryan Freckleton wrote:
> > I've been following the python-dev and python 3000 lists for over a
> > year, but this is my first posting.
>
> Hello!

Hello too!

> > I think I've found additional abstract base class to add to PEP 3119.
> > An ABC for composable data (e.g. list, tuple, set, and perhaps dict)
> > to inherit from. An composable object can contain instances of other
> > composable objects. In other words, a composable object can be used as
> > the outer container in a nested data structure.
> [...]
> > def recurse(sequence):
> >       if isinstance(sequence, Composoble):
> >               for child in sequence:
> >                       recurse(child)
> >       else:
> >               print sequence
>
> I think I understand your example, but I don't understand what makes
> it necessary to introduce an ABC for Composable as separate from
> Iterable.  What is intended to be different about Composable?  Can
> you provide a usage example for Composable where Iterable would not
> be sufficient?

Ryan is repeating the classic flatten example: strings are iterables
but shouldn't be iterated over in this example. This is more the
domain of Generic Functions, PEP 3124. Anyway, the beauty of PEP 3119
is that even if PEP 3124 were somehow rejected, you could add
Composable yourself, and there is no requirement to add it (or any
other category you might want to define) to the "standard" set of
ABCs.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sun May 27 11:59:45 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 27 May 2007 02:59:45 -0700
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
In-Reply-To: <003f01c79fd9$66948ec0$0201a8c0@mshome.net>
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<ca471dc20705241713n7f348c7eh563b5631e512fd93@mail.gmail.com>
	<017d01c79e98$c6b84090$0201a8c0@mshome.net>
	<ca471dc20705251014r46c86f74j822729e843cef797@mail.gmail.com>
	<001d01c79f15$f0afa140$0201a8c0@mshome.net>
	<ca471dc20705251613p59eb1a71k9aaf13f4b492181e@mail.gmail.com>
	<002d01c79f6d$ce090de0$0201a8c0@mshome.net>
	<ca471dc20705260708t952d820w7473474554c9469b@mail.gmail.com>
	<003f01c79fd9$66948ec0$0201a8c0@mshome.net>
Message-ID: <ca471dc20705270259ke665af6v3b5bdbffbd926330@mail.gmail.com>

On 5/26/07, Tim Delaney <timothy.c.delaney at gmail.com> wrote:
> Guido van Rossum wrote:
> > Quick, since I'm about to hop on a plane: Thinking about it again,
> > storing the super instance in the bound method object is fine, as long
> > as you only do it when the bound function needs it. Using an unbound
> > super object in an unbound method is also fine.
>
> OTOH, I've got a counter argument to storing the super object - we don't
> want to create a permantent cycle.

The bound method object isn't stored in the class -- it's created by
the "C.method" or "inst.method" getattr operation. I don't see how
this would introduce a cycle.

> If we store the class, we can store it as a weakref - the when the super
> object is created, a strong reference to the class exists.
>
> We can't store a weakref to the super instance though, as there won't be any
> other reference to it.
>
> I still quite like the idea of im_super though, but it would need to be a
> property instead of just a reference.
>
> I also agree with Jim that exposing the class object is useful e.g. for
> introspection.
>
> So I propose the following:
>
> 1. Internal weakref to class object.
>
> 2. im_type - property that returns a strong ref to the class object.
>
> I went through several names before coming up with im_type (im_staticclass,
> im_classobj, im_classobject, im_boundclass, im_bindingclass). I think
> im_type conveys exactly what we want this attribute to represent - the
> class/type that this method was defined in.
>
> im_class would have also been suitable, but has had another, different
> meaning since 2.2.

Since class and type are synonym (as you say) having both im_class and
in_type would be a bad idea.

> 3. im_super - property that returns the unbound super object (for an unbound
> method) and bound super object (for a bound method).
>
> Tim Delaney
>
> > On 5/26/07, Tim Delaney <timothy.c.delaney at gmail.com> wrote:
> >> Guido van Rossum wrote:
> >>
> >>>>> - Why not make super a keyword, instead of just prohibiting
> >>>>> assignment to it? (I'm planning to do the same with None BTW in
> >>>>> Py3k -- I find the "it's a name but you can't assign to it" a
> >>>>> rather silly business and hardly "the simplest solution".)
> >>>>
> >>>> That's currently an open issue - I'm happy to make it a keyword -
> >>>> in which case I think the title should be changed to "super as a
> >>>> keyword" or something like that.
> >>>
> >>> As it was before. :-)
> >>>
> >>> What's the argument against?
> >>
> >> I don't see any really, especially if None is to become a true
> >> keyword. But some people have raised objections.
> >>
> >>>> Th preamble will only be added to functions/methods that cause the
> >>>> 'super' cell to exist i.e. for CPython have 'super' in co.cellvars.
> >>>> Functions that just have 'super' in co.freevars wouldn't have the
> >>>> preamble.
> >>>
> >>> I think it's still too vague. For example:
> >>>
> >>> class C:
> >>>  def f(s):
> >>>    return 1
> >>> class D(C):
> >>>  pass
> >>> def f(s):
> >>>  return 2*super.f()
> >>> D.f = f
> >>> print(D().f())
> >>>
> >>> Should that work? I would be okay if it didn't, and if the super
> >>> keyword is only allowed inside a method that is lexically inside a
> >>> class. Then the second definition of f() should be a (phase 2)
> >>> SyntaxError.
> >>
> >> That would simplify things. I'll update the PEP.
> >>
> >>> Was it ever decided whether the implicitly bound class should be:
> >>>
> >>> - the class object as produced by the class statement (before
> >>> applying class decorators);
> >>> - whatever is returned by the last class decorator (if any); or
> >>> - whatever is bound to the class name at the time the method is
> >>> invoked?
> >>> I've got a hunch that #1 might be more solid; #3 seems asking for
> >>> trouble.
> >>
> >> I think #3 is definitely the wrong thing to do, but there have been
> >> arguments put forwards for both #1 and #2.
> >>
> >> I think I'll put it as an open issue for now.
> >>
> >>> There's also the issue of what to do when the method itself is
> >>> decorated (the compiler can't know what the decorators mean, even
> >>> for built-in decorators like classmethod).
> >>
> >> I think that may be a different issue. If you do something like:
> >>
> >> class A:
> >>     @decorator
> >>     def func(self):
> >>         pass
> >>
> >> class B(A):
> >>     @decorator
> >>     def func(self):
> >>         super.func()
> >>
> >> then `super.func()` will call whatever `super(B, self).func()` would
> >> now, which (I think) would result in calling the decorated function.
> >>
> >> However, I think the staticmethod decorator would need to be able to
> >> modify the class instance that's held by the method. Or see my
> >> proposal below ...
> >>> We could make the class in question a fourth attribute of the
> >>> (poorly named) "bound method" object, e.g. im_class_for_super
> >>> (im_super would be confusing IMO). Since this is used both by
> >>> instance methods and by the @classmethod decorator, it's just about
> >>> perfect for this purpose. (I would almost propose to reuse im_self
> >>> for this purpose, but that's probably asking for subtle backwards
> >>> incompatibilities and not worth it.)
> >>
> >> I'm actually thinking instead that an unbound method should
> >> reference an unbound super instance for the appropriate class -
> >> which we could then call im_super.
> >>
> >> For a bound instance or class method, im_super would return the
> >> appropriate bound super instance. In practice, it would work like
> >> your autosuper recipe using __super.
> >>
> >> e.g.
> >>
> >> class A:
> >>     def func(self):
> >>         pass
> >>
> >>>>> print A.func.im_super
> >> <super: <class 'A'>, NULL>
> >>
> >>>>> print A().func.im_super
> >> <super: <class 'A'>, <A object>>
> >>
> >>> See my proposal above. It differs slightly in that the __super__
> >>> call is made only when the class is not NULL. On the expectation
> >>> that a typical function that references super uses it exactly once
> >>> per call (that would be by far the most common case I expect) this
> >>> is just fine. In my proposal the 'super' variable contains whatever
> >>> __super__(<class>, <inst>) returned, rather than <class> which you
> >>> seem to be proposing here.
> >>
> >> Think I must have been explaining poorly - if you look at the
> >> reference implementation in the PEP, you'll see that that's exactly
> >> what's held in the 'super' free variable.
> >>
> >> I think your proposal is basically what I was trying to convey -
> >> I'll look at rewording the PEP so it's less ambiguous. But I'd like
> >> your thoughts on the above proposal to keep a reference to the
> >> actual super object rather than the class.
> >>
> >> Cheers,
> >>
> >> Tim Delaney
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ncoghlan at gmail.com  Sun May 27 12:18:45 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 27 May 2007 20:18:45 +1000
Subject: [Python-3000] Composable abstract base class?
In-Reply-To: <ca471dc20705270229h45111dabr2f1f7a5be3f6222a@mail.gmail.com>
References: <318072440705270019u5c66ff5u54732c429d4beca8@mail.gmail.com>	<Pine.LNX.4.58.0705270315540.27740@server1.LFW.org>
	<ca471dc20705270229h45111dabr2f1f7a5be3f6222a@mail.gmail.com>
Message-ID: <46595B05.8080301@gmail.com>

Guido van Rossum wrote:
> Ryan is repeating the classic flatten example: strings are iterables
> but shouldn't be iterated over in this example. This is more the
> domain of Generic Functions, PEP 3124. Anyway, the beauty of PEP 3119
> is that even if PEP 3124 were somehow rejected, you could add
> Composable yourself, and there is no requirement to add it (or any
> other category you might want to define) to the "standard" set of
> ABCs.

I think this is an interesting example to flesh out though - how would I 
express that most instances of Iterable should be iterated over when 
being Flattened, but that certain instances of Iterable (i.e. strings) 
should be ignored?

For example, it would be nice to be able to write:

   from abc import Iterable

   class Flattenable(Iterable):
       pass

   Flattenable.deregister(basestring)


Reading the PEP as it stands, I believe carving out exceptions like this 
would require either subclassing ABCMeta to change the behaviour, or 
else relying on PEP 3124 or some other generic function mechanism.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From timothy.c.delaney at gmail.com  Sun May 27 13:09:22 2007
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Sun, 27 May 2007 21:09:22 +1000
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<ca471dc20705241713n7f348c7eh563b5631e512fd93@mail.gmail.com>
	<017d01c79e98$c6b84090$0201a8c0@mshome.net>
	<ca471dc20705251014r46c86f74j822729e843cef797@mail.gmail.com>
	<001d01c79f15$f0afa140$0201a8c0@mshome.net>
	<ca471dc20705251613p59eb1a71k9aaf13f4b492181e@mail.gmail.com>
	<002d01c79f6d$ce090de0$0201a8c0@mshome.net>
	<ca471dc20705260708t952d820w7473474554c9469b@mail.gmail.com>
	<003f01c79fd9$66948ec0$0201a8c0@mshome.net>
	<ca471dc20705270259ke665af6v3b5bdbffbd926330@mail.gmail.com>
Message-ID: <009c01c7a04f$7e348460$0201a8c0@mshome.net>

Guido van Rossum wrote:

> The bound method object isn't stored in the class -- it's created by
> the "C.method" or "inst.method" getattr operation. I don't see how
> this would introduce a cycle.
>
>> If we store the class, we can store it as a weakref - the when the
>> super object is created, a strong reference to the class exists.

We need to create some relationship between the unbound method and the 
class. So the class has a reference to the unbound method, and the unbound 
method has a reference to the class, thus creating a cycle. Bound methods 
don't come into it - it's the unbound method that's the problem.

> Since class and type are synonym (as you say) having both im_class and
> im_type would be a bad idea.

I'm struggling to think of another, not too complicated name that conveys 
the same information.

Tim Delaney 


From stephen at xemacs.org  Sun May 27 14:59:08 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 27 May 2007 21:59:08 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <9D017904-5A64-40EC-8A5C-23502FB1E314@fuhm.net>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<464FFD04.90602@v.loewis.de>
	<fb6fbf560705210856x4b73df43g3a2b747538cc83db@mail.gmail.com>
	<46521CD7.9030004@v.loewis.de>
	<fb6fbf560705211629x7b58bf4bg43ebcc69ba68dd6c@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<Pine.LNX.4.58.0705241552450.8399@server1.LFW.org>
	<781A2C3C-011E-4048-A72A-BE631C0C5127@fuhm.net>
	<87ps4p3zot.fsf@uwakimon.sk.tsukuba.ac.jp>
	<9D017904-5A64-40EC-8A5C-23502FB1E314@fuhm.net>
Message-ID: <87wsyu2zj7.fsf@uwakimon.sk.tsukuba.ac.jp>

James Y Knight writes:

 > If the identifier syntax is changed to include unicode, all python  
 > modules are still usable everywhere. Once you start going down the  
 > road of configurable syntax (worse: globally configurable syntax),

The syntax is not "configured", it is "audited".  Just like Unix
passwords, which can be anything in principle, but most distros audit
them (unless assigned by root).

Now, Ka-Ping Yee and Josiah Carlson clearly would like to see the
restriction in the language.  That's not where I'm going.  I see PEP
3131 as defining the language.

However, I do think that a limited amount of *optional* auditing *in
the Python compiler* would be a good idea to have, especially for
Americans who (along with everybody else) have *no* need for Unicode
identifier support now, and are not going to have a need for a long
time on average.  Better they should get a heads-up when the Klingons
arrive.

 > there will be a "second class" of python modules that won't work on  
 > some systems without extra pain.

That's right.  It's all modules that contain non-ASCII identifiers,
because by PEP 3131 they cannot be distributed with Python as part of
the standard library.

The question is how much extra pain, and will it actually hinder u

 > It started with a simple "-U", grew into a "-U <language>", grew into

Actually, it started with plugging into the codec interface, with
"ASCII-only" and "PEP 3131" auditors available by default.

 > a 'pyidchar.txt' file with a list of character ranges, and now that  
 > pyidchar.txt file is going to have separate sections based on module  
 > name? Sorry, but are you !@# kidding me?!?

The scalability issue was raised by Guido, not the ASCII advocates.

To answer how I view this, no, I'm not kidding.  Until the vaporware
auditing programs get fieldtested, and we've actually seen a couple of
exploits of unwary sites and discover that they're the ones the
auditing programs already catch, not something unexpected.

In any case, I expect that the most commonly used version of that file
will look like

    [DEFAULT]
    000000-1FFFFF    # all of Unicode as restricted by PEP 3131

    # pyidchar.txt ends here

Anything more complicated than that is a convenient standardized
format for filters that can be shared among the seriously paranoid.


From guido at python.org  Sun May 27 14:50:47 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 27 May 2007 05:50:47 -0700
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
In-Reply-To: <009c01c7a04f$7e348460$0201a8c0@mshome.net>
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<017d01c79e98$c6b84090$0201a8c0@mshome.net>
	<ca471dc20705251014r46c86f74j822729e843cef797@mail.gmail.com>
	<001d01c79f15$f0afa140$0201a8c0@mshome.net>
	<ca471dc20705251613p59eb1a71k9aaf13f4b492181e@mail.gmail.com>
	<002d01c79f6d$ce090de0$0201a8c0@mshome.net>
	<ca471dc20705260708t952d820w7473474554c9469b@mail.gmail.com>
	<003f01c79fd9$66948ec0$0201a8c0@mshome.net>
	<ca471dc20705270259ke665af6v3b5bdbffbd926330@mail.gmail.com>
	<009c01c7a04f$7e348460$0201a8c0@mshome.net>
Message-ID: <ca471dc20705270550j5e199624xd4e8f6caa9dda93d@mail.gmail.com>

On 5/27/07, Tim Delaney <timothy.c.delaney at gmail.com> wrote:
> Guido van Rossum wrote:
>
> > The bound method object isn't stored in the class -- it's created by
> > the "C.method" or "inst.method" getattr operation. I don't see how
> > this would introduce a cycle.
> >
> >> If we store the class, we can store it as a weakref - the when the
> >> super object is created, a strong reference to the class exists.
>
> We need to create some relationship between the unbound method and the
> class. So the class has a reference to the unbound method, and the unbound
> method has a reference to the class, thus creating a cycle. Bound methods
> don't come into it - it's the unbound method that's the problem.

Still wrong, I think. The unbound method object *also* isn't stored in
the class. It's returned by the C.method operation. Compare C.method
(which returns an unbound method) to C.__dict__['method'] (which
returns the actual function object stored in the class).

> > Since class and type are synonym (as you say) having both im_class and
> > im_type would be a bad idea.
>
> I'm struggling to think of another, not too complicated name that conveys
> the same information.

Keep trying. im_type is not acceptable. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sun May 27 14:57:15 2007
From: guido at python.org (Guido van Rossum)
Date: Sun, 27 May 2007 05:57:15 -0700
Subject: [Python-3000] Composable abstract base class?
In-Reply-To: <46595B05.8080301@gmail.com>
References: <318072440705270019u5c66ff5u54732c429d4beca8@mail.gmail.com>
	<Pine.LNX.4.58.0705270315540.27740@server1.LFW.org>
	<ca471dc20705270229h45111dabr2f1f7a5be3f6222a@mail.gmail.com>
	<46595B05.8080301@gmail.com>
Message-ID: <ca471dc20705270557x502fa771qc155fbff4573d7c1@mail.gmail.com>

On 5/27/07, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Guido van Rossum wrote:
> > Ryan is repeating the classic flatten example: strings are iterables
> > but shouldn't be iterated over in this example. This is more the
> > domain of Generic Functions, PEP 3124. Anyway, the beauty of PEP 3119
> > is that even if PEP 3124 were somehow rejected, you could add
> > Composable yourself, and there is no requirement to add it (or any
> > other category you might want to define) to the "standard" set of
> > ABCs.
>
> I think this is an interesting example to flesh out though - how would I
> express that most instances of Iterable should be iterated over when
> being Flattened, but that certain instances of Iterable (i.e. strings)
> should be ignored?
>
> For example, it would be nice to be able to write:
>
>    from abc import Iterable
>
>    class Flattenable(Iterable):
>        pass
>
>    Flattenable.deregister(basestring)
>
>
> Reading the PEP as it stands, I believe carving out exceptions like this
> would require either subclassing ABCMeta to change the behaviour, or
> else relying on PEP 3124 or some other generic function mechanism.

You can't do it with the existing ABC class, but you could do it by
overriding __subclasscheck__ in a different way. But it's definitely
much easier to do with GFs.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From stephen at xemacs.org  Sun May 27 16:03:59 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 27 May 2007 23:03:59 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <fb6fbf560705260939x64cd9642qd025c9a01ef7604e@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<46527904.1000202@v.loewis.de>
	<fb6fbf560705221329y5c3cad1et48341bd9a820fb14@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>
	<87646g3u9q.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705260939x64cd9642qd025c9a01ef7604e@mail.gmail.com>
Message-ID: <87veee2wj4.fsf@uwakimon.sk.tsukuba.ac.jp>

Jim Jewett writes:

 > > Cf characters?  Are we admitting "stupid bidi tricks", too?<wink>
 > 
 > If Tomer needs them.

But that's what I mean by respecting the work of the Unicode technical
committees.  They say he *doesn't* need them, no matter what he thinks.

They do make mistakes.  But they are far less likely to make mistakes
than a non-specialist native speaker.

 > Seriously, I wouldn't put Cf characters in the default accepted
 > tabled.  (But remember that *I* would limit that default to ASCII.)

It's not the default that matters.  It's what actually gets used that
matters.  If we start by saying "you can't have these characters" and
the users thumb their noses at us, OK, we made a mistake and we fix
it to correspond to what the users actually have shown to be BCP.

If we start by saying "you can have any characters you want", I'm
pretty sure we're making a mistake, and if so, we can't fix it any
more than we can get rid of Reply-To munging.

 > Agreed; but in my opinion, the decision to allow those characters is
 > local; the decision to rescind them would therefore also be local.

It is not a local decision, not in PEP 3131.  PEP 3131 clearly intends
to conform to UAX #31.  (I think it still needs to *explicitly* state
that it's defining a profile of UAX #31, since there are restrictions
on ASCII identifier characters in Python that are not in the basic
definitions of UAX #31.)  Your proposal would return PEP 3131 to a
blank sheet of paper, and ensure non-conformance with an important
normative Annex of Unicode.

 > I had been thinking of the unicode version as a feature that didn't
 > change within a python release.  Perhaps that is negotiable?

I think it's a bad idea to allow it to change within a release.  All I
meant was that there could be a well-known mechanism for using
different tables, either at run-time or at compile-time, so that users
could change it if they want to.

People who need Lepcha and Cham and want to have a Python that uses
unapproved code points for them will have to use a Python which is not
conformant.  Let them, of course, but I don't see why the 6 billion
potential Python users who have never heard of Lepcha, Cham, or the
"IBM corporate extension character set for Japanese" should need to
forego Unicode conformance as well.

 > > Maybe the way to handle this is to allow private-space characters in
 > > identifiers as an option.  That would be doable with your well-known
 > > file scheme.  But it's very dangerous across modules.
 > 
 > It turns out that page was out of date; Lepcha and Cham now have code
 > points which haven't been formally approved, but aren't likely to
 > change.  Officially, they're still undefined, but using private-space
 > probably isn't the right answer.  So either we allow these particular
 > "undefined" characters, or we (for now) disallow Lepcha and Cham.

The law of the excluded middle doesn't apply in that way.  It's
trivial to "cast" the unofficial code points into "private space" as a
block.  This technique was used in XEmacs/CHISE (nee XEmacs/UTF-2000)
to grandfather the old MULE codes while they filled out the Unicode
space, and to map character sets that are not Unicode conformant into
Unicode space while preserving collating order and so on.

Granted, that's a research extension not a production editor, but the
technique seems to work pretty well for the people who need such
things.  Any Python code that doesn't assume a numerical relationship
between the Lepcha block and any other block will work unchanged, and
implementing the changeover for old versions of Python that don't know
about Lepcha simply requires installing a Lepcha compatibility codec
to do the trivial mapping.  Is that cool or what?

The main problem with this technique is that on some platforms you
have to be careful about casting into the BMP, because vendors like
Microsoft and Apple have a penchant for using a lot of the BMP private
space for corporate logos and the like.  And I think Klingon is
standard on Linux (or has the Unicode consortium approved a Klingon
block since I last looked?)


From collinw at gmail.com  Mon May 28 02:41:34 2007
From: collinw at gmail.com (Collin Winter)
Date: Sun, 27 May 2007 17:41:34 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <87veee2wj4.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>
	<87646g3u9q.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705260939x64cd9642qd025c9a01ef7604e@mail.gmail.com>
	<87veee2wj4.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <43aa6ff70705271741w2b3eefcbj29921e81822d189@mail.gmail.com>

On 5/27/07, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Jim Jewett writes:
>
>  > > Cf characters?  Are we admitting "stupid bidi tricks", too?<wink>
>  >
>  > If Tomer needs them.
>
> But that's what I mean by respecting the work of the Unicode technical
> committees.  They say he *doesn't* need them, no matter what he thinks.
>
> They do make mistakes.  But they are far less likely to make mistakes
> than a non-specialist native speaker.

Sincere question: if these characters aren't needed, why are they
provided? From what I can tell by googling, they're needed when, e.g.,
Arabic is embedded in an otherwise left-to-right script. Do I have
that right? That sounds pretty close to what you'd get when using
Arabic identifiers with the English keywords/stdlib.

Collin Winter

From stephen at xemacs.org  Mon May 28 05:51:46 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 28 May 2007 12:51:46 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <43aa6ff70705271741w2b3eefcbj29921e81822d189@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>
	<87646g3u9q.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705260939x64cd9642qd025c9a01ef7604e@mail.gmail.com>
	<87veee2wj4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<43aa6ff70705271741w2b3eefcbj29921e81822d189@mail.gmail.com>
Message-ID: <87fy5h38rx.fsf@uwakimon.sk.tsukuba.ac.jp>

Collin Winter writes:

 > Sincere question: if these characters aren't needed, why are they
 > provided? From what I can tell by googling, they're needed when, e.g.,
 > Arabic is embedded in an otherwise left-to-right script. Do I have
 > that right? That sounds pretty close to what you'd get when using
 > Arabic identifiers with the English keywords/stdlib.

The problem is visual presentation to humans.  It's very much like
unmarshalling little-endian integers from a byte stream.  The byte
stream by definition is big-endian, so when you simply memcpy into the
stream buffer, little-endian integers will come out in reverse byte
order.  Bidi works a little bit differently; in principle it works
both ways (if you start LTR then the RTL is in reverse order in the
stream, and vice versa) since both kinds of script are character
streams.  But in both cases, *inside* the computer, there is a natural
"big-endian" order and the computer does not get confused.  That is
one sense in which format characters are YAGNIs.

Now, identifiers are by definition character streams.  If an English
speaker would pronounce the spelling of an English word "A B C", and
an Arabic speaker an Arabic word as "1 2 3", then *as an identifier*
the combination English then Arabic is spelled "A B C _ 1 2 3".  And
that's all the Python compiler needs to know.  In fact, on the editor
display this would be presented "ABC_321".  In data entry, you'd see
something like this

key     display
 A      A
 B      AB
 C      ABC
 _      ABC_
 1      ABC_1
 2      ABC_21
 3      ABC_321

This can be done algorithmically (this is the "Unicode Technical Annex
#9", aka "UAX #9", you may have seen references to), to a very high
degree approximation to what human typesetters do in bidi cultures.

Now suppose you want to see on screen the contents of memory cells as
characters.  Then you would put into memory something like "A B C _
LRO 1 2 3" where LRO is a control character that says "no matter what
directional property has normally, override that with left-to-right
until I say otherwise."  That logical sequence of characters is indeed
displayed "ABC_123".

But how about those as identifiers?  Note that in memory the sequence
of printing characters is "A B C _ 1 2 3" in each case.  So it makes
sense to think of that as the identifier, *ignoring* the presentation
control characters.

Suppose we prohibit the directional control characters.  Then a
Unicode conforming editor will put the characters in logical order "A
B C _ 1 2 3" in the file, and display them naturally (to a speaker of
Arabic) as "ABC_321".  This is going to be by far the most common
case, and the user knows that it works this way.  I don't see a
problem here.  Do you?

OK, now let's consider the cases of breakage.  Consider a malicious
author who uses LRO as "A B C _ 1 2 LRO 3" which displays as "ABC_213"
(IIRC, I haven't actually tried to implement bidi in a very long
time).  Can you think of a genuine use for that?  I can't; I think
it's a bad idea to allow it.

On the other hand, you could have a situation where the printed
documentation uses the UAX #9 bidi algorithm, and discusses the
meaning of the identifier "ABC_321", while the reviewing programmer is
using a broken editor which implements overrides but not the
algorithm, and sees "ABC_123".  So in the case where LRO is permitted,
the author can enforce the visual order that the reviewer will see in
the documents on both the documents and the editor display.  But since
it's the unnatural (to an Arabic reader) "ABC_123", it will be
confusing and hard to read.  Is this a win?

As somebody (I think Jim J) pointed out, bidi is a world of pain
unless and until *all* editors and readers implement a common set of
display conventions.  Python can't do anything that will unambiguously
reduce that pain.  So IMHO it is best to conform to a standard that
can be unambiguously implemented, and is likely to be available to the
majority of programmers who need to work with bidi environments.  That
is UAX #31, which mandates ignoring these format characters (in the
default profile), and strongly recommends prohibiting them in all
profiles.

From stephen at xemacs.org  Mon May 28 06:08:21 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 28 May 2007 13:08:21 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <43aa6ff70705271741w2b3eefcbj29921e81822d189@mail.gmail.com>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>
	<87646g3u9q.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705260939x64cd9642qd025c9a01ef7604e@mail.gmail.com>
	<87veee2wj4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<43aa6ff70705271741w2b3eefcbj29921e81822d189@mail.gmail.com>
Message-ID: <87d50l380a.fsf@uwakimon.sk.tsukuba.ac.jp>

Collin Winter writes:

 > Sincere question: if these characters aren't needed, why are they
 > provided?

I already gave a long jargony answer, but maybe this analogy is
better:

Most of the time automatic line-wrapping gives excellent results, but
sometimes you need the newline character to achieve special effects
(eg, poetry).  Directional controls are similar: used for "special
effects" that are none-the-less an everyday part of the language.

 > From what I can tell by googling, they're needed when, e.g.,
 > Arabic is embedded in an otherwise left-to-right script.

No, they are unnecessary; there are algorithms that do a fine job --
for most purposes, but not all.  It's the exceptions where the control
characters are needed.

The Unicode technical committee does not think identifiers are
exceptional, and they are experts (including Hebrew and Arabic native
speakers, I am sure).

From alexandre at peadrop.com  Mon May 28 18:56:00 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Mon, 28 May 2007 12:56:00 -0400
Subject: [Python-3000] Lines breaking
Message-ID: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>

Hi,

Just wandering. It would be a good idea to make the string methods
split() and splitlines() break lines as specified by the Unicode
Standard (Section 5.8 Newline Guidelines)?

If you don't have a printed copy, you can read the section here:
http://www.unicode.org/versions/Unicode4.0.0/ch05.pdf

-- Alexandre

From daniel at stutzbachenterprises.com  Mon May 28 21:41:44 2007
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Mon, 28 May 2007 14:41:44 -0500
Subject: [Python-3000] BLists (PEP 3128)
Message-ID: <eae285400705281241q5193b050tad8ba5d169a26450@mail.gmail.com>

On 5/11/07, Raymond Hettinger <python at rcn.com> wrote:
> > Would it be useful if I created an experimental fork of 2.5
> > that replaces array-based lists with BLists,
> >  so that the performance penalty (if any) on existing code
> > can be measured?
>
> That would likely be an informative exercise and would assure that your
> code is truly interchangable with regular lists.  It would also highlight the
> under-the-hood difficulties you'll encounter with the C-API.
>
> That being said, it is a labor intensive exercise and the time might be better
> spent on tweaking the third-party module code and building a happy user-base.

Just to provide a quick update on my adventures with BLists:

I went forward with the exercise of replacing the array-based list
with BLists in the Python interpreter.  As a first pass, I went with a
simple, not-very-efficient redirect of the List API.  I had very few
problems getting this working well enough to compile.

The exercise also had the benefit that I have been able to test BLists
against the entire Python test suite.  Previously, I had adapted only
test_list.  test_builtin was particularly useful.  I was able to find
and fix a couple more bugs in my implementation this way.  Almost all
of the tests pass now.

There are also a handful of test failures where the tests are
asserting the CPython implementation details when the intent is really
just to assert(don't crash).  These tests are related to when
references are deleted to evil comparison/lookup functions or what
happens when a list changes size during iteration.  I'll probably
change the BList code to match CPython's behavior.

I have one genuine bug to fix that inexplicably causes test_shlex to
fail.  Once that is taken care of, I get back to looking at
performance.

However, I'm leaving tomorrow for 3 weeks (wedding + honeymoon), so
I'm not going to be able to make any further progress until I get
back.  :-)

-- 
Daniel Stutzbach, Ph.D.             President, Stutzbach Enterprises LLC

From guido at python.org  Tue May 29 00:44:31 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 29 May 2007 06:44:31 +0800
Subject: [Python-3000] Lines breaking
In-Reply-To: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
Message-ID: <ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>

Can you or someone supply a patch? Put it in the SourceForge patch
manager and post here.

OTOH I don't believe that's how 2.x implements these methods, and
AFAIK nobody's complained. Is in necessary to change? At the very
least I'd be opposed if it changed the behavior of splitting
ASCII-only text.

--Guido

On 5/29/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> Hi,
>
> Just wandering. It would be a good idea to make the string methods
> split() and splitlines() break lines as specified by the Unicode
> Standard (Section 5.8 Newline Guidelines)?
>
> If you don't have a printed copy, you can read the section here:
> http://www.unicode.org/versions/Unicode4.0.0/ch05.pdf
>
> -- Alexandre
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From alexandre at peadrop.com  Tue May 29 01:49:33 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Mon, 28 May 2007 19:49:33 -0400
Subject: [Python-3000] Lines breaking
In-Reply-To: <ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
Message-ID: <acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>

On 5/28/07, Guido van Rossum <guido at python.org> wrote:
> Can you or someone supply a patch? Put it in the SourceForge patch
> manager and post here.

I can't promise anything, since I am quite busy my SoC project, but I
could try to supply a patch, if you and the other developers are in
favor for the change. A few other methods would need to be changed too
to conform fully to the standard -- I am thinking especially of the
file methods readline/readlines. So, the change should probably be
documented in a PEP.

> OTOH I don't believe that's how 2.x implements these methods, and
> AFAIK nobody's complained. Is in necessary to change? At the very
> least I'd be opposed if it changed the behavior of splitting
> ASCII-only text.

The change would extend the line breaking behavior to three other
ASCII characters:
  NEL "Next Line" 85
  VT "Vertical Tab" 0B
  FF "Form Feed" 0C
Of course, it is not really necessary to change, but I think full
conformance to the standard [1] could give Python better support of
multilingual texts. However, full conformance would require a good
amount of work. So, it is true that it is probably better to postpone
it until someone complaint.

-- Alexandre

[1] http://www.unicode.org/reports/tr14/tr14-19.html

From greg.ewing at canterbury.ac.nz  Tue May 29 03:03:48 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 29 May 2007 13:03:48 +1200
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <87fy5h38rx.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>
	<87646g3u9q.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705260939x64cd9642qd025c9a01ef7604e@mail.gmail.com>
	<87veee2wj4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<43aa6ff70705271741w2b3eefcbj29921e81822d189@mail.gmail.com>
	<87fy5h38rx.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <465B7BF4.1000400@canterbury.ac.nz>

Stephen J. Turnbull wrote:
> If an English
> speaker would pronounce the spelling of an English word "A B C", and
> an Arabic speaker an Arabic word as "1 2 3", then *as an identifier*
> the combination English then Arabic is spelled "A B C _ 1 2 3".

But would an Arabic speaker pronounce the identifier as a whole
as "A B C 1 2 3" or "1 2 3 A B C"? That's where I find it all
gets very confusing.

--
Greg

From greg.ewing at canterbury.ac.nz  Tue May 29 03:26:37 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 29 May 2007 13:26:37 +1200
Subject: [Python-3000] Lines breaking
In-Reply-To: <acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
Message-ID: <465B814D.2060101@canterbury.ac.nz>

Alexandre Vassalotti wrote:

> The change would extend the line breaking behavior to three other
> ASCII characters:
>   NEL "Next Line" 85

That's not an ASCII character.

>   VT "Vertical Tab" 0B
>   FF "Form Feed" 0C

-1 on making these line-breaking characters by default.
I like my ASCII text file lines broken by newline chars
and nothing else.

--
Greg

From guido at python.org  Tue May 29 04:37:47 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 29 May 2007 10:37:47 +0800
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
In-Reply-To: <ca471dc20705270550j5e199624xd4e8f6caa9dda93d@mail.gmail.com>
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<ca471dc20705251014r46c86f74j822729e843cef797@mail.gmail.com>
	<001d01c79f15$f0afa140$0201a8c0@mshome.net>
	<ca471dc20705251613p59eb1a71k9aaf13f4b492181e@mail.gmail.com>
	<002d01c79f6d$ce090de0$0201a8c0@mshome.net>
	<ca471dc20705260708t952d820w7473474554c9469b@mail.gmail.com>
	<003f01c79fd9$66948ec0$0201a8c0@mshome.net>
	<ca471dc20705270259ke665af6v3b5bdbffbd926330@mail.gmail.com>
	<009c01c7a04f$7e348460$0201a8c0@mshome.net>
	<ca471dc20705270550j5e199624xd4e8f6caa9dda93d@mail.gmail.com>
Message-ID: <ca471dc20705281937y48300821u840add9d5454e8d9@mail.gmail.com>

Hi Tim,

I've gone ahead and cooked up a tiny demo patch that uses im_class to
store what you called im_type. Because I don't have the parser changes
ready yet, this requires you to declare a keyword-only arg named
'super'; this triggers special code that set it to super(im_class,
im_self).

http://python.org/sf/1727209

I haven't tried to discover yet how much breaks due to the change of
semantics for im_class.

--Guido

On 5/27/07, Guido van Rossum <guido at python.org> wrote:
> On 5/27/07, Tim Delaney <timothy.c.delaney at gmail.com> wrote:
> > Guido van Rossum wrote:
> >
> > > The bound method object isn't stored in the class -- it's created by
> > > the "C.method" or "inst.method" getattr operation. I don't see how
> > > this would introduce a cycle.
> > >
> > >> If we store the class, we can store it as a weakref - the when the
> > >> super object is created, a strong reference to the class exists.
> >
> > We need to create some relationship between the unbound method and the
> > class. So the class has a reference to the unbound method, and the unbound
> > method has a reference to the class, thus creating a cycle. Bound methods
> > don't come into it - it's the unbound method that's the problem.
>
> Still wrong, I think. The unbound method object *also* isn't stored in
> the class. It's returned by the C.method operation. Compare C.method
> (which returns an unbound method) to C.__dict__['method'] (which
> returns the actual function object stored in the class).
>
> > > Since class and type are synonym (as you say) having both im_class and
> > > im_type would be a bad idea.
> >
> > I'm struggling to think of another, not too complicated name that conveys
> > the same information.
>
> Keep trying. im_type is not acceptable. :-)
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From turnbull at sk.tsukuba.ac.jp  Tue May 29 05:57:23 2007
From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull)
Date: Tue, 29 May 2007 12:57:23 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <465B7BF4.1000400@canterbury.ac.nz>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>
	<87646g3u9q.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705260939x64cd9642qd025c9a01ef7604e@mail.gmail.com>
	<87veee2wj4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<43aa6ff70705271741w2b3eefcbj29921e81822d189@mail.gmail.com>
	<87fy5h38rx.fsf@uwakimon.sk.tsukuba.ac.jp>
	<465B7BF4.1000400@canterbury.ac.nz>
Message-ID: <878xb82sf0.fsf@uwakimon.sk.tsukuba.ac.jp>

Greg Ewing writes:

 > Stephen J. Turnbull wrote:

 > > If an English speaker would pronounce the spelling of an English
 > > word "A B C", and an Arabic speaker an Arabic word as "1 2 3",
 > > then *as an identifier* the combination English then Arabic is
 > > spelled "A B C _ 1 2 3".

 > But would an Arabic speaker pronounce the identifier as a whole
 > as "A B C 1 2 3" or "1 2 3 A B C"? That's where I find it all
 > gets very confusing.

Then "unask the question."<wink>  Bidi is *not* context-free; that
question is not properly formulated.  Pragmatically, in a well-formed
Python program in the overwhelming majority of cases you or she will
be in a LTR context, so it will be read "A B C _ 1 2 3".

The ambiguity you have in mind probably is best expressed "what
happens with a single line program?"  Eg, one that appears on the
display like this:

                               ABC_321

True, a native Arabic speaker would surely (absent any context except
her early upbringing) read that "1 2 3 _ A B C".  And I admit, I'd
read it "A B C _ 1 2 3".  That looks like an ambiguity requiring use
of a direction indicator, but it's not.  According to PEP 263 all
Python programs implicitly start in ASCII (otherwise the optional
coding cookie cannot be parsed, and presumably not the optional
shebang, either).

So since the Python programmer (whether natively English-speaking or
Arabic-speaking) starts in state "LTR", she reads the "A" first, not
the "1", and there are no problems.  Of course, you want to be able to
express the identifier that would be spelled out (and represented in
memory!) as "1 2 3 _ A B C", and you can:

                               321_ABC

Since I'm not an Arabic-speaker at all, I can only say I suspect that
Arabic speakers will learn to do this context initialization very
quickly, and to read comments marked at the *end* of the line, rather
than the beginning.  Ie, to an Arabic speaker an Arabic header comment
will feel like this:

                                     This is the Foomatic program. #
                                     It makes passes at compilers. #
                        It is licentiously speaking a GPL program. #

A smart editor should be able to format that:

This is the Foomatic program.                                      #
It makes passes at compilers.                                      #
It is licentiously speaking a GPL program.                         #

It feels weird, but it's not that bad, to me anyway.

Once again, speakers of bidi languages are in a world of pain anyway;
it's reasonable to suppose that this doesn't really make things worse.
There are ambiguities here, and while a naive native speaker might
resolve them differently in ad hoc cases from the above, I doubt
they'd be lucky enough to come up with a consistent interpretation.
On the other hand, humans are *designed* to learn the arbitrary rules
of languages, as children, at least.  Adults who are fortunate enough
to retain enough of that ability to learn to program probably will
have little trouble with this particular arbitrary rule.  At least,
that's my guess, indirectly supported by the Unicode rules for
identifiers which suggests that it is reasonable to prohibit direction
indicators in identifiers.

From python at zesty.ca  Tue May 29 05:57:45 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Mon, 28 May 2007 22:57:45 -0500 (CDT)
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <87fy5h38rx.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>
	<87646g3u9q.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705260939x64cd9642qd025c9a01ef7604e@mail.gmail.com>
	<87veee2wj4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<43aa6ff70705271741w2b3eefcbj29921e81822d189@mail.gmail.com>
	<87fy5h38rx.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <Pine.LNX.4.58.0705282136380.27740@server1.LFW.org>

On Mon, 28 May 2007, Stephen J. Turnbull wrote:
> Now, identifiers are by definition character streams.  If an English
> speaker would pronounce the spelling of an English word "A B C", and
> an Arabic speaker an Arabic word as "1 2 3", then *as an identifier*
> the combination English then Arabic is spelled "A B C _ 1 2 3".  And
> that's all the Python compiler needs to know.  In fact, on the editor
> display this would be presented "ABC_321".

This draft on internationalized URIs:

    http://www.w3.org/International/iri-edit/draft-duerst-iri.html#anchor5

points out some examples of extremely confusing display orders that
can be caused by digits (which require a left-to-right ordering),
slashes (which can cause digits to be interpreted as fractions), and
other operators near digits.  These strike me as rather awful results
of the bidi algorithm.

Would the display of source code be affected this way as well?


-- ?!ng

From stephen at xemacs.org  Tue May 29 07:30:55 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 29 May 2007 14:30:55 +0900
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <Pine.LNX.4.58.0705282136380.27740@server1.LFW.org>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>
	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>
	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>
	<87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>
	<87646g3u9q.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705260939x64cd9642qd025c9a01ef7604e@mail.gmail.com>
	<87veee2wj4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<43aa6ff70705271741w2b3eefcbj29921e81822d189@mail.gmail.com>
	<87fy5h38rx.fsf@uwakimon.sk.tsukuba.ac.jp>
	<Pine.LNX.4.58.0705282136380.27740@server1.LFW.org>
Message-ID: <877iqs2o34.fsf@uwakimon.sk.tsukuba.ac.jp>

Ka-Ping Yee writes:

 > Would the display of source code be affected this way as well?

Of course!  That's what PEP 3131 proponents *want*.  From the draft
you cite: "certain phenomena in this relationship may look strange to
somebody not familiar with bidirectional behavior, but familiar to
users of Arabic and Hebrew."  Ie, we proponents want to allow programs
that look familiar to native speakers of various languages, but do not
look familiar to monolingual speakers of American English.


From martin at v.loewis.de  Tue May 29 07:24:14 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 29 May 2007 07:24:14 +0200
Subject: [Python-3000] Lines breaking
In-Reply-To: <acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
Message-ID: <465BB8FE.4030604@v.loewis.de>

> The change would extend the line breaking behavior to three other
> ASCII characters:
>   NEL "Next Line" 85
>   VT "Vertical Tab" 0B
>   FF "Form Feed" 0C

Of these, NEL is not an ASCII character, so Guido's "no change
for ASCII-only text" requirement doesn't apply to text containing
NEL.

> Of course, it is not really necessary to change, but I think full
> conformance to the standard [1] could give Python better support of
> multilingual texts. However, full conformance would require a good
> amount of work. So, it is true that it is probably better to postpone
> it until someone complaint.

Can you please point to the chapter and verse where it says that VT
must be considered? I only found mention of FF, in R4.

Regards,
Martin


From martin at v.loewis.de  Tue May 29 07:26:44 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 29 May 2007 07:26:44 +0200
Subject: [Python-3000] Lines breaking
In-Reply-To: <465B814D.2060101@canterbury.ac.nz>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz>
Message-ID: <465BB994.9050309@v.loewis.de>

>>   VT "Vertical Tab" 0B
>>   FF "Form Feed" 0C
> 
> -1 on making these line-breaking characters by default.
> I like my ASCII text file lines broken by newline chars
> and nothing else.

The question, of course, is what a newline char is; this
whole mess originates from disagreement about this issue.

For example, .splitlines considers carriage-return (CR)
characters as well, and you don't seem to complain about
that.

Regards,
Martin

From guido at python.org  Tue May 29 08:03:59 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 29 May 2007 14:03:59 +0800
Subject: [Python-3000] Lines breaking
In-Reply-To: <465BB994.9050309@v.loewis.de>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>
Message-ID: <ca471dc20705282303s2b6c873w917d79988a6215f0@mail.gmail.com>

On 5/29/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> >>   VT "Vertical Tab" 0B
> >>   FF "Form Feed" 0C
> >
> > -1 on making these line-breaking characters by default.
> > I like my ASCII text file lines broken by newline chars
> > and nothing else.
>
> The question, of course, is what a newline char is; this
> whole mess originates from disagreement about this issue.
>
> For example, .splitlines considers carriage-return (CR)
> characters as well, and you don't seem to complain about
> that.

Well, I would have complained about that too, except I was too busy
when splitlines() was snuck into the language behind my back. :-) I
should add that it has never caused me grief even though it is
flagrant disagreement with Python's general concept of line endings.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nnorwitz at gmail.com  Tue May 29 08:12:31 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 28 May 2007 23:12:31 -0700
Subject: [Python-3000] Lines breaking
In-Reply-To: <ca471dc20705282303s2b6c873w917d79988a6215f0@mail.gmail.com>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>
	<ca471dc20705282303s2b6c873w917d79988a6215f0@mail.gmail.com>
Message-ID: <ee2a432c0705282312t222f3607wf21b6ff76b76b73c@mail.gmail.com>

On 5/28/07, Guido van Rossum <guido at python.org> wrote:
>
> Well, I would have complained about that too, except I was too busy
> when splitlines() was snuck into the language behind my back. :-) I

Heh, just today I was wondering if we should kill splitlines:

$ grep splitlines `find Lib -name '*.py'` | egrep -v
'(difflib|/test/|UserString)' | wc
     24     111    1653
$ egrep 'split[^l]' `find Lib -name '*.py'` | egrep -v
'(difflib|/test/|UserString)' | wc
    916    4943   63104

splitlines() is pretty lightly used.  split() has many uses (not surprising).

n

From g.brandl at gmx.net  Tue May 29 08:28:25 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 29 May 2007 08:28:25 +0200
Subject: [Python-3000] Lines breaking
In-Reply-To: <ee2a432c0705282312t222f3607wf21b6ff76b76b73c@mail.gmail.com>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>	<465B814D.2060101@canterbury.ac.nz>
	<465BB994.9050309@v.loewis.de>	<ca471dc20705282303s2b6c873w917d79988a6215f0@mail.gmail.com>
	<ee2a432c0705282312t222f3607wf21b6ff76b76b73c@mail.gmail.com>
Message-ID: <f3gh61$t2e$2@sea.gmane.org>

Neal Norwitz schrieb:
> On 5/28/07, Guido van Rossum <guido at python.org> wrote:
>>
>> Well, I would have complained about that too, except I was too busy
>> when splitlines() was snuck into the language behind my back. :-) I
> 
> Heh, just today I was wondering if we should kill splitlines:

And perhaps add tuple parameters to .split()?

x.split(("\r", "\n"))

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From martin at v.loewis.de  Tue May 29 08:59:48 2007
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 29 May 2007 08:59:48 +0200
Subject: [Python-3000] Lines breaking
In-Reply-To: <ee2a432c0705282312t222f3607wf21b6ff76b76b73c@mail.gmail.com>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>	
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>	
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>	
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>	
	<ca471dc20705282303s2b6c873w917d79988a6215f0@mail.gmail.com>
	<ee2a432c0705282312t222f3607wf21b6ff76b76b73c@mail.gmail.com>
Message-ID: <465BCF64.5010400@v.loewis.de>

> Heh, just today I was wondering if we should kill splitlines:
> 
> $ grep splitlines `find Lib -name '*.py'` | egrep -v
> '(difflib|/test/|UserString)' | wc
>     24     111    1653
> $ egrep 'split[^l]' `find Lib -name '*.py'` | egrep -v
> '(difflib|/test/|UserString)' | wc
>    916    4943   63104
> 
> splitlines() is pretty lightly used.  split() has many uses (not
> surprising).

However, I think that splitlines should work consistently with
readlines (for some definition of "consistent").

Regards,
Martin

From stephen at xemacs.org  Tue May 29 10:17:20 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 29 May 2007 17:17:20 +0900
Subject: [Python-3000] Lines breaking
In-Reply-To: <465BB8FE.4030604@v.loewis.de>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465BB8FE.4030604@v.loewis.de>
Message-ID: <874plw2gdr.fsf@uwakimon.sk.tsukuba.ac.jp>

"Martin v. L?wis" writes:

 > Alexandre Vassalotti writes:

 > > The change would extend the line breaking behavior to three other
 > > ASCII characters:
 > >   NEL "Next Line" 85
 > >   VT "Vertical Tab" 0B
 > >   FF "Form Feed" 0C
 > > Of course, it is not really necessary to change, but I think full
 > > conformance to the standard [1] could give Python better support of
 > > multilingual texts. However, full conformance would require a good
 > > amount of work.

I don't understand why full conformance would require much work, not
for the language.  Unicode does not propose to place requirements on
the syntax of Python *including the repertoire of characters allowed*,
only that where a character does occur, it must have the semantics
defined in UAX#14.  (Of course text processing modules in the stdlib
will have some work to do!)

I see no reason in UAX#14 that the Python grammar cannot ignore or
prohibit VT and NEL (see below), prohibit use of LINE SEPARATOR and
PARAGRAPH SEPARATOR, and restrict FORM FEED to occur immediately after
a line break.  (All outside of strings, of course, where there would
be no restriction.  Restrictions *must* apply to comment content,
however.)  Note that given Python's semantics for lines, the algorithm
in Unicode (v4.1, Section 5.8, R1) for remapping to unambiguous use of
LS and PS is well-defined and will leave zero residual ambiguity in a
legal Python program (and no instances of PS).

With the provisions above, you'll get the same display of a legal
Python program as ever when you switch to a UAX#14-conforming text
editor, except that it may provide a more friendly display for strings
containing very long lines.  People who wish to edit Python programs
in Microsoft Word should preprocess with the R1 algorithm.<wink>

 > Can you please point to the chapter and verse where it says that VT
 > must be considered? I only found mention of FF, in R4.

In UAX#14, revision 19, in the descriptions of classes it says:

------------------------------------------------------------------------
  BK: Mandatory Break (A) (Non-tailorable)

  Explicit breaks act independently of the surrounding characters. No
  characters can be added to the BK class as part of tailoring, but
  implementations are not required to support the VT character.

  000C      FORM FEED (FF)
  000B      LINE TABULATION (VT)

  FORM FEED separates pages. The text on the new page starts at the
  beginning of the line. No paragraph formatting is applied.

  2028      LINE SEPARATOR (LS)

  The text after the Line Separator starts at the beginning of the
  line. No paragraph formatting is applied. This is similar to HTML
  <BR>.

  2029      PARAGRAPH SEPARATOR (PS)

  The text of the new paragraph starts at the beginning of the
  line. Paragraph formatting is applied.

  Newline Function (NLF)

  Newline Functions are defined in the Unicode Standard as providing
  additional explicit breaks. They are not individual characters, but
  are encoded as sequences of the control characters NEL, LF, and CR.
------------------------------------------------------------------------

In the descriptions of the singleton classes LF, CR, and NL
(containing NEL), it is indicated that supporting LF and CR is
mandatory, the rules are the ones used by Python's universal newline
feature AFAICT.  And NL need not be supported:

------------------------------------------------------------------------
  NL: Next Line (A) (Non-tailorable) 

  0085      NEXT LINE (NEL)

  The NL class acts like BK in all respects (there is a mandatory break
  after any NEL character). It cannot be tailored, but implementations
  are not required to support the NEL character; see the discussion
  under BK.
------------------------------------------------------------------------

From guido at python.org  Tue May 29 10:20:08 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 29 May 2007 16:20:08 +0800
Subject: [Python-3000] Lines breaking
In-Reply-To: <f3gh61$t2e$2@sea.gmane.org>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>
	<ca471dc20705282303s2b6c873w917d79988a6215f0@mail.gmail.com>
	<ee2a432c0705282312t222f3607wf21b6ff76b76b73c@mail.gmail.com>
	<f3gh61$t2e$2@sea.gmane.org>
Message-ID: <ca471dc20705290120k1041597sa8c59cee787acf55@mail.gmail.com>

What would that do?

On 5/29/07, Georg Brandl <g.brandl at gmx.net> wrote:
> Neal Norwitz schrieb:
> > On 5/28/07, Guido van Rossum <guido at python.org> wrote:
> >>
> >> Well, I would have complained about that too, except I was too busy
> >> when splitlines() was snuck into the language behind my back. :-) I
> >
> > Heh, just today I was wondering if we should kill splitlines:
>
> And perhaps add tuple parameters to .split()?
>
> x.split(("\r", "\n"))
>
> Georg
>
> --
> Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
> Four shall be the number of spaces thou shalt indent, and the number of thy
> indenting shall be four. Eight shalt thou not indent, nor either indent thou
> two, excepting that thou then proceed to four. Tabs are right out.
>
> _______________________________________________
> Python-3000 mailing list
> Python-3000 at python.org
> http://mail.python.org/mailman/listinfo/python-3000
> Unsubscribe: http://mail.python.org/mailman/options/python-3000/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From krstic at solarsail.hcs.harvard.edu  Tue May 29 10:26:15 2007
From: krstic at solarsail.hcs.harvard.edu (=?UTF-8?B?SXZhbiBLcnN0acSH?=)
Date: Tue, 29 May 2007 04:26:15 -0400
Subject: [Python-3000] Lines breaking
In-Reply-To: <ca471dc20705290120k1041597sa8c59cee787acf55@mail.gmail.com>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>	<465B814D.2060101@canterbury.ac.nz>
	<465BB994.9050309@v.loewis.de>	<ca471dc20705282303s2b6c873w917d79988a6215f0@mail.gmail.com>	<ee2a432c0705282312t222f3607wf21b6ff76b76b73c@mail.gmail.com>	<f3gh61$t2e$2@sea.gmane.org>
	<ca471dc20705290120k1041597sa8c59cee787acf55@mail.gmail.com>
Message-ID: <465BE3A7.3050507@solarsail.hcs.harvard.edu>

Guido van Rossum wrote:
> What would that do?

It would split on all separators in the tuple, so

    x.split(("\r", "\n"))

would do the same thing that x.splitlines() does now.

-- 
Ivan Krsti? <krstic at solarsail.hcs.harvard.edu> | GPG: 0x147C722D

From python at zesty.ca  Tue May 29 10:36:18 2007
From: python at zesty.ca (Ka-Ping Yee)
Date: Tue, 29 May 2007 03:36:18 -0500 (CDT)
Subject: [Python-3000] Lines breaking
In-Reply-To: <465BE3A7.3050507@solarsail.hcs.harvard.edu>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>
	<ca471dc20705282303s2b6c873w917d79988a6215f0@mail.gmail.com>
	<ee2a432c0705282312t222f3607wf21b6ff76b76b73c@mail.gmail.com>
	<f3gh61$t2e$2@sea.gmane.org>
	<ca471dc20705290120k1041597sa8c59cee787acf55@mail.gmail.com>
	<465BE3A7.3050507@solarsail.hcs.harvard.edu>
Message-ID: <Pine.LNX.4.58.0705290333280.27140@server1.LFW.org>

On Tue, 29 May 2007, [UTF-8] Ivan Krsti?^G wrote:
> Guido van Rossum wrote:
> > What would that do?
>
> It would split on all separators in the tuple, so
>
>     x.split(("\r", "\n"))
>
> would do the same thing that x.splitlines() does now.

Hmm... would it?  Or should two split points with nothing between
them produce empty strings, i.e. you would have to do

    x.split(('\r\n', '\r', '\n'))

to get the behaviour of x.splitlines()?


-- ?!ng

From krstic at solarsail.hcs.harvard.edu  Tue May 29 10:38:55 2007
From: krstic at solarsail.hcs.harvard.edu (=?UTF-8?B?SXZhbiBLcnN0acSH?=)
Date: Tue, 29 May 2007 04:38:55 -0400
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <877iqs2o34.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <19dd68ba0705120817k61788659n83da8d2c09dba0e1@mail.gmail.com>	<87sl9o5dvi.fsf@uwakimon.sk.tsukuba.ac.jp>	<fb6fbf560705230926v4aa719a4x15c4a7047f48388d@mail.gmail.com>	<87646i5td6.fsf@uwakimon.sk.tsukuba.ac.jp>	<fb6fbf560705241114i55ae23afg5b8822abe0f99560@mail.gmail.com>	<87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>	<fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>	<87646g3u9q.fsf@uwakimon.sk.tsukuba.ac.jp>	<fb6fbf560705260939x64cd9642qd025c9a01ef7604e@mail.gmail.com>	<87veee2wj4.fsf@uwakimon.sk.tsukuba.ac.jp>	<43aa6ff70705271741w2b3eefcbj29921e81822d189@mail.gmail.com>	<87fy5h38rx.fsf@uwakimon.sk.tsukuba.ac.jp>	<Pine.LNX.4.58.0705282136380.27740@server1.LFW.org>
	<877iqs2o34.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <465BE69F.1050804@solarsail.hcs.harvard.edu>

Stephen J. Turnbull wrote:
> Ie, we proponents want to allow programs
> that look familiar to native speakers of various languages, but do not
> look familiar to monolingual speakers of American English.

That characterization is overly narrow. I speak and write at least three
languages including English non-natively, and unexpected bidi behavior
still looks unfamiliar and confusing to me.

I haven't had time to participate in this discussion though I've been
following it; FWIW, I'm a loud -1 on Unicode identifiers by default for
just about the exact reasons that Ping enumerated.

-- 
Ivan Krsti? <krstic at solarsail.hcs.harvard.edu> | GPG: 0x147C722D

From krstic at solarsail.hcs.harvard.edu  Tue May 29 11:47:04 2007
From: krstic at solarsail.hcs.harvard.edu (=?UTF-8?B?SXZhbiBLcnN0acSH?=)
Date: Tue, 29 May 2007 05:47:04 -0400
Subject: [Python-3000] Lines breaking
In-Reply-To: <Pine.LNX.4.58.0705290333280.27140@server1.LFW.org>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>
	<ca471dc20705282303s2b6c873w917d79988a6215f0@mail.gmail.com>
	<ee2a432c0705282312t222f3607wf21b6ff76b76b73c@mail.gmail.com>
	<f3gh61$t2e$2@sea.gmane.org>
	<ca471dc20705290120k1041597sa8c59cee787acf55@mail.gmail.com>
	<465BE3A7.3050507@solarsail.hcs.harvard.edu>
	<Pine.LNX.4.58.0705290333280.27140@server1.LFW.org>
Message-ID: <465BF698.6050703@solarsail.hcs.harvard.edu>

Ka-Ping Yee wrote:
> Hmm... would it?  Or should two split points with nothing between
> them produce empty strings, i.e. you would have to do
>     x.split(('\r\n', '\r', '\n'))
> to get the behaviour of x.splitlines()?

Right, Georg's example would be unintuitive given the current behavior
of str.split which will happily provide zero-width matches when it hits
separators in sequence.

Perl bypasses the issue by having split
(http://perldoc.perl.org/functions/split.html) take a regex; I've only
rarely used this for complex matches, though. I tried a Google code
search for

    lang:perl split\(?\s?\/\[        (simple multiple separators)
    lang:python \.splitlines\s?\(
    lang:python \.split\s?\(

but the number of results seems to oscillate between 300 and 100000, so
that didn't help much.

-- 
Ivan Krsti? <krstic at solarsail.hcs.harvard.edu> | GPG: 0x147C722D

From g.brandl at gmx.net  Tue May 29 12:51:59 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 29 May 2007 12:51:59 +0200
Subject: [Python-3000] Lines breaking
In-Reply-To: <Pine.LNX.4.58.0705290333280.27140@server1.LFW.org>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>	<465B814D.2060101@canterbury.ac.nz>
	<465BB994.9050309@v.loewis.de>	<ca471dc20705282303s2b6c873w917d79988a6215f0@mail.gmail.com>	<ee2a432c0705282312t222f3607wf21b6ff76b76b73c@mail.gmail.com>	<f3gh61$t2e$2@sea.gmane.org>	<ca471dc20705290120k1041597sa8c59cee787acf55@mail.gmail.com>	<465BE3A7.3050507@solarsail.hcs.harvard.edu>
	<Pine.LNX.4.58.0705290333280.27140@server1.LFW.org>
Message-ID: <f3h0k7$hdl$1@sea.gmane.org>

Ka-Ping Yee schrieb:
> On Tue, 29 May 2007, [UTF-8] Ivan Krsti?^G wrote:
>> Guido van Rossum wrote:
>> > What would that do?
 >>
>> It would split on all separators in the tuple, so

Exactly, just like .startswith() with a tuple tries all of the elements.

>>     x.split(("\r", "\n"))
>>
>> would do the same thing that x.splitlines() does now.
> 
> Hmm... would it?  Or should two split points with nothing between
> them produce empty strings, i.e. you would have to do
> 
>     x.split(('\r\n', '\r', '\n'))
> 
> to get the behaviour of x.splitlines()?

Yes, that would be the correct analogon. Sorry, I should have made that
clear.

Georg



-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From alexandre at peadrop.com  Tue May 29 17:56:27 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Tue, 29 May 2007 11:56:27 -0400
Subject: [Python-3000] Lines breaking
In-Reply-To: <465BB8FE.4030604@v.loewis.de>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465BB8FE.4030604@v.loewis.de>
Message-ID: <acd65fa20705290856u654da609ia814dd040942612b@mail.gmail.com>

On 5/29/07, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > The change would extend the line breaking behavior to three other
> > ASCII characters:
> >   NEL "Next Line" 85
> >   VT "Vertical Tab" 0B
> >   FF "Form Feed" 0C
>
> Of these, NEL is not an ASCII character, so Guido's "no change
> for ASCII-only text" requirement doesn't apply to text containing
> NEL.

Right. It is defined in the ISO control function standard (ISO 6429).
I have been duped by the format of table 5-1 in the Unicode standard.

> > Of course, it is not really necessary to change, but I think full
> > conformance to the standard [1] could give Python better support of
> > multilingual texts. However, full conformance would require a good
> > amount of work. So, it is true that it is probably better to postpone
> > it until someone complaint.
>
> Can you please point to the chapter and verse where it says that VT
> must be considered? I only found mention of FF, in R4.
>

Right again. (It is not my day today...) I should had read more
throughly, instead relying on the table.

Here the two sections for readline and writeline:

  R4 A readline function should stop at NLF, LS, FF, or PS. In the
     typical implementation, it does not include the NLF, LS, PS, or
     FF that caused it to stop.

  R4a A writeline (or newline) function should convert NLF, LS, and PS
      according to the conventions just discussed in "Converting to
      Other Character Code Sets."

-- Alexandre

From aahz at pythoncraft.com  Tue May 29 19:08:44 2007
From: aahz at pythoncraft.com (Aahz)
Date: Tue, 29 May 2007 10:08:44 -0700
Subject: [Python-3000] Lines breaking
In-Reply-To: <465BF698.6050703@solarsail.hcs.harvard.edu>
References: <acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>
	<ca471dc20705282303s2b6c873w917d79988a6215f0@mail.gmail.com>
	<ee2a432c0705282312t222f3607wf21b6ff76b76b73c@mail.gmail.com>
	<f3gh61$t2e$2@sea.gmane.org>
	<ca471dc20705290120k1041597sa8c59cee787acf55@mail.gmail.com>
	<465BE3A7.3050507@solarsail.hcs.harvard.edu>
	<Pine.LNX.4.58.0705290333280.27140@server1.LFW.org>
	<465BF698.6050703@solarsail.hcs.harvard.edu>
Message-ID: <20070529170843.GA7598@panix.com>

On Tue, May 29, 2007, Ivan Krsti?? wrote:
>
> Perl bypasses the issue by having split
> (http://perldoc.perl.org/functions/split.html) take a regex; I've only
> rarely used this for complex matches, though. 

Then perhaps we should just point people at re.split()...
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Look, it's your affair if you want to play with five people, but don't
go calling it doubles."  --John Cleese anticipates Usenet

From aahz at pythoncraft.com  Tue May 29 19:28:36 2007
From: aahz at pythoncraft.com (Aahz)
Date: Tue, 29 May 2007 10:28:36 -0700
Subject: [Python-3000] Support for PEP 3131
In-Reply-To: <465BE69F.1050804@solarsail.hcs.harvard.edu>
References: <87r6p540n4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705251047w3a27bf43nc461c728e051dc09@mail.gmail.com>
	<87646g3u9q.fsf@uwakimon.sk.tsukuba.ac.jp>
	<fb6fbf560705260939x64cd9642qd025c9a01ef7604e@mail.gmail.com>
	<87veee2wj4.fsf@uwakimon.sk.tsukuba.ac.jp>
	<43aa6ff70705271741w2b3eefcbj29921e81822d189@mail.gmail.com>
	<87fy5h38rx.fsf@uwakimon.sk.tsukuba.ac.jp>
	<Pine.LNX.4.58.0705282136380.27740@server1.LFW.org>
	<877iqs2o34.fsf@uwakimon.sk.tsukuba.ac.jp>
	<465BE69F.1050804@solarsail.hcs.harvard.edu>
Message-ID: <20070529172836.GB7598@panix.com>

On Tue, May 29, 2007, Ivan Krsti?? wrote:
>
> I haven't had time to participate in this discussion though I've been
> following it; FWIW, I'm a loud -1 on Unicode identifiers by default for
> just about the exact reasons that Ping enumerated.

Considering that OLPC is given as an argument in favor of Unicode
identifiers, I think Ivan's vote should be given extra weight.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Look, it's your affair if you want to play with five people, but don't
go calling it doubles."  --John Cleese anticipates Usenet

From alexandre at peadrop.com  Tue May 29 19:29:52 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Tue, 29 May 2007 13:29:52 -0400
Subject: [Python-3000] Lines breaking
In-Reply-To: <acd65fa20705290856u654da609ia814dd040942612b@mail.gmail.com>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465BB8FE.4030604@v.loewis.de>
	<acd65fa20705290856u654da609ia814dd040942612b@mail.gmail.com>
Message-ID: <acd65fa20705291029v2c8592a3g68f7176744729a59@mail.gmail.com>

I just thought about something. Would making readline(s) not glob the
line breaking character be a too radical idea? I think that is what
most people are expecting from a readline function, anyway. I often
see things like [line.strip() for line in open(file).readlines()],
which is not so elegant IMHO.

This should be accompanied with a change to writelines that would make
it appends to each line the platform-specific line breaking character,
as defined by os.linesep.

The main objections I would against the change is obviously breaking
backward-compatibility, and losing the closure property of
readlines/writelines -- i.e., after g.writelines(f.readlines()), g
wouldn't have the guarantee to have the same content of f. On the
other hand, this could give Python a neat way to convert line breaking
characters.

Anyway, that was just a random thought. I don't think the change is
worthwhile enough, to break backward-compatibility.

-- Alexandre

From greg.ewing at canterbury.ac.nz  Wed May 30 03:15:02 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 30 May 2007 13:15:02 +1200
Subject: [Python-3000] Lines breaking
In-Reply-To: <465BB994.9050309@v.loewis.de>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>
Message-ID: <465CD016.7050002@canterbury.ac.nz>

Martin v. L?wis wrote:

> For example, .splitlines considers carriage-return (CR)
> characters as well, and you don't seem to complain about
> that.

That doesn't bother me so much because \r as a line
boundary is a well-established convention on some
platforms. But I've *never* heard of FF or VT being
used as line delimiters. If they were, I would regard
it as an application-specific convention requiring
special coding for that application.

--
Greg

From greg.ewing at canterbury.ac.nz  Wed May 30 03:17:40 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 30 May 2007 13:17:40 +1200
Subject: [Python-3000] Lines breaking
In-Reply-To: <ca471dc20705282303s2b6c873w917d79988a6215f0@mail.gmail.com>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>
	<ca471dc20705282303s2b6c873w917d79988a6215f0@mail.gmail.com>
Message-ID: <465CD0B4.9000204@canterbury.ac.nz>

Guido van Rossum wrote:

> Well, I would have complained about that too, except I was too busy
> when splitlines() was snuck into the language behind my back. :-) I
> should add that it has never caused me grief even though it is
> flagrant disagreement with Python's general concept of line endings.

Personally I wouldn't object if you reverted that and
only allowed "\n" in splitlines. Having one and only
one internal representation for line endings seems
like a good thing.

--
Greg

From greg.ewing at canterbury.ac.nz  Wed May 30 03:42:22 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 30 May 2007 13:42:22 +1200
Subject: [Python-3000] Lines breaking
In-Reply-To: <acd65fa20705291029v2c8592a3g68f7176744729a59@mail.gmail.com>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465BB8FE.4030604@v.loewis.de>
	<acd65fa20705290856u654da609ia814dd040942612b@mail.gmail.com>
	<acd65fa20705291029v2c8592a3g68f7176744729a59@mail.gmail.com>
Message-ID: <465CD67E.7080902@canterbury.ac.nz>

Alexandre Vassalotti wrote:
> I often
> see things like [line.strip() for line in open(file).readlines()],

If readline() stripped newlines, there would be no way
to distinguish between an empty line and EOF.

--
Greg

From stephen at xemacs.org  Wed May 30 05:19:29 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 30 May 2007 12:19:29 +0900
Subject: [Python-3000] Lines breaking
In-Reply-To: <465CD016.7050002@canterbury.ac.nz>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>
	<465CD016.7050002@canterbury.ac.nz>
Message-ID: <87ps4j0zi6.fsf@uwakimon.sk.tsukuba.ac.jp>

Greg Ewing writes:

 > That doesn't bother me so much because \r as a line boundary is a
 > well-established convention on some platforms. But I've *never*
 > heard of FF or VT being used as line delimiters.

The Unicode newline recommendation is all about making the use of
characters match their physical presentation.  If on a printer, you
force a new page with FF, you will see a physical line break at the
end of the page containing the FF.  Similarly with VT.  (It seems that
word processors which interpret LF as a paragraph separator often use
VT as a hard newline.)  The input functions should obey Unicode's
recommendations, IMHO.

OTOH, AIUI Unicode conformance does not require the Python language
(grammar) to allow line breaking characters other than those currently
recognized.  And the grammar may restrict their use (eg, FF only at
the end of an empty line).

From greg.ewing at canterbury.ac.nz  Thu May 31 04:49:39 2007
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 31 May 2007 14:49:39 +1200
Subject: [Python-3000] Lines breaking
In-Reply-To: <87ps4j0zi6.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>
	<465CD016.7050002@canterbury.ac.nz>
	<87ps4j0zi6.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <465E37C3.9070407@canterbury.ac.nz>

Stephen J. Turnbull wrote:
> The Unicode newline recommendation is all about making the use of
> characters match their physical presentation.  If on a printer, you
> force a new page with FF, you will see a physical line break at the
> end of the page containing the FF.  Similarly with VT.

I'm worried here about loss of information. Currently,
a Python-recognised line break character signifies
a line break and nothing else. You can read a file as
lines, strip off the newlines, do some processing, and
add the newlines back in when writing out the results,
without losing anything essential.

But an FF or VT is not *just* a line break, it can
have other semantics attatched to it as well. So
treating it just the same as a \n by default would be
wrong, I think.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From talin at acm.org  Thu May 31 08:37:07 2007
From: talin at acm.org (Talin)
Date: Wed, 30 May 2007 23:37:07 -0700
Subject: [Python-3000] Updating PEP 3101
Message-ID: <465E6D13.2030606@acm.org>

I'm in the process of updating PEP 3101 to incorporate all of the 
various discussions and evolutions that have taken place, and this is 
turning out to be fairly involved, as there are a lot of ideas scattered 
all over the place.

One thing I'd like to do is simplify the PEP a little bit, but at the 
same time address some of the requests that folks have asked for.

The goal here is to keep the basic "string.format" interface as simple 
as possible, but at the same time to allow access to more complex 
formatting for people who need it. My assumption is that people who need 
that more complex formatting would be willing to give up some of the 
syntactical convenience of the simple "string.format" style of formatting.

So for example, one thing that has been asked for is the ability to pass 
in a whole dictionary as a single argument, without using **kwds-style 
keyword parameter expansion (which is inefficient if the dictionary is 
large and only a few entries are being referred to in the format string.)

The most recent proposals have this implemented by a special 'namespace' 
argument to the format function. However, I don't like the idea of 
having certain arguments with 'special' names.

Instead, what I'd like to do is define a "Formatter" class that takes a 
larger number of options and parameters than the normal string.format 
method. People who need the extra power can construct an instance of 
Formatter (or subclass it if needed) and use that.

So for example, for people who want to be able to directly access local 
variables in a format string, you might be able to say something like:

    a = 1
    print(Formatter(locals()).format("The value of a is {a}"))

Where the "Formatter" constructor looks like:

    Formatter(namespace={}, flags=None)

In the case where you want direct access to global variables, you can 
make it even more convenient by caching the Formatter:

     f = Formatter(globals()).format
     a = 1
     print(f("The value of a is {a}"))

(You can't do this with locals() because you can't keep the dict around.)

My question to the groupmind out there is: Do you find this extra syntax 
too inconvenient and wordy, or does it seem acceptable?

-- Talin

From stephen at xemacs.org  Thu May 31 09:22:41 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 31 May 2007 16:22:41 +0900
Subject: [Python-3000] Lines breaking
In-Reply-To: <465E37C3.9070407@canterbury.ac.nz>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>
	<465CD016.7050002@canterbury.ac.nz>
	<87ps4j0zi6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<465E37C3.9070407@canterbury.ac.nz>
Message-ID: <8764691mpq.fsf@uwakimon.sk.tsukuba.ac.jp>

Greg Ewing writes:

 > But an FF or VT is not *just* a line break, it can
 > have other semantics attatched to it as well. So
 > treating it just the same as a \n by default would be
 > wrong, I think.

*Python* does the right thing: it leaves the line break character(s)
in place.  It's not Python's problem if programmers go around
stripping characters just because they happen to be at the end of the
line.  If you do care, you're already in trouble if you strip
willy-nilly:

>>> len("a\014\n")
3
>>> len("a\014\n".strip())
1
>>> len("a\014\n".strip() + "\n")
2
>>> "a\r\n"[:-1]
"a\r"

I think the odds are really good that there are already more people
who will expect Python to be Unicode-ly correct than who have
already-defined semantics for FF or VT that just happen to work right
if you strip the terminating LF but not a terminating FF.

The remaining issue, embedding those characters in the interior of
lines but considering them not line breaks, is considered by the
Unicode technical committee a non-issue.  Those characters are
mandatory breaks because the expectation is *very* consistent (they
say).  I gather you think it's reasonable, too, you just worry that
the additional semantics may get lost with current newline-stripping
heuristics.

As far as existing programs that will go postal if you hand them a
line that's terminated with FF or VT, I don't see any conceptual problem
with a codec (universal newline) that on input of "a\014" returns
"a\014\n".  Getting the details right (ie, respecting POLA) will
require some thought and maybe some fiddly options, but it will work.

Always-do-right-it-will-gratify-some-people-and-astonish-the-rest-ly y'rs


From guido at python.org  Thu May 31 13:48:48 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 31 May 2007 19:48:48 +0800
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
In-Reply-To: <ca471dc20705281937y48300821u840add9d5454e8d9@mail.gmail.com>
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<001d01c79f15$f0afa140$0201a8c0@mshome.net>
	<ca471dc20705251613p59eb1a71k9aaf13f4b492181e@mail.gmail.com>
	<002d01c79f6d$ce090de0$0201a8c0@mshome.net>
	<ca471dc20705260708t952d820w7473474554c9469b@mail.gmail.com>
	<003f01c79fd9$66948ec0$0201a8c0@mshome.net>
	<ca471dc20705270259ke665af6v3b5bdbffbd926330@mail.gmail.com>
	<009c01c7a04f$7e348460$0201a8c0@mshome.net>
	<ca471dc20705270550j5e199624xd4e8f6caa9dda93d@mail.gmail.com>
	<ca471dc20705281937y48300821u840add9d5454e8d9@mail.gmail.com>
Message-ID: <ca471dc20705310448p5c5cfeds41fdc75e05c21f55@mail.gmail.com>

I've updated the patch; the latest version now contains the grammar
and compiler changes needed to make super a keyword and to
automatically add a required parameter 'super' when super is used.
This requires the latest p3yk branch (r55692 or higher).

Comments anyone? What do people think of the change of semantics for
the im_class field of bound (and unbound) methods?

--Guido

On 5/29/07, Guido van Rossum <guido at python.org> wrote:
> Hi Tim,
>
> I've gone ahead and cooked up a tiny demo patch that uses im_class to
> store what you called im_type. Because I don't have the parser changes
> ready yet, this requires you to declare a keyword-only arg named
> 'super'; this triggers special code that set it to super(im_class,
> im_self).
>
> http://python.org/sf/1727209
>
> I haven't tried to discover yet how much breaks due to the change of
> semantics for im_class.
>
> --Guido
>
> On 5/27/07, Guido van Rossum <guido at python.org> wrote:
> > On 5/27/07, Tim Delaney <timothy.c.delaney at gmail.com> wrote:
> > > Guido van Rossum wrote:
> > >
> > > > The bound method object isn't stored in the class -- it's created by
> > > > the "C.method" or "inst.method" getattr operation. I don't see how
> > > > this would introduce a cycle.
> > > >
> > > >> If we store the class, we can store it as a weakref - the when the
> > > >> super object is created, a strong reference to the class exists.
> > >
> > > We need to create some relationship between the unbound method and the
> > > class. So the class has a reference to the unbound method, and the unbound
> > > method has a reference to the class, thus creating a cycle. Bound methods
> > > don't come into it - it's the unbound method that's the problem.
> >
> > Still wrong, I think. The unbound method object *also* isn't stored in
> > the class. It's returned by the C.method operation. Compare C.method
> > (which returns an unbound method) to C.__dict__['method'] (which
> > returns the actual function object stored in the class).
> >
> > > > Since class and type are synonym (as you say) having both im_class and
> > > > im_type would be a bad idea.
> > >
> > > I'm struggling to think of another, not too complicated name that conveys
> > > the same information.
> >
> > Keep trying. im_type is not acceptable. :-)
> >
> > --
> > --Guido van Rossum (home page: http://www.python.org/~guido/)
> >
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ncoghlan at gmail.com  Thu May 31 13:52:21 2007
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 31 May 2007 21:52:21 +1000
Subject: [Python-3000] Updating PEP 3101
In-Reply-To: <465E6D13.2030606@acm.org>
References: <465E6D13.2030606@acm.org>
Message-ID: <465EB6F5.5000308@gmail.com>

Talin wrote:
> In the case where you want direct access to global variables, you can 
> make it even more convenient by caching the Formatter:
> 
>      f = Formatter(globals()).format
>      a = 1
>      print(f("The value of a is {a}"))
> 
> (You can't do this with locals() because you can't keep the dict around.)
> 
> My question to the groupmind out there is: Do you find this extra syntax 
> too inconvenient and wordy, or does it seem acceptable?

I like it - even with locals, it works well for multi-line output:

   fmt = Formatter(locals()).format
   print(fmt('Count: {count}'))
   print(fmt('Total: {total}'))
   print(fmt('Average: {avg}'))

(Hmm, the extra parentheses on print statements are annoying me 
already... but I imagine I will get over it :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From timothy.c.delaney at gmail.com  Thu May 31 14:25:28 2007
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Thu, 31 May 2007 22:25:28 +1000
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<001d01c79f15$f0afa140$0201a8c0@mshome.net>
	<ca471dc20705251613p59eb1a71k9aaf13f4b492181e@mail.gmail.com>
	<002d01c79f6d$ce090de0$0201a8c0@mshome.net>
	<ca471dc20705260708t952d820w7473474554c9469b@mail.gmail.com>
	<003f01c79fd9$66948ec0$0201a8c0@mshome.net>
	<ca471dc20705270259ke665af6v3b5bdbffbd926330@mail.gmail.com>
	<009c01c7a04f$7e348460$0201a8c0@mshome.net>
	<ca471dc20705270550j5e199624xd4e8f6caa9dda93d@mail.gmail.com>
	<ca471dc20705281937y48300821u840add9d5454e8d9@mail.gmail.com>
	<ca471dc20705310448p5c5cfeds41fdc75e05c21f55@mail.gmail.com>
Message-ID: <016201c7a37e$c941adc0$0201a8c0@mshome.net>

Guido van Rossum wrote:
> I've updated the patch; the latest version now contains the grammar
> and compiler changes needed to make super a keyword and to
> automatically add a required parameter 'super' when super is used.
> This requires the latest p3yk branch (r55692 or higher).
>
> Comments anyone? What do people think of the change of semantics for
> the im_class field of bound (and unbound) methods?

I had problems getting the p3yK branch that I only resolved yesterday so I 
haven't actually applied the patch here yet. Turns out I'd grabbed the wrong 
URL for the repository at some point, and couldn't work out why I kept 
getting prop not found errors when trying to check out.

If I understand correctly, the patch basically takes im_class back to Python 
2.1 semantics, which I always felt were much more useful than the 2.2 
semantics. As a bonus, it should mean that the repr of a bound or unbound 
method should reflect the class it was defined in. Is this correct?

The patch notes say that you're actually inserting a keyword-only argument - 
is this purely meant to be a stopgap measure so that you've got a local 
(which could be put into a cell)? Presumably with this approach you could 
call the method like:

    A().func(1, 2, super=object())

The final implementation IMO needs to have super be an implicit local, but 
not an argument.

BTW, what made you change your mind on re-using im_class? Previously you'd 
said you didn't want to (although now I can't find the email to back that 
up). I'd written off reusing it for this purpose because of that.

I won't be able to update the PEP until Sunday (visiting family) but I'll 
try to incorporate everything we've discussed. Did we get a decision on 
whether im_class should return the decorated or undecorated class, or did 
you want me to leave that as an open issue?

I'm starting to feel somewhat embarrassed that I haven't had the time 
available to work solidly on this, but don't let that stop you from doing 
it - I'd rather have a good implementation early and not let my ego get in 
the way <wink>.

Cheers,

Tim Delaney 


From guido at python.org  Thu May 31 15:08:16 2007
From: guido at python.org (Guido van Rossum)
Date: Thu, 31 May 2007 21:08:16 +0800
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
In-Reply-To: <016201c7a37e$c941adc0$0201a8c0@mshome.net>
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<002d01c79f6d$ce090de0$0201a8c0@mshome.net>
	<ca471dc20705260708t952d820w7473474554c9469b@mail.gmail.com>
	<003f01c79fd9$66948ec0$0201a8c0@mshome.net>
	<ca471dc20705270259ke665af6v3b5bdbffbd926330@mail.gmail.com>
	<009c01c7a04f$7e348460$0201a8c0@mshome.net>
	<ca471dc20705270550j5e199624xd4e8f6caa9dda93d@mail.gmail.com>
	<ca471dc20705281937y48300821u840add9d5454e8d9@mail.gmail.com>
	<ca471dc20705310448p5c5cfeds41fdc75e05c21f55@mail.gmail.com>
	<016201c7a37e$c941adc0$0201a8c0@mshome.net>
Message-ID: <ca471dc20705310608qfbc20f9l4086713bf805905e@mail.gmail.com>

On 5/31/07, Tim Delaney <timothy.c.delaney at gmail.com> wrote:
> Guido van Rossum wrote:
> > I've updated the patch; the latest version now contains the grammar
> > and compiler changes needed to make super a keyword and to
> > automatically add a required parameter 'super' when super is used.
> > This requires the latest p3yk branch (r55692 or higher).
> >
> > Comments anyone? What do people think of the change of semantics for
> > the im_class field of bound (and unbound) methods?
>
> I had problems getting the p3yK branch that I only resolved yesterday so I
> haven't actually applied the patch here yet. Turns out I'd grabbed the wrong
> URL for the repository at some point, and couldn't work out why I kept
> getting prop not found errors when trying to check out.

svn definitely has some sharp edges when you specify a bad URL.

> If I understand correctly, the patch basically takes im_class back to Python
> 2.1 semantics, which I always felt were much more useful than the 2.2
> semantics. As a bonus, it should mean that the repr of a bound or unbound
> method should reflect the class it was defined in. Is this correct?

Right. (I think that's the main cause of various test failures, which
I haven't corrected yet.)

> The patch notes say that you're actually inserting a keyword-only argument -
> is this purely meant to be a stopgap measure so that you've got a local
> (which could be put into a cell)?

I'm not using a cell because I'm storing the result of calling
super(Class, self) -- that is different for each instance, while a
cell would be shared by all invocations of the same function.

> Presumably with this approach you could
> call the method like:
>
>     A().func(1, 2, super=object())

No, because that would be a syntax error (super as a keyword is only
allowed as an atom). You could get the same effect with

  A().func(1, 2, **{'super': object()})

but that's so obscure I don't mind.

Hmm, right now the super=object() syntax *is* accepted, but that's a
bug in the code (which I submitted yesterday) that checks for
assignments to keywords like None, True, False, and now super.

> The final implementation IMO needs to have super be an implicit local, but
> not an argument.

I thought so to at first, but there are no APIs to pass the value
along from the point where the super object is created (in the
method_call() function) to the point where the frame exists into which
the object needs to be stored (in PyEval_EvalCodeEx). So I think a
hidden keyword argument is quite convenient.

> BTW, what made you change your mind on re-using im_class? Previously you'd
> said you didn't want to (although now I can't find the email to back that
> up). I'd written off reusing it for this purpose because of that.

I do recall not liking that, but ended up thinking some more about it
after I realized how much work it would be to add another member to
the method struct. When I tried it and saw that only 7 unit test
modules had failures (and mostly only a few out of many tests) I
decided it was worth trying.

> I won't be able to update the PEP until Sunday (visiting family) but I'll
> try to incorporate everything we've discussed. Did we get a decision on
> whether im_class should return the decorated or undecorated class, or did
> you want me to leave that as an open issue?

In my implementation, it will return whatever object is found in the
MRO of the derived class, because that's all that's available  -- I
suppose this means in practice it's the decorated class.

BTW I'm open to a different implementation that stores the class in a
cell and moves the computation of super(Class, self) into the function
body -- but that would be completely different from the current
version, as the changes to im_class and method_call would not be
useful in that case. Instead, someting would have to be done with that
cell at class definition time. I fear that it would be much more
complicated to produce that version -- I spent a *lot* of time trying
to understand how symtable.c and compile.c work in order to be able to
add the implied super argument. That code is really difficult to
follow, it uses a different style than most of the rest of Python
(perhaps because I didn't write it :-), and it is quite subtle. For
example, if a nested function inside a method uses super, this
currently doesn't reference the super of the method -- it adds super
to the nested function's parameter lists, and this makes it
effectively uncallable.

> I'm starting to feel somewhat embarrassed that I haven't had the time
> available to work solidly on this, but don't let that stop you from doing
> it - I'd rather have a good implementation early and not let my ego get in
> the way <wink>.

Thanks. I realize I sort of took over and was hoping you'd respond
like this. I may not have much time over the weekend (recovering from
an exhausting and mind-bending trip to Beijing) so you're welcome to
catch up!

> Cheers,
>
> Tim Delaney

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From timothy.c.delaney at gmail.com  Thu May 31 15:25:17 2007
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Thu, 31 May 2007 23:25:17 +1000
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<002d01c79f6d$ce090de0$0201a8c0@mshome.net>
	<ca471dc20705260708t952d820w7473474554c9469b@mail.gmail.com>
	<003f01c79fd9$66948ec0$0201a8c0@mshome.net>
	<ca471dc20705270259ke665af6v3b5bdbffbd926330@mail.gmail.com>
	<009c01c7a04f$7e348460$0201a8c0@mshome.net>
	<ca471dc20705270550j5e199624xd4e8f6caa9dda93d@mail.gmail.com>
	<ca471dc20705281937y48300821u840add9d5454e8d9@mail.gmail.com>
	<ca471dc20705310448p5c5cfeds41fdc75e05c21f55@mail.gmail.com>
	<016201c7a37e$c941adc0$0201a8c0@mshome.net>
	<ca471dc20705310608qfbc20f9l4086713bf805905e@mail.gmail.com>
Message-ID: <018101c7a387$23f21a90$0201a8c0@mshome.net>

Guido van Rossum wrote:

>> The patch notes say that you're actually inserting a keyword-only
>> argument - is this purely meant to be a stopgap measure so that
>> you've got a local (which could be put into a cell)?
>
> I'm not using a cell because I'm storing the result of calling
> super(Class, self) -- that is different for each instance, while a
> cell would be shared by all invocations of the same function.

I'm actually investigating another (possibly complementary) option at the 
moment - adding an im_super attribute to methods, which would store either a 
bound or unbound super instance when the bound or unbound method object is 
created. method_new becomes:

static PyObject *
method_new(PyTypeObject* type, PyObject* args, PyObject *kw)
{
    PyObject *func;
    PyObject *self;
    PyObject *classObj = NULL;

    if (!_PyArg_NoKeywords("instancemethod", kw))
        return NULL;
    if (!PyArg_UnpackTuple(args, "method", 2, 3,
                  &func, &self, &classObj))
        return NULL;
    if (!PyCallable_Check(func)) {
        PyErr_SetString(PyExc_TypeError,
                "first argument must be callable");
        return NULL;
    }
    if (self == Py_None)
        self = NULL;
    if (self == NULL && classObj == NULL) {
        PyErr_SetString(PyExc_TypeError,
            "unbound methods must have non-NULL im_class");
        return NULL;
    }

    return PyMethod_New(func, self, classObj);
}

then in method_call we could have:

static PyObject *
method_call(PyObject *func, PyObject *arg, PyObject *kw)
{
    PyObject *self = PyMethod_GET_SELF(func);
    PyObject *klass = PyMethod_GET_CLASS(func);
    PyObject *supervalue = PyMethod_GET_SUPER(func);

and populate the `super` argument from supervalue. I think im_super has uses 
on its own (esp. for introspection).

>> Presumably with this approach you could
>> call the method like:
>>
>>     A().func(1, 2, super=object())
>
> No, because that would be a syntax error (super as a keyword is only
> allowed as an atom). You could get the same effect with
>
>  A().func(1, 2, **{'super': object()})
>
> but that's so obscure I don't mind.

I'd prefer to eliminate it, but that's a detail that can be taken care of 
later.

Anyway, need to go to bed - have to be up in 6 hours.

Cheers,

Tim Delaney 


From janssen at parc.com  Thu May 31 16:49:53 2007
From: janssen at parc.com (Bill Janssen)
Date: Thu, 31 May 2007 07:49:53 PDT
Subject: [Python-3000] Lines breaking
In-Reply-To: <465E37C3.9070407@canterbury.ac.nz> 
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>
	<465CD016.7050002@canterbury.ac.nz>
	<87ps4j0zi6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<465E37C3.9070407@canterbury.ac.nz>
Message-ID: <07May31.074957pdt."57996"@synergy1.parc.xerox.com>

> But an FF or VT is not *just* a line break, it can
> have other semantics attatched to it as well. So
> treating it just the same as a \n by default would be
> wrong, I think.

I agree.  I have text files which contain lines of FF NL, which
are supposed to be single lines with a FF as their content (to signify
a page break), not two separate lines.

Bill

From pje at telecommunity.com  Thu May 31 19:08:40 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 31 May 2007 13:08:40 -0400
Subject: [Python-3000] [Python-Dev] PEP 367: New Super
In-Reply-To: <ca471dc20705310448p5c5cfeds41fdc75e05c21f55@mail.gmail.com
 >
References: <001101c79aa7$eb26c130$0201a8c0@mshome.net>
	<001d01c79f15$f0afa140$0201a8c0@mshome.net>
	<ca471dc20705251613p59eb1a71k9aaf13f4b492181e@mail.gmail.com>
	<002d01c79f6d$ce090de0$0201a8c0@mshome.net>
	<ca471dc20705260708t952d820w7473474554c9469b@mail.gmail.com>
	<003f01c79fd9$66948ec0$0201a8c0@mshome.net>
	<ca471dc20705270259ke665af6v3b5bdbffbd926330@mail.gmail.com>
	<009c01c7a04f$7e348460$0201a8c0@mshome.net>
	<ca471dc20705270550j5e199624xd4e8f6caa9dda93d@mail.gmail.com>
	<ca471dc20705281937y48300821u840add9d5454e8d9@mail.gmail.com>
	<ca471dc20705310448p5c5cfeds41fdc75e05c21f55@mail.gmail.com>
Message-ID: <20070531170734.273393A40AA@sparrow.telecommunity.com>

At 07:48 PM 5/31/2007 +0800, Guido van Rossum wrote:
>I've updated the patch; the latest version now contains the grammar
>and compiler changes needed to make super a keyword and to
>automatically add a required parameter 'super' when super is used.
>This requires the latest p3yk branch (r55692 or higher).
>
>Comments anyone? What do people think of the change of semantics for
>the im_class field of bound (and unbound) methods?

Please correct me if I'm wrong, but just looking at the patch it 
seems to me that the descriptor protocol is being changed as well -- 
i.e., the 'type' argument is now the found-in-type in the case of an 
instance __get__ as well as class __get__.

It would seem to me that this change would break classmethods both on 
the instance and class level, since the 'cls' argument is supposed to 
be the derived class, not the class where the method was 
defined.  There also don't seem to be any tests for the use of super 
in classmethods.

This would seem to make the change unworkable, unless we are also 
getting rid of classmethods, or further change the descriptor 
protocol to add another argument.  However, by the time we get to 
that point, it seems like making 'super' a cell variable might be a 
better option.

Here's a strategy that I think could resolve your difficulties with 
the cell variable approach:

First, when a class is encountered during the symbol setup pass, 
allocate an extra symbol for the class as a cell variable with a 
generated name (e.g. $1, $2, etc.), and keep a pointer to this name 
in the class state information.

Second, when generating code for 'super', pull out the generated 
variable name of the nearest enclosing class, and use it as if it had 
been written in the code.

Third, change the MAKE_FUNCTION for the BUILD_CLASS to a 
MAKE_CLOSURE, and add code after BUILD_CLASS to also store a super 
object in the special variable.  Maybe something like:

      ...
      BUILD_CLASS
      ... apply decorators ...
      DUP_TOP
      STORE_* classname
      ... generate super object ...
      STORE_DEREF $n

Fourth, make sure that the frame initialization code can deal with a 
code object that has a locals dictionary *and* cell variables.  For 
Python 2.5, this constraint is already met as long as CO_OPTIMIZED 
isn't set, and that should already be true for the relevant cases 
(module-level code and class bodies), so we really just need to 
ensure that CO_OPTIMIZED doesn't get set as a side-effect of adding 
cell variables.


From g.brandl at gmx.net  Thu May 31 19:34:16 2007
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 31 May 2007 19:34:16 +0200
Subject: [Python-3000] __debug__
Message-ID: <f3n0ul$k49$1@sea.gmane.org>

Guido just fixed a case in the py3k branch where you could assign to
"None" in a function call.

__debug__ has similar problems: it can't be assigned to normally, but via
keyword arguments it is possible.

This should be fixed; or should __debug__ be thrown out anyway?

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From stephen at xemacs.org  Thu May 31 19:50:28 2007
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 01 Jun 2007 02:50:28 +0900
Subject: [Python-3000] Lines breaking
In-Reply-To: <07May31.074957pdt."57996"@synergy1.parc.xerox.com>
References: <acd65fa20705280956j20409bc1qfe2f82f03ca94247@mail.gmail.com>
	<ca471dc20705281544i3be797f7ldab472dac3e1f543@mail.gmail.com>
	<acd65fa20705281649m7a7a871bw8d690456202f7b83@mail.gmail.com>
	<465B814D.2060101@canterbury.ac.nz> <465BB994.9050309@v.loewis.de>
	<465CD016.7050002@canterbury.ac.nz>
	<87ps4j0zi6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<465E37C3.9070407@canterbury.ac.nz>
	<07May31.074957pdt."57996"@synergy1.parc.xerox.com>
Message-ID: <873b1c287v.fsf@uwakimon.sk.tsukuba.ac.jp>

Bill Janssen writes:

 > > But an FF or VT is not *just* a line break, it can
 > > have other semantics attatched to it as well. So
 > > treating it just the same as a \n by default would be
 > > wrong, I think.
 > 
 > I agree.  I have text files which contain lines of FF NL, which
 > are supposed to be single lines with a FF as their content (to signify
 > a page break), not two separate lines.

I agree that that looks nice in my editor, but it is not Unicode-
conforming practice, and I suspect that if you experiment with any
printer you'll discover that you get an empty line at the top of the
page.

I also suspect that any program that currently is used to process
those files' content by lines probably simply treats the FF as
whitespace, and throws away empty lines.  If so, it will still work
with FF treated as a hard line break in line-processing mode, since
the trailing NL will now generate a (superfluous) empty line.

Given that, is this going to matter to you?

From alexandre at peadrop.com  Thu May 31 20:49:25 2007
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Thu, 31 May 2007 14:49:25 -0400
Subject: [Python-3000] Buffer objects and StringIO
Message-ID: <acd65fa20705311149j7e0d4750h3e086ef8f0b5cc3e@mail.gmail.com>

Hello,

I finished yesterday the implementations of BytesIO and StringIO
objects in C. They are both fully working. (The code is available in
my cpy_merge branch in the svn tree.) There is only one thing that is
bothering me with StringIO, it doesn't accept buffer objects. Should I
care about this?

Thanks,
-- Alexandre

From brett at python.org  Thu May 31 20:51:08 2007
From: brett at python.org (Brett Cannon)
Date: Thu, 31 May 2007 11:51:08 -0700
Subject: [Python-3000] __debug__
In-Reply-To: <f3n0ul$k49$1@sea.gmane.org>
References: <f3n0ul$k49$1@sea.gmane.org>
Message-ID: <bbaeab100705311151q5a900afeq7ab1f8988a7eecee@mail.gmail.com>

On 5/31/07, Georg Brandl <g.brandl at gmx.net> wrote:
>
> Guido just fixed a case in the py3k branch where you could assign to
> "None" in a function call.
>
> __debug__ has similar problems: it can't be assigned to normally, but via
> keyword arguments it is possible.
>
> This should be fixed; or should __debug__ be thrown out anyway?



I never use the flag, personally.  When I am debugging I have an
app-specific flag I set.  I am +1 on ditching it.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070531/60a1d71c/attachment.html 

From theller at ctypes.org  Thu May 31 21:59:28 2007
From: theller at ctypes.org (Thomas Heller)
Date: Thu, 31 May 2007 21:59:28 +0200
Subject: [Python-3000] __debug__
In-Reply-To: <bbaeab100705311151q5a900afeq7ab1f8988a7eecee@mail.gmail.com>
References: <f3n0ul$k49$1@sea.gmane.org>
	<bbaeab100705311151q5a900afeq7ab1f8988a7eecee@mail.gmail.com>
Message-ID: <f3n9f1$kqg$1@sea.gmane.org>

Brett Cannon schrieb:
> On 5/31/07, Georg Brandl <g.brandl at gmx.net> wrote:
>>
>> Guido just fixed a case in the py3k branch where you could assign to
>> "None" in a function call.
>>
>> __debug__ has similar problems: it can't be assigned to normally, but via
>> keyword arguments it is possible.
>>
>> This should be fixed; or should __debug__ be thrown out anyway?
> 
> 
> 
> I never use the flag, personally.  When I am debugging I have an
> app-specific flag I set.  I am +1 on ditching it.
> 
> -Brett
> 
> 

I would very much wish that __debug__ stays, because I use it it nearly every larger
program that I later wish to freeze and distribute.

"if __debug__: ..." blocks have the advantage that *no* bytecode is generated
when run or frozen with -O or -OO, so the modules imported in these blocks
are not pulled in by modulefinder.  You cannot get this effect (AFAIK) with
app-specific flags.

Thanks,
Thomas


From nnorwitz at gmail.com  Thu May 31 22:55:39 2007
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 31 May 2007 13:55:39 -0700
Subject: [Python-3000] Buffer objects and StringIO
In-Reply-To: <acd65fa20705311149j7e0d4750h3e086ef8f0b5cc3e@mail.gmail.com>
References: <acd65fa20705311149j7e0d4750h3e086ef8f0b5cc3e@mail.gmail.com>
Message-ID: <ee2a432c0705311355p7e0dadd3nc8edc24ad580b9e1@mail.gmail.com>

On 5/31/07, Alexandre Vassalotti <alexandre at peadrop.com> wrote:
> Hello,
>
> I finished yesterday the implementations of BytesIO and StringIO
> objects in C. They are both fully working. (The code is available in
> my cpy_merge branch in the svn tree.) There is only one thing that is
> bothering me with StringIO, it doesn't accept buffer objects. Should I
> care about this?

Yes, but buffer objects are likely to change in 3.0.  See PEP 3118
http://www.python.org/dev/peps/pep-3118/

It's not accepted because the PEP isn't complete yet AFAIK.

n