From grosser.meister.morti at gmx.net  Tue Apr  1 12:36:57 2008
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Tue, 01 Apr 2008 12:36:57 +0200
Subject: [Python-ideas] dictionary unpacking
In-Reply-To: <20080331144309.GA20205@panix.com>
References: <47F0F267.8080005@gmx.net> <20080331144309.GA20205@panix.com>
Message-ID: <47F21049.5050700@gmx.net>

Aahz schrieb:
> On Mon, Mar 31, 2008, Mathias Panzenb?ck wrote:
>> Maybe dictionary unpacking would be a nice thing?
>>
>>  >>> d = {'foo': 42, 'egg': 23}
>>  >>> {'foo': bar, 'egg': spam} = d
>>  >>> print bar, spam
>> 42 23
>>
>> What do you think? Bad idea? Good idea?
> 
> Horrible idea.  ;-)

And your argumentation on this is?

From lists at cheimes.de  Tue Apr  1 12:48:12 2008
From: lists at cheimes.de (Christian Heimes)
Date: Tue, 01 Apr 2008 12:48:12 +0200
Subject: [Python-ideas] dictionary unpacking
In-Reply-To: <47F21049.5050700@gmx.net>
References: <47F0F267.8080005@gmx.net> <20080331144309.GA20205@panix.com>
	<47F21049.5050700@gmx.net>
Message-ID: <fst3jn$dj4$1@ger.gmane.org>

Mathias Panzenb?ck schrieb:
> And your argumentation on this is?

I'm -1, too.

I don't see how it makes code easier to read. In fact it makes it even
harder to understand.

Christian

From facundobatista at gmail.com  Tue Apr  1 15:13:35 2008
From: facundobatista at gmail.com (Facundo Batista)
Date: Tue, 1 Apr 2008 10:13:35 -0300
Subject: [Python-ideas] dictionary unpacking
In-Reply-To: <47F0F267.8080005@gmx.net>
References: <47F0F267.8080005@gmx.net>
Message-ID: <e04bdf310804010613h56d11f19u5253b05602a14778@mail.gmail.com>

2008/3/31, Mathias Panzenb?ck <grosser.meister.morti at gmx.net>:

> Maybe dictionary unpacking would be a nice thing?
>
>   >>> d = {'foo': 42, 'egg': 23}
>   >>> {'foo': bar, 'egg': spam} = d
>   >>> print bar, spam
>  42 23

Bad idea.

It uncovers a lot of details for your brain to take care  when doing
that, for example:

>>> d = {'foo': 42, 'egg': 23}
>>> {'not': bar, 'egg': spam} = d
>>> print bar
???
>>> {'egg': bar, 'egg': spam} = d
>>> print bar
???

I would consider something like the following:

  bar, spam = d.multiple("foo", "egg")

with the semantics of:

  bar, spam = [d[k] for k in ("foo", "egg")]

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From tjreedy at udel.edu  Tue Apr  1 19:44:55 2008
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 1 Apr 2008 13:44:55 -0400
Subject: [Python-ideas] dictionary unpacking
References: <47F0F267.8080005@gmx.net>
	<e04bdf310804010613h56d11f19u5253b05602a14778@mail.gmail.com>
Message-ID: <fstsal$fa7$1@ger.gmane.org>

"Facundo Batista" <facundobatista at gmail.com> 
wrote in message 
news:e04bdf310804010613h56d11f19u5253b05602a14778 at mail.gmail.com...

-1 because the need is way to rare for new syntax support and because

|  bar, spam = [d[k] for k in ("foo", "egg")]

already does what is wanted in easily understood code.

tjtr

From gnewsg at gmail.com  Wed Apr  2 04:25:08 2008
From: gnewsg at gmail.com (Giampaolo Rodola')
Date: Tue, 1 Apr 2008 19:25:08 -0700 (PDT)
Subject: [Python-ideas] dictionary unpacking
In-Reply-To: <47F0F267.8080005@gmx.net>
References: <47F0F267.8080005@gmx.net>
Message-ID: <74429728-217d-4f92-9d63-274070489253@a1g2000hsb.googlegroups.com>

On 31 Mar, 16:17, Mathias Panzenb?ck <grosser.meister.mo... at gmx.net>
wrote:
> Maybe dictionary unpacking would be a nice thing?
>
> ?>>> d = {'foo': 42, 'egg': 23}
> ?>>> {'foo': bar, 'egg': spam} = d
> ?>>> print bar, spam
> 42 23
>
> What do you think? Bad idea? Good idea?

I think this is one of the oddest things I've ever seen proposed until
now. :)
Actually I realized what was that for only after having seen Facundo's
list comprehension translation.

From g.brandl at gmx.net  Wed Apr  2 14:21:39 2008
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 02 Apr 2008 14:21:39 +0200
Subject: [Python-ideas] dictionary unpacking
In-Reply-To: <e04bdf310804010613h56d11f19u5253b05602a14778@mail.gmail.com>
References: <47F0F267.8080005@gmx.net>
	<e04bdf310804010613h56d11f19u5253b05602a14778@mail.gmail.com>
Message-ID: <fsvtog$o59$1@ger.gmane.org>

Facundo Batista schrieb:
> 2008/3/31, Mathias Panzenb?ck <grosser.meister.morti-hi6Y0CQ0nG0 at public.gmane.org>:
> 
>> Maybe dictionary unpacking would be a nice thing?
>>
>>   >>> d = {'foo': 42, 'egg': 23}
>>   >>> {'foo': bar, 'egg': spam} = d
>>   >>> print bar, spam
>>  42 23
> 
> Bad idea.
> 
> It uncovers a lot of details for your brain to take care  when doing
> that, for example:
> 
>>>> d = {'foo': 42, 'egg': 23}
>>>> {'not': bar, 'egg': spam} = d
>>>> print bar
> ???
>>>> {'egg': bar, 'egg': spam} = d
>>>> print bar
> ???
> 
> I would consider something like the following:
> 
>   bar, spam = d.multiple("foo", "egg")
> 
> with the semantics of:
> 
>   bar, spam = [d[k] for k in ("foo", "egg")]

We also have

bar, spam = itemgetter("foo", "egg")(d)

if some functional form is preferred.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

From grosser.meister.morti at gmx.net  Wed Apr  2 14:38:57 2008
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Wed, 02 Apr 2008 14:38:57 +0200
Subject: [Python-ideas] dictionary unpacking
In-Reply-To: <fsvtog$o59$1@ger.gmane.org>
References: <47F0F267.8080005@gmx.net>	<e04bdf310804010613h56d11f19u5253b05602a14778@mail.gmail.com>
	<fsvtog$o59$1@ger.gmane.org>
Message-ID: <47F37E61.7080509@gmx.net>

Georg Brandl schrieb:
 >
 > We also have
 >
 > bar, spam = itemgetter("foo", "egg")(d)
 >
 > if some functional form is preferred.
 >
 > Georg
 >

oh, we have that? yes, that's really a nice way doing it.

	-panzi

From aaron.watters at gmail.com  Wed Apr  2 17:16:50 2008
From: aaron.watters at gmail.com (Aaron Watters)
Date: Wed, 2 Apr 2008 11:16:50 -0400
Subject: [Python-ideas] dictionary unpacking
Message-ID: <fc13a6500804020816n40bd9493oce9d30128be8aac3@mail.gmail.com>

"Facundo Batista" <facundobatista at gmail.com>
>bar, spam = d.multiple("foo", "egg")

Terry Reed:
->1 because the need is way to rare for new syntax support and because

|  bar, spam = [d[k] for k in ("foo", "egg")]

>already does what is wanted in easily understood code.

This is true, but with a fast implementation this
operation is incredibly useful.  If we go waaay back
to kjbuckets (the original "set" module circa 1993 or so)
there was

(v1,v2,v3) = D.dump( (k1,k2,k3))

and also the inverse operation

kjDict.undump( (k1,k2,k3), (v1, v2, v3) )

which are extremely useful for packing/unpacking
large data structures.

Please see
http://gadfly.sourceforge.net/kjbuckets.html

There are lots of other cool operations which
can be very handy -- transpose, difference, closure...

By the way, I've been delighted to see Python
gradually growing features that kjbuckets provided
more than a decade ago.  It's just too bad you
guys can't figure out how to get them all at once
somehow....

   -- Aaron Watters

===
http://www.xfeedme.com/nucular/pydistro.py/go?FREETEXT=buckets
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20080402/0fd9fd32/attachment.html>

From grosser.meister.morti at gmx.net  Wed Apr  2 19:44:51 2008
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Wed, 02 Apr 2008 19:44:51 +0200
Subject: [Python-ideas] dictionary unpacking
In-Reply-To: <47F37E61.7080509@gmx.net>
References: <47F0F267.8080005@gmx.net>	<e04bdf310804010613h56d11f19u5253b05602a14778@mail.gmail.com>	<fsvtog$o59$1@ger.gmane.org>
	<47F37E61.7080509@gmx.net>
Message-ID: <47F3C613.9050000@gmx.net>

Maybe just extend the functionality of the __getitem__ and __setitem__ methods of dicts?

 >>> class xdict(dict):
	def __getitem__(self,key):
		if type(key) in (tuple, list):
			return (dict.__getitem__(self,k) for k in key)
		else:
			return dict.__getitem__(self,key)

	def __setitem__(self,key,value):
		if type(key) in (tuple, list):
			for k, v in zip(key,value):
				dict.__setitem__(self,k,v)
		else:
			dict.__setitem__(self,key,value)

 >>> d=xdict({'foo':23,'bar':42,'egg':'spam'})
 >>> d
{'bar': 42, 'foo': 23, 'egg': 'spam'}
 >>> d["foo",]
<generator object at 0x9dbc02c>
 >>> list(d["foo",])
[23]
 >>> list(d["foo","bar"])
[23, 42]
 >>> a,b=d['foo','egg']
 >>> a,b
(23, 'spam')
 >>> d['baken','bar'] = 'tomato', 36
 >>> d
{'bar': 36, 'foo': 23, 'egg': 'spam', 'baken': 'tomato'}

Well, this would break current behaviour, so actually no. Not good.
But maybe that way (no conflict because lists are unhashable):

 >>> class xdict(dict):
	def __getitem__(self,key):
		if isinstance(key,list):
			return (dict.__getitem__(self,k) for k in key)
		else:
			return dict.__getitem__(self,key)

	def __setitem__(self,key,value):
		if isinstance(key,list):
			for k, v in zip(key,value):
				dict.__setitem__(self,k,v)
		else:
			dict.__setitem__(self,key,value)

 >>> d=xdict({'foo':23,'bar':42,'egg':'spam'})
 >>> list(d[["foo"]])
[23]
 >>> list(d[["foo","bar"]])
[23, 42]
 >>> a,b=d[['foo','egg']]
 >>> a,b
(23, 'spam')
 >>> d[['baken','bar']] = 'tomato', 36
 >>> d
{'bar': 36, 'foo': 23, 'egg': 'spam', 'baken': 'tomato'}

The [[ ]] looks almost like a spacial syntax. Good or bad?

	-panzi

From matt-python at theory.org  Thu Apr  3 02:49:49 2008
From: matt-python at theory.org (Matt Chisholm)
Date: Wed, 2 Apr 2008 17:49:49 -0700
Subject: [Python-ideas] Lambda again: Anonymous function definition
In-Reply-To: <mailman.47.1207044010.18976.python-ideas@python.org>
References: <mailman.47.1207044010.18976.python-ideas@python.org>
Message-ID: <20080403004949.GC22302@tesla.theory.org>

On Mon, 31 Mar 2008 08:19:35, Eli Courtwright wrote:
>On Mon, Mar 31, 2008 at 3:23 AM, Leszek Dubiel <leszek at dubiel.pl> wrote:
>
>> Lambda should have the same syntax as ordinary functions. The only
>> difference should be: you don't have to put the name of the function.
>>
>> def f (x, y): return x ** 2 + y ** 2
>>
>> g = f
>>
>> h = def (x, y): return x ** 2 + y ** 2
>>
>> Functions f, g and h are doing the same.
>
>
>
>Javascript handles anonymous functions this way as well:
>
>function f(x, y) { return x*x + y*y; }
>
>g = f;
>
>h = function(x, y) { return x*x + y*y; }
>
>With that being said, it makes sense for the return statement to be omitted
>in lambdas (or anonymous defs, as I hope they will eventually be called),
>since those functions are limited to one statement.
>
>- Eli

+1 to handling anonymous functions/lambdas the way JavaScript does.
It's the only thing I like better about JavaScript than Python.

I don't know if I agree about leaving off the return statement, though. 

-matt

From talin at acm.org  Thu Apr  3 03:49:51 2008
From: talin at acm.org (Talin)
Date: Wed, 02 Apr 2008 18:49:51 -0700
Subject: [Python-ideas] Lambda again: Anonymous function definition
In-Reply-To: <20080403004949.GC22302@tesla.theory.org>
References: <mailman.47.1207044010.18976.python-ideas@python.org>
	<20080403004949.GC22302@tesla.theory.org>
Message-ID: <47F437BF.90500@acm.org>

Matt Chisholm wrote:
> +1 to handling anonymous functions/lambdas the way JavaScript does.
> It's the only thing I like better about JavaScript than Python.

Unfortunately, there's no way in Python to have a statement inside of an 
expression (because statements are delimited by line ends and 
indentation). Many people have attempted this, none have succeeded. 
Don't go there, you'll just be opening up old wounds...

-- Talin

From taleinat at gmail.com  Thu Apr  3 15:20:46 2008
From: taleinat at gmail.com (Tal Einat)
Date: Thu, 3 Apr 2008 16:20:46 +0300
Subject: [Python-ideas] Lambda again: Anonymous function definition
In-Reply-To: <47F437BF.90500@acm.org>
References: <mailman.47.1207044010.18976.python-ideas@python.org>
	<20080403004949.GC22302@tesla.theory.org> <47F437BF.90500@acm.org>
Message-ID: <7afdee2f0804030620v293acdc5oa8ee28b46bb074db@mail.gmail.com>

Talin wrote:
> Matt Chisholm wrote:
>  > +1 to handling anonymous functions/lambdas the way JavaScript does.
>  > It's the only thing I like better about JavaScript than Python.
>
>  Unfortunately, there's no way in Python to have a statement inside of an
>  expression (because statements are delimited by line ends and
>  indentation). Many people have attempted this, none have succeeded.
>  Don't go there, you'll just be opening up old wounds...
>
>From my experience anonymous functions are very useful when writing
event-based code, such as client-side event handlers for web browsers
with JavaScript, Expect event handlers in Tcl/Perl, and GUI event
handlers in any language. I think having a simple, readable way to
define anonymous functions in Python would allow writing more concise
and readable Python code for such applications, thus making Python an
even better Jack-of-all-trades.

However, even if it were easy to implement with clear, simple syntax,
I'm not sure this would be a good idea; I'd leave this to more
experienced language developers.

- Tal

From eli at courtwright.org  Thu Apr  3 15:57:31 2008
From: eli at courtwright.org (Eli Courtwright)
Date: Thu, 3 Apr 2008 09:57:31 -0400
Subject: [Python-ideas] Lambda again: Anonymous function definition
In-Reply-To: <47F437BF.90500@acm.org>
References: <mailman.47.1207044010.18976.python-ideas@python.org>
	<20080403004949.GC22302@tesla.theory.org> <47F437BF.90500@acm.org>
Message-ID: <3f6c86f50804030657g760d4309i57f4be2de9d4276c@mail.gmail.com>

On Wed, Apr 2, 2008 at 9:49 PM, Talin <talin at acm.org> wrote:

> Matt Chisholm wrote:
> > +1 to handling anonymous functions/lambdas the way JavaScript does.
> > It's the only thing I like better about JavaScript than Python.
>
> Unfortunately, there's no way in Python to have a statement inside of an
> expression (because statements are delimited by line ends and
> indentation). Many people have attempted this, none have succeeded.
> Don't go there, you'll just be opening up old wounds...

Well what's being proposed is for

def f(x, y): return x*x + y*y

to be a statement, but for

def(x,y): x*x + y*y

to be an expression.  In other words, anonymous def will behave exactly like
lambda does now.  This seems doable, though I don't know Python's parsing
code well enough to understand how difficult it would be to implement.

Of course, we all understand that lambda won't change anytime soon; this is
really more of a hypothetical "what if we did this in 10 years if there's
ever a Python 4000" discussion.

- Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20080403/c6b89d97/attachment.html>

From arne_bab at web.de  Mon Apr  7 22:49:54 2008
From: arne_bab at web.de (Arne Babenhauserheide)
Date: Mon, 7 Apr 2008 22:49:54 +0200
Subject: [Python-ideas] Fwd: Re:  simpler super() syntax
Message-ID: <200804072249.57454.arne_bab@web.de>

--- this message got only to Guido by error - I only now found out, so I 
forwarded it here. --- 

(a) Great! 

(b) I though about this a lot, but only two weeks ago I realized, why my idea 
was moot. It just took a bit of time. 

The reason is simple (though it took me quite some time to find it): 

There are times, where I might want to put an additional argument between the 
*args and the **kwds. 

class A(): 
	def __init__(a, b, c=d): 
		pass

class B(A): 
	def __init__(a, c=d): 
		super().__init__(*args, b, **kwds)

So the way Python does it is the simplest possibility, which doesn't cripple 
the language on the long run. 

And that's a nice result, too :) 

Thank you for answering, even though my idea was flawed. 

Best wishes, 
Arne

El Friday, 22 de February de 2008 17:12:31 escribi?:
> (a) In Py3k, you will be able to use super() itself without args, e.g.
> super().__init__(*args)
>
> (b) There are lots of reasons why you would not want to pass the args
> to your super method *unchanged*. Also, super methods may have
> defaults for all args. So super.__init__() would be ambiguous -- does
> he want to pass all args or none?
>
> Because of this I am strongly against this.
>
> On Fri, Feb 22, 2008 at 12:53 AM, Arne Babenhauserheide <arne_bab at web.de> 
wrote:
> > Hi,
> >
> >  I just spent some time figuring out how and why super needs to be called
> > with *args and **kwds in any class, when I use multiple inheritance (or
> > when some subclass wants to use it), and I got the impression, that
> > simply every class should take *args and **kwds and that super should be
> > called inside the init of every class.
> >
> >  Would it make sense to make the init of any class take *args and **kwds
> >  implicitely?
> >
> >  With that, arguments and keywords would always be passed on (the
> > behaviour we need as soon as we use any multiple inheritance) and the
> > code would look cleaner (I think).
> >
> >
> >  At the moment the code for a class with MI looks like this:
> >
> >  class Blah(Blubb):
> >         def __init__(*args, **kwds)
> >                 super(Blah, self).__init__(*args, **kwds)
> >
> >  with implicit *args and **kwds, it would look like this:
> >
> >  class Blah(Blubb):
> >         def __init__()
> >                 super(Blah, self).__init__()
> >
> >  And by calling super, I implicitely say, that i want to pass on any
> > leftover args or kwds which (to my knowledge) I must do anyway, since
> > else I am in danger of getting MI bugs.
> >
> >  What do you think?
> >
> >  Best wishes,
> >  Arne
> >  --
> >  Unpolitisch sein
> >  Hei?t politisch sein
> >  Ohne es zu merken.
> >  - Arne Babenhauserheide ( http://draketo.de )
> >  -- Weblog: http://blog.draketo.de
> >
> >  -- Mein ?ffentlicher Schl?ssel (PGP/GnuPG):
> >  http://draketo.de/inhalt/ich/pubkey.txt
> >
> > _______________________________________________
> >  Python-ideas mailing list
> >  Python-ideas at python.org
> >  http://mail.python.org/mailman/listinfo/python-ideas

-- 
Unpolitisch sein
Hei?t politisch sein
Ohne es zu merken. 
- Arne Babenhauserheide ( http://draketo.de )
-- Weblog: http://blog.draketo.de

-- Mein ?ffentlicher Schl?ssel (PGP/GnuPG): 
http://draketo.de/inhalt/ich/pubkey.txt

-------------------------------------------------------

-- 
Unpolitisch sein
Hei?t politisch sein
Ohne es zu merken. 
- Arne Babenhauserheide ( http://draketo.de )
-- Weblog: http://blog.draketo.de

-- Mein ?ffentlicher Schl?ssel (PGP/GnuPG): 
http://draketo.de/inhalt/ich/pubkey.txt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20080407/280e1b74/attachment.pgp>

From artomegus at gmail.com  Wed Apr  9 03:55:17 2008
From: artomegus at gmail.com (Anthony Tolle)
Date: Tue, 8 Apr 2008 21:55:17 -0400
Subject: [Python-ideas] superarg: an alternative to super()
Message-ID: <47857e720804081855o53152c27kf8ce5c510bea70dc@mail.gmail.com>

I was testing the new functionality of super() in Python 3.0a4, and
noted the current "limitations":

------------------------------------------------------------
Test 1 - define method outside class definition

>>> class A:
...  def f(self):
...   return 'A'
...
>>> def B_f(self):
...  return super().f() + 'B'
...
>>> class B(A):
...  f = B_f
...
>>> B().f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in B_f
SystemError: super(): __class__ cell not found

------------------------------------------------------------
Test 2 - Call super() from inner function

>>> class A:
...  def f(self):
...   return 'A'
...
>>> class B(A):
...  def f(self):
...   def inner():
...    return super().f() + 'B'
...   return inner()
...
>>> B().f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in f
  File "<stdin>", line 4, in inner
SystemError: super(): no arguments

------------------------------------------------------------

Not satisfied, I started work on another way to simplify using
super().  I call it superarg, and it doesn't have any of the above
limitations.  It only takes twenty-six lines to implement in Python,
and doesn't rely on bytecode hacks or stack frame inspection.

What superarg does do is provide a decorator that inserts the super
object into the method's argument list, before the self argument.  In
the following Python-only implementation, correct behavior does depend
on using a special metaclass.

Here it is, including a test suite (tested in Python 3.0a4):

------------------------------------------------------------
# superarg.py

from types import MethodType

class superarg_method():
    def __init__(self, callable, cls=None):
        self.callable = callable
        self.cls = cls
    def __get__(self, obj, objtype=None):
        return self if obj is None else MethodType(self, obj)
    def __call__(self, obj, *args, **kwargs):
        return self.callable(super(self.cls, obj), obj, *args, **kwargs)
    def withclass(self, cls):
        return self.__class__(self.callable, cls)

class superarg_classmethod(superarg_method):
    def __get__(self, obj, objtype=None):
        return MethodType(self, type(obj) if objtype is None else objtype)

class superarg_meta(type):
    def __init__(cls, name, bases, dict):
        for name, value in dict.items():
            if isinstance(value, superarg_method):
                type.__setattr__(cls, name, value.withclass(cls))
    def __setattr__(cls, name, value):
        if isinstance(value, superarg_method):
            value = value.withclass(cls)
        type.__setattr__(cls, name, value)

# implementation ends here - test suite follows:

if __name__ == '__main__':
    class A(metaclass=superarg_meta):
        def f(self):
            return 'A(%s)' % (self.name,)
        @classmethod
        def cm(cls):
            return 'A(%s)' % (cls.__name__,)

    # Standard use
    class B(A):
        @superarg_method
        def f(super, self):
            return 'B' + super.f()
        @superarg_classmethod
        def cm(super, cls):
            return 'B' + super.cm()

    # Reference super in inner function
    class C(A):
        @superarg_method
        def f(super, self):
            def inner():
                return 'C' + super.f()
            return inner()
        @superarg_classmethod
        def cm(super, cls):
            def inner():
                return 'C' + super.cm()
            return inner()

    # Define functions before class definition
    @superarg_method
    def D_f(super, self):
        return 'D' + super.f()

    @superarg_classmethod
    def D_cm(super, cls):
        return 'D' + super.cm()

    class D(B, C):
        def __init__(self, name):
            self.name = name
        f = D_f
        cm = D_cm

    # Define functions after class definition
    class E(C, B):
        def __init__(self, name):
            self.name = name

    @superarg_method
    def E_f(super, self):
        return 'E' + super.f()

    @superarg_classmethod
    def E_cm(super, cls):
        return 'E' + super.cm()

    E.f = E_f
    E.cm = E_cm

    # Test D
    d = D('d')
    assert d.f()  == 'DBCA(d)' # Normal method, instance binding
    assert D.f(d) == 'DBCA(d)' # Normal method, class binding
    assert d.cm() == 'DBCA(D)' # Class method, instance binding
    assert D.cm() == 'DBCA(D)' # Class method, class binding

    # Test E
    e = E('e')
    assert e.f()  == 'ECBA(e)'
    assert E.f(e) == 'ECBA(e)'
    assert e.cm() == 'ECBA(E)'
    assert E.cm() == 'ECBA(E)'

    # Test using D's methods in E
    E.cm = D_cm
    E.f = D_f
    assert e.f()  == 'DCBA(e)'
    assert E.f(e) == 'DCBA(e)'
    assert e.cm() == 'DCBA(E)'
    assert E.cm() == 'DCBA(E)'

    # Why not use E.cm = D.cm?  Because D.cm returns a bound method
    # (this would also be the case without using superarg).  Instead,
    # one would need to use E.cm = D.__dict__['cm']

------------------------------------------------------------

The decorators only need to be applied where necessary.  If a method
doesn't need the super object, then it doesn't need the decorator.

I also tested implementing the decorators in C (as built-in types),
which works even better because I was able to put the class fix-ups in
type_new and type_setattro.  Thus, the special metaclass is no longer
needed.

The only times it wouldn't work right is if the end-user implemented a
metaclass that never calls type's __new__ or __setattr__ methods.
Avoiding type.__new__ should be fairly rare, but I'm not sure about
__setattr__.  I could probably extend the types so users can do their
own fix-ups if need be.  However, if a user is customizing a class
that much, then I wonder if they would make use of superarg anyway.

Whether or not there is any perceived value in adding this to a future
version of Python, perhaps someone will find it useful enough to add
to one of their own projects (and back-porting it to Python 2.x should
be fairly simple).

--
Anthony Tolle

From guido at python.org  Wed Apr  9 04:00:56 2008
From: guido at python.org (Guido van Rossum)
Date: Tue, 8 Apr 2008 19:00:56 -0700
Subject: [Python-ideas] superarg: an alternative to super()
In-Reply-To: <47857e720804081855o53152c27kf8ce5c510bea70dc@mail.gmail.com>
References: <47857e720804081855o53152c27kf8ce5c510bea70dc@mail.gmail.com>
Message-ID: <ca471dc20804081900r1cc9fc64p61b9b10870b71923@mail.gmail.com>

Sorry, not interested.

On Tue, Apr 8, 2008 at 6:55 PM, Anthony Tolle <artomegus at gmail.com> wrote:
> I was testing the new functionality of super() in Python 3.0a4, and
>  noted the current "limitations":
>
>  ------------------------------------------------------------
>  Test 1 - define method outside class definition
>
>  >>> class A:
>  ...  def f(self):
>  ...   return 'A'
>  ...
>  >>> def B_f(self):
>  ...  return super().f() + 'B'
>  ...
>  >>> class B(A):
>  ...  f = B_f
>  ...
>  >>> B().f()
>  Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "<stdin>", line 2, in B_f
>  SystemError: super(): __class__ cell not found
>
>  ------------------------------------------------------------
>  Test 2 - Call super() from inner function
>
>  >>> class A:
>  ...  def f(self):
>  ...   return 'A'
>  ...
>  >>> class B(A):
>  ...  def f(self):
>  ...   def inner():
>  ...    return super().f() + 'B'
>  ...   return inner()
>  ...
>  >>> B().f()
>  Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "<stdin>", line 5, in f
>   File "<stdin>", line 4, in inner
>  SystemError: super(): no arguments
>
>  ------------------------------------------------------------
>
>  Not satisfied, I started work on another way to simplify using
>  super().  I call it superarg, and it doesn't have any of the above
>  limitations.  It only takes twenty-six lines to implement in Python,
>  and doesn't rely on bytecode hacks or stack frame inspection.
>
>  What superarg does do is provide a decorator that inserts the super
>  object into the method's argument list, before the self argument.  In
>  the following Python-only implementation, correct behavior does depend
>  on using a special metaclass.
>
>  Here it is, including a test suite (tested in Python 3.0a4):
>
>  ------------------------------------------------------------
>  # superarg.py
>
>  from types import MethodType
>
>  class superarg_method():
>     def __init__(self, callable, cls=None):
>         self.callable = callable
>         self.cls = cls
>     def __get__(self, obj, objtype=None):
>         return self if obj is None else MethodType(self, obj)
>     def __call__(self, obj, *args, **kwargs):
>         return self.callable(super(self.cls, obj), obj, *args, **kwargs)
>     def withclass(self, cls):
>         return self.__class__(self.callable, cls)
>
>  class superarg_classmethod(superarg_method):
>     def __get__(self, obj, objtype=None):
>         return MethodType(self, type(obj) if objtype is None else objtype)
>
>  class superarg_meta(type):
>     def __init__(cls, name, bases, dict):
>         for name, value in dict.items():
>             if isinstance(value, superarg_method):
>                 type.__setattr__(cls, name, value.withclass(cls))
>     def __setattr__(cls, name, value):
>         if isinstance(value, superarg_method):
>             value = value.withclass(cls)
>         type.__setattr__(cls, name, value)
>
>  # implementation ends here - test suite follows:
>
>  if __name__ == '__main__':
>     class A(metaclass=superarg_meta):
>         def f(self):
>             return 'A(%s)' % (self.name,)
>         @classmethod
>         def cm(cls):
>             return 'A(%s)' % (cls.__name__,)
>
>     # Standard use
>     class B(A):
>         @superarg_method
>         def f(super, self):
>             return 'B' + super.f()
>         @superarg_classmethod
>         def cm(super, cls):
>             return 'B' + super.cm()
>
>     # Reference super in inner function
>     class C(A):
>         @superarg_method
>         def f(super, self):
>             def inner():
>                 return 'C' + super.f()
>             return inner()
>         @superarg_classmethod
>         def cm(super, cls):
>             def inner():
>                 return 'C' + super.cm()
>             return inner()
>
>     # Define functions before class definition
>     @superarg_method
>     def D_f(super, self):
>         return 'D' + super.f()
>
>     @superarg_classmethod
>     def D_cm(super, cls):
>         return 'D' + super.cm()
>
>     class D(B, C):
>         def __init__(self, name):
>             self.name = name
>         f = D_f
>         cm = D_cm
>
>     # Define functions after class definition
>     class E(C, B):
>         def __init__(self, name):
>             self.name = name
>
>     @superarg_method
>     def E_f(super, self):
>         return 'E' + super.f()
>
>     @superarg_classmethod
>     def E_cm(super, cls):
>         return 'E' + super.cm()
>
>     E.f = E_f
>     E.cm = E_cm
>
>     # Test D
>     d = D('d')
>     assert d.f()  == 'DBCA(d)' # Normal method, instance binding
>     assert D.f(d) == 'DBCA(d)' # Normal method, class binding
>     assert d.cm() == 'DBCA(D)' # Class method, instance binding
>     assert D.cm() == 'DBCA(D)' # Class method, class binding
>
>     # Test E
>     e = E('e')
>     assert e.f()  == 'ECBA(e)'
>     assert E.f(e) == 'ECBA(e)'
>     assert e.cm() == 'ECBA(E)'
>     assert E.cm() == 'ECBA(E)'
>
>     # Test using D's methods in E
>     E.cm = D_cm
>     E.f = D_f
>     assert e.f()  == 'DCBA(e)'
>     assert E.f(e) == 'DCBA(e)'
>     assert e.cm() == 'DCBA(E)'
>     assert E.cm() == 'DCBA(E)'
>
>     # Why not use E.cm = D.cm?  Because D.cm returns a bound method
>     # (this would also be the case without using superarg).  Instead,
>     # one would need to use E.cm = D.__dict__['cm']
>
>  ------------------------------------------------------------
>
>  The decorators only need to be applied where necessary.  If a method
>  doesn't need the super object, then it doesn't need the decorator.
>
>  I also tested implementing the decorators in C (as built-in types),
>  which works even better because I was able to put the class fix-ups in
>  type_new and type_setattro.  Thus, the special metaclass is no longer
>  needed.
>
>  The only times it wouldn't work right is if the end-user implemented a
>  metaclass that never calls type's __new__ or __setattr__ methods.
>  Avoiding type.__new__ should be fairly rare, but I'm not sure about
>  __setattr__.  I could probably extend the types so users can do their
>  own fix-ups if need be.  However, if a user is customizing a class
>  that much, then I wonder if they would make use of superarg anyway.
>
>  Whether or not there is any perceived value in adding this to a future
>  version of Python, perhaps someone will find it useful enough to add
>  to one of their own projects (and back-porting it to Python 2.x should
>  be fairly simple).
>
>  --
>  Anthony Tolle
>  _______________________________________________
>  Python-ideas mailing list
>  Python-ideas at python.org
>  http://mail.python.org/mailman/listinfo/python-ideas
>

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From artomegus at gmail.com  Wed Apr  9 04:34:00 2008
From: artomegus at gmail.com (Anthony Tolle)
Date: Tue, 8 Apr 2008 22:34:00 -0400
Subject: [Python-ideas] superarg: an alternative to super()
In-Reply-To: <ca471dc20804081900r1cc9fc64p61b9b10870b71923@mail.gmail.com>
References: <47857e720804081855o53152c27kf8ce5c510bea70dc@mail.gmail.com>
	<ca471dc20804081900r1cc9fc64p61b9b10870b71923@mail.gmail.com>
Message-ID: <47857e720804081934w6f175456x8c5fdb197bb272ae@mail.gmail.com>

That sure was fast.  I guess I should be glad it even got noticed :)

On Tue, Apr 8, 2008 at 10:00:56 PM EST, Guido van Rossum
<guido at python.org> wrote:
> Sorry, not interested.
>
>  On Tue, Apr 8, 2008, at 9:55:17 PM EST, Anthony Tolle <artomegus at gmail.com> wrote:
>  > (incendiary idea that lights up the sky as it plummets to the earth below)

*sings: When You Wish Upon a Star* :)

--
Anthony Tolle

From aaron.watters at gmail.com  Thu Apr 10 16:13:57 2008
From: aaron.watters at gmail.com (Aaron Watters)
Date: Thu, 10 Apr 2008 10:13:57 -0400
Subject: [Python-ideas] stealing "var" from javascript
Message-ID: <fc13a6500804100713v569cc8al64b785f71e196a0@mail.gmail.com>

I know it's too late but as long as
py3k will break everything, why not
add explicit introduction of local variables?

This is one place where Perl actually
gets something right with "use strict;" and
"my variable=...;"

In particular there is a big conceptual difference
between creating a new variable and changing
a binding to an existing variable.

You could make this difference explicit by saying

   # create local myvariable (error if exists) and assign value
   var myvariable=value;

   # rebind local myvariable (error if doesn't exist)
   myvariable = value;

This would allow catching a lot of nameerrors at
compile time, and also name mistakes which don't
show up as nameerrors (like accidentally typing a
variable name wrong in such a way that it doesn't
fail the test cases, but only fails after deployment...).
Any assignment with no associated "my" declaration
would automatically be an error at compile time.

As I say, I know it's too late.  I also know it's probably
not a new idea, but I couldn't locate it in a brief attempt.
I also know that various python lint tools catch most name
problems like the above...

hmmm... I actually wouldn't mind the "my" keyword
instead of "var" since it is shorter and probably less
frequently used in existing code... (btw, "yield" broke
every existing financial application :( ).

Just a thought.
   -- Aaron Watters
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20080410/4d074a97/attachment.html>

From taleinat at gmail.com  Thu Apr 10 17:23:58 2008
From: taleinat at gmail.com (Tal Einat)
Date: Thu, 10 Apr 2008 18:23:58 +0300
Subject: [Python-ideas] stealing "var" from javascript
In-Reply-To: <fc13a6500804100713v569cc8al64b785f71e196a0@mail.gmail.com>
References: <fc13a6500804100713v569cc8al64b785f71e196a0@mail.gmail.com>
Message-ID: <7afdee2f0804100823i285a758sa86a6f48a56fe36e@mail.gmail.com>

Aaron Watters wrote:
> I know it's too late but as long as
> py3k will break everything, why not
> add explicit introduction of local variables?
>
> This is one place where Perl actually
> gets something right with "use strict;" and
>  "my variable=...;"
>
> In particular there is a big conceptual difference
> between creating a new variable and changing
> a binding to an existing variable.
>
> You could make this difference explicit by saying
>
>    # create local myvariable (error if exists) and assign value
>    var myvariable=value;
>
>    # rebind local myvariable (error if doesn't exist)
>    myvariable = value;
>
> This would allow catching a lot of nameerrors at
>  compile time, and also name mistakes which don't
> show up as nameerrors (like accidentally typing a
> variable name wrong in such a way that it doesn't
> fail the test cases, but only fails after deployment...).
>  Any assignment with no associated "my" declaration
> would automatically be an error at compile time.
>
[snip]

I've worked with Perl more than I would have liked to, and IMO this is
useful in Perl mainly because Perl code tends to grow into long
functions and/or a lot of top-level code; better management of
variables is essential simply because one needs more variables in a
single scope to get things done. I was repeatedly surprised to find
myself writing very long functions in Perl, unlike my style in every
other language, and "use strict" really did catch such errors of mine
many times.

I've also recently worked a lot with JavaScript and the "var" thing is
HORRIBLE!!!

Ok, so after venting some steam, allow me to explain: JavaScript uses
"var" for local variable declaration, and it has closures, and it has
anonymous functions. Also, note that JavaScript's closures hold
variables "by reference" rather than "by value". The result is that if
you ever forget to write "var" in a function, you're very likely to
end up overriding a variable from an outer scope, or various other
problems. Such errors can be extremely hard to debug!

Python's current scoping rules are confusing as well. But the lack of
anonymous functions means you're more aware of the current scope (and
outer scopes). And the "freezing" of variable values in closures makes
using closers MUCH simpler and less bug-prone. Empirically, I don't
think I've ever been bitten by this in Python, even though I've
defined a lot of functions inside functions (inside functions...).

- Tal

From grosser.meister.morti at gmx.net  Thu Apr 10 17:47:50 2008
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Thu, 10 Apr 2008 17:47:50 +0200
Subject: [Python-ideas] stealing "var" from javascript
In-Reply-To: <fc13a6500804100713v569cc8al64b785f71e196a0@mail.gmail.com>
References: <fc13a6500804100713v569cc8al64b785f71e196a0@mail.gmail.com>
Message-ID: <47FE36A6.2080500@gmx.net>

It would also help with local functions!

def foo():
    var x = 1

    def bar(y):
       x += y * 2

    bar(55)

From idadesub at users.sourceforge.net  Thu Apr 10 17:52:33 2008
From: idadesub at users.sourceforge.net (Erick Tryzelaar)
Date: Thu, 10 Apr 2008 08:52:33 -0700
Subject: [Python-ideas] stealing "var" from javascript
In-Reply-To: <47FE36A6.2080500@gmx.net>
References: <fc13a6500804100713v569cc8al64b785f71e196a0@mail.gmail.com>
	<47FE36A6.2080500@gmx.net>
Message-ID: <1ef034530804100852p379d7af5y93b479b29ea6b9ff@mail.gmail.com>

On Thu, Apr 10, 2008 at 8:47 AM, Mathias Panzenb?ck
<grosser.meister.morti at gmx.net> wrote:
> It would also help with local functions!
>
>  def foo():
>     var x = 1
>
>     def bar(y):
>        x += y * 2
>
>     bar(55)

They've added something to py3k to handle this:

def foo():
  x = 1

  def bar(y):
    nonlocal x
    x += y * 2

  bar(55)

From rhamph at gmail.com  Thu Apr 10 19:23:00 2008
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 10 Apr 2008 11:23:00 -0600
Subject: [Python-ideas] stealing "var" from javascript
In-Reply-To: <fc13a6500804100713v569cc8al64b785f71e196a0@mail.gmail.com>
References: <fc13a6500804100713v569cc8al64b785f71e196a0@mail.gmail.com>
Message-ID: <aac2c7cb0804101023l17541404hb998b46546cccaaf@mail.gmail.com>

On Thu, Apr 10, 2008 at 8:13 AM, Aaron Watters <aaron.watters at gmail.com> wrote:
> This would allow catching a lot of nameerrors at
>  compile time, and also name mistakes which don't
> show up as nameerrors (like accidentally typing a
> variable name wrong in such a way that it doesn't
> fail the test cases, but only fails after deployment...).
>  Any assignment with no associated "my" declaration
> would automatically be an error at compile time.

If they don't support it already, the various lint tools could
probably be extended to catch this.  Just find out how similar the
variable used in a function are, especially ones assigned only once
and never used.

-- 
Adam Olsen, aka Rhamphoryncus

From tjreedy at udel.edu  Thu Apr 10 22:18:39 2008
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 10 Apr 2008 16:18:39 -0400
Subject: [Python-ideas] superarg: an alternative to super()
References: <47857e720804081855o53152c27kf8ce5c510bea70dc@mail.gmail.com>
Message-ID: <ftlsmr$gvi$1@ger.gmane.org>

"Anthony Tolle" <artomegus at gmail.com> wrote 
in message 
news:47857e720804081855o53152c27kf8ce5c510bea70dc at mail.gmail.com...
|I was testing the new functionality of super() in Python 3.0a4, and
| noted the current "limitations":

| Whether or not there is any perceived value in adding this to a future
| version of Python, perhaps someone will find it useful enough to add
| to one of their own projects (and back-porting it to Python 2.x should
| be fairly simple).

I think the Python cookbook at Oreilly might be a good place for this.

From ntoronto at cs.byu.edu  Fri Apr 11 09:43:55 2008
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Fri, 11 Apr 2008 00:43:55 -0700
Subject: [Python-ideas] stealing "var" from javascript
In-Reply-To: <1ef034530804100852p379d7af5y93b479b29ea6b9ff@mail.gmail.com>
References: <fc13a6500804100713v569cc8al64b785f71e196a0@mail.gmail.com>	<47FE36A6.2080500@gmx.net>
	<1ef034530804100852p379d7af5y93b479b29ea6b9ff@mail.gmail.com>
Message-ID: <47FF16BB.8020807@cs.byu.edu>

Erick Tryzelaar wrote:
> On Thu, Apr 10, 2008 at 8:47 AM, Mathias Panzenb?ck
> <grosser.meister.morti at gmx.net> wrote:
>> It would also help with local functions!
>>
>>  def foo():
>>     var x = 1
>>
>>     def bar(y):
>>        x += y * 2
>>
>>     bar(55)
> 
> They've added something to py3k to handle this:
> 
> def foo():
>   x = 1
> 
>   def bar(y):
>     nonlocal x
>     x += y * 2
> 
>   bar(55)

Indeed. Funny enough, it was a similar plea for a variable declaration 
keyword (mine, in fact) that was the final straw. Thus, after a bit of 
bike shedding and collision checking on the keyword spelling, "nonlocal" 
was born...

Since the foremost use case has been handled, I expect a "var" proposal 
to go nowhere. The rationale, IIRC, is rather compression-oriented: in a 
language where only functions define inner scopes, it's far more likely 
that an assignment is meant to refer to the current scope than an outer 
scope. A declaration keyword is provided for the unlikely case, leaving 
the likely case declaration-free.

In general, the more compressed your code is, the harder it is to detect 
encoding errors. In this case, Python's designers decided on compression 
over easy error detection.

Neil

From grosser.meister.morti at gmx.net  Sat Apr 12 22:11:16 2008
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Sat, 12 Apr 2008 22:11:16 +0200
Subject: [Python-ideas] adopt an enum type for the standard library?
In-Reply-To: <200801240824.29041.mark@qtrac.eu>
References: <200801230859.54917.mark@qtrac.eu>	<200801231512.41281.mark@qtrac.eu>	<7afdee2f0801231114t1b4f5464m6f02f7fc90efa37d@mail.gmail.com>
	<200801240824.29041.mark@qtrac.eu>
Message-ID: <48011764.1080605@gmx.net>

Mark Summerfield schrieb:
> 
> This is the kind of thing I'm looking for:
> 
>     flags = Enum("OK", "ERROR", "OTHER") # defaults to sequential ints
>     flags.OK == 0
>     flags.ERROR == 1
>     flags.OTHER == 2
>     flags.OK = 5 # exception raised
>     flags.FOO # exception raised
>     str(flags.OK) == "OK"
> 

I once wrote this:
http://twoday.tuwien.ac.at/pub/files/enum

I don't know how good this is, though.

	-panzi

From ggpolo at gmail.com  Sat Apr 12 22:30:40 2008
From: ggpolo at gmail.com (Guilherme Polo)
Date: Sat, 12 Apr 2008 17:30:40 -0300
Subject: [Python-ideas] adopt an enum type for the standard library?
In-Reply-To: <48011764.1080605@gmx.net>
References: <200801230859.54917.mark@qtrac.eu>
	<200801231512.41281.mark@qtrac.eu>
	<7afdee2f0801231114t1b4f5464m6f02f7fc90efa37d@mail.gmail.com>
	<200801240824.29041.mark@qtrac.eu> <48011764.1080605@gmx.net>
Message-ID: <ac2200130804121330l1fa05a43k30b049cd7739fbd6@mail.gmail.com>

2008/4/12, Mathias Panzenb?ck <grosser.meister.morti at gmx.net>:
> Mark Summerfield schrieb:
>  >
>  > This is the kind of thing I'm looking for:
>  >
>  >     flags = Enum("OK", "ERROR", "OTHER") # defaults to sequential ints
>  >     flags.OK == 0
>  >     flags.ERROR == 1
>  >     flags.OTHER == 2
>  >     flags.OK = 5 # exception raised
>  >     flags.FOO # exception raised
>  >     str(flags.OK) == "OK"
>  >
>
>  I once wrote this:
>  http://twoday.tuwien.ac.at/pub/files/enum
>

Also:

http://norvig.com/python-iaq.html
http://www.python.org/doc/essays/metaclasses/Enum.py
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/67107
http://www.faqts.com/knowledge_base/view.phtml/aid/4415

>  I don't know how good this is, though.
>
>         -panzi
>  _______________________________________________
>  Python-ideas mailing list
>  Python-ideas at python.org
>  http://mail.python.org/mailman/listinfo/python-ideas
>

-- 
-- Guilherme H. Polo Goncalves

From python at rcn.com  Sat Apr 12 23:23:24 2008
From: python at rcn.com (Raymond Hettinger)
Date: Sat, 12 Apr 2008 14:23:24 -0700
Subject: [Python-ideas] adopt an enum type for the standard library?
References: <200801230859.54917.mark@qtrac.eu>	<200801231512.41281.mark@qtrac.eu>	<7afdee2f0801231114t1b4f5464m6f02f7fc90efa37d@mail.gmail.com><200801240824.29041.mark@qtrac.eu>
	<48011764.1080605@gmx.net>
Message-ID: <008201c89ce3$71df54b0$6800a8c0@RaymondLaptop1>

> http://twoday.tuwien.ac.at/pub/files/enum
>
> I don't know how good this is, though.

It's not needed.  The namedtuple() function already makes enums trivially simple for anyone who wants to go down this path:

  >>> flags = namedtuple('flags', 'OK ERROR OTHER')(*range(3))
  >>> flags
  flags(OK=0, ERROR=1, OTHER=2)
  >>> flags.OTHER
  2

I'm strongly against adding any variant of enum() to the standard library.  It distracts users from existing, better approaches like 
module level constants, classes, dicts, and function attributes. While enums make some sense in compiled languages, there is almost 
zero need for them in Python.  Module level constants are often the way to go -- they look cleaner and they save an unnecessary 
level of indirection (that is cost-free in a compiled language but expensive in Python):

   re.IGNORECASE   is better than   re.flags.IGNORECASE
   decimal.Overflow   is better than   decimal.exceptions.Overflow

Raymond

From jimjjewett at gmail.com  Mon Apr 14 02:04:32 2008
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sun, 13 Apr 2008 20:04:32 -0400
Subject: [Python-ideas] stealing "var" from javascript
In-Reply-To: <47FF16BB.8020807@cs.byu.edu>
References: <fc13a6500804100713v569cc8al64b785f71e196a0@mail.gmail.com>
	<47FE36A6.2080500@gmx.net>
	<1ef034530804100852p379d7af5y93b479b29ea6b9ff@mail.gmail.com>
	<47FF16BB.8020807@cs.byu.edu>
Message-ID: <fb6fbf560804131704j1dd5073aq1ada3ccce72b493c@mail.gmail.com>

On 4/11/08, Neil Toronto <ntoronto at cs.byu.edu> wrote:

>  In general, the more compressed your code is, the harder it is to detect
>  encoding errors. In this case, Python's designers decided on compression
>  over easy error detection.

My experience is more the opposite.  Yes, I have seen bugs from
too-short identifiers, or wrongly assumed defaults.

But I have seen far more errors from simple mistakes that would have
been avoided if two parts of the code were visible at the same time.
And I have seen plenty of bugs from things that people skipped over
reading because they were skimming past boilerplate.  And, of course,
there are tons of (admittedly shallow) bugs that wouldn't be bugs in
the first place if you didn't have to specify everything.  (e.g.,
things fixed by explicit casts.)

"var" by itself won't shove much off the screen.  But changing from:

"""
def fn(a=4):
    x=5
    ...
"""
to:
"""
def fn(a=4):

    var x
    var y
    var z

    x=5
    ...
"""

is a big step in that direction.

-jJ

From arne_bab at web.de  Mon Apr 14 21:59:10 2008
From: arne_bab at web.de (Arne Babenhauserheide)
Date: Mon, 14 Apr 2008 21:59:10 +0200
Subject: [Python-ideas] stealing "var" from javascript
In-Reply-To: <fb6fbf560804131704j1dd5073aq1ada3ccce72b493c@mail.gmail.com>
References: <fc13a6500804100713v569cc8al64b785f71e196a0@mail.gmail.com>
	<47FF16BB.8020807@cs.byu.edu>
	<fb6fbf560804131704j1dd5073aq1ada3ccce72b493c@mail.gmail.com>
Message-ID: <200804142159.14091.arne_bab@web.de>

El Monday, 14 de April de 2008 02:04:32 Jim Jewett escribi?:
> "var" by itself won't shove much off the screen.  But changing from:
>
> """
> def fn(a=4):
>     x=5
>     ...
> """
> to:
> """
> def fn(a=4):
>
>     var x
>     var y
>     var z
>
>     x=5
>     ...
> """

This looks kinda bloated to me. 

If I just want to do

def f(b=2): 
  # First assign x  
  x=2*b

  # Now define function we'll use again later
  def fn(a=4): 
      x=5
      return a+x

  # and use x again. 
  y = fn(x)

Your idea would needlessly fill up lines here, because I'd have to declare x 
as local in the inner function (as I understand it). 

"When any function is defined, it's variables are local" is a nice and simple 
rule for me. 

(I'm sure there are some cases I miss, though... )

Best wishes, 
Arne
-- 
Unpolitisch sein
Hei?t politisch sein
Ohne es zu merken. 
- Arne Babenhauserheide ( http://draketo.de )
-- Weblog: http://blog.draketo.de

-- Mein ?ffentlicher Schl?ssel (PGP/GnuPG): 
http://draketo.de/inhalt/ich/pubkey.txt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20080414/d32c241c/attachment.pgp>

From ntoronto at cs.byu.edu  Tue Apr 15 07:24:26 2008
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Mon, 14 Apr 2008 22:24:26 -0700
Subject: [Python-ideas] Associative arrays in hardware
Message-ID: <48043C0A.3090809@cs.byu.edu>

Full-on associative memory wouldn't be necessary. If there were a 
machine-level instruction in current chips that did a dict-style lookup 
almost as fast as a base+index*size, Python could *scream*. It might 
work like 80x86's lea (load effective address):

     hashaddr   base, recsize, dictsize, value, destreg

It would be very CISCy, and there are a whole host of practical problems 
dealing with hashing functions, good collision resolution, etc. But any 
pointer/index-based (e.g. interned string) lookup could go almost as 
fast as C++'s "this->". Mainstream is moving toward dynamic languages, 
and it makes sense to support that in hardware.

Heck, I hear routers and switches already have things like this. Does 
anybody have the ear of an Intel chip designer?

Neil

From ironfroggy at socialserve.com  Tue Apr 15 15:08:49 2008
From: ironfroggy at socialserve.com (Calvin Spealman)
Date: Tue, 15 Apr 2008 09:08:49 -0400
Subject: [Python-ideas] Associative arrays in hardware
In-Reply-To: <48043C0A.3090809@cs.byu.edu>
References: <48043C0A.3090809@cs.byu.edu>
Message-ID: <B94D0046-10BD-4BFC-8CC2-A564FDD78900@socialserve.com>

Given how modern hardware is implemented, with a large set of their  
instructions being translated into even lower-level instructions, I  
don't know there would actually be a huge benefit. Just because there  
is an instruction for something doesn't mean its absolutely blazing.  
Most likely this instruction would take several cycles to execute,  
just like the equivalent instructions currently implementing a dict- 
lookup. This also can't handle the fact that most (useful) values in  
a dictionary are not atomic (they are not a single int, float, etc.)  
and require custom __hash__ functions. How do you propose a hardware  
solution would solve those hurdles?

On Apr 15, 2008, at 1:24 AM, Neil Toronto wrote:

> Full-on associative memory wouldn't be necessary. If there were a
> machine-level instruction in current chips that did a dict-style  
> lookup
> almost as fast as a base+index*size, Python could *scream*. It might
> work like 80x86's lea (load effective address):
>
>      hashaddr   base, recsize, dictsize, value, destreg
>
> It would be very CISCy, and there are a whole host of practical  
> problems
> dealing with hashing functions, good collision resolution, etc. But  
> any
> pointer/index-based (e.g. interned string) lookup could go almost as
> fast as C++'s "this->". Mainstream is moving toward dynamic languages,
> and it makes sense to support that in hardware.
>
> Heck, I hear routers and switches already have things like this. Does
> anybody have the ear of an Intel chip designer?
>
> Neil
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

From greg.ewing at canterbury.ac.nz  Wed Apr 16 05:00:58 2008
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 16 Apr 2008 15:00:58 +1200
Subject: [Python-ideas] Associative arrays in hardware
In-Reply-To: <48043C0A.3090809@cs.byu.edu>
References: <48043C0A.3090809@cs.byu.edu>
Message-ID: <48056BEA.3030008@canterbury.ac.nz>

Neil Toronto wrote:
> Full-on associative memory wouldn't be necessary. If there were a 
> machine-level instruction in current chips that did a dict-style lookup 
> almost as fast as a base+index*size,

But how would you make it that fast without some kind of
associative lookup hardware?

Without that, you'd just be implementing the Python dict
lookup algorithm in microcode, which might be a bit
faster, but not spectacularly so -- the same number of
memory accesses would still be needed.

-- 
Greg

From ntoronto at cs.byu.edu  Wed Apr 16 08:47:45 2008
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Tue, 15 Apr 2008 23:47:45 -0700
Subject: [Python-ideas] Associative arrays in hardware
In-Reply-To: <48052CF7.70503@gmx.net>
References: <48043C0A.3090809@cs.byu.edu> <48052CF7.70503@gmx.net>
Message-ID: <4805A111.5070803@cs.byu.edu>

Mathias Panzenb?ck wrote:
> However, there still could be a collision so you have to compare the key 
> by value anyway, so the performance might not by that much better. And 
> pythons strings are immutable so the hash code is cached anyway and has 
> not be calculated every time. But this instruction could only calculate 
> a hash of a sequence of bytes (like a string), not of a complex object 
> which has some fields that form the key (and ohters that dont). So it's 
> pretty useless in my opinion. (But I'm so tired I can hardly think, 
> maybe I'm missing something.)

Yes, and that's that most language identifiers are "interned" strings: 
just a pointer to them is sufficient for equality comparison. (Ruby and 
every Lisp dialect on the planet call these "symbols".) Python is so 
very close to using nothing but interned strings for all namespace dicts 
that it wouldn't be that difficult to change it.

It's entirely possible that a hash of the string pointer would be 
sufficient, so it wouldn't even be necessary to hash the string and send 
it along.

> Neil Toronto schrieb:
>> Full-on associative memory wouldn't be necessary. If there were a 
>> machine-level instruction in current chips that did a dict-style 
>> lookup almost as fast as a base+index*size, Python could *scream*. It 
>> might work like 80x86's lea (load effective address):
>>
>>      hashaddr   base, recsize, dictsize, value, destreg
>>
>> It would be very CISCy, and there are a whole host of practical 
>> problems dealing with hashing functions, good collision resolution, 
>> etc. But any pointer/index-based (e.g. interned string) lookup could 
>> go almost as fast as C++'s "this->". Mainstream is moving toward 
>> dynamic languages, and it makes sense to support that in hardware.
>>
>> Heck, I hear routers and switches already have things like this. Does 
>> anybody have the ear of an Intel chip designer?
>>
>> Neil
> 

From ntoronto at cs.byu.edu  Wed Apr 16 09:00:58 2008
From: ntoronto at cs.byu.edu (Neil Toronto)
Date: Wed, 16 Apr 2008 00:00:58 -0700
Subject: [Python-ideas] Associative arrays in hardware
In-Reply-To: <48056BEA.3030008@canterbury.ac.nz>
References: <48043C0A.3090809@cs.byu.edu> <48056BEA.3030008@canterbury.ac.nz>
Message-ID: <4805A42A.5000804@cs.byu.edu>

Greg Ewing wrote:
> Neil Toronto wrote:
>> Full-on associative memory wouldn't be necessary. If there were a 
>> machine-level instruction in current chips that did a dict-style lookup 
>> almost as fast as a base+index*size,
> 
> But how would you make it that fast without some kind of
> associative lookup hardware?

Good point. They'd have to add that to the RAM. :D Or at least the cache...

> Without that, you'd just be implementing the Python dict
> lookup algorithm in microcode, which might be a bit
> faster, but not spectacularly so -- the same number of
> memory accesses would still be needed.

Right, it's memory access that's the bottleneck. A hardware dict lookup 
could know its own memory access characteristics and plan for them. In 
particular, it would be much better at predicting what to bring into 
cache next than current CPUs are at predicting based on software 
algorithms, if they predict at all. On the other side of it, they could 
optimize their lookups for their own hardware, particularly collision 
resolution methods.

Neil

From bvidinli at gmail.com  Thu Apr 24 10:36:52 2008
From: bvidinli at gmail.com (bvidinli)
Date: Thu, 24 Apr 2008 11:36:52 +0300
Subject: [Python-ideas] [Python-Dev] annoying dictionary problem,
	non-existing keys
In-Reply-To: <fupfne$heg$1@ger.gmane.org>
References: <36e8a7020804240113m7cbd10efibfe9f9b68f6135ac@mail.gmail.com>
	<fupfne$heg$1@ger.gmane.org>
Message-ID: <36e8a7020804240136r6e42f9fdv1f3d4dc3e947169a@mail.gmail.com>

I posted to so many lists because,

this issue is related to all lists,
this is an idea for python,
this is related to development of python...

why are you so much defensive ?

i think ideas all important for development of python, software....
i am sory anyway.... hope will be helpful.

2008/4/24, Terry Reedy <tjreedy at udel.edu>:
> Python-dev is for discussion of development of future Python.  Use
>  python-list / comp.lang.python / gmane.comp.python.general for usage
>  questions.
>
>
>
>  _______________________________________________
>  Python-Dev mailing list
>  Python-Dev at python.org
>  http://mail.python.org/mailman/listinfo/python-dev
>  Unsubscribe: http://mail.python.org/mailman/options/python-dev/bvidinli%40gmail.com
>

-- 
?.Bahattin Vidinli
Elk-Elektronik M?h.
-------------------
iletisim bilgileri (Tercih sirasina gore):
skype: bvidinli (sesli gorusme icin, www.skype.com)
msn: bvidinli at iyibirisi.com
yahoo: bvidinli

+90.532.7990607
+90.505.5667711

From bvidinli at gmail.com  Thu Apr 24 10:13:25 2008
From: bvidinli at gmail.com (bvidinli)
Date: Thu, 24 Apr 2008 11:13:25 +0300
Subject: [Python-ideas] annoying dictionary problem, non-existing keys
Message-ID: <36e8a7020804240113m7cbd10efibfe9f9b68f6135ac@mail.gmail.com>

i use dictionaries to hold some config data,
such as:

conf={'key1':'value1','key2':'value2'}
and so on...

when i try to process conf, i have to code every time like:
if conf.has_key('key1'):
     if conf['key1']<>'':
         other commands....

this is very annoying.
in php, i was able to code only like:
if conf['key1']=='someth'

in python, this fails, because, if key1 does not exists, it raises an exception.

MY question:
is there a way to directly get value of an array/tuple/dict  item by key,
as in php above, even if key may not exist,  i should not check if key exist,
i should only use it, if it does not exist, it may return only empty,
just as in php....

i hope you understand my question...

-- 
?.Bahattin Vidinli
Elk-Elektronik M?h.
-------------------
iletisim bilgileri (Tercih sirasina gore):
skype: bvidinli (sesli gorusme icin, www.skype.com)
msn: bvidinli at iyibirisi.com
yahoo: bvidinli

+90.532.7990607
+90.505.5667711

From gagsl-py2 at yahoo.com.ar  Fri Apr 25 12:44:19 2008
From: gagsl-py2 at yahoo.com.ar (Gabriel Genellina)
Date: Fri, 25 Apr 2008 07:44:19 -0300
Subject: [Python-ideas] A new .pyc file format
Message-ID: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>

Hello

(Sorry if you get this twice, I can't see my original post from gmane)

I want to propose a new .pyc file format. Currently .pyc files use a very
simple format:

- MAGIC number (4 bytes, little-endian)
- last modification time of source file (4 bytes, little-endian)
- code object (marshaled)

The problem is that this format is *too* simple. It can't be changed, nor
can accomodate other fields if desired. I propose using a more flexible
..pyc format (resembling RIFF files with multiple levels). The layout would
be as follows:

- A file contains a sequence of sections.
- A section has an identifier (4 bytes, usually ASCII letters), followed
by its size (4 bytes, not counting the section identifier nor the size
itself), followed by the actual section content.
- The layout inside each section is arbitrary, but it's suggested to use
the same technique: a sequence of (identifier x 4 bytes, size x 4 bytes,
actual value)

The outer section is called "PYCO" (from Python Code, or a contraction of
pyc+pyo) and contains at least 4 subsections:

   - "VERS": import's "MAGIC number", now seen as a "code version number" (4
bytes, same format as before)
   - "DATE": last modification time of source file (4 bytes, same format as
before)
   - "COFL": the code.co_flags attribute (4 bytes)
   - "CODE": the marshaled code object

             4 bytes                 4 bytes
    +-----.-----.-----.-----+-----.-----.-----.-----+
    | "P" | "Y" | "C" | "O" | size of whole section |
    +-----------------------+-----------------------+

    +-----.-----.-----.-----+-----.-----.-----.-----+-----.-----.-----.-----+
    | "V" | "E" | "R" | "S" |   4                   | import "MAGIC number"  
|
    +-----.-----.-----.-----+-----.-----.-----.-----+-----.-----.-----.-----+
    | "D" | "A" | "T" | "E" |   4                   | source file st_mtime   
|
    +-----.-----.-----.-----+-----.-----.-----.-----+-----------------------+
    | "C" | "O" | "F" | "L" |   4                   | code.co_flags          
|
    +-----.-----.-----.-----+-----.-----.-----.-----+-----.-----.-----.-----+
    | "C" | "O" | "D" | "E" | size of marshaled code| marshaled code object  
|
    +-----.-----.-----.-----+-----.-----.-----.-----+     ... ... ...        
|
                                                    |     ... ... ...        
|
                                                    +-----------------------+

New sections -or subsections inside a section- can be defined in the
future. No implied knowledge of section meanings or its structure is
required to read the file; readers can safely skip over sections they
don't understand, and never lost synchronism.

Compared with the current format, it has an overhead of 44 bytes.

The format above can replace the current format used for .pyc/.pyo files
(but see below). Of course it's totally incompatible with the old format.
Apart from changing every place where .pyc files are read or written in
the Python sources (not so many, I've identified all of them), 3rd party
libraries and tools using the old format would have to be updated. Perhaps
a new module should be provided to read and write pyc files.
Anyway the change is "safe", in the sense that any old code expecting the
MAGIC number in the first 4 bytes will reject the new format as invalid
and not process it.
Due to this incompatibility, this should be aimed at Python 3.x; I hope we
are on time to implement this for 3.0?

A step further:

Currently, the generated code object depends on the Python version and the
optimize flag; it used to depend on the Unicode flag too, but that's not
the case for Python 3.
The Python version determines the base MAGIC number; the Unicode flag
increments that number by 1; the optimize flag determines the file
extension used (.pyc/.pyo).
With this new format, there is no need to use two different extensions
anymore: all of this can be gathered from the attributes above, so several
variants of the same code object can be stored in a single file. The
importer can choose which one to load based on those attributes. The
selection can be made rather quickly, just the relevant attributes have to
be read actually; all other subsections can be entirely skipped without
further parsing.

Some issues:

- endianness: .pyc files currently store the modification time and magic
number in little-endian; probably one should just stick to it.
- making the size of all sections multiple of 4 may be a good idea, so
marshaled code should be padded with up to 3 NUL bytes at the end.
- section ordering, and subsection ordering inside a section: should not
be relevant; what if one can't seek to an earlier part of the file? (Ok,
unlikely, but currently import.c appears to handle such cases). If "CODE"
comes before any of "VERS", "COFL", "DATE" it should be necesary to rewind
the file to read the code section. The easy fix is to forbid that
situation: "CODE" must come after all of those subsections.
- The co_flags attribute of code objects is now externally visible; future
Python versions should not redefine those flags.
- There is no provision for explicit attribute types: "VERS" is a number,
"CODE" is a marshaled code object... The reader has to *know* that
(although it can completely skip over unknown attributes). No string
attributes were defined (nor required). For the *current* needs, it's
enough as it is. But perhaps in the future this reveals as a shortcoming,
and the .pyc format has to be changed *again*, and I'd hate that.
- Perhaps the source modification date should be stored in a more portable
way?
- a naming problem: currently, the code version number defined in import.c
is called "MAGIC", and is written at the very beginning of the file. It
identifies the file as having a valid code object. In the proposed format,
the file will begin with the letters "PYCO" instead, and the current magic
number is buried inside a subsection... it's not a "magic" anymore, just a
version number, and the "magic" in the sense used by file(1) would be the
4 bytes "PYCO". So the name "MAGIC" should be changed everywhere... it
seems too drastic.
- 32 bits should be enough for all sizes (and 640k should be enough for
all people...)

Implementation:

I don't have a complete implementation yet, but if this format is approved
(as is or with any changes) I could submit a patch. I've made a small but
incompatible modification in the currently used .pyc format in order to
detect all places where this change would impact, and they're not so many
actually.

-- 
Gabriel Genellina

From guido at python.org  Fri Apr 25 16:29:13 2008
From: guido at python.org (Guido van Rossum)
Date: Fri, 25 Apr 2008 07:29:13 -0700
Subject: [Python-ideas] A new .pyc file format
In-Reply-To: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
References: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
Message-ID: <ca471dc20804250729i75b683b7sb3b398647bce29c3@mail.gmail.com>

I think this is a reasonable thing to do, but I'd like to hear more
motivation. Maybe you can write it all up in PEP format an add a
section that explains what features we want from .pyc files?

I like that this would get rid of .pyo files BTW.

--Guido

On Fri, Apr 25, 2008 at 3:44 AM, Gabriel Genellina
<gagsl-py2 at yahoo.com.ar> wrote:
> Hello
>
>  (Sorry if you get this twice, I can't see my original post from gmane)
>
>  I want to propose a new .pyc file format. Currently .pyc files use a very
>  simple format:
>
>  - MAGIC number (4 bytes, little-endian)
>  - last modification time of source file (4 bytes, little-endian)
>  - code object (marshaled)
>
>  The problem is that this format is *too* simple. It can't be changed, nor
>  can accomodate other fields if desired. I propose using a more flexible
>  ..pyc format (resembling RIFF files with multiple levels). The layout would
>  be as follows:
>
>  - A file contains a sequence of sections.
>  - A section has an identifier (4 bytes, usually ASCII letters), followed
>  by its size (4 bytes, not counting the section identifier nor the size
>  itself), followed by the actual section content.
>  - The layout inside each section is arbitrary, but it's suggested to use
>  the same technique: a sequence of (identifier x 4 bytes, size x 4 bytes,
>  actual value)
>
>  The outer section is called "PYCO" (from Python Code, or a contraction of
>  pyc+pyo) and contains at least 4 subsections:
>
>   - "VERS": import's "MAGIC number", now seen as a "code version number" (4
>  bytes, same format as before)
>   - "DATE": last modification time of source file (4 bytes, same format as
>  before)
>   - "COFL": the code.co_flags attribute (4 bytes)
>   - "CODE": the marshaled code object
>
>             4 bytes                 4 bytes
>    +-----.-----.-----.-----+-----.-----.-----.-----+
>    | "P" | "Y" | "C" | "O" | size of whole section |
>    +-----------------------+-----------------------+
>
>    +-----.-----.-----.-----+-----.-----.-----.-----+-----.-----.-----.-----+
>    | "V" | "E" | "R" | "S" |   4                   | import "MAGIC number" |
>    +-----.-----.-----.-----+-----.-----.-----.-----+-----.-----.-----.-----+
>    | "D" | "A" | "T" | "E" |   4                   | source file st_mtime  |
>    +-----.-----.-----.-----+-----.-----.-----.-----+-----------------------+
>    | "C" | "O" | "F" | "L" |   4                   | code.co_flags         |
>    +-----.-----.-----.-----+-----.-----.-----.-----+-----.-----.-----.-----+
>    | "C" | "O" | "D" | "E" | size of marshaled code| marshaled code object |
>    +-----.-----.-----.-----+-----.-----.-----.-----+     ... ... ...       |
>                                                    |     ... ... ...       |
>                                                    +-----------------------+
>
>  New sections -or subsections inside a section- can be defined in the
>  future. No implied knowledge of section meanings or its structure is
>  required to read the file; readers can safely skip over sections they
>  don't understand, and never lost synchronism.
>
>  Compared with the current format, it has an overhead of 44 bytes.
>
>  The format above can replace the current format used for .pyc/.pyo files
>  (but see below). Of course it's totally incompatible with the old format.
>  Apart from changing every place where .pyc files are read or written in
>  the Python sources (not so many, I've identified all of them), 3rd party
>  libraries and tools using the old format would have to be updated. Perhaps
>  a new module should be provided to read and write pyc files.
>  Anyway the change is "safe", in the sense that any old code expecting the
>  MAGIC number in the first 4 bytes will reject the new format as invalid
>  and not process it.
>  Due to this incompatibility, this should be aimed at Python 3.x; I hope we
>  are on time to implement this for 3.0?
>
>
>  A step further:
>
>  Currently, the generated code object depends on the Python version and the
>  optimize flag; it used to depend on the Unicode flag too, but that's not
>  the case for Python 3.
>  The Python version determines the base MAGIC number; the Unicode flag
>  increments that number by 1; the optimize flag determines the file
>  extension used (.pyc/.pyo).
>  With this new format, there is no need to use two different extensions
>  anymore: all of this can be gathered from the attributes above, so several
>  variants of the same code object can be stored in a single file. The
>  importer can choose which one to load based on those attributes. The
>  selection can be made rather quickly, just the relevant attributes have to
>  be read actually; all other subsections can be entirely skipped without
>  further parsing.
>
>
>  Some issues:
>
>  - endianness: .pyc files currently store the modification time and magic
>  number in little-endian; probably one should just stick to it.
>  - making the size of all sections multiple of 4 may be a good idea, so
>  marshaled code should be padded with up to 3 NUL bytes at the end.
>  - section ordering, and subsection ordering inside a section: should not
>  be relevant; what if one can't seek to an earlier part of the file? (Ok,
>  unlikely, but currently import.c appears to handle such cases). If "CODE"
>  comes before any of "VERS", "COFL", "DATE" it should be necesary to rewind
>  the file to read the code section. The easy fix is to forbid that
>  situation: "CODE" must come after all of those subsections.
>  - The co_flags attribute of code objects is now externally visible; future
>  Python versions should not redefine those flags.
>  - There is no provision for explicit attribute types: "VERS" is a number,
>  "CODE" is a marshaled code object... The reader has to *know* that
>  (although it can completely skip over unknown attributes). No string
>  attributes were defined (nor required). For the *current* needs, it's
>  enough as it is. But perhaps in the future this reveals as a shortcoming,
>  and the .pyc format has to be changed *again*, and I'd hate that.
>  - Perhaps the source modification date should be stored in a more portable
>  way?
>  - a naming problem: currently, the code version number defined in import.c
>  is called "MAGIC", and is written at the very beginning of the file. It
>  identifies the file as having a valid code object. In the proposed format,
>  the file will begin with the letters "PYCO" instead, and the current magic
>  number is buried inside a subsection... it's not a "magic" anymore, just a
>  version number, and the "magic" in the sense used by file(1) would be the
>  4 bytes "PYCO". So the name "MAGIC" should be changed everywhere... it
>  seems too drastic.
>  - 32 bits should be enough for all sizes (and 640k should be enough for
>  all people...)
>
>  Implementation:
>
>  I don't have a complete implementation yet, but if this format is approved
>  (as is or with any changes) I could submit a patch. I've made a small but
>  incompatible modification in the currently used .pyc format in order to
>  detect all places where this change would impact, and they're not so many
>  actually.
>
>  --
>  Gabriel Genellina
>
>  _______________________________________________
>  Python-ideas mailing list
>  Python-ideas at python.org
>  http://mail.python.org/mailman/listinfo/python-ideas
>

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bwinton at latte.ca  Fri Apr 25 16:33:16 2008
From: bwinton at latte.ca (Blake Winton)
Date: Fri, 25 Apr 2008 10:33:16 -0400
Subject: [Python-ideas] A new .pyc file format
In-Reply-To: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
References: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
Message-ID: <4811EBAC.2040600@latte.ca>

Gabriel Genellina wrote:
> I propose using a more flexible .pyc format (resembling RIFF files with
 > multiple levels). The layout would be as follows:
> 
> - A file contains a sequence of sections.
> - A section has an identifier (4 bytes, usually ASCII letters),

[snip...]

> New sections -or subsections inside a section- can be defined in the
> future. No implied knowledge of section meanings or its structure is
> required to read the file; readers can safely skip over sections they
> don't understand, and never lost synchronism.

As a side suggestion, the PNG spec makes the capitalization of each 
identifier indicate extra meta-data about the section.  (See: 
http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html#Chunk-naming-conventions 
)

For instance, an identifier that starts with a capital letter means that 
  the decoder must understand this chunk to process the contents of the 
file, whereas an identifier with a lowercase first letter can safely be 
skipped.

An uppercase second letter means that the identifier is defined by 
Python, whereas a lowercase second letter would indicate a 
third-party-defined chunk.

(The PNG spec reserves the case of the third letter, and forces it to be 
uppercase.  The case of the fourth letter indicates whether it's safe to 
copy this chunk.  I don't think either of those are particularly useful 
to Python, and so could conveniently be skipped.)

Would this extra meta-data be useful?  I think so, for the "safe to 
ignore" flag, at least.

Later,
Blake.

From facundobatista at gmail.com  Fri Apr 25 17:30:06 2008
From: facundobatista at gmail.com (Facundo Batista)
Date: Fri, 25 Apr 2008 12:30:06 -0300
Subject: [Python-ideas] A new .pyc file format
In-Reply-To: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
References: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
Message-ID: <e04bdf310804250830t1e71a985pe7aa8bd3f9cd0bc8@mail.gmail.com>

2008/4/25, Gabriel Genellina <gagsl-py2 at yahoo.com.ar>:

>  The problem is that this format is *too* simple. It can't be changed, nor
>  can accomodate other fields if desired. I propose using a more flexible

But how do you think that these extended pyc's will be used? I mean,
are there use cases for this more complex pyc? Or they just will be
more complex, but with the same information than before, for years,
because nobody needs this flexibility?

>  Anyway the change is "safe", in the sense that any old code expecting the
>  MAGIC number in the first 4 bytes will reject the new format as invalid
>  and not process it.

Maybe what we can do here is that, for some Python versions (say, 3.0,
and maybe 3.1), the importer will try to import in the new form, and
if recognizes it as invalid *and* finds the some MAGIC numbers in the
first 4 bytes, just import it in the old fashion way..

>  - making the size of all sections multiple of 4 may be a good idea, so
>  marshaled code should be padded with up to 3 NUL bytes at the end.

Why? Has this something to do with memory alignment? I don't see the
benefit of this extra rule.

>  - 32 bits should be enough for all sizes (and 640k should be enough for
>  all people...)

Regarding this, and the 44 bytes of overhead you said, I checked which
is the average size of the .pyc in the Linux system I had at hand:

$ locate pyc | xargs ls -l | awk 'BEGIN {a=0; c=0} {a += $5; c+=1} END
{print c, a/c}'
4960 9069.12

I think that both overhead and 32b for size are ok.

>  Implementation:
>
>  I don't have a complete implementation yet, but if this format is approved
>  (as is or with any changes) I could submit a patch. I've made a small but
>  incompatible modification in the currently used .pyc format in order to
>  detect all places where this change would impact, and they're not so many
>  actually.

Yes, you should start writing a PEP (any help you need here, we can
talk about it in the next Python Argentina meeting, ;).

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From mwm at mired.org  Fri Apr 25 16:35:48 2008
From: mwm at mired.org (Mike Meyer)
Date: Fri, 25 Apr 2008 10:35:48 -0400
Subject: [Python-ideas] A new .pyc file format
In-Reply-To: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
References: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
Message-ID: <20080425103548.13cf7cbb@bhuda.mired.org>

On Fri, 25 Apr 2008 07:44:19 -0300
"Gabriel Genellina" <gagsl-py2 at yahoo.com.ar> wrote:

> Hello
> 
> (Sorry if you get this twice, I can't see my original post from gmane)
> 
> I want to propose a new .pyc file format. Currently .pyc files use a very
> simple format:
> 
> - MAGIC number (4 bytes, little-endian)
> - last modification time of source file (4 bytes, little-endian)
> - code object (marshaled)
> 
> The problem is that this format is *too* simple. It can't be changed, nor
> can accomodate other fields if desired. I propose using a more flexible
> ..pyc format (resembling RIFF files with multiple levels). The layout would
> be as follows:

Ok, *why* is this a problem? What proposed other fields do you have,
other than putting in multiple code segments with different flags?

Beyond that:

> - A section has an identifier (4 bytes, usually ASCII letters), followed
> by its size (4 bytes, not counting the section identifier nor the size
> itself), followed by the actual section content.

AKA Tag/Length/Value triples. While TLV is the common order, it's
slightly easier to deal with them if you go with LTV. You *have* to
deal with the length in order to read things in. Beyond that, you can
treat TV as atomic you don't care about the tag for some reason.

> - 32 bits should be enough for all sizes (and 640k should be enough for
> all people...)

Given that there are people who write code that writes code, and the
memory and disk capacities of modern systems, I'd say this is likely
to cause problems. Given those capacities, 8 byte lengths instead of 4
shouldn't be a problem. For embedded devices - well, they're not going
to like the idea in the first place.

   <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

From facundobatista at gmail.com  Fri Apr 25 20:21:43 2008
From: facundobatista at gmail.com (Facundo Batista)
Date: Fri, 25 Apr 2008 15:21:43 -0300
Subject: [Python-ideas] A new .pyc file format
In-Reply-To: <20080425103548.13cf7cbb@bhuda.mired.org>
References: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
	<20080425103548.13cf7cbb@bhuda.mired.org>
Message-ID: <e04bdf310804251121o1df2a405ib5bf39364fb67d90@mail.gmail.com>

2008/4/25, Mike Meyer <mwm at mired.org>:

>  > - 32 bits should be enough for all sizes (and 640k should be enough for
>  > all people...)
>
> Given that there are people who write code that writes code, and the
>  memory and disk capacities of modern systems, I'd say this is likely
>  to cause problems. Given those capacities, 8 byte lengths instead of 4
>  shouldn't be a problem. For embedded devices - well, they're not going
>  to like the idea in the first place.

Well, ASN.1 has a well defined semantics to support L of multiple
bytes in a TLV construction, we can adopt that. Note that I think that
doing this is too complex for this purpose.

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From brett at python.org  Fri Apr 25 23:20:50 2008
From: brett at python.org (Brett Cannon)
Date: Fri, 25 Apr 2008 14:20:50 -0700
Subject: [Python-ideas] A new .pyc file format
In-Reply-To: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
References: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
Message-ID: <bbaeab100804251420k4e96a417t4f459077936413dd@mail.gmail.com>

On Fri, Apr 25, 2008 at 3:44 AM, Gabriel Genellina
<gagsl-py2 at yahoo.com.ar> wrote:
> Hello
>
>  (Sorry if you get this twice, I can't see my original post from gmane)
>
>  I want to propose a new .pyc file format. Currently .pyc files use a very
>  simple format:
>
>  - MAGIC number (4 bytes, little-endian)
>  - last modification time of source file (4 bytes, little-endian)
>  - code object (marshaled)
>
>  The problem is that this format is *too* simple. It can't be changed, nor
>  can accomodate other fields if desired. I propose using a more flexible
>  ..pyc format (resembling RIFF files with multiple levels). The layout would
>  be as follows:
>
>  - A file contains a sequence of sections.
>  - A section has an identifier (4 bytes, usually ASCII letters), followed
>  by its size (4 bytes, not counting the section identifier nor the size
>  itself), followed by the actual section content.
>  - The layout inside each section is arbitrary, but it's suggested to use
>  the same technique: a sequence of (identifier x 4 bytes, size x 4 bytes,
>  actual value)
>
>  The outer section is called "PYCO" (from Python Code, or a contraction of
>  pyc+pyo) and contains at least 4 subsections:
>
>   - "VERS": import's "MAGIC number", now seen as a "code version number" (4
>  bytes, same format as before)
>   - "DATE": last modification time of source file (4 bytes, same format as
>  before)
>   - "COFL": the code.co_flags attribute (4 bytes)
>   - "CODE": the marshaled code object
>
>             4 bytes                 4 bytes
>    +-----.-----.-----.-----+-----.-----.-----.-----+
>    | "P" | "Y" | "C" | "O" | size of whole section |
>    +-----------------------+-----------------------+
>
>    +-----.-----.-----.-----+-----.-----.-----.-----+-----.-----.-----.-----+
>    | "V" | "E" | "R" | "S" |   4                   | import "MAGIC number" |
>    +-----.-----.-----.-----+-----.-----.-----.-----+-----.-----.-----.-----+
>    | "D" | "A" | "T" | "E" |   4                   | source file st_mtime  |
>    +-----.-----.-----.-----+-----.-----.-----.-----+-----------------------+
>    | "C" | "O" | "F" | "L" |   4                   | code.co_flags         |
>    +-----.-----.-----.-----+-----.-----.-----.-----+-----.-----.-----.-----+
>    | "C" | "O" | "D" | "E" | size of marshaled code| marshaled code object |
>    +-----.-----.-----.-----+-----.-----.-----.-----+     ... ... ...       |
>                                                    |     ... ... ...       |
>                                                    +-----------------------+
>
>  New sections -or subsections inside a section- can be defined in the
>  future. No implied knowledge of section meanings or its structure is
>  required to read the file; readers can safely skip over sections they
>  don't understand, and never lost synchronism.
>

While I think having a more flexible format is important to allow for
modifying the AST before bytecode write-out, I don't know if it needs
to go quite this far. The magic number, timestamp, and marshaled code
are not about to go away. Thus the current format can basically stay,
but we can add a flexible addition between the timestamp and the code
object. This saves some memory and simplifies the format slightly in
the case where the guaranteed requirements of .pyc regeneration can be
quickly checked (e.g., a quick 8 byte read off the file will quickly
tell if the magic number of timestamp are out of date, thus skipping
having to read the entire header for these two critical sanity
checks).

The thing I think that the new format needs to easily support is not
just the removal of .pyo, but of user-defined AST transformations
prior to .pyc generation. Now that this can be done at the Python
level some people might start coming up with compiler-optimizations
that they want to do which changes semantics. That means there needs
to be a clear way to register an AST transformation has having
occurred. I am just worried that the 4 bytes for labeling something
won't be enough. We could say that all optimizations are labeled
"OPTO" and that what format is used is specified is the value, but
that means supporting multiple instances of the same label in the
header (which I think is fine since this is going to be read linearly
off disk 99% of the time).

So I guess this boils down to I think we don't need to label what MUST
be in the header, and that we should allow for multiple instances of
the same label (whether this is always true or we use some way to flag
that through capitalization).

-Brett

From lists at cheimes.de  Sat Apr 26 01:18:14 2008
From: lists at cheimes.de (Christian Heimes)
Date: Sat, 26 Apr 2008 01:18:14 +0200
Subject: [Python-ideas] A new .pyc file format
In-Reply-To: <ca471dc20804250729i75b683b7sb3b398647bce29c3@mail.gmail.com>
References: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
	<ca471dc20804250729i75b683b7sb3b398647bce29c3@mail.gmail.com>
Message-ID: <481266B6.3030603@cheimes.de>

Guido van Rossum schrieb:
> I think this is a reasonable thing to do, but I'd like to hear more
> motivation. Maybe you can write it all up in PEP format an add a
> section that explains what features we want from .pyc files?
> 
> I like that this would get rid of .pyo files BTW.

Indeed, the general idea sounds very promising.

Christian

From collinw at gmail.com  Sat Apr 26 01:52:23 2008
From: collinw at gmail.com (Collin Winter)
Date: Fri, 25 Apr 2008 16:52:23 -0700
Subject: [Python-ideas] [Python-Dev] annoying dictionary problem,
	non-existing keys
In-Reply-To: <36e8a7020804240136r6e42f9fdv1f3d4dc3e947169a@mail.gmail.com>
References: <36e8a7020804240113m7cbd10efibfe9f9b68f6135ac@mail.gmail.com>
	<fupfne$heg$1@ger.gmane.org>
	<36e8a7020804240136r6e42f9fdv1f3d4dc3e947169a@mail.gmail.com>
Message-ID: <43aa6ff70804251652r4f0deb39he5b2e1c71e8aabee@mail.gmail.com>

2008/4/24 bvidinli <bvidinli at gmail.com>:
> I posted to so many lists because,
>
>  this issue is related to all lists,
>  this is an idea for python,
>  this is related to development of python...
>
>  why are you so much defensive ?
>
>  i think ideas all important for development of python, software....
>  i am sory anyway.... hope will be helpful.

Please consult the documentation first:
http://docs.python.org/lib/typesmapping.html . You're looking for the
get() method.

This attribute of PHP is hardly considered a feature, and is not
something Python wishes to emulate.

Collin Winter

>  2008/4/24, Terry Reedy <tjreedy at udel.edu>:
>
>
> > Python-dev is for discussion of development of future Python.  Use
>  >  python-list / comp.lang.python / gmane.comp.python.general for usage
>  >  questions.
>  >
>  >
>  >
>  >  _______________________________________________
>  >  Python-Dev mailing list
>  >  Python-Dev at python.org
>  >  http://mail.python.org/mailman/listinfo/python-dev
>  >  Unsubscribe: http://mail.python.org/mailman/options/python-dev/bvidinli%40gmail.com
>
> >
>
>
>  --
>  ?.Bahattin Vidinli
>  Elk-Elektronik M?h.
>  -------------------
>  iletisim bilgileri (Tercih sirasina gore):
>  skype: bvidinli (sesli gorusme icin, www.skype.com)
>  msn: bvidinli at iyibirisi.com
>  yahoo: bvidinli
>
>  +90.532.7990607
>  +90.505.5667711
>  _______________________________________________
>
>
> Python-Dev mailing list
>  Python-Dev at python.org
>  http://mail.python.org/mailman/listinfo/python-dev
>  Unsubscribe: http://mail.python.org/mailman/options/python-dev/collinw%40gmail.com
>

From jimjjewett at gmail.com  Sat Apr 26 18:21:59 2008
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sat, 26 Apr 2008 12:21:59 -0400
Subject: [Python-ideas] A new .pyc file format
In-Reply-To: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
References: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
Message-ID: <fb6fbf560804260921m383f43d4n45aab7a0b3ac4e7a@mail.gmail.com>

On 4/25/08, Gabriel Genellina <gagsl-py2 at yahoo.com.ar> wrote:

>  The problem is that this [pyc] format is *too* simple. It can't be changed, nor
>  can accomodate other fields if desired.

Why do you need to?  Except for bootstrapping, can't you make all
these changes with a custom loader/importer?

Shipping python with default support for a new format may be
reasonable as well -- the interpreter already handles both pure python
and extension modules.  Even hooking it in as an alternate generated
format just extends the pyo/pyc decision.

Or were you suggesting that the stdlib should use this new format by
default, or even strictly instead of the current format?  If so, what
are the advantages in the normal case?  (Deferring the load of
docstrings?  Better categorization by some external tool?)

-jJ

From bvidinli at gmail.com  Sat Apr 26 21:16:50 2008
From: bvidinli at gmail.com (bvidinli)
Date: Sat, 26 Apr 2008 22:16:50 +0300
Subject: [Python-ideas] Apologize: annoying dictionary problem,
	non-existing keys
Message-ID: <36e8a7020804261216k41e749c4sdad93cc6c008b435@mail.gmail.com>

Thank you all for your answers. i get many usefull answers and someone
remembered me of avoiding cross-posting.
i apologize from all of you, for cross posting and disturbing.
that day was not a good day for me... it is my fault..

sory.... and have nice days...

26 Nisan 2008 Cumartesi 02:52 tarihinde Collin Winter <collinw at gmail.com> yazd?:
> 2008/4/24 bvidinli <bvidinli at gmail.com>:
>  > I posted to so many lists because,
>  >
>  >  this issue is related to all lists,
>  >  this is an idea for python,
>  >  this is related to development of python...
>  >
>  >  why are you so much defensive ?
>  >
>  >  i think ideas all important for development of python, software....
>  >  i am sory anyway.... hope will be helpful.
>
>  Please consult the documentation first:
>  http://docs.python.org/lib/typesmapping.html . You're looking for the
>  get() method.
>
>  This attribute of PHP is hardly considered a feature, and is not
>  something Python wishes to emulate.
>
>  Collin Winter
>
>  >  2008/4/24, Terry Reedy <tjreedy at udel.edu>:
>  >
>  >
>  > > Python-dev is for discussion of development of future Python.  Use
>  >  >  python-list / comp.lang.python / gmane.comp.python.general for usage
>  >  >  questions.
>  >  >
>  >  >
>  >  >
>  >  >  _______________________________________________
>  >  >  Python-Dev mailing list
>  >  >  Python-Dev at python.org
>  >  >  http://mail.python.org/mailman/listinfo/python-dev
>  >  >  Unsubscribe: http://mail.python.org/mailman/options/python-dev/bvidinli%40gmail.com
>  >
>  > >
>  >
>  >
>  >  --
>  >  ?.Bahattin Vidinli
>  >  Elk-Elektronik M?h.
>  >  -------------------
>  >  iletisim bilgileri (Tercih sirasina gore):
>  >  skype: bvidinli (sesli gorusme icin, www.skype.com)
>  >  msn: bvidinli at iyibirisi.com
>  >  yahoo: bvidinli
>  >
>  >  +90.532.7990607
>  >  +90.505.5667711
>  >  _______________________________________________
>  >
>  >
>  > Python-Dev mailing list
>  >  Python-Dev at python.org
>  >  http://mail.python.org/mailman/listinfo/python-dev
>  >  Unsubscribe: http://mail.python.org/mailman/options/python-dev/collinw%40gmail.com
>  >
>

-- 
?.Bahattin Vidinli
Elk-Elektronik M?h.
-------------------
iletisim bilgileri (Tercih sirasina gore):
skype: bvidinli (sesli gorusme icin, www.skype.com)
msn: bvidinli at iyibirisi.com
yahoo: bvidinli

+90.532.7990607
+90.505.5667711

From ironfroggy at socialserve.com  Mon Apr 28 16:30:52 2008
From: ironfroggy at socialserve.com (Calvin Spealman)
Date: Mon, 28 Apr 2008 10:30:52 -0400
Subject: [Python-ideas] A new .pyc file format
In-Reply-To: <ca471dc20804250729i75b683b7sb3b398647bce29c3@mail.gmail.com>
References: <op.t95xf5jzx6zn5v@gabriel2.softlabbsas.com.ar>
	<ca471dc20804250729i75b683b7sb3b398647bce29c3@mail.gmail.com>
Message-ID: <5570F2C7-DC49-4A7F-9332-8711BAFE64FD@socialserve.com>

I'll play my part here and toss out some ideas we could use this for.  
I'm not really advocating it, yet, but I'll say I am +0. In either  
case, if we did, here are some possible uses:

We could break up the code into multiple sections and allow  
alternatives for sections with different versions. Different versions  
could be used for a few different things, including different  
optimization levels, supporting multiple bytecode versions, or  
storing both pre and post AST transformations of the code.

Meta-ish data like docstrings could be pulled out of the code objects  
and injected in non-optimized modes. This might also include original  
source for code, which could be helpful when you change the source  
and still-running code tracebacks and gives you invalid lines.

Bookkeeping data could sit in some sections, detailing things like  
call stats (average call time, frequency, etc) and other information  
that could be useful for optimizers and JIT compilers like psyco.

I am not saying any of these are good ideas or good uses of the  
original idea. I'm just giving thought fodder for the hypothetical.

On Apr 25, 2008, at 10:29 AM, Guido van Rossum wrote:

> I think this is a reasonable thing to do, but I'd like to hear more
> motivation. Maybe you can write it all up in PEP format an add a
> section that explains what features we want from .pyc files?
>
> I like that this would get rid of .pyo files BTW.
>
> --Guido
>
> On Fri, Apr 25, 2008 at 3:44 AM, Gabriel Genellina
> <gagsl-py2 at yahoo.com.ar> wrote:
>> Hello
>>
>>  (Sorry if you get this twice, I can't see my original post from  
>> gmane)
>>
>>  I want to propose a new .pyc file format. Currently .pyc files  
>> use a very
>>  simple format:
>>
>>  - MAGIC number (4 bytes, little-endian)
>>  - last modification time of source file (4 bytes, little-endian)
>>  - code object (marshaled)
>>
>>  The problem is that this format is *too* simple. It can't be  
>> changed, nor
>>  can accomodate other fields if desired. I propose using a more  
>> flexible
>>  ..pyc format (resembling RIFF files with multiple levels). The  
>> layout would
>>  be as follows:
>>
>>  - A file contains a sequence of sections.
>>  - A section has an identifier (4 bytes, usually ASCII letters),  
>> followed
>>  by its size (4 bytes, not counting the section identifier nor the  
>> size
>>  itself), followed by the actual section content.
>>  - The layout inside each section is arbitrary, but it's suggested  
>> to use
>>  the same technique: a sequence of (identifier x 4 bytes, size x 4  
>> bytes,
>>  actual value)
>>
>>  The outer section is called "PYCO" (from Python Code, or a  
>> contraction of
>>  pyc+pyo) and contains at least 4 subsections:
>>
>>   - "VERS": import's "MAGIC number", now seen as a "code version  
>> number" (4
>>  bytes, same format as before)
>>   - "DATE": last modification time of source file (4 bytes, same  
>> format as
>>  before)
>>   - "COFL": the code.co_flags attribute (4 bytes)
>>   - "CODE": the marshaled code object
>>
>>             4 bytes                 4 bytes
>>    +-----.-----.-----.-----+-----.-----.-----.-----+
>>    | "P" | "Y" | "C" | "O" | size of whole section |
>>    +-----------------------+-----------------------+
>>
>>    +-----.-----.-----.-----+-----.-----.-----.----- 
>> +-----.-----.-----.-----+
>>    | "V" | "E" | "R" | "S" |   4                   | import "MAGIC  
>> number" |
>>    +-----.-----.-----.-----+-----.-----.-----.----- 
>> +-----.-----.-----.-----+
>>    | "D" | "A" | "T" | "E" |   4                   | source file  
>> st_mtime  |
>>    +-----.-----.-----.-----+-----.-----.-----.----- 
>> +-----------------------+
>>    | "C" | "O" | "F" | "L" |   4                   |  
>> code.co_flags         |
>>    +-----.-----.-----.-----+-----.-----.-----.----- 
>> +-----.-----.-----.-----+
>>    | "C" | "O" | "D" | "E" | size of marshaled code| marshaled  
>> code object |
>>    +-----.-----.-----.-----+-----.-----.-----.----- 
>> +     ... ... ...       |
>>                                                     
>> |     ... ... ...       |
>>                                                     
>> +-----------------------+
>>
>>  New sections -or subsections inside a section- can be defined in the
>>  future. No implied knowledge of section meanings or its structure is
>>  required to read the file; readers can safely skip over sections  
>> they
>>  don't understand, and never lost synchronism.
>>
>>  Compared with the current format, it has an overhead of 44 bytes.
>>
>>  The format above can replace the current format used  
>> for .pyc/.pyo files
>>  (but see below). Of course it's totally incompatible with the old  
>> format.
>>  Apart from changing every place where .pyc files are read or  
>> written in
>>  the Python sources (not so many, I've identified all of them),  
>> 3rd party
>>  libraries and tools using the old format would have to be  
>> updated. Perhaps
>>  a new module should be provided to read and write pyc files.
>>  Anyway the change is "safe", in the sense that any old code  
>> expecting the
>>  MAGIC number in the first 4 bytes will reject the new format as  
>> invalid
>>  and not process it.
>>  Due to this incompatibility, this should be aimed at Python 3.x;  
>> I hope we
>>  are on time to implement this for 3.0?
>>
>>
>>  A step further:
>>
>>  Currently, the generated code object depends on the Python  
>> version and the
>>  optimize flag; it used to depend on the Unicode flag too, but  
>> that's not
>>  the case for Python 3.
>>  The Python version determines the base MAGIC number; the Unicode  
>> flag
>>  increments that number by 1; the optimize flag determines the file
>>  extension used (.pyc/.pyo).
>>  With this new format, there is no need to use two different  
>> extensions
>>  anymore: all of this can be gathered from the attributes above,  
>> so several
>>  variants of the same code object can be stored in a single file. The
>>  importer can choose which one to load based on those attributes. The
>>  selection can be made rather quickly, just the relevant  
>> attributes have to
>>  be read actually; all other subsections can be entirely skipped  
>> without
>>  further parsing.
>>
>>
>>  Some issues:
>>
>>  - endianness: .pyc files currently store the modification time  
>> and magic
>>  number in little-endian; probably one should just stick to it.
>>  - making the size of all sections multiple of 4 may be a good  
>> idea, so
>>  marshaled code should be padded with up to 3 NUL bytes at the end.
>>  - section ordering, and subsection ordering inside a section:  
>> should not
>>  be relevant; what if one can't seek to an earlier part of the  
>> file? (Ok,
>>  unlikely, but currently import.c appears to handle such cases).  
>> If "CODE"
>>  comes before any of "VERS", "COFL", "DATE" it should be necesary  
>> to rewind
>>  the file to read the code section. The easy fix is to forbid that
>>  situation: "CODE" must come after all of those subsections.
>>  - The co_flags attribute of code objects is now externally  
>> visible; future
>>  Python versions should not redefine those flags.
>>  - There is no provision for explicit attribute types: "VERS" is a  
>> number,
>>  "CODE" is a marshaled code object... The reader has to *know* that
>>  (although it can completely skip over unknown attributes). No string
>>  attributes were defined (nor required). For the *current* needs,  
>> it's
>>  enough as it is. But perhaps in the future this reveals as a  
>> shortcoming,
>>  and the .pyc format has to be changed *again*, and I'd hate that.
>>  - Perhaps the source modification date should be stored in a more  
>> portable
>>  way?
>>  - a naming problem: currently, the code version number defined in  
>> import.c
>>  is called "MAGIC", and is written at the very beginning of the  
>> file. It
>>  identifies the file as having a valid code object. In the  
>> proposed format,
>>  the file will begin with the letters "PYCO" instead, and the  
>> current magic
>>  number is buried inside a subsection... it's not a "magic"  
>> anymore, just a
>>  version number, and the "magic" in the sense used by file(1)  
>> would be the
>>  4 bytes "PYCO". So the name "MAGIC" should be changed  
>> everywhere... it
>>  seems too drastic.
>>  - 32 bits should be enough for all sizes (and 640k should be  
>> enough for
>>  all people...)
>>
>>  Implementation:
>>
>>  I don't have a complete implementation yet, but if this format is  
>> approved
>>  (as is or with any changes) I could submit a patch. I've made a  
>> small but
>>  incompatible modification in the currently used .pyc format in  
>> order to
>>  detect all places where this change would impact, and they're not  
>> so many
>>  actually.
>>
>>  --
>>  Gabriel Genellina
>>
>>  _______________________________________________
>>  Python-ideas mailing list
>>  Python-ideas at python.org
>>  http://mail.python.org/mailman/listinfo/python-ideas
>>
>
>
>
> -- 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

From grosser.meister.morti at gmx.net  Tue Apr 29 02:33:19 2008
From: grosser.meister.morti at gmx.net (=?ISO-8859-9?Q?Mathias_Panzenb=F6ck?=)
Date: Tue, 29 Apr 2008 02:33:19 +0200
Subject: [Python-ideas] annoying dictionary problem, non-existing keys
In-Reply-To: <36e8a7020804240113m7cbd10efibfe9f9b68f6135ac@mail.gmail.com>
References: <36e8a7020804240113m7cbd10efibfe9f9b68f6135ac@mail.gmail.com>
Message-ID: <48166CCF.90306@gmx.net>

bvidinli schrieb:
> i use dictionaries to hold some config data,
> such as:
> 
> conf={'key1':'value1','key2':'value2'}
> and so on...
> 
> when i try to process conf, i have to code every time like:
> if conf.has_key('key1'):
>      if conf['key1']<>'':
>          other commands....
> 
> 
> this is very annoying.
> in php, i was able to code only like:
> if conf['key1']=='someth'
> 
> in python, this fails, because, if key1 does not exists, it raises an exception.
> 
> MY question:
> is there a way to directly get value of an array/tuple/dict  item by key,
> as in php above, even if key may not exist,  i should not check if key exist,
> i should only use it, if it does not exist, it may return only empty,
> just as in php....
> 
> i hope you understand my question...
> 

if conf.get('key1','') == 'someth':
	...

From gagsl-py2 at yahoo.com.ar  Tue Apr 29 11:26:22 2008
From: gagsl-py2 at yahoo.com.ar (gagsl-py2 at yahoo.com.ar)
Date: Tue, 29 Apr 2008 06:26:22 -0300 (ART)
Subject: [Python-ideas] A new .pyc file format
In-Reply-To: <ca471dc20804250729i75b683b7sb3b398647bce29c3@mail.gmail.com>
Message-ID: <9033.57657.qm@web32804.mail.mud.yahoo.com>

Replying to all posts jointly (and directly to the
list, looks like gmane doesn't like my posts on the
newsgroup...)

En Fri, 25 Apr 2008 11:29:13 -0300, Guido van Rossum  
<guido-+ZN9ApsXKcEdnm+yROfE0A at public.gmane.org>
escribi?:

> I think this is a reasonable thing to do, but I'd
> like to hear more motivation. Maybe you can write 
> it all up in PEP format an add a section that 
> explains what features we want from .pyc files?

Ok.

> I like that this would get rid of .pyo files BTW.

Yes, both .pyc and .pyo versions could coexist on 
the same file, among other things.

En Fri, 25 Apr 2008 11:33:16 -0300, Blake Winton  
<bwinton-D8CoGe09WXY at public.gmane.org> escribi?:

> As a side suggestion, the PNG spec makes the 
> capitalization of each  identifier indicate extra 
> meta-data about the section.  (See:  
>
http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html#Chunk-naming-conventions

> )
> 
> For instance, an identifier that starts with a 
> capital letter means that the decoder must 
> understand this chunk to process the contents of 
> the file, whereas an identifier with a lowercase 
> first letter can safely be skipped.
> 
> An uppercase second letter means that the 
> identifier is defined by Python, whereas a 
> lowercase second letter would indicate a 
> third-party-defined chunk.
> 
> (The PNG spec reserves the case of the third 
> letter, and forces it to be uppercase.  The case 
> of the fourth letter indicates whether it's safe 
> to copy this chunk.  I don't think either of those
> are particularly useful to Python, and so could 
> conveniently be skipped.)
> 
> Would this extra meta-data be useful?  I think so,
> for the "safe to ignore" flag, at least.

Yes, we can reserve the case of the last two letters
(always uppercase now) until any useful meaning 
emerges.

En Fri, 25 Apr 2008 12:30:06 -0300, Facundo Batista  
<facundobatista-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org>
escribi?:

> 2008/4/25, Gabriel Genellina  
> <gagsl-py2-/E1597aS9LQMlKAeRRkD2Q at public.gmane.org>:

>>  The problem is that this format is *too* simple.
>>  It can't be changed, nor can accomodate other 
>>  fields if desired. I propose using a more 
>>  flexible

> But how do you think that these extended pyc's 
> will be used? I mean, are there use cases for this
> more complex pyc? Or they just will be more 
> complex, but with the same information than 
> before, for years, because nobody needs this 
> flexibility?

.pyc and .pyo files could be merged into a single 
file, using always the .pyc extension. This way the 
logic to locate/search the right file to load would 
be simpler (at the cost of making more complex 
locating the *section* inside the file that must be 
loaded!). In the past, zipimport got it wrong in 
some cases (see http://bugs.python.org/issue1346572).

Another example is python -U; it changes the magic 
number, so modules compiled in this mode are 
incompatible with modules compiled in the "normal" 
mode. If a mechanism like this proposal had existed 
in the past, both variants could have been stored 
in the same .pyc file.
(I think that nobody *really* uses python -U, but 
the same argument applies to any alternate code 
generation method: pyc files are unable to contain 
more than one code variant at a time)

>>  Anyway the change is "safe", in the sense that 
>>  any old code expecting the MAGIC number in the 
>>  first 4 bytes will reject the new format as 
>>  invalid and not process it.

> Maybe what we can do here is that, for some Python
> versions (say, 3.0, and maybe 3.1), the importer 
> will try to import in the new form, and if 
> recognizes it as invalid *and* finds the some 
> MAGIC numbers in the first 4 bytes, just import it
> in the old fashion way..

That could be done, but why? Isn't the same 
situation as a change in the magic number? That 
invalidates all existing .pyc files and they all 
must be recompiled. If this new .pyc format were 
implemented, it's the same thing; all existing .pyc 
files must be recompiled. Old .pyc files have always
been discarded, and the same should apply to this 
new format, I think.

> Yes, you should start writing a PEP (any help you 
> need here, we can talk about it in the next Python
> Argentina meeting, ;).

(Mmm, I would not rely on that, given the past 
statistics... :-( )

En Fri, 25 Apr 2008 11:35:48 -0300, Mike Meyer  
<mwm-tkOQc4lHIczYtjvyW6yDsg at public.gmane.org>
escribi?:

>> - A section has an identifier (4 bytes, usually 
>> ASCII letters), followed by its size (4 bytes, 
>> not counting the section identifier nor the size
>> itself), followed by the actual section content.

> AKA Tag/Length/Value triples. While TLV is the 
> common order, it's slightly easier to deal with 
> them if you go with LTV. You *have* to deal with 
> the length in order to read things in. Beyond 
> that, you can treat TV as atomic you don't care 
> about the tag for some reason.

Ok, the proposed order was that of the RIFF format, 
and the only reason I chose it is because it can be 
read using the chunk.py standard module. But it's 
not a very convincing argument. I like LTV more.

>> - 32 bits should be enough for all sizes (and 
>> 640k should be enough for all people...)

> Given that there are people who write code that 
> writes code, and the memory and disk capacities 
> of modern systems, I'd say this is likely to cause
> problems. Given those capacities, 8 byte lengths 
> instead of 4 shouldn't be a problem. For embedded 
> devices - well, they're not going to like the idea
> in the first place.

I've found that using more than 32 bits would 
require changes in other places too, including 
the marshal format.
According to this thread from last year 
http://mail.python.org/pipermail/python-dev/2007-May/073157.html
looks like huge code objects are not supported, 
unless something has changed in the meantime.

-- 
Gabriel Genellina

Gabriel Genellina
Softlab SRL

      Yahoo! Deportes Beta
?No te pierdas lo ?ltimo sobre el torneo clausura 2008! Enterate aqu? http://deportes.yahoo.com

From gagsl-py2 at yahoo.com.ar  Tue Apr 29 11:44:38 2008
From: gagsl-py2 at yahoo.com.ar (gagsl-py2 at yahoo.com.ar)
Date: Tue, 29 Apr 2008 06:44:38 -0300 (ART)
Subject: [Python-ideas] A new .pyc file format
In-Reply-To: <fb6fbf560804260921m383f43d4n45aab7a0b3ac4e7a@mail.gmail.com>
Message-ID: <386677.47115.qm@web32807.mail.mud.yahoo.com>

On Sat, 26 Apr 2008 13:21:59 -0300, Jim Jewett
<jimjjewett-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org>
escribi?:
> On 4/25/08, Gabriel Genellina  
> <gagsl-py2-/E1597aS9LQMlKAeRRkD2Q at public.gmane.org>
wrote:
>
>>  The problem is that this [pyc] format is *too* 
>>  simple. It can't be changed, nor can accomodate 
>>  other fields if desired.
>
> Why do you need to?  Except for bootstrapping, 
> can't you make all these changes with a custom 
> loader/importer?
> 
> Shipping python with default support for a new 
> format may be reasonable as well -- the 
> interpreter already handles both pure python and 
> extension modules.  Even hooking it in as an 
> alternate generated format just extends the 
> pyo/pyc decision.

I want to unify pyc+pyo so they coexist on the same
file.

> Or were you suggesting that the stdlib should use 
> this new format by default, or even strictly 
> instead of the current format?  If so, what are 
> the advantages in the normal case?  (Deferring 
> the load of docstrings?  Better categorization by 
> some external tool?)

Yes, the idea is to *replace* completely the current
format, not to add another alternative.
There are now 4 different code variants: using -O or
not, and using -U or not; the first one changes the 
file extension, the second changes the magic number.
If the .pyc format could handle more than one 
variant, they all could be stored in a single file. 
Other kind of data can be stored too.

Deferring docstrings can't be done without changing 
the marshal format, and that's out of the scope of 
this proposal (until now).

-- 
Gabriel Genellina

Gabriel Genellina
Softlab SRL

      Yahoo! Encuentros.

Ahora encontrar pareja es mucho m?s f?cil, prob? el nuevo Yahoo! Encuentros http://yahoo.cupidovirtual.com/servlet/NewRegistration

From gagsl-py2 at yahoo.com.ar  Tue Apr 29 12:02:13 2008
From: gagsl-py2 at yahoo.com.ar (gagsl-py2 at yahoo.com.ar)
Date: Tue, 29 Apr 2008 07:02:13 -0300 (ART)
Subject: [Python-ideas] A new .pyc file format
Message-ID: <349638.45938.qm@web32801.mail.mud.yahoo.com>

En Mon, 28 Apr 2008 11:30:52 -0300, Calvin Spealman
<ironfroggy-+z4E91SeU6Qg+LkckTzjXg at public.gmane.org>
escribi?:

> I'll play my part here and toss out some ideas we
could use this for.  
> I'm not really advocating it, yet, but I'll say I am
+0. In either case,  
> if we did, here are some possible uses:
>
> We could break up the code into multiple sections
and allow alternatives  
> for sections with different versions. Different
versions could be used  
> for a few different things, including different
optimization levels,  
> supporting multiple bytecode versions, or storing
both pre and post AST  
> transformations of the code.

Thanks for the examples! The idea was to allow storing
such things, but I could not think of a concrete
alternative example.

> Meta-ish data like docstrings could be pulled out of
the code objects  
> and injected in non-optimized modes. This might also
include original  
> source for code, which could be helpful when you
change the source and  
> still-running code tracebacks and gives you invalid
lines.
>
> Bookkeeping data could sit in some sections,
detailing things like call  
> stats (average call time, frequency, etc) and other
information that  
> could be useful for optimizers and JIT compilers
like psyco.

Do you mean, to update the file after the code is
executed, to store such statistics? When would it be
done?

-- 
Gabriel Genellina

      Tarjeta de cr?dito Yahoo! de Banco Supervielle.
Solicit? tu nueva Tarjeta de cr?dito. De tu PC directo a tu casa. www.tuprimeratarjeta.com.ar