Modifying the {} and [] tokens

Sat Aug 23 12:37:33 EDT 2003

On Sat, 23 Aug 2003 13:56:13 +0000 (UTC), kern at taliesen.caltech.edu
(Robert Kern) wrote:

>The parser compiles {} into the BUILD_MAP opcode which the eval loop 
>interprets by calling PyDict_New(). This creates a dict object
>independently of whatever __builtins__.dict is bound to.
>
>I discovered this by disassembling some code and looking up the opcode
>in the bytecode interpreter eval loop (Python/ceval.c).
>
>>>> import dis
>>>> code = compile("{}", "<test>", "single")
>>>> dis.dis(code)
>  1           0 BUILD_MAP                0
>              3 PRINT_EXPR
>              4 LOAD_CONST               0 (None)
>              7 RETURN_VALUE

This is very cool, I've never looked at this before.  Will be a
helpful learning to to see how Python does things.  Thanks!  :)

>Reading the Language Reference can help a lot, too.

I searched around in it, didnt read it where I wasnt finding answers.
Mostly just found the grammar and a little explanation on the parsing,
but not the way it kicks off commands based on them for these tokens.

>I think that's precisely the reason why this feature will probably
>never make it into Python. I, for one, don't want a module that I import
>to change the meaning of a literal just because that module wants to use
>a funky form of dict internally.

I can definitely understand that point of view, though if it stays
restricted to the module you are importing it into, it doesn't seem
like it would be very destructive.  It would also aid in making other
types of special purpose languages out of Python itself (instead of
having to create parsers or something).

True, could lead to messiness and confusion, but also could lead to
beauty and elegance.  Just a question of how much rope it's acceptable
to be given.

>In the end, it's just not worth the trouble. 99.9% of the time, I think
>you will find that subclassing the builtin types and giving the new
>classes short names will suffice. I mean, how much of a hardship is it
>to do the following:
>
>class d(dict):
>    def __add__(self, other):
>        # stuff
>
>d({1:2}) + d({3:4})

It's not actually the hardship of doing this, there is also the
conversion of everything that doesn't create MyDict by default.  I
would have to keep converting them to perform the same manipulations,
or other functions (add is only an important example, others could be
handy as well).

Now I will need additional lines to do this, or make the lines that
exist more complicated.  It's not undue hardship in a lot of ways, but
it would be nice not to have to and be able to keep the characters
used to a minimum.  Espcially when trying to write sub-50 line scripts
and such.

>If you're still not convinced, ask yourself these questions:

I actually feel confident I would be happy with my provided answers as
real functionality.  I don't believe Python will be changed to allow
that, and that is fine with me.  I just wanted to know if it could be
done the way Python is now.

>* How would you apply the new subclass?
>  - Only on new literals after the subclass definition (and registry of
>    the subclass with some special hook)?
>  - On every new object that would normally have been the base type?

Yes and yes.

It is possible, if this were possible, to only make this work for
classes that imported it if they so desired I expect.  Explicitly
requested only, etc.

>  - On every previously existing object with the base type?

No, they're already defined.  Unless my understanding of the way types
are done is lacking, and they are referencing the same base object
code, and then maybe yes to this as well, but it would be passive.

>  - In just the one module? or others which import it? or also in
>    modules imported after the one with the subclass?

In the module, and any that import it.  It fails to be useful in
making things clean-and-neat if in order to make a clean-and-neat
script you have to first have a bunch of stuff that modifies the
language norms.

>* How does your choice above work with code compiled on-the-fly with
>  eval, exec or execfile?

It would depend on how Python allowed this change to work.  If it did
it the way "I imagine" it would, then the change would effect all of
these circumstances.

>* How do you deal with multiple subclasses being defined and registered?

If they re-wrote the same thing and didn't just augment it (or
overwrote each other), then I would expect them to clobber each other.
Problems could ensue.

However, why isn't having {} + {} or [].len() part of the language
anyway?  If it was, people wouldnt have to keep writing their own.

{} + {} doesnt have to solve ALL cases, which is why I'm assuming it
isnt already implemented as [] + [] is.  It just has to solve the
basic case and provide the mechanism for determining the difference.
Ex, key clobbering add, and an intersect() function to see what will
be clobbered and do something about it if you care.

Isn't it better to have some default solution than to have none and
everyone keeps implementing it themselves anew every time?

For the [].len() type things, this is obviously a matter of taste, and
I have always been ok with len([]), but in trying to get team mates to
work with Python I have to concede that [].len() is more consistent
and a good thing.  Same for other builtins that apply.

>* How would you pass in initialization information?
>  E.g. say I want to limit the length of lists
>
>  class LimitList(list):
>    def __init__(self, maxlength, data=[]):
>      self.maxlength = maxlength
>      list.__init__(self, data)
>    def append(self, value):
>      if len(self) == self.maxlength:
>        raise ValueError, "list at maximum length"
>      else:
>        list.append(self, value)

I am actually interested in new features than modifying current ones,
but I suppose you would have this ability as well since you have
access.

I would say if you break the initialization or any other methods of
accessing the type in a default way, then you have just given yourself
broken code.  No one other function will work properly that uses it
and didn't expect to.

You will see it doesn't work, and stop.  I didn't say I wanted to
change ALL the behavior of the language, I just wanted to add features
that dont currently exist and I think should.

>* Can I get a real dict again if wanted to?

Save it, dont clobber the original out of existance.  The coder could
mess this up (more rope).

>* Given your choices above, how can you implement it in such a way that
>  you don't interfere with other people's code by accident?

There could be module limitations, but when I imagined this it was
globally reaching.  I believe that as long as you add features, or any
changes you make dont break things, then you will be safe.  If you
mess with features that exist then you will break things and your
programs will not work.

This seems natural enough to me.  You ARE changing a default behavior,
there are risks, but if you dont change things that other functions
use (pre-existing methods), then you are safe.

>Okay, so the last one isn't really fair, but the answers you give on the
>other questions should help define in your mind the kind of behavior you
>want. Reading up on Python internals with the Language Reference, the
>dis module documentation, and some source code should give you an idea
>of how one might go about an implementation and (more importantly) the
>compromises one would have to make. 

It seems the changes would have to be to the Python source, and I dont
want to make a new Python.  I dont even want to "change" Python, I
want to augment it.  I think these are reasonable additions, and I'm
sure I missed the discussions on why they aren't there now.

There may be very good reasons, and I could end up retracting my
desired feature set because of them, but I still wanted to see if I
could make it work for my own code.

>Then compare these specific features with the current way. Which is
>safer? Which is more readable by someone unfamiliar with the code? Which
>is more flexible? Which saves the most typing?

I think my answers above are sound.  If it worked the way I hoped it
might, you could change things on a global scale and you still
wouldn't break anything.

>My condolences on having to deal with such a team. It seems a little
>silly to me to equate OO with Everything-Must-Be-a-Method-Call. But if
>they insist, point them at [].__len__(). len([]) just calls [].__len__()
>anyways. After writing code like that for a while, they'll probably get
>over their method fixation.

I never had a problem with this myself, but people are different.  I
have a great team, everyone has their own opinions.  No need to
belittle them for differing.

Some people never get over the whitespacing.  It's just the way the
people work.  No need to see them as flawed.  :)

-Geoff Howland
http://ludumdare.com/