[Python-3000] enhanced descriptors

Sat Jun 10 16:43:53 CEST 2006

well, adding bytecodes is out-of-the-question for me.

i did think of doing a position-proxy class, but it has lots of
drawbacks as well:
* lots of methods to implement (to make it look like an int)

* lazy evaluation -- should only perform tell() when requested,
not before. for example, calling __repr__ or __add__ would have
to tell(), while __iadd__ would not... nasty code

* it would be slower: adding logic to __set__, and an int-like
object (never as fast as a real int), etc.

* and, worst of all, it would have unavoidable undesired
behavior:

desired behavior:
    f.position += 2

undesired behavior:
    p = f.position
    p += 2     # this would seek()!!!

any good solution would require lots of magic, so i guess
i'm just gonna pull off the += optimization. two system calls
are not worth writing such an ugly code.

the solution must come from "enhanced descriptors".

- - - - - - - - -

> What would be needed is to combine the attribute access
> and += operator into a single "add to attribute" operation.
> So there would be an ADD_TO_ATTRIBUTE bytecode, and a
> corresponding __iaddattr__ method or some such implementing
> it.

i'm afraid that's not possible, because the compiler can't tell
that x.y+=z is a descriptor assignment.

> Then of course you'd want corresponding methods for all
> the other inplace operators applied to attributes. And
> probably a third set for obj[index] += value etc.

no, i don't think so. indexing should first __get__ the object,
and then index it. these are two separate operations. only the
inplace operators should be optimized into one function.

- - - - - - - - -

> It might be worth writing a PEP about this.

well, you asked for it :)

preliminary pep: STORE_INPLACE_ATTR

today, x.y += z is translated to
x.__setattr__("y",  x.__getattr__("y") +/+=  z)
depending on y (if it supports __iadd__ or only __add__)

the proposed change is to replace this scheme by
__setiattr__ - set inplace attr. it takes three arguments:
name, operation, and value. it is invoked by the new
bytecode instruction: STORE_INPLACE_ATTR.

the new instruction's layout looks like this:
TOS+2: value
TOS+1: operation code (1=add, 2=sub, 3=mul, ...)
TOS: object
STORE_INPLACE_ATTR <nameindex>

the need of this new special method is to optimize the
inplace operators, for both normal attributes and descriptors.

examples:
for normal assignment, the normal behavior is retained
x.y = 5 ==> x.__setattr__("y", 5)

for augmented assignment, the inplace version (__setiattr__)
is used instead:
x.y += 5  ==> x.__setiattr__("y", operator.add, 5)

the STORE_INPLACE_ATTR instruction would convert the
operation code into the corresponding function from the
`operator` module, to make __setiattr__ simpler.

descriptors:
the descriptor protocol is also extended with the __iset__
method -- inplace __set__.
if the attribute is a data descriptor, __setiattr__ will try
to call __iset__; if it does not exist, it would default to
__get__ and then __set__.

sketch implementation:

def __setiattr__(self, name, op, value):
    attr = getattr(self, name)

    # descriptors
    if hasattr(attr, "__iset__"):
        attr.__iset__(self, op, value)
        return

    if hasattr(attr, "__set__"):
        result = op(attr.__get__(self, self.__class__), value)
        attr.__set__(result)
        return

    # normal attributes
    inplace_op_name = "__i%s__" % (op.name,) # ugly!!
    if hasattr(attr, inplace_op_name):
        getattr(atttr, inplace_op_name)(value)
    else:
        setattr(self, name, op(attr, value))

issues:
should it be just one special method (__setiattr__) or a
method per-operation (__setiadd__, __setisub__)?
multiple methods mean a each method is simpler, but also
cause code duplication. and lots of new method slots.

notes:
if the STORE_INPLACE_ATTR instruction does not find
__setiattr__, it can always default to __setattr__, the same
way it's done today.

-tomer

On 6/10/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> tomer filiba wrote:
>
> > so my suggestion is as follows:
> > data descriptors must define __get__ and __set__. if they
> > also define one of the inplace-operators (__iadd___, etc),
> > it will be called instead of first __get__()ing and then
> > __set__()ing.
> >
> > however, the inplace operators would have to use a different
> > signature than the normal operators -- instead of
> > __iadd__(self, other)
> > they would be defined as
> > __iadd__(self, obj, value).
>
> This could be done, although it would require some large
> changes to the way things work. Currently the attribute
> access and inplace operation are done by separate bytecodes,
> so by the time the += gets processed, the whole descriptor
> business is finished with.
>
> What would be needed is to combine the attribute access
> and += operator into a single "add to attribute" operation.
> So there would be an ADD_TO_ATTRIBUTE bytecode, and a
> corresponding __iaddattr__ method or some such implementing
> it.
>
> Then of course you'd want corresponding methods for all
> the other inplace operators applied to attributes. And
> probably a third set for obj[index] += value etc.
>
> That's getting to be a ridiculously large set of methods.
> It could be cut down considerably by having just one
> in-place method of each kind, parameterised by a code
> indicating the arithmetic operation (like __richcmp__):
>
>     Syntax                Method
>     obj.attr OP= value    obj.__iattr__(op, value)
>     obj[index] OP= value  obj.__iitem__(op, value)
>
> It might be worth writing a PEP about this.
>
> Getting back to the problem at hand, there's another way
> it might be handled using current Python. Instead of a
> normal int, the position descriptor could return an
> instance of an int subclass with an __iadd__ method that
> manipulates the file position.
>
> There's one further problem with all of this, though.
> Afterwards, the result of the += is going get assigned
> back to the position property. If you want to avoid
> making another redundant system call, you'll somehow
> have to detect when the value being assigned is the
> result of a += and ignore it.
>
> --
> Greg
>