Access the name of variable that is being assigned

Hi, Is it possible at all to define a class in Python that can read name of variable it is assigned to on init?
For this to work, SomeClass __init__ needs to know what variable name is currently waiting to be assigned. But at the time __init__ is executed, MyObject is not entered global or local space yet. I know that this is possible to do on AST level, but AST is inaccessible when program is running, so what is the corresponding structure to track that? (links to source are appreciated) 1. Is it possible to do this in CPython and PyPy? 2. Is it possible to do this in generic way? 3. Is there any stack of assignments? 3.1. Is this stack accessible? Thanks. -- anatoly t.

On Tue, Jul 15, 2014 at 5:05 PM, anatoly techtonik <techtonik@gmail.com> wrote:
I thing in general a normal object in Python does not have a name and there's nothing special about the name of the variable it is assigned to first. To see why is is not going to work, what do you expect you print function to do if the object is created like some_function(SomeClass()) or some_other_object.some_attribute = SomeClass() or some_variable = another_variable = SomeClass() or some_variable = (SomeClass(),) or even SomeClass() # not assigning to anything etc.... I guess it would be better if you can describe what you really want to do.

Hello, Is there any way to have fast dispatch based on the type of a variable? I'm talking about code of the form: t = type(var) if t is int: i(v) elif t is long: l(v) elif t is float: f(v) elif t is str: s(v) elif t is unicode: u(v) ... I have tried these ideas: - Having the types as keys in a dict and the functions as lambdas. - Creating a list from min(type_hashes) to max(type_hashes) (with lambdas as list values) and indexing in it with hash(var_type) - min(type_hashes) But both were slower than the multiple ifs. The ideal case would be to have an optimization like C/C++ compilers do to switch statements, where they would create a binary search over the multiple cases like below: Assuming that: hash(int) < hash(long) < hash(float) .... t=hash(type(var)) if t < hash(float): if t < hash(long): i(v) else: l(v) else: if t< hash(unicode): s(v) else: u(v) The problem in Python is that the order of type_hashes is not constant. So it is not possible to create the binary search code. Kind regards, l.

Hi, On 15 July 2014 18:37, Eleytherios Stamatogiannakis <estama@gmail.com> wrote:
This should already give you the fastest possible execution on PyPy, because the first type inspection should promote the type in the JIT. All subsequent "if" checks are constant-folded. However, to be sure, you need to check with jitviewer. Note however that if all paths are eventually compiled by the JIT, the promotion will have a number of different cases, and searching through them is again done by linear search for now. This can be regarded as a bug waiting for improvement. A bientôt, Armin.

On 15/7/2014 8:28 μμ, Armin Rigo wrote:
Above code gets hit millions of times with different variable types. So in our case all paths are compiled and we are linear. Another idea that i have is the following. At startup i could sort all the hash(types), create (in a string) a python method that does binary sorting and eval it. Would the JIT be able to handle eval gymnastics like that? Thank you. l.

Hi, On 15 July 2014 19:11, Elefterios Stamatogiannakis <estama@gmail.com> wrote:
You need to try. There are far too many variations to be able to give a clear yes/no answer. For example, linear searches through only 6 items is incredibly fast anyway. But here's what I *think* should occur with your piece of code (untested!): t = type(x) Here, in this line, in order to get the application-level type, we need to know the exact RPython class of x. This is because the type is not always written explicitly: a Python-level int object, for example, is in RPython an instance of the class W_IntObject. We know that all instances of W_IntObject have the Python type 'int'; it doesn't need to be written explicitly as a field every time. So at the line above, there is promotion of the RPython class of x. Right now this is done with linear searching through all cases seen so far. If there are 5-6 different cases it's fine. (Note that RPython class != Python class in general, but for built-in types like int, str, etc. there is a one-to-one correspondence.) So at the line above, assuming that x is an instance of a built-in type, we end up with t being a constant already (a different one in each of the 5-6 paths). if t is int: ... elif t is long: ... In all the rest of the function, the "if... elif..." are constant-folded away. You don't gain anything by doing more complicated logic with t. A bientôt, Armin.

On 16/07/14 17:31, Armin Rigo wrote:
Could this be made faster with binary search or jump tables, like what C++ compilers use to optimize switches? I also noticed that "if" ladders checking multiple "isinstance" happen a lot in Python's standard library. Maybe an optimization like that would generally improve the speed of PyPy ?

Hi, On 17 July 2014 13:32, Eleytherios Stamatogiannakis <estama@gmail.com> wrote:
Could this be made faster with binary search or jump tables, like what C++ compilers use to optimize switches?
Yes, it's a long-term plan we have to enable this in the JIT. Let me repeat again that for 6 items it's a bit unclear that it would be better, but it would definitely be an improvement if the number of cases grows larger. A bientôt, Armin.

On Tue, Jul 15, 2014 at 12:50 PM, Yichao Yu <yyc1992@gmail.com> wrote:
I don't need this case, so I can ignore it. But reliably detecting this to distinguish from other situations would be nice.
or some_other_object.some_attribute = SomeClass()
Detect that name is an attribute, handle distinctly if needed, for my purpose print 'some_other_object.some_attribute'
or some_variable = another_variable = SomeClass()
Print closest assigned variable name, i.e. 'another_variable'
or some_variable = (SomeClass(),)
There is no direct assignment. Don't need this.
Good to detect. Don't need. Actually all cases that I don't need are the same case on this one - there is no direct assignment to variable.
I guess it would be better if you can describe what you really want to do.
I described. Or you need use case or user story? I think I want to link object instances to variable names without to print those names (if possible) in __repr__ and debug messages to save time on troubleshooting.

On Tue, Jul 15, 2014 at 5:05 PM, anatoly techtonik <techtonik@gmail.com> wrote:
This feature would be useful for things like namedtuple, where we currently have to write the name twice: record = namedtuple('record', 'a b c d') But I'm not sure why Anatoly is asking here. It would be a change in semantics of Python, and while I suppose it's possible for PyPy to lead the way with a semantic change for Python 3.5 or higher, or even an implementation-specific feature that other Python's don't offer, I would expect that normally this idea should go through CPython first. -- Steven

I guess this feature is mainly useful for debugging since it is really hard to do it consistantly in python. And IMHO, for debuging, it might be more useful to record the position (file + line num etc) where the object is created, which has less ambiguity and can probably already be done by tracing back the call stack in the constructor (until the most derived constructor or even just record the whole call stack). For not writing the name of a named tuple twice, won't it be a better idea to just use it as a base class, i.e.: class TupleName(named_tuple_base('a', 'b', 'c', 'd')): pass or maybe even using the syntax and trick of the new Enum class introduced in python 3.4(? or 3.3?) class TupleName(TupleBase): a = 1 b = 1 IMHO, this fits the python syntax better (if the current one is not good enough :) ) Yichao Yu On Wed, Jul 16, 2014 at 10:03 AM, Steven D'Aprano <steve@pearwood.info> wrote:

Hi Anatoly, Haven't done any python in a loong time now, but I lurk sometimes on the pypy list, and thought ok, I'll play with that, and see if it's nay use to you. I wrote techtonik.py (included at the end) and used it interactively to show some features, first trying to approximate the interaction you gave as an example (well, except the ns. prefix ;-) I may have gone a little overboard, tracking re-assignments etc ;-) HTH On 07/15/2014 11:05 AM anatoly techtonik wrote:
--- Python 2.7.3 (default, Jul 3 2012, 19:58:39) [GCC 4.7.1] on linux2 Type "help", "copyright", "credits" or "license" for more information.
If you run techtonik.py, it runs a little test, whose output is: ---- [04:37 ~/wk/py]$ techtonik.py 'MyObject' <SomeClass obj: (1, 2) {'three': 3} assigned to: 'MyObject'> 'MyObject', aka 'mycopy' <SomeClass obj: (1, 2) {'three': 3} assigned to: 'MyObject', 'mycopy'> 'myinstance' <SomeClass obj: ('same class new instance',) {} assigned to: 'myinstance'> 'a', aka 'b', aka 'c' <SomeClass obj: ('multi',) {} assigned to: 'a', 'b', 'c'> 10 20 'a', aka '\\b', aka 'c' <SomeClass obj: ('multi',) {} assigned to: 'a', '\\b', 'c'> [04:37 ~/wk/py]$ ---- Here is the techtonik.py source: ============================================ #!/usr/bin/python # debugging ideas for anatoly # 2014-07-16 00:54:06 # # The goal is to make something like this: # # >>> MyObject = SomeClass() # >>> print(MyObject) # 'MyObject' # # we can do it with an attribute name space: # if you can live with prefixing the name space # name to the names you want to track assignments of: class NameSetter(object): def __setattr__(self, name, val): # if our namespace already has the name being assigned ... if hasattr(self, name): # if that is a nametracking object like a SomeClass instance... tgt = getattr(self, name) if hasattr(tgt, '_names') and isinstance(tgt._names, list): # update its name list to reflect clobbering (prefix '\') for i,nm in enumerate(tgt._names): if nm==name: tgt._names[i] = '\\'+ nm # if value being assigned has a _names list attribute like SomeClass if hasattr(val, '_names') and isinstance(val._names, list): val._names.append(name) # add name to assignemt list # now store the value, whatever the type object.__setattr__(self, name, val) # avoid recursive loop class SomeClass(object): def __init__(self, *args, **kw): self.args = args self.kw = kw self._names = [] def __str__(self): return ', aka '.join(repr(s) for s in (self._names or ["(unassigned)"])) def __repr__(self): return '<%s obj: %r %r %s>' %( self.__class__.__name__, self.args, self.kw, 'assigned to: %s'%( ', '.join(repr(s) for s in (self._names or ["(unassigned)"])))) def test(): ns = NameSetter() ns.target = 'value' ns.MyObject = SomeClass(1,2,three=3) print ns.MyObject print repr(ns.MyObject) ns.mycopy = ns.MyObject print ns.MyObject print repr(ns.MyObject) ns.myinstance = SomeClass('same class new instance') print ns.myinstance print repr(ns.myinstance) ns.a = ns.b = ns.c = SomeClass('multi') print ns.a print repr(ns.b) ns.ten = 10 ns.b = 20 print ns.ten, ns.b, ns.a print repr(ns.a) if __name__ == '__main__': test() ====================================================== Have fun. Regards, Bengt Richter

On Tue, Jul 15, 2014 at 5:05 PM, anatoly techtonik <techtonik@gmail.com> wrote:
I thing in general a normal object in Python does not have a name and there's nothing special about the name of the variable it is assigned to first. To see why is is not going to work, what do you expect you print function to do if the object is created like some_function(SomeClass()) or some_other_object.some_attribute = SomeClass() or some_variable = another_variable = SomeClass() or some_variable = (SomeClass(),) or even SomeClass() # not assigning to anything etc.... I guess it would be better if you can describe what you really want to do.

Hello, Is there any way to have fast dispatch based on the type of a variable? I'm talking about code of the form: t = type(var) if t is int: i(v) elif t is long: l(v) elif t is float: f(v) elif t is str: s(v) elif t is unicode: u(v) ... I have tried these ideas: - Having the types as keys in a dict and the functions as lambdas. - Creating a list from min(type_hashes) to max(type_hashes) (with lambdas as list values) and indexing in it with hash(var_type) - min(type_hashes) But both were slower than the multiple ifs. The ideal case would be to have an optimization like C/C++ compilers do to switch statements, where they would create a binary search over the multiple cases like below: Assuming that: hash(int) < hash(long) < hash(float) .... t=hash(type(var)) if t < hash(float): if t < hash(long): i(v) else: l(v) else: if t< hash(unicode): s(v) else: u(v) The problem in Python is that the order of type_hashes is not constant. So it is not possible to create the binary search code. Kind regards, l.

Hi, On 15 July 2014 18:37, Eleytherios Stamatogiannakis <estama@gmail.com> wrote:
This should already give you the fastest possible execution on PyPy, because the first type inspection should promote the type in the JIT. All subsequent "if" checks are constant-folded. However, to be sure, you need to check with jitviewer. Note however that if all paths are eventually compiled by the JIT, the promotion will have a number of different cases, and searching through them is again done by linear search for now. This can be regarded as a bug waiting for improvement. A bientôt, Armin.

On 15/7/2014 8:28 μμ, Armin Rigo wrote:
Above code gets hit millions of times with different variable types. So in our case all paths are compiled and we are linear. Another idea that i have is the following. At startup i could sort all the hash(types), create (in a string) a python method that does binary sorting and eval it. Would the JIT be able to handle eval gymnastics like that? Thank you. l.

Hi, On 15 July 2014 19:11, Elefterios Stamatogiannakis <estama@gmail.com> wrote:
You need to try. There are far too many variations to be able to give a clear yes/no answer. For example, linear searches through only 6 items is incredibly fast anyway. But here's what I *think* should occur with your piece of code (untested!): t = type(x) Here, in this line, in order to get the application-level type, we need to know the exact RPython class of x. This is because the type is not always written explicitly: a Python-level int object, for example, is in RPython an instance of the class W_IntObject. We know that all instances of W_IntObject have the Python type 'int'; it doesn't need to be written explicitly as a field every time. So at the line above, there is promotion of the RPython class of x. Right now this is done with linear searching through all cases seen so far. If there are 5-6 different cases it's fine. (Note that RPython class != Python class in general, but for built-in types like int, str, etc. there is a one-to-one correspondence.) So at the line above, assuming that x is an instance of a built-in type, we end up with t being a constant already (a different one in each of the 5-6 paths). if t is int: ... elif t is long: ... In all the rest of the function, the "if... elif..." are constant-folded away. You don't gain anything by doing more complicated logic with t. A bientôt, Armin.

On 16/07/14 17:31, Armin Rigo wrote:
Could this be made faster with binary search or jump tables, like what C++ compilers use to optimize switches? I also noticed that "if" ladders checking multiple "isinstance" happen a lot in Python's standard library. Maybe an optimization like that would generally improve the speed of PyPy ?

Hi, On 17 July 2014 13:32, Eleytherios Stamatogiannakis <estama@gmail.com> wrote:
Could this be made faster with binary search or jump tables, like what C++ compilers use to optimize switches?
Yes, it's a long-term plan we have to enable this in the JIT. Let me repeat again that for 6 items it's a bit unclear that it would be better, but it would definitely be an improvement if the number of cases grows larger. A bientôt, Armin.

On Tue, Jul 15, 2014 at 12:50 PM, Yichao Yu <yyc1992@gmail.com> wrote:
I don't need this case, so I can ignore it. But reliably detecting this to distinguish from other situations would be nice.
or some_other_object.some_attribute = SomeClass()
Detect that name is an attribute, handle distinctly if needed, for my purpose print 'some_other_object.some_attribute'
or some_variable = another_variable = SomeClass()
Print closest assigned variable name, i.e. 'another_variable'
or some_variable = (SomeClass(),)
There is no direct assignment. Don't need this.
Good to detect. Don't need. Actually all cases that I don't need are the same case on this one - there is no direct assignment to variable.
I guess it would be better if you can describe what you really want to do.
I described. Or you need use case or user story? I think I want to link object instances to variable names without to print those names (if possible) in __repr__ and debug messages to save time on troubleshooting.

On Tue, Jul 15, 2014 at 5:05 PM, anatoly techtonik <techtonik@gmail.com> wrote:
This feature would be useful for things like namedtuple, where we currently have to write the name twice: record = namedtuple('record', 'a b c d') But I'm not sure why Anatoly is asking here. It would be a change in semantics of Python, and while I suppose it's possible for PyPy to lead the way with a semantic change for Python 3.5 or higher, or even an implementation-specific feature that other Python's don't offer, I would expect that normally this idea should go through CPython first. -- Steven

I guess this feature is mainly useful for debugging since it is really hard to do it consistantly in python. And IMHO, for debuging, it might be more useful to record the position (file + line num etc) where the object is created, which has less ambiguity and can probably already be done by tracing back the call stack in the constructor (until the most derived constructor or even just record the whole call stack). For not writing the name of a named tuple twice, won't it be a better idea to just use it as a base class, i.e.: class TupleName(named_tuple_base('a', 'b', 'c', 'd')): pass or maybe even using the syntax and trick of the new Enum class introduced in python 3.4(? or 3.3?) class TupleName(TupleBase): a = 1 b = 1 IMHO, this fits the python syntax better (if the current one is not good enough :) ) Yichao Yu On Wed, Jul 16, 2014 at 10:03 AM, Steven D'Aprano <steve@pearwood.info> wrote:

Hi Anatoly, Haven't done any python in a loong time now, but I lurk sometimes on the pypy list, and thought ok, I'll play with that, and see if it's nay use to you. I wrote techtonik.py (included at the end) and used it interactively to show some features, first trying to approximate the interaction you gave as an example (well, except the ns. prefix ;-) I may have gone a little overboard, tracking re-assignments etc ;-) HTH On 07/15/2014 11:05 AM anatoly techtonik wrote:
--- Python 2.7.3 (default, Jul 3 2012, 19:58:39) [GCC 4.7.1] on linux2 Type "help", "copyright", "credits" or "license" for more information.
If you run techtonik.py, it runs a little test, whose output is: ---- [04:37 ~/wk/py]$ techtonik.py 'MyObject' <SomeClass obj: (1, 2) {'three': 3} assigned to: 'MyObject'> 'MyObject', aka 'mycopy' <SomeClass obj: (1, 2) {'three': 3} assigned to: 'MyObject', 'mycopy'> 'myinstance' <SomeClass obj: ('same class new instance',) {} assigned to: 'myinstance'> 'a', aka 'b', aka 'c' <SomeClass obj: ('multi',) {} assigned to: 'a', 'b', 'c'> 10 20 'a', aka '\\b', aka 'c' <SomeClass obj: ('multi',) {} assigned to: 'a', '\\b', 'c'> [04:37 ~/wk/py]$ ---- Here is the techtonik.py source: ============================================ #!/usr/bin/python # debugging ideas for anatoly # 2014-07-16 00:54:06 # # The goal is to make something like this: # # >>> MyObject = SomeClass() # >>> print(MyObject) # 'MyObject' # # we can do it with an attribute name space: # if you can live with prefixing the name space # name to the names you want to track assignments of: class NameSetter(object): def __setattr__(self, name, val): # if our namespace already has the name being assigned ... if hasattr(self, name): # if that is a nametracking object like a SomeClass instance... tgt = getattr(self, name) if hasattr(tgt, '_names') and isinstance(tgt._names, list): # update its name list to reflect clobbering (prefix '\') for i,nm in enumerate(tgt._names): if nm==name: tgt._names[i] = '\\'+ nm # if value being assigned has a _names list attribute like SomeClass if hasattr(val, '_names') and isinstance(val._names, list): val._names.append(name) # add name to assignemt list # now store the value, whatever the type object.__setattr__(self, name, val) # avoid recursive loop class SomeClass(object): def __init__(self, *args, **kw): self.args = args self.kw = kw self._names = [] def __str__(self): return ', aka '.join(repr(s) for s in (self._names or ["(unassigned)"])) def __repr__(self): return '<%s obj: %r %r %s>' %( self.__class__.__name__, self.args, self.kw, 'assigned to: %s'%( ', '.join(repr(s) for s in (self._names or ["(unassigned)"])))) def test(): ns = NameSetter() ns.target = 'value' ns.MyObject = SomeClass(1,2,three=3) print ns.MyObject print repr(ns.MyObject) ns.mycopy = ns.MyObject print ns.MyObject print repr(ns.MyObject) ns.myinstance = SomeClass('same class new instance') print ns.myinstance print repr(ns.myinstance) ns.a = ns.b = ns.c = SomeClass('multi') print ns.a print repr(ns.b) ns.ten = 10 ns.b = 20 print ns.ten, ns.b, ns.a print repr(ns.a) if __name__ == '__main__': test() ====================================================== Have fun. Regards, Bengt Richter
participants (7)
-
anatoly techtonik
-
Armin Rigo
-
Bengt Richter
-
Elefterios Stamatogiannakis
-
Eleytherios Stamatogiannakis
-
Steven D'Aprano
-
Yichao Yu