Re: [Python-ideas] [Python-Dev] matrix operations on dict :)

On Wed, Feb 8, 2012 at 9:54 AM, julien tayon <julien@tayon.net> wrote:
probably will (eventually) require changes to the "object" model of Python and may require (or want) the addition of the "compound" data-type (as in python's predecessor ABC). The symbol that denotes a compound would be the colon (":") and associates a left hand side with right-hand side value, a NAME with a VALUE. A dictionary would (then) be a SET of these. (Voila! things have already gotten simplified.) Eventually, I also think this will seque and integrate nicely into Mark Shannon's "shared-key dict" proposal (PEP 410). The compound data-type would act as the articulation point, around which the recursive, fractal data structure would revolve: much like the decimal point forms in (non-integer) number. (In theory, you could even do a reciprocal or __INVERT__ operation on this type.) OR perhaps a closer comparison is whatever separates the imaginary from the real part in a complex number on the complex plane -- by virtue of such, creates two orthogonal and independent spaces. The same we want to do with this new fractal dictionary type. While in the abstract one might think to allow any arbitrary data-type for right-hand-side values, in PRACTICE, integers are sufficient. The reason is thus. In the fractal data type, you simply need to define the bottom-most and top-most layers of the fractalset abstraction, which is the same as saying the relationship between the *atomic* and the *group* -- everything in-between will be taken care of by the power of the type itself. It makes sense to use a maximally atomic, INTEGER data type (starting with the UNIT 1) for the bottom most level., and a maximally abstract top-most level -- this is simply an abstract grouping type (i.e. a collection). I'm going to suggest a SET is the most abstract (i.e. sufficient) because it does not impose an order and for reasons regarding the CONFLATION rule (RULE1). The CONFLATION rule is thus: items of the same name are combined ({'a':1, 'a':3} ==> {'a':4}, and non-named (atomic) items are summed. To simplify representation, values should be conflated as much as possible, the idea is maximizing reduction. This rule separates a set from a list, because non-unique items will be conflated into one. Such a set or grouping should be looked at as an arbitrary n-dimensional space. An interesting thing to think about is how this space can be mapped into a unique 1-dimensional, ordered list and vice versa. Reflectively, a list can be converted uniquely into this fractal set thusly: All non-integer, non-collection items will be considered NAMES and counted. If an item is another list, it will recurse and create another set. If a set it will simply add it, as is. These rules could be important in object serialization (we'll call this EXPANSION). In any case, for sake of your example. In the above KABOOM example, unnamed, atomic elements can just be considered ANONYMOUS (using None as the key). In this case, the new dict becomes: { "a" : 1 } + { "a" : { "b" : 1 } } ==> { "a" : {None: 1, "b" : 1 } } , OR if have a compound data-type, we can remove the redundant pseudo-name: { "a" : { 1, "b" : 1 } }. Furthermore we can assume a default value of 1 for non-valued "names", so we could express this more simply: { 'a' } + { 'a" : { 'b' } } ==> { ''a': { 1, 'b' } } No ambiguity! as long as we determine a convention. As noted, one element is named, and the other is not. Consider unnamed values within a grouping like a GAS and *named* values as a SOLID. You're adding them into the same room where they can co-exist just fine. No confusion! To clarify the properties of this fractal data type more clearly: there is only 1 key in the the second, inner set ('b'). We can remove the values() method as they will always be the atomic INTEGER type and conflate to a single number. We'll call this other thing, this property "mass"; in this case = 2.) The use of physical analog is helpful and will inform the definition. (Could one represent a python CLASS heirarchy more simply with this fractalset object somehow....?) Further definitions: RULE2: When an atomic is added to a compound, a grouping must be created: 1 + "b" : 1 = { None : 1, "b" : 1 } RULE3: Preserve groupings where present: 'b' : 7 + { 'b' : 1 } = { 'b' : 8 } I think this might be sufficient. Darn, I hope it makes some sense.... mark

On Thu, Feb 9, 2012 at 4:11 PM, Mark Janssen <dreamingforward@gmail.com>wrote:
That was not a user-visible data type in ABC. ABC had dictionaries (with somewhat different semantics due to the polymorphic static typing) and the ':' was part of the dictionary syntax, not of the type system.
A dictionary would (then) be a SET of these. (Voila! things have already gotten simplified.)
Really? So {a:1, a:2} would be a dict of length 2?
Maybe you should reduce your coffee intake. There's too much SHOUTING in your post... :-) -- --Guido van Rossum (python.org/~guido)

On Thu, Feb 9, 2012 at 7:11 PM, Mark Janssen <dreamingforward@gmail.com> wrote:
I have the problem looking for this solution!
The application for this functionality is in coding a fractal graph (or "multigraph" in the literature).
I think that would be better represented using an object of some sort, such as a MultiGraphNode and/or MultiGraphEdge, instead of re-purposing dict.
Okay, I guess I did not make myself very clear. What I'm proposing probably will (eventually) require changes to the "object" model of Python
That means you're talking about Python 4, at a minimum, and you would need to show how valuable it is by building a workaround version and getting people to use that extensively in Python 3. And frankly, you should probably do that anyhow; this feels to me like a bad plan for language defaults, but it is still a valid use case -- and I don't think this sort of math exploration should (or will) wait for Python 4; people will model it somehow in an existing language.
That sounds like an association list. I think you're dealing with sufficiently abstract problems that you don't want to restrict your keys to hashable things, and it is worth suffering a bit slower performance in return.
Eventually, I also think this will seque and integrate nicely into Mark Shannon's "shared-key dict" proposal (PEP 410).
I'm pretty sure he doesn't intend to change the semantics of dict at all. He does want to make the implementation more efficient, at least in terms of space; any semantic differences are considered either bugs or costs worth paying for that efficiency.
While in the abstract one might think to allow any arbitrary data-type for right-hand-side values, in PRACTICE, integers are sufficient.
By integers, do you really mean pointers or (possibly abstract) references to other structures? Because if you do, then ordinary arithmetic isn't the right solution, but if you don't, then I don't see them as sufficient.
I agree that it is the most abstract (at least of the well-known) type, and that all the other types can be represented in terms of sets. The catch is that these representations may be massively inefficient. If you're doing mathematical exploration, that may be a reasonable tradeoff, but Python also caters to other use cases.
(Could one represent a python CLASS heirarchy more simply with this fractalset object somehow....?)
Depending on what you want to represent, probably. But if you want to represent the ancestors of a given class for efficient method and attribute access, then no; it is hard to beat an array for efficiency of sequential access. -jJ

On Fri, Feb 10, 2012 at 8:38 AM, Jim Jewett <jimjjewett@gmail.com> wrote:
Those would be good strategies in general, but the issue is how things hook together in the object model. These things are very abstract, it's exactly the thing which had made metaclasses difficult to "grok" at times. I'll probably just have to try to implement them in Pypy or abandon the idea.
really happened. So this is really still for what was dreamed to happen in version 3.
information model, everything can be represented by atomic units (where integers come in) and groups (or a collection type). Compre with how all the complexity of the physical world is a product of small-massed electrons and protons. I'm arguing that all the uses of data can be represented in a similar way. Thanks for the reply, but I think I'll shelve the discussion for now.... mark

On Thu, Feb 9, 2012 at 4:11 PM, Mark Janssen <dreamingforward@gmail.com>wrote:
That was not a user-visible data type in ABC. ABC had dictionaries (with somewhat different semantics due to the polymorphic static typing) and the ':' was part of the dictionary syntax, not of the type system.
A dictionary would (then) be a SET of these. (Voila! things have already gotten simplified.)
Really? So {a:1, a:2} would be a dict of length 2?
Maybe you should reduce your coffee intake. There's too much SHOUTING in your post... :-) -- --Guido van Rossum (python.org/~guido)

On Thu, Feb 9, 2012 at 7:11 PM, Mark Janssen <dreamingforward@gmail.com> wrote:
I have the problem looking for this solution!
The application for this functionality is in coding a fractal graph (or "multigraph" in the literature).
I think that would be better represented using an object of some sort, such as a MultiGraphNode and/or MultiGraphEdge, instead of re-purposing dict.
Okay, I guess I did not make myself very clear. What I'm proposing probably will (eventually) require changes to the "object" model of Python
That means you're talking about Python 4, at a minimum, and you would need to show how valuable it is by building a workaround version and getting people to use that extensively in Python 3. And frankly, you should probably do that anyhow; this feels to me like a bad plan for language defaults, but it is still a valid use case -- and I don't think this sort of math exploration should (or will) wait for Python 4; people will model it somehow in an existing language.
That sounds like an association list. I think you're dealing with sufficiently abstract problems that you don't want to restrict your keys to hashable things, and it is worth suffering a bit slower performance in return.
Eventually, I also think this will seque and integrate nicely into Mark Shannon's "shared-key dict" proposal (PEP 410).
I'm pretty sure he doesn't intend to change the semantics of dict at all. He does want to make the implementation more efficient, at least in terms of space; any semantic differences are considered either bugs or costs worth paying for that efficiency.
While in the abstract one might think to allow any arbitrary data-type for right-hand-side values, in PRACTICE, integers are sufficient.
By integers, do you really mean pointers or (possibly abstract) references to other structures? Because if you do, then ordinary arithmetic isn't the right solution, but if you don't, then I don't see them as sufficient.
I agree that it is the most abstract (at least of the well-known) type, and that all the other types can be represented in terms of sets. The catch is that these representations may be massively inefficient. If you're doing mathematical exploration, that may be a reasonable tradeoff, but Python also caters to other use cases.
(Could one represent a python CLASS heirarchy more simply with this fractalset object somehow....?)
Depending on what you want to represent, probably. But if you want to represent the ancestors of a given class for efficient method and attribute access, then no; it is hard to beat an array for efficiency of sequential access. -jJ

On Fri, Feb 10, 2012 at 8:38 AM, Jim Jewett <jimjjewett@gmail.com> wrote:
Those would be good strategies in general, but the issue is how things hook together in the object model. These things are very abstract, it's exactly the thing which had made metaclasses difficult to "grok" at times. I'll probably just have to try to implement them in Pypy or abandon the idea.
really happened. So this is really still for what was dreamed to happen in version 3.
information model, everything can be represented by atomic units (where integers come in) and groups (or a collection type). Compre with how all the complexity of the physical world is a product of small-massed electrons and protons. I'm arguing that all the uses of data can be represented in a similar way. Thanks for the reply, but I think I'll shelve the discussion for now.... mark
participants (3)
-
Guido van Rossum
-
Jim Jewett
-
Mark Janssen