[Python-ideas] [Python-Dev] hello, new dict addition for new eve ?

julien tayon julien at tayon.net
Sat Dec 31 17:07:46 CET 2011


Hello,

I had tough time finding a connection to internet :)

@Arnaud + Terry
Thx, code corrected to have the right terminology.
at least my code really test for the good rules :)
But being precise matter, so that I changed all the terms to fit in.

@nathan

<< is a symbol indicating an asymetry for left and right value, and I
carefully tried to have a symetrical behaviour with left & right :
associativity, distribution, commutativity.

The whole point about using a symbol, is that it follow rules.

That is the reason why consistent_addition class is all about testing
linear algebraic rules (hopefully the right way, and now, I hope with
the right terminology).

But I have reason to think that there are more than way to do
addition. Linear algebrae is the most used and intuitive one for non
developer, and developer think of sets operations as more intuitive.
I guess, it all come down to one point. Which is what is a dict as a
mathematical object. If there is more than one choice, which has the
most benefits. Should there be more than one definition ?



@Guido  & Eric

In my book there are rules for sets.
The question is a dict the same as a vector or as a set ?

since sets are using logical operators for operations that are roughly
sets operations why not use & ^ | for sets operation on dict ?
it would be pretty consistent with sets operations.

( @jakkob )
In this case I would expect inconsistencies error when doing
dict( a = 1 ) | dict( a = 2 , b = 2 ) (thus key-> value are seen like
an atomic element of the set, unless a "value collision" operator is
given, and this collision operator would recursively and naturely
apply to the descendants)
because I expect both a | b == b | a and a + b = b + a

The problem would be indeed
list + list
vectorish behaviour is my prefered of course.
but if we shift the record algebrae addition to another symbol (let's
assume << ) we resolve the conflict. But then we have a problem of
retro compatibility.

We then may be tempted to use & | ^  (since it would be used for dict)
and then we have a problem :
should [ 1 , 2 ] & [ 3 , 2 ] mean set addition  or applying & to all
the element of the same rank ?

I'm pretty stucked myself in trying to be consistent.

Just for the record, why dit you not use "." (dot) for concatenation ?
  I know it is typographically unreadable on a messy screen, but is
there a better reason ?

@terry & Nathan Regarding Counter.
on
https://github.com/jul/ADictAdd_iction/blob/master/demonstrate.py#L120
you may notice that I do have all the property of counter, but counter
does not as -this dict- aggregate values and (sub) totals at the same
time.
It is very convenient for map reduce since it does some re-reduce
operations at reduce time. smart, no ? <:o)

@terry
Sorry for making assertive and carefree statements regarding strong
resentment. My bad.



@bruce,

truth is aim at similarity cosinus for dict, and I imagine a map
reduce on dict representing values of a model you want ideal = dict(
blue = 1 , height = 180, weight = 70, wage = 500 )
in the filter before the map reduce you would want to filter all
records close to this one by using similarity cosinus in this way :
filter( lambda model : cos( ideal, model) > .7 , all_model )
http://en.wikipedia.org/wiki/Cosine_similarity

of course I  will try to advocate dot product, and metrics.

As a result my real goal is to make people consider dict as vectors of
path to value => value not sets of key => value

To solve issues such as weighted decisions and non linear choices
(based on trigger or a value belonging to a set) I can fairly easily
concevie a projector (which would be a class transforming a vector in
a vector, but not with a a matrix, but with computed rules ).



@Mark Jansen

+1 are you telepath ?

Yes a key collision operator would be nice :

given for instance an apache log,  I will have segments of path, and I
may want keys to be the accumulated graph of a user on a website. so
my collision rule might be :
(key  being the referer value the page visited afterwards)
dict( a = b, a = c ) + dict( b = e ) = dict( a = dict( b = e ), a = c)

I thought of it, but I really loved the conservation rule. And I
feared people would think of it as too complicated. Keep It Simple I
learnt :)

@all
Wrapping everything to logicial sense.

I was thinking of supersets of object (thus out of the stdlib) that
would have different algebrae and could be used as casts on object to
redifine + - * / cos & | ^
and for different objects would behave consistently ex :
vector( dict ) vector( list ) vector( string )
would make dict & list & string behave like vectors  thus mainly
supporting elementwise operations

sets( dict ) , sets( string ) ...
would make dict string be sets of elements ... (with the original dict
addition design of GvR et al)

Each superset would have a unit test  for the operation verifying that
behaviour is consistent.  (associativity, distributivity, ...) what
about this solution ?

-- 
Friendly
jul



More information about the Python-ideas mailing list