Implementing a set of operation (+, /, - *) on dict consistent with linearAlgebrae
Hello :) the idea is described here:http://jul.github.io/cv/pres.html#printable Summary of the idea : Take a linear algebrae book, and implements all the rules as a TDD.https://github.com/jul/archery/blob/master/consistent_algebrae.py make it works based on abstract base class and sets of Mixins.https://archery.readthedocs.io/en/latest/ And see if we can make cos/__abs__/dot and if it gives naively the intended results ? (spoiler: yes) Making it work with dict, and "other" dictionary like counter by using ineritancehttps://archery.readthedocs.io/en/latest/#advanced-usage My idea is : wouldn't it be nice if we introduced geometries as sets of mixins for objects ? (Hilbertian algebrae could be nice too, and we could make MutableMapping behave like bra/kets). So I was proposing a soft discussion on : could we agree that it would be nice to consider operation overloading as a whole set of behaviours that could profit from being consistent in a categorized way ? (like the + of [] could be the + of "RecordAlgebrae") Meaning we could define sets of "expected behaviour consistent interaction between operators" as we defined the abc and call them algebrae? I offer the LinearAlgebrae Mixins as a POC, and was thinking of creating a unittest to qualify if an object is following the rules of linear algebrae. What are your opinions ? I don't actually see a lot of use case except it was funny to build. But maybe it can be of use. Cordialement -- Julien
What are your opinions ? I don't actually see a lot of use case except it was funny to build. But maybe it can be of use.
This list is for suggesting additions and changes to python. Broad usefulness is a prerequisite. So please build your lib but this seems off topic on this list. / Anders
Julien, your article is very pleasant to read (and funny) but as other say the mailing list is not there to share some articles, but for proposition to the standard python library, do our own lib on github and pypi first if you want to Share some code to the world ! And if project becomes super useful to everyone one day, it may come one day to the standard library so that everybody will have it. Cheers, Robert
Thanks robert for the praise. It feels nice. I may be bold, but I really hate to come empty handed to a discussion. So this lib is nothing more than doing my homework when I don't have a PhD. Actually, science (in my opinion) is about measuring. What I propose is nothing more than (if you add Vector traits) giving native metrics to objects (having coded in Perl for too long I still see objects as a hierarchy of blessed MutableMappings, I am sorry). And I think that measurements are a corner stone of science, thus of data science. (my opinion you may not share). Thus it could be kind of extending some of the concepts of datasets : https://www.python.org/dev/peps/pep-0557/ to additionnal default behaviour (that could be subscribed optionnally). As an everyday coder, this behaviour does solve problems I can illustrate with code (like aggregating data, or measuring if I might have doubon in a set of dataset, transforming objects into objects). I do not want to force feed the community with my "brilliant" ideas, I much more would like to plead my case on how adopting "consistent geometric behaviours" at the language level would ease our lives as coders, if this is not inappropriate. Please don't look at the lib. Look at the idea of making operators behave in a consistent way that gives the property of well known mathematic constructions to the core of the language. It also enables parallelisation without side effects (aka the map reduce of the poors), which are a first order consequence of the linear algebrae. I may not be gifted with writing long dissertations, however, I have a pragmatic mind. So I don't mind being challenged a tad, as long as we talk about stuffs like : how does it profit python coders to be standard, can you show me real life example ? However, if a "no (answer)" is a "no", I do understand. I like python the way it is, and I don't want to introduce friction in the process of improving python by being off topic. Thus if no one is interested, I still have a last word : keep up the good work! And thank you all for what you bring us. Cheers On Tue, 30 Oct 2018 at 19:11, Robert Vanden Eynde <robertve92@gmail.com> wrote:
Julien, your article is very pleasant to read (and funny) but as other say the mailing list is not there to share some articles, but for proposition to the standard python library,
do our own lib on github and pypi first if you want to Share some code to the world !
And if project becomes super useful to everyone one day, it may come one day to the standard library so that everybody will have it.
Cheers,
Robert
julien tayon wrote:
like the + of [] could be the + of "RecordAlgebrae"
If you're proposing to change the behaviour of '+' on the built-in list type, that's not going to happen.
I dont suggest to change something that already exists and works (I am
On Tue, 30 Oct 2018 at 22:33, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote: pretty conservative too, and expect stuff to not be broken by any changes) And all behaviours can coexists quite peacefully. "RecordAlgebra" In [5]: [2] + [2] Out[5]: [2, 2] In [6]: [2] * 2 Out[6]: [2, 2] In [7]: "a" + "a" Out[7]: 'aa' In [8]: "a" * 2 Out[8]: 'aa' (adding n times the same value is equal to multiplying by n // that is totally consistent to me) Mixed scenario : In [12]: a= mdict(a=[2], b='a') In [13]: a+a Out[14]: {'a': [2, 2], b='aa'} In [17]: a * 4 Out[17]: {'a': [2, 2, 2, 2], b='aaaa'} I propose the operators to be propagated, and any value to still follow its logic. LibearAlgebraic MutableMapping would be as algebraic as their values. No more. --
Greg _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Counter doesn't QUITE do the same thing as this `mdict`. But it's pretty close. I think if .__add__() became a synonym for .update() that wouldn't break anything that currently works. But I'm probably wrong, and missing a case in my quick thought:
from collections import Counter c = Counter(a=[2], b='a') c.update(c) c Counter({'a': [2, 2], 'b': 'aa'}) c2 = Counter(a=1, b=2) c2 + c2 Counter({'b': 4, 'a': 2}) c2.update(c2) c2 Counter({'b': 4, 'a': 2}) c + c Traceback (most recent call last): File "<ipython-input-18-e88785f3c342>", line 1, in <module> c + c File "/anaconda3/lib/python3.6/collections/__init__.py", line 705, in __add__ if newcount > 0: TypeError: '>' not supported between instances of 'list' and 'int'
On Tue, Oct 30, 2018 at 6:54 PM Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
In [12]: a= mdict(a=[2], b='a') In [13]: a+a
Aren't you reinventing the Counter type?
from collections import Counter c = Counter(a=1,b=2) c + c Counter({'b': 4, 'a': 2})
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
Actually, they are definitely different as in-place mutation versus returning a new Counter. But in some arithmetic way they look mostly the same. On Tue, Oct 30, 2018, 7:19 PM David Mertz <mertz@gnosis.cx wrote:
Counter doesn't QUITE do the same thing as this `mdict`. But it's pretty close.
I think if .__add__() became a synonym for .update() that wouldn't break anything that currently works. But I'm probably wrong, and missing a case in my quick thought:
from collections import Counter c = Counter(a=[2], b='a') c.update(c) c Counter({'a': [2, 2], 'b': 'aa'}) c2 = Counter(a=1, b=2) c2 + c2 Counter({'b': 4, 'a': 2}) c2.update(c2) c2 Counter({'b': 4, 'a': 2}) c + c Traceback (most recent call last): File "<ipython-input-18-e88785f3c342>", line 1, in <module> c + c File "/anaconda3/lib/python3.6/collections/__init__.py", line 705, in __add__ if newcount > 0: TypeError: '>' not supported between instances of 'list' and 'int'
On Tue, Oct 30, 2018 at 6:54 PM Alexander Belopolsky < alexander.belopolsky@gmail.com> wrote:
In [12]: a= mdict(a=[2], b='a') In [13]: a+a
Aren't you reinventing the Counter type?
from collections import Counter c = Counter(a=1,b=2) c + c Counter({'b': 4, 'a': 2})
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On Wed, 31 Oct 2018 at 00:20, David Mertz <mertz@gnosis.cx> wrote:
Counter doesn't QUITE do the same thing as this `mdict`. But it's pretty close.
I think if .__add__() became a synonym for .update() that wouldn't break anything that currently works. But I'm probably wrong, and missing a case in my quick thought:
My quick thoughts too is that it achieve coincidently Counter features as a subset of its features. I never noticed it, and both approaches seem consistent in their results (pfiou, close one since I did not thought of checking it)
And .... if you add the trait to Counter .... you have the following results :
from collections import Counter
from archery.quiver import LinearAlgebrae class ACounter(LinearAlgebrae, Counter): pass
c = ACounter(a=[2], b='a') c.update(c) c
ACounter({'a': [2, 2], 'b': 'aa'}) (same)
c2 = ACounter(a=1, b=2)
c2 + c2
ACounter({'b': 4, 'a': 2})
c2.update(c2) c2
ACounter({'b': 4, 'a': 2})
c2 + c2
ACounter({'a': 4, 'b': 8})
c2 + .5 * c2 ACounter({'a': 1.5, 'b': 3.0})
On Tue, Oct 30, 2018 at 6:54 PM Alexander Belopolsky <
alexander.belopolsky@gmail.com> wrote:
In [12]: a= mdict(a=[2], b='a') In [13]: a+a
Aren't you reinventing the Counter type?
nop. It is an unintended subset of the possibilities. I do have though
c2 / 2 Out[17]: ACounter({'a': 0.5, 'b': 1.0}) c / 2 TypeError: can't multiply sequence by non-int of type 'float'
And talking about Counter, by inheriting from the mixins of Vector (dot, abs, cos) we give it out of the box the cosine simlarities. Which given its wide use in textual indexation is pretty reassuring. It would also enable to normalize Counter (with value that supports truediv) easily by writing
class VCounter(LinearAlgebrae,Vector, Counter): pass c2 = VCounter(a=1, b=2) c2/abs(c2) Out[20]: VCounter({'a': 0.4472135954999579, 'b': 0.8944271909999159}) And since it is mixins it touches nothing of the MutableMapping class it relies on. It just gives behaviours associated with operators. (ofc c2.cos(c) willl normally raise a TypeError since it would have no sense)
It really is a proof of concept of adding linear/vectorial algebrae to ANY kind of mutable mapping be it : dict, Counter, OrderedDict, defaultDict ... It only relies on what mutableMapping (from abc) offers and does its life with it.
Julien, would I be correct if I summarized the changes you have in mind like this: for dictionaries d1 and d2, non-Mapping ("scalar") sc, binary operation ⊛, and unary operation 𝓊 (such as negation or abs()): d1 ⊛ sc == {k: (v ⊛ sc) for k, v in d1.items()} sc ⊛ d1 == {k: (sc ⊛ v) for k, v in d1.items()} 𝓊(d1) == {k: 𝓊(v) for k, v in d1.items()} d1 ⊛ d2 == {k: (d1[k] ⊛ d2[k]) for k in d1.keys() & d2.keys()}
And with libraries like pip install funcoperators or pip install infix, you can even write it infix :D from funcoperators import infix @infix def superop(d1, sc): return {k: (v *superopp* sc) for k, v in d1.items()} print({'a': 8} *superop* 5) Le mer. 31 oct. 2018 à 18:35, Vladimir Filipović <hemflit@gmail.com> a écrit :
Julien, would I be correct if I summarized the changes you have in mind like this:
for dictionaries d1 and d2, non-Mapping ("scalar") sc, binary operation ⊛, and unary operation 𝓊 (such as negation or abs()):
d1 ⊛ sc == {k: (v ⊛ sc) for k, v in d1.items()} sc ⊛ d1 == {k: (sc ⊛ v) for k, v in d1.items()} 𝓊(d1) == {k: 𝓊(v) for k, v in d1.items()} d1 ⊛ d2 == {k: (d1[k] ⊛ d2[k]) for k in d1.keys() & d2.keys()} _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Julien, Personally, I would be able to use the module you are proposing to accumulate arbitrarily-named measures. I can not think of a use case for division, but it would be nice for completion. I have made my own library that implements a small part of what you propose [1]. I was looking through the pstats.py [2] source code; and I thought it could benefit from vector operations. I have seen other code that collect measures have the same redundant pattern. Maybe some fancy regex can identify other += code sequences that would benefit. If you make a module, and show how it can simplify pstats.py, maybe you have a winner? [1] "vector" addition? - https://github.com/klahnakoski/mo-dots/blob/dev/tests/test_dot.py#L610 [2] pstats.py source code - https://github.com/python/cpython/blob/3.7/Lib/pstats.py#L156 On 2018-10-30 11:31, julien tayon wrote:
Hello :)
the idea is described here: http://jul.github.io/cv/pres.html#printable
Summary of the idea :
Take a linear algebrae book, and implements all the rules as a TDD. https://github.com/jul/archery/blob/master/consistent_algebrae.py
make it works based on abstract base class and sets of Mixins. https://archery.readthedocs.io/en/latest/
And see if we can make cos/__abs__/dot and if it gives naively the intended results ? (spoiler: yes)
Making it work with dict, and "other" dictionary like counter by using ineritance https://archery.readthedocs.io/en/latest/#advanced-usage
My idea is : wouldn't it be nice if we introduced geometries as sets of mixins for objects ? (Hilbertian algebrae could be nice too, and we could make MutableMapping behave like bra/kets).
So I was proposing a soft discussion on : could we agree that it would be nice to consider operation overloading as a whole set of behaviours that could profit from being consistent in a categorized way ? (like the + of [] could be the + of "RecordAlgebrae") Meaning we could define sets of "expected behaviour consistent interaction between operators" as we defined the abc and call them algebrae?
I offer the LinearAlgebrae Mixins as a POC, and was thinking of creating a unittest to qualify if an object is following the rules of linear algebrae.
What are your opinions ? I don't actually see a lot of use case except it was funny to build. But maybe it can be of use.
Cordialement
-- Julien
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
participants (8)
-
Alexander Belopolsky
-
Anders Hovmöller
-
David Mertz
-
Greg Ewing
-
julien tayon
-
Kyle Lahnakoski
-
Robert Vanden Eynde
-
Vladimir Filipović