PEP-603: Adding a frozenmap type to collections

Hi!
I've just published a PEP to add a frozenmap type to Python; it should be online shortly.
Read it here: https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/...
Here's an excerpt from the Abstract:
A *persistent data structure* is defined as a data structure that preserves the previous version of the data when the data is modified. Such data structures are effectively *immutable*, as operations on them do not update the structure in-place, but instead always yield a new updated structure (see [0]_ for more details.)
This PEP proposes to add a new fully persistent and immutable mapping type called ``frozenmap`` to the ``collections`` module.
The bulk of ``frozenmap``'s reference implementation is already used in CPython to implement the ``contextvars`` module.
Yury

I'll let other people comment on this wrt Python (even though I think it looks cool and useful to me). Out of curiosity, why does frozenmap not have the same runtime and ordering guarantees as dict?
On Thursday, September 12, 2019 at 7:47:25 AM UTC-4, Yury Selivanov wrote:
Hi!
I've just published a PEP to add a frozenmap type to Python; it should be online shortly.
Read it here: https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/...
Here's an excerpt from the Abstract:
A *persistent data structure* is defined as a data structure that preserves the previous version of the data when the data is modified. Such data structures are effectively *immutable*, as operations on them do not update the structure in-place, but instead always yield a new updated structure (see [0]_ for more details.)
This PEP proposes to add a new fully persistent and immutable mapping type called ``frozenmap`` to the ``collections`` module.
The bulk of ``frozenmap``'s reference implementation is already used in CPython to implement the ``contextvars`` module.
Yury _______________________________________________ Python-ideas mailing list -- python...@python.org javascript: To unsubscribe send an email to python-id...@python.org javascript: https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/HYFR6Q... Code of Conduct: http://python.org/psf/codeofconduct/

On Thu, 12 Sep 2019 at 14:58, Neil Girdhar mistersheik@gmail.com wrote:
I'll let other people comment on this wrt Python (even though I think it looks cool and useful to me). Out of curiosity, why does frozenmap not have the same runtime and ordering guarantees as dict?
Presumably because the implementation needs to be different to handle the immutability, and that puts constraints on the most efficient implementation techniques? Paul

On 12 Sep 2019, at 13:46, Yury Selivanov yselivanov.ml@gmail.com wrote:
Hi!
I've just published a PEP to add a frozenmap type to Python; it should be online shortly.
Read it here: https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/...
My two cents: the motivation for "frozenmap" over "frozendict" isn't very compelling. In python we call them dicts. map() is a function. This is super confusing in other languages that have Map and map. Python doesn't have this problem. Let's not introduce it!
I would like some comparison and discussion of this data structure and its API compared to the corresponding data structure and API in pyrsistent if possible. (I am biased here because I've been involved a bit there and the author is a colleague but I think this is a relatively popular library for this use case...)
/ Anders

I think pyrsistent uses the same data structure -- HAMT. Regardless of that, using it is a nonstarter because (a) we already use the bulk of frozenmap implementation to power the contextvars module; (b) pyrsistent is 15x slower, see https://gist.github.com/1st1/be5a1c10aceb0775d0406e879cf87344#gistcomment-30...
Yury
On Thu, Sep 12, 2019 at 3:52 PM Anders Hovmöller boxed@killingar.net wrote:
On 12 Sep 2019, at 13:46, Yury Selivanov yselivanov.ml@gmail.com wrote:
Hi!
I've just published a PEP to add a frozenmap type to Python; it should be online shortly.
Read it here: https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/...
My two cents: the motivation for "frozenmap" over "frozendict" isn't very compelling. In python we call them dicts. map() is a function. This is super confusing in other languages that have Map and map. Python doesn't have this problem. Let's not introduce it!
I would like some comparison and discussion of this data structure and its API compared to the corresponding data structure and API in pyrsistent if possible. (I am biased here because I've been involved a bit there and the author is a colleague but I think this is a relatively popular library for this use case...)
/ Anders

BTW, I forget to mention, but please post replies here: https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/...
I'm checking this mailing list on a far less regular schedule. Besides, splitting the discussion between two different mediums isn't a good idea anyways.
Yury
On Thu, Sep 12, 2019 at 12:46 PM Yury Selivanov yselivanov.ml@gmail.com wrote:
Hi!
I've just published a PEP to add a frozenmap type to Python; it should be online shortly.
Read it here: https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/...
Here's an excerpt from the Abstract:
A *persistent data structure* is defined as a data structure that preserves the previous version of the data when the data is modified. Such data structures are effectively *immutable*, as operations on them do not update the structure in-place, but instead always yield a new updated structure (see [0]_ for more details.)
This PEP proposes to add a new fully persistent and immutable mapping type called ``frozenmap`` to the ``collections`` module.
The bulk of ``frozenmap``'s reference implementation is already used in CPython to implement the ``contextvars`` module.
Yury

On Sep 12, 2019, at 04:46, Yury Selivanov yselivanov.ml@gmail.com wrote:
Hi!
I've just published a PEP to add a frozenmap type to Python; it should be online shortly.
Read it here: https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/...
How does this compare in performance, features, API, etc. to pyrsistent, immutables, hamt, and other third-party implementations on PyPI?
Not all of the features provided by all of the third-party libs are necessary (Haskell-style zippers; items and friends that return a lazy list instead of an iterator; instar-style transforms), but it would be nice to know that they were considered and rejected, and why.
And, similarly, it would be nice to have a rationale for why we _do_ need the evolve (or whatever it’s called—the temporarily mutable copy via chain thing) thing, and if/ how/why it differs from the other implementations.
And for features that are included, it would be nice to know how easy it is to translate to and from the other implementations, and if it’s not just a trivial respelling (changing setting to including, emacs can take care of for me; if evolving and locking is a whole different flow, it can’t), why it can’t be. After all, it’s sometimes been handy that you can pretty quickly migrate code from, e.g., ipaddress or simplejson or the old OrderedDict recipe when you can use a newer Python, or the other way when you learn that you need to deploy on an older Python. If the same is doable for persistent mappings, that would be a nice-to-have feature (but not if it would require making the API worse or making everything slower or more complicated).
Finally, it’s probably not worth adding HAMT-backed sets, lists, etc., just because we can, even though I think most libraries both in Python and in other languages do. But it might be worth adding a rationale for _why_ they aren’t necessary (and mentioning that if we see a lot of people using frozenmaps with None values as sets in place of frozenset, we can always easily add something later).

As Yuri said the proposed implementation is 15x times faster than pyrsistent. hamt library doesn't exist on PyPI. immutables library is a code written by Yuri, it uses hamt datastructure internally. Basically, the proposed frozenmap is a port of immutables (with the class renaming). In turn, immutables is a CPython hamt.c with python wrappers.
On Thu, Sep 12, 2019 at 6:55 PM Andrew Barnert via Python-ideas python-ideas@python.org wrote:
On Sep 12, 2019, at 04:46, Yury Selivanov yselivanov.ml@gmail.com wrote:
Hi!
I've just published a PEP to add a frozenmap type to Python; it should be online shortly.
Read it here: https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/...
How does this compare in performance, features, API, etc. to pyrsistent, immutables, hamt, and other third-party implementations on PyPI?
Not all of the features provided by all of the third-party libs are necessary (Haskell-style zippers; items and friends that return a lazy list instead of an iterator; instar-style transforms), but it would be nice to know that they were considered and rejected, and why.
And, similarly, it would be nice to have a rationale for why we _do_ need the evolve (or whatever it’s called—the temporarily mutable copy via chain thing) thing, and if/ how/why it differs from the other implementations.
And for features that are included, it would be nice to know how easy it is to translate to and from the other implementations, and if it’s not just a trivial respelling (changing setting to including, emacs can take care of for me; if evolving and locking is a whole different flow, it can’t), why it can’t be. After all, it’s sometimes been handy that you can pretty quickly migrate code from, e.g., ipaddress or simplejson or the old OrderedDict recipe when you can use a newer Python, or the other way when you learn that you need to deploy on an older Python. If the same is doable for persistent mappings, that would be a nice-to-have feature (but not if it would require making the API worse or making everything slower or more complicated).
Finally, it’s probably not worth adding HAMT-backed sets, lists, etc., just because we can, even though I think most libraries both in Python and in other languages do. But it might be worth adding a rationale for _why_ they aren’t necessary (and mentioning that if we see a lot of people using frozenmaps with None values as sets in place of frozenset, we can always easily add something later). _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/2CIPO3... Code of Conduct: http://python.org/psf/codeofconduct/

On Sep 12, 2019, at 11:15, Andrew Svetlov andrew.svetlov@gmail.com wrote:
As Yuri said the proposed implementation is 15x times faster than pyrsistent. hamt library doesn't exist on PyPI.
I probably didn’t remember the names of all of the relevant PyPI projects off the top of my head. But shouldn’t that be included in the PEP so people don’t have to remember and/or search?
immutables library is a code written by Yuri, it uses hamt datastructure internally. Basically, the proposed frozenmap is a port of immutables (with the class renaming).
OK, the fact that immutables uses the same C data structure implementation that’s already in Python surely outweighs the fact that it’s not a category-killer.
(i assume that either there’s a pure-Python version for PyPy and other implementations, or that those implementations are expected to already include some code that’s equivalent to what hamt.c does that can be accessed the same way?)
But that doesn’t answer any of the questions about the API design. Shouldn’t the PEP explain why various features were included, or left out, or designed differently than in existing third-party libraries (and equivalent features in other languages)? Just “one of the existing libs made these choices for reasons that aren’t documented anywhere” doesn’t seem like enough of a rationale.
Again, I don’t think there’s anything wrong with the decisions. (For example, evolve, or whatever he calls it, is useful more often than zippers, and harder to build yourself on top of a library that doesn’t provide it; etc.) Just that the PEP should make the case for each such decision.

Please bring your questions to https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/... to have the single discussion site.
On Fri, Sep 13, 2019 at 5:06 AM Andrew Barnert abarnert@yahoo.com wrote:
On Sep 12, 2019, at 11:15, Andrew Svetlov andrew.svetlov@gmail.com wrote:
As Yuri said the proposed implementation is 15x times faster than pyrsistent. hamt library doesn't exist on PyPI.
I probably didn’t remember the names of all of the relevant PyPI projects off the top of my head. But shouldn’t that be included in the PEP so people don’t have to remember and/or search?
immutables library is a code written by Yuri, it uses hamt datastructure internally. Basically, the proposed frozenmap is a port of immutables (with the class renaming).
OK, the fact that immutables uses the same C data structure implementation that’s already in Python surely outweighs the fact that it’s not a category-killer.
(i assume that either there’s a pure-Python version for PyPy and other implementations, or that those implementations are expected to already include some code that’s equivalent to what hamt.c does that can be accessed the same way?)
But that doesn’t answer any of the questions about the API design. Shouldn’t the PEP explain why various features were included, or left out, or designed differently than in existing third-party libraries (and equivalent features in other languages)? Just “one of the existing libs made these choices for reasons that aren’t documented anywhere” doesn’t seem like enough of a rationale.
Again, I don’t think there’s anything wrong with the decisions. (For example, evolve, or whatever he calls it, is useful more often than zippers, and harder to build yourself on top of a library that doesn’t provide it; etc.) Just that the PEP should make the case for each such decision.

Yury Selivanov wrote:
A *persistent data structure* is defined as a data structure that preserves the previous version of the data when the data is modified. Such data structures are effectively *immutable*, as operations on them do not update the structure in-place, but instead always yield a new updated structure (see [0]_ for more details.)
You might want to find another term for this -- it's very different from what "persistent data structure" usually means (i.e. something that gets saved to a file or database with some degree of automagicness).

On 13 Sep 2019, at 00:38, Greg Ewing greg.ewing@canterbury.ac.nz wrote:
Yury Selivanov wrote:
A *persistent data structure* is defined as a data structure that preserves the previous version of the data when the data is modified. Such data structures are effectively *immutable*, as operations on them do not update the structure in-place, but instead always yield a new updated structure (see [0]_ for more details.)
You might want to find another term for this -- it's very different from what "persistent data structure" usually means (i.e. something that gets saved to a file or database with some degree of automagicness).
Unfortunately this is actually the correct terminology from all existing literature! :(
/ Anders
participants (7)
-
Anders Hovmöller
-
Andrew Barnert
-
Andrew Svetlov
-
Greg Ewing
-
Neil Girdhar
-
Paul Moore
-
Yury Selivanov