PEP-603: Adding a frozenmap type to collections

Hi! I've just published a PEP to add a frozenmap type to Python; it should be online shortly. Read it here: https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/... Here's an excerpt from the Abstract: A *persistent data structure* is defined as a data structure that preserves the previous version of the data when the data is modified. Such data structures are effectively *immutable*, as operations on them do not update the structure in-place, but instead always yield a new updated structure (see [0]_ for more details.) This PEP proposes to add a new fully persistent and immutable mapping type called ``frozenmap`` to the ``collections`` module. The bulk of ``frozenmap``'s reference implementation is already used in CPython to implement the ``contextvars`` module. Yury

I'll let other people comment on this wrt Python (even though I think it looks cool and useful to me). Out of curiosity, why does frozenmap not have the same runtime and ordering guarantees as dict? On Thursday, September 12, 2019 at 7:47:25 AM UTC-4, Yury Selivanov wrote:

On Thu, 12 Sep 2019 at 14:58, Neil Girdhar <mistersheik@gmail.com> wrote:
I'll let other people comment on this wrt Python (even though I think it looks cool and useful to me). Out of curiosity, why does frozenmap not have the same runtime and ordering guarantees as dict?
Presumably because the implementation needs to be different to handle the immutability, and that puts constraints on the most efficient implementation techniques? Paul

My two cents: the motivation for "frozenmap" over "frozendict" isn't very compelling. In python we call them dicts. map() is a function. This is super confusing in other languages that have Map and map. Python doesn't have this problem. Let's not introduce it! I would like some comparison and discussion of this data structure and its API compared to the corresponding data structure and API in pyrsistent if possible. (I am biased here because I've been involved a bit there and the author is a colleague but I think this is a relatively popular library for this use case...) / Anders

I think pyrsistent uses the same data structure -- HAMT. Regardless of that, using it is a nonstarter because (a) we already use the bulk of frozenmap implementation to power the contextvars module; (b) pyrsistent is 15x slower, see https://gist.github.com/1st1/be5a1c10aceb0775d0406e879cf87344#gistcomment-30... Yury On Thu, Sep 12, 2019 at 3:52 PM Anders Hovmöller <boxed@killingar.net> wrote:
-- Yury

BTW, I forget to mention, but please post replies here: https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/... I'm checking this mailing list on a far less regular schedule. Besides, splitting the discussion between two different mediums isn't a good idea anyways. Yury On Thu, Sep 12, 2019 at 12:46 PM Yury Selivanov <yselivanov.ml@gmail.com> wrote:
-- Yury

On Sep 12, 2019, at 04:46, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
How does this compare in performance, features, API, etc. to pyrsistent, immutables, hamt, and other third-party implementations on PyPI? Not all of the features provided by all of the third-party libs are necessary (Haskell-style zippers; items and friends that return a lazy list instead of an iterator; instar-style transforms), but it would be nice to know that they were considered and rejected, and why. And, similarly, it would be nice to have a rationale for why we _do_ need the evolve (or whatever it’s called—the temporarily mutable copy via chain thing) thing, and if/ how/why it differs from the other implementations. And for features that are included, it would be nice to know how easy it is to translate to and from the other implementations, and if it’s not just a trivial respelling (changing setting to including, emacs can take care of for me; if evolving and locking is a whole different flow, it can’t), why it can’t be. After all, it’s sometimes been handy that you can pretty quickly migrate code from, e.g., ipaddress or simplejson or the old OrderedDict recipe when you can use a newer Python, or the other way when you learn that you need to deploy on an older Python. If the same is doable for persistent mappings, that would be a nice-to-have feature (but not if it would require making the API worse or making everything slower or more complicated). Finally, it’s probably not worth adding HAMT-backed sets, lists, etc., just because we can, even though I think most libraries both in Python and in other languages do. But it might be worth adding a rationale for _why_ they aren’t necessary (and mentioning that if we see a lot of people using frozenmaps with None values as sets in place of frozenset, we can always easily add something later).

As Yuri said the proposed implementation is 15x times faster than pyrsistent. hamt library doesn't exist on PyPI. immutables library is a code written by Yuri, it uses hamt datastructure internally. Basically, the proposed frozenmap is a port of immutables (with the class renaming). In turn, immutables is a CPython hamt.c with python wrappers. On Thu, Sep 12, 2019 at 6:55 PM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
-- Thanks, Andrew Svetlov

On Sep 12, 2019, at 11:15, Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
As Yuri said the proposed implementation is 15x times faster than pyrsistent. hamt library doesn't exist on PyPI.
I probably didn’t remember the names of all of the relevant PyPI projects off the top of my head. But shouldn’t that be included in the PEP so people don’t have to remember and/or search?
OK, the fact that immutables uses the same C data structure implementation that’s already in Python surely outweighs the fact that it’s not a category-killer. (i assume that either there’s a pure-Python version for PyPy and other implementations, or that those implementations are expected to already include some code that’s equivalent to what hamt.c does that can be accessed the same way?) But that doesn’t answer any of the questions about the API design. Shouldn’t the PEP explain why various features were included, or left out, or designed differently than in existing third-party libraries (and equivalent features in other languages)? Just “one of the existing libs made these choices for reasons that aren’t documented anywhere” doesn’t seem like enough of a rationale. Again, I don’t think there’s anything wrong with the decisions. (For example, evolve, or whatever he calls it, is useful more often than zippers, and harder to build yourself on top of a library that doesn’t provide it; etc.) Just that the PEP should make the case for each such decision.

Please bring your questions to https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/... to have the single discussion site. On Fri, Sep 13, 2019 at 5:06 AM Andrew Barnert <abarnert@yahoo.com> wrote:
-- Thanks, Andrew Svetlov

I'll let other people comment on this wrt Python (even though I think it looks cool and useful to me). Out of curiosity, why does frozenmap not have the same runtime and ordering guarantees as dict? On Thursday, September 12, 2019 at 7:47:25 AM UTC-4, Yury Selivanov wrote:

On Thu, 12 Sep 2019 at 14:58, Neil Girdhar <mistersheik@gmail.com> wrote:
I'll let other people comment on this wrt Python (even though I think it looks cool and useful to me). Out of curiosity, why does frozenmap not have the same runtime and ordering guarantees as dict?
Presumably because the implementation needs to be different to handle the immutability, and that puts constraints on the most efficient implementation techniques? Paul

My two cents: the motivation for "frozenmap" over "frozendict" isn't very compelling. In python we call them dicts. map() is a function. This is super confusing in other languages that have Map and map. Python doesn't have this problem. Let's not introduce it! I would like some comparison and discussion of this data structure and its API compared to the corresponding data structure and API in pyrsistent if possible. (I am biased here because I've been involved a bit there and the author is a colleague but I think this is a relatively popular library for this use case...) / Anders

I think pyrsistent uses the same data structure -- HAMT. Regardless of that, using it is a nonstarter because (a) we already use the bulk of frozenmap implementation to power the contextvars module; (b) pyrsistent is 15x slower, see https://gist.github.com/1st1/be5a1c10aceb0775d0406e879cf87344#gistcomment-30... Yury On Thu, Sep 12, 2019 at 3:52 PM Anders Hovmöller <boxed@killingar.net> wrote:
-- Yury

BTW, I forget to mention, but please post replies here: https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/... I'm checking this mailing list on a far less regular schedule. Besides, splitting the discussion between two different mediums isn't a good idea anyways. Yury On Thu, Sep 12, 2019 at 12:46 PM Yury Selivanov <yselivanov.ml@gmail.com> wrote:
-- Yury

On Sep 12, 2019, at 04:46, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
How does this compare in performance, features, API, etc. to pyrsistent, immutables, hamt, and other third-party implementations on PyPI? Not all of the features provided by all of the third-party libs are necessary (Haskell-style zippers; items and friends that return a lazy list instead of an iterator; instar-style transforms), but it would be nice to know that they were considered and rejected, and why. And, similarly, it would be nice to have a rationale for why we _do_ need the evolve (or whatever it’s called—the temporarily mutable copy via chain thing) thing, and if/ how/why it differs from the other implementations. And for features that are included, it would be nice to know how easy it is to translate to and from the other implementations, and if it’s not just a trivial respelling (changing setting to including, emacs can take care of for me; if evolving and locking is a whole different flow, it can’t), why it can’t be. After all, it’s sometimes been handy that you can pretty quickly migrate code from, e.g., ipaddress or simplejson or the old OrderedDict recipe when you can use a newer Python, or the other way when you learn that you need to deploy on an older Python. If the same is doable for persistent mappings, that would be a nice-to-have feature (but not if it would require making the API worse or making everything slower or more complicated). Finally, it’s probably not worth adding HAMT-backed sets, lists, etc., just because we can, even though I think most libraries both in Python and in other languages do. But it might be worth adding a rationale for _why_ they aren’t necessary (and mentioning that if we see a lot of people using frozenmaps with None values as sets in place of frozenset, we can always easily add something later).

As Yuri said the proposed implementation is 15x times faster than pyrsistent. hamt library doesn't exist on PyPI. immutables library is a code written by Yuri, it uses hamt datastructure internally. Basically, the proposed frozenmap is a port of immutables (with the class renaming). In turn, immutables is a CPython hamt.c with python wrappers. On Thu, Sep 12, 2019 at 6:55 PM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
-- Thanks, Andrew Svetlov

On Sep 12, 2019, at 11:15, Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
As Yuri said the proposed implementation is 15x times faster than pyrsistent. hamt library doesn't exist on PyPI.
I probably didn’t remember the names of all of the relevant PyPI projects off the top of my head. But shouldn’t that be included in the PEP so people don’t have to remember and/or search?
OK, the fact that immutables uses the same C data structure implementation that’s already in Python surely outweighs the fact that it’s not a category-killer. (i assume that either there’s a pure-Python version for PyPy and other implementations, or that those implementations are expected to already include some code that’s equivalent to what hamt.c does that can be accessed the same way?) But that doesn’t answer any of the questions about the API design. Shouldn’t the PEP explain why various features were included, or left out, or designed differently than in existing third-party libraries (and equivalent features in other languages)? Just “one of the existing libs made these choices for reasons that aren’t documented anywhere” doesn’t seem like enough of a rationale. Again, I don’t think there’s anything wrong with the decisions. (For example, evolve, or whatever he calls it, is useful more often than zippers, and harder to build yourself on top of a library that doesn’t provide it; etc.) Just that the PEP should make the case for each such decision.

Please bring your questions to https://discuss.python.org/t/pep-603-adding-a-frozenmap-type-to-collections/... to have the single discussion site. On Fri, Sep 13, 2019 at 5:06 AM Andrew Barnert <abarnert@yahoo.com> wrote:
-- Thanks, Andrew Svetlov
participants (7)
-
Anders Hovmöller
-
Andrew Barnert
-
Andrew Svetlov
-
Greg Ewing
-
Neil Girdhar
-
Paul Moore
-
Yury Selivanov