syntax difference
Peter J. Holzer
hjp-python at hjp.at
Wed Jun 27 07:42:15 EDT 2018
On 2018-06-27 11:11:37 +1200, Gregory Ewing wrote:
> Bart wrote:
> > x = set(range(10_000_000))
> >
> > This used up 460MB of RAM (the original 100M I tried exhausted the memory).
> >
> > The advantage of Pascal-style sets is that that same set will occupy
> > only 1.25MB, as it is a bit-map.
>
> That's true, but they're also extremely limited compared to
> the things you can do with Python sets. They're really two
> quite different things, designed for different use cases.
>
> Compact sets of integers are possible in Python, but not
> as a built-in type. This suggests that they're not needed
> much
Also, when such sets are needed, a simple array of bits may not be
sufficient. For example, sets may be sparse, or they may have almost all
except a few bits set. In these cases a tree or RLE representation is
much smaller and faster. There are a number of such implementations
(Judy Arrays and Roaring Bitmaps come to mind[1]). Each has its
advantages and limitations, so its probably better to leave the choice
to the author.
hp
[1] Judy is a C library. Roaring bitmaps are a data structure which has
been implemented in several languages. I haven't checked whether
there are packages on pypi. Shouldn't be too hard to implement if
one needs it.
--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp at hjp.at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20180627/70ce2e42/attachment.sig>
More information about the Python-list
mailing list