[Python-ideas] Trie ABC?
Geoffrey Sneddon
foolistbar at googlemail.com
Sun May 5 22:03:13 CEST 2013
Currently there are a large number of trie implementations for Python,
all with slightly different APIs. It would be nice to introduce a ABC
for Tries to attempt to unify these.
Why do people want tries in Python? Typically because they provide quick
lookup of prefixes and keys starting with a prefix. html5lib uses a
pluggable trie based around a pseudo-ABC for tokenizing entities, for
example (see below for links).
It would likely make sense to introduce Trie and MutableTrie ABCs
inheriting from Mapping and (MutableMapping, Trie) respectively.
It is suggested to add at least two mixins:
longest_prefix(prefix) returning the longest prefix of "prefix" that is
a member of the Trie.
keys_with_prefix(prefix) (or maybe keys_starting_with, but I'm
bike-shedding myself now!) returning an iterable of keys that start with
with prefix. Some implementations simply override keys for this, adding
an optional first argument of prefix.
Many implementations also include something like has_keys_with_prefix
returning len(keys_with_prefix(prefix)) > 0.
A selection of existing trie implementations for Python:
https://github.com/kmike/datrie
https://github.com/buriy/python-chartrie
https://github.com/kmike/hat-trie
https://github.com/dhain/trie
I can attempt to summarize the various APIs these provide if there is
any interest whatsoever (it seems like there's little use if it's
decided against on principle!).
It would be nice to get an implementation into the stdlib if people are
in favour of an ABC, but given there's no real singular solution that
has been proven in the field I'm not rushing to force this.
<http://bugs.python.org/issue9520> was a case of trying to get a trie
implementation into the stdlib.
There's something vaguely along the lines of the above ABC that I
implemented for html5lib in
<https://github.com/html5lib/html5lib-python/blob/master/html5lib/trie/_base.py>,
as well as a light-weight pure-Python implementation in
<https://github.com/html5lib/html5lib-python/blob/master/html5lib/trie/py.py>.
/gsnedders
More information about the Python-ideas
mailing list