[Baypiggies] Which tree or trie library?

Mark Voorhies mvoorhie at yahoo.com
Tue Mar 12 02:12:41 CET 2013


On 03/11/2013 04:48 PM, Stephen McInerney wrote:
> I'm looking for advice (and also good overviews) on tree and trie implementations in
> Python. I already did some reading and it's a tower of Babel out there (StackOverflow , pypi, macports etc.); I can't find any good overviews. Apparently figuring out which package to use causes severe grief, especially to people from other languages where these exist in stdlib.
>
> In my current use case I need an n-ary tree or trie to store 32,000 UK regions and placenames.
> Max depth is 6. The keys are strings, possibly multiword. The values should be arbitrary objects. Must run on a Mac.
>
> Requirements:
>
> - doesn't need to be fancy. Only needs to support insert, and lookup by name,
>    and also hierarchical lookup by name, e.g. there is a location UK->North East England->Tyne & Wear->Newcastle Upon Tyne, hence we should be able to lookup 'Newcastle Upon Tyne'
>    at any node from the root downwards. There may be multiple hits.
>
> - not esssential, but ideally it can understand names can have synonyms/aliases,
>    e.g. UK->SW England is an alias to UK->South West England or UK->Cymru to UK->Wales
>    (however if we walk the tree, only the set of unique canonical names should get returned)
>    (Fail that, I can manually merge nodes.)
>
> - efficiency (memory or CPU) is not a concern in this application, but I'd like to know
>    which packages are efficient so I use a decent one.
>
> - later on I may want to import geolocations and postcodes, then figure out which regions are close, query-by-proximity. But that's a nice-to-have.
>
> Does any package spring to mind?

I can vouch for NetworkX as very good for large graph/digraph/tree problems.  It is very dict-like in terms of
associating objects with nodes.  It depends on numpy, so if you can install that, you should be fine
(this might help: http://fperez.org/py4science/starter_kit.html)

This review may also be useful (Fast Non-Standard Data Structures for Python):

http://kmike.ru/python-data-structures/

--Mark



More information about the Baypiggies mailing list