Storing a big amount of path names
rgaddi at highlandtechnology.invalid
Thu Feb 11 21:17:49 EST 2016
Tim Chase wrote:
> On 2016-02-12 00:31, Paulo da Silva wrote:
>> What is the best (shortest memory usage) way to store lots of
>> pathnames in memory where:
>> 1. Path names are pathname=(dirname,filename)
>> 2. There many different dirnames but much less than pathnames
>> 3. dirnames have in general many chars
>> The idea is to share the common dirnames.
> Well, you can create a dict that has dirname->list(filenames) which
> will reduce the dirname to a single instance. You could store that
> dict in the class, shared by all of the instances, though that starts
> to pick up a code-smell.
> But unless you're talking about an obscenely large number of
> dirnames & filenames, or a severely resource-limited machine, just
> use the default built-ins. If you start to push the boundaries of
> system resources, then I'd try the "anydbm" module or use the
> "shelve" module to marshal them out to disk. Finally, you *could*
> create an actual sqlite database on disk if size really does exceed
> reasonable system specs.
Probably more memory efficient to make a list of lists, and just declare
that element of each list is the dirname. That way you're not
wasting memory on the unused entryies of the hashtable.
But unless the OP has both a) plus of a million entries and b) let's say
at least 20 filenames to each dirname, it's not worth doing.
Now, if you do really have a million entries, one thing that would help
with memory is setting __slots__ for MyFile rather than letting it
create an instance dictionary for each one.
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order. See above to fix.
More information about the Python-list