walking a directory with very many files

Scott David Daniels Scott.Daniels at Acm.Org
Wed Jun 17 09:05:12 EDT 2009


Mike Kazantsev wrote:
> On Wed, 17 Jun 2009 14:52:28 +1200
> Lawrence D'Oliveiro <ldo at geek-central.gen.new_zealand> wrote:
> 
>> In message 
>> <234b19ac-7baf-4356-9fe5-37d00146d982 at z9g2000yqi.googlegroups.com>,
>> thebjorn wrote:
>>
>>> Not proud of this, but...:
>>>
>>> [django] www4:~/datakortet/media$ ls bfpbilder|wc -l
>>>  174197
>>>
>>> all .jpg files between 40 and 250KB with the path stored in a
>>> database field... *sigh*
>> Why not put the images themselves into database fields?
>>
>>> Oddly enough, I'm a relieved that others have had similar folder
>>> sizes ...
>> One of my past projects had 400000-odd files in a single folder. They
>> were movie frames, to allow assembly of movie sequences on demand.
> 
> For both scenarios:
> Why not use hex representation of md5/sha1-hashed id as a path,
> arranging them like /path/f/9/e/95ea4926a4 ?
> ...
> In fact, on modern filesystems it doesn't matter whether you accessing 
> /path/f9e95ea4926a4 with million files in /path or /path/f/9/e/95ea
> with only hundred of them in each path.
Probably better to use:
     /path/f9/e9/5ea4926a4
If you want to talk hundreds per layer.  Branching 16 ways seems silly.

--Scott David Daniels
Scott.Daniels at Acm.Org



More information about the Python-list mailing list