Creating 50K text files in python

Tim Chase python.list at tim.thechases.com
Wed Mar 18 09:43:10 EDT 2009


>           I've an application where I need to create 50K files spread
> uniformly across 50 folders in python. The content can be the name of
> file itself repeated 10 times.I wrote a code using normal for loops
> but it is taking hours together for that. Can some one please share
> the code for it using Multithreading. As am new to Python, I know
> little about threading concepts.

I'm not sure multi-threading will assist much with what appears 
to be a filesystem & I/O bottleneck.  Far more important will be 
factors like

- what OS are you running?

- what filesystem are you writing to and does it journal?  (NTFS? 
  FAT?  ext[234]?  JFS?  XFS?  tempfs/tmpfs?  etc...)  Some 
filesystems deal much better with lots of small files while 
others perform better with fewer-but-larger files.

- how is that filesystem mounted?  Are syncs forced for all 
writes?  How large are your write caches?

- what sort of hardware is it hitting?  A 4500 RPM drive will 
have far worse performance than a 10k RPM drive; or you could try 
against a flash drive or RAID-0 which should have higher 
sustained write throughput.

- how sparse is the drive layout beforehand?  If you're cramped 
for space, the FS drivers have to scatter your files across 
disparate space; while lots of free-space may allow it to spew 
the files without great consideration.

Hope this gives you some ways to test your configuration...

-tkc








More information about the Python-list mailing list