Scanning directories for new files?
Martin Gregorie
martin at address-in-sig.invalid
Tue Dec 21 14:51:25 EST 2010
On Tue, 21 Dec 2010 14:17:40 -0500, Matty Sarro wrote:
> Hey everyone.
> I'm in the midst of writing a parser to clean up incoming files, remove
> extra data that isn't needed, normalize some values, etc. The base files
> will be uploaded via FTP.
> How does one go about scanning a directory for new files? For now we're
> looking to run it as a cron job but eventually would like to move away
> from that into making it a service running in the background.
>
Make sure the files are initially uploaded using a name that the parser
isn't looking for and rename it when the upload is finished. This way the
parser won't try to process a partially loaded file.
If you are uploading to a *nix machine You the rename can move the file
between directories provided both directories are in the same filing
system. Under those conditions rename is always an atomic operation with
no copying involved. This would you to, say, upload the file to "temp/
myfile" and renamed it to "uploaded/myfile" with your parser only
scanning the uploaded directory and, presumably, renaming processed files
to move them to a third directory ready for further processing.
I've used this technique reliably with files arriving via FTP at quite
high rates.
--
martin@ | Martin Gregorie
gregorie. | Essex, UK
org |
More information about the Python-list
mailing list