Suggestions on mechanism or existing code - maintain persistence of file download history
R.Wieser
address at not.available
Thu Jan 30 03:35:07 EST 2020
jkn,
> MRAB's scheme does have the disadvantages to me that Chris has pointed
> out.
Nothing that can't be countered by keeping copies of the last X number of
to-be-dowloaded-URLs files.
As for rewriting every time, you will /have/ to write something for every
action (and flush the file!), if you think you should be able to ctrl-c (or
worse) out of the program.
But, you could opt to write this sessions successfully downloaded URLs to a
seperate file, and only merge that with the origional one program start.
That together with an integrity check of the seperate file (eventually on a
line-by-line (URL) basis) should make the origional files corruption rather
unlikely.
A database /sounds/ good, but what happens when you ctrl-c outof a
non-atomic operation ? How do you fix that ? IOW: Databases can be
corrupted for pretty-much the same reason as for a simple datafile (but with
much worse consequences).
Also think of the old adagio: "I had a problem, and than I thought I could
use X. Now I have two problems..." - with X traditionally being "regular
expressions". In other words: do KISS (keep it ....)
By the way: The "just write the URLs in a folder" method is not at all a bad
one. /Very/ easy to maintain, resilent (especially when you consider the
self-repairing capabilities of some filesystems) and the polar opposite of a
"customer lock-in". :-)
Regards,
Rudy Wieser
More information about the Python-list
mailing list