Removing outdated files

Jan Danielsson jan.m.danielsson at gmail.com
Tue Jan 23 00:17:15 EST 2007


Hello all,

   I have a backup system which produces files using the following pattern:

   <basename>.<date>.<extension>

   For instance:

   documents.2007-01-01.tar.bz2.gpg
   documents.2007-01-02.tar.bz2.gpg
   .
   .
   .
   system_files.2007-01-01.tar.bz2.gpg
   system_files.2007-01-02.tar.bz2.gpg
   .
   .
   .
   etc.

   Obviously, I have little need for *all* those files. What I want to
do is to delete old files according to this pattern:

   - Keep all backup files which are two weeks, or less, old
   - If backups are more than two weeks old, then keep only the latest
one for each week.
   - If backups are more than two months old, then keep only the latest
one for each month.
   - If backups are more than two years old, then keep only the latest
one for each year.

   I have generated a list of files, parsed the date from the entries,
and created date object from them. I have a list where I have grouped
the "basenames" together. I.e. (lists in a list):

documents
	2007-01-01
	2007-01-02
	.
	.
	.
system_files
	2007-01-01
	2007-01-02
	.
	.
	.

   Now all I have to do is iterate through the date-lists for each of
the basenames, and apply the rules! Well..

   How does one group by week, for instance? I'd like to create a new
set of lists which looks like this:

   basename:documents
      week:01
        2007-01-01
        2007-01-02
      week:02
        2007-01-07
        2007-01-08

   Date-grouping is dead easy in SQL, but I don't feel like resorting to
postgresql just for this. :-)

   I've been looking at the datetime.date class, but I can't see any
easy way to parse the week number from it. I could calculate this
information by brute force -- but I get a feeling that there are
functions in Python to extract week numbers from a date.

-- 
Kind regards,
Jan Danielsson
------------ And now a word from our sponsor ------------------
For a quality usenet news server, try DNEWS, easy to install,
fast, efficient and reliable. For home servers or carrier class
installations with millions of users it will allow you to grow!
----  See http://netwinsite.com/sponsor/sponsor_dnews.htm  ----



More information about the Python-list mailing list