Generating generations of files

MRAB python at mrabarnett.plus.com
Mon Apr 29 16:17:33 EDT 2019


On 2019-04-29 20:59, DL Neil wrote:
> Are you aware of a library/utility which will generate and maintain the
> file names of multiple generations of a file?
> 
> 
> The system generates multiple output files. For example, one might be
> called "output.rpt". However, we do not want to 'lose' the output
> file(s) from any previous run(s). In this case we want access to both
> files, eg "output.rpt" (the latest) and "output.rpt.1" (the previous
> generation, or maybe we could refer to it as a 'backup'). Similarly, if
> we run again, we want to maintain the three generations of the output
> file, including "output.rpt.2". (etc)
> 
> Backup systems/VCS often simply add a number to the original file-name,
> as above.
> 
> In logging, we only "rotate" the generations of a log file daily. Thus
> can add stat-data eg output.rpt.CCYY-MM-DD.
> 
> Yes, there are totally-unique naming ideas involving UUIDs. Um, "no"!
> 
> This application is multi-tenant (so 'your' output won't ever mix with
> 'mine'). It is not multi-entrant/does not need to be 'thread-safe'.
> However, a single user might frequently make several runs in a day.
> Thus, leaning toward the 'add a number', rather than looking to extend
> the 'stat' idea down to hours and minutes.
> 
> OTOH, using generation-numbers when there are many versions, (?surely)
> requires a 'ripple' of renaming; whereas the date-time idea is
> one-time-only rename.
> 
> The users want the (latest) output files to continue to use their
> 'traditional names'. The "generations' idea is (part of) this sprint's
> extension to the existing system. They have no particular preference for
> which/whatever generational naming system is used - as long as "it makes
> sense". They have undertaken responsibility for managing their disk-space(!)
> 
> 
> Have looked in the PSL, eg os, os.path, shutil, pathlib... Hope I didn't
> miss something.
> 
> Any ideas (before we 'reinvent the wheel') please?
> 
Why would generation numbers result in a 'ripple' of renaming?

You're assuming that "output.rpt.1" comes after "output.rpt.2", but it 
could just as well come before (generation 1 precedes generation 2, 
etc.). You're just left with the exception of the unnumbered 
"output.rpt" being the latest.


More information about the Python-list mailing list