[CentralOH] Automated Patches
jep200404 at columbus.rr.com
jep200404 at columbus.rr.com
Mon Oct 29 21:53:34 CET 2012
On Thu, 25 Oct 2012 15:48:16 -0700, Austin Godber <godber at gmail.com> wrote:
> I am not sure if I have fully grokked your problem, ...
I am archiving data files as originally received.
It is a requirement to preserve the original files
as received, regardless of how good or bad they are.
They are all compressed. There are many files.
Some of them are around 1/2 Gigabyte.
We have enough data that we have to move files that are not
likely to be needed, from production servers, to off-line
storage.
Some of the data files have some bad content,
and the corrections to the (uncompressed) content are tiny,
so diffs are a nice way to keep track of corrections,
using little extra storage while preserving the original
(bad) files.
Conventional version control systems don't seem to be a good
fit. Many don't handle compressed files well. I don't know
how well they would handle moving older data to off-line
storage.
I'm looking for a Pythonic way to automate the application of an
individual patches, when they exist, without modfying, creating
or renaming any files[2]. Pipes are good, whether ala Unix,
or within Python[1].
Instead of using open(filename), I would call open_patched(filename)
which would look for both filename and filename + '.patch'.
If both files exist, open_patched would return a file-like object
that would yield the output of something like[3]
cat $filename | patcheroo filename+'.patch'
otherwise open_patched() would work just like open(),
returning a file-like object for just filename.
[1] http://mail.python.org/pipermail/centraloh/2012-August/001369.html
Thanks again Neil.
[2] Of course, I could copy or write the uncompressed file in some
temporary directory, modify it, consume it, then delete it.
It'd be nice if to avoid having temporary files by using
pipes or pipe-like goodness instead.
The patch command might require temp files.
[3] I use the ficticious patcheroo which uses stdin for
the unpatched data, and stdout for the patched data,
to avoid dealing with patch's grammar and behavior.
[4] And now for something completely different. NATs are good.
http://www.youtube.com/watch?v=v26BAlfWBm8
Thanks Rick.
More information about the CentralOH
mailing list