[Moin-user] VCS as storage backend? Bundle Storage idea.

Thomas Waldmann tw-public at gmx.de
Fri Feb 2 07:47:58 EST 2007


> Speaking of the storage level - I noticed that todays revisions are
> stored as full copies of all revisions. Has there been any thoughts on
> using a full-blown VC (Subversion or similar) as backend, or is that a
> stupid idea for reasons I don't know anything about? 
>   
Well. currently this simply is not possible. Some day we'll have a 
storage plugin api so one theoretically could maybe plug in anything one 
wants (assumining you also want to write the code if it is not already 
there).

Speaking of VCSes, this is maybe just overkill for moin because we don't 
need most functions. It might also be slower and create new dependencies.
But if somebody wants to try (when the time has come), we won't hold him 
back. :)

What I personally thought of doing (after storage api is alive and 
working well) is some filesystem storage that transparently compresses 
old revisions into bundles.

E.g. if you have page revisions 1 .. 2345, we could maybe have this:
00000100.tar.gz - has page file 00000001 .. 00000100 inside
00000200.tar.gz - has page file 00000101 .. 00000200 inside
...
00002300.tar.gz - ... up to 2300
00002301 - page file, not compressed yet
...
00002345 - page file, not compressed yet

This would be quite efficient because a page evolving over time usually 
has similar content all the time - and that compresses well.
Also, the file system overhead would be much lower as when storing all 
those tiny files separately.

Maybe this is also interesting for revisioned "attachments" (file 
items), assuming that there is also some similarity in file contents. 
That could be a big space safer as files tend to be much bigger than 
wiki pages. So if you update a file often, this could cost quite some 
space for the revisions.
For the files, we also maybe need to set purge criteria (.e.g only keep 
the last 20 revisions, or delete everything older than a year (but keep 
at least the last 2 revisions)).

Python has built-in support for tar files and gzip compression, so it 
wouldn't need external stuff.

OTOH: storage is cheap (and getting cheaper all the time) and modern 
file systems can cope with many files, so we maybe have more important 
stuff to do in moin than implementing a storage saving backend. So maybe 
we'll leave this as exercise to the reader. <g>





More information about the Moin-user mailing list