looking for new members to join grid of storage nodes using tahoe-lafs which is implemented in python.
billy.earney at gmail.com
Wed Feb 2 21:58:15 EST 2011
Would you like to be able to back up ~1 TB of data to secure,
highly-reliable, off-site storage, for free?
By "highly-reliable" I mean that your data should survive pretty much
anything short of the collapse of civilization. By "free" I mean "no
monetary cost, although you have to provide some hardware and a little of
There's a very cool distributed file system project implemented in python
called Tahoe-LAFS (http://tahoe-lafs.org) . The idea is that when you store
files in this file system, they actually get split into multiple pieces and
moved to servers "elsewhere" on the Internet. Even better, the splitting is
done in such a way that even if some of the pieces get lost or destroyed,
your data is still perfectly intact and retrievable at any time. Oh and all
of your pieces are encrypted so that no one who has access to the servers
hosting them can see your data.
Where are these "other servers"? Well, they can be anywhere, but in this
case, the server grid that I'm using is provided by a group of volunteers,
all of whom want highly-reliable off-site backup storage. The cost of using
this grid is that you have to provide a backup server of your own, which
will host pieces of other peoples' data. We call it "volunteergrid2". The
two is because there's also a "volunteergrid1", which is working fine, but
I've found doesn't meet my requirements. Basically, I want to store a lot
more stuff than that grid is set up to accept, so I've been pushing for this
Here are the basic requirements to join the grid:
1. You have to have a computer which is always on and always connected to
the Internet. In fact, you have to commit to keeping that machine up and
connected at least 95% of the time (no more than 8 hours per week or 1.5
days per month downtime). It's easier if the machine runs a Unix operating
system, though Windows works, too.
2. You have to provide at least 500 GB of storage to the grid.
3. You have to be able to open a couple of ports through your
router/firewall to expose that portion of your server to the Internet. If
you need help with this, the members of the grid can probably help.
4. You may consume as much storage from the grid as what you provide to
the grid, with one caveat and one issue to consider. The issue is that when
your files are split into pieces, depending on the splitting parameters you
choose (and we'll be glad to help you make good choices), your files will
get larger. So if your parameters cause your files to grow by a factor of
three, and you provide 500 GB of storage to the grid, then you can store 170
GB of your data in the grid. The caveat is that there is a 1 TB cap. You may
not store more than 1 TB in the grid -- not without approval of the rest of
the grid members, anyway. The group makes decisions by consensus.
So far, the grid has about 10 storage servers, each contributing 500 GB, for
a total of 5 TB. We'd like to get that up to at least 20 servers and 10-15
TB, and I'd really like to get to 40 servers and 30-40 TB, because with more
servers you can use more efficient splitting so you get the same level of
reliability but with less expansion.
Over time, I expect we'll gradually raise the bar, requiring all
participants to provide at least 750 GB, then 1 TB, and so on, so that our
available storage keeps pace with the growth in common drive capacities.
If you are interested, you can find out more information at
http://bigpig.org. We have a subscription list which members use as their
main line of communication to other members. You can subscribe at
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list