mech422 at gmail.com
Thu Aug 31 00:39:08 CEST 2006
On 8/30/06, Shannon -jj Behrens <jjinux at gmail.com> wrote:
> paralizable pieces, but I don't understand the admin side of it. My
> current data set is about 16 gigs, and I need to do things like run
> filters over strings, make sure strings are unique, etc. I'll be
> using Python wherever possible.
Sounds like fun :-)
* Do I have to run a particular Linux distro? Do they all have to be
> the same, or can I just setup a daemon on each machine?
You can use just about any linux distro - it's easier if all the 'compute'
nodes run the same distro. This allows you to boot the nodes via tftp and
only have 1 'compute root image' to juggle.
* What does "Beowulf" do for me?
It's the basic cluster infra-structure
* How do I admin all the boxes without having to enter the same command n
tftp boot with a single 'compute' image. There are also a bunch of cluster
admin tools - check freshmeat
(also tools for building images for cluster nodes)
* I've heard that MPI is good and standard. Should I use it? Can I
> use it with Python programs?
I've never worked with it - but it does appear to be the 'standard' for
* Is there anything better than NFS that I could use to access the data?
personally, I just s/NFS/Samba/ these days. Given some higher end hardware,
you might want to look at GFS ?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Baypiggies