"HJL" == Hung Jung Lu <hungjunglu@yahoo.com> writes:
HJL> Again, I have a tangential question. I am hitting the HJL> physical limit of the CPU (meaning things have been optimized HJL> down to assembly level), in order to achieve even higher HJL> performance, the only way to go is hardware. HJL> Is there any recommendation for fast machines at the price HJL> range of a few thousand dollars? (I cannot afford HJL> supercomputers or connection machines.) My purpose is to run HJL> Monte Carlo simulation. This means that a lot of scenarios HJL> can be run in parallel fashion. Of course I can just use HJL> regular cheap Pentium boxes... but they are kind of bulky, HJL> and I don't need any of the video, audio, USB features (I HJL> think 10 machines at 1GHz each would be the size of HJL> calculation power I need, or equivalently, a single machine HJL> at an equivalent 10GHz. Heck, if there are some specialized HJL> racks/boxes, I can wire the motherboards myself.) I am HJL> wondering what you people do for heavy number crunching? Are HJL> there any cheap yet specialized machines? What about machines HJL> with dual processor? I would imagine a lot of people in the HJL> number crunching world run into my situation, and since the HJL> number crunching machines don't require much beyond a HJL> motherboard and a small hard-drive, maybe there are already HJL> some cheap solutions out there. The usual way is to build some "blackboxes", i.e. mobo/cpu/memory/NIC, diskless or nearly diskless (you don't want to maintain machines :-). Connect them using 100bT or faster networks (though 100bT should be fine). Do such things exist? Sort of -- they tend to be more expensive than building them yourself, but if you've got a reliable local supplier, they can build them fairly cheaply for you. I'd go with single or dual athlons, myself :-). If power and maintenance is an issue, duals, and if not, maybe singles. We use MOSIX (www.mosix.org) for transparent load balancing between linux machines, and it could be used on the machines I described (using a floppy or CD to boot). The next question is whether some form of parallel RNG will help. The answer is "maybe". I worked with a student who evaluated coupled chains, and we couldn't do too much better. And then after that, is whether you want to figure out how to post-process the results. If you want to automate the whole thing (and it isn't clear that it would be worth it, but...), you could use PyPVM to front-end the sub-processes distributed on the network, load-balanced at the system level by MOSIX. Now for the problems -- MOSIX seems to have difficulties with Python. Severe difficulties. I don't know if it still holds true for recent MOSIX releases. (note that I use R (www.r-project.org) for most of my simulation work these days, but am looking at Python for stat analyses, of which MCMC tools are of interest). best, -tony -- A.J. Rossini Rsrch. Asst. Prof. of Biostatistics U. of Washington Biostatistics rossini@u.washington.edu FHCRC/SCHARP/HIV Vaccine Trials Net rossini@scharp.org -------------- http://software.biostat.washington.edu/ -------------- FHCRC: M-W: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email UW: T-Th: 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX Rosen: (Mullins' Lab) Fridays, and I'm unreachable except by email.