[Tutor] code efficiency and biological databases
Oleksandr Moskalenko
malex@tagancha.org
Fri Apr 25 14:24:01 2003
Danny,
* Danny Yoo <dyoo@hkn.eecs.berkeley.edu> [2003-04-24 15:15:53 -0700]:
>
>
> On Thu, 24 Apr 2003 pan@uchicago.edu wrote:
>
> > Thx Danny for pointing out the rate limiting step in the code I
> > presented earlier.
>
> The computer scientist Alan Perlis once quipped: "Lisp programmers know
> the value of everything, and the cost of nothing." Let's make sure that
> that generalization doesn't apply so strongly to Python programmers.
> *grin*
>
>
> > I am heading toward the world of genome/evolution analysis
>
> Very cool! Yes, biologists often have to deal with enormous databases, so
> I think it can be effective to be aware of program efficiency.
>
> The Institute of Genomic Research (TIGR) keeps a respository of many
> genomes available on their FTP site; what's sorta neat is that a lot of
> their data is in XML. But what sorta sucks is that a lot of their data is
> in XML. *grin*
>
> If you're ever interested in the model organism 'Arabidopsis Thaliana',
> you can check out a concrete example of a medium-sized dataset:
>
> ftp://ftp.tigr.org/pub/data/a_thaliana/ath1/BACS/
>
> I'm using the 'gzip' and 'pulldom' modules to open and parse out
> individual sections of each "Bacterial Artificial Chromosome" at work.
> But the library documentation on 'pulldom' is so laughably sparse at the
> moment --- I'm thinking of writing a small tutorial on it when I get the
> chance.
>
This would be a great tutorial to write! You have a supporting vote from
me.
>
> Sorry for being so off topic; I just like talking about my work... *grin*
> Talk to you later!
Alex.
--
The lyf so short, the craft so long to lerne.
-- Chaucer