What strategy for random accession of records in massive FASTA file?

Robert Kern rkern at ucsd.edu
Thu Jan 13 19:41:45 EST 2005

Jeff Shannon wrote:

> (Plus, if this format might be used for RNA sequences as well as DNA 
> sequences, you've got at least a fifth base to represent, which means 
> you need at least three bits per base, which means only two bases per 
> byte (or else base-encodings split across byte-boundaries).... That gets 
> ugly real fast.)

Not to mention all the IUPAC symbols for incompletely specified bases 
(e.g. R = A or G).


Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter

More information about the Python-list mailing list