[Baypiggies] reading files quickly and efficiently
mvoorhie at yahoo.com
Wed Nov 17 22:30:08 CET 2010
On Wednesday, November 17, 2010 01:18:34 pm Tony Cappellini wrote:
> Don't read the entire file into memory.
> readlines() does that.
> Take a look at Dave Beazely's slides on generators and how he
> processes multi-GB sized files.
For NR, it can also be convenient to convert the FASTA to BLAST
database format (via formatdb or downloading the pre-generated
databases from NCBI) and extract sequences with fastacmd
(formatdb and fastacmd are both included in the NCBI BLAST
More information about the Baypiggies