[Baypiggies] reading files quickly and efficiently

Mark Voorhies mvoorhie at yahoo.com
Wed Nov 17 22:30:08 CET 2010

On Wednesday, November 17, 2010 01:18:34 pm Tony Cappellini wrote:
> Don't read the entire file into memory.
> readlines() does that.
> Take a look at Dave Beazely's slides on generators and how he
> processes multi-GB sized  files.
> http://www.dabeaz.com/generators/

For NR, it can also be convenient to convert the FASTA to BLAST
database format (via formatdb or downloading the pre-generated
databases from NCBI) and extract sequences with fastacmd
(formatdb and fastacmd are both included in the NCBI BLAST


More information about the Baypiggies mailing list