[Baypiggies] reading files quickly and efficiently

Glen Jarvis glen at glenjarvis.com
Wed Nov 17 22:35:02 CET 2010


Oops.. I meant SeqIO.. Thanks Brent!!! I was doing that quickly...


Cheers,

Glen

On Wed, Nov 17, 2010 at 1:24 PM, Brent Pedersen <bpederse at gmail.com> wrote:

> On Wed, Nov 17, 2010 at 1:13 PM, Glen Jarvis <glen at glenjarvis.com> wrote:
> > BioPython also will do all of this for you -- too:
> >>>> from Bio import SeqIO
> >
> >>>> record = SeqIO.read("NC_005816.fna", "fasta")
> >>>> record
> >
> SeqRecord(seq=Seq('TGTAACGAACGGTGCAATAGTGATCCACACCCAACGCCTGAAATCAGATCCAGG...CTG',
> > SingleLetterAlphabet()), id='gi|45478711|ref|NC_005816.1|',
> > name='gi|45478711|ref|NC_005816.1|',
> > description='gi|45478711|ref|NC_005816.1| Yersinia pestis biovar Microtus
> > ... sequence',
> > dbxrefs=[])
> >
> > You can also look for particular fields (record.id, record.description,
> and
> > record.sequence):
> >
> > Look at this tutorial:
> > http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc16
> >
> > Cheers,
> >
> > Glen
>
> i agree with glen that you should use a library. however, that example
> is for a single-entry fasta file. if you want random access to a
> multi-fasta, use the SeqIO.index in biopython:
> http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc56
>
> if you just want an iterator, use SeqIO.parse
> http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc11
>
> -brent
>



-- 
Whatever you can do or imagine, begin it;
boldness has beauty, magic, and power in it.

-- Goethe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/baypiggies/attachments/20101117/a15891f8/attachment.html>


More information about the Baypiggies mailing list